Nothin' Like Good POO
In a reaction to the whole fad of using components for everything, the Java folks came up with a back to grass roots movement called Plain Old Java Objects (POJO). That’s all fine and good, but Java’s not the only language with objects. So in this article I’m writing about Plain Old Objects, or POO for short.
Fragile POO and Testing Foo
Writing good unit tests is it’s own challenge. In order to have confidence that you are testing what you believe you are testing, you really need to isolate the code. I don’t know about you, but most objects I write use other objects. In fact, you’ll find it fairly common to see object trees in your application. That’s true if you are creating a web application that manages people’s information or if you are writing games. There’s probably a lot of ways to create the tree structure, but there are a few things that will cause a fragile architecture and easily broken tests.
Let’s say you have some code that needs to make decisions based on when a user first opened their account. Now, to make this difficult, the code is separated by a couple intermediate objects from the user account object. The fragile approach would be to chain the calls back to the user account. For example:
if ( Owner.Owner.Account.StartDate < DecisionDate )
{
DoSomethingSpecial();
}
You can substitute code from your own language of choice if you like. So what’s so bad about this approach? The problem is we are making some assumptions that may not always hold true. Let’s say you got a Null Pointer/Reference exception in your if statement. Where do you look first? It could be that your object doesn’t have an owner. That wouldn’t be uncommon in a unit test environment. It could be your object’s owner doesn’t have an owner. Easy to miss in unit testing. Or it could be that there is no account. Perhaps it’s not the account but there is no date and the comparison is throwing the exception. That’s four things that can go wrong just on the left side of the comparison statement.
Degrees of Separation
So, we still need to make a decision on the user’s account start date. If we have corrupt data, or we are migrating accounts from one back end server to another, our code might break suddenly. If we were to be truly defensive, we would need a very complex null checking condition before we made this simple check. In the case we outlined above, we have three degrees of separation from the information we need. Really, what we need is direct access to the account.
You should have no more than one degree of separation from any information you need.
Something the component folks got right, is that by definition every component has direct access to the information it needs. The container, or controlling object takes care of that for us. The mechanisms to get at that information vary between the component systems, and can even be more complex than necessary.
If the Account object is central to your application, and multiple objects need to access it, you have a couple choices. You can make the Account static and centrally accessible. That’s great for easy access, and might even work if you can guarantee only one user’s account will ever be needed at one time. It’s a solution that would work for a desktop application with a single document interface. If you work on the web, or you want to let your users have more than one document open at a time, a centrally accessible static object will introduce other fragilities that are not worth the trouble.
Another solution would be to provide a lookup interface to find the right Account. This is the solution that some component frameworks use (ahem. JNDI, Avalon, etc.). A pro here is that you can do some cool things to make sure you get the right account. It allows you the flexibility of serving some accounts from LDAP while some are in a database. Problem is, we have left the world of POO and introduced a lot of complexity we probably just don’t need.
A better solution would be the judicious use of the Hollywood Principle. “Don’t call us, we’ll call you.” Some of the POJO (or POO) proponents will object, exclaiming that’s exactly the thing we wanted to avoid. Component frameworks like Spring, Castle, PicoContainer use this approach extensively. The only thing that the component frameworks do for you is to automate how the component objects are assigned to each other. We don’t need to go all the way down the path of using a component architecture. All we need to do is ensure that all children objects are assigned the same Account object that it has.
Now we have one degree of separation. We write the test to make sure children objects get assigned the account object that the parent has. We write the test to make sure the child object can use the account properly. This makes it easier to mock out the Account object and simplify our test setup.
In our tests, we should only ever have to assign the object(s) that will be used directly by the code we are testing. We should never have to create a whole hierarchy of objects for code to work. When something goes wrong with your object chain, the only way to find out exactly what it is will be to break out the debugger and step through your program. Any change to any object in the chain can potentially break the code you are writing. That is why we only want one degree of separation between any information we need and the object we are writing.
Say it with me, Singletons are Evil
I’m sure we can all point to a situation where having a singleton was very useful, and made code a lot easier. If we are honest with ourselves, we’ll realize the number of times it worked we can count on one hand, or maybe even one finger. The problem isn’t necessarily the singleton at the time it was created. The problem has more to do with how code evolves.
When you have an object that is easily accessible from anywhere in your program, more and more code will depend on that object being there. Then the code will depend on the objects that it can get from the singleton. Then you have code that accesses the singleton that would be better off written differently. Over time that singleton becomes a point of contention in your application.
In this day of multi-core processors and the demand for multi-threaded or multi-process applications, having a single object that all other objects access introduces subtle concurrency errors. Even if you handle the concurrency problems well with mutexes and a slick locking mechanism, you introduce bottlenecks and even the risk of deadlock because all your code is accessing this one singleton.
It’s worse if you mix the singleton with the multiple dimensions of separation problem we talked about earlier. Now, instead of proper object oriented design, you are developing a procedural application that mixes the worst from both object programming and procedural programming metaphors.
Singletons also are a sore point with testing. The problem is that if the singleton carries any state whatsoever, the effects of one test will affect other tests. If the tests are not executed in the correct order, they may not pass. There’s nothing more frustrating than trying to figure out why a test provides different results when it is doing the same thing. The code behaves insanely, when the tests expect it to behave the same way every time.
There’s More to Good Architecture
I don’t have the time or space to write out every little thing that can go wrong. The important approach is to simplify your system as much as possible. If you can compose your application of several well contained classes that behave predictably in tests, you stand a better chance of writing a solid application. Every version of your application you release, think to yourself, “Is there anything I can get rid of?” Or “Can I do the same work with less code?”
To me, an impressive mark of a solid application is how little code is needed to do all that it does. The more lines of code, the more opportunities for something to go wrong. The more moving parts, the more unpredictable the application becomes. Pursue simplicity, one step at a time.
Multi-threading and .Net 4.0
.Net 4.0 has some new features that will make it easier to work with multiple threads. This will in turn make it easier to use up all those cores on your multi-core processors. The usual warnings and caveats apply with ensuring your classes are thread safe. So, what is there to get excited about?
First, .Net finally gets it’s Barrier class. A barrier allows all the threads running in parallel to complete at the same time. Java’s had theirs since Java 5 (more than a couple years old). I find it most useful when I am using multiple threads to do some number crunching, but I need to make sure I’ve incorporated all their work before I go on. The initial purpose is to have the multiple threads sync up before doing the next round of processing. Using multi-threaded ray tracer problem from yesterday, let’s look at the problem it will solve:
ThreadPool.SetMaxThreads(screenWidth,25);
for (int y = 0; y < screenHeight; y++)
{
for (int x = 0; x < screenWidth; x++)
{
int cx = x;
int cy = y;
ThreadPool.QueueUserWorkItem(state =>
{
Color c = RenderColor(cx, cy, scene);
RenderPixelDelegate dl = RenderPixel;
pictureBox.Invoke(dl, new Object[] { cx, cy, c });
}, (y+1) * (x+1));
}
}
Now, the RenderPixel(x,y,color) method takes care of plotting the pixel on the bitmap and telling the screen to redraw. The problem is the screen redrawing. That takes a lot of time, and the whole program performs better when you only redraw when the whole raster line is done. Problem is, you can’t be sure that the raster line is complete just because the last pixel in the row finished when you are computing the pixels in parallel. Truly we only care about the last line. In Java we could poll the ThreadPool to see if there are any remaining workers. Unfortunately .Net doesn’t give us that option. This is where the Barrier comes in to play. By using a Barrier we can ensure all the pixels are rendered before issuing the final redraw command:
ThreadPool.SetMaxThreads(screenWidth,25);
Barrier barrier = new Barrier(screenHeight * screenWidth + 1);
for (int y = 0; y < screenHeight; y++)
{
for (int x = 0; x < screenWidth; x++)
{
int cx = x;
int cy = y;
ThreadPool.QueueUserWorkItem(state =>
{
Color c = RenderColor(cx, cy, scene);
RenderPixelDelegate dl = RenderPixel;
pictureBox.Invoke(dl, new Object[] { cx, cy, c });
barrier.SignalAndWait();
}, (y+1) * (x+1));
}
}
barrier.SignalAndWait();
barrier.Dispose();
pictureBox.Refresh();
A couple things I’ll point out here. When you want the barrier to force the main thread to wait, you have to include it in the number of Barrier participants. I added a participant for each thread. The Java variation allowed for a thread to signal that it got to the synchronization point but not pause. That’s more useful in a situation like this where we only need to be notified that the thread is done. Having too many open Barriers may have bad effects on the ThreadPool because the worker queue items can’t officially close until all the barriers are synced up. Unfortunately I do not have .Net 4 installed on my machine so I can’t say for sure. It’s my suspicion based on my experience with the Java equivalent.
Another key threading improvement is the Parallel class. In some ways I think it is a step up from the Java Fork/Join Task architecture. The nice thing about the Parallel class is that you can use the Action delegate instead of the WaitCallBack delegate. The WaitCallBack item passes in an object for the thread state, however you rarely need it. The Parallel class will let you invoke the same delegate several times, or perform a parallel loop for you. Essentially, the code block I’ve been using can look like this now:
Barrier barrier = new Barrier(screenHeight * screenWidth + 1);
Parallel.For(0, screenHeight, y =>
{
Parallel.For(0, screenWidth, x =>
{
int cx = x;
int cy = y;
Color c = RenderColor(cx, cy, scene);
RenderPixelDelegate dl = RenderPixel;
pictureBox.Invoke(dl, new Object[] { cx, cy, c });
barrier.SignalAndWait();
}
}
barrier.SignalAndWait();
barrier.Dispose();
pictureBox.Refresh();
Of course, I’m assuming that the Parallel For loop has the same mutable integer problem that I experienced with the delegate BeginInvoke problem. The problem still exists for the ThreadPool worker pool, so there is no reason for me to assume any different here. I like the simplicity and lack of clutter that the Parallel class provides.
Please do note that I chose a highly parallelizable problem to show off these features. Not all complex actions are as responsive. Below is a checklist to determine if a type of problem/algorithm can be ridiculously parallel:
- No state needs to be shared between threads
- All state necessary for the function was passed in as parameters
- There are no reference or output parameters
Ray tracing fits this bill well because each pixel can be computed and plotted completely independent of the other pixels. If we added anti-aliasing to the mix, we would need to create more samples than we display. Web applications also fit this bill well because HTTP is a stateless protocol. Each request is handled independently from the others. There are several other problems that fit this bill.
When you have a set of data that is shared between threads and it has to remain consistent (i.e. race conditions would cause major problems or unstable behavior), you have to be more careful with your threads. Sure you have locks, mutexes, etc. but they essentially turn a multi-threaded application into a single threaded application and carry more overhead than if you never dealt with threads in the first place. By rethinking the problem a bit, you might be able to make the overall solution a bit more friendly to multiple threads. Below are some ways of making room:
- Copy all data into each thread
- Avoids synchronizations because the data is being used only in one thread at a time
- Adds memory overhead and requires efficient and safe copy routines
- Don’t do micro threads
- The ThreadPool and Parallel classes handle micro processing needs, reusing threads as necessary
- The costs outweigh the benefits. Always look for the major points that can be parallelized before looking at smaller items.
- Understand the goals you have for multi-threading
- The raw processing time might be shorter if you did everything in one thread
- You can process more things at once if you have multiple cores, but you won’t gain much from having more threads than processing units (in fact you might lose some)
- Is it the data or the process that can be run in parallel? Sometimes it’s not recommended to split an algorithm up into multiple threads, but you still might be able to process the data in chunks.
- Stop if you can’t understand what the code is supposed to be doing. Debugging parallel code is very difficult, but if you have no working mental model for how the code is supposed to work it is impossible.
The trick with multi-threaded programming is minimizing the dependencies between the threads. The less they have to coordinate with each other the more efficiently the threads can use your computer. That’s a lesson from Erlang.
Fun with Delegates, or How to make .Net more like Ruby 2
I’ll admit it, I’m a Ruby fan. There are certain aspects of programming with that language that are really fun. While I don’t have the privilege of working with it every day, I’m working with it enough. With yesterday’s look at the world of LINQ, I decided to play a bit more with delegates and extension methods. For the other Ruby fans out there, it is true that C# is both a statically typed language and those types are locked in at compile time. However, there are ways to make C# behave more like Ruby. Sure you can use the var keyword which just means that the type isn’t decided until the first assignment. But I’m going to look at imitating the way Ruby does looping in C#.
list = ["one", "two", "three"]
list.each {|item| puts item}
# Alternately:
list.each do |item|
puts item
end
Perusing the .Net APIs, your IEnumerable<T> or List<T> objects don’t have an Each() method. Although it’s not difficult to add it after the fact. The process is not hard. The first step is to create an extension method. Extension methods are much like Ruby mixins, and are implemented very similarly. As long as our mixin class in in scope, so is our extension method. Let’s look at how it would look in C#:
string[] list = {"one", "two", "three"};
list.Each( item => Console.WriteLine(item) );
// Alternately:
list.Each(
delegate(string item)
{
Console.WriteLine(item);
});
The code to implement this really is not difficult to do, but it requires some understanding of the underlying mechanisms that make it possible:
public static class IEnumerableExtensions
{
public static void Each<T>(this IEnumerable<T> list,
Action<T> callback)
{
foreach(T item in list)
{
callback(item);
}
}
}
To better understand what’s going on here, I’m going to have to call out a few things. First, the mixin class IEnumerableExtensions must be static. Without the static keyword in front of the class the CLR will not be able to look inside for extension methods. That keyword also tells the compiler that all members of this class will also be static. The class cannot be instantiated in any way. Functionally, it behaves a lot like a Ruby module, which is why I’m calling it a mixin class. Note: the name of the class really doesn’t matter, but convention adds the word “Extensions” to the class or interface we are adding functionality to.
Next, the first argument of an extension method includes the keyword “this” and the type we are extending as the first parameter. Without the keyword “this” we would have just another random static method. Also note, this is a feature that is not possible in Java without bytecode manipulation or dynamic proxy chicanery. Perhaps that will change in the future. For those that used to know how C++ objects worked, the structure here makes a lot of sense. It’s similar to the idiom popular among C programmers to pass the struct being modified as the first parameter. I chose the IEnumberable interface because the same method will work on anything that implements that interface: which is exactly how Ruby attacks the problem. Essentially Ruby has an Enumerable module (mixin) that defines all the extra methods the collection classes can use.
We used generics here to provide the same idioms and levels of type checking that come with the .Net platform. Even when you are mixing in concepts derived from other languages, you should never completely throw away the principles in the language you are using. The generics allow us to use the method preserving all the type safety built into the language without requiring us to write a number of overloads. This helps using the method feel a little more Ruby-like without ignoring C# principles.
Lastly, I want to point out the type Action. It is a delegate defined by the .Net framework, along with its companion “Func”. The only difference between the two is that Action does not return anything and Func does. Instead of creating my own delegates, I simply used what was already available. With the pair of delegates supplied, we can have a lot of fun. For example, let’s say we wanted to derive a list of answers from executing the same function across all the members of a set of values. One example would be to derive the Root Mean Square (RMS) value on a set of numbers. Our extension method would look like this:
public static List<TResult> Collect<T,TResult>(
this IEnumerable<T> list, Func<T,TResult> callback)
{
List<TResult> answers = new List<TResult>();
foreach (T item in list)
{
answers.Add(callback(item));
}
return answers;
}
The way you would use the “Derive” method we just defined to calculate the RMS would look something like this:
double[] values = {1.0, 4.0, 3.0};
double rms = Math.Sqrt(
values.Collect(x => x * x).Sum() / values.Count());
Console.WriteLine("RMS is: {0}", rms);
The “Derive” method as it is written will also allow you to do a mass conversion of all the elements in one IEnumerable object into a new list. For example, if you had an array of numbers and you wanted a set of strings it can be done with just one line:
double[] values = {1.0, 4.0, 3.0};
List<string> conversion = values.Derive( item => item.ToString() );
conversion.Each( item => Console.WriteLine(item) );
Your creativity is bounded only by your imagination.
Java and C# suffer from the same ailment
I have an interest in language design, even though I have no direct outlet for it at the moment. So as I’ve been contemplating what I like and what I don’t like about the languages I have been exposed to, I’ve realized that both Java and C# are suffering from the same core ailment. That ailment is the conceptual complexity underlying these platforms. I have to say platform because both Java and C# use a virtual machine that has been used to host other languages as well. C# without the CLR is like Java without the JVM: useless. This is in stark contrast to the almost sublime conceptual simplicity of Lisp, Smalltalk, and even Ruby.
Both C# and Java have bolted on several different features to deal with the underlying complexities, much like the English language has imported words from several different languages. English, technically a Germanic language borrows significantly from Romantic languages like Latin, and even some Greek. We won’t mention some import words from vastly different languages like Japanese (kimono, karaoke, katana, kanji). So it is with Java and C#. A short list of concepts shared by both languages include:
- Autoboxing
- Attributes/Annotations
- Dynamic binding (.Net 4.0 has a DLR and Java 7 has new JVM opcodes for this purpose)
- For each style iterating
- API document generation
- and more…
The problem isn’t so much the features in and of themselves. The problem is more subtle than that. In order to deal with the complexity of the language itself, these features are necessary. In some ways, a language like Lisp has conceptual appeal, even though its syntax is hard to wrap your head around. If everything is a list, from parameters passed in to a function to data values, and the language is built around set theory, it maps pretty well to a discipline of math. Heck, with Lisp a function is just a list of operations. Although perhaps in some ways Lisp is too conceptually simple.
The problem I’m getting at is being able to form a reasonable hypothesis of how the software is addressing your problems. I remember reading a PR piece on how Java was better than C# that had a small snippet of code asking how many method invocations there were. The two or three line snippet actually ended up invoking an unexpectedly large number of methods, from attribute accessors to delegates and some other magic. The intent of the developer was clear, although the impact of the code was unexpectedly complex. That’s not to say that C# is bad. The article was a PR piece to help Java developers still feel good about themselves. However, Java is just as guilty. Have you ever tried to debug dynamic proxy code? Have you worked with features that injected functionality into your code for you (Spring/Hibernate comes to mind)?
Other than the general second law of thermodynamics, what is it that drives languages to be more complex? Rather than truly seeking simplicity, both Java and C# have progressively moved toward sweeping the inherent complexity under the rug. Essentially moving the problem from something the developer has to worry about to something the platform has to worry about. To paraphrase my wife’s favorite movie:
There are three kinds of pipe. You have nickel, and you can see where that’s gotten you. You have bronze, which is very good… until something goes wrong. Something always goes wrong. And then you have copper, which is the only kind I use. from Moonstruck
Programming is a complex process. Translating the sometimes conflicting desires of a human into something a computer can understand is not easy. That complexity is further compounded by the moving parts we need to work together to accomplish our goals. My goal in exploring the world of language design is to find the right path for true simplicity. While we are approaching on that ideal from different programming paradigms, we haven’t quite reached the ideal yet. It feels like we live in a world where there is only nickel and bronze, and copper has yet to be discovered. I’m not the only one thinking about this for sure.
To All API and Library Designers
Creating APIs and libraries can be a difficult task. There are many concerns that you have to worry about such as design consistency, correctness, function, performance, security, and the list goes on. Something that typically gets lost in the list of concerns is usability . I know you might be thinking I’m nuts because APIs and libraries don’t have a graphical user interface. Yet they do have a user interface. The users of APIs and libraries are developers, and they use the exposed functionality provided by the API. Here are a few pointers that will cause developers to shout your praises rather than curse your name:
- Error messages/Exceptions should be clear
- Design patterns should be consistent
- Documentation should be useful
I’ll spend a little more on each topic to flesh out what I mean.
Error messages should be clear
Whether your language uses exceptions or another method of notification; the user of your API needs to know if they are the cause of the problem, and more importantly what they can do to fix it. Let’s take the example of bad parameters. They happen, particularly when a user doesn’t understand what is written in the API docs or it’s just an honest mistake. When a function has a half dozen parameters, it really helps to know which parameter is causing the problem. A generic “invalid parameter exception” with no indication of which parameter isn’t very helpful. If the parameter has to be within a certain range, say so.
There’s nothing more frustrating than spending a whole day trying to figure out what you are doing wrong, searching forums for possible clues, only to turn up empty. The more you can help your users (the developer) stay focused on writing software, the better. If an exception is caused by something outside of your control, it really helps to give that information back to the developer. They may be able to fix something that your library depends on.
Design patterns should be consistent
Whether you are using formal design patterns or simply a programming idiom to help convey the intent of the API, use it the same way throughout your API. There is nothing more confusing than having exceptions to the rule. We get enough of that learning English, we don’t need it in our APIs as well. The problem with exceptions is that they raise the mental complexity of the API. Every time a developer accesses a function, they have to consider is this function operates differently (or worse: which difference applies here?).
There’s no question that designing a consistent API is very difficult. The nature of the problem is trying to find an abstraction that helps the developer solve a problem. The problem is that if you choose the wrong abstraction you will make it harder to solve the problem or write the API. It’s a delicate balance. However, if you find yourself always needing to break your abstraction in a couple of places, perhaps you chose the wrong one to begin with?
Documentation should be useful
Developers really need two types of documentation. Many API writers recognize the importance of API docs (such as JavaDocs or .Net Docs). In order for these to be useful, the person referring to the documentation needs to know more than the parameters and the name of the function. They need to know what the function does for them, and if there are related functions that handle different tasks what they are. In an API reference, that is sufficient. You just need to refresh the developer’s memory, and help point them in the best direction if there are related (but different) functions.
What usually gets lost is describing the design patterns or abstraction the API is using. It’s one thing to know what the functions are, it’s quite another to be able to put them together in the right combination. Writing down the API design approach also helps the API developers understand how they are supposed to be solving the problem. Also, by writing down the API approach, you see just how difficult it is to convey. If it is too hard to explain, it is too hard to understand, and far too hard to use. Without this foundation, the examples of how to use the API make a lot more sense.
Conclusion
The three concerns I listed here are the top frustrations I have with any given platform. Whether it’s Java’s extensive APIs, the .Net library, Ruby’s API, etc. I’ve come across violations in at least one of the three concerns. My biggest frustration at the moment has to deal with bad or confusing error messages. The worst thing you can do as an API writer is have your users play “bring me a rock” with your API. It really helps to know that parameter X is invalid because it is not within the range of Y-Z. It does not help to know that at least one parameter in a list of seven is invalid: go figure out which one and how to fix it yourself. Additionally, if the network gets dropped and you no longer have a connection to a server, that information needs to make it to the developer instead of other exceptions that are just a consequence. For example if the problem is a file permissions problem, throwing a null pointer exception only confuses the user. They will quite rightly think you don’t know what you are doing. Understandably, proper exception handling is important. Which exceptions do you expose, and which do you handle internally? When the system breaks are you doing something unexpected as a fail-over? Java’s Remote Method Invocation API failed terribly in this manner. Instead of completely failing to call the remote object, it would silently fail over to an unencrypted HTTP tunneling approach first—even if the original connection was encrypted. That’s an epic security failure. Be reasonable, and sometimes it is better to fail completely than it is to fail over to something insecure without notifying the developer in any way.
Concurrent Programming Lessons, and some abstract thought 1
Taking a break from my self serving blogging about my machine geekery, I’d like to jot down my thoughts on building a concurrent language that merges some powerful concepts from other existing languages. Based on lessons from Erlang, Ruby, and JavaScript I think it is possible to approach a working model for how to do safe concurrency in an object oriented manner.
First, let’s examine some concepts from Erlang, or Concurrency Oriented Programming:
- The world around us is concurrent
- Each cogniscent being maintains their own state
- Exchange of ideas is performed by passing messages
- By responding to messages, each cogniscent being may change their state
Based on these operations, Erlang makes a few restrictions. Variables are write-once (i.e. immutable once set). Because of this, Erlang does not need locks, mutexes, semaphores, or other fancy concurrency control mechanisms the popular languages use. All reads will be the same, regardless of timing issues. Processes are first class citizens, and no memory is shared between processes. Again, the same evils apply. If information is passed from one process to another, it is done by sending messages. The approach echoes observations smarter people than myself have seen in using SAX vs. DOM for XML parsing. The event (message) based architecture of SAX was less memory demanding and easier to maintain throughput in highly concurrent systems (i.e. web servers) than the DOM alternative. There’s a few more things in here that deal with reliability such as the VM’s ability to monitor and restart processes that fail. The end result are programs that scale easily with the processing nodes available, both local CPU cores and remote machines.
Next, let’s examine some concepts from Ruby, or Object Oriented Programming:
- The world around us consist of things
- Things can act on other things, or can be the recipient of actions
- Each thing should maintain its internal state
- Things act on other things by sending messages (i.e. calling methods)
Based on this set of basic rules, there is a fair amount of overlap between the highly concurrent Erlang concepts and the object oriented view of Ruby (or Java, or C# if you prefer). In practice there are a few different types of things. First, there are things (objects) that cannot act on their own (i.e. value objects like color, money, or dates) but will respond accordingly when acted on by others. These value objects never change state and are completely passive. Next there are objects that represent the current state of the world. These business objects, as some call them, maintain their own state and respond to messages from other objects. In some cases, the business objects will act on other business objects. Finally, there are things that act, or service objects. A service objects are a little different than the physical representations of the other two types of objects, but they take care of complex logic, workflow, etc.
Finally, let’s examine JavaScript, or Prototype Based Programming:
- There are no object descriptions (classes), only objects
- Objects can send or respond to messages (i.e. calling methods)
- Objects should maintain their internal state
So there is some overlap here as well. The major difference between prototype languages and object languages is the lack of a class. In essence, instead of defining how an object should look and behave using a class, you copy an existing object prototype. In the copy process you can extend the prototype by adding methods (new message receivers), properties, or whatever you like. You can also simply use the object as it is. There is a side effect here, that is the system tends to have fewer objects overall compared to your object oriented system. That helps with pesky matters like garbage collection. However, the objects tend to be a bit more powerful.
Now, some personal observations based on working with these languages:
- I come from an object oriented background, it makes sense to me so it’s hard to make the mental shift to the other programming approaches.
- There is a fair amount of overlap in the concepts, to the point where we can start formulating how to merge them.
- Defining a class for an object that will have only one instance seems a bit excessive. The system has to keep the definition of the object and the object resident in memory. Perhaps the prototype approach can help reign that in.
- Tying a process to an object gives us the concurrency of Erlang and the familiarity of objects. Essentially, the messages are methods and each object manages itself in its own process. Garbage collection can be much quicker since the collector can be optimized for one process’s data.
- Straight value objects don’t necessarily need their own process, they can run within the process (object) that uses them.
I’ve also identified a few challenges with the process/object approach as well. Processes will have to be monitored to see if they are still in use. A special garbage collector would need to be written for that purpose. The Erlang concept of a write-once variable matches the mathematical ideal well. For example, X=X+1 is a mathematical impossibility but a common programming concept in many languages. Yet, objects need to vary their state over time. Special distinction needs to be made to differentiate state that can change vs. state that cannot change. In some ways the concept of a Map for maintaining the internal state of an object is a natural approach. It might be how Erlang programmers maintain state in their processes.
There are a few things that I am concerned with, no matter what the language is or how concurrency is performed:
- Security. There’s bad people out there wanting to do bad things. Unfortunately, most security models are more of a pain to work with and consequently don’t get used.
- Internationalization. We live on a planet with many languages and cultures. At the very least UTF should be the default internal representation of strings. This is still a field the industry is trying to figure out.
- Robustness. Error handling has to be given special attention. If you get it right, you will help people create software that won’t easily break. If you get it wrong, you will help people create monstrocities that break more easily.
- Testability. Anyone who has done unit testing seriously has learned that the design of the code affects how easy it is to test. The easier it is to test, the easier it is to catch bugs, and the less likely people will complain about writing unit tests.
- Scalability. The platform should make it easier to take advantage of new features like multicore processors and remote machines. Ideally, the software performance should scale along with the hardware. I hate jumping through hoops to do what should really be done in the platform.
These are just random thoughts. Please shoot holes in them. I know I’ll have to figure out a lot of details to make something like that work.
Naked Objects, Point - Counterpoint 2
The Pragmatic Programmer folks have a new book out called Domain-Driven Design Using Naked Objects which caught my attention. The title caught my attention, and I figured the author was using Naked Objects in the same vein as Jamie Oliver as “The Naked Chef” (old series on Food Network). Essentially, the ingredients are used to their full potential, complimenting each other without the heavy use of spice. So I decided to do some research on where this came from. My suspicions were confirmed, and made even more sense when I found the original thesis came from someone at Trinity College, Dublin.
I found the original thesis by Richard Pawson entitled Naked objects where he details the principles behind the concept. The thesis is very readable, as theses go. It is broken up into an introduction, a case study, guiding principles, etc. What I found more interesting was the forward written by the pioneer of the MVC pattern, Trygve Reenskaug. The concept of Naked Objects isn’t exactly new, and it should be lauded for its intent of getting back to the proper intent of object oriented design and programming. Of course, as a technology, and as some of the design constraints of naked objects, the thesis is not without detractors. For example a short paper by Larry Constantine called The Emperor Has No Clothes: Naked Objects Meet the Interface .
The true value in something like Naked Objects is to get you to adjust the way you are thinking. The main concept is to build the complete logic of the system using a finite set of domain objects. The framework is designed to take care of database persistence and user interface. According to the forward in the thesis, the spirit behind MVC is that each view is mapped to only one object, although each object might be mapped to many views. The controller is responsible for mapping the events of the view (inputs, etc.) to the domain model. Essentially the domain model (or model) uses views for output and controllers for input. This is different from the way it was originally described to me and I originally understood the pattern. Pawson argues that the framework can generate the user interface views and controllers automatically. Further advances in the concept also automatically maps the domain model to a relational database using Hibernate .
The software developer side of me likes this concept. It’s less plumbing to worry about. I don’t have to know how to code a user interface. I can get my work done quicker. However, the user interface design side of me loathes the concept because the user interface (admittedly by Pawson) is not easily grasped without training, nor is it particularly accessible. The architect in me is thinking about how I can have my cake and eat it too. Ignoring the problem of database mapping for a moment, the real challenge is in the view/controller (VC) layer. Pawson sites arguments from advocates of Object Oriented User Interface (OOUI) design that there is only one true correct way of representing an object. Yet turns around and presents two: an icon to represent the object and a dialog box to represent the content in the object. In my own project I am working on now, there are at least two representations of every object: the view in a list, and the view of the full content of the object. Nevertheless, there still remains concepts I can leverage.
In some respects Wicket would be an ideal candidate for dynamic generation of VC code. Or at the very least, due to its attempt to treat the view layer in an object oriented manner, some extensions to the application can dynamically generate the controller side. I have some reservations about pursuing that too far at the moment. The real conundrum is in the presentation layer. Managing information and behavior is something that object oriented languages are designed to handle. It is right and good to take advantage of the features of your language to properly model the business domain. However, representing that same information to the user in a way that makes sense to the user is a completely different discipline. I can argue against the principles in OOUI till I’m blue in the face, but that doesn’t solve the fundamental problem.
What we need is a way for the programmers to create the functionally complete object oriented domain model, while your user interface specialists concentrate on their responsibility. While frameworks such as Wicket have tried to address that very problem, it is my personal opinion that they fall a little short. I don’t think the fault lies with Wicket. The fault lies within the current set of W3C standards and differing levels of browser compliance. The W3C is still stuck on a model that prefers static information. If the W3C were to truly pursue a model where the user interface layer is bound to certain objects and the browser makes calls to the server to render these objects we might have a better solution. We’ve already started down this path with AJAX and the myriad of Javascript frameworks to make this work. Needless to say that there is a lot of future work that has to be done in order to truly see a synergy from functionally complete domain models and an object oriented user interfaces.
The goal of such an endeavor should be to allow user interface designers these freedoms:- Create the representations of the objects as they see fit
- Create the rules of how to select the correct view from the different possibilities.
The controller logic should be built into the browser already, in terms of invoking the domain model (or representations of a remote object).
While I’m on the subject of Naked Objects and domain models, I’d like to make a minor rant on Object/Relational Mapping tools. One of the problems is that ORM tools tend to require accessor and mutator methods (getters and setters) for every field that is going to be persisted to the database. While you are technically encapsulating the internal state of the object, in 99.44% of the cases there is no difference in using the accessor and mutator methods and directly accessing the underlying attributes of the class. In a properly designed object, you only need to expose information via accessors that the user is allowed to see, and you only provide mutator methods for what the user is allowed to change. ORM tools require you to violate those principles if you want to persist the information down to the database. Some ORM tools (ActiveRecord) generates these accessors and mutators dynamically for you. That’s great for convenience, but terrible for a properly designed domain model. For the time being, there really is no alternative unless you write the ORM layer yourself. Not recommended if you can help it.
Web Accessibility
I’ve been spending a lot of time looking into accessibility for web sites to finally get a grip on what needs to be done. Sadly, it is overlooked by the big guys when it doesn’t really have to be so hard. There are several types of disabilities including:
- Visual (full, partial, or color blindness; Photo-Epileptic Seizure [PES] susceptibility)
- Motor control (full or partial disabilities preventing the use of a mouse)
- Cognitive disorders (dyslexia and the like)
- Auditory (full or partial deafness)
The main issue with accessibility is the lack of knowledge and reasonable resources available to help. In particular, the tools that are needed to support people who have the disabilities are expensive and complicated. There are tools available that everyone can have at their fingertips that will help. Most are Firefox plugins, some require you to use an external site. Below are a few of the resources I’ve found:
The folks at WebAIM have several good recommendations. For example, you know those “edit in place” controls on Flickr? You can make them keyboard accessible by applying the attribute “taborder=’0’”. The control will be in the page’s natural tab order. I did a little experimentation and the control supplied by Script.aculo.us can take care of all these things for you. The only issue is that IE 6 won’t put the control in the tab order unless it is in the original markup. Dynamically changing the DOM to put it in won’t put the control in the tab order with that browser.
Web Design 101 1
This is a departure from the film and darkroom printing posts, so apologies and I’ll be back on that later.
I’d like to think I’m a pretty good application designer, although some of the things that I’ve seen on other sites have left me a bit jealous. Face it, I didn’t have a background in web design. I know what I should be able to do, and what it ends up looking like. The problem is familiar to anyone who has designed for the web: IE and non-standard implementations of HTML, JavaScript, CSS, etc. The way it’s supposed to work is you whip open the standards, see what CSS can do for your style, and do it. When you see impressive use of CSS from the CSS Zen Garden you start thinking, why can’t I do that?
I come from an application development background, where we can design our desktop apps with nice grid based layouts and do all kinds of fancy things. From there I went on to web development, and have always been frustrated with my inability to escape using tables for layout. Tables are great for lining things up, but they can cause their own issues when you start incorporating AJAX and other goodies. Then you have to worry about what your client’s browsers can do and what they can’t do. At least I don’t have to worry about Netscape 4 and it’s 30 second lag time to sort out nested tables anymore.
So, I think I’ve solved the worst of my problems. I solved the nonstandard JavaScript implementation problem by using Prototype.js which really made it easy to take care of AJAX support for tags and made some of my complex form handling easier. I was still doing my own JavaScript implementations of type-ahead controls and the like. Then I discovered Scriptaculous which took the pain out of 90% of the more advanced control issues I had.
But what could be done between the presentation hell that Internet Explorer puts me through compared to other browsers on the market? IE positioned things just a bit off, and handled the positioning of divs a little crazy as well. If I was going to both have a nice visual grid layout and use markup the way it was intended, I needed a solution like prototype and scriptaculous to help me out. Enter Blueprint CSS to take away the pain and suffering of using divs to align your content. Now blueprint provides some nice little “plugins” that use CSS to do pretty buttons or decorate links with graphic icons all with pure CSS. The graphic icons degrade gracefully as IE 6 does not support the type of selectors it uses. However, the library included a couple PNGs with transparencies. So what does one do about the fact that IE 6 does not render that nicely? Fix it up with Unit PNG Fix of course.
Now my header looks like this:
<head>
<title>My Site</title>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<link rel="stylesheet" href="css/screen.css" type="text/css" media="screen, projection">
<link rel="stylesheet" href="css/print.css" type="text/css" media="print">
<!-- Fix nonstandard IE spacing -->
<!--[if lt IE 8]><link rel="stylesheet" href="css/ie.css" type="text/css" media="screen, projection"><![endif]-->
<!-- Site CSS -->
<link type="text/css" href="css/mysite.css" rel="stylesheet"/>
<!-- Fix IE6 PNG transparency issues -->
<!--[if lte IE 6]><script type="text/javascript" src="lib/unitpngfix.js"></script><![endif]-->
<!-- Prototype.js/Script.aculo.us -->
<script type="text/javascript" src="lib/prototype/prototype.js"></script>
<script type="text/javascript" src="lib/scriptaculous/scriptaculous.js"></script>
</head>
It seems like a lot, but it makes my life so much easier. I highly recommend Head First Web Design as it covers all these wonderful concepts and more.
Adobe Lightroom: Bringing Fun Back to Digital Photography
Adobe Lightroom is a vast improvement over Photoshop Elements, to which a hundred of you will chime in and say “no duh!” and another hundred of you will chime in and say “wait till you try Aperture!”. The truth is that all the adjustments I need are taken care of in Lightroom and Aperture. I’ll have to wait to try Aperture until I get my own Mac. The family Mac is used all too much for me to have time with it. My wife is doing transcripts, my son is creating and editing music, and my daughter is playing with video and her own canned music. That says something about the utility of the Mac out of the box. Dell, HP, Gateway, Sony, etc. pay attention, the set of software included in a Mac is both fun and useful for a family. Not the load of crap that slows down the PC and gets in the way.
Back to Lightroom. All my complaints about tagging and organizing my photos has been addressed in this piece of software. It’s how things should be. Of course, it does highlight a problem I noticed about my tags—over time some of the supporting players on my son’s team had their jersey numbers change. I shouldn’t have included the numbers in the name tags. The cool thing is that I was able to tag and organize my pictures from a game in about half an hour. Considering I am tagging the pictures with the players that can be seen, that’s better than Flickr. I’m sure that Aperture is equally cool in this regard. I just can’t play with it yet.
It’s important to point out that when you make things that people have to do fun, they are in turn more productive. They also have a certain loyalty to the brand. I thought it was interesting to point out that with the O’Reilly experiment to have their Lightroom columnist try Aperture for a week, and vice versa, that both columnists were loyal to the app that first made managing their pictures fun. They appreciated several features of the application they weren’t used to, but because they work differently they had to relearn how to do the same things they already knew how to do—which is never fun. It’s like a baller fixing their shot. They already know they can shoot, and they continually fall back into their old shot until they have practiced so much that the new shot becomes natural. It’s work and work is not fun. The only thing that will help them pursue the change is that they will become even better. With both Aperture and Lightroom, they suit different styles of working with your pictures. Aperture is very non-linear, and Lightroom is more structured. One avoids forcing a workflow on the user, and the other encourages a workflow.
Structure can help people make sense of their world, but the wrong structure can get in your way more than it supports you. If your discipline is to organize the pictures before you spend any time touching them up, then you will enjoy Lightroom. If you like to make multiple passes and organize on the fly as you touch up the pictures (a much more organic process), then Aperture will better suit you. Lightroom forces you to adapt to its way of thinking before you can enjoy it. Organize, then touch up. I tried organizing, and touching up the pictures one by one and that took too long. I adapted by organizing, selecting the top 10 and then touching them up one by one. It was annoying to have to change modes to change pictures, but it was more productive that way. I prefer a more organic approach. I’ll definitely try out Aperture (which is at a nicer price point anyway) in the near future.
The bottom line is that when the cost per picture is fairly low you will take more pictures. You then have to do something to manage them all. With large format photography, you have a much lower production pace because it takes longer to shoot, costs more per frame, and you usually don’t have to compromise to get a “close enough” picture. You can manage it efficiently with proofs and notes on paper. The whole idea of a program like Aperture and Lightroom is unique due to the demands of digital photography.
