Nothin' Like Good POO
In a reaction to the whole fad of using components for everything, the Java folks came up with a back to grass roots movement called Plain Old Java Objects (POJO). That’s all fine and good, but Java’s not the only language with objects. So in this article I’m writing about Plain Old Objects, or POO for short.
Fragile POO and Testing Foo
Writing good unit tests is it’s own challenge. In order to have confidence that you are testing what you believe you are testing, you really need to isolate the code. I don’t know about you, but most objects I write use other objects. In fact, you’ll find it fairly common to see object trees in your application. That’s true if you are creating a web application that manages people’s information or if you are writing games. There’s probably a lot of ways to create the tree structure, but there are a few things that will cause a fragile architecture and easily broken tests.
Let’s say you have some code that needs to make decisions based on when a user first opened their account. Now, to make this difficult, the code is separated by a couple intermediate objects from the user account object. The fragile approach would be to chain the calls back to the user account. For example:
if ( Owner.Owner.Account.StartDate < DecisionDate )
{
DoSomethingSpecial();
}
You can substitute code from your own language of choice if you like. So what’s so bad about this approach? The problem is we are making some assumptions that may not always hold true. Let’s say you got a Null Pointer/Reference exception in your if statement. Where do you look first? It could be that your object doesn’t have an owner. That wouldn’t be uncommon in a unit test environment. It could be your object’s owner doesn’t have an owner. Easy to miss in unit testing. Or it could be that there is no account. Perhaps it’s not the account but there is no date and the comparison is throwing the exception. That’s four things that can go wrong just on the left side of the comparison statement.
Degrees of Separation
So, we still need to make a decision on the user’s account start date. If we have corrupt data, or we are migrating accounts from one back end server to another, our code might break suddenly. If we were to be truly defensive, we would need a very complex null checking condition before we made this simple check. In the case we outlined above, we have three degrees of separation from the information we need. Really, what we need is direct access to the account.
You should have no more than one degree of separation from any information you need.
Something the component folks got right, is that by definition every component has direct access to the information it needs. The container, or controlling object takes care of that for us. The mechanisms to get at that information vary between the component systems, and can even be more complex than necessary.
If the Account object is central to your application, and multiple objects need to access it, you have a couple choices. You can make the Account static and centrally accessible. That’s great for easy access, and might even work if you can guarantee only one user’s account will ever be needed at one time. It’s a solution that would work for a desktop application with a single document interface. If you work on the web, or you want to let your users have more than one document open at a time, a centrally accessible static object will introduce other fragilities that are not worth the trouble.
Another solution would be to provide a lookup interface to find the right Account. This is the solution that some component frameworks use (ahem. JNDI, Avalon, etc.). A pro here is that you can do some cool things to make sure you get the right account. It allows you the flexibility of serving some accounts from LDAP while some are in a database. Problem is, we have left the world of POO and introduced a lot of complexity we probably just don’t need.
A better solution would be the judicious use of the Hollywood Principle. “Don’t call us, we’ll call you.” Some of the POJO (or POO) proponents will object, exclaiming that’s exactly the thing we wanted to avoid. Component frameworks like Spring, Castle, PicoContainer use this approach extensively. The only thing that the component frameworks do for you is to automate how the component objects are assigned to each other. We don’t need to go all the way down the path of using a component architecture. All we need to do is ensure that all children objects are assigned the same Account object that it has.
Now we have one degree of separation. We write the test to make sure children objects get assigned the account object that the parent has. We write the test to make sure the child object can use the account properly. This makes it easier to mock out the Account object and simplify our test setup.
In our tests, we should only ever have to assign the object(s) that will be used directly by the code we are testing. We should never have to create a whole hierarchy of objects for code to work. When something goes wrong with your object chain, the only way to find out exactly what it is will be to break out the debugger and step through your program. Any change to any object in the chain can potentially break the code you are writing. That is why we only want one degree of separation between any information we need and the object we are writing.
Say it with me, Singletons are Evil
I’m sure we can all point to a situation where having a singleton was very useful, and made code a lot easier. If we are honest with ourselves, we’ll realize the number of times it worked we can count on one hand, or maybe even one finger. The problem isn’t necessarily the singleton at the time it was created. The problem has more to do with how code evolves.
When you have an object that is easily accessible from anywhere in your program, more and more code will depend on that object being there. Then the code will depend on the objects that it can get from the singleton. Then you have code that accesses the singleton that would be better off written differently. Over time that singleton becomes a point of contention in your application.
In this day of multi-core processors and the demand for multi-threaded or multi-process applications, having a single object that all other objects access introduces subtle concurrency errors. Even if you handle the concurrency problems well with mutexes and a slick locking mechanism, you introduce bottlenecks and even the risk of deadlock because all your code is accessing this one singleton.
It’s worse if you mix the singleton with the multiple dimensions of separation problem we talked about earlier. Now, instead of proper object oriented design, you are developing a procedural application that mixes the worst from both object programming and procedural programming metaphors.
Singletons also are a sore point with testing. The problem is that if the singleton carries any state whatsoever, the effects of one test will affect other tests. If the tests are not executed in the correct order, they may not pass. There’s nothing more frustrating than trying to figure out why a test provides different results when it is doing the same thing. The code behaves insanely, when the tests expect it to behave the same way every time.
There’s More to Good Architecture
I don’t have the time or space to write out every little thing that can go wrong. The important approach is to simplify your system as much as possible. If you can compose your application of several well contained classes that behave predictably in tests, you stand a better chance of writing a solid application. Every version of your application you release, think to yourself, “Is there anything I can get rid of?” Or “Can I do the same work with less code?”
To me, an impressive mark of a solid application is how little code is needed to do all that it does. The more lines of code, the more opportunities for something to go wrong. The more moving parts, the more unpredictable the application becomes. Pursue simplicity, one step at a time.
Naked Objects, Point - Counterpoint 2
The Pragmatic Programmer folks have a new book out called Domain-Driven Design Using Naked Objects which caught my attention. The title caught my attention, and I figured the author was using Naked Objects in the same vein as Jamie Oliver as “The Naked Chef” (old series on Food Network). Essentially, the ingredients are used to their full potential, complimenting each other without the heavy use of spice. So I decided to do some research on where this came from. My suspicions were confirmed, and made even more sense when I found the original thesis came from someone at Trinity College, Dublin.
I found the original thesis by Richard Pawson entitled Naked objects where he details the principles behind the concept. The thesis is very readable, as theses go. It is broken up into an introduction, a case study, guiding principles, etc. What I found more interesting was the forward written by the pioneer of the MVC pattern, Trygve Reenskaug. The concept of Naked Objects isn’t exactly new, and it should be lauded for its intent of getting back to the proper intent of object oriented design and programming. Of course, as a technology, and as some of the design constraints of naked objects, the thesis is not without detractors. For example a short paper by Larry Constantine called The Emperor Has No Clothes: Naked Objects Meet the Interface .
The true value in something like Naked Objects is to get you to adjust the way you are thinking. The main concept is to build the complete logic of the system using a finite set of domain objects. The framework is designed to take care of database persistence and user interface. According to the forward in the thesis, the spirit behind MVC is that each view is mapped to only one object, although each object might be mapped to many views. The controller is responsible for mapping the events of the view (inputs, etc.) to the domain model. Essentially the domain model (or model) uses views for output and controllers for input. This is different from the way it was originally described to me and I originally understood the pattern. Pawson argues that the framework can generate the user interface views and controllers automatically. Further advances in the concept also automatically maps the domain model to a relational database using Hibernate .
The software developer side of me likes this concept. It’s less plumbing to worry about. I don’t have to know how to code a user interface. I can get my work done quicker. However, the user interface design side of me loathes the concept because the user interface (admittedly by Pawson) is not easily grasped without training, nor is it particularly accessible. The architect in me is thinking about how I can have my cake and eat it too. Ignoring the problem of database mapping for a moment, the real challenge is in the view/controller (VC) layer. Pawson sites arguments from advocates of Object Oriented User Interface (OOUI) design that there is only one true correct way of representing an object. Yet turns around and presents two: an icon to represent the object and a dialog box to represent the content in the object. In my own project I am working on now, there are at least two representations of every object: the view in a list, and the view of the full content of the object. Nevertheless, there still remains concepts I can leverage.
In some respects Wicket would be an ideal candidate for dynamic generation of VC code. Or at the very least, due to its attempt to treat the view layer in an object oriented manner, some extensions to the application can dynamically generate the controller side. I have some reservations about pursuing that too far at the moment. The real conundrum is in the presentation layer. Managing information and behavior is something that object oriented languages are designed to handle. It is right and good to take advantage of the features of your language to properly model the business domain. However, representing that same information to the user in a way that makes sense to the user is a completely different discipline. I can argue against the principles in OOUI till I’m blue in the face, but that doesn’t solve the fundamental problem.
What we need is a way for the programmers to create the functionally complete object oriented domain model, while your user interface specialists concentrate on their responsibility. While frameworks such as Wicket have tried to address that very problem, it is my personal opinion that they fall a little short. I don’t think the fault lies with Wicket. The fault lies within the current set of W3C standards and differing levels of browser compliance. The W3C is still stuck on a model that prefers static information. If the W3C were to truly pursue a model where the user interface layer is bound to certain objects and the browser makes calls to the server to render these objects we might have a better solution. We’ve already started down this path with AJAX and the myriad of Javascript frameworks to make this work. Needless to say that there is a lot of future work that has to be done in order to truly see a synergy from functionally complete domain models and an object oriented user interfaces.
The goal of such an endeavor should be to allow user interface designers these freedoms:- Create the representations of the objects as they see fit
- Create the rules of how to select the correct view from the different possibilities.
The controller logic should be built into the browser already, in terms of invoking the domain model (or representations of a remote object).
While I’m on the subject of Naked Objects and domain models, I’d like to make a minor rant on Object/Relational Mapping tools. One of the problems is that ORM tools tend to require accessor and mutator methods (getters and setters) for every field that is going to be persisted to the database. While you are technically encapsulating the internal state of the object, in 99.44% of the cases there is no difference in using the accessor and mutator methods and directly accessing the underlying attributes of the class. In a properly designed object, you only need to expose information via accessors that the user is allowed to see, and you only provide mutator methods for what the user is allowed to change. ORM tools require you to violate those principles if you want to persist the information down to the database. Some ORM tools (ActiveRecord) generates these accessors and mutators dynamically for you. That’s great for convenience, but terrible for a properly designed domain model. For the time being, there really is no alternative unless you write the ORM layer yourself. Not recommended if you can help it.
Using Many Objects at Once
Last article was meant to introduce you to the terminology that is common to object oriented languages. You should have a clear idea of what an Object is and what a Method is. Today, we are going to learn about how to use more than one object at a time. We’ll start with simple variables, which are simply names for objects you are using so you can get to them again. Then we’ll talk about some special objects called containers, whose sole purpose is to organize a group of objects for you.
A container is an object that holds a bunch of other objects
Variables
So there’s a little more terminology here. We can assign objects to individual variables fairly easily just by giving a name, using the equals sign, and the value. For example:
my_variable = 12345
As long as the first character of the variable name starts with a letter or an underscore “” character, the rest of the characters can be letters, numbers, and underscores. Almost all languages have this requirement, so that the computer doesn’t get confused between what is supposed to be a number and what is supposed to be a variable. Just a little helpful advice from someone with experience here, use variable names that make sense. It is better to type a few extra keys than wonder whether _cntr meant counter or center later on. It’s also better to use names that reflect the variable’s purpose rather than what it is made of.
array = ['cero', 'uno', 'dos', 'tres'] ones_column = ['cero', 'uno', 'dos', 'tres']
The variable named ‘array’ is kind of silly. It describes what kind of object is assigned to the variable rather than what we are using the variable for. The variable named ‘ones_column’ describes the purpose for the variable better. We are using an array of names to write out the ones column of a number.
Variables let us hold on to objects with a user friendly name so we can use them later
Arrays
Since we already introduced how to create an array above, let’s talk about how they are used. Arrays hold a bunch of related values together so that we can use them in interesting ways. Using the ‘ones_column’ example above, we can use any single digit number to get the name of that number.
Arrays are lists of information
puts ones_column[2] # prints 'dos'
Remember in the last lesson, when we had a number do something using the times method? The number passed back to our code started at zero. The number inside the square brackets is an index number that starts at zero. So the first element in the array is always 0, not 1. That’s why we started the array with ‘cero’ (Spanish for zero) instead of ‘uno’ (Spanish for one).
An Array is a Collection, so like all collections it has an each method. The each method works just like the times method on the number objects. The method will pass back the elements (the objects) of the array one at a time in order. So, if we wanted to count in Spanish, we would use our ones_column array like this:
ones_column.each do |number| puts number end
Hash
A Hash allows you to map one object as a key and another object as the value. Other languages will call the Hash a Map or a Dictionary, essentially letting you look up one object (a definition) using another object (the word to look up). Think of the word key in the terms of a map key. A symbol represents a concept, so to speak.
Hash lets you associate key objects with value objects
In Ruby, Hashes are created using the curly braces, and the key/value pairs are marked with the phrase: key => value . They can be quite handy when you have a lot of information to pass from one place to the next. One variable would take all the name/value pairs. If it helps, think of it as a collection of variables. It is most common to see string names be the key, but in Ruby we also have something called a symbol. Strings are surrounded by quotes, and they can change. Symbols have a colon in front and they never change.
my_map = {'string' => 'surrounded by quotes',
:symbol => 'starts with colon',
:string => 'not the same as a string'}
puts my_map['string'] # shows 'surrounded by quotes'
puts my_map[:symbol] # shows 'starts with colon'
puts my_map[:string] # shows 'not the same as a string'
puts my_map[:string.to_s] # shows 'surrounded by quotes'
In the above code, we set up a map of keys to string values. Just so you know, the objects don’t have to be the same type. In fact you’ll see that we used one string and two symbols for keys. You’ll also see that the Hash object treats them differently. The last lookup call might cause a little confusion, so look at the method being called on the symbol. The to_s method is on every Ruby object, and it is a convenient way to get a string out of the object. For a symbol, the to_s returns the name of the symbol (everything except for the colon). Because that happens to be the same name as a string key that is already in the map, it returns the value that belongs to the string.
This brings up a very important point, particularly for newbies. Save yourself a lot of headaches, and only use keys of the same type until you know exactly what you are doing. If you ever use many types of objects for different keys make sure you document your code with your comments. You don’t want to run the chance of spending hours tracking down a bug only to find it is the same thing I showed here. Talk about your ‘Doh!’ moments.
Strings
Ok, the Array and Map are good enough to get you going—there are more collection types and more things you can do with these collections, but strings are pretty important in programming. The string is your best tool for communicating with the user. On web forms, your values are all strings, and when you add content onto a page, they are strings. There are three ways of working with strings in Ruby, depending on how much power you need. We will introduce them from the simplest to the most complex.
Strings hold text
simple_string = 'single quotes and no processing'
The above string is rather simple. The single quotes mean to take the text AS IS, no escaped characters, no single quotes, and no formatting. You’d be surprised by how much you can use this simple declaration. If we ever needed to stick two strings together, we can use the plus sign (’+’).
hello = 'Hello' world = 'World' puts hello + ' ' + world #shows 'Hello World'
Things can get messy when you have to show information your program has back to the user. The ’+’ concatenation only works for strings, so each object that is not a string needs to call the to_s method to be concatenated. Oh, and then you still have the problem of formatting the text to be pretty. There is a more powerful way to declare and use strings, and that is to use double quotes:
number = 4
puts "Number:\t#{number}"
In the string we sent directly to the puts method, we see a bunch of stuff that might look confusing at first. The first funny looking characters would be the ”\t”. The backslash (’\’) is an escape character, so you can access more special characters with it. You can embed double quotes by adding a backslash in front, or in this case you can embed a tab character. Essentially, we told the string to embed some white space. The next set of funny looking characters would be ”#{number}”, but when you realize that the number inside the curly braces is the same name as the variable we defined before, it suddenly becomes a little less confusing. Basically anything between the ’#{’ and ‘}’ will be treated like regular ruby. You can even do math in there if you really wanted. It’s best to keep it simple and only use it to display the value of variables in your string.
That’s all well and good, but what if I have a lot of text, I only need to embed a little bit of Ruby variables, and I want to do all the spacing and carriage returns myself? There is a way to define a big string by providing your own “terminator”. The terminator is just so that Ruby can know when the string is done and it can go back to normal Ruby processing.
really_big = <<-END This string will keep going and going on and on until I use the terminator to make it end. END puts really_big
You can embed variables just like before using the ’#{}’ construct, but there really isn’t a whole lot of need for escape characters and such. When you have a big message you want to give to the user, this is usually the best way to get it done. The important thing is that you choose the terminator so it will never be text in your block. The characters ‘<<-’ basically tell ruby to keep adding lines to the string until we reach X where X is the terminator. In my example, I chose the word END in all capital letters, but you could just as well use TERMINAR or BATTLE_FLAG or any other gibberish you want. The string will start on the following line and continue until it hits the terminator you set up. Just make sure it is clear what you are doing so that when you come back to the code later you aren’t thinking What in the world did I do?
Intro to Programming with Ruby
First of all, Ruby is a programming language that has become pretty popular with the advent of Ruby on Rails . That said, Ruby is a great first programming language to learn. I won’t bore you with it’s history, even though the language has been in existence for over a decade. For the summary about Ruby, check out http://www.ruby-lang.org/en/about/ and also look at the different tutorials. This tutorial is meant for people who don’t know squat about programming. If it’s confusing at all, let me know in the comments.
Before we delve into objects and things like that, let’s consider what happens when we have to use Ruby to help with other tasks. Those tasks can be deployment scripts, test support, etc. One of the most important things you will learn has nothing to do with actually making stuff work. It’s the comment. Basically you want to leave notes for yourself so that you can get back into the swing of things after you leave something alone for a while. To do that with Ruby, all you need is the ’#’ symbol. It marks the beginning of a comment, and the comment is over at the end of the line.
# # A block of comments looks like this, with a '#' symbol # at the beginning of every line. Comments are supposed # to help you remember things later on. #
Start with good commenting habits, and keep them up. Some things are pretty self explanatory so make sure you document the big things and not what every line is doing. Remember that these are clues you leave to yourself, or anyone else who will work with the code about what’s going on.
Now, Ruby is an object oriented language, but before I get into making objects let’s look at what they can do for you. In Ruby, everything is an object—your numbers, ranges, collections, etc. To help wrap your brain around objects, think of them as “things”.
Objects are things you can tell what to do.
Let’s say you need something to be done five times. The number five is an object, which means you can tell it to do something for you. Core Ruby Doc has all the standard objects that are part of the language, but it can be a little overwhelming at the beginning. First of all, the number five is an Integer (there are no decimal points so math people tell us that the correct name is an Integer). If you look up Integer in the Ruby docs, you’ll find that a list of things that it can do. There are two that look pretty interesting to us, but for now we just want to do something five times. Here’s how we do it:
5.times do |num|
puts 'Do something!'
end
Ok, what’s going on here? The number five is an object, and the ”.” symbol means “tell the object to do something”. The name after the ”.” is what we are telling it to do. So we are basically saying “Number 5, I want you to .times”. OK, so we are missing part of the picture. That part is the do block. Everything between the do and end is also being passed along with the message ”.times”. Let’s call the ”.times” message a method.
A method is something you can tell an object to do.
Continuing on, we have the do block followed by some pipe symbols ”|” and a name in the middle. This is how the number 5 passes something back into your do block so that you can use it. The name you give this something is called a variable. You can use it or ignore it, it’s up to you. We are ignoring it for now, but if we want to use it, we just have to use that name.
5.times do |count|
puts 'Do something on time ' + count
end
What you see will change based on the value of the variable “count” that the number 5 is giving us. The documentation tells us that the number will start at 0 and go to just under our number. So we will see five lines counting up from 0 to 4. A lot of programming languages do this, so it’s just something you have to get used to. Everything between the do and the end markers gets run each time. The important thing is that the variable you named inside the ”|” symbols is how you use that variable.
OK, we are taking some baby steps here, but I’ll stop for today. First of all, we learned that things are called Objects and we tell Objects what to do by calling Methods (some people call them messages, but other languages use the word Method so we are just being consistent). We also learned that you could pass in a whole block into a method and have that object run that block for us. We learned that when we pass in a block to an object’s message, it can pass something back into our block so that we can use it. I want to point out that the whole “pass a block of code” thing is something that not every language can do. For instance at the time I am writing this, Java, C#, and C++ can’t do that.
Lastly, I want to start you thinking about something. The best code is self documenting, but it will never read like a book. As long as the details are clear, all you have to worry about in your comments is why you are doing something five times. With the last snippet of code above, it almost reads like English. We are saying “five times do something using ‘count’”. The word “puts” is actually a method on the text console object. Ruby makes some assumptions to make the code a little more readable.
