Intro to Functions

Posted by Berin Loritsch Fri, 22 Aug 2008 11:58:00 GMT

When you are just writing quick scripts, you can use Ruby all you want and be happy. However, there comes a point where you have to do the same thing in a bunch of places. Functions are a way to organize the logic in your code so that you can re-use it in more than one place. I’ll introduce how to do math at the beginning, but functions aren’t only for numbers as we will show later.

Doing Some Math

As long as you are working with numbers, you will have to remember some symbols. In your math text books you will see symbols that just don’t exist on keyboards and requires different key combination to make them show up. The good news is that the conventions for replacing mathematical symbols in code is pretty standard across languages. You only have to learn them once, which helps.

  • + addition
  • - subtraction
  • * multiplication
  • / division
  • % modulus
  • ^ exponent
  • () group expressions

Math expressions are performed in algebraic order. In short, that means that expressions are evaluated in the reverse order from what I listed. Parentheses first, exponents next, then multiplication, division and modulus, finally addition and subtraction. Just to make it clear, look at the following code:

puts 4 + 5 * 6
# 34

puts (4 + 5) * 6
# 54

puts 4 + (5 * 6)
# 34

It’s a good habit to use parentheses to make things clearer. There’s a few more symbols that allow you to do bit manipulation, but then I have to explain the math behind it. Let’s focus on this level of math for now. Let’s say we want to do a little trigonometry and calculate the area of a circle. The mathematical formula for the area of a circle is πr2. So how do we get a hold of the value of π? There is a Ruby module called Math that has the value of π and other more advanced functions.

Ok, so how does the expression look like in Ruby?

radius = 5

puts Math::PI * (radius ^ 2)
# 15.707963267949

I added the parentheses to make it clearer that the exponent (raising to the power of two) comes first. So what if we wanted to reuse this function anywhere? We would have to create a function to do it. It’s pretty easy, and you will use the same construct in another post when we talk about creating our own methods. Let’s create our function:

def area radius
    Math::PI * (radius ^ 2)
end

So what’s going on here? The word def is a Ruby keyword that tells Ruby that you are creating a function. After that, is the name of the function. Finally we have the list of parameters. A parameter is a name we give to a value that you pass to the function. Basically, the function is going to do something with that value—even though it doesn’t know what the value is first. The next line bears some explaining.

Functions can return a value, which is usually their whole point. However, we don’t see any words that say “return this”. It’s probably the most unintuitive thing you’ll run into with Ruby, but the last expression in a function is the value that’s returned. It’s a carryover from Smalltalk, and once you understand that it becomes a little more understandable. If we had one more line that just had the number 2 on it, then the function would always return the number 2—which is wrong for what we want. What some people do to make things a bit clearer is to use the keyword return . That keyword is designed for letting you leave a method early for some cases, but it works just as well. It’s probably not a bad habit as other languages require you to use it. The method would now look like this:

def area radius
    return Math::PI * (radius ^ 2)
end

The keyword end is something we saw already when we were doing loops in the last lesson. This keyword is used to end any block, so you will use it a lot.

Not All Functions Are for Math

I introduced functions with math because that’s where the idea came from. But most problems don’t require the use of heavy math. Ruby isn’t designed to be a math engine anyway. Just for fun, let’s create a function that will turn a number into words—Japanese words to be exact. It’s only fitting as Ruby came from Japan after all. Just to save us some work, we’ll limit ourselves to the range from 0 to 99. To do that we need to use an if statement. The if statement let’s us do something if it is true, but skips the code inside if it is not true. We also want to raise an issue so that the calling code knows that they asked something we can’t deliver. The keyword is raise , which is rather convenient. You can “raise” any object, but we will just use a string. The code looks like this:

if not (0..99).include? number
    raise "We can only translate numbers between 0 and 99" 
end

I’ll include the solution below, and just expound on things in comments. Your job is to expand the method to do up to 999, or to change it to another language. I’m using Japanese partly because it’s easy to do with code. Other languages have more exceptions.

def to_japanese(number)
    #
    # Protect our method from trying to work on numbers
    # it doesn't support
    #
    if not (0..99).include? number
        raise  "We can only translate numbers between 0 and 99" 
    end

    #
    # Keep it simple, use the variations of four and
    # nine that work in the tens column as well as
    # the ones column.  These are the numbers from
    # zero to nine.
    # 
    numbers = ['rei', 'ichi', 'ni', 'san', 'yon', 'go',
               'roku', 'nana', 'hachi', 'kyu']

    #
    # Modulus gives the remainder.
    # 12 divided by 10 is 1 with a remainder of 2.
    # It's a good way to get just the ones column.
    # Then we use regular division to get just the tens column
    #
    ones = number % 10
    tens = number / 10

    case tens
        # When we are doing 10 - 19
        when 1
            japanese = (0 == ones) ? 'ju' : 'ju ' + numbers[ones]

        # When we are doing 20 - 99
        when 2..9
            japanese = numbers[tens] + ' ju'

            if (ones > 0)
                japanese = [japanese, numbers[ones]].join(' ')
            end

        # Otherwise we are doing 0-9
        else
            japanese = numbers[ones]
    end

    return japanese
end

So there are a couple things I need to explain above. First is the case, when, else construct. The case statement tells Ruby that we are going to use the following expression (in this example the expression is a variable) with a bunch of comparisons. It’s a little nicer than doing a bunch of if/else statements. The first match is what gets run. Each case that we are checking is marked with the when statement. To translate it to English, it’s like saying “when tens is 1 do this”, “when tens is in the range 2..9 do that”, “otherwise do this”.

The next thing I have to explain is the (something) ? true : false construct. It’s a shorthand for an if/else statement. Essentially, we are saying that if the ones column is 0, just return ‘ju’ otherwise return ‘ju ’ plus the translation of the ones column. Have fun!

What if Programming Languages Followed the Social Paradigm?

Posted by Berin Loritsch Mon, 28 Jan 2008 13:54:00 GMT

Sometimes, all it takes is a subtle shift in your viewpoint to open your eyes to new possibilities. The big problem with many existing programming languages is that they don’t always lend themselves to natural parallelization. Yes, that includes Java, C++, C#, and Ruby. It’s not impossible with those languages, it just doesn’t come for free. The reason it’s a big deal nowadays is that multi-core chips are hear to stay. Architectures like the PS3’s Cell architecture are likely to become the norm. The result is like fitting a round peg into a square hole. It’s not impossible, it just requires a lot of work.

So how do social paradigms work to allow an expressive and powerful language with a natural ability to be parallelized? I’m not fully sure, but it might work for the expressive and powerful part. Think about the concept of tagging. A tag is just metadata, and what that metadata means is up to us to decide. Of course, as soon as I use the word metadata, I’m sure I lose a part of my audience. I know my ears turned off the first time I heard that word. It didn’t mean anything to me, and it was an intangible concept that didn’t hold any value. Until, that is, we introduced the concept of tagging.

Metadata is the adjectives your language uses. It’s how you tell a fast car from a slow car.

So if objects are actors, and methods are verbs, how do I make these adjectives work for me. Who assigns these adjectives, and when can they be assigned? Here’s where the social paradigm comes in. Anybody, the developer, the actors (classes), environment, can assign these adjectives to any other actor or thing in the system. So what does that buy me? The purpose for tags is to find things again. What if we do something special with the tags? If a piece of code tags another piece of code, it’s because it wants to do something with it later. In fact the system can use that same mechanism.

For example, what if the language could tell dynamically whether the flyweight pattern was more applicable for you, and you don’t have to do a thing? If the runtime environment can determine how long it takes to create an instance of an object, it can tag the class as “Fast”, “Slow”, or “Average”. With that information, the environment can determine whether it is worth it to keep creating new instances of the object or switch context with the same memory resident object. Alternatively, it might decide to turn a reference to that object as a Future or asynchronous object. Sure you can send messages to the object, and expect the messages to be answered in the order you need them, but the application doesn’t have to stop in its tracks while you are waiting on an answer from a remote source.

OK, so now that we’ve seen something potentially useful, what about the powerful part. Sure it’s pretty cool to use asynchronous calls without declaring that you want something to be asynchronous, but what about other useful things like being able to perform the same function on all objects that were tagged specially? You know, kind of like telling all the stealth enemies to come out of hiding when they’ve been located? Rules engines work this way. You tell the rules to monitor all the objects with certain facts and do something when they match. Oh, isn’t this Functional Programming? Why, how astute of you. Then shouldn’t we use Scala? Scala requires you to be too explicit, and I can’t see any examples of it actually making life easier or easier to understand.

One of my frustrations with the Java Virtual Machine is its security model. In most cases, it is to inflexible and difficult to be used, so the application just runs unprotected—relying on the underlying operating system to enforce any security constraints. It might work for Unix based machines, but Windows machines are usually not protected as well by default. Also, if the application were run as the super user, you can cause some serious problems. Sometimes you want to be able to set up a sandbox, set some attributes for it, and run things inside of that. Kind of like setting up a virtual world for a set of components, or plugins. You can allow that plugin to access only the things you want it to access, and nothing more. What’s better to decide this than code you already trust? You can set up a separate work directory, and have the plugin use it as if it were the default system work directory (or temp directory). The code that set up the world for the plugin to do its job can decide by how often the code bumps into the security constraints if the plugin is behaving nicely.

The concept of the sandboxed little worlds fit well into the “Groups” concept that is present in the social applications. Everyone that is part of a group has a common goal and function. Of course, the same actors can be a member of several groups, and they all have to obey the rules of the group they are in. It’s the same exact instance, it’s just that the context that it is working in is different.

There’s some more possibilities, but this is enough to chew on for now. Of course, many of these things can be done without the need for a new programming language—it’s just that a new syntax would make it easier to work with and hide the implementation details. The resulting language shouldn’t be strictly object oriented, or functional. It should be designed in a way where the language can optimize at runtime based on the resources being used. It should also be designed in a way where it should be readable, understandable, and predictable.

How to Aproach a New Language

Posted by Berin Loritsch Wed, 05 Sep 2007 11:54:00 GMT

Whether you are new to a spoken language or a computer language, the principles are similar. There is so much to learn and it can seem so foreign to you that you can easily get overwhelmed. You can always start with survival phrases: little conversation swiss army knives that can get you a long way. They are really designed to get you to a point where you can find someone who speaks your language to finish the conversation, so it’s not like you are going to understand what many people are saying back to you most of the time. There are three major parts to understand a language: the vocabulary, the grammar, and the writing system.

Programming languages are usually easier to learn than spoken languages primarily because the vocabulary is intentionally kept small. The grammar is also kept consistent and simple to keep the parsers sane and predictability high. The writing system usually entails what can be typed from a keyboard and a few rules for mathematic symbols. There is some punctuation that you have to worry about, but not that much. So what makes a language so difficult? In my short experience, it has to do with translating the simple rules into something useful. You need to learn the libraries that come with the language to help you get something done with the operating system. A more subtle problem is more akin to dialects in spoken language, which is finding the standard idioms for doing things.

Spoken languages are usually tougher to learn because of the volume of the vocabulary and the different grammar rules you have to learn. Sure there is a basic grammar that is consistent throughout a language, but there are always exceptions you have to learn. After all that, you’ll invariably get stumped at some phrases and slang. For example if you translate the Arabic phrase for “How are you?” literally into English you would get “What color are you?” For someone who has grown up using the language its as natural as breathing, but for someone else it’s not that intuitive. How should you take it if someone wishes you to “be enlarged with fatness”? Should you be insulted or flattered? In many languages and cultures it’s a compliment.

Even though the two types of languages have very different challenges and end goals, there are a few strategies to help you in the process. Surprisingly, these strategies are the same for both endeavors. As someone who has learned Spanish and Classical Greek in a classroom environment, and everything else by self-study and pestering people, I can say that the classroom only gets you so far. This is what I’ve found to be helpful to me:

  1. Use it. What good will a language do you if you don’t use it for something? Even if you don’t know a lot, use what you know. You’ll find out more about what you need to know by stealing Nike’s advertising slogan and Just Do It.
  2. Develop a system or schedule for how you are going to expand your vocabulary. You can only take information in so fast before your brain overloads. It needs time to process what you’ve learned so far.
  3. Review constantly. You’re making mistakes, you just don’t know it yet. Go back over what you’ve done in the past in light of what you know now.
  4. Immerse yourself in the language. Do what you need to do to see and hear the language used properly. Watch shows, listen to podcasts, get involved in an open source project, read books, whatever you can do, do it. You’ll eventually find a good support group that will help you with the difficult stuff, and find new friends in the process.
  5. Learn the slang. Textbooks and classrooms teach you the “correct” way to do things, from an academic standpoint. That might be all well and good for some situations, but the real world is different than the classroom.
  6. Don’t strive for mastery. Strive to be better than you are. Mastery comes with a large amount of personal investment in the process. You’ll get overwhelmed if you try to master what you are studying. Just try to make incremental improvements and you’ll be more productive and happy with the progress.

Ok, so considering everything, how can I make these general guidelines? After all, what have I taken the time to learn, and how good am I with it? Even I’m surprised when I look at the list:

  • English—fluent, my natural language, 14 years of classroom (grade school and college) plus being competent in the idioms and slang.
  • Spanish—somewhat conversant, my second language, 2 years of classroom and a few years of talking to Spanish speaking people. I can speak, read, and write, but I’m still really slow listening. My vocabulary has wained a bit from lack of use.
  • Classic Greek—1 year of classroom, some self study using the Bible and study helps. It’s no longer a spoken language, but I can read and write the characters and I know where to look for answers that I need.
  • Japanese—I just started learning this language mostly out of curiosity. I’ve got some cultural ties to Japan both from my wife’s family and from mine. I’ve got martial arts, culinary, and cultural interests. Imagine my delight to find Japanese Pod 101.com to aid in my endeavors here.
  • There are some languages I just picked up a smattering of phrases, words, etc. Hardly useful more than to break the ice: Arabic, Punjabi, French, German, Russian, Swahili, Finnish.
  • BASIC—My first computer was a Commodore 64, which came with BASIC and some other language options. I learned BASIC well enough to work on some toy programs. I also learned the IBM and TRS-80 variants.
  • LOGO—I didn’t do more than turtle graphics with this one, but I’m dating myself aren’t I?
  • 6502 Assembly Language—BASIC was too slow, so I figured out how to make things happen at a lower level. Graphics were the same either way, so I found Assembly much more powerful and even expressive than the limited BASIC that was native to the Commodore machine. BTW geoProgrammer was excellent (more later).
  • GameMaker—Game making infrastructure, including sprite editors and rule editors.
  • COMAL—I had a class at school with this one. I never used it for anything more the classroom. However I did learn some cool tricks to use the modulus operator to handle certain corner cases with leap year handling.
  • C++—I skipped C because I believed that C++ had more of a future. By this time I left my C64 behind, and I was working with PCs. I first learned with GCC, and then with Microsoft Visual C++. When I started, neither were standards compliant (although I don’t think any where). One of my first personal projects involved CORBA. I’m very well conversant in this language, and if I need to I can get right back into it.
  • ColdFusion—I try to block this experience from my mind, but it did drive home the need for a good MVC architecture.
  • Java—I started by developing a data migration tool, which was actually well constructed even for a first project. I then got into Cocoon and Avalon as an answer to the ColdFusion fiasco. I’m fluent and still using this today.
  • D—The first of the languages I explored for sheer curiosity. It touted itself as a successor to C++ , having enough influence from Java to correct some memory handling snafus and binary compatibility across machines, yet enough C++ to be able to use local libraries natively. Obviously, binary compatibility has to do with how the methods are bound together, which is something that was left undefined in C++. That means it doesn’t matter which compiler is used or what machine the binary was compiled on, it will work if all the supporting libraries are present.
  • C#—While I was never really sold on anything that is born from Microsoft, ignoring it completely is more of a mistake. I wanted to find out where the utility began and the hype ended. Sadly, it was nothing more than a language very similar to Java. Sure it had some nice conveniences which were adopted fairly quickly in Java, so as an evolution it had some value. Nevertheless, I can’t entirely trust a culture which does nothing more than to parrot one voice. Java at least has a diverse world of opinions which provides a healthy base to gain experience from. It’s probably because of my firm cultural bias from C++ and Java aesthetics that make me think that C# code looks ugly. C++ had some tools that would extract specially formatted comments into docs, a system that Java took wholesale and extended into the JavaDoc system. C# ignored the precedence and decided to do things different from Java just to be different.
  • Lisp/JESS—I had to introduce myself for a project where we were using the Java Expert System Shell for a part of it. It’s based on a dialect of Lisp, so it took a lot of getting used to for me. I barely developed a working understanding of the tool.
  • Smalltalk—The second of the languages I explored for sheer curiosity. I wanted to get back to the roots of object oriented programming, and see how these guys did things. It had a profound influence on me, even though I don’t use it everyday. I highly recommend introducing yourself to the language even if you have no intentions of using it.
  • Ruby—I’ve heard so much about how nice this language is to use, and I have to say it is a shear pleasure to use once you’ve caught on to the Ruby culture. Remember how one of the keys to learning a language is to immerse yourself in it? You’ll be rewarded. Of course, now I love Ruby on Rails which has greatly influenced how I think web applications should be developed.
  • Perl—I can’t say I’m fluent, but I am conversant enough. I was hired to migrate an application to a more modern architecture (with newer versions of Perl and DBI libraries).

That’s a lot of stuff. I’ve got all these languages, both spoken and programming, rummaging around my head with different levels of proficiency. However, by the time you’ve learned your third language it becomes easier. You start to see similarities between them, and you start thinking about them abstractly. Those notions help you learn the new languages more quickly. When you realize that Greek adjectives are positioned grammatically like Spanish, and that Japanese sentence structure is somewhat like Greek (with the verb at the end) you’re building on concepts you’ve already learned.

I’m not finished yet. There will be new languages I have interest in the future. They’ll likely be programming languages, but I’m not ruling out learning another spoken language—or even expanding my knowledge in one of the ones I barely know. Constant learning keeps you sharp, and the good thing is that you can do it a little bit at a time. I don’t have the time, spare money, or the inkling to sit down in a classroom these days. However, if I can fit a little study in here and there, I’m happy. I’ve learned that there are more similarities than differences in all these languages and cultures.

Tag Suggestion from Content

Posted by Berin Loritsch Tue, 28 Aug 2007 12:59:00 GMT

I am researching what companies have technologies can suggest tags from the content of posts. For example, if I post a blog entry, the technology would automatically tag my content with appropriate tags. The way most link tagging sites like del.icio.us and ma.gnolia.com perform this task is by taking tags other people have used and you have used to suggest something. That’s great when you have several different people all tagging and marking links differently. That’s not so great when there is a central content like Flickr or a blog like this one. There’s one copy of the article, one copy of a picture, etc. There just isn’t a wider pool of tags to suggest from. The only way then is to analyze the content, and see how other similar content has been marked.

The academia approach involves natural language processing, storing contextual models of both the tag space and the content. You can get pretty accurate with that kind of approach, and even discover when there are tags that are misspelled but mean the same thing, etc. I’ve done some searching around and found Language Computer which is the research arm of Lymba . I also found a paper from TagAssist about the topic.

No matter how you slice it, this approach is going to take some number crunching and disk space. That means multiple machines to process the content on the way in. For very low volume submission sites like my blog, it might be possible to do everything on one machine. For higher volume submission sites like the one I’m working on, that’s a real problem to work through.

The question I have, and I haven’t been able to find much on the subject, is if there are low-tech solutions that will get us 50 percent of the way there for a little investment. We may have to do this “cool” integration at a later stage, depending on the costs involved. I need to find a set of alternatives and choose what will be the best match, but this is a relatively new application for this type of technology. If anyone has some clues, please let me know.