2009-12-29

Object-Oriented Thinking, Part One

For me, starting to learn Object-Oriented Programming (OOP) was quite the pain. It wasn't what I'd grown accustomed to over the past 15 years, and I was learning C++ at the same time from a prof who was also learning these things at the same time. (Not that there's anything wrong with that—profs do it all the time—but sometimes it's not the best learning environment.)

One of the consequences of this was that we were all caught up in the minutiae of C++, and kinda skimped on the big OOP picture.

So here it is: "OOP is good if you have a bunch of things that have attributes and do stuff." OOP wraps up the things, their attributes, and the stuff they do in a nice way that can make the code clean and clear. Things like GUIs (which have a bunch of things like windows and buttons, and the windows and buttons have attributes like their locations on the screen, and they can do stuff like get clicked or move) are natural fits to OOP.

Antelopes Three Antelope instances. The one on the left is calling its move() method, and the one in the middle is calling its eat() method.

(Photo by Mr Raja Purohit)

In OOP, you have particular "classes" of things, for example "Baxter", "Jane", and "Sue" could all be antelopes (of class "Antelope"). In this case, they are all called "instances" of the "Antelope class", or "they are instances of Antelopes".

So the class definition, in a way, is the master description of what data the instances will have and what stuff the instances will be able to do. In general, the class doesn't really do anything on its own—you make new instances based off it. (The preceding sentence, like most everything in this post, has exceptions, but you can worry about those later.)

Do antelopes have attributes? Yup! In this example, they have a "stomach fullness" and a "location" (described as an \(x\) and \(y\) coordinate). The class would describe how these attributes were stored.

Do antelopes do stuff? You bet! They can "eat a vegetable" and "walk a direction". The class would describe what these actions did. These actions are often called methods in OOP parlance, and consist of code that does stuff—other programming paradigms keep their code in "procedures", "subroutines", or "functions", but OOP has "methods".

In many OOP languages, the code for defining the class is in a difference place than the code that makes the instances. For the moment, let's take the code for the Antelope class definition, here written in Java:

class Antelope {
    public String name;
    private int stomachFullness;
    private int xLocation;
    private int yLocation;

    /**
     * Constructor--make a new Antelope instance with a given name
     */
    public Antelope(String name) {
        this.name = name;
        this.stomachFullness = 0;
        this.xLocation = 0;
        this.yLocation = 0;
    }

    /**
     * Eat some food
     */
    public void eat(int vegetableSize) {
        this.stomachFullness += vegetableSize;

        System.out.println(this.name + " now has fullness " +
            this.stomachFullness);
    }

    /**
     * Move a direction
     * 0=north, 1=south, 2=west, 3=east
     */
    public void move(int direction) {
        switch (direction) {
            case 0: this.yLocation--; break; // north
            case 1: this.yLocation++; break; // south
            case 2: this.xLocation--; break; // west
            case 3: this.xLocation++; break; // east
        }

        System.out.println(this.name + " has moved to (" +
            this.xLocation + "," + this.yLocation + ")");
    }
}

(Ignore the keywords public and private for the moment.)

Notice that the code on its own isn't actually doing anything; it merely describes what attributes an Antelope has and what its actions do (its "methods".)

Before we can actually do anything with Antelopes, we need to make instances of them (doing this is known as "instantiating Antelopes".) You can't do anything without any Antelope instances. The good news is you can make as many instances of Antelopes as you want! Feel the power!

Another thing to see in the above code, if you're not familiar with OOP, is the this keyword. It's a little bit funky to think about, but this refers to "this instance of the Antelope class"—it only makes sense in the context of an instance. So if we have three instances of Antelope, baxter, jane, and sue, when jane refers to "this.xLocation", jane is referring to her "xLocation" value. baxter and sue have their own xLocations. ("This is my xLocation. There are many others like it, but this one is mine.")

Some languages, like Python and Objective-C, call "this instance of the class" "self" instead of "this". Further complicating matters, in some languages like Objective-C, C++, and Java, this is optional—the compiler will automatically prepend it if it can't find a variable of the same name in scope. Personally I always explicitly write this so other engineers are clear that I'm referring to the object's instance variables. (Objective-C takes this even farther and writing "self.foo" and "foo" can have different effects even if they both refer to the same instance variable! But this is way more than I want to talk about here.)

Now, you probably notice that I snuck some code in the definition of the Antelope class that I hadn't previously mentioned: the Constructor. The constructor is the code that executes on a new instance of a class when that instance is first brought to life ("constructed", or "instantiated".) It's very common to put initialization code in there. Also in this case, I allow the constructor to be passed a parameter: the Antelope's name. This name is used later in the eat() and move() methods so that when the program runs, the user will be able to tell which Antelope instance is doing what. The constructor dutifully stores a copy of the name reference in the instance so it can use it later.

In Java, the constructor looks like a method with the same name as the class. Other languages might be similar (like C++) or completely different (like Python, where the constructor is always named __init__().)

In any case, it's time to fabricate some Antelopes! We have now defined the "Antelope" class, and it's time to make some instances and use them in the main program. Then we'll have them move and eat!

class Main {
    public static void main(String args[]) {
        Antelope baxter = new Antelope("Baxter");
        Antelope jane = new Antelope("Jane");
        Antelope sue = new Antelope("Sue");

        baxter.move(0); // 0 means north
        jane.move(2);   // 2 means west
        sue.eat(10);    // eat a big veggie of size 10

        System.out.println("jane's name is: " + jane.name);
    }
}

At the top of main(), you see we construct three Antelopes right away with the new keyword. (new tells Java to instantiate a new instance of a class.) This causes the Antelopes to be allocated and their constructors called—you see we pass the constructor a human-readable version of their names, nicely capitalized.

Following this, baxter is told to move north, jane is told to move west, and sue is told to eat a veggie of size 10. When these methods are called, each instance will print the result (as you see in the Antelope's move() and eat() code.) More jargon: "calling an object's method" is sometimes referred to as "passing a message to an object" in OOP contexts.

Finally, the main routine prints out the value of jane's name instance variable.

Here's the output:

Baxter has moved to (0,-1)
Jane has moved to (-1,0)
Sue now has fullness 10
jane's name is: Jane

Now, you remember those "private" and "public" keywords I told you about? To revisit those, they tell you which variables and methods are visible outside the class. In the case of the Antelope class, anyone can access the name instance variable because it's public. (Indeed, the main code does this when it prints out baxter's name.) However, stomachFullness is private, and can't be accessed from anywhere except inside the Antelope class. (If main were to try to print it like it prints the name, it would be a compile-time error.)

The reason for making class members private is to hide the implementation details of the Antelope from the outside world in order to increase the ease-of-use of the Antelope, as well as increasing the reliability and maintainability of the code. As long as no other code in the entire Universe refers to the private instance variable stomachFullness it can be changed in (or even removed from) the Antelope class later without a problem—declaring it to be private ensures no one else can use it. Likewise, declaring something to be "public" allows people to use it, and you won't be able to change it later without also overhauling all the code that uses it. This is why it's very important to be sure about the solid, clean, good design of your public class elements before sharing them with the world.

Finally, here's a quick set of guidelines for coming up with your OO design for a particular problem. First, just describe the problem in plain English (or the language of your choice). Describe what the types of entities in the problem are, what attributes they possess, what actions they can take, and what those actions need to work. ("Well, we'll have some antelopes and they'll have a location, and they'll be able to move and eat food, so I guess they'll need a stomach fullness, then. And when they move, they'll need a direction to go, and when they eat they'll need to be told the amount.") Then a class can be written for each type of entity, where the attributes are instance variables in the class, and the actions are its methods, and what the actions need to work are passed as parameters to those methods. Lastly, you instantiate instances of the objects you need, and call their methods!

That's everything there is to know about Object-Oriented Programming! Of course, that's a lie—this is only the first step out of about a zillion steps, but it's a very important big first step that's almost as big as all the other steps combined.

(This is continued in Object-Oriented Thinking Part Two.)

Beej's Bit Bucket

⚡ Tech and Programming Fun

Object-Oriented Thinking, Part One