Some Words on Object Orientation | The Ruby Way, Second Edition: Solutions and Techniques in Ruby Programming (2nd Edition)

	Ruby Way By Hal Fulton Slots : 1.0
	Table of Contents

Before talking about Ruby specifically, it is a good idea to talk about object-oriented programming in the abstract. These first few pages will provide a review of those concepts with only cursory references to Ruby, before we proceed in a few pages to the review of the Ruby language itself.

In object-oriented programming, the fundamental unit is the object, which is an entity that serves as a container for data and also controls access to the data. Associated with an object is a set of attributes, which are essentially no more than variables belonging to the object. (In this book, we will loosely use the ordinary term variable for an attribute.) Also associated with an object is a set of functions that provide an interface to the functionality of the object. These functions are called methods.

It is essential that any OOP language provide encapsulation. As the term is commonly used, it means first that the attributes and methods of an object are associated specifically with that object or bundled with it. Secondly, it means that the scope of those attributes and methods is by default the object itself (an application of the well-known principle of data hiding, which is not specific to OOP).

An object is considered to be an instance or manifestation of an object class (usually simply called a class). The class may be thought of as the blueprint or pattern; the object itself is the thing created from that blueprint or pattern. A class is often thought of as an abstract typea more complex type than, for example, an integer or character string.

When an object (an instance of a class) is created, it is said to be instantiated. Some languages have the notion of an explicit constructor and destructor for an objectfunctions that perform whatever tasks are needed to initialize an object and, respectively, to "destroy" it. We may as well mention prematurely that Ruby has what might be considered a constructor but certainly does not have any concept of a destructor (because of its well-behaved garbage-collection mechanism).

Occasionally a situation arises in which a piece of data is more "global" in scope than a single object, and it is inappropriate to put a copy of the attribute into each instance of the class. For example, consider a class called MyDogs, from which three objects are created: fido, rover, and spot. For each dog, there might be such attributes as age and date of vaccination. But suppose we want to store the owner's name. We could certainly put it in each object, but that is wasteful of memory and at the very least a misleading design. Clearly the owner_name attribute belongs not to any individual object but rather to the class itself. When it is defined that way (and the syntax will vary from one language to another), it is called a class attribute (or class variable).

Of course, there are many situations in which a class variable might be needed. For example, suppose we want to keep a count of how many objects of a certain class have been created. We could use a class variable that was initialized to zero and incremented with every instantiation; the class variable would be associated with the class and not with any particular object. In scope, this variable would be just like any other attribute, but there would only be one copy of it for the entire class and the entire set of objects created from that class.

To distinguish between class attributes and ordinary attributes, the latter are sometimes explicitly called object attributes (or instance attributes). We will use the convention that any attribute is assumed to be an instance attribute unless we explicitly call it a class attribute.

Just as an object's methods are used to control access to its attributes and provide a clean interface to them, so is it sometimes appropriate or necessary to define a method that is associated with a class. A class method, not surprisingly, controls access to the class variables and also performs any tasks that might have class-wide effects rather than merely object-wide effects. As with data attributes, methods are assumed to belong to the object rather than the class, unless stated otherwise.

It is worth mentioning that there is a sense in which all methods are class methods. We should not suppose that when a hundred objects are created, we actually copy the code for the methods a hundred times! However, the rules of scope assure us that each object method operates only on the object whose method is being called, providing us with the extremely necessary illusion that object methods are associated strictly with their objects.

We come now to one of the real strengths of object-oriented programming: inheritance. Inheritance is a mechanism that allows us to extend a previously existing entity by adding features to create a new entity. In short, inheritance is a way of reusing code. (Easy, effective code reuse has long been the Holy Grail of computer science, resulting in the invention decades ago of parameterized subroutines and code libraries. OOP is only one of the later efforts in realizing this goal.)

Typically we think of inheritance at the class level. If we have a specific class in mind and there is a more general case already in existence, we can define our new class to inherit the features of the old one. For example, suppose we have the class Polygon, which describes convex polygons. If we then find ourselves wanting to deal with the Rectangle class, we can inherit from Polygon so that Rectangle now has all the attributes and methods that Polygon has. For example, there might be a method that would calculate perimeter by iterating over all the sides and adding their lengths. Assuming everything is implemented properly, this method would automatically work for the new class; the code would not have to be rewritten.

When class B inherits from class A, we say that B is a subclass of Aor conversely, A is the superclass of B. In slightly different terminology, we may say that A is a base class or parent class, and B is a derived class or child class.

A derived class, as you have seen, may treat a method inherited from its base class as if it were its own. On the other hand, it may redefine that method entirely, if it is necessary to provide a different implementation; this is referred to as overriding a method. In addition, most languages provide a way for an overridden method to call its namesake in the parent class; that is, the method foo in B knows how to call method foo in A if it wants to. (Any language not providing this feature is under suspicion of not being truly object oriented.) Essentially the same is true for data attributes.

The relationship between a class and its superclass is an interesting and important one; it is usually described as the is-a relationship, because a Square "is a" Rectangle, and a Rectangle "is a" Polygon, and so on. Therefore, if we create an inheritance hierarchy (which tends to exist in one form or another in any OOP language), we see that the more specific entity "is a" subclass of the more general entity at any given point in the hierarchy. Note that this relationship is transitivein the preceding example, you can easily see that a Square "is a" Polygon. Note also that the relationship is not commutativewe know that every Rectangle is a Polygon, but not every Polygon is a Rectangle.

This brings us to the topic of multiple inheritance. It is conceivable that there might be more than one class from which a new class could inherit. For example, the classes Dog and Cat can both inherit from the class Mammal, and Sparrow and Raven can inherit from WingedCreature. But what if we want to define the class Bat? It can reasonably inherit from both Mammal and WingedCreature. This corresponds well with our experience in real life, in which things are not members of just one category but of many non-nested categories.

Multiple inheritance (MI) is probably the most controversial area in OOP. One camp will point out the potential for ambiguity that must be resolved. For example, if Mammal and WingedCreature both have an attribute called size (or a method called eat), which one will be referenced when we refer to it from a Bat object? Another related difficulty is the "diamond inheritance problem" (so called because of the shape of its inheritance diagram), with both superclasses inheriting from a single common superclass. For example, imagine that Mammal and WingedCreature both inherit from Organism; the hierarchy from Organism to Bat forms a diamond. But what about the attributes that the two intermediate classes both inherit from their parent? Does Bat get two copies of each of them, or are they merged back into single attributes because they come from a common ancestor in the first place?

These are both issues for the language designer rather than the programmer. Different OOP languages deal with the issues in different ways. Some will provide rules allowing one definition of an attribute to "win out," or a way to distinguish between attributes of the same name, or even a way of aliasing or renaming the identifiers. This in itself is considered by many to be an argument against MIthe mechanisms for dealing with name clashes and the like are not universally agreed upon but are very much language dependent. C++ offers a fairly minimal set of features for dealing with ambiguities; those of Eiffel are probably better, and those of Perl are different from both.

The alternative, of course, is to disallow MI altogether. This is the approach taken by such languages as Java and Ruby. This sounds like a drastic compromise; however, as you'll see later, it is not as bad as it sounds. We will look at a viable alternative to traditional multiple inheritance, but we must first discuss yet another OOP buzzword: polymorphism.

Polymorphism is the term that perhaps inspires the most semantic disagreement in the field. Everyone seems to know what it is, but everyone has a different definition. (In recent years, "What is polymorphism?" has become a popular interview question. If it is asked of you, I recommend quoting an expert like Bertrand Meyer or Bjarne Stroustrup; that way, if the interviewer disagrees, his beef is with the expert and not with you.)

The literal meaning of polymorphism is "the ability to take on multiple forms or shapes." In its broadest sense, this refers to the ability of different objects to respond in different ways to the same message (or method invocation).

Damian Conway, in his book Object-Oriented Perl, distinguishes meaningfully between two kinds of polymorphism. The first, inheritance polymorphism, is what most programmers are referring to when they talk about polymorphism.

When a class inherits from its superclass, we know (by definition) that any method present in the superclass is also present in the subclass. Therefore, a chain of inheritance represents a linear hierarchy of classes that can respond to the same set of methods. Of course, we must remember that any subclass can redefine a method; that is what gives inheritance its power. If I call a method on an object, typically it will be either the one it inherited from its superclass or a more appropriate (more specialized) method tailored for the subclass.

In strongly typed languages such as C++, inheritance polymorphism establishes type compatibility down the chain of inheritance (but not in the reverse direction). For example, if B inherits from A, then a pointer to an A object can also point to a B object. However, the reverse is not true. This type compatibility is an essential OOP feature in such languagesindeed it almost sums up polymorphismbut polymorphism certainly exists in the absence of static typing (as in Ruby).

The second kind of polymorphism Conway identifies is interface polymorphism. This does not require any inheritance relationship between classes; it only requires that the interfaces of the objects have methods of a certain name. The treatment of such objects as being the same "kind" of thing is therefore a type of polymorphism (although in most writings it is not explicitly referred to as such).

Readers familiar with Java will recognize that it implements both kinds of polymorphism. A Java class can extend another class, inheriting from it via the extends keyword, or it may implement an interface, acquiring a known set of methods (which must then be overridden) via the implements keyword. Because of the syntax requirements, the Java interpreter is able to determine at compile time whether a method can be invoked on a particular object.

Ruby supports interface polymorphism but in a different way, providing modules whose methods may be mixed in to existing classes (interfacing to user-defined methods that are expected to exist). This, however, is not the way modules are usually used. A module consists of methods and constants that may be used as though they were actual parts of that class or object; when a module is mixed in via the include statement, this is considered to be a restricted form of multiple inheritance. (According to the language designer Yukihiro Matsumoto, this can be viewed as "single inheritance with implementation sharing.") This is a way of preserving the benefits of MI without suffering all the consequences.

It's worth noting that Ruby supports implicit interface polymorphism by virtue of the simple fact that any class can "masquerade" as another class. In many cases, the only type information we care about is whether a certain set of methods is implementedthat is, whether an object responds to certain messages. Sometimes we write code for a Duck object when really all we care about is for it to implement a quack method. Yet, if something "quacks" like a Duck, for our purposes it is a Duck (with no need to inherit from that class at all). The set of available methods is arguably the most important type information.

Languages such as C++ contain the concept of abstract classesclasses that must be inherited from and cannot be instantiated on their own. This concept does not exist in the more dynamic Ruby language, although if the programmer really wants, it is possible to fake this kind of behavior by forcing the methods to be overridden. Whether this is useful or not is left as an exercise for you, the reader.

The creator of C++, Bjarne Stroustrup, also identifies the concept of a concrete type. This is a class that exists only for convenience; it is not designed to be inherited from, nor is it expected that there will ever be another class derived from it. In other words, the benefits of OOP are basically limited to encapsulation. Ruby does not specifically support this concept through any special syntax (nor does C++), but it is naturally well suited for the creation of such classes.

Some languages are considered to be more "purely" object-oriented than others. (We also use the term radically object oriented.) This refers to the concept that every entity in the language is an object; every primitive type is represented as a full-fledged class, and variables and constants alike are recognized as object instances. This is in contrast to such languages as Java, C++, and Eiffel. In these, the more primitive data types (especially constants) are not first-class objects, although they may sometimes be treated that way with "wrapper" classes.

Most object-oriented languages are fairly static; the methods and attributes belonging to a class, the global variables, and the inheritance hierarchy are all defined at compile time. Perhaps the largest conceptual leap for a Ruby programmer is that these are all handled dynamically in Ruby. Definitions and even inheritance can happen at runtimein fact, we can truly say that every declaration or definition is actually executed during the running of the program. Among many other benefits, this obviates the need for conditional compilation and can produce more efficient code in many circumstances.

This sums up the whirlwind tour of OOP. Throughout the rest of the book, we have tried to make consistent use of the terms introduced here. Let's proceed now to a brief review of the Ruby language itself.