6.3. Design and Implementation Issues

< Free Open Study >

Defining good class interfaces goes a long way toward creating a high-quality program. The internal class design and implementation are also important. This section discusses issues related to containment, inheritance, member functions and data, class coupling, constructors, and value-vs.-reference objects.

Containment ("has a" Relationships)

Containment is the simple idea that a class contains a primitive data element or object. A lot more is written about inheritance than about containment, but that's because inheritance is more tricky and error-prone, not because it's better. Containment is the work-horse technique in object-oriented programming.

Implement "has a" through containment One way of thinking of containment is as a "has a" relationship. For example, an employee "has a" name, "has a" phone number, "has a" tax ID, and so on. You can usually accomplish this by making the name, phone number, and tax ID member data of the Employee class.

Implement "has a" through private inheritance as a last resort In some instances you might find that you can't achieve containment through making one object a member of another. In that case, some experts suggest privately inheriting from the contained object (Meyers 1998, Sutter 2000). The main reason you would do that is to set up the containing class to access protected member functions or protected member data of the class that's contained. In practice, this approach creates an overly cozy relationship with the ancestor class and violates encapsulation. It tends to point to design errors that should be resolved some way other than through private inheritance.

Be critical of classes that contain more than about seven data members The number "7±2" has been found to be a number of discrete items a person can remember while performing other tasks (Miller 1956). If a class contains more than about seven data members, consider whether the class should be decomposed into multiple smaller classes (Riel 1996). You might err more toward the high end of 7±2 if the data members are primitive data types like integers and strings, more toward the lower end of 7±2 if the data members are complex objects.

Inheritance ("is a" Relationships)

Inheritance is the idea that one class is a specialization of another class. The purpose of inheritance is to create simpler code by defining a base class that specifies common elements of two or more derived classes. The common elements can be routine interfaces, implementations, data members, or data types. Inheritance helps avoid the need to repeat code and data in multiple locations by centralizing it within a base class.

When you decide to use inheritance, you have to make several decisions:

For each member routine, will the routine be visible to derived classes? Will it have a default implementation? Will the default implementation be overridable?
For each data member (including variables, named constants, enumerations, and so on), will the data member be visible to derived classes?

The following subsections explain the ins and outs of making these decisions:

Implement "is a" through public inheritance When a programmer decides to create a new class by inheriting from an existing class, that programmer is saying that the new class "is a" more specialized version of the older class. The base class sets expectations about how the derived class will operate and imposes constraints on how the derived class can operate (Meyers 1998).

The single most important rule in object-oriented programming with C++ is this: public inheritance means "is a." Commit this rule to memory.
Scott Meyers

If the derived class isn't going to adhere completely to the same interface contract defined by the base class, inheritance is not the right implementation technique. Consider containment or making a change further up the inheritance hierarchy.

Design and document for inheritance or prohibit it Inheritance adds complexity to a program, and, as such, it's a dangerous technique. As Java guru Joshua Bloch says, "Design and document for inheritance, or prohibit it." If a class isn't designed to be inherited from, make its members non-virtual in C++, final in Java, or non-overridable in Microsoft Visual Basic so that you can't inherit from it.

Adhere to the Liskov Substitution Principle (LSP) In one of object-oriented programming's seminal papers, Barbara Liskov argued that you shouldn't inherit from a base class unless the derived class truly "is a" more specific version of the base class (Liskov 1988). Andy Hunt and Dave Thomas summarize LSP like this: "Subclasses must be usable through the base class interface without the need for the user to know the difference" (Hunt and Thomas 2000).

In other words, all the routines defined in the base class should mean the same thing when they're used in each of the derived classes.

If you have a base class of Account and derived classes of CheckingAccount, SavingsAccount, and AutoLoanAccount, a programmer should be able to invoke any of the routines derived from Account on any of Account's subtypes without caring about which subtype a specific account object is.

If a program has been written so that the Liskov Substitution Principle is true, inheritance is a powerful tool for reducing complexity because a programmer can focus on the generic attributes of an object without worrying about the details. If a programmer must be constantly thinking about semantic differences in subclass implementations, then inheritance is increasing complexity rather than reducing it. Suppose a programmer has to think this: "If I call the InterestRate() routine on CheckingAccount or SavingsAccount, it returns the interest the bank pays, but if I call InterestRate() on AutoLoanAccount I have to change the sign because it returns the interest the consumer pays to the bank." According to LSP, AutoLoanAccount should not inherit from the Account base class in this example because the semantics of the InterestRate() routine are not the same as the semantics of the base class's InterestRate() routine.

Be sure to inherit only what you want to inherit A derived class can inherit member routine interfaces, implementations, or both. Table 6-1 shows the variations of how routines can be implemented and overridden.

Table 6-1. Variations on Inherited Routines
	Overridable	Not Overridable
Implementation: Default Provided	Overridable Routine	Non-Overridable Routine
Implementation: No Default Provided	Abstract Overridable Routine	Not used (doesn't make sense to leave a routine undefined and not allow it to be overridden)

As the table suggests, inherited routines come in three basic flavors:

An abstract overridable routine means that the derived class inherits the routine's interface but not its implementation.
An overridable routine means that the derived class inherits the routine's interface and a default implementation and it is allowed to override the default implementation.
A non-overridable routine means that the derived class inherits the routine's interface and its default implementation and it is not allowed to override the routine's implementation.

When you choose to implement a new class through inheritance, think through the kind of inheritance you want for each member routine. Beware of inheriting implementation just because you're inheriting an interface, and beware of inheriting an interface just because you want to inherit an implementation. If you want to use a class's implementation but not its interface, use containment rather than inheritance.

Don't "override" a non-overridable member function Both C++ and Java allow a programmer to override a non-overridable member routine kind of. If a function is private in the base class, a derived class can create a function with the same name. To the programmer reading the code in the derived class, such a function can create confusion because it looks like it should be polymorphic, but it isn't; it just has the same name. Another way to state this guideline is, "Don't reuse names of non-overridable base-class routines in derived classes."

Move common interfaces, data, and behavior as high as possible in the inheritance tree The higher you move interfaces, data, and behavior, the more easily derived classes can use them. How high is too high? Let abstraction be your guide. If you find that moving a routine higher would break the higher object's abstraction, don't do it.

Be suspicious of classes of which there is only one instance A single instance might indicate that the design confuses objects with classes. Consider whether you could just create an object instead of a new class. Can the variation of the derived class be represented in data rather than as a distinct class? The Singleton pattern is one notable exception to this guideline.

Be suspicious of base classes of which there is only one derived class When I see a base class that has only one derived class, I suspect that some programmer has been "designing ahead" trying to anticipate future needs, usually without fully understanding what those future needs are. The best way to prepare for future work is not to design extra layers of base classes that "might be needed someday"; it's to make current work as clear, straightforward, and simple as possible. That means not creating any more inheritance structure than is absolutely necessary.

Be suspicious of classes that override a routine and do nothing inside the derived routine This typically indicates an error in the design of the base class. For instance, suppose you have a class Cat and a routine Scratch() and suppose that you eventually find out that some cats are declawed and can't scratch. You might be tempted to create a class derived from Cat named ScratchlessCat and override the Scratch() routine to do nothing. This approach presents several problems:

It violates the abstraction (interface contract) presented in the Cat class by changing the semantics of its interface.
This approach quickly gets out of control when you extend it to other derived classes. What happens when you find a cat without a tail? Or a cat that doesn't catch mice? Or a cat that doesn't drink milk? Eventually you'll end up with derived classes like ScratchlessTaillessMicelessMilklessCat.
Over time, this approach gives rise to code that's confusing to maintain because the interfaces and behavior of the ancestor classes imply little or nothing about the behavior of their descendants.

The place to fix this problem is not in the base class, but in the original Cat class. Create a Claws class and contain that within the Cats class. The root problem was the assumption that all cats scratch, so fix that problem at the source, rather than just bandaging it at the destination.

Avoid deep inheritance trees Object-oriented programming provides a large number of techniques for managing complexity. But every powerful tool has its hazards, and some object-oriented techniques have a tendency to increase complexity rather than reduce it.

In his excellent book Object-Oriented Design Heuristics (1996), Arthur Riel suggests limiting inheritance hierarchies to a maximum of six levels. Riel bases his recommendation on the "magic number 7±2," but I think that's grossly optimistic. In my experience most people have trouble juggling more than two or three levels of inheritance in their brains at once. The "magic number 7±2" is probably better applied as a limit to the total number of subclasses of a base class rather than the number of levels in an inheritance tree.

Deep inheritance trees have been found to be significantly associated with increased fault rates (Basili, Briand, and Melo 1996). Anyone who has ever tried to debug a complex inheritance hierarchy knows why. Deep inheritance trees increase complexity, which is exactly the opposite of what inheritance should be used to accomplish. Keep the primary technical mission in mind. Make sure you're using inheritance to avoid duplicating code and to minimize complexity.

Prefer polymorphism to extensive type checking Frequently repeated case statements sometimes suggest that inheritance might be a better design choice, although this is not always true. Here is a classic example of code that cries out for a more object-oriented approach:

C++ Example of a Case Statement That Probably Should Be Replaced by Polymorphism

 switch ( shape.type ) {    case Shape_Circle:       shape.DrawCircle();       break;    case Shape_Square:       shape.DrawSquare();       break;    ... }

In this example, the calls to shape.DrawCircle() and shape.DrawSquare() should be replaced by a single routine named shape.Draw(), which can be called regardless of whether the shape is a circle or a square.

On the other hand, sometimes case statements are used to separate truly different kinds of objects or behavior. Here is an example of a case statement that is appropriate in an object-oriented program:

C++ Example of a Case Statement That Probably Should Not Be Replaced by Polymorphism

 switch ( ui.Command() ) {    case Command_OpenFile:       OpenFile();       break;    case Command_Print:       Print();       break;    case Command_Save:       Save();       break;    case Command_Exit:       ShutDown();       break;    ... }

In this case, it would be possible to create a base class with derived classes and a polymorphic DoCommand() routine for each command (as in the Command pattern). But in a simple case like this one, the meaning of DoCommand() would be so diluted as to be meaningless, and the case statement is the more understandable solution.

Make all data private, not protected As Joshua Bloch says, "Inheritance breaks encapsulation" (2001). When you inherit from an object, you obtain privileged access to that object's protected routines and data. If the derived class really needs access to the base class's attributes, provide protected accessor functions instead.

Multiple Inheritance

Inheritance is a power tool. It's like using a chain saw to cut down a tree instead of a manual crosscut saw. It can be incredibly useful when used with care, but it's dangerous in the hands of someone who doesn't observe proper precautions.

The one indisputable fact about multiple inheritance in C++ is that it opens up a Pandora's box of complexities that simply do not exist under single inheritance.
Scott Meyers

If inheritance is a chain saw, multiple inheritance is a 1950s-era chain saw with no blade guard, no automatic shutoff, and a finicky engine. There are times when such a tool is valuable; mostly, however, you're better off leaving the tool in the garage where it can't do any damage.

Although some experts recommend broad use of multiple inheritance (Meyer 1997), in my experience multiple inheritance is useful primarily for defining "mixins," simple classes that are used to add a set of properties to an object. Mixins are called mixins because they allow properties to be "mixed in" to derived classes. Mixins might be classes like Displayable, Persistant, Serializable, or Sortable. Mixins are nearly always abstract and aren't meant to be instantiated independently of other objects.

Mixins require the use of multiple inheritance, but they aren't subject to the classic diamond-inheritance problem associated with multiple inheritance as long as all mixins are truly independent of each other. They also make the design more comprehensible by "chunking" attributes together. A programmer will have an easier time understanding that an object uses the mixins Displayable and Persistent than understanding that an object uses the 11 more-specific routines that would otherwise be needed to implement those two properties.

Java and Visual Basic recognize the value of mixins by allowing multiple inheritance of interfaces but only single-class inheritance. C++ supports multiple inheritance of both interface and implementation. Programmers should use multiple inheritance only after carefully considering the alternatives and weighing the impact on system complexity and comprehensibility.

Why Are There So Many Rules for Inheritance?

This section has presented numerous rules for staying out of trouble with inheritance. The underlying message of all these rules is that inheritance tends to work against the primary technical imperative you have as a programmer, which is to manage complexity. For the sake of controlling complexity, you should maintain a heavy bias against inheritance. Here's a summary of when to use inheritance and when to use containment:

If multiple classes share common data but not behavior, create a common object that those classes can contain.
If multiple classes share common behavior but not data, derive them from a common base class that defines the common routines.
If multiple classes share common data and behavior, inherit from a common base class that defines the common data and routines.
Inherit when you want the base class to control your interface; contain when you want to control your interface.

Cross-Reference

For more on complexity, see "Software's Primary Technical Imperative: Managing Complexity" in Section 5.2.

Member Functions and Data

Here are a few guidelines for implementing member functions and member data effectively.

Cross-Reference

For more discussion of routines in general, see Chapter 7, "High-Quality Routines."

Keep the number of routines in a class as small as possible A study of C++ programs found that higher numbers of routines per class were associated with higher fault rates (Basili, Briand, and Melo 1996). However, other competing factors were found to be more significant, including deep inheritance trees, large number of routines called within a class, and strong coupling between classes. Evaluate the tradeoff between minimizing the number of routines and these other factors.

Disallow implicitly generated member functions and operators you don't want Sometimes you'll find that you want to disallow certain functions perhaps you want to disallow assignment, or you don't want to allow an object to be constructed. You might think that, since the compiler generates operators automatically, you're stuck allowing access. But in such cases you can disallow those uses by declaring the constructor, assignment operator, or other function or operator private, which will prevent clients from accessing it. (Making the constructor private is a standard technique for defining a singleton class, which is discussed later in this chapter.)

Minimize the number of different routines called by a class One study found that the number of faults in a class was statistically correlated with the total number of routines that were called from within a class (Basili, Briand, and Melo 1996). The same study found that the more classes a class used, the higher its fault rate tended to be. These concepts are sometimes called "fan out."

Minimize indirect routine calls to other classes Direct connections are hazardous enough. Indirect connections such as account.ContactPerson().DaytimeContactInfo().PhoneNumber() tend to be even more hazardous. Researchers have formulated a rule called the "Law of Demeter" (Lieberherr and Holland 1989), which essentially states that Object A can call any of its own routines. If Object A instantiates an Object B, it can call any of Object B's routines. But it should avoid calling routines on objects provided by Object B. In the account example above, that means account.ContactPerson() is OK but account.ContactPerson().DaytimeContactInfo() is not.

Constructors

Following are some guidelines that apply specifically to constructors. Guidelines for constructors are pretty similar across languages (C++, Java, and Visual Basic, anyway). Destructors vary more, so you should check out the materials listed in this chapter's "Additional Resources" section for information on destructors.

Initialize all member data in all constructors, if possible Initializing all data members in all constructors is an inexpensive defensive programming practice.

Enforce the singleton property by using a private constructor If you want to define a class that allows only one object to be instantiated, you can enforce this by hiding all the constructors of the class and then providing a static GetInstance() routine to access the class's single instance. Here's an example of how that would work:

Java Example of Enforcing a Singleton with a Private Constructor

 public class MaxId {    // constructors and destructors    private MaxId() {       <-- 1       ...    }    ...    // public routines    public static MaxId GetInstance() {       <-- 2       return m_instance;    }    ...    // private members    private static final MaxId m_instance = new MaxId();       <-- 3    ... }

(1)Here is the private constructor.
(2)Here is the public routine that provides access to the single instance.
(3)Here is the single instance.

The private constructor is called only when the static object m_instance is initialized. In this approach, if you want to reference the MaxId singleton, you would simply refer to MaxId.GetInstance().

Prefer deep copies to shallow copies until proven otherwise One of the major decisions you'll make about complex objects is whether to implement deep copies or shallow copies of the object. A deep copy of an object is a member-wise copy of the object's member data; a shallow copy typically just points to or refers to a single reference copy, although the specific meanings of "deep" and "shallow" vary.

The motivation for creating shallow copies is typically to improve performance. Although creating multiple copies of large objects might be aesthetically offensive, it rarely causes any measurable performance impact. A small number of objects might cause performance issues, but programmers are notoriously poor at guessing which code really causes problems. (For details, see Chapter 25, "Code-Tuning Strategies.") Because it's a poor tradeoff to add complexity for dubious performance gains, a good approach to deep vs. shallow copies is to prefer deep copies until proven otherwise.

Deep copies are simpler to code and maintain than shallow copies. In addition to the code either kind of object would contain, shallow copies add code to count references, ensure safe object copies, safe comparisons, safe deletes, and so on. This code can be error-prone, and you should avoid it unless there's a compelling reason to create it.

If you find that you do need to use a shallow-copy approach, Scott Meyers's More Effective C++, Item 29 (1996) contains an excellent discussion of the issues in C++. Martin Fowler's Refactoring (1999) describes the specific steps needed to convert from shallow copies to deep copies and from deep copies to shallow copies. (Fowler calls them reference objects and value objects.)