3.6 The Proliferation of Classes Problem

< Free Open Study >

We will see many heuristics in this book discussing trade-offs and improvements of one design over another. Many of these are small tweakings on a design and are local in nature. A single violation rarely causes major ramifications on the entire application. We will examine a group of 10 heuristics that focus on a particularly nasty pitfall in the design of object-oriented software. This pitfall is known as the proliferation of classes problem. At a conference some years back, a speaker brought out the interesting point that there is no way to get spaghetti code in an object-oriented system ”you get ravioli code instead. Consider the solution to our fictitious problem consisting of f1() through f5() . In our object-oriented solution, we developed a design consisting of D1 and D2 . Imagine the problems if our solution to this tiny problem had 18 classes, or 180 classes. This is the notion of class proliferation. Our extensibility problem is no longer, "I need to change this data, now whom do I need to tell ? OOPS!" In the object-oriented paradigm, the problem has been transformed into, "I want to add this feature to my system, which 54 classes of the 16,786 classes in my system need to be modified .? OOPS!" You have the same maintenance problem; it just manifests itself differently. Of the 10 known places that proliferation occurs, three are intuitively obvious at design time because they lead to an exponential explosion in the number of classes. We will examine all 10 within the area of the paradigm that they occur (many occur through the inheritance relationship). For those wishing to look ahead, the heuristics in question are 2.11, 3.7, 3.8, 3.9, 3.10, 4.9, 5.14, 5.15, 5.16, 5.18. Three of these heuristics are discussed below.

Heuristic 3.7

Eliminate irrelevant classes from your design.

An irrelevant class is one that has no meaningful behavior in the domain of your system. These are usually detected by looking for classes that have no operations besides set, get, and print type functions. The reason sets , gets , and prints are not counted as meaningful behavior is that all too often they operate solely on the descriptive attributes of a system. The fact that a car will give you its color field is generally not interesting behavior in the domain of a system. There are notable exceptions to using get and set operations in the detection of irrelevant classes. Sensors and transducers often have meaningful get and set operations, namely, getting is the behavior of sensors and setting is the behavior of transducers . When we discuss eliminating irrelevant classes, we do not necessarily remove the information from our design. Typically, the class is demoted to an attribute.

Heuristic 3.8

Eliminate classes that are outside the system.

This heuristic is really a special case of the previous heuristic. If a class is outside the system, it is irrelevant with respect to the given domain. Classes outside the system are not always easy to detect. During successive iterations of design, it eventually becomes clear that some classes do not require any methods to be written for them. These are classes that are outside of the system. The hallmark of such classes is an abstraction that sends messages into the system domain but does not receive message sends from other classes in the domain. I have seen this to be a problem in three case studies, including the following:

A company was building a product registration system for processing consumer-purchased equipment such as blenders , toasters, televisions , etc. A question arose as to whether the blender should be a class. One group argued that it should be since blenders do have methods like whip, chop, puree, liquify, etc. If blender is a class, what about the person who fills out the registration card, the registration card itself, and the information on the registration card? While all of these abstractions have behavior, only the data on the card has behavior in this domain. Our system should not care if someone filled out the card or if a squirrel arrived carrying it in his mouth. The only interest in the domain of the system is that there exists product registration information, and we somehow received it. It is important to note that although the blender is not a class in this domain, it will most likely appear as the value of an attribute in the product registration information that will be modeled as a card.

It is important not to laugh at any pitfall discussed in these designs, lest you tempt the fates and bring the same problem to your design.

Another company I dealt with listened to my discussion of the pitfall that the first company ran into, and they stated they would never fall into such a design flaw. The second company collected large quantities of automobile engine test data from specialized hardware including various dynamometers. They were designing a report generation system where a user would describe the report layout and the number of test points he or she wanted to see. In their initial design, they began discussing operations on the dynamometer. Dynamometers clearly have a well-defined public interface; unfortunately , it is not used in their domain. Like the first company, they chose to model a class outside their system. The collected data was interesting in their domain and should be modeled as a class, but the method of collection is uninteresting to report generators. They do not care if a squirrel is typing in data behind the computer or if fancy million-dollar machines are collecting it. They simply report what they have. If a process control system was being designed to run the dynamometer, then clearly the dynamometer should be modeled as a class. Watch out for the seduction of physical devices that are outside the system. Of all the irrelevant classes, they are the most frequently modeled.

The third case of classes outside the system occurred in the design of the automatic teller machine. This has become somewhat of a standard problem for design textbooks to model. If we consider the top-level classes one needs to model in order to facilitate a user withdrawing and depositing money, we might consider the ATM itself, the bank it will need to talk to, and the customer. Clearly, there are some other classes like deposit slots, keypads, display screens, etc. For now let us assume they are contained somewhere in the top-level classes (e.g., the ATM). If you claim the customer is a class, then a good question to ask is, "What does the customer do in your system?" A typical answer might be, "The customer sends a message to the ATM asking the ATM to withdraw $100." It turns out that sending messages is not meaningful behavior in a given domain. You must receive messages and, therefore, define messages/methods in order to be meaningful. What message does the ATM send to the customer? If the answer is none, then the customer is a class outside the system and should not be modeled. If an appropriate answer can be found, then by all means model the customer.

Heuristic 3.9

Do not turn an operation into a class. Be suspicious of any class whose name is a verb or is derived from a verb, especially those that have only one piece of meaningful behavior (i.e., do not count sets , gets , and prints ). Ask if that piece of meaningful behavior needs to be migrated to some existing or undiscovered class.

Violations of this heuristic are a leading cause of proliferation of classes. Be on the lookout for any class that has only one piece of meaningful behavior. Ask, "Should this class be an operation on one of my other classes, or is it really capturing some key abstraction?" Watch out for designers who request, "I need a class that does ." The word "does" sounds too much like a behavior, not a request for an abstraction. Is it a violation of this heuristic or a slip of the language? It is certainly worth examining. Classes whose names are verbs, or are derived from verbs, are especially suspect. Newcomers to the object-oriented paradigm are especially prone to violations of this heuristic. These developers are accustomed to functions being the entity of decomposition and often capture each method in a single class. They have not yet made the leap to the larger granularity of abstractions found in the object-oriented paradigm.

A telecommunications project with which I was recently involved introduced the two classes shown in Figure 3.13 to the object-oriented design of their system. These two classes are really modeling operations on a class that is yet to be discovered . If we were to look at the public interfaces of these two classes, we would very likely find a single method. This method is the implementation of one piece of functionality that the undiscovered class requires. The better design is shown in Figure 3.14.

Figure 3.13. Classes which should be operations.

graphics/03fig13.gif

Figure 3.14. A better design for telephone services.

graphics/03fig14.gif

It is important to note that not all classes whose names are verbs need to be eliminated. In the context of a design course, some students who are asked to design an ATM system from a set of requirement specifications often produce the partial design contained in Figure 3.15.

Figure 3.15. Partial ATM solution.

graphics/03fig15.gif

The deposit , withdraw , and balance classes are good candidates for operations that have been accidentally turned into classes. The names of these classes are verbs, and they have only one meaningful operation in their public interface. Many students criticize the design and state that the ATM system should be designed as in Figure 3.16.

Figure 3.16. A better partial ATM solution?

graphics/03fig16.gif

If we only consider logical design information, then the second design is more desirable. Why have three additional classes with only one operation each when we can simply give the bank class three additional methods? A problem with the second design is encountered when a requirement is added stating that the bank is responsible for printing monthly statements for its customers. The bank customers do not want to simply see their balance for the month; they want an itemized list of their transactions. This implies that deposits, withdrawals, and balance requests are persistent, that is, they must be stored for future use by the system. The fact that these entities are persistent implies that they should be modeled by a class. For that reason we defer back to the first design and model withdrawal, deposit, and balance as classes.

< Free Open Study >