Conclusion
In many ways, the Open/Closed Principle is at the heart of object-oriented design. Conformance to this principle is what yields the greatest benefits claimed for object-oriented technology: flexibility, reusability, and maintainability. Yet conformance to this principle is not achieved simply by using an object-oriented programming language. Nor is it a good idea to apply rampant abstraction to every part of the application. Rather, it requires a dedication on the part of the developers to apply abstraction only to those
|
Bibliography[Jacobson92] Ivar Jacobson, Patrick Johnsson, Magnus Christerson, and Gunnar vergaard, Object-Oriented Software Engineering: A Use Case Driven Approach , Addison-Wesley, 1992. [Meyer97] Bertrand Meyer, Object Oriented Software Construction , 2d. ed., Prentice Hall, 1997. |
Chapter 10. The Liskov Substitution Principle (LSP)
Jennifer M. Kohnke
The primary mechanisms behind the Open/Closed Principle are abstraction and polymorphism. In statically typed languages, such as C#, one of the key mechanisms that supports abstraction and polymorphism is inheritance. It is by using inheritance that we can create derived classes that implement abstract
What are the design rules that
Barbara Liskov wrote this principle in 1988. [1] She said:
The importance of this principle becomes obvious when you consider the consequences of violating it. Presume that we have a function
f
that takes as its argument a reference to some base class
B
. Presume also that when passed to
f
in the guise of
B,
some derivative
D
of
B
causes
f
to misbehave. Then
D
The authors of
f
will be tempted to put in some kind of test for
D
so that
f
can behave properly when a
D
is passed to it. This test violates
OCP
because now,
f
is not closed to all the various derivatives of
B
. Such tests are a code smell that are the result of inexperienced developers or, what's
|
Violations of LSPA Simple Example
Violating LSP often results in the use of runtime type checking in a manner that grossly
Listing 10-1.
A violation of LSP
|
struct Point {double x, y;}
public enum ShapeType {square, circle};
public class Shape
{
private ShapeType type;
public Shape(ShapeType t){type = t;}
public static void DrawShape(Shape s)
{
if(s.type == ShapeType.square)
(s as Square).Draw();
else if(s.type == ShapeType.circle)
(s as Circle).Draw();
}
}
public class Circle : Shape
{
private Point center;
private double radius;
public Circle() : base(ShapeType.circle) {}
public void Draw() {/* draws the circle */}
}
public class Square : Shape
{
private Point topLeft;
private double side;
public Square() : base(ShapeType.square) {}
public void Draw() {/* draws the square */}
}
|
Clearly, the DrawShape function in Listing 10-1 violates OCP. It must know about every possible derivative of the Shape class, and it must be changed whenever new derivatives of Shape are created. Indeed, many rightly view the structure of this function as anathema to good design. What would drive a programmer to write a function like this?
Consider Joe the Engineer. Joe has studied object-oriented technology and has concluded that the overhead of polymorphism is too high to pay. [2] Therefore, he defined class Shape without any abstract functions. The classes Square and Circle derive from Shape and have Draw() functions, but they don't override a function in Shape . Since Circle and Square are not substitutable for Shape , DrawShape must inspect its incoming Shape , determine its type, and then call the appropriate Draw function.
[2] On a reasonably fast machine, that overhead is on the order of 1ns per method invocation, so it's difficult to see Joe's point.
The fact that Square and Circle cannot be substituted for Shape is a violation of LSP. This violation forced the violation of OCP by DrawShape . Thus, a violation of LSP is a latent violation of OCP .
Of course there are other, far more subtle ways of violating LSP. Consider an application that uses the Rectangle class as described in Listing 10-2.
public class Rectangle
{
private Point topLeft;
private double width;
private double height;
public double Width
{
get { return width; }
set { width = value; }
}
public double Height
{
get { return height; }
set { height = value; }
}
}
|
Imagine that this application works well and is installed in many sites. As is the case with all successful software, its users demand changes from time to time. One day, the users demand the ability to manipulate squares in addition to rectangles.
It is often said that inheritance is the IS-A relationship. In other words, if a new kind of object can be said to fulfill the IS-A relationship with an old kind of object, the class of the new object should be derived from the class of the old object.
For all normal intents and purposes, a square is a rectangle. Thus, it is logical to view the Square class as being derived from the Rectangle class. (See Figure 10-1.)
This use of the IS-A relationship is sometimes thought to be one of the fundamental techniques of object-oriented analysis, a
Our first clue that something has gone wrong might be the fact that a
Square
does not need both
height
and
width
member
Let's assume, for the moment, that we are not very
public new double Width
{
set
{
base.Width = value;
base.Height = value;
}
}
public new double Height
{
set
{
base.Height = value;
base.Width = value;
}
}
Now, when someone sets the width of a
Square
object, its height will change correspondingly. And when someone sets the height, its width will change with it. Thus, the invariantsthose properties that must always be true regardless of stateof the
Square
Square s = new Square(); s.SetWidth(1); // Fortunately sets the height to 1 too. s.SetHeight(2); // sets width and height to 2. Good thing.
But consider the following function:
void f(Rectangle r)
{
r.SetWidth(32); // calls Rectangle.SetWidth
}
If we pass a reference to a Square object into this function, the Square object will be corrupted, because the height won't be changed. This is a clear violation of LSP. The f function does not work for derivatives of its arguments. The reason for the failure is that Width and Height were not declared virtual in Rectangle and are therefore not polymorphic.
We can fix this easily by declaring the setter properties to be
virtual
. However, when the creation of a derived class causes us to make changes to the base class, it often implies that the design is faulty. Certainly, it violates OCP. We might counter this by saying that
Still, let's assume that we accept the argument and fix the classes. We wind up with the code in Listing 10-3.
public class Rectangle
{
private Point topLeft;
private double width;
private double height;
public virtual double Width
{
get { return width; }
set { width = value; }
}
public virtual double Height
{
get { return height; }
set { height = value; }
}
}
public class Square : Rectangle
{
public override double Width
{
set
{
base.Width = value;
base.Height = value;
}
}
public override double Height
{
set
{
base.Height = value;
base.Width = value;
}
}
}
|
Square and Rectangle now appear to work. No matter what you do to a Square object, it will remain consistent with a mathematical square. And regardless of what you do to a Rectangle object, it will remain a mathematical rectangle. Moreover, you can pass a Square into a function that accepts a Rectangle , and the Square will still act like a square and will remain consistent.
Thus, we might conclude that the design is now self-consistent and correct. However, this conclusion would be amiss. A design that is self-consistent is not
void g(Rectangle r)
{
r.Width = 5;
r.Height = 4;
if(r.Area() != 20)
throw new Exception("Bad area!");
}
This function invokes the
Width
and
Height
Clearly, it is reasonable to assume that changing the width of a rectangle does not affect its height! However, not all objects that can be passed as
Rectangle
s
Function
g
shows that there exist functions that take
Rectangle
objects but that cannot
One might contend that the problem lay in function
g
, that the author had no right to make the assumption that width and height were independent. The author of
g
would
It is the author of
Square
who has violated the invariant. Interestingly enough, the author of
Square
did not
The Laskov Substitution Principle leads us to a very important conclusion:
A model,
When considering whether a particular design is appropriate, one cannot simply view the solution in isolation. One must view it in terms of the reasonable assumptions made by the users of that design. [3]
[3] Often, you will find that those reasonable assumptions are asserted in the unit tests written for the base class. This is yet another good reason to practice test-driven development.
Who
So, what
Not as far as the author of g is concerned! A square might be a rectangle, but from g 's point of view, a Square object is definitely not a Rectangle object. Why? Because the behavior of a Square object is not consistent with g 's expectation of the behavior of a Rectangle object. Behaviorally, a Square is not a Rectangle , and it is behavior that software is really all about. LSP makes it clear that in OOD, the IS-A relationship pertains to behavior that can be reasonably assumed and that clients depend on.
Many developers may feel uncomfortable with the notion of behavior that is "reasonably assumed." How do you know what your clients will really expect? There is a technique for making those reasonable assumptions explicit and thereby enforcing LSP. The technique is called design by contract (DBC) and is expounded by Bertrand Meyer. [4]
[4] [Meyer97], p. 331
Using DBC, the author of a class explicitly states the contract for that class. The contract informs the author of any client code of the behaviors that can be relied on. The contract is specified by declaring preconditions and postconditions for each method. The preconditions must be true in order for the method to execute. On completion, the method
We can view the postcondition of the Rectangle.Width setter as follows:
assert((width == w) && (height == old.height));
where
old
is the value of the
Rectangle
before
Width
is called. Now the rule for preconditions and postconditions of derivatives, as stated by Meyer, is: "A routine
[5] [Meyer97], p. 573
In other words, when using an object through its base class interface, the
Clearly, the postcondition of the Square.Width setter is weaker [6] than the postcondition of the Rectangle.Width setter , since it does not enforce the constraint (height == old.height) . Thus, the Width property of Square violates the contract of the base class.
[6] The term weaker can be confusing. X is weaker than Y if X does not enforce all the constraints of Y . It does not matter how many new constraints X enforces.
Certain languages, such as Eiffel, have direct support for preconditions and postconditions. You can declare them and have the runtime system verify them for you. C# has no such feature. In C#, we must manually consider the preconditions and postconditions of each method and make sure that Meyer's rule is not violated. Moreover, it can be very helpful to document these preconditions and postconditions in the comments for each method.
Contracts can also be specified by writing unit tests. By thoroughly testing the behavior of a class, the unit tests make the behavior of the class clear. Authors of client code will want to review the unit tests in order to know what to reasonably assume about the classes they are using.
Enough of squares and rectangles! Does LSP have a
In the early 1990s I purchased a third-party class library that had some container classes.
[7]
The containers were
[7] The language was C++, long before the standard container library was available.
The constructor for
BoundedSet
specified the maximum number of elements the set could hold. The space for these elements was preallocated as an array within the
BoundedSet
. Thus, if the creation of the
BoundedSet
succeeded, we could be sure that it had enough memory. Since it was based on an array, it was very fast. There were no memory
UnboundedSet , on the other hand, had no declared limit on the number of elements it could hold. So long as heap memory was avaliable, the UnboundedSet would continue to accept elements. Therefore, it was very flexible. It was also economical in that it used only the memory necessary to hold the elements that it currently contained. It was also slow, because it had to allocate and deallocate memory as part of its normal operation. Finally, a danger was that its normal operation could exhaust the heap.
I was
I created an interface, called Set , that presented abstract Add , Delete , and IsMember functions, as shown in Listing 10-4. [8] This structure unified the unbounded and bounded varieties of the two third-party sets and allowed them to be accessed through a common interface. Thus, some client could accept an argument of type Set and would not care whether the actual Set it worked on was of the bounded or unbounded variety. (See the PrintSet function in Listing 10-5.)
[8] The original code has been translated into C# here to make it easier for .NET programmers to understand.
public interface Set
{
public void Add(object o);
public void Delete(object o);
public bool IsMember(object o);
}
|
void PrintSet(Set s)
{
foreach(object o in s)
Console.WriteLine(o.ToString());
}
|
It is a big advantage not to have to know or care what kind of Set you are using. It means that the programmer can decide which kind of Set is needed in each particular instance, and none of the client functions will be affected by that decision. The programmer may choose an UnboundedSet when memory is tight and speed is not critical or may choose a BoundedSet when memory is plentiful and speed is critical. The client functions will manipulate these objects through the interface of the base class Set and will therefore not know or care which kind of Set they are using.
I wanted to add a
PersistentSet
to this hierarchy. A persistent set is can be written out to a stream and then read back in later, possibly by a different application. Unfortunately, the only third-party container that I had access to that also
Note that
PersistentSet
contains an instance of the third-party persistent set, to which it delegates all its
On the surface, this might look all right. However, there is an
When a client is adding members to the base class Set , that client cannot be sure whether the Set might be a PersistentSet . Thus, the client has no way of knowing whether the elements it adds ought to be derived from PersistentObject .
Consider the code for
PersistentSet.Add()
in Listing 10-6. This code makes it clear that if any client
void Add(object o)
{
PersistentObject p = (PersistentObject)o;
thirdPartyPersistentSet.Add(p);
}
|
Is this a problem? Certainly. Functions that never before failed when passed a derivative of
Set
may now cause runtime errors when passed a
PersistentSet
. Debugging this kind of problem is relatively difficult, since the runtime error occurs very far away from the logic flaw. The logic flaw is the decision either to pass a
PersistentSet
into a function or to add an object to the
PersistentSet
that is not derived from
PersistentObject
. In either case, the decision might be millions of instructions away from the invocation of the
Add
method. Finding it can be a bear. Fixing it can be
How do we solve this problem? Several years ago, I
This module was responsible for reading and writing all the containers to and from the persistent store. When a container needed to be written, its contents were
This solution may seem overly
Did this solution work? Not really. The convention was violated in several
How would I solve this now? I would