Item 28: Avoid returning "handles" to object internals.Suppose you're working on an application involving rectangles. Each rectangle can be represented by its upper left corner and its lower right corner. To keep a Rectangle object small, you might decide that the points defining its extent shouldn't be stored in the Rectangle itself, but rather in an auxiliary struct that the Rectangle points to: class Point { // class for representing points public: Point(int x, int y); ... void setX(int newVal); void setY(int newVal); ... }; struct RectData { // Point data for a Rectangle Point ulhc; // ulhc = " upper left-hand corner" Point lrhc; // lrhc = " lower right-hand corner" }; class Rectangle { ... private: std::tr1::shared_ptr<RectData> pData; // see Item 13 for info on }; // tr1::shared_ptr Because Rectangle clients will need to be able to determine the extent of a Rectangle, the class provides the upperLeft and lowerRight functions. However, Point is a user-defined type, so, mindful of Item 20's observation that passing user-defined types by reference is typically more efficient than passing them by value, these functions return references to the underlying Point objects: class Rectangle { public: ... Point& upperLeft() const { return pData->ulhc; } Point& lowerRight() const { return pData->lrhc; } ... }; This design will compile, but it's wrong. In fact, it's self-contradictory. On the one hand, upperLeft and lowerRight are declared to be const member functions, because they are designed only to offer clients a way to learn what the Rectangle's points are, not to let clients modify the Rectangle (see Item 3). On the other hand, both functions return references to private internal data references that callers can use to modify that internal data! For example: Point coord1(0, 0); Point coord2(100, 100); const Rectangle rec(coord1, coord2); // rec is a const rectangle from // (0, 0) to (100, 100) rec.upperLeft().setX(50); // now rec goes from // (50, 0) to (100, 100)! Here, notice how the caller of upperLeft is able to use the returned reference to one of rec's internal Point data members to modify that member. But rec is supposed to be const! This immediately leads to two lessons. First, a data member is only as encapsulated as the most accessible function returning a reference to it. In this case, though ulhc and lrhc are declared private, they're effectively public, because the public functions upperLeft and lowerRight return references to them. Second, if a const member function returns a reference to data associated with an object that is stored outside the object itself, the caller of the function can modify that data, (This is just a fallout of the limitations of bitwise constness see Item 3.) Everything we've done has involved member functions returning references, but if they returned pointers or iterators, the same problems would exist for the same reasons. References, pointers, and iterators are all handles (ways to get at other objects), and returning a handle to an object's internals always runs the risk of compromising an object's encapsulation. As we've seen, it can also lead to const member functions that allow an object's state to be modified. We generally think of an object's "internals" as its data members, but member functions not accessible to the general public (i.e., that are protected or private) are part of an object's internals, too. As such, it's important not to return handles to them. This means you should never have a member function return a pointer to a less accessible member function. If you do, the effective access level will be that of the more accessible function, because clients will be able to get a pointer to the less accessible function, then call that function through the pointer. Functions that return pointers to member functions are uncommon, however, so let's turn our attention back to the Rectangle class and its upperLeft and lowerRight member functions. Both of the problems we've identified for those functions can be eliminated by simply applying const to their return types: class Rectangle { public: ... const Point& upperLeft() const { return pData->ulhc; } const Point& lowerRight() const { return pData->lrhc; } ... }; With this altered design, clients can read the Points defining a rectangle, but they can't write them. This means that declaring upperLeft and upperRight as const is no longer a lie, because they no longer allow callers to modify the state of the object. As for the encapsulation problem, we always intended to let clients see the Points making up a Rectangle, so this is a deliberate relaxation of encapsulation. More importantly, it's a limited relaxation: only read access is being granted by these functions. Write access is still prohibited. Even so, upperLeft and lowerRight are still returning handles to an object's internals, and that can be problematic in other ways. In particular, it can lead to dangling handles: handles that refer to parts of objects that don't exist any longer. The most common source of such disappearing objects are function return values. For example, consider a function that returns the bounding box for a GUI object in the form of a rectangle: class GUIObject { ... }; const Rectangle // returns a rectangle by boundingBox(const GUIObject& obj); // value; see Item 3 for why // return type is const Now consider how a client might use this function: GUIObject *pgo; // make pgo point to ... // some GUIObject const Point *pUpperLeft = // get a ptr to the upper &(boundingBox(*pgo).upperLeft()); // left point of its // bounding box The call to boundingBox will return a new, temporary Rectangle object. That object doesn't have a name, so let's call it temp. upperLeft will then be called on temp, and that call will return a reference to an internal part of temp, in particular, to one of the Points making it up. pUpperLeft will then point to that Point object. So far, so good, but we're not done yet, because at the end of the statement, boundingBox's return value temp will be destroyed, and that will indirectly lead to the destruction of temp's Points. That, in turn, will leave pUpperLeft pointing to an object that no longer exists; pUpperLeft will dangle by the end of the statement that created it! This is why any function that returns a handle to an internal part of the object is dangerous. It doesn't matter whether the handle is a pointer, a reference, or an iterator. It doesn't matter whether it's qualified with const. It doesn't matter whether the member function returning the handle is itself const. All that matters is that a handle is being returned, because once that's being done, you run the risk that the handle will outlive the object it refers to. This doesn't mean that you should never have a member function that returns a handle. Sometimes you have to. For example, operator[] allows you to pluck individual elements out of strings and vectors, and these operator[]s work by returning references to the data in the containers (see Item 3) data that is destroyed when the containers themselves are. Still, such functions are the exception, not the rule. Things to Remember
|