Item 44: Factor parameter-independent code out of templates

Templates are a wonderful way to save time and avoid code replication. Instead of typing 20 similar classes, each with 15 member functions, you type one class template, and you let compilers instantiate the 20 specific classes and 300 functions you need. (Member functions of class templates are implicitly instantiated only when used, so you should get the full 300 member functions only if each is actually used.) Function templates are similarly appealing. Instead of writing many functions, you write one function template and let the compilers do the rest. Ain't technology grand?

Yes, well...sometimes. If you're not careful, using templates can lead to code bloat: binaries with replicated (or almost replicated) code, data, or both. The result can be source code that looks fit and trim, yet object code that's fat and flabby. Fat and flabby is rarely fashionable, so you need to know how to avoid such binary bombast.

Your primary tool has the imposing name commonality and variability analysis, but there's nothing imposing about the idea. Even if you've never written a template in your life, you do such analysis all the time.

When you're writing a function and you realize that some part of the function's implementation is essentially the same as another function's implementation, do you just replicate the code? Of course not. You factor the common code out of the two functions, put it into a third function, and have both of the other functions call the new one. That is, you analyze the two functions to find the parts that are common and those that vary, you move the common parts into a new function, and you keep the varying parts in the original functions. Similarly, if you're writing a class and you realize that some parts of the class are the same as parts of another class, you don't replicate the common parts. Instead, you move the common parts to a new class, then you use inheritance or composition (see Items 32, 38, and 39) to give the original classes access to the common features. The parts of the original classes that differ the varying parts remain in their original locations.

When writing templates, you do the same analysis, and you avoid replication in the same ways, but there's a twist. In non-template code, replication is explicit: you can see that there's duplication between two functions or two classes. In template code, replication is implicit: there's only one copy of the template source code, so you have to train yourself to sense the replication that may take place when a template is instantiated multiple times.

For example, suppose you'd like to write a template for fixed-size square matrices that, among other things, support matrix inversion.

 template<typename T,           // template for n x n matrices of          std::size_t n>        // objects of type T; see below for info class SquareMatrix {           // on the size_t parameter public:   ...   void invert();              // invert the matrix in place }; 

This template takes a type parameter, T, but it also takes a parameter of type size_t a non-type parameter. Non-type parameters are less common than type parameters, but they're completely legal, and, as in this example, they can be quite natural.

Now consider this code:

 SquareMatrix<double, 5> sm1; ... sm1.invert();                  // call SquareMatrix<double, 5>::invert SquareMatrix<double, 10> sm2; ... sm2.invert();                  // call SquareMatrix<double, 10>::invert 

Two copies of invert will be instantiated here. The functions won't be identical, because one will work on 5x5 matrices and one will work on 10 x 10 matrices, but other than the constants 5 and 10, the two functions will be the same. This is a classic way for template-induced code bloat to arise.

What would you do if you saw two functions that were character-for-character identical except for the use of 5 in one version and 10 in the other? Your instinct would be to create a version of the function that took a value as a parameter, then call the parameterized function with 5 or 10 instead of replicating the code. Your instinct serves you well! Here's a first pass at doing that for SquareMatrix:

 template<typename T>                   // size-independent base class for class SquareMatrixBase {               // square matrices protected:   ...   void invert(std::size_t matrixSize); // invert matrix of the given size   ... }; template<          typename T, std::size_t n> class SquareMatrix: private SquareMatrixBase<T> { private:   using SquareMatrixBase<T>::invert;   // avoid hiding base version of                                        // invert; see Item 33 public:   ...   void invert() { this->invert(n); }   // make inline call to base class };                                     // version of invert; see below                                        // for why "this->" is here 

As you can see, the parameterized version of invert is in a base class, SquareMatrixBase. Like SquareMatrix, SquareMatrixBase is a template, but unlike SquareMatrix, it's templatized only on the type of objects in the matrix, not on the size of the matrix. Hence, all matrices holding a given type of object will share a single SquareMatrixBase class. They will thus share a single copy of that class's version of invert.

SquareMatrixBase::invert is intended only to be a way for derived classes to avoid code replication, so it's protected instead of being public. The additional cost of calling it should be zero, because derived classes' inverts call the base class version using inline functions. (The inline is implicit see Item 30.) These functions use the "this->" notation, because otherwise, as Item 43 explains, function names in templatized base classes (such as SquareMatrixBase<T>) are hidden from derived classes. Notice also that the inheritance between SquareMatrix and SquareMatrixBase is private. This accurately reflects the fact that the reason for the base class is only to facilitate the derived classes' implementations, not to express a conceptual is-a relationship between SquareMatrix and SquareMatrixBase. (For information on private inheritance, see Item 39.)

So far, so good, but there's a sticky issue we haven't addressed yet. How does SquareMatrixBase::invert know what data to operate on? It knows the size of the matrix from its parameter, but how does it know where the data for a particular matrix is? Presumably only the derived class knows that. How does the derived class communicate that to the base class so that the base class can do the inversion?

One possibility would be to add another parameter to SquareMatrixBase::invert, perhaps a pointer to the beginning of a chunk of memory with the matrix's data in it. That would work, but in all likelihood, invert is not the only function in SquareMatrix that can be written in a size-independent manner and moved into SquareMatrixBase. If there are several such functions, all will need a way to find the memory holding the values in the matrix. We could add an extra parameter to all of them, but we'd be telling SquareMatrixBase the same information repeatedly. That seems wrong.

An alternative is to have SquareMatrixBase store a pointer to the memory for the matrix values. And as long as it's storing that, it might as well store the matrix size, too. The resulting design looks like this:

 template<typename T> class SquareMatrixBase { protected:   SquareMatrixBase(std::size_t n, T *pMem)     // store matrix size and a   : size(n), pData(pMem) {}                    // ptr to matrix values   void setDataPtr(T *ptr) { pData = ptr; }     // reassign pData   ... private:   std::size_t size;                            // size of matrix   T *pData;                                    // pointer to matrix values }; 

This lets derived classes decide how to allocate the memory. Some implementations might decide to store the matrix data right inside the SquareMatrix object:

 template<typename T, std::size_t n> class SquareMatrix: private SquareMatrixBase<T> { public:   SquareMatrix()                             // send matrix size and   : SquareMatrixBase<T>(n, data) {}          // data ptr to base class   ... private:   T data[n*n]; }; 

Objects of such types have no need for dynamic memory allocation, but the objects themselves could be very large. An alternative would be to put the data for each matrix on the heap:

 template<typename T, std::size_t n> class SquareMatrix: private SquareMatrixBase<T> { public:   SquareMatrix()                          // set base class data ptr to null,   : SquareMatrixBase<T>(n, 0),            // allocate memory for matrix     pData(new T[n*n])                     // values, save a ptr to the   { this->setDataPtr(pData.get()); }      // memory, and give a copy of it   ...                                     // to the base class private:   boost::scoped_array<T> pData;           // see Item 13 for info on };                                        // boost::scoped_array 

Regardless of where the data is stored, the key result from a bloat point of view is that now many maybe all of SquareMatrix's member functions can be simple inline calls to base class versions that are shared with all other matrices holding the same type of data, regardless of their size. At the same time, SquareMatrix objects of different sizes are distinct types, so even though, e.g., SquareMatrix<double, 5> and SquareMatrix<double, 10> objects use the same member functions in SquareMatrixBase<double>, there's no chance of passing a SquareMatrix<double, 5> object to a function expecting a SquareMatrix<double, 10>. Nice, no?

Nice, yes, but not free. The versions of invert with the matrix sizes hardwired into them are likely to generate better code than the shared version where the size is passed as a function parameter or is stored in the object. For example, in the size-specific versions, the sizes would be compile-time constants, hence eligible for such optimizations as constant propagation, including their being folded into the generated instructions as immediate operands. That can't be done in the size-independent version.

On the other hand, having only one version of invert for multiple matrix sizes decreases the size of the executable, and that could reduce the program's working set size and improve locality of reference in the instruction cache. Those things could make the program run faster, more than compensating for any lost optimizations in size-specific versions of invert. Which effect would dominate? The only way to know is to try it both ways and observe the behavior on your particular platform and on representative data sets.

Another efficiency consideration concerns the sizes of objects. If you're not careful, moving size-independent versions of functions up into a base class can increase the overall size of each object. For example, in the code I just showed, each SquareMatrix object has a pointer to its data in the SquareMatrixBase class, even though each derived class already has a way to get to the data. This increases the size of each SquareMatrix object by at least the size of a pointer. It's possible to modify the design so that these pointers are unnecessary, but, again, there are trade-offs. For example, having the base class store a protected pointer to the matrix data leads to the loss of encapsulation described in Item 22. It can also lead to resource management complications: if the base class stores a pointer to the matrix data, but that data may have been either dynamically allocated or physically stored inside the derived class object (as we saw), how will it be determined whether the pointer should be deleted? Such questions have answers, but the more sophisticated you try to be about them, the more complicated things become. At some point, a little code replication begins to look like a mercy.

This Item has discussed only bloat due to non-type template parameters, but type parameters can lead to bloat, too. For example, on many platforms, int and long have the same binary representation, so the member functions for, say, vector<int> and vector<long> would likely be identical the very definition of bloat. Some linkers will merge identical function implementations, but some will not, and that means that some templates instantiated on both int and long could cause code bloat in some environments. Similarly, on most platforms, all pointer types have the same binary representation, so templates holding pointer types (e.g., list<int*>, list<const int*>, list<SquareMatrix<long, 3>*>, etc.) should often be able to use a single underlying implementation for each member function. Typically, this means implementing member functions that work with strongly typed pointers (i.e., T* pointers) by having them call functions that work with untyped pointers (i.e., void* pointers). Some implementations of the standard C++ library do this for templates like vector, deque, and list. If you're concerned about code bloat arising in your templates, you'll probably want to develop templates that do the same thing.

Things to Remember

  • Templates generate multiple classes and multiple functions, so any template code not dependent on a template parameter causes bloat.

  • Bloat due to non-type template parameters can often be eliminated by replacing template parameters with function parameters or class data members.

  • Bloat due to type parameters can be reduced by sharing implementations for instantiation types with identical binary representations.

Effective C++ Third Edition 55 Specific Ways to Improve Your Programs and Designs
Effective C++ Third Edition 55 Specific Ways to Improve Your Programs and Designs
ISBN: 321334876
Year: 2006
Pages: 102 © 2008-2017.
If you may any questions please contact us: