15.8 STATIC VERSUS DYNAMIC BINDING FOR FUNCTIONS IN C | Programming with Objects: A Comparative Presentation of Object Oriented Programming with C++ and Java

15.8 STATIC VERSUS DYNAMIC BINDING FOR FUNCTIONS IN C++

Ordinarily, a compiler is able to figure out which function definition to bind to a given function call. If function names are overloaded, the compiler may have to resort to overload resolution (as discussed in Chapter 9), but it will know which one of the overload definitions to use. When the exact function definition to be invoked can be determined at compile time, we refer to that as the static binding of a function definition to a function call. In the following example, the constructor definition at (P) is statically bound to the constructor invocation at (R). Similarly, the constructor definition at (Q) is statically bound to the constructor invocation at (S).

 
 //StaticBinding.cc class UserProfile {     string name;     int age;     // ... public:     UserProfile( string str, int yy )              : name( str ), age( yy ) {}                       //(P)     UserProfile( string str )              : name( str ) { age = averageAge(); }             //(Q)     int averageAge() { return 48; }     // ... }; int main() {     UserProfile user1( "Zaphod", 112 );                        //(R)     UserProfile user2( "Trillion" );                           //(S)     //. . .     return 0; }

Now let's consider the print loop in line (E) of the main() of VirtualPrint2.cc program of the previous section. A print() function cannot be statically bound to the invocation in the print loop because the true identity of the object pointed to by the iterator in line(F) of that program cannot be figured out at compilation time. The compiler only knows that an overridable function-overridable because print() was declared virtual in the base class in line (A) of VirtualPrint2.cc -is being invoked on an iterator that is pointing to an object of type Employee, but, within the Employee hierarchy, the more precise identity of the object pointed to is not known to the compiler. The true identity of this object can only be ascertained at run time.

Therefore, which particular version of print() to use for each invocation of this function in the print loop in line (F) of VirtualPrint2.cc can only be determined at run time. This is referred to as dynamic binding. When we declare a function virtual in the base class and thus make possible a run-time invocation of the correct form of the function for each derived class, we make it possible for that function to become dynamically bound.

Since for virtual functions, the choice of which specific function to invoke can only be made at run time through dynamic binding, one might wonder as to what costs are associated with dynamic binding and how these costs depend on the complexity of a class (in terms of, say, the number of virtual functions defined for a class). The rest of the discussion in this section should give the reader a sense of these costs. For most programming, these costs are completely insignificant and can safely be ignored.

Each class that either has its own virtual functions or that inherits virtual functions from a superclass is provided with a virtual table, commonly referred to as vtb1, that contains pointers to the implementations of the virtual functions for that class. So given a class X and its subclass Y as

 class X {                                 // BASE     // .... public:     X();     virtual void f1();     virtual void f2();     void f3();     ~X(); }; class Y : public X {                      // DERIVED      // ... public:        void f4();        virtual f5(); };

the class X would have associated with it the virtual table shown in Table 15.1, and the classYthe virtual table shown in Table 15.2.

Table 15.1
virtual table for class X
f1	pointer to the implementation code for f1
f2	pointer to the implementation code for f2

Table 15.2
virtual table for class Y
f1	pointer to the implementation code for f1
f2	pointer to the implementation code for f2
f5	pointer to the implementation code for f5

The virtual table of a class is stored somewhere in the memory outside the class and the class is given a pointer to the table. This pointer is called a virtual table pointer, commonly referred to by the symbol vtpr. A vtpr pointer is a hidden data member in every class that has at least one virtual function that may either be defined directly in the class or that may be inherited from a superclass.

To actually see the object enlargement caused by the concealed vtpr data member, consider the following test program in which the destructor has been declared to be virtual in order to make the class a polymorphic type:

 
 //VtprConcealed.cc #include <iostream> using namespace std; class X {     int n; public:     X( int nn ) : n( nn ) {}     virtual ~X(){} }; int main(){     cout << sizeof( X ) << end1;  // 8     X xobj( 10 );                 // 8     cout << sizeof( xobj ) << end1; }

This small program prints out 8 bytes for the size of X, which is 4 bytes for the int data member n and additional 4 bytes for the concealed pointer data member vtpr. If you made the destructor nonvirtual and ran the same program, you'll only get 4 for the size of X. The additional concealed data member is stored on a per object basis since the sizeof operator when applied to a class type only returns the memory needed on a per-object basis.

Now let's see how a virtual table is actually used to appreciate what performance issues may arise if our program was suffused with a very large number of calls to functions that required dynamic binding. When the compiler sees the function call in line (T) of main in the program shown below

 
 //VirtualFunctionCost.cc class X {                                 // BASE       // ... public:     virtual void foo(); }; class Y : public X {                      // DERIVED     // ... public:     void foo(); }; int main() {     X* p;     // ....     // p could be made to point to either     // an X object or a Y object     // ...     p->foo();                                              //(T)    // ... }

it has no way to know (at compile time) the true identity of the object to which p in main points. So all it does is to replace the call p->foo() by a piece of code that says to (i) first ascertain the true identity of the object to which the pointer p is pointing; (ii) then reach into the virtual table of the class corresponding to that object through the vtpr pointer data member in the object; and, finally, (iii) invoke the implementation of foo reached though the table. So, obviously, the invocation of a polymorphic function through dynamic binding takes a few more steps than is the case for statically bound functions. To the extent that there will be a few cpu cycles consumed by these extra steps, there is a slight performance penalty associated with polymorphic invocation of functions.

Calls to virtual functions also extract another performance penalty: interference with compiler optimizations through function inlining. As we mentioned in Chapter 9, when a compiler is allowed to inline a function, it replaces a call to that function with the body of the function. But a call to a virtual function that requires dynamic binding obviously does not allow for such code replacement, since the compiler would not know which version of the function to use at compilation time.