Tracing Memory Leaks in C | Memory as a Programming Concept in C and C++

Tracing Memory Leaks in C++

Unfortunately, the situation here is not as straightforward as with C programs; in fact, it is downright convoluted. Of course, we can overload the global operators new and delete as in the following sample code.

 extern "C" {  #include <stdio.h>  #include <string.h>  #include <stdlib.h> } #include <iostream> //location new void* operator new(size_t size,const char* src,int line) {   void *p;   p = malloc(size);   cout << "location new in " << src << " at line "        << line << " size " << size << " alloc "        << (unsigned int)p << '\n';   return p; } //location delete void operator delete(void* p,const char* src,int line) {   cout << "location delete in " << src << " at line "        << line << " alloc " << (unsigned int)p << '\n';   free(p); } //overloaded delete void operator delete(void* p) {   cout << "overloaded delete alloc " << (unsigned int)p << '\n';   free(p); } #define new new(__FILE__,__LINE__) class X { public:   int *val;   char filler[2];   X() { val=NULL; }   X(int x) {     cout << "CON\n";        //entering constructor     val = new int;     *val = x;     cout << "EXIT CON\n";   //exiting constructor   }   ~X() {     cout << "DES\n";        //entering destructor     delete val;     cout << "EXIT DES\n";   //exiting destructor   } };//end class X void doit(); // function main ---------------------------------------------- int main() {   doit();   return 0; }//end main // function doit ---------------------------------------------- void doit() {   X x(32);   cout << "address of object x is " << (unsigned int) &x << '\n';   X *p = new X(33);   cout << "p="<< (unsigned int) p << '\n';   delete p; }//end doit

Execution of the program will output something like this (of course, we added the annotations in [ brackets ] ).

 CON        [entering constructor for the auto object x] location new in x.cpp at line 43 size 4 alloc 4391008            [allocated x.val] EXIT CON   [exiting constructor for the auto object x] address of x is 1244952 location new in x.cpp at line 69 size 8 alloc 4398224            [allocated p] CON        [entering constructor for the dynamic object *p] location new in x.cpp at line 43 size 4 alloc 4390960            [allocated p->val] EXIT CON   [exiting constructor for the dynamic object *p] p = 4398224 DES        [entering destructor for *p] overloaded delete alloc 4390960            [deleted p->val] EXIT DES   [exiting destructor for *p] overloaded delete alloc 4398224            [deleted p] DES        [entering destructor for x] overloaded delete alloc 4391008            [deleted x.val] EXIT DES   [exiting destructor for x] done

As you can see, the global operator new has been replaced by the location operator new (due to the macro #define new new(__FILE__,__LINE__) ), but the location operator delete apparently did not participate though the overloaded delete did. Unfortunately, as discussed in Chapter 8, the operator delete cannot be called explicitly in a program with placement syntax. This is why the global (and overloaded) delete is called instead. So why have the location delete at all? During construction of an object of class X , the compiler keeps track of which new is used; when an exception is thrown during the construction, the pieces constructed so far are destroyed using the appropriate delete . However, once the construction is completed, the information concerning which new was used is erased; hence the compiler has no way of selecting the appropriate delete and thus the global one is used by default. To summarize: we have the location delete there just for the case of an exception during construction, and any other delete is performed by the overload of the global delete . So here is our first disappointment with the approach that worked so well with C. There is no simple way to have delete announce its location. Our second problem involves placement of the macro #define new new(__FILE__,__LINE__) . If placed after the definition of the class X , then the new used in the constructor of X would not be location new . The picture can get even murkier, for we might have a class-specific new that we do not want to "modify" to the location new . In a given class there can actually be mixed use of new : in some places the global new is used and elsewhere the class-specific new is used. We might not want to "change" the former, but we might want to "change" the latter.

On the other hand, if we want each new to report its location and if we want to log and keep track of what is allocated (as in the debug version of malloc() ), we would have to "change" not only the global operators but also every new and delete in all classes in the program - not an easy task. Since new and delete are operators and not (as in C) standard functions, they cannot be "replaced" and thus you have no chance of detecting memory leaks in external object code that is written in C++. And we have not even discussed new[] and delete[] yet. All these troubles arise because new and delete are operators (and hence a part of the language) - unlike in C, where the allocators are simply standard functions that can be plugged in or unplugged at will.

By now the reader must be convinced that, without significant changes to the C++ program being debugged for memory leaks, the task cannot be done. However, we can certainly employ certain programming strategies to produce a C++ code that at least facilitates reasonable tracing.

Mark an entry to any function using TRACE(function name ) . If the program is already written, a simple preprocessor can provide this. Mark every exit from a void function using RETURN and from a nonvoid function using RETURN1(..) , and do not let the code "fall off" at the end of a function; use either RETURN or RETURN1 . (Again, if it has not been done, a simple preprocessor can do it.) Thus we have each entry and exit of any function marked in a special way. Introduce a global variable char* LOC . For a production build, use

 #define TRACE(a) LOC=#a;    #define RETURN return;    #define RETURN1(a) return(a);    #define FILELOC(0) LOC

Thus, for instance, for a production run

 void doit()    {      TRACE(doit)      .... RETURN      ....      RETURN    }

will be preprocessed to

 void doit()    {      LOC="doit";      .... return;      ....      return;    }

In the program we can use FILELOC(0) as a location reference for logging (if our system is supposed to provide logging in production runs); this will report the function in which the report is being made. The overhead during execution is minimal.

For a debugging run we can define

 #define TRACE(a) push_on_stack(#a);    #define RETURN {pop_stack(); return;}    #define RETURN1(a) {pop_stack(); return(a);}    #define FILELOC(n) show_stack(n)

which will modify our example to

 void doit()    {      push_on_stack("doit");       .... {pop_stack(); return;}      ....      {pop_stack(); return;}    }

We must now link the program that is being debugged with our additional debugging functions void push_on_stack(char*) , void pop_stack() , and char* show_stack(int n) . The role of the first two functions is self-explanatory; the function show_stack(n) returns the string that is located n positions from the top of the stack. In the program we can use as location references FILELOC(m) ... FILELOC(0) , which will report the sequence of the last m calls in the correct order. The overhead during execution is not trivial, but this is for a debugging run and so performance is not an issue. FILELOC(m) ... FILELOC(0) together with __FILE__ and __LINE__ will give us a decent location reference for debug_malloc() and for location overloaded new and delete , while FILELOC(m) ... FILELOC(0) will give us a somewhat less decent location reference for overloaded delete .

For each class suspected of leaking memory, we can employ two strategies. In the first we overload its class-specific new and delete (as we did for the global operators) and then check the logs and the final statistics to determine if memory is leaking in these objects. In the second strategy we implement an "object counting" approach to see if there exist some objects that remain undeallocated. Which approach is more suitable depends on the circumstances. Generally speaking, the first strategy should be used if you suspect that memory is leaking somewhere in the destruction of the objects, whereas the latter one should be used if you suspect that memory is leaking on account of whole objects not being deallocated. In the worst case, both approaches can be combined.

Doing all of this for a large program that is leaking memory may prove to be quite laborious. It is thus highly advisable to design the program with these debugging features right from the start and to have these features turned on or off based on compilation flags (in this way, the program may be compiled either for a production build or for debugging).