Watch Those Pointer Variables | Designing Highly Useable Software

In the previous section I talked about how you can use pointers in your container variables. But as they say, with such power comes a certain responsibility. Pointers can be a mess if you’re not careful.

You know the old story: Delete every object you create. But sometimes this is easier said than done. What if you’re creating a library that other people will use, and one of your functions creates a new object and returns a pointer to the object? Who is responsible for deleting the object, you or the user? If you delete the object eventually, will the programmers making use of your library know that the object got deleted? And if you don’t delete it, will the programmers think you deleted it, leaving it hanging around, using up memory? Before I move on, I’ll go ahead and solve this one for you.

RULE

The choice is yours as to whether you want to delete the object or you want the programmer using the library to delete the object. But whatever you choose, document it and make sure the people using your library are aware of your decision.

Ultimately, when dealing with pointers, you should have one goal in mind: to make the absolute best software that has no errors and no problems. All your efforts in worrying about memory allocation and handling orphan objects are for naught if your program isn’t highly useable. What difference does it make? Junk that’s slightly better than other junk is still junk. And who decides if your software is junk? The users, of course.

Some of you might think I’m being a bit harsh here, but I would guess that at least in the world of C++ programming, pointer problems are the single biggest contributor to bugs in software, particularly crashes. Want to see a really easy way to crash your program? Here goes:

 int main() {      int *x = 0;      *x = 0;  }

This tiny program, of course, tries to write the byte 0 to the memory address 0. And with the virtual memory systems of today’s processors, that memory 0 probably isn’t even mapped into the physical memory. Or if it is, who knows where the operating system mapped it to in physical memory. The end result? A big ol’ crash.

Now, of course, you wouldn’t write code like that, but you might find yourself writing code like this:

 int main() {      MyObject *inst = AllocateObject();      inst->value = 0;  }

where MyObject and AllocateObject are part of a third-party library you obtained. Why is this code bad? It’s not, unless AllocateObject returns a 0 or NULL pointer for whatever reason. Maybe AllocateObject requires an Internet connection, and during development and testing you and the testers all had Internet connections, but you forgot to test what might happen if the Internet connection goes down. And then you send out your software, and it works fine on 99 out of 100 systems. But that 100th system is struggling to get a connection in the dial-up on the old co-op phone system in the rural areas. Alas, the connection fails, so AllocateObject can’t get the data it needs, and it simply returns a NULL pointer. Then your program will crash and you’ll have an unhappy customer.

The solution, of course, is good error checking, like so:

 int main() {      MyObject *inst = AllocateObject();      if (inst) {          inst->value = 0;      }  }

Easy! But of course, you need to handle this error and not just end the program as my little sample does. If the object is NULL, you’ll want to ask the user how to proceed, whether to use some presaved data or whatever. (Look at what Internet Explorer does: If it can’t get a connection, it offers to load the data from the cache if present.)

But this brings up a troubling question: Should you test a pointer every single time you use it? What if you call a routine, passing the pointer in, and you don’t know but that routine might have just wiped it out? Let’s consider that for a moment. Look at this line of code:

 Wipeout(inst);

The inst variable is a pointer to an object. Can Wipeout wipe out the pointer? Maybe. But to be sure what Wipeout does, look at the prototype. Suppose it’s this:

 void Wipeout(MyObject *someinst);

You’re a little bit safer here, because Wipeout doesn’t take a reference. Therefore, whatever Wipeout does is on a copy of the pointer, not the original pointer. But that only means you’re a little bit safer, because while Wipeout may be getting a copy of the pointer, that copy still points to the same object you’re working with, which isn’t a copy. In this case, Wipeout can still corrupt the data in the object or call delete on the object. (Don’t forget about delete!)

If Wipeout, instead, takes a reference, then you still want to be suspicious, although in some sense you’re actually a little safer. Here’s a sample prototype:

 void Wipeout(MyObject &inst);

To call Wipeout, of course, you need to de-reference your pointer:

 Wipeout(*inst);

This is indeed safer, because your pointer never even makes it into Wipeout. Wipeout gets a reference to the object (which internally is really the function’s own pointer variable totally separate from your pointer variable). Thus, there’s no way for Wipeout to change the value stored in your pointer. Wipeout can, however, corrupt the object by maliciously (or accidentally) putting bad data inside the object.

Another really nasty thing that Wipeout can do is delete your object, which is something you might not expect it to be able to do considering it doesn’t even take a pointer. Here’s a sample Wipeout routine that deletes the object:

 void Wipeout(MyObject &inst) {      delete &inst;  }

This is particularly nasty because the calling routine really has no way to test whether this action took place without using some advanced memory management libraries. The reason is the original pointer is still intact and nonzero, even though the object is no longer valid. Yet I have seen libraries with code like this frightening Wipeout sample. And really, a Wipeout function that takes a pointer can do the same thing, of course. However, deleting a reference by taking the object’s address is especially bad because you don’t have to pass in a heap object; you can pass in an object that’s on the stack, like so:

 MyObject inst;  inst.value = 10;  Wipeout(inst);  cout << inst.value << endl;

No routine I call should even be attempting to delete this inst object! Yet, the compiler allows this code to run just fine. Yuck!

But do you want to know what’s even scarier about code like this? The calling routine might be able to continue using the object for awhile! This, of course, depends on the runtime library, but the runtime might not have cleared out the object’s memory. Instead, the runtime may have just left the memory there but noted that the memory is available for use elsewhere. This means your program might happily go along and only much later (after another call to new, most likely) your pointer will suddenly point to bad data and you’ll get a crash. My personal experience is that this is the absolute hardest possible bug to track down. I’m not exaggerating; memory corruption takes hours upon hours to track down because you can’t immediately identify when the corruption takes place. (For what it’s worth, the gcc compiler and library I tested did clear the object’s memory to 0 after deleting the object if the object was on the heap but not if the object was on the stack; however, values of all 0 for the data members might not cause your program to crash just yet.)

But the good news is this: This kind of evil code doesn’t necessarily occur just in third-party libraries where you have no control. It can occur in your own code. When you’re grinding out hundreds of lines of code a day, it’s easy to forget exactly what you did in some function. Or, if you’re working on a team, it’s very easy for one of your less-experienced teammates to include such evil code.

My recommendations, then, are these:

Be extremely conscientious of your news and deletes.
Document your functions carefully so people writing code to call into your functions know whether your functions will be deleting the objects.
Consider purchasing a memory management tool that provides instrumentation.

As for the final point, several vendors create memory management tools that can be lifesavers in helping you track down problems. The way they work is that they start up in the background, and when your program starts, the utility performs an instrumentation, which means the utility analyzes your software for the function calls. Then you just use your program as you normally would. Once you finish using your program, you look at the utility’s report. It will tell you if it found any problems, such as attempts to access an object after the object was deleted. It will tell you the line where the access occurred and provide a call stack showing the function flow that led to the line.

In addition, you can find various libraries that you link into your program that track all your new calls and delete calls, saving everything to a log file. Or, if you’re ambitious, you can write your own such library by overloading the new and delete operators. Since this isn’t a book on C++, I’m not going to show you how to do this; instead, I recommend doing a web search because I found lots of websites through Yahoo! that have tips on overloading the new and delete operators. (However, I did actually have a need to overload the new and delete operators in Chapter 9, “When Your Software Starts, Stops, or Dies a Quick Death,” in the section “What About Exceptions in Constructors and Destructors in C++?” So you can find an example there, although I don’t do any logging.)