Schedule Time for Building Debugging Systems | Debugging Applications for MicrosoftВ® .NET and Microsoft WindowsВ® (Pro-Developer)

[Previous] [Next]

As you're doing the design and initial scheduling for your project, make sure to add in time for building your debugging systems. You need to decide up front how you're going to implement your crash handlers (a topic covered in Chapter 9), file data dumpers, and other tools you'll need to help you reproduce problems reported from the field. I've always liked to treat the error handling systems as if they were a product feature. That way, others in the company can see how you're going to handle bugs proactively when they come up.

As you're planning your debugging systems, you need to establish your preventive debugging policies. The first and most difficult parts of this process involve determining how you're going to return error conditions in the project. Whatever you do, make sure you pick only one way and stick with it. One project I encountered long ago (and fortunately wasn't a part of) had three different ways to return errors: return values, setjmp/longjmp exceptions, and through a global error variable similar to the C run-time library's errno variable. Those developers had a very difficult time tracking errors across subsystem boundaries.

Unfortunately, I can't make a blanket recommendation for a particular way of returning errors because Windows development involves too many dependencies on technologies and third-party components. Technologies such as the Component Object Model (COM) enforce an error-return standard. In general, across subsystem boundaries, I prefer the COM approach, in which you check a return value instead of throwing objects such as C++ exceptions. I realize that some hardcore C++ developers might disagree with my preference, but I always err on the side of simplicity and understandability, both of which are apparent in the COM approach.

Build All Builds with Debugging Symbols

Some of the debugging system recommendations that I do make aren't that controversial. I've been harping on my first recommendation for years: build all builds, including release builds, with full debugging symbols. Debugging symbols are the data that lets the debugger show you source and line information, variable names, and data type information for your program. I don't relish the task of debugging a release build binary completely at the assembly-language level. If you like challenges, I guess you could do it, but I like to save time.

Of course, debugging release builds with symbols has its drawbacks. For example, the optimized code the compiler produces won't always match the flow of execution in the source code, so you might find that stepping through release code is a little harder than stepping through debug code. Another problem to watch out for in release builds is that sometimes the compiler optimizes the stack registers such that you can't see the complete call stack, as you would in a straight debug build. Also be aware that when you do add debugging symbols to the binary, it will grow a small amount. However, the size increase is negligible when compared to the ease of being able to solve bugs quickly.

Turning on debug symbols for a release build is quite easy. For Microsoft Visual Basic, on the Project Properties dialog box Compile tab, check Create Symbolic Debug Info. For Microsoft Visual C++ projects, two separate steps are required. The first step is to set the compiler, CL.EXE, to put debug symbols into the OBJ files. In the Project Settings dialog box, select Win32 Release in the Settings For combo box to modify your release builds only. On the C/C++ tab, General Category, Debug Info combo box, select Program Database. This setting will add the /Zi switch to your compiles. Make sure you don't select Program Database For Edit And Continue (/ZI)—that option adds all sorts of padding and other information to your binary so that you can edit the source code as you debug.

The second step for Visual C++ projects is to have the linker, LINK.EXE, generate the actual debug symbols. Select Win32 Release in the Settings For combo box, and on the Link tab, General Category, check Generate Debug Info. This setting turns on the /DEBUG switch to the linker, which is fine for debug builds. You also need to type /OPT:REF in the Project Options edit box on the Link tab. Using the /DEBUG switch with the linker automatically tells it to bring in all functions whether or not they are referenced, which is the default for debug builds. The /OPT:REF switch tells the linker to bring in only functions that your program calls directly. If you forget to add the /OPT:REF switch, your release application will also contain the functions that are never called, making it much larger than it should be.

Although you might be concerned that turning on debugging symbols will make reverse engineering your application easier, it doesn't. When you select the Program Database (PDB) setting in your project, all debugging symbols are stored in the separate PDB files your program generates. Because you don't ship those files to your customers, the extra debugging symbols won't make reverse engineering your application any easier.

After you build your release builds with full PDB files, you need to store the PDB files in a safe place along with any binary files you ship to customers. If you lose your PDB files, you'll be right back to debugging at the assembly-language level. Treat your PDB files as you would your distributed binaries.

Treat Warnings as Errors—Maybe

Because Visual Basic is much more sensitive to compilation errors than C++ is, anything the compiler reports to you is an error. C++, as anyone who has compiled any program larger than "Hello, World!" knows, is a much looser language and lets you get away with murder and mayhem. Like Visual Basic, Visual C++ has some hard errors that will abort the compilation. Errors such as C2037, "left of 'operator' specifies undefined struct/union 'identifier'," mean that the compiler can't continue. What makes Visual C++ different from Visual Basic is that it can also report warnings.

These warnings generally mean that some part of your code is ambiguous, but the compiler will take an educated guess at the correct meaning. A warning such as C4244, "'conversion' conversion from 'type1' to 'type2', possible loss of data," which is always reported on conversions between signed and unsigned types, is an excellent example. Although some people feel that warnings are just that, I feel that any warning is the same as an error and that you need to treat it as such. As soon as I see a compiler warning, I stop and fix my code so that it is unambiguous to the compiler.

If you ever had the opportunity to learn about compiler construction, particularly parsing, you probably walked away with one thought: parsing is very hard, especially with a language as complex as C++. If the Visual C++ compiler writers go to all the trouble to report a warning, they are trying to tell you that something in your code is ambiguous and they will have to guess what you mean. I don't like letting a tool guess for me—it's a perfect way to introduce bugs. When someone asks me to help with a bug, the first thing I do is verify that the code compiles with no warnings. If it doesn't, I tell them that I'll be glad to help, but not until their code compiles cleanly.

The default projects that the Visual C++ wizards create are at warning-level 3, which corresponds to the /W3 switch to CL.EXE. The next step up is warning-level 4, /W4, and you can even have the compiler treat all warnings as errors with /WX. These levels are all easy to set in the Visual C++ integrated development environment (IDE) in the Project Settings dialog box. On the C/C++ tab, General Category, you set the warning level in the Warning Level combo box, and in the check box right below it, Warnings As Errors, you set the /WX switch.

Although I can almost justify making the global statement "All builds should compile with warning-level 4, and you should treat all warnings as errors," reality intrudes to force me to temper this remark. First off, some common header files won't compile with /W4 and /WX set. The compiler itself has a couple of unnecessary information warnings that it treats as real warnings, so using /WX will stop the compile. The Standard Template Library (STL) that comes with Visual C++ has many warning-level 4 issues in it. The compiler also has a few problems with templates. Fortunately, you can work around most of these issues.

You might think that just setting the warning level to 4 and turning off treating warnings as errors would be fine; in fact, that scheme defeats the purpose. I've found that developers quickly become desensitized to warnings in the Build window. If you don't fix all the warnings as they happen, no matter how innocuous a warning seems, you'll start to lose more important warnings because they'll be hidden amid the output stream. The trick is to be more explicit about which warnings you want to handle. Although your goal should be to get rid of most warnings by writing better code, you can also turn off specific errors with the #pragma warning directive. Additionally, you can use the #pragma warning directive to control the error level around specific headers.

A good example of lowering the error level is when you're including headers that don't compile at warning-level 4. The extended #pragma warning directive, first offered in Visual C++ 6, can lower the warning level. In the following code snippet, I set the warning level before including the suspect header and reset it so that my code compiles with warning-level 4:

 #pragma warning ( push , 3 ) #include "IDoNotCompileAtWarning4.h" #pragma warning ( pop )

You can also disable individual warnings with the #pragma warning directive. This directive comes in handy when you're using a nameless structure or union and you get a C4201 error, "nonstandard extension used : nameless struct/union," with warning-level 4. To turn off that warning, you use the #pragma warning directive as in the following code. Notice that I commented what I was turning off and explained why I was turning it off. When disabling individual warnings, be sure to restrict the scope of the #pragma warning directive to specific sections of code. If you place the directive at too high a level, you can mask other problems in your code.

 // Turning off "nonstandard extension used : nameless struct/union" // because I'm not writing portable code #pragma warning ( disable : 4201 ) struct S {  float y;  struct  {  int a ;  int b ;  int c ;  } ; } *p_s ; // Turn warning back on. #pragma warning ( default : 4201 )

If you're not using STL, the scheme above works well. If you're using STL, it might work, but it might not. Always try to get the STL headers to compile at warning-level 4 before you lower the error level around them with #pragma warning ( push , 3 ). You might have to turn off some additional individual warnings, but strive to keep the warning level at 4, if possible. On a couple of projects, I never did get the code to compile without warnings no matter what workaround I tried. In those cases, I dropped the global warning level to 3. Even then, however, I still kept the Warnings As Errors option on.

The bottom line is that you should try to compile with the highest warning level possible and treat all warnings as errors from the start of your project. When you first boost the warning level for your project, you'll probably be surprised by the number of warnings you get. Go through and fix each one. You'll probably notice that just fixing the warnings will solve a bug or two. For those of you who think getting your program to compile with /W4 and /WX is impossible, I have proof otherwise: all the sample code on this book's companion CD compiles with both flags set for all configurations.

Know Where Your DLLs Load

If you've ever been hiking in the woods, you know that landmarks can be very important in keeping you from getting lost. When you don't have any landmarks, you can end up going around in circles. When your application crashes, you need a similar kind of landmark to help point you in the right direction so that you're not wandering around in the debugger.

The first big landmark for crashes is the base address of your dynamic-link libraries (DLLs) and ActiveX controls (OCXs), which indicates where they loaded into memory. When a customer gives you a crash address, you should be able to narrow down which DLL it came from quickly by the first two or three numbers. I don't expect you to have all the system DLLs memorized, but you should memorize at least your project's DLL base addresses.

If all your DLLs load at unique addresses, you have some good landmarks to help guide your search for the crash. But what do you think would happen if all your DLLs had the same load address? Obviously, the operating system doesn't map them all into the same place in memory. It has to "relocate" any incoming DLL that wants to occupy memory that's already filled by putting the incoming DLL into a different place. The issue then becomes one of trying to figure out which DLL is loaded where. Unfortunately, you have no way of knowing what the operating system will do on different machines. Consequently, you'd have no idea where the crash came from, and you'd spend days searching through the debugger looking for it.

By default for wizard-created projects, Visual Basic DLLs load at 0x11000000, and Visual C++ DLLs load at 0x10000000. I'm willing to bet that at least half the DLLs in the world today try to load at one of those addresses. Changing the base address for your DLL is called rebasing, and it's a simple operation in which you specify a different load address than the default.

Before we jump into rebasing, let's look at an easy way to find out whether you have load conflicts in your DLLs. If you see the following notification in the Visual C++ debugger Output window, you need to stop and fix the load addresses of the conflicting DLLs immediately so that you don't forget to fix them later. Make sure you fix the load addresses for both release and debug builds.

 LDR: Dll xxx base 10000000 relocated due to collision with yyy

The xxx and yyy in this statement are the names of the DLLs that are conflicting.

In addition to making it difficult to find a crash, when the operating system has to relocate a DLL, your application slows down. When relocating, the operating system needs to read all the relocation information for the DLL, run through each place in the code that accesses an address within the DLL, and change the address because the DLL is no longer at its preferred place in memory. If you have a couple of load address conflicts in your application, startup can sometimes take more than twice as long!

There are two ways to rebase the DLLs in your application. The first method is to use the REBASE.EXE utility that comes with the Platform SDK. REBASE.EXE has many different options, but your best bet is to call it using the /b command-line switch with the starting base address and put the appropriate DLLs on the command line.

Table 2-1 shows a table from the Platform SDK documentation for rebasing your DLLs. As you can see, the recommended format is to use an alphabetical scheme. I generally follow this scheme because it's simple. The operating system DLLs load from 0x70000000 to 0x78000000, so using the range in Table 2-1 will keep you from conflicting with the operating system.

Table 2-1 DLL Rebasing Scheme

DLL First Letter	Starting Address
A-C	0x60000000
D-F	0x61000000
G-I	0x62000000
J-L	0x63000000
M-O	0x64000000
P-R	0x65000000
S-U	0x66000000
V-X	0x67000000
Y-Z	0x68000000

If you have four DLLs in your application, APPLE.DLL, DUMPLING.DLL, GINGER.DLL, and GOOSEBERRIES.DLL, you would run REBASE.EXE three times to get all the DLLs rebased appropriately. The following three commands show how you would run REBASE.EXE with those DLLs:

 REBASE /b 0x60000000 APPLE.DLL REBASE /b 0x61000000 DUMPLING.DLL REBASE /b 0x62000000 GINGER.DLL GOOSEBERRIES.DLL

If multiple DLLs are passed on the REBASE.EXE command line, as shown here with GINGER.DLL and GOOSEBERRIES.DLL, REBASE.EXE will rebase the DLLs so that they are loaded back to back starting at the specified starting address.

The other method of rebasing a DLL is to specify the load address when you link the DLL. In Visual Basic, set the address in the DLL Base Address field on the Compile tab of the Project Properties dialog box. In Visual C++, specify the address by selecting the Link tab on the Project Settings dialog box, choosing Output from the Category combo box, and then entering the address in the Base Address edit box. Visual C++ translates the address you enter in the Base Address edit box into the /BASE switch to LINK.EXE.

Although you can use REBASE.EXE to automatically handle setting multiple DLL load addresses at a time, you have to be a little more careful when setting the load address at link time. If you set the load addresses of multiple DLLs too close together, you'll see the loader relocation message in the Ouput window. The trick is to set the load addresses far enough apart that you never have to worry about them after you set them.

Using the same DLLs from the REBASE.EXE example, I'd set their load address to the following:

 APPLE.DLL 0x60000000 DUMPLING.DLL 0x61000000 GINGER.DLL 0x62000000 GOOSEBERRIES.DLL 0x62100000

The important two DLLs are GINGER.DLL and GOOSEBERRIES.DLL because they begin with the same character. When that happens, I use the third-highest digit to differentiate the load addresses. If I were to add another DLL that started with "G," its load address would be 0x62200000.

To see a project in which the load addresses are set manually, look at the WDBG project in the section "WDBG: A Real Debugger" in Chapter 4. The /BASE switch also allows you to specify a text file that contains the load addresses for each DLL in your application. In the WDBG project, I use the text-file scheme.

Either method, using REBASE.EXE or rebasing the DLLs manually, will rebase your DLLs and OCXs, but it might be best to follow the second method and rebase your DLLs manually. I manually rebased all the sample DLLs on this book's companion CD. The main benefit of using this method is that your MAP file will contain the specific address you set. A MAP file is a text file that indicates where the linker put all the symbols and source lines in your program. You should always create MAP files with your release builds because they are the only straight text representation of your symbols that you can get. MAP files are especially handy in the future when you need to find a crash location and your current debugger doesn't read the old symbols. If you use REBASE.EXE to rebase a DLL instead of rebasing it manually, the MAP file created by the linker will contain the original base address, and you'll have to do some arithmetic to convert an address in the MAP file to a rebased address. In Chapter 8, I'll explain MAP files in more detail.

One of the big questions I get when I tell people to rebase their files is, "What files am I supposed to rebase?" The rule of thumb is simple: If you or someone on your team wrote the code, rebase it. Otherwise, leave it alone. If you're using third-party components, your binaries will have to fit around them.

Common Debugging Question
What additional compiler and linker options will help me with my proactive debugging?
A number of compiler and linker switches can help you control your application's performance and better debug your application. Additionally, I don't completely agree with the default compiler and linker settings that the Visual C++ project wizards give you, as I mentioned earlier in the chapter. Consequently, I always change some of the settings.

Compiler Switches for CL.EXE

You can type all these compiler switches directly into the Project Options edit control on the bottom of the C/C++ tab of the Project Settings dialog box.

/P (preprocess to a file)

If you're having trouble with a macro, the /P switch will preprocess your source file, expanding all macros and including all include files, and send the ouput to a file with the same name but with an .I extension. You can look in the .I file to see how your macro expanded. Make sure that you have sufficient disk space because the .I files can be several megabytes apiece. If your .I files are too big for the disk, you can use the /EP switch with /P to suppress the #line directives output by the preprocessor. The #line directives are what the preprocessor uses to coordinate line numbers and source file names in a preprocessed file so that the compiler can report the location of compilation errors.

/X (ignore standard include paths)

Getting a correct build can sometimes be a pain if you have multiple compilers and SDKs installed on your machine. If you don't use this switch, the compiler, when invoked by a MAK file, will use the INCLUDE environment variable. To control exactly which header files are included, the /X switch will cause the compiler to ignore the INCLUDE environment variable and look only for header files in the locations you explicitly specify with the /I switch.

/Zp (structure member alignment)

You should not use this flag. Instead of specifying on the command line how structure members should be aligned in memory, you should align structure members by using the #pragma pack directive inside specific headers. I've seen some huge bugs in code because the development team originally built by setting /Zp. When they moved to a new build or another team went to use their code, the /Zp switch was forgotten, and structures were slightly different because the default alignment was different. It took a long time to find those bugs.

/GZ (catch release-build errors in debug build)

Visual C++ 6 introduced the outstanding debugging feature in which the compiler automatically initializes a function's local variables and checks the call stack after function calls. This flag is on by default for debug builds, but you can also use it in release builds. If you're having trouble with uninitialized memory reads (wild reads), uninitialized memory writes (wild writes), or memory overwrites, create a new project configuration that is based on your release build and add this switch to the compile options. With all your local variables filled with 0xCC as they're created, you can start looking around to see what changed the values at the wrong time.

Additionally, the /GZ switch will generate code that saves the current stack pointer before an indirect function call (such as a call to a DLL function) and verifies that the stack pointer is unchanged after the call. Validating the stack pointer helps protect you against one of the most insidious bugs around: a mismatched calling convention declaration. This bug occurs when you call a __stdcall function but you misdeclare it as a __cdecl function. These two calling conventions clean up the stack differently, so you'll crash later in the program if you get the calling convention wrong.

/O1 (minimize size)

By default, a project created by the Microsoft Foundation Class (MFC) library AppWizard uses /O2 (maximize speed) for its release-build configurations. However, Microsoft builds all its commercial applications with /O1, and that's what you should be using. What Microsoft has found is that after picking the best algorithm and writing tight code, avoiding page faults can help speed up your application considerably. As I've heard it said, "Page faults can ruin your day!"

Page faults occur when your executing code moves from one page of memory (4 KB for x86 Intel) to the next. To resolve a page fault, the operating system must stop executing your program and place the new page on the CPU. If the page fault is soft, meaning that the page is already in memory, the overhead isn't too terrible—but it's extra overhead nonetheless. If the page fault is hard, however, the operating system must go out to disk and bring the page into memory. As you can imagine, this little trip will cause hundreds of thousands of instructions to execute, slowing down your application. By minimizing the size of your binary, you decrease the total number of pages your application uses, thereby reducing the number of page faults. Granted, the operating system's loaders and cache management are quite good, but why take more page faults than you have to?

In addition to using /O1, you should look at using the Working Set Tuner (WST) utility from the Platform SDK. WST will help you order your most commonly called functions to the front of your binary so that you minimize your working set, the number of pages kept in memory. With your common functions up front, the operating system can swap out the unneeded pages. Thus, your application runs faster. For more on using WST, see my February 1999 "Bugslayer" column in Microsoft Systems Journal on MSDN.

Linker Switches for LINK.EXE

You can type all these compiler switches directly into the Project Options edit control on the bottom of the Link tab of the Project Settings dialog box.

/MAP (generate MAP file)

/MAPINFO:LINES (include line information in the MAP file)

/MAPINFO:EXPORTS (include export information in the MAP file)

These switches build a MAP file for the linked image. (See Chapter 8 for instructions on how to read a MAP file.) You should always create a MAP file because it's the only way to get textual symbolic information. Use all three of these switches to ensure that the MAP file contains the most useful information.

/NODEFAULTLIB (ignore libraries)

Many system header files include #pragma comment ( lib#, XXX ) records to specify what library file to link with, where XXX is the name of the library. /NODEFAULTLIB tells the linker to ignore the pragmas. This switch lets you control which libraries to link with and in what order. You'll need to specify each necessary library on the linker command line so that your application will link, but at least you'll know exactly which libraries you're getting and in which order you're getting them. Controlling the order in which libraries are linked can be important any time the same symbol is included in more than one library, which can lead to very difficult-to-find bugs.

/ORDER (put functions in order)

After you've run WST, the /ORDER switch allows you to specify the file that contains the order for the functions. /ORDER will turn off incremental linking, so use it only on release builds.

/PDBTYPE:CON (consolidate PDB files)

You should always turn on /PDBTYPE:CON for all your builds, both release and debug. Visual C++ projects don't have this switch on by default. This switch consolidates all the debugging information for a module into a single PDB file instead of spreading it into multiple files. Having a single PDB file makes it much easier for multiple users to debug the same binaries; it also simplifies the archiving of your debugging information.

/VERBOSE (print progress messages)

/VERBOSE:LIB (print libraries searched only progress messages)

If you're having trouble linking, these messages can show you what symbols the linker is looking for and where it finds them. The output can get voluminous, but it can show you where you're having a build problem. I've used /VERBOSE and /VERBOSE:LIB when I've had an odd crash because a function being called didn't look, at the assembly-language level, anything like I thought it should. It turned out that I had two functions with identical signatures, but different implementations, in two different libraries, and the linker was finding the wrong one.

/WARN:3

Generally, I don't use this switch all the time, but a couple times during the project's life I look to see which libraries I'm actually referencing. Turning on /WARN:3 will tell you whether libraries passed to LINK.EXE are referenced. Personally, I like to control exactly which libraries I link against, and I remove unreferenced libraries from the link list.

Design a Lightweight Diagnostic System for Release Builds

The bugs I hate the most are those that happen only on the machines of one or two users. Every other user is merrily running your product, but one or two users have something unique going on with their machines—something that is almost impossible to figure out. Although you could always have the user ship the misbehaving machine to you, this strategy isn't always practical. If the customer is in the Caribbean, you could volunteer to travel there and debug the problem. For some reason, however, I haven't heard of too many companies that are that quality conscious. Nor have I heard of many developers who would volunteer to go to the Arctic Circle to fix a problem either.

When you do have a problem situation that occurs on only one or two machines, you need a way to see the program's flow of execution on those machines. Many developers already track the flow of execution through logging files and writing to the event log, but I want to stress how important that log is to solving problems. The problem-solving power of flow logging increases dramatically when the whole team approaches tracking the program's flow of execution in an organized fashion.

When logging your information, following a template is especially important. With the information in a consistent format, developers will find it much easier to parse the file and report the interesting highlights. If you log information correctly, you can record tons of information and have Perl scripts pull out the significant items so that you don't need to spend 20 minutes reading a text file just to track down one detail.

What you need to log is mostly project-dependent, but at a minimum, you should definitely log failure and abnormal situations. You also want to try to capture a logical sense of the program operation. For example, if your program is performing file operations, you wouldn't want to log fine-grained details such as "Moving to offset 23 in the file," but you would want to log the opening and closing of the file so that if the last entry in the log is "Preparing to open D:\Foo\BAR.DAT," you know that BAR.DAT is probably corrupt.

The depth of the logging also depends on the performance hit associated with the logging. I generally log everything I could possibly want and keep an eye on the release-build performance when not logging. With today's performance tools, you can quickly see whether your logging code is getting in the way. If it is, you can start to back off on the logging a little bit until you strike enough of a balance that you get sufficient logging without slowing down the application too much.

For C++ code, I like to use a macro such as the following to do the logging. Note that G_IsLogging is a global variable that all modules can see. By having the global variable, you can avoid the performance cost of a function call.

 // Visual C++ macro to do logging #define LOGGING(x)              \     if ( TRUE == G_IsLogging )  \     {                           \         LoggingInfo ( x ) ;     \     }

For Visual Basic code, since there are no macros in the language, I just check the global variable manually. If you were ambitious, you could write a simple Visual Basic IDE add-in that with a button would add everything but the string to pass to the logging function.

 ' Visual Basic example of calling the logging function If ( 1 = G_IsLogging ) Then     LoggingInfo ( "Preparing to open " & sFile ) End If

You set the global logging flag in one of two ways. If your target audience is experienced computer users, you can set it with an environment variable. Because most of us are targeting ordinary people, however, I recommend making the flag a registry setting. Additionally, I'd create a small utility to set the special registry flag and install it with the application. If users reported problems, your technical support engineers could have them run the utility and ensure that they have the logging turned on. Providing a utility to set the registry flag also relieves your support staff of having to take a novice user on a long and potentially damaging trip through the registry over the phone.