14.6 Dynamic-analysis tools

14.6.1 Dynamic analysis

There are several reasons why dynamic analysis may be more effective than static analysis in diagnosing a bug. Dynamic analysis can include system and third-party libraries. Source code isn’t required. Dynamic analysis only evaluates code that is executed. This is likely to take less system time than static analysis. With dynamic analysis, an error is caught right before it occurs.

On the other hand, dynamic analysis isn’t perfect. Dynamic analysis only evaluates code that is executed. It won’t find defects in code not exercised by your tests. To get the full benefit from a dynamic-analysis tool, you need a comprehensive test suite in place. You should also have a test coverage tool that will enable you to determine whether your test suite actually exercises all parts of your application.

14.6.2 Insure++

Insure++ is a product of ParaSoft. It is available on Windows™, as well as on popular versions of UNIX™ and Linux™.

14.6.2.1 Features

Insure++ detects the following pointer-reference problems:

References to null pointers
References to uninitialized pointers
References to pointers that don’t point to valid memory locations
Comparison of pointers that point to different memory blocks
Attempts to execute functions through pointer references that don’t actually point to functions

Insure++ detects the following memory-leak problems:

Freeing blocks of memory that contain pointers to other blocks of memory
Functions that ignore pointers to storage allocated by a subfunction and returned as a result
Functions that return without freeing storage pointed to by local pointers

Insure++ detects the following miscellaneous pointer problems:

Freeing a memory block more than once
Freeing static memory (global variables)
Freeing stack memory (local variables)
Applying free or delete to an address that isn’t the beginning of an allocated memory block
Applying free or delete to a null or uninitialized pointer
Passing invalid arguments to malloc, calloc, realloc, or free
Mismatching invocations of new [] and delete []
Mixing allocation using malloc, calloc, realloc, or free
Invalid overloading of operators new and delete

Insure++ detects the following library API errors:

Mismatched argument types
Invalid argument values
Errors returned by library functions

The system knows how to check system calls and X/Motif API calls, as well as other popular libraries.

14.6.2.2 Technology

To get the full benefit of Insure++, you must recompile your source code, using the tool instead of your normal compiler. If you want to use the tool in Chaperon mode, you don’t need to recompile or relink. This mode doesn’t do error checking as extensively as normal mode. It operates on the executable program. The Chaperon mode slows down the application, but not nearly as much as the normal mode. Insure++ is based on patented technology (U.S. Patents 5,581,696 and 5,842,019), which is the basis for the following description:

The normal mode of operation for Insure++ is Source Code Instrumentation mode. In this mode, the tool is invoked instead of a compiler. The tool parses your program and identifies constructs that must be tracked at runtime. It generates revised source code for the constructs that must be monitored and passes this code to your regular compiler. When you run your program, the tool checks data values and memory references against a database to verify consistency and correctness.

There are seven types of instrumentation described in U.S. Patent 5,581,696:

Detecting a read of an uninitialized simple variable
Detecting a read or write through an invalid address for an aggregate variable
Detecting a dynamic-memory error while using a pointer variable
Detecting an invalid use of a pointer variable
Detecting a memory leak
Detecting a function call argument error
User-definable instrumentation

To find reads of uninitialized simple variables, Insure++ places calls to runtime procedures in two places. After space is allocated on the stack for the variable, it calls the runtime library to record its uninitialized state. Before the value of the variable is referenced, it makes another call that checks whether the variable has been assigned since the previous call.

To find reads or writes through an invalid address for an aggregate variable, Insure++ places calls to runtime procedures in two places. After the declaration of an aggregate variable, Insure++ places a call to the runtime library to record the length of a homogeneous aggregate (array) or the size of an inhomogeneous aggregate (structure). Before the value of an element is referenced or assigned, it calls a runtime routine that checks whether the address generated is within the bounds of the aggregate.

There are six problems that Insure++ categorizes as dynamic-memory errors while using a pointer variable:

Reading from or writing through an invalid pointer
Passing an invalid pointer as a function argument
Returning an invalid pointer as a function result
Freeing the same memory block more than once
Freeing addresses that are on the stack
Freeing addresses that don’t point to the beginning of a memory block

To find errors using pointer variables, Insure++ places calls to runtime procedures in several places. After the declaration of a pointer variable, Insure++ places a call to the runtime library that generates a record of the pointer variable and its contents. After memory allocation operations that are assigned to a pointer variable, it makes a call to record the size and starting address of the memory block, addresses of pointers that point to or are contained by the memory block, and miscellaneous information. After assignments to pointer variables, it makes a call to note that the pointer contents may be changed or invalid.

Once these are set up, it’s possible to check for dynamic-memory errors. Before a pointer variable is used to read or write memory, a call is made to check that the pointer contains a valid address. Before a pointer variable is passed as an argument or is returned as a result, it makes a call to check that the pointer contains a valid address. Before a memory deallocation, it makes a call to check that the address being freed isn’t already freed, isn’t a stack address, and is the start of a memory block.

There are five problems that Insure++ categorizes as invalid uses of a pointer variable:

Operating on a null pointer
Operating on an uninitialized pointer
Operating on a pointer that doesn’t point to valid data
Comparing pointers that point to different objects
Invoking a function through a pointer that doesn’t point to a function

To find errors operating on pointer variables, Insure++ places calls to runtime procedures in two places. After the declaration of a pointer variable, Insure++ places a call to the runtime library that generates a record of the pointer variable and its contents. Before operations on the value of a pointer variable, Insure++ places a call to the runtime library that checks that none of the five problems mentioned above occur.

There are three problems that Insure++ categorizes as memory-leak errors:

An assignment to the only variable containing the address of a memory block before that block is freed
Returning the address of a memory block as a function result and not assigning that return value to a pointer variable
Returning from a function before freeing all memory blocks that are pointed to by local pointer variables

To find memory-leak errors, Insure++ places calls to runtime procedures in two places. After the declaration of a pointer variable, Insure++ places a call to the runtime library that generates a record of the pointer variable and its contents. Before operations that decrease the stack size, Insure++ places a call to the runtime library to note that local variables have gone out of scope. When these calls are evaluated, the runtime library checks to see if any memory blocks were only pointed to by local pointer variables. To increase the effectiveness of its leak tracking, Insure++ supplements the tracking described above with a reference count for memory blocks. Insure++ uses both a dynamic search for leaked blocks and a static scan through allocated memory at the end of program execution.

14.6.2.3 Usage

As with all dynamic techniques, the checking done by Insure++ is only as good as the coverage of the code that results from the test suite you use. Insure++ can be licensed with an optional module that does test coverage analysis. Use this optional module to increase your confidence level with the problem reports Insure++ generates.

14.6.3 BoundsChecker

BoundsChecker is a product of Compuware Corporation. It is available on Windows™ operating systems. It checks memory error and API calls in C and C++ source code.

14.6.3.1 Features

You can use BoundsChecker in two modes. In ActiveCheck mode, you use BoundsChecker with Microsoft Visual Studio. When you run your program in Visual Studio, BoundsChecker will run in the background checking for errors. When the tool finds a bug, it stops your application and displays the error description, the call stack, and the source line where the problem occurred.

In FinalCheck mode, you use BoundsChecker with standalone Windows™ applications. As you build your application, BoundsChecker inserts error-detection code into the intermediate representation used by the Visual C++ compiler. When your application runs, the inserted code finds memory and pointer errors as before.

BoundsChecker finds pointer errors and memory leaks in the usage of static, automatic (stack), and dynamic (heap) memory.

BoundsChecker validates calls to Windows™, ODBC, ActiveX, DirectX, COM, and Internet APIs. It will check for the following API errors:

Invalid parameters
Invalid return codes
Wrong number of parameters
Out-of-range parameters
Invalid flags
Uninitialized fields
Invalid pointers

You can extend it to check calls to custom libraries (DLLs) that you produce.

You can customize the analysis and output of BoundsChecker to fit your needs. You can control which types of errors to check for, which files and modules to check, and which reports to suppress. You can also control which Windows™ version should be used to check for API errors.

14.6.3.2 Technology

BoundsChecker instruments the intermediate representation generated by the Visual C++ compiler. Modifying this representation is faster than generating modified source code. BoundsChecker doesn’t have to write and read the modified source file or analyze the lexical and syntactic structure of the modified program. These time savings can speed up the edit-compile-run cycle significantly in a large program.

Modifying the compiler’s internal representation provides more context information than modifying the object code. Tools that modify object code can’t relate an error back to source code without extra annotations provided by the compiler, which may or may not be available.

These points are advantages over the competitors of BoundsChecker, Insure++, and Purify. The downside of its tight integration with the Visual C++ compiler is that it is only available on Windows™ platforms. In addition, version 7.0 and beyond of BoundsChecker don’t support Windows 98.

14.6.3.3 Usage

As with all dynamic techniques, the checking done by BoundsChecker is only as good as the coverage of the code that results from the test suite you use. BoundsChecker can be licensed with an optional module, TrueCoverage, which does test coverage analysis. Use this optional module to increase your confidence level with the problem reports BoundsChecker generates.

14.6.4 Purify

Purify is a product of the Rational Corporation. It is available on Windows™ and UNIX™ operating systems. It checks memory errors and API calls in C and C++ source code, and garbage-collection problems in Java code.

14.6.4.1 Features

The following problems are found by Purify:

Reading or writing beyond memory block bounds
Reading or writing freed memory
Freeing memory multiple times
Reading uninitialized memory
Reading or writing through invalid or null pointers
Reading or writing beyond stack end
Overflowing the stack
Memory leaks
File descriptor leaks
Windows™ API usage errors
COM API usage errors

The Windows™ product is integrated with Microsoft Visual Studio, and the UNIX™ product provides a GUI. The level of checking can be set to minimal or precise.

14.6.4.2 Technology

Purify works by modifying the object code used to build your application. This means that it’s useful for debugging assembly code, as well as code generated by compilers. It will even modify object code that comes from system libraries. Of course, there isn’t much you can do with problems for which you don’t have the source code, except report them to someone else.

Purify is based on patented technology (U.S. Patent 5,335,344), which is the basis for the following description:

Purify modifies object files, which originate from compiled code, assembled code, or archive libraries. It inserts instructions in front of every instruction that accesses data from memory. These instructions call functions from a special library. After inserting the instructions, it performs necessary changes to symbol tables, instruction-relocation structures, or data-relocation structures as needed.

Purify monitors the state of memory with two bits of state per byte of accessible memory. One bit refers to the allocation status of the memory; the other bit refers to the initialization status. The extra instructions that are inserted before the original set read these bits to detect invalid memory accesses.

Purify also modifies the data sections of object files to insert dummy storage around the original application variables. It notes that these dummy variables are unallocated and uninitialized, as far as the original application is concerned, and thus any attempt to read or write them is an error.

To check access to stack memory, Purify inserts code before instructions that change the stack pointer. Increasing the stack is treated as a memory allocation, and decreasing it as a deallocation. Purify uses the same checking methods for heap and stack memory. Rather than look up each byte in the stack, Purify uses a shortcut convention where it compares the address to be accessed with the current stack pointer. If the address is beyond the stack pointer, the memory is treated as unallocated.

To track the allocation and deallocation of heap memory, Purify replaces references to malloc and free with references to functions in the Purify library. These functions call the original library functions after they record information in the Purify data structures about the memory allocated or freed.

Purify provides for watch points to be set on monitored memory by recording a state of unallocated and uninitialized for the address and putting the address on a special list. When a reference to the address is detected, the address is looked up on the list of watch points; if it’s found, the watch point procedure is called instead of normal error reporting.

14.6.4.3 Usage

Under what circumstances does it make sense to use Purify? Purify handles C and C++ source code. Most of its features aren’t relevant to Java because of error-checking features built into that language. It does, however, provide support for the analysis of garbage-collection issues in Java.

The Purify literature says that it performs checks for “array bound read and write errors.” It defines an array as a block of contiguous memory. This isn’t the same as array-subscript checking. Purify doesn’t deal with array subscripts, but with addresses.

Purify will find array-subscript errors for simple arrays, which are the vast majority of arrays used in typical C and C++ code. It is possible, however, to have array-subscript errors in arrays of arrays, in which the address generated from subscript calculations is wrong, but is within the bounds of the memory block. Purify will not, in general, find these errors.

14.6.5 mpatrol

mpatrol is an open-source memory allocation library that can help you find runtime memory problems. A comprehensive set of tools is provided with the library to make it useful for a variety of memory debugging tasks. It will work with both C and C++ memory allocation functionality. It has been ported to Windows™, Linux™, and numerous UNIX™ implementations. It is available for download from several Web sites, including SourceForge (sourceforge.net/projects/mpatrol) and FreshMeat (freshmeat.net.projects.mpatrol).

14.6.5.1 Features

The level of modification that you must make to your application depends on the operating system you’re using. In some cases, you must relink your application to use mpatrol. In other cases, you can attach it to your application when it executes. To use it as a substitute for the other dynamic tools described in this section, you must recompile your sources, including a single mpatrol header file.

mpatrol creates a comprehensive log of all dynamic memory operations that occur while a program is executing. mpatrol performs extensive runtime checking for invalid operations performed on dynamically allocated memory. Numerous library settings can be changed at runtime through environment variables and command-line options.

mpatrol includes a complete set of replacements for those C and C++ library functions that allocate and manipulate memory. This includes C dynamic-memory-allocation functions, C++ dynamic-memory-allocation operators, and C memory-manipulation functions.

mpatrol can produce a summary of memory-allocation statistics. An accompanying tool can read and summarize this information and generate profile reports.

mpatrol can produce a concise trace of all memory allocations, reallocations, and deallocations. An accompanying tool can read this file and generate a trace of the memory events in a tabular or graphical format.

14.6.5.2 Technology

mpatrol coexists with the gcc command-line option -fcheck-memory-usage. This option tells the compiler to place calls to functions that check each memory access. This option supports the GNU Checker tool. The Checker tool itself doesn’t coexist with mpatrol.

Memory operation logging includes a call stack traceback wherever possible. Since these logs can be quite large, you can request that memory-allocation events be recorded in a very concise format suitable for analysis by tracing.

When you log allocation information, you can specify that allocations are recorded in a special leak table. The leak table records the memory-allocation behavior between two points in a program. The library provides functions that can dynamically start and stop recording in the leak table.

You can have all allocated and freed memory filled with special bytes to track operations that are using uninitialized or freed memory. You can also keep some or all freed memory blocks to track these type of errors.

You can have special buffers placed on either side of each block of allocated memory, prefilled with a special value, to catch code that is running off either end of a memory block. mpatrol also provides features to allocate these buffers in write-protected memory.

mpatrol provides convenient symbols that you can use to set breakpoints in the allocation library and procedures for printing information about memory allocations. It also provides hooks so that you execute your own procedures when the library starts up and shuts down and each time a memory block is allocated or freed. There are functions to query for detailed information about any memory block, to iterate over every allocated or freed memory block, and to check the library’s data structures. There are numerous functions that can be called to generate output to the log file.

There are a number of utility programs included with the library. mleak checks the log file for memory leaks. mprof is analogous to the popular UNIX™ gprof command and generates summaries of allocation behavior on a call graph basis. mptrace can be used to trace the history of every memory allocation and can generate output in graphical or tabular form on systems that support X Windows.

14.6.5.3 Usage

Under what circumstances does it make sense to use mpatrol? It supports both C and C++ sources. It has been ported to more operating systems and hardware platforms than any other tool in this chapter. It is well integrated with the GNU program development environment. It can be built as a thread-safe library on a number of platforms, although the serialization is done at a somewhat coarse level.

It provides more information about memory allocation than the other dynamic tools, but collecting this information comes with a runtime performance penalty. If you need a high degree of flexibility and control for your analysis, this penalty will be justified.

The method of placing buffers around allocated blocks is useful for detecting writes off the ends of blocks, but doesn’t catch reads off the ends. The method of using protected virtual memory pages to detect these errors can be prohibitively expensive in terms of memory space. If you think that your problem is reading or writing past the end of a block, mpatrol may not be the best way to find your problem, unless you can use the gcc compiler with the -fcheck-memory-usage option.

14.6.6 Examples

The following listings show the relevant output of mpatrol when used to diagnose the bugs described.

14.6.6.1 mpatrol output for case 1, bug 3

 @(#) mpatrol 1.4.8 (02/01/08)  Copyright (C) 1997-2002 Graeme S. Roy  ...  operating system:       UNIX  system variant:         Linux  processor architecture: Intel 80x86  ...  allocation peak:   22 (456907 bytes)  allocation limit:  0 bytes  allocated blocks:  15 (1871 bytes)  marked blocks:     0 (0 bytes)  freed blocks:      0 (0 bytes)  free blocks:       3 (518321 bytes)  internal blocks:   35 (573440 bytes)  total heap usage:  1093632 bytes  total compared:    0 bytes  total located:     0 bytes  total copied:      0 bytes  total set:         594984 bytes  total warnings:    0  total errors:      0  ERROR: [ILLMEM]: illegal memory access at address 0x00000008      0x00000008 not in heap      call stack  0x08049FA9 insert__4Heap+53  0x08049F3C __4HeapiPi+212  0x0804B5CF main+75  0x400FC177 __libc_start_main+147  0x08049D51 _start+33

14.6.6.2 mpatrol output for case 2, C++ version, part 1

 @(#) mpatrol 1.4.8 (02/01/08)  Copyright (C) 1997-2002 Graeme S. Roy  ...  operating system:       UNIX  system variant:         Linux  processor architecture: Intel 80x86  ...  MEMCOPY: memmove (0x08076BF4, 0x08079A0C, 10572 bytes, 0x00) [-|-|-]  0x08062BAD __copy_trivial__H1Zi_PCX01T0PX01_PX01+33  0x080623CE copy__t15__copy_dispatch3ZPCiZPiZ11__true_typePCiT1Pi+26  0x080615DA copy__H2ZPCiZPi_X01X01X11_X11+26  0x08062786  __uninitialized_copy_aux__H2ZPCiZPi_X01X01X11G11__true_type             _X11+30  0x08061FE3 __uninitialized_copy__H3ZPCiZPiZi_X01X01X11PX21_X11+35  0x08060821 uninitialized_copy__H2ZPCiZPi_X01X01X11_X11+45  0x0805F5F4  __t6vector2ZiZt9allocator1ZiRCt6vector2ZiZt9allocator1Zi+120  0x0804B0C8 lexSortTuples__FGt6vector2ZPt6vector2ZiZt9allocator1ZiZt9             allocator1ZPt6vector2ZiZt9allocator1Zi+3192  0x0804BC87 test__FiPiT1+67  0x0804BE1E main+30  0x400FC177 __libc_start_main+147  0x0804A371 _start+33  ERROR: [RNGOVF]: memmove: range [0x08076BF4,0x0807953F] overflows         [0x080768FC,0x08076E13]  0x080768FC (1304 bytes) {malloc:80:0} [-|-|-]  0x08063607 _S_chunk_alloc__t24__default_alloc_template2b1i0UiRi+239  0x08063388 _S_refill__t24__default_alloc_template2b1i0Ui+28  0x08062F6E allocate__t24__default_alloc_template2b1i0Ui+122  0x08062751  allocate__t12simple_alloc2ZiZt24__default_alloc_template2b1             i0Ui+25  0x08061FA5  _M_allocate__t18_Vector_alloc_base3ZiZt9allocator1Zib1Ui+21  0x08061397 _M_insert_aux__t6vector2ZiZt9allocator1ZiPiRCi+179  0x0805FDAF push_back__t6vector2ZiZt9allocator1ZiRCi+83  0x0804BB15 makeTuples__FiPiT1+305  0x0804BC5C test__FiPiT1+24  0x0804BE1E main+30  0x400FC177 __libc_start_main+147  0x0804A371 _start+33

14.6.6.3 mpatrol output for case 2, C++ version, part 2

 allocation count:  121  allocation peak:   54 (456907 bytes)  allocation limit:  0 bytes  allocated blocks:  54 (25427 bytes)  marked blocks:     0 (0 bytes)  freed blocks:      0 (0 bytes)  free blocks:       3 (494765 bytes)  internal blocks:   38 (622592 bytes)  total heap usage:  1142784 bytes  total compared:    0 bytes  total located:     0 bytes  total copied:      1852 bytes  total set:         609144 bytes  total warnings:    0  total errors:      1  ERROR: [ILLMEM]: illegal memory access at address 0x00000007      0x00000007 not in heap      call stack          0x08060499 size__Ct6vector2Zt6vector2ZiZt9allocator1ZiZt9                     allocator1Zt6vector2ZiZt9allocator1Zi+17          0x0806011E __t6vector2Zt6vector2ZiZt9allocator1ZiZt9allocator1  Zt6vector2ZiZt9allocator1ZiRCt6vector2Zt6vector2ZiZt9  allocator1ZiZt9allocator1Zt6vector2ZiZt9allocator1Zi+42          0x0804B136  lexSortTuples__FGt6vector2ZPt6vector2ZiZt9allocator1                     ZiZt9allocator1ZPt6vector2ZiZt9allocator1Zi+3302          0x0804BC87 test__FiPiT1+67          0x0804BE1E main+30          0x400FC177 __libc_start_main+147          0x0804A371 _start+33

14.6.6.4 mpatrol output for case 3, bug 3

 @(#) mpatrol 1.4.8 (02/01/08)  Copyright (C) 1997-2002 Graeme S. Roy  ...  operating system:       UNIX  system variant:         Linux  processor architecture: Intel 80x86  ...  allocation count:  94  allocation peak:   26 (456907 bytes)  allocation limit:  0 bytes  allocated blocks:  26 (3079 bytes)  marked blocks:     0 (0 bytes)  freed blocks:      0 (0 bytes)  free blocks:       4 (517113 bytes)  internal blocks:   36 (589824 bytes)  total heap usage:  1110016 bytes  total compared:    0 bytes  total located:     0 bytes  total copied:      0 bytes  total set:         596664 bytes  total warnings:    0  total errors:      0  ERROR: [ILLMEM]: illegal memory access at address 0x00000000      0x00000000 not in heap      call stack          0x0804C109 main+137          0x400FC177 __libc_start_main+147          0x08049DE1 _start+33