7.2. Memory-Checking Tools Included in glibc The GNU C Library (glibc) includes three simple memory-checking tools. The first two, mcheck() and MALLOC_CHECK_, enforce heap data structure consistency checking, and the third, mtrace(), TRaces memory allocation and deallocation for later processing. 7.2.1. Finding Memory Heap Corruption When memory is allocated from the heap, the memory management functions need someplace to store information about the allocations. That place is the heap itself; this means that the heap is composed of alternating areas of memory that are used by the program and by the memory management functions themselves. This means that buffer overflows or underruns can actually damage the data structures that the memory management functions use to keep track of what memory has been allocated. When this happens, all bets are off, except that it is a pretty good bet that the memory management functions will eventually cause the program to crash. If you set the MALLOC_CHECK_ environment variable, a different and some-what slower set of memory management functions is chosen that is more tolerant of errors and can check for calling free() more than once on the same pointer and for single-byte buffer overflows. If MALLOC_CHECK_ is set to 0, the memory management functions are simply more tolerant of error but do not give warnings. If MALLOC_CHECK_ is set to 1, the memory management functions print out warning messages on standard error when they notice problems. If MALLOC_CHECK_ is set to 2, the memory management functions call abort() when they notice problems. Setting MALLOC_CHECK_ to 0 may be useful if you are prevented from finding one memory bug by another that is not convenient to fix at the moment; it might allow you to use other tools to chase down the other memory bug. It may also be useful if you are running code that works on another system but not on Linux and you want a quick workaround that may allow the code to function temporarily, before you have a chance to resolve the error. Setting MALLOC_CHECK_ to 1 is useful if you are not aware of any problems and just want to be notified if any problems exist. Setting MALLOC_CHECK_ to 2 is most useful from inside the debugger, because it allows you to get a backtrace as soon as the memory management functions discover the error, which will get you closest to the point at which the error has happened. $ MALLOC_CHECK_=1 ./broken malloc: using debugging hooks 1: 12345 free(): invalid pointer 0x80ac008! 2: 12345678 3: 12345678 4: 12345 5: 12345 6: 12345 7: 12345 $ MALLOC_CHECK_=2 gdb ./broken ... (gdb) run Starting program: /usr/src/lad/code/broken 1: 12345 Program received signal SIGABRT, Aborted. 0x00c64c32 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2 (gdb) where #0 0x00c64c32 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2 #1 0x00322969 in raise () from /lib/tls/libc.so.6 #2 0x00324322 in abort () from /lib/tls/libc.so.6 #3 0x0036d9af in free_check () from /lib/tls/libc.so.6 #4 0x0036afa5 in free () from /lib/tls/libc.so.6 #5 0x0804842b in broken () at broken.c:17 #6 0x08048520 in main () at broken.c:47 Another way to ask glibc to do heap consistency checking is with the mcheck() function: typedef void (*mcheckCallback)(enum mcheck_status status); void mcheck(mcheckCallback cb); When the mcheck() function has been called, malloc() places known byte sequences before and after the returned memory region in order to make it possible to spot buffer overflow and buffer underrun conditions. free() looks for those signatures, and if they have been disturbed, it calls the function pointed to by the cb parameter. If cb is NULL, the library exits instead. Running a program linked against mcheck() through gdb can show you exactly which memory regions have been corrupted, as long as those regions are properly free() ed. However, the mcheck() method does not pinpoint exactly where the corruption occurred; it is up to the programmer to figure that out based on an understanding of the program flow. Linking our test program against the mcheck library yields the following results: $ gcc -ggdb -o broken broken.c -lmcheck $ ./broken 1: 12345 memory clobbered past end of allocated block Because mcheck merely complains and exits, this does not really pinpoint the error. To pinpoint the error, you need to run the program inside gdb and tell mcheck to abort() when it notices a problem. You can simply call mcheck() from within gdb, or you can call mcheck(1) as the first line of your program (before you ever call malloc()). (Note that you can call mcheck() from within gdb without linking your program against the mcheck library!) $ rm -f broken; make broken $ gdb broken ... (gdb) break main Breakpoint 1 at 0x80483f4: file broken.c, line 14. (gdb) command 1 Type commands for when breakpoint 1 is hit, one per line. End with a line saying just "end". >call mcheck(&abort) >continue >end (gdb) run Starting program: /usr/src/lad/code/broken Breakpoint 1, main () at broken.c:14 47 return broken(); $1 = 0 1: 12345 Program received signal SIGABRT, Aborted. 0x00e12c32 in_dl_sysinfo_int80 () from /lib/ld-linux.so.2 (gdb) where #0 0x00e12c32 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2 #1 0x0072c969 in raise () from /lib/tls/libc.so.6 #2 0x0072e322 in abort () from /lib/tls/libc.so.6 #3 0x007792c4 in freehook () from /lib/tls/libc.so.6 #4 0x00774fa5 in free () from /lib/tls/libc.so.6 #5 0x0804842b in broken () at broken.c:17 #6 0x08048520 in main () at broken.c:47 The important part of this is where it tells you that the problem was detected in broken.c at line 17. That lets you see that the error was detected during the first free() call, which indicates the problem was in (or more precisely, bordering) the dyn memory region. (freehook() is just the hook that mcheck uses to do its consistency checks.) mcheck does not help you to find overflows or underruns in local or global variables, only in malloc() ed memory regions. 7.2.2. Using mTRace() to Track Allocations A simple way to find all of a program's memory leaks is to log all its calls to malloc() and free(). When the program has completed, it is straightforward to match each malloc() ed block with the point at which it was free() ed, or report a leak if it was never free() ed. Unlike mcheck(), mtrace() has no library against which you can link to enable mtrace(). This is no great loss; you can use the same technique with gdb to start tracing. However, for mtrace() to enable tracing, the environment variable MALLOC_TRACE must be set to a valid filename; either an existing file that the process can write to (in which case it is truncated) or a filename that the process can create and write to. $ MALLOC_TRACE=mtrace.log gdb broken ... (gdb) break main Breakpoint 1 at 0x80483f4: file broken.c, line 14. (gdb) command 1 Type commands for when breakpoint 1 is hit, one per line. End with a line saying just "end". >call mtrace() >continue >end (gdb) run Starting program: /usr/src/lad/code/broken Breakpoint 1, main () at broken.c:47 47 return broken(); $1 = 0 1: 12345 2: 12345678 3: 12345678 4: 12345 5: 12345 6: 12345 7: 12345 Program exited normally. (gdb) quit $ ls -l mtrace.log -rw-rw-r-- 1 ewt ewt 220 Dec 27 23:41 mtrace.log $ mtrace ./broken mtrace.log Memory not freed: ----------------- Address Size Caller 0x09211378 0x5 at /usr/src/lad/code/broken.c:20 Note that the mtrace program has found the memory leak exactly. The mtrace program can also find memory that is free() ed that was never allocated in the first place if this case shows up in the log file, but in practice it will not find it there because the program should crash immediately when attempting to free() the unallocated memory. |