Creating and Reading a MAP File


Many people have asked me why I keep recommending that everyone create MAP files with their release builds. Simply put, MAP files are the only textual representation of your program's global symbols and source file and line number information. Although using CrashFinder is far easier than deciphering a MAP file, your MAP files can be read anywhere and anytime, without requiring a supporting program and without requiring all your program's binaries to get the same information. Trust me, at some point in the future, you're going to need to figure out where a crash happened on an older version of your software, and the only way you'll be able to find the information is with your MAP file.

MAP files are useful only for release builds because when creating a MAP File, the linker has to turn off incremental linking. Setting up to generate MAP files is much easier in Microsoft Visual C++ .NET than in previous versions. In your project Property Pages dialog box, Linker folder, Debugging property page, you just have to set the Generate Map File field, Map Exports, and the Map Lines fields to Yes. Doing so turns on the linker /MAP, /MAPINFO:EXPORTS, and /MAPINFO:LINES options. As you can guess, SettingsMaster from Chapter 9, with the default project files, will add these settings automatically.

If you're working on a real-world project, you probably have your binary files going to their own output directory. By default, the linker writes the MAP file to the same directory as the intermediate files, so you need to specify that the MAP file goes to the binary file output directory. In the Map File Name edit box, you can type $(OutDir)/$(ProjectName).map. The $(OutDir) is a built-in macro that the build system will substitute with the real output directory, and $(ProjectName) substitutes the project name. Figure 12-3 shows the completed MAP file settings for the release build of the MapDLL project, which is included with this book's sample files.

click to expand
Figure 12-3: The MAP file settings in the project Property Pages dialog box

Although you might not need the MAP files in your day-to-day operation, chances are that you'll need them in the future. CrashFinder and your debugger rely on symbol tables and a symbol engine to read them. If the format of the symbol table changes or if you forget to save the Program Database (PDB) files, you're completely out of luck. Forgetting to save the PDB files is your fault, but you have no control over symbol table formats. They change frequently. For example, many people who upgraded from Microsoft Visual Studio 6 to Microsoft Visual Studio .NET noticed that tools such as CrashFinder quit working with programs compiled with Visual Studio .NET. Microsoft changed the symbol table format and does so on a regular basis. MAP files are your only savior at that time.

Even though you, as a developer, might be up to Window Server 2008 with Visual Studio .NET 2007 Service Pack 6 in five years, I can assure you that you'll still have customers who will be running the software you released back in 2003. When they call you in alarm and give you a crash address, you could spend the next two days trying to find the Visual Studio .NET CDs so that you can read your saved PDB files. Or if you have the MAP files, you can find the problem in five minutes.

MAP File Contents

Listing 12-1 shows an example MAP file. The top part of the MAP file contains the module name, the timestamp indicating when LINK.EXE linked the module, and the preferred load address. After the header comes the section information that shows which sections the linker brought in from the various OBJ and LIB files.

Listing 12-1: Example MAP file

start example
 MapDLL     Timestamp is 3e2b44a3 (Sun Jan 19 19:36:51 2003)     Preferred load address is 03900000     Start         Length     Name                   Class 0001:00000000 00000304H .text                   CODE 0002:00000000 00000028H .idata$5                DATA 0002:00000030 000000f8H .rdata                  DATA 0002:00000128 00000063H .rdata$debug            DATA 0002:00000190 00000004H .rdata$sxdata           DATA 0002:00000194 00000004H .rtc$IAA                DATA 0002:00000198 00000004H .rtc$IZZ                DATA 0002:0000019c 00000004H .rtc$TAA                DATA 0002:000001a0 00000004H .rtc$TZZ                DATA 0002:000001a4 00000014H .idata$2                DATA 0002:000001b8 00000014H .idata$3                DATA 0002:000001cc 00000028H .idata$4                DATA 0002:000001f4 00000082H .idata$6                DATA 0002:00000280 0000007bH .edata                  DATA 0003:00000000 00000004H .CRT$XCA                DATA 0003:00000004 00000004H .CRT$XCZ                DATA 0003:00000008 00000004H .CRT$XIA                DATA 0003:0000000c 00000004H .CRT$XIZ                DATA 0003:00000010 00000004H .data                   DATA 0003:00000014 00000014H .bss                    DATA      Address        Publics by Value           Rva+Base     Lib:Object     0000:00000001   ___safe_se_handler_count   00000001     <absolute> 0001:00000000   _DllMain@12                03901000 f   MapDLL.obj 0001:00000006   ?MapDLLFunction@@YAHXZ     03901006 f   MapDLL.obj 0001:00000023   ?MapDLLHappyFunc@@YAPADPAD@Z 03901023 f   MapDLL.obj 0001:0000003c   __CRT_INIT@12              0390103c f   MSVCRT:crtdll. obj 0001:000000fa   __DllMainCRTStartup@12     039010fa f   MSVCRT:crtdll. obj 0001:000001de   __initterm                 039011de f   MSVCRT:MSVCR71 .dll 0001:000001e4   __onexit                   039011e4 f   MSVCRT:atonexi t.obj 0001:0000020a   _atexit                    0390120a f   MSVCRT:atonexi t.obj 0001:0000021c   __RTC_Initialize           0390121c f   MSVCRT:initsec t.obj 0001:00000260   __RTC_Terminate            03901260 f   MSVCRT:initsec t.obj 0001:000002a4   ___CppXcptFilter           039012a4 f   MSVCRT:MSVCR71 .dll 0001:000002ac   __SEH_prolog               039012ac f   MSVCRT:sehprol g.obj 0001:000002e7   __SEH_epilog               039012e7 f   MSVCRT:sehprol g.obj 0001:000002f8   __except_handler3          039012f8 f   MSVCRT:MSVCR71 .dll 0001:000002fe   ___dllonexit               039012fe f   MSVCRT:MSVCR71 .dll 0002:00000000   __imp__printf              03902000     MSVCRT:MSVCR71 .dll 0002:00000004   __imp__free                03902004     MSVCRT:MSVCR71 .dll 0002:00000008   __imp___initterm           03902008     MSVCRT:MSVCR71 .dll 0002:0000000c   __imp__malloc              0390200c     MSVCRT:MSVCR71 .dll 0002:00000010   __imp___adjust_fdiv        03902010     MSVCRT:MSVCR71 .dll 0002:00000014   __imp____CppXcptFilter     03902014     MSVCRT:MSVCR71 .dll 0002:00000018   __imp___except_handler3    03902018     MSVCRT:MSVCR71 .dll 0002:0000001c   __imp____dllonexit         0390201c     MSVCRT:MSVCR71 .dll 0002:00000020   __imp___onexit             03902020     MSVCRT:MSVCR71 .dll 0002:00000024   \177MSVCR71_NULL_THUNK_DATA 03902024     MSVCRT:MSVCR7 1.dll 0002:0000007c   ??_C@_0CE@EBHAJKCA@Whoops?0?5a?5crash?5is?5about?5to?5 occu@                                            0390207c     MapDLL.obj 0002:000000a0   ??_C@_0CD@OILENIKO@Hello?5from?5InternalStaticFunctio@                                            039020a0     MapDLL.obj 0002:000000c4   ??_C@_0BM@DFMPKPOD@Hello?5from?5MapDLLFunction?$CB?6?$ AA@                                            039020c4     MapDLL.obj 0002:000000e0   __load_config_used         039020e0     MSVCRT:loadcfg .obj 0002:00000190   ___safe_se_handler_table   03902190     <linker- defined> 0002:00000194   ___rtc_iaa                 03902194     MSVCRT:initsec t.obj 0002:00000198   ___rtc_izz                 03902198     MSVCRT:initsec t.obj 0002:0000019c   ___rtc_taa                 0390219c     MSVCRT:initsec t.obj 0002:000001a0   ___rtc_tzz                 039021a0     MSVCRT:initsec t.obj 0002:000001a4   __IMPORT_DESCRIPTOR_MSVCR71 039021a4     MSVCRT:MSVCR7 1.dll 0002:000001b8   __NULL_IMPORT_DESCRIPTOR   039021b8     MSVCRT:MSVCR71 .dll 0003:00000000   ___xc_a                    03903000     MSVCRT:cinitex e.obj 0003:00000004   ___xc_z                    03903004     MSVCRT:cinitex e.obj 0003:00000008   ___xi_a                    03903008     MSVCRT:cinitex e.obj 0003:0000000c   ___xi_z                    0390300c     MSVCRT:cinitex e.obj 0003:00000010   ___security_cookie         03903010     MSVCRT:seccook .obj 0003:00000018   __adjust_fdiv              03903018     <common> 0003:0000001c   ___onexitend               0390301c     <common> 0003:00000020   ___onexitbegin             03903020     <common> 0003:00000024   __pRawDllMain              03903024     <common>     entry point at        0001:000000fa     Static symbols     0001:00000016       ?InternalStaticFunction@@YAXXZ 03901016 f   MapDLL .obj     Line numbers for .\Release\MapDLL.obj(d:\dev\booktwo\disk\chapter exam ples\ch apter 12\mapfile\mapdll\mapdll.cpp) segment .text     11 0001:00000000  20 0001:00000000  21 0001:00000003  26 0001:00000006 25 0001:00000006  27 0001:00000012  28 0001:00000015  31 0001:00000016 32 0001:00000016  33 0001:00000022  37 0001:00000023  36 0001:00000023 38 0001:00000028  39 0001:00000033  41 0001:0000003b     Line numbers for R:\VSNET2003\Vc7\lib\MSVCRT.lib(f:\vs70builds\2292\vc\crtbld \crt\src\atonexit.c) segment .text     81 0001:000001e4  76 0001:000001e4  90 0001:00000209  96 0001:0000020a 95 0001:0000020a  97 0001:0000021b     Line numbers for R:\VSNET2003\Vc7\lib\MSVCRT.lib(f:\vs70builds\2292\vc\crtbld \crt\src\crtdll.c) segment .text     134 0001:0000003c  129 0001:0000003c  135 0001:00000044  136 0001:0000 004c 158 0001:00000052  163 0001:00000065  168 0001:0000007a  170 0001:0000 007e 172 0001:00000081  178 0001:0000008b  179 0001:00000090  184 0001:0000 009a 189 0001:000000ab  192 0001:000000b2  219 0001:000000b8  220 0001:0000 00c1 225 0001:000000c3  226 0001:000000cf  234 0001:000000e5  236 0001:0000 00ec 234 0001:000000f3  240 0001:000000f4  241 0001:000000f7  249 0001:0000 00fa 250 0001:00000106  252 0001:0000010c  257 0001:00000111  258 0001:0000 011e 260 0001:00000124  262 0001:0000012d  263 0001:00000136  265 0001:0000 0142 266 0001:0000014b  268 0001:0000015a  269 0001:0000015c  272 0001:0000 015e 275 0001:0000016e  283 0001:00000177  286 0001:00000181  288 0001:0000 018a 289 0001:00000198  291 0001:0000019b  292 0001:000001a9  298 0001:0000 01b7 294 0001:000001bc  295 0001:000001d4  299 0001:000001d6     Exports      ordinal    name      1    ?MapDLLFunction@@YAHXZ (int __cdecl MapDLLFunction(void))  2    ?MapDLLHappyFunc@@YAPADPAD@Z (char * __cdecl MapDLLHappyFunc(cha r *))
end example

After the section information, you get to the good stuff, the public function information. Notice the "public" part. If you have static-declared functions, they are placed in a similar table after the public functions table. Fortunately, the line numbers are not separated out and appear together.

The important parts of the public function information are the function names and the information in the Rva+Base column, which is the starting address of the function. The f after some of the Rva+Base addresses indicates that the address is an actual function and not a global variable or imported address of some kind. The line information follows the public function section. The lines are shown as follows:

24 0001:00000006

The first number is the line number, and the second is the offset from the beginning of the code section in which this line occurred. Yes, that sounds confusing, but later I'll show you the calculation you need to convert an address into a source file and line number.

If the module contains exported functions, the final section of a MAP file lists the exports. You can get this same information by running DUMPBIN /EXPORTS <modulename>.

Finding the Source File, Function Name, and Line Number

The algorithm for extracting the source file, function name, and line number from a MAP file is straightforward, but you need to do a few hexadecimal calculations when using it. As an example, let's say that a crash in MAPDLL.DLL, the module shown in Listing 12-1, occurs at address 0x03901038.

The first step is to look in your project's MAP files for the file that contains the crash address. First look at the preferred load address and the last address in the public function section. If the crash address is between those values, you're looking at the correct MAP file.

To find the function, scan down the Rva+Base column until you find the first function address that's greater than the crash address. The preceding entry in the MAP file is the function that had the crash. For example, in Listing 12-1, the first function address greater than the 0x03901038 crash address is 0x03901023, so the function that crashed is ?MapDLLHappyFunc@@YAPADPAD@Z. Any function name that starts with a question mark is a C++ decorated name.

You're probably wondering why I didn't mention the C++ name decoration when I talked about the calling convention name decoration in Chapter 6. Although both serve similar purposes, they come from different places. The calling convention name decoration simply tells the code generator how to generate the parameter pushes and stack cleanup, and it comes from the operating system definitions. C++ name decoration comes as a result of the language. Since you can have overloaded methods, the compiler has to have some way to differentiate them. It "decorates" the name with the return type, calling convention, and parameter information. That way it will know exactly what function you meant to call. To translate the name, pass it as a command-line parameter to the program UNDNAME.EXE, which is included with Visual Studio .NET. In the example, ?MapDLLHappyFunc@@YAPADPAD@Z translates into char * __cdecl MapDLLHappyFunc(char *). You probably could have figured out that MapDLLHappyFunc was the function name just by looking at the decorated name. Other C++ decorated names are harder to decipher, especially when overloaded functions are used.

To find the line number, you get to do a little hexadecimal subtraction by using the following formula:

(crash address) – (preferred load address) – 0x1000

Remember that the addresses are offsets from the beginning of the first code section, so the formula does that conversion. You can probably guess why you subtract the preferred load address, but you earn extra credit if you know why you still have to subtract 0x1000. The crash address is an offset from the beginning of the code section, but the code section isn't the first part of the binary. The first part of the binary is the PE (portable executable) header and associated DOS stub, which is 0x1000 bytes long. Yes, all Win32 binaries still have that MS-DOS heritage in them.

I'm not sure why the linker still generates MAP files that require this odd calculation. The linker team put in the Rva+Base column a while ago, so I don't see why they didn't just fix up the line number at the same time.

Once you've calculated the offset, look through the MAP file line information until you find the closest number that isn't over the calculated value. Keep in mind that during the generation phase the compiler can jiggle the code around so that the source lines aren't in ascending order. With my crash example, I used the following formula:

0x03901038 – 0x03900000 – 0x1000 = 0x38

If you look through the MAP file in Listing 12-1, you'll see that the closest line that isn't over 0x38 is 39 0001:00000033 (Line 39) in MAPDLL.CPP.

PDB2MAP—Map Files After the Fact

One issue that keeps coming up when I discuss finding crash addresses with other developers is the lament that you've already got code out in the field in which you don't have MAP files. Other eagle eye developers have also pointed out that having perfect MAP files means you have to set the base address of all your DLLs as part of the build. If you're working on an existing project that's about to ship, you might not want to destabilize the build by changing a bunch of settings. Additionally, without SettingsMaster from Chapter 9, Visual Studio doesn't make it convenient to make those global project settings changes. That's a primary reason why people simply default to using REBASE.EXE to take care of setting their DLL's base addresses.

Being one not to let any challenge go unmet, I took a look at the problem. Really all I needed was a way to enumerate functions, source files, and source lines. Given that the DBGHELP.DLL symbol engine already does that, it was a piece of cake to take the next step to generate a MAP file from a PDB file.

The first problem I ran into was that the SymGetSymNext and SymGetSymPrev functions don't return what you would expect. I thought I could get an address in a source file, call SymGetSymPrev until I got to the beginning of the source file, and roll down the end of the source file with SymGetSymNext. What I forgot to take into account are small things called inline functions. Those functions and source lines can occur in the middle of a function, so the source line information is really stored in ranges. This meant that I had to come up with a scheme to keep track of all the ranges so that I could condense the source and line information. Once I got over that hurdle, the program was pretty easy to develop.

The only other thing that got me had nothing to do with symbol engines—it was the Standard Template Library (STL). I first started out implementing my data structures in STL and quickly found that even a partial implementation of PDB2MAP.EXE was excruciatingly slow. That was mostly my fault because I was using the vector class in a linear search way that was just plain stupid. After fiddling some more, I realized that STL was always going to be much slower as it was doing quite a bit of memory allocation and copying behind the scenes. After much gnashing of teeth trying to make sense of some of the STL implementation details, I figured out I was making the problem much more complicated than it needed to be. I ended up manually coding a simple multiple array system that was blindingly fast and super simple to understand. It also had the added benefit of being much more maintainable than anything I could have created in STL.

The files produced by PDB2MAP are close to actual MAP files. Since the DBGHELP.DLL symbol engine doesn't return static functions, there's no way for me to output that information. As you look at a .P2M file, you'll see that you should have no trouble reading it. I considered using the crazy MAP file line number system for old-times' sake, but instead used PDB2MAP, which I brought into the modern age. My line information is generated using real addresses that appear in memory.

One other interesting tidbit of data that you might be interested in is output in your .P2M file. As I mentioned back in Chapter 2, small code is good code. However, other than looking at the total size of the binary, there's no way to see how different compiler switches will affect the size of individual functions. Additionally, there's no way to see what effect inline functions have on a particular function. Since I was doing PDB2MAP, I figured I might as well report symbol sizes because DBGHELP.DLL's symbol engine can report the individual sizes. After the header information in your .P2M file, the function information shows the size of each function between the function address and name, as shown in Listing 12-2, which is an abbreviated .P2M file. Although nearly all functions will have their sizes, DBGHELP.DLL doesn't guarantee that sizes will be returned, so you might see sizes of 0.

Listing 12-2: An abbreviated .P2M file

start example
PDB2MAP Generated Map File     Image: AssertTest     Timestamp is 3E0E7E2A -> Sat Dec 28 23:46:34 2002     Preferred load address is 00400000     Address      Size  Function 0x00401050     36  ??2@YAPAXI@Z 0x00401080    260  ?MyThread@@YGKPAX@Z 0x00401190     38  ?SleepThread@@YGKPAX@Z 0x004011C0    535  ?TestThree@@YAXPAD@Z 0x004013E0    258  ?TestTwo@@YAXXZ 0x004014F0    421  ?TestOne@@YAXPAG@Z 0x004016A0    453  _wWinMain@16 0x00401A5E      6  _InitCommonControls@0 0x00401A64      6  _SuperAssertionW . . . Line numbers for d:\dev\booktwo\disk\bugslayerutil\tests\asserttest\as serttest.cpp     16 : 0x00401080     18 : 0x0040109F     19 : 0x004010CB     20 : 0x004 010DE 21 : 0x004010E5     22 : 0x0040113C     23 : 0x0040113E     26 : 0x004 01190 27 : 0x00401194     28 : 0x004011A8     29 : 0x004011AA     32 : 0x004 011C0 33 : 0x004011D7     39 : 0x004011DE     40 : 0x00401201     41 : 0x004 01207 43 : 0x00401223     44 : 0x0040127A     45 : 0x0040129D     46 : 0x004 012B2 47 : 0x0040131A     48 : 0x0040131F     49 : 0x00401334     50 : 0x004 0139C 53 : 0x004013E0     55 : 0x004013F7     57 : 0x0040140F     59 : 0x004 01427 60 : 0x0040143A     61 : 0x0040143C     62 : 0x0040143E     63 : 0x004 01498 64 : 0x004014A5     67 : 0x004014F0     68 : 0x00401515     70 : 0x004 01527 74 : 0x0040153E     76 : 0x00401548     78 : 0x004015CF     80 : 0x004 01632 81 : 0x00401638     82 : 0x0040163D     90 : 0x004016A0     91 : 0x004 016C4 92 : 0x0040171B     93 : 0x00401772     94 : 0x004017D2     96 : 0x004 01829 97 : 0x00401838     98 : 0x00401845     99 : 0x00401852    100 : 0x004 01854     Line numbers for f:\vs70builds\2292\vc\crtbld\crt\src\atonexit.c     76 : 0x00402810     81 : 0x00402814     90 : 0x0040284B     95 : 0x004 02850 96 : 0x00402853     97 : 0x00402866  
end example




Debugging Applications for Microsoft. NET and Microsoft Windows
Debugging Applications for MicrosoftВ® .NET and Microsoft WindowsВ® (Pro-Developer)
ISBN: 0735615365
EAN: 2147483647
Year: 2003
Pages: 177
Authors: John Robbins

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net