Implementing LIMODS

[Previous] [Next]

The LIMODS implementation turned out to be quite interesting. I had to do some finagling to get the DBGHELP.DLL symbol engine to work, but the really interesting parts involved writing __cdecl import hook functions and hooking imports by ordinal value.

Determining Source Code Ranges

You probably won't be surprised to find out that I'm using my old friend DBGHELP.DLL yet again. (I introduced this DLL in Chapter 4.) With the source file and line number functions in the DBGHELP.DLL symbol engine, I figured that I could find the first and last addresses (what I call the address range) that correspond to a particular source file. Armed with the address range, I could hook OutputDebugString using the techniques I discussed in Chapter 12 and look at the return address to see whether it's in a source file the user wants to allow trace statements from. Although this approach was simple in theory, the actual implementation had me going around my elbow to get the information the way I needed it.

There's no specific application programming interface (API) function that enumerates the source file address ranges, but I figured I could make do with the symbol enumeration function SymEnumerateSymbols. I'd use SymEnumerateSymbols to retrieve the first symbol, and in my symbol enumeration function, I'd move back to the start of the source file with SymGetLinePrev and then walk to the end of the source file with SymGetLineNext. Using SymEnumerateSymbols worked great on my simple test cases, but when I ran it against GENLIMODS.EXE, I noticed that the source ranges didn't jibe with what the disassembly showed. I seemed to be missing entire sections of the source file.

When I manually calculated the ranges, they came out looking like those listed in Table 14-1. The issue was that the SymGetLineNext and SymGetLinePrev functions enumerate only the contiguous ranges. As you can see in Table 14-1, source files with inline functions occur between the first part of GENLIMODS.CPP and the second part. I quickly realized that this isn't a bug but rather just how the compiler operates. The misunderstanding was on my part; I was focusing on the source file when I really needed to be thinking about the ranges first.

Table 14-1 GENLIMODS.EXE Sample Address Ranges

Start End Source File
0x00401900 0x00401A8A COMMANDLINE.CPP
0x00401D00 0x00402F1F GENLIMODS.CPP
0x00403450 0x00403774 RESSTRING.H
0x004037C0 0x004037DD GENLIMODS.H
0x00403D60 0x004040F9 SYMBOLENGINE.H
0x00404690 0x004046AC GENLIMODS.CPP
0x00407080 0x0040852E LOMFILE.CPP
0x00409D50 0x0040A532 READIGNOREFILES.CPP
0x0040C800 0x0040C894 VERBOSE.CPP

LOM Files Explained

The LOM files GENLIMODS.EXE generates are just INI files, as you can see in Listing 14-2. The first section of the code is where I keep the main module information, which includes the name, load address, and date and timestamp of the module used to build the LOM file. When LIMODSDLL.DLL looks at the module in memory, it checks the module against the LOM file; if the module's date and timestamp is different from the LOM file's date and timestamp, LIMODSDLL.DLL has GENLIMODS.EXE generate a new LOM file for the module. I store the base address of the module so that if the image loader relocates the module, LIMODSDLL.DLL can recalculate the address ranges on the fly. LIMODSDLL.DLL will also let you know through calls to OutputDebugString that a module has been relocated.

Listing 14-2 A sample LOM file

[Module Info] DateTimeStamp=380b75e8 BaseAddress=400000 ModuleName=LIMODS.exe [Ranges] RangeCount=11 Range0=0x004017D0|0x00401C8E|0|D:\Book\SourceCode\LIMODS\About.cpp Range1=0x00401EF0|0x00402313|0|D:\Book\SourceCode\LIMODS\BigIcon.CPP Range2=0x00402430|0x00402A5E|0|D:\Book\SourceCode\LIMODS\LIMODS.cpp Range3=0x00402D60|0x00403727|1|D:\Book\SourceCode\LIMODS\LIMODSDoc.cpp Range4=0x004044B0|0x0040480D|0|D:\Book\SourceCode\LIMODS\LIMODSOptions.cpp Range5=0x00404950|0x00405823|0|D:\Book\SourceCode\LIMODS\LIMODSView.cpp Range6=0x00405D70|0x00405DB0|0|D:\Book\SourceCode\LIMODS\LIMODSDoc.h Range7=0x00406150|0x00407521|0|D:\Book\SourceCode\LIMODS\LOMFile.cpp Range8=0x00408D00|0x004090FF|0|D:\Book\SourceCode\LIMODS\MainFrm.cpp Range9=0x00409270|0x00409516|0|D:\Book\SourceCode\LIMODS\OptionsDialog.cpp Range10=0x0040A0A0|0x0040A140|0|appmodul.cpp [Sources] Source0=0|D:\Book\SourceCode\LIMODS\About.cpp Source1=0|D:\Book\SourceCode\LIMODS\BigIcon.CPP Source2=0|D:\Book\SourceCode\LIMODS\LIMODS.cpp Source3=1|D:\Book\SourceCode\LIMODS\LIMODSDoc.cpp Source4=0|D:\Book\SourceCode\LIMODS\LIMODSOptions.cpp Source5=0|D:\Book\SourceCode\LIMODS\LIMODSView.cpp Source6=0|D:\Book\SourceCode\LIMODS\LIMODSDoc.h Source7=0|D:\Book\SourceCode\LIMODS\LOMFile.cpp Source8=0|D:\Book\SourceCode\LIMODS\MainFrm.cpp Source9=0|D:\Book\SourceCode\LIMODS\OptionsDialog.cpp Source10=0|appmodul.cpp SourceCount=11

The format of the [Ranges] section defines the address ranges for the source files and is primarily what LIMODSDLL.DLL uses to determine what trace statements to show and when to show them. In order, the fields are start address, end address, the show trace Boolean value, and the source file name. LIMODS.EXE uses the [Sources] section to show the source file names in its tree view control. I used the INI file format originally to make initial testing easier and had it hidden by an accessor class in LOMFILE.H and LOMFILE.CPP. As I kept working on LIMODS as a whole, I found the performance acceptable, so I never changed to a different format.

Excluding Source Files from LOM Files

GENLIMODS.EXE offers an exclusion capability to limit the source files in the resulting LOM file to the ones that include trace statements. Although you might find it interesting to see that you have half the Standard Template Library (STL) in the generated code, those files contain no trace statements and they just cause LIMODSDLL.DLL to take up more memory and process slowly. GENLIMODS.EXE looks for two files that Visual C++ uses to exclude files from its dependency checking: SYSINCL.DAT and the optional, user-supplied MSVCINCL.DAT. SYSINCL.DAT is just a list of files that appear in your <VC Dir>\Include and <VC Dir>\MFC\Include directories. MSVCINCL.DAT can contain any list of headers you don't want dependency checking used on. In addition to those files, GENLIMODS.EXE looks for a LIMODSINCL.DAT in your PATH directories for any extra files you want to exclude. For example, if you don't want to see the C run-time library files in your LOM files, you can include them in LIMODSINCL.DAT. On the companion CD in the \SourceCode\LIMODS directory is a version of LIMODSINCL.DAT that will exclude all the C run-time library source files.

What LIMODSDLL.DLL Hooks

When LIMODSDLL.DLL starts, it intercepts the key imported tracing functions in all process modules. For this version of LIMODS, those functions are OutputDebugStringA and OutputDebugStringW from KERNEL32.DLL, DiagOutputA and DiagOutputW from BUGSLAYERUTIL.DLL, _CrtDbgReport from MSVCRTD.DLL, and AfxTrace from MFC42(U)D.DLL. In addition, I hooked the LoadLibrary family of functions so that I'd know when additional modules are loaded into the address space.

For LIMODS to work with Visual Basic, I also needed to hook GetProcAddress so that I could return the appropriate function when MSVBVM60.DLL tries to get OutputDebugStringA. I talked about hooking functions in Chapter 12, and you might've thought that I'd exhausted this topic. As it turned out, hooking DiagOutputA, DiagOutputW, and the AfxTrace functions posed some unique challenges. For one thing, these functions are __cdecl functions instead of the __stdcall functions I showed how to hook in Chapter 12. Also, AfxTrace is exported by ordinal value.

Handling __cdecl Hooks

As you saw in Chapter 12, __stdcall functions are easy to hook because the function itself cleans up the stack; the caller cleans up the stacks of __cdecl functions. The DiagOutputA, DiagOutputW, and AfxTrace functions also have variable-length parameters, so intercepting them is that much more difficult. The act of hooking is the same as with __stdcall exported functions, but the __cdecl hook function processing has to be different. In LIMODSDLL.DLL, I wanted the hook function to grab the return address and determine whether it's an address range from which the user wants to see trace statements. After checking the source, I'll either let the trace function execute or ignore the trace function and return to the caller. With __stdcall functions, this processing is all very simple. I can just call the trace function directly and return right from my hook function to the caller because I clean up the stack from the hook function. With __cdecl functions, I have to get the stack back to the original state and then jump to (not call) the trace function if I need to execute it.

Listing 14-3 A __cdecl hook function with macros expanded

VOID NAKEDDEF LIMODS_DiagOutputA ( void ) { // Holds the return address of the caller DWORD_PTR dwRet ; // Holds the saved ESI so that Visual C++ 6 debug builds work. (The // chkesp function inserted with the /GZ switch uses ESI.) DWORD_PTR dwESI ; __asm PUSH EBP /* Set up the standard frame. */ __asm MOV EBP , ESP __asm SUB ESP , __LOCAL_SIZE /* Save room for the local */ /* variables. */ __asm MOV EAX , EBP /* EBP points to the original stack.*/ __asm ADD EAX , 4 /* Account for PUSH EBP. */ __asm MOV EAX , [EAX] /* Get the return address. */ __asm MOV [dwRet] , EAX /* Save the return address. */ __asm MOV [dwESI] , ESI /* Save ESI so chkesp in debug */ /* builds works. */ // Call the function that determines whether this address is one to // show. The return value is in EAX after this call and is checked // below. A return of TRUE means execute the trace function. A return // of FALSE means skip the trace function. CheckIfAddressIsOn ( dwRet ) ; __asm MOV ESI , [dwESI] /* Restore ESI. */ __asm ADD ESP , __LOCAL_SIZE /* Take away local variable space. */ __asm MOV ESP, EBP /* Restore the standard frame. */ __asm POP EBP // Here's where the fun begins! The preceding four lines of asm code // restored the stack to exactly what it looked like coming into // this function, so I'm now prepared to jump to the trace function. // pReadDiagOutputA holds the trace function address that I got // during initialization. __asm TEST EAX , EAX /* Test EAX for 0. */ __asm JZ lblDiagOutputA /* If 0, just return. */ __asm JMP pReadDiagOutputA /* Do it! THE JUMP WILL RETURN TO */ /* THE CALLER, NOT TO THIS FUNCTION.*/ lblDiagOutputA: /* Skipped the TRACE! Just return */ __asm RET /* to the caller. */ }

Listing 14-3 shows a hook function, with macros expanded, that takes care of DiagOutputA from BUGSLAYERUTIL.DLL. To make it easier to reuse common assembly-language routines, such as __cdecl prolog code, I define several assembly-language macros in LIMODSDLL.CPP for use in my hook functions. I strongly encourage you to step through the macros in the Visual C++ debugger's Disassembly window so that you can watch each instruction in action.

Hooking Functions Exported by Ordinal Value

I have to be honest and say that I almost didn't support hooking functions exported by ordinal value because the endeavor is so error prone, especially because different versions of the MFC DLLs use different ordinal values. Once you get past these version problems, however, the process of hooking by ordinal value is almost identical to hooking by name. Compare the HookOrdinalExport function, shown in Listing 14-4, with the HookImportedFunctionsByName function shown in Chapter 12 and you'll see that both functions perform many of the same operations.

Listing 14-4 The HookOrdinalExport function

BOOL BUGSUTIL_DLLINTERFACE __stdcall HookOrdinalExport ( HMODULE hModule , LPCTSTR szImportMod , DWORD dwOrdinal , PROC pHookFunc , PROC * ppOrigAddr ) { // Assert the parameters. ASSERT ( NULL != hModule ) ; ASSERT ( FALSE == IsBadStringPtr ( szImportMod , MAX_PATH ) ) ; ASSERT ( 0 != dwOrdinal ) ; ASSERT ( FALSE == IsBadCodePtr ( pHookFunc ) ) ; // Perform the error checking for the parameters. if ( ( NULL == hModule ) || ( TRUE == IsBadStringPtr ( szImportMod , MAX_PATH ) ) || ( 0 == dwOrdinal ) || ( TRUE == IsBadCodePtr ( pHookFunc ) ) ) { SetLastErrorEx ( ERROR_INVALID_PARAMETER , SLE_ERROR ) ; return ( FALSE ) ; } if ( NULL != ppOrigAddr ) { ASSERT ( FALSE == IsBadWritePtr ( ppOrigAddr , sizeof ( PROC ) ) ) ; if ( TRUE == IsBadWritePtr ( ppOrigAddr , sizeof ( PROC ) ) ) { SetLastErrorEx ( ERROR_INVALID_PARAMETER , SLE_ERROR ) ; return ( FALSE ) ; } } // Get the specific import descriptor. PIMAGE_IMPORT_DESCRIPTOR pImportDesc = GetNamedImportDescriptor ( hModule , szImportMod ) ; if ( NULL == pImportDesc ) { // The requested module wasn't imported. Don't return an error. return ( TRUE ) ; } // Get the original thunk information for this DLL. I can't use // the thunk information stored in pImportDesc->FirstThunk // because the loader has already changed that array to fix up // all the imports. The original thunk gives me access to the // function names. PIMAGE_THUNK_DATA pOrigThunk = MakePtr ( PIMAGE_THUNK_DATA , hModule , pImportDesc->OriginalFirstThunk ) ; // Get the array that pImportDesc->FirstThunk points to because I'll // do the actual hooking there. PIMAGE_THUNK_DATA pRealThunk = MakePtr ( PIMAGE_THUNK_DATA , hModule , pImportDesc->FirstThunk ); // The flag is going to be set from the thunk, so make it // easier to look up. DWORD dwCompareOrdinal = IMAGE_ORDINAL_FLAG | dwOrdinal ; // Loop through and find the function to hook. while ( NULL != pOrigThunk->u1.Function ) { // Look only at functions that are imported by ordinal value, // not those that are imported by name. if ( IMAGE_ORDINAL_FLAG == ( pOrigThunk->u1.Ordinal & IMAGE_ORDINAL_FLAG )) { // Did I find the function to hook? if ( dwCompareOrdinal == pOrigThunk->u1.Ordinal ) { // I found the function to hook. Now I need to change // the memory protection to writable before I overwrite // the function pointer. Note that I'm now writing into // the real thunk area! MEMORY_BASIC_INFORMATION mbi_thunk ; VirtualQuery ( pRealThunk , &mbi_thunk , sizeof ( MEMORY_BASIC_INFORMATION ) ) ; if ( FALSE == VirtualProtect ( mbi_thunk.BaseAddress , mbi_thunk.RegionSize , PAGE_READWRITE , &mbi_thunk.Protect )) { ASSERT ( !"VirtualProtect failed!" ) ; // There's nothing I can do but fail the function. SetLastErrorEx ( ERROR_INVALID_PARAMETER , SLE_ERROR ) ; return ( FALSE ) ; } // Save the original address if requested. if ( NULL != ppOrigAddr ) { *ppOrigAddr = (PROC)pRealThunk->u1.Function ; } // Microsoft has two different definitions of the // PIMAGE_THUNK_DATA fields as they are moving to // support Win64. The W2K RC2 Platform SDK is the // latest header, so I'll use that one and force the // Visual C++ 6 Service Pack 3 headers to deal with it. // Hook the function. DWORD * pTemp = (DWORD*)&pRealThunk->u1.Function ; *pTemp = (DWORD)(pHookFunc) ; DWORD dwOldProtect ; // Change the protection back to what it was before I // overwrote the function pointer. VERIFY ( VirtualProtect ( mbi_thunk.BaseAddress , mbi_thunk.RegionSize , mbi_thunk.Protect , &dwOldProtect ) ) ; // Life is good. SetLastError ( ERROR_SUCCESS ) ; return ( TRUE ) ; } } // Increment both tables. pOrigThunk++ ; pRealThunk++ ; } // Nothing was hooked. Technically, this isn't an error. It just // means that the module is imported but the function isn't. SetLastError ( ERROR_SUCCESS ) ; return ( FALSE ) ; }

If I had tried to handle AfxTrace without hooking it, I would've had to do some stack walking on each call to get back to the real caller of OutputDebugString. The extra work on each call would've been slow compared with hooking AfxTrace directly. Also, if I had ignored AfxTrace, LIMODS would be basically useless for MFC programmers. In the end, I opted for making LIMODS as complete as possible, even though I had to double-check the MFC DLL versions.

General Implementation Issues

Once I got past hooking the ordinal value exports, I didn't have too many problems implementing LIMODS as a whole. One interesting feature I implemented in LIMODS.EXE was the autochecking behavior of its tree view control. When you check or uncheck the root node of the tree view (the module name), the tree view automatically checks or unchecks all the child nodes (the source files). To make the autochecking work, I had to implement check-toggling notification. Refer to LIMODSVIEW.CPP on the companion CD to see how I implemented everything.

The biggest problem I had with the implementation of LIMODS was with STL. I realize that engineers much, much smarter than I wrote STL, but still I was unprepared for how impenetrable the Visual C++ STL code is. Just deciphering compilation errors took me quite a while; and I absolutely dreaded stepping into the code to see why things failed or how something worked. As I recommend in Chapter 2, I use level 4-compiler warnings and treat all warnings as errors, so I would appreciate it if the STL code would compile at warning-level 4 without errors and if the Microsoft compiler would stop producing the C4786 warning, "'identifier' : identifier was truncated to '255' characters in the debug information," with STL templates for any class that has more than two characters in the class name.

The secret to shutting off the C4786 warning is to disable the warning through the #pragma warning directive before including any STL headers. This #pragma warning technique also works best if you include STL headers only in the main precompiled header and disable the warning in the precompiled header once and for all. Even though I had to tweak the build a little, I saved some time by using STL instead of implementing my own growable arrays and map classes.

As for the last problem I ran into, I can't tell whether it was a bug in the compiler or a misunderstanding on my part. In LIMODSDLL.DLL, I use a static array of HOOKFUNCDESC to hold the real function pointers for DiagOutputA and DiagOutputW from BUGSLAYERUTIL.DLL. In the hook functions, I use the real function pointer out of the structure as the destination for the jump. The problem was that referencing the second item in the array would produce an invalid reference. The inline assembler source line

JMP g_stBugslayerUtilRealFuncs[0].pProc

would generate the assembly-language code

JMP g_stBugslayerUtilRealFuncs+4h

which was correct. However, the source line that referenced the second item in the structure,

JMP g_stBugslayerUtilRealFuncs[1].pProc

would generate

JMP g_stBugslayerUtilRealFuncs+5h

when I thought it should generate

JMP g_stBugslayerUtilRealFuncs+0Ch

Consequently, the generated code was jumping off into never-never land. I worked around the problem by using

JMP g_stBugslayerUtilRealFuncs[0x8].pProc

as the reference. This is an isolated problem and shouldn't cause trouble for anyone, but it could affect you if you want to add your own special trace functions to LIMODSDLL.DLL. If you do add your own functions, use the BUGSLAYERUTIL.DLL tables as an example to follow.



Debugging Applications
Debugging Applications for MicrosoftВ® .NET and Microsoft WindowsВ® (Pro-Developer)
ISBN: 0735615365
EAN: 2147483647
Year: 2000
Pages: 122
Authors: John Robbins

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net