I thought the best way to show you how a debugger worked was to write one, so I did. Although WDBG might not replace the Visual C++ debugger any time soon, it
Figure 4-2 WDBG in action
Overall, I'm happy with WDBG because it's a good sample. Looking at the WDBG
Before moving into the specifics of debugging, let's take a closer look at WDBG. Table 4-2 lists all the main subsystems of WDBG and describes what they do. One of my intentions in creating WDBG was to define a neutral interface between the UI and the debug loop. With a neutral interface, if I wanted to make WDBG.EXE support remote debugging over a network, I'd just have to replace the local debugging DLLs.
Table 4-2 WDBG Main Subsystems
| Subsystem | Description |
|---|---|
| WDBG.EXE | This module contains all the UI code. Additionally, all the breakpoint processing is taken care of here. Most of this debugger's work occurs in WDBGPROJDOC.CPP. |
| LOCALDEBUG.DLL |
This module contains the debug loop. Because I wanted to be able to reuse this debug loop, the user code, WDBG.EXE in this case,
|
| LOCALASSIST.DLL | This simple module is just a wrapper around the API functions for manipulating the debuggee's memory and registers. By using the interface defined in this module, WDBG.EXE and I386CPUHELP.DLL can instantly handle remote debugging just by replacing this module. |
| I386CPUHELP.DLL | This module is the IA32 (Pentium) helper module. Although this module is specific to Pentium processors, its interface, defined in CPUHELP.H, is CPU-independent. If you wanted to port WDBG to a different processor, this module is the only one you should have to replace. The disassembler in this module came from the Dr. Watson sample code that ships on the Platform SDK. Although the disassembler works, it appears to need updating to support the later Pentium CPU variants. |
Reading from a debuggee's memory is simple. ReadProcessMemory takes care of it for you. A debugger has full access to the debuggee if the debugger started it because the handle to the process returned by the CREATE_PROCESS_DEBUG_EVENT debug event has PROCESS_VM_READ and PROCESS_VM_WRITE access. If your debugger attaches to the process with DebugActiveProcess , you must use OpenProcess to get a handle to the debuggee, and you need to specify both read and write access.
Before I can talk about writing to the debuggee's memory, I need to
Writing to the debuggee memory is almost as straightforward as reading from it. Because the memory pages you want to write to might be
An interesting detail about the Win32 Debugging API is that the debugger is responsible for getting the string to output when an OUTPUT_DEBUG_STRING_EVENT comes through. The information passed to the debugger includes the location and the length of the string. When it receives this message, the debugger goes and reads the memory out of the debuggee. In Chapter 3, I mentioned that trace statements could easily change your application's behavior when running under a debugger. Because all threads in the application stop when the debug loop is processing an event, calling OutputDebugString in the debuggee means that all your threads stop. Listing 4-3 shows how WDBG handles the OUTPUT_DEBUG_STRING_EVENT . Notice that the DBG_ReadProcessMemory function is the wrapper function around ReadProcessMemory from LOCALASSIST.DLL.
Listing 4-3 OutputDebugStringEvent from PROCESSDEBUGEVENTS.CPP
static
DWORDOutputDebugStringEvent(CDebugBaseUser*pUserClass,
LPDEBUGGEEINFOpData,
DWORDdwProcessId,
DWORDdwThreadId,
OUTPUT_DEBUG_STRING_INFO&stODSI)
{
TCHARszBuff[512];
HANDLEhProc=pData->GetProcessHandle();
DWORDdwRead;
//Readthememory.
BOOLbRet=DBG_ReadProcessMemory(hProc,
stODSI.lpDebugStringData,
szBuff,
min(sizeof(szBuff),
stODSI.nDebugStringLength),
&dwRead);
ASSERT(TRUE==bRet);
if(TRUE==bRet)
{
//AlwaysNULLterminatethestring.
szBuff[dwRead+1]=_T('
|
Most
The concept of setting a breakpoint is simple. All you need to do is have a memory address where you want to set a breakpoint, save the opcode (the value) at that location, and write the breakpoint instruction into the address. On the Intel Pentium family, the breakpoint instruction mnemonic is "INT 3" or an opcode of 0xCC, so you need to save only a single byte at the address you're setting the breakpoint. Other CPUs, such as the Intel Merced, have different opcode sizes, so you would need to save more data at the address.
Listing 4-4 shows the code for the SetBreakpoint function. As you read through this code, keep in mind that the DBG_* functions are those that come out of LOCALASSIST.DLL and help isolate the various process manipulation routines, making it easier to add remote debugging to WDBG. The SetBreakpoint function illustrates the processing (described earlier in the chapter) necessary for changing memory protection when you're writing to it.
Listing 4-4 SetBreakpoint from 1386CPUHELP.C
intCPUHELP_DLLINTERFACE__stdcall
SetBreakpoint(PDEBUGPACKETdp,
ULONGulAddr,
OPCODE*pOpCode)
{
DWORDdwReadWrite=0;
BYTEbTempOp=BREAK_OPCODE;
BOOLbReadMem;
BOOLbWriteMem;
BOOLbFlush;
MEMORY_BASIC_INFORMATIONmbi;
DWORDdwOldProtect;
ASSERT(FALSE==IsBadReadPtr(dp,sizeof(DEBUGPACKET)));
ASSERT(FALSE==IsBadWritePtr(pOpCode,sizeof(OPCODE)));
if((TRUE==IsBadReadPtr(dp,sizeof(DEBUGPACKET)))
(TRUE==IsBadWritePtr(pOpCode,sizeof(OPCODE))))
{
TRACE0("SetBreakpoint:invalidparameters\n!");
return(FALSE);
}
//IftheoperatingsystemisWindows98andtheaddressisabove
//2GB,justleavequietly.
if((FALSE==IsNT())&&(ulAddr>=0x80000000))
{
return(FALSE);
}
//Readtheopcodeatthelocation.
bReadMem=DBG_ReadProcessMemory(dp->hProcess,
(LPCVOID)ulAddr,
&bTempOp,
sizeof(BYTE),
&dwReadWrite);
ASSERT(FALSE!=bReadMem);
ASSERT(sizeof(BYTE)==dwReadWrite);
if((FALSE==bReadMem)
(sizeof(BYTE)!=dwReadWrite))
{
return(FALSE);
}
//Isthisnewbreakpointabouttooverwriteanexisting
//breakpointopcode?
if(BREAK_OPCODE==bTempOp)
{
return(-1);
}
//Getthepageattributesforthedebuggee.
DBG_VirtualQueryEx(dp->hProcess,
(LPCVOID)ulAddr,
&mbi,
sizeof(MEMORY_BASIC_INFORMATION));
//Forcethepagetocopy-on-writeinthedebuggee.
if(FALSE==DBG_VirtualProtectEx(dp->hProcess,
mbi.BaseAddress,
mbi.RegionSize,
PAGE_EXECUTE_READWRITE,
&mbi.Protect))
{
ASSERT(!"VirtualProtectExfailed!!");
return(FALSE);
}
//SavetheopcodeImabouttowhack.
*pOpCode=(void*)bTempOp;
bTempOp=BREAK_OPCODE;
dwReadWrite=0;
//Theopcodewassaved,sonowsetthebreakpoint.
bWriteMem=DBG_WriteProcessMemory(dp->hProcess,
(LPVOID)ulAddr,
(LPVOID)&bTempOp,
sizeof(BYTE),
&dwReadWrite);
ASSERT(FALSE!=bWriteMem);
ASSERT(sizeof(BYTE)==dwReadWrite);
if((FALSE==bWriteMem)
(sizeof(BYTE)!=dwReadWrite))
{
return(FALSE);
}
//ChangetheprotectionbacktowhatitwasbeforeIblastedthe
//breakpointin.
VERIFY(DBG_VirtualProtectEx(dp->hProcess,
mbi.BaseAddress,
mbi.RegionSize,
mbi.Protect,
&dwOldProtect));
//FlushtheinstructioncacheincasethismemorywasintheCPU
//cache.
bFlush=DBG_FlushInstructionCache(dp->hProcess,
(LPCVOID)ulAddr,
sizeof(BYTE));
ASSERT(TRUE==bFlush);
return(TRUE);
}
|
After you set the breakpoint, the CPU will execute it and will tell the debugger that an EXCEPTION_BREAKPOINT (0x80000003) occurred—that's where the fun begins. If it's a regular breakpoint, the debugger will locate and display the breakpoint location to the user. After the user decides to continue execution, the debugger has to do some work to restore the state of the program. Because the breakpoint overwrote a portion of memory, if you, as the debugger writer, were to just let the process continue, you would be executing code out of sequence and the debuggee would probably crash. What you need to do is to move the current instruction pointer back to the breakpoint address and replace the breakpoint with the opcode you saved when you set the breakpoint. After restoring the opcode, you can continue executing.
There's only one small problem: How do you reset the breakpoint so that you can stop at that location again? If the CPU you're working on supports single-step execution, resetting the breakpoint is trivial. In single-step execution, the CPU executes a single instruction and generates another type of exception, EXCEPTION_SINGLE_STEP (0x80000004). Fortunately, all CPUs that Win32 runs on support single-step execution. For the Intel Pentium family, setting single-step execution requires that you set bit 8 on the flags register. The Intel reference manual calls this bit the TF, or Trap Flag. The code in Listing 4-5 shows the SetSingleStep function and the work needed to set the TF. After replacing the breakpoint with the original opcode, the debugger marks its internal state to reflect that it's expecting a single-step exception, sets the CPU into single-step execution, and then continues the process.
Listing 4-5 SetSingleStep function from I386CPUHELP.C
BOOL CPUHELP_DLLIMNTERFACE __stdcall
SetSingleStep (PDEBUGPACKET dp)
{
BOOL bSetContext ;
ASSERT (FALSE == IsBadReadPtr (dp , sizeof (DEBUGPACKET))) ;
if (TRUE == IsBadReadPtr (dp , sizeof (DEBUGPACKET)))
{
TRACE0 ("SetSingleStep : invalid parameters\n!") ;
return (FALSE) ;
}
// For the i386, just set the TF bit.
dp->context.EFlags = TF_BIT ;
bSetContext = DBG_SetThreadContext (dp->hThread ,
&dp->context) ;
ASSERT (FALSE != bSetContext) ;
return (bSetContext) ;
}
|
After the debugger releases the process by calling
ContinueDebugEvent
, the process immediately generates a single-step exception after the single instruction executes. The debugger checks its internal state to verify that it was expecting a single-step exception. Because the debugger was expecting a single-step exception, it
If you want to see all the breakpoint processing in action, look for the
CWDBGProjDoc::HandleBreakpoint
method in the WDBGPROJDOC.CPP file on the companion CD. I defined the breakpoints
One of the neater features I implemented in WDBG was the Debug Break menu option. This option means that you can break into the debugger at any time while the debuggee is running. Although WDBG uses the breakpoint operations described earlier in the chapter, the breakpoints used to implement the Debug Break option are referred to as one-shot breakpoints because the breakpoints are removed just as soon as they trigger. Getting those one-shot breakpoints set is pretty interesting. The full details are in CWDBGProjDoc::OnDebugBreak in WDBGPROJDOC.CPP, but I'll go into greater detail here because I think you'll find the explanation enlightening. Listing 4-6 shows the CWDBGProjDoc::OnDebugBreak function from WDBGPROJDOC.CPP. (To find out more about one-shot breakpoints, see the section "Step Into, Step Over, and Step Out" later in this chapter.)
Listing 4-6 Debug Break processing in WDBGPROJDOC.CPP
voidCWDBGProjDoc::OnDebugBreak()
{
//Justformyownpeaceofmind.
ASSERT(m_vDbgThreads.size()>0);
//Theideahereistogetallthedebuggeesthreadssuspendedand
//setabreakpointatthecurrentinstructionpointerforeach.
//Thatway,Icanguaranteethatatleastoneofthethreads
//willtriptheone-shotbreakpoints.
//Onesituationinwhichsettingabreakpointoneachthread
//wontworkiswhenanapplicationishung.Becauseno
//threadsareturningover,thebreakpointsnevergetcalled.
//Tomakethedeadlockcasework,Idneedtouseanalgorithmsuch
//asthefollowing:
//1.Setthebreakpointsasthisfunctiondoes.
//2.SetastateflagindicatingthatImwaitingonaDebugBreak
//breakpoint.
//3.Setabackgroundtimertowaitforthebreakpoint.
//4.Ifoneofthebreakpointsgoesoff,clearthetimer.Lifeis
//good.
//5.Ifthetimergoesoff,theapplicationishung.
//6.Afterthetimer,settheinstructionpointerofoneofthe
//threadstoanotheraddressandputabreakpointatthat
//address.
//7.Restartthethread.
//8.Whenthisspecialbreakpointfires,clearthebreakpointand
//resettheinstructionpointerbacktotheoriginallocation.
//Toavoidproblems,Illboostthepriorityofthisthreadso
//thatIgetthroughsettingthesebreakpointsasfastaspossible
//andkeepanyofthedebuggeesthreadsfrombeingscheduled.
HANDLEhThisThread=GetCurrentThread();
intiOldPriority=GetThreadPriority(hThisThread);
SetThreadPriority(hThisThread,THREAD_BASE_PRIORITY_LOWRT);
HANDLEhProc=GetDebuggeeProcessHandle();
DBGTHREADVECT::iteratori;
for(i=m_vDbgThreads.begin();
i!=m_vDbgThreads.end();
i++)
{
//Suspendthisthread.Ifithasasuspendcountalready,I
//dontreallycare.ThatswhyIsetabreakpointoneach
//threadinthedebuggee.Illhitanactiveoneeventually.
DBG_SuspendThread(i->m_hThread);
//Nowthatthethreadissuspended,Icangetthecontext.
CONTEXTctx;
ctx.ContextFlags=CONTEXT_FULL;
//IfGetThreadContextfails,Ihavetohandletheerrormessage
//carefully.Becausethisthreadspriorityissettoreal-time,
//ifIuseanASSERT,thecomputermighthangonthemessage
//box,sointheelsestatement,Icanindicatetheerroronly
//withatracestatement.
if(FALSE!=DBG_GetThreadContext(i->m_hThread,&ctx))
{
//Findtheaddressthattheinstructionpointerisaboutto
//execute.ThataddressiswhereIllsetthebreakpoint.
DWORDdwAddr=ReturnInstructionPointer(&ctx);
COneShotBPcBP;
//Setthebreakpoint.
cBP.SetBreakpointLocation(dwAddr);
//Armit.
if(TRUE==cBP.ArmBreakpoint(hProc))
{
//AddthisbreakpointtotheDebugBreaklistonlyif
//thebreakpointwassuccessfullyarmed.Thedebuggee
//couldeasilyhavemultiplethreadssittingonthe
//sameinstruction,soIwantonlyonebreakpointset
//onthataddress.
m_aDebugBreakBPs.Add(cBP);
}
}
else
{
TRACE("GetThreadContextfailed!LastError=0x%08X\n",
GetLastError());
#ifdef_DEBUG
//BecauseGetThreadContextfailed,Iprobablyneedtotakea
//lookatwhathappened.Therefore,Illpopintothe
//thedebuggerdebuggingtheWDBGdebugger.Eventhough
//theWDBGthreadisrunningatareal-timeprioritylevel,
//callingDebugBreakwillimmediatelypullthisthreadout
//oftheoperatingsystemscheduler,sotheprioritydrops.
DebugBreak();
#endif
}
}
//Allthethreadshavebreakpointsset.NowIllrestartthemall
//andpostathreadmessagetoeachone.Thereasonforposting
//thethreadmessageissimple.Ifthedebuggeeischuggingaway
//onmessagesorotherprocessing,itwillbreakimmediately.
//However,ifitsjustidlinginamessageloop,Ineedtogiveit
//atickletoforceitintoaction.BecauseIhavethethreadID,
//IlljustsendthethreadaWM_NULLmessage.WM_NULLissupposed
//tobeabenignmessage,soitshouldntscrewupthedebuggee.If
//thethreaddoesnthaveamessagequeue,thisfunctionjustfails
//forthatthreadwithnoharmdone.
for(i=m_vDbgThreads.begin();
i!=m_vDbgThreads.end();
i++)
{
//Letthisthreadresumesothatithitsthebreakpoint.
DBG_ResumeThread(i->m_hThread);
PostThreadMessage(i->m_dwTID,WM_NULL,0,0);
}
//Nowdroptheprioritybackdown.
SetThreadPriority(hThisThread,iOldPriority);
}
|
When you want to stop a debuggee that's churning like mad, you need to get a breakpoint
Although breaking into the debugger is
To ensure that the debuggee
The only question then becomes, What message do I post? You don't want to post a message that could cause the debuggee to do any real processing, thus allowing the debugger to change the behavior of the debuggee. For example, posting a WM_CREATE message probably wouldn't be a good idea. Fortunately, the WM_NULL message is supposed to be a benign message and is what you're supposed to use in hooks if you change a message. It does no harm to post the WM_NULL message with PostThreadMessage even if the thread doesn't have a message queue. And if the thread doesn't have a message queue, such as in a console application, calling PostThreadMessage doesn't do any damage. Because console-based applications will always be processing, even if waiting for a keystroke, setting the breakpoint at the current executing instruction will cause the break.
Another issue involves multithreading. If you're going to suspend only a single thread and the application is multithreaded, how do you know which thread to suspend? If you suspend and set the breakpoint in the wrong thread, say one that is blocked waiting on an event that is signaled only when background printing occurs, your breakpoint might never go off unless the user decides to print something. If you want to break on a multithreaded application, the only safe course is to suspend all the threads and set a breakpoint in each one.
Suspending all the threads and setting a breakpoint in each one works just great on an application that has only two threads. If you want to break on an application that has many threads, however, you could leave yourself
So far, my algorithm for breaking in a multithreaded application sounds reasonable. However, the debugger still needs to deal with one last issue to make the Debug Break option work completely. If you have all the breakpoints set in all the threads and you resume the threads, you still face one situation in which the break won't happen. By setting the breakpoints, you're relying on at least one of the threads to execute in order to trigger the breakpoint exception. What do you think happens if the process is in a deadlock situation? Nothing happens—no threads execute and your carefully positioned breakpoints never trigger the exception.
I told you the Debug Break business gets interesting. When you're breaking in a deadlock, you need to set up a timer to mark when you added the break. After your period of time elapses (the Visual C++ debugger uses 3 seconds), you need to take some drastic action. When the Debug Break option times out, you'll need to set one of the thread's instruction pointers to another address, set a breakpoint at that new address, and restart the thread. When that special breakpoint fires, you need to set the thread instruction pointer back to its original location. In WDBG, I didn't implement the anti-deadlock processing, but I left the implementation as an exercise for you in the CWDBGProjDoc::OnDebugBreak function in WDBGPROJDOC.CPP on the companion CD. The complete infrastructure is in place to handle the anti-deadlock processing, and it would probably take no more than a couple hours to put in. By the time you had it implemented, you'd have a good idea how WDBG works.
The real black art to writing a debugger involves symbol engines, the code that manipulates symbol tables. Debugging at the straight assembly-language level is interesting for the first couple of minutes you have to do it, but it gets old quickly. Symbol tables, also called debugging symbols, are what turn hexadecimal
Before diving into a discussion of accessing symbol tables, I need to go over the various symbol formats available. I've found that people are a little
The first format, SYM, is an older format that used to be common in the MS-DOS and 16-bit Windows days. The only current use of SYM is for the debugging symbols for Windows 98; the SYM format is used here because most of the
Common Object File Format (COFF) was one of the original symbol table formats and was introduced with Windows NT 3.1, the first version of Windows NT. The Windows NT team was
The C7, or CodeView, format first appeared as part of Microsoft C/C++ version 7 back in the MS-DOS days. If you're an old timer, you might have
If you're interested in symbol tables and would like to write one, the C7 specification is on MSDN. Look for it in the "VC5.0 Symbolic Debug Information" topic. The specification lists only the raw byte structure and type definitions. If you'd like to see the actual type definitions in C, the Dr. Watson source code, included on the MSDN CDs, has some old C7 format header files in its include directory. Although those header files are considerably dated, they can give you an idea what the structures look like.
Although you could use the C7 format for your applications if you wanted to, you probably shouldn't. The main reason for not using C7 is that it automatically turns off incremental linking. With incremental linking turned off, link times increase dramatically. The other reason for avoiding C7 is that it makes binary files incredibly large. Although you could strip out the symbol information with REBASE.EXE, other formats, namely PDB, automatically remove the symbol information for you.
The PDB format is the most common symbol format used today, and both Visual C++ and Visual Basic support it. Unlike the C7 format, PDB symbols are stored in a separate file or files, depending on how the application is linked. By default, Visual C++ 6 links with /PDBTYPE:SEPT, which puts the type information into VC60.PDB and the symbols into <binary name>.PDB. Separating the type information from the debug symbols makes linking faster and requires less disk space. However, the documentation states that if you're building binaries that others could be debugging, you should use /PDBTYPE:CON so that all the type information and debug symbols are consolidated into a single PDB file. Fortunately, Visual Basic automatically uses /PDBTYPE:CON.
To see whether a binary contains PDB symbol information, open it in a hex editor and move to the end of the file. You'll see a marker to the debugging information. If the marker starts with "NB10" and ends with the complete
DBG files are unique because, unlike the other symbol formats, the linker doesn't create them. A DBG file is basically just a file that holds other types of debug symbols, such as COFF or C7. DBG files use some of the same structures defined by the Portable Executable (PE) file format—the format used by Win32 executables. REBASE.EXE produces DBG files by stripping the COFF or C7 debugging information out of a module. There's no need to run REBASE.EXE on a module that was built using PDB files because the symbols are already separate from the module. If you're generating C7 symbols and you need to strip them, read the MSDN documentation on REBASE.EXE to see how to do it. Microsoft distributes the operating system debugging symbols in DBG files, and with Windows 2000, the PDB files are included as well. Before you get your hopes up that the operating system symbols include everything you need to reverse engineer the entire operating system, let me warn you that the files include only the public and global information. Using these files makes it much easier to see where you are when you're dropped into the middle of the Disassembly window.
If you're interested in symbol engines and you start
The Working Set Tuner (WST) program that comes with the Platform SDK
if (TRUE == bIsError)
{ <- The basic block starts here.
// Do the error handling here.
} <- The basic block ends here.
|
The Microsoft tool moves the error handler to the end of the binary so that only the most common code goes into the front. The OMAP symbols seem to be some
To access symbol information, you can use Microsoft's DBGHELP.DLL symbol engine. DBGHELP.DLL can read COFF and C7 symbol formats by itself. To read PDB files, you must also have MSDBI.DLL, which DBGHELP.DLL uses internally. In the past, the symbol engine was in IMAGEHLP.DLL, but Microsoft wisely pulled the symbol engine out of the core system and put it in a DLL that was easier to upgrade. If you have a program that was using the symbol engine when it was part of IMAGEHLP.DLL, IMAGEHLP.DLL still includes the symbol engine exports. The new IMAGEHLP.DLL forwards those functions to DBGHELP.DLL. At the time I was writing this book, the MSDN documentation for the symbol engine was still included as part of IMAGEHLP.DLL.
The DBGHELP.DLL symbol engine allows you to turn an address into the
For WDBG, I used a simple C++ wrapper class, shown in Listing 4-7, that I originally wrote as part of my BUGSLAYERUTIL.DLL library. It is a paper-thin layer of the existing DBGHELP.DLL symbol engine API, but it does provide some workarounds to problems that I've
Listing 4-7 SYMBOLENGINE.H
/*----------------------------------------------------------------------
"DebuggingApplications"(MicrosoftPress)
Copyright(c)1997-2000JohnRobbins--Allrightsreserved.
------------------------------------------------------------------------
Thisclassisapaper-thinlayeraroundtheDBGHELP.DLLsymbolengine.
Thisclasswrapsonlythosefunctionsthattaketheunique
HANDLEvalue.OtherDBGHELP.DLLsymbolenginefunctionsareglobalin
scope,soIdidntwrapthemwiththisclass.
------------------------------------------------------------------------
CompilationDefines:
DO_NOT_WORK_AROUND_SRCLINE_BUG-Ifdefined,theclasswillNOTwork
aroundtheSymGetLineFromAddrbugwhere
PDBfMilelookupsfailafterthefirst
lookup.
USE_BUGSLAYERUTIL-Ifdefined,theclasswillhaveanother
initializationmethod,BSUSymInitialize,whichwill
useBSUSymInitializefromBUGSLAYERUTIL.DLLto
initializethesymbolengineandallowtheinvade
processflagtoworkforallWin32operatingsystems.
Ifyouusethisdefine,youmustuse
BUGSLAYERUTIL.Htoincludethisfile.
----------------------------------------------------------------------*/
#ifndef_SYMBOLENGINE_H
#define_SYMBOLENGINE_H
//YoucouldincludeeitherIMAGEHLP.DLLorDBGHELP.DLL.
#include"imagehlp.h"
#include<tchar.h>
//Includetheseincasetheuserforgetstolinkagainstthem.
#pragmacomment(lib,"dbghelp.lib")
#pragmacomment(lib,"version.lib")
//ThegreatBugslayerideaofcreatingwrapperclassesonstructures
//thathavesizefieldscamefromfellowMSJcolumnist,PaulDiLascia.
//Thanks,Paul!
//IdidntwrapIMAGEHLP_SYMBOLbecausethatisavariable-size
//structure.
//TheIMAGEHLP_MODULEwrapperclass
structCImageHlp_Module:publicIMAGEHLP_MODULE
{
CImageHlp_Module()
{
memset(this,NULL,sizeof(IMAGEHLP_MODULE));
SizeOfStruct=sizeof(IMAGEHLP_MODULE);
}
};
//TheIMAGEHLP_LINEwrapperclass
structCImageHlp_Line:publicIMAGEHLP_LINE
{
CImageHlp_Line()
{
memset(this,NULL,sizeof(IMAGEHLP_LINE));
SizeOfStruct=sizeof(IMAGEHLP_LINE);
}
};
//Thesymbolengineclass
classCSymbolEngine
{
/*----------------------------------------------------------------------
PublicConstructionandDestruction
----------------------------------------------------------------------*/
public:
//Tousethisclass,calltheSymInitializememberfunctionto
//initializethesymbolengineandthenusetheothermember
//functionsinplaceoftheircorrespondingDBGHELP.DLLfunctions.
CSymbolEngine(void)
{
}
virtual~CSymbolEngine(void)
{
}
/*----------------------------------------------------------------------
PublicHelperInformationFunctions
----------------------------------------------------------------------*/
public:
//ReturnsthefileversionofDBGHELP.DLLbeingused.
//Toconvertthereturnvaluesintoareadableformat:
//wsprintf(szVer,
//_T("%d.%02d.%d.%d"),
//HIWORD(dwMS),
//LOWORD(dwMS),
//HIWORD(dwLS),
//LOWORD(dwLS));
//szVerwillcontainastringlike:5.00.1878.1
BOOLGetImageHlpVersion(DWORD&dwMS,DWORD&dwLS)
{
return(GetInMemoryFileVersion(_T("DBGHELP.DLL"),
dwMS,
dwLS));
}
BOOLGetDbgHelpVersion(DWORD&dwMS,DWORD&dwLS)
{
return(GetInMemoryFileVersion(_T("DBGHELP.DLL"),
dwMS,
dwLS));
}
//ReturnsthefileversionofthePDBreadingDLLs
BOOLGetPDBReaderVersion(DWORD&dwMS,DWORD&dwLS)
{
//FirsttryMSDBI.DLL.
if(TRUE==GetInMemoryFileVersion(_T("MSDBI.DLL"),
dwMS,
dwLS))
{
return(TRUE);
}
elseif(TRUE==GetInMemoryFileVersion(_T("MSPDB60.DLL"),
dwMS,
dwLS))
{
return(TRUE);
}
//JustfalldowntoMSPDB50.DLL.
return(GetInMemoryFileVersion(_T("MSPDB50.DLL"),
dwMS,
dwLS));
}
//Theworkerfunctionusedbytheprevioustwofunctions
BOOLGetInMemoryFileVersion(LPCTSTRszFile,
DWORD&dwMS,
DWORD&dwLS)
{
HMODULEhInstIH=GetModuleHandle(szFile);
//Getthefullfilenameoftheloadedversion.
TCHARszImageHlp[MAX_PATH];
GetModuleFileName(hInstIH,szImageHlp,MAX_PATH);
dwMS=0;
dwLS=0;
//Gettheversioninformationsize.
DWORDdwVerInfoHandle;
DWORDdwVerSize;
dwVerSize=GetFileVersionInfoSize(szImageHlp,
&dwVerInfoHandle);
if(0==dwVerSize)
{
return(FALSE);
}
//Gottheversionsize,nowgettheversioninformation.
LPVOIDlpData=(LPVOID)newTCHAR[dwVerSize];
if(FALSE==GetFileVersionInfo(szImageHlp,
dwVerInfoHandle,
dwVerSize,
lpData))
{
delete[]lpData;
return(FALSE);
}
VS_FIXEDFILEINFO*lpVerInfo;
UINTuiLen;
BOOLbRet=VerQueryValue(lpData,
_T("\"),
(LPVOID*)&lpVerInfo,
&uiLen);
if(TRUE==bRet)
{
dwMS=lpVerInfo->dwFileVersionMS;
dwLS=lpVerInfo->dwFileVersionLS;
}
delete[]lpData;
return(bRet);
}
/*----------------------------------------------------------------------
PublicInitializationandCleanup
----------------------------------------------------------------------*/
public:
BOOLSymInitialize(INHANDLEhProcess,
INLPSTRUserSearchPath,
INBOOLfInvadeProcess)
{
m_hProcess=hProcess;
return(::SymInitialize(hProcess,
UserSearchPath,
fInvadeProcess));
}
#ifdefUSE_BUGSLAYERUTIL
BOOLBSUSymInitialize(DWORDdwPID,
HANDLEhProcess,
PSTRUserSearchPath,
BOOLfInvadeProcess)
{
m_hProcess=hProcess;
return(::BSUSymInitialize(dwPID,
hProcess,
UserSearchPath,
fInvadeProcess));
}
#endif//USE_BUGSLAYERUTIL
BOOLSymCleanup(void)
{
return(::SymCleanup(m_hProcess));
}
/*----------------------------------------------------------------------
PublicModuleManipulation
----------------------------------------------------------------------*/
public:
BOOLSymEnumerateModules(INPSYM_ENUMMODULES_CALLBACK
EnumModulesCallback,
INPVOIDUserContext)
{
return(::SymEnumerateModules(m_hProcess,
EnumModulesCallback,
UserContext));
}
BOOLSymLoadModule(INHANDLEhFile,
INPSTRImageName,
INPSTRModuleName,
INDWORDBaseOfDll,
INDWORDSizeOfDll)
{
return(::SymLoadModule(m_hProcess,
hFile,
ImageName,
ModuleName,
BaseOfDll,
SizeOfDll));
}
BOOLEnumerateLoadedModules(INPENUMLOADED_MODULES_CALLBACK
EnumLoadedModulesCallback,
INPVOIDUserContext)
{
return(::EnumerateLoadedModules(m_hProcess,
EnumLoadedModulesCallback,
UserContext));
}
BOOLSymUnloadModule(INDWORDBaseOfDll)
{
return(::SymUnloadModule(m_hProcess,BaseOfDll));
}
BOOLSymGetModuleInfo(INDWORDdwAddr,
OUTPIMAGEHLP_MODULEModuleInfo)
{
return(::SymGetModuleInfo(m_hProcess,
dwAddr,
ModuleInfo));
}
DWORDSymGetModuleBase(INDWORDdwAddr)
{
return(::SymGetModuleBase(m_hProcess,dwAddr));
}
/*----------------------------------------------------------------------
PublicSymbolManipulation
----------------------------------------------------------------------*/
public:
BOOLSymEnumerateSymbols(INDWORDBaseOfDll,
INPSYM_ENUMSYMBOLS_CALLBACK
EnumSymbolsCallback,
INPVOIDUserContext)
{
return(::SymEnumerateSymbols(m_hProcess,
BaseOfDll,
EnumSymbolsCallback,
UserContext));
}
BOOLSymGetSymFromAddr(INDWORDdwAddr,
OUTPDWORDpdwDisplacement,
OUTPIMAGEHLP_SYMBOLSymbol)
{
return(::SymGetSymFromAddr(m_hProcess,
dwAddr,
pdwDisplacement,
Symbol));
}
BOOLSymGetSymFromName(INLPSTRName,
OUTPIMAGEHLP_SYMBOLSymbol)
{
return(::SymGetSymFromName(m_hProcess,
Name,
Symbol));
}
BOOLSymGetSymNext(INOUTPIMAGEHLP_SYMBOLSymbol)
{
return(::SymGetSymNext(m_hProcess,Symbol));
}
BOOLSymGetSymPrev(INOUTPIMAGEHLP_SYMBOLSymbol)
{
return(::SymGetSymPrev(m_hProcess,Symbol));
}
/*----------------------------------------------------------------------
PublicSourceLineManipulation
----------------------------------------------------------------------*/
public:
BOOLSymGetLineFromAddr(INDWORDdwAddr,
OUTPDWORDpdwDisplacement,
OUTPIMAGEHLP_LINELine)
{
#ifdefDO_NOT_WORK_AROUND_SRCLINE_BUG
//Justpassalongthevaluesreturnedbythemainfunction.
return(::SymGetLineFromAddr(m_hProcess,
dwAddr,
pdwDisplacement,
Line));
#else
//Theproblemisthatthesymbolenginefindsonlythosesource
//lineaddresses(afterthefirstlookup)thatfallexactlyon
//azerodisplacement.Illwalkbackward100bytesto
//findthelineandreturntheproperdisplacement.
DWORDdwTempDis=0;
while(FALSE==::SymGetLineFromAddr(m_hProcess,
dwAddr-dwTempDis,
pdwDisplacement,
Line))
{
dwTempDis+=1;
if(100==dwTempDis)
{
return(FALSE);
}
}
//Ifounditandthesourcelineinformationiscorrect,soIll
//changethedisplacementifIhadtosearchbackwardtofind
//thesourceline.
if(0!=dwTempDis)
{
*pdwDisplacement=dwTempDis;
}
return(TRUE);
#endif//DO_NOT_WORK_AROUND_SRCLINE_BUG
}
BOOLSymGetLineFromName(INLPSTRModuleName,
INLPSTRFileName,
INDWORDdwLineNumber,
OUTPLONGplDisplacement,
INOUTPIMAGEHLP_LINELine)
{
return(::SymGetLineFromName(m_hProcess,
ModuleName,
FileName,
dwLineNumber,
plDisplacement,
Line));
}
BOOLSymGetLineNext(INOUTPIMAGEHLP_LINELine)
{
return(::SymGetLineNext(m_hProcess,Line));
}
BOOLSymGetLinePrev(INOUTPIMAGEHLP_LINELine)
{
return(::SymGetLinePrev(m_hProcess,Line));
}
BOOLSymMatchFileName(INLPSTRFileName,
INLPSTRMatch,
OUTLPSTR*FileNameStop,
OUTLPSTR*MatchStop)
{
return(::SymMatchFileName(FileName,
Match,
FileNameStop,
MatchStop));
}
/*----------------------------------------------------------------------
PublicMiscellaneousMembers
----------------------------------------------------------------------*/
public:
LPVOIDSymFunctionTableAccess(DWORDAddrBase)
{
return(::SymFunctionTableAccess(m_hProcess,AddrBase));
}
BOOLSymGetSearchPath(OUTLPSTRSearchPath,
INDWORDSearchPathLength)
{
return(::SymGetSearchPath(m_hProcess,
SearchPath,
SearchPathLength));
}
BOOLSymSetSearchPath(INLPSTRSearchPath)
{
return(::SymSetSearchPath(m_hProcess,SearchPath));
}
BOOLSymRegisterCallback(INPSYMBOL_REGISTERED_CALLBACK
CallbackFunction,
INPVOIDUserContext)
{
return(::SymRegisterCallback(m_hProcess,
CallbackFunction,
UserContext));
}
/*----------------------------------------------------------------------
ProtectedDataMembers
----------------------------------------------------------------------*/
protected:
//Theuniquevaluethatwillbeusedforthisinstanceofthe
//symbolengine.Thisvaluedoesnthavetobeanactual
//processvalue,justauniquevalue.
HANDLEm_hProcess;
};
#endif//_SYMBOLENGINE_H
|
Before Windows 2000 became available, getting the Microsoft-supplied symbol engine working was no easy task. The main reason for the difficulty was that the symbol engine was in IMAGEHLP.DLL and many programs used it. Because you can't replace a DLL that is currently loaded, getting a
Having DBGHELP.DLL installed is only part of the battle, because you need to ensure that you have your symbol files accessible to the symbol engine in order to load them. For DBG files, the DBGHELP.DLL symbol engine will look for them in the following places:
The environment variables must point to directories that are set up a specific way. For example, if your symbols are located in c:\MyFiles, you must create a directory named Symbols under your main directory. Under the Symbols directory, you must create a directory for each extension your binary files use. For example, if you have an EXE and a couple of DLLs, your final directory tree would look like the following. The DBG files for each of your particular extensions go in the appropriate places.
c:\MyFiles c:\MyFiles\Symbols c:\MyFiles\Symbols\Exe c:\MyFiles\Symbols\Dll |
For PDB files, the only difference is that the DBGHELP.DLL symbol engine will look in the binary for the original PDB path and try to load the PDB from that absolute directory. If the DBGHELP.DLL symbol engine can't load the PDB file from that directory, it will attempt to load the PDB file using the same steps I described previously for DBG files.
Fortunately for all of us, we don't have to write our own stack-walking code. DBGHELP.DLL provides the StackWalk API function. StackWalk is straightforward and takes care of all your stack-walking needs. WDBG uses the StackWalk API function just as the Visual C++ debugger does. The only snag you might encounter is that the documentation isn't explicit about what needs to be set in the STACKFRAME structure. Listing 4-8 shows you the exact fields that need to be filled out in the STACKFRAME structure.
S
tackWalk
does such a good job of taking care of the details that you might not be aware that stack walking can be difficult with optimized code. The reason for the difficulty is that the compiler can optimize away the stack frame, the place where the code pushes stack entries, for some functions. The Visual C++ and Visual Basic compilers are
Listing 4-8 InitializeStackFrameWithContext from I386CPUHELP.C
BOOL CPUHELP_DLLINTERFACE __stdcall
InitializeStackFrameWithContext (STACKFRAME * pStack ,
CONTEXT * pCtx)
{
ASSERT (FALSE == IsBadReadPtr (pCtx , sizeof (CONTEXT))) ;
ASSERT (FALSE == IsBadWritePtr (pStack , sizeof (STACKFRAME)));
if ((TRUE == IsBadReadPtr (pCtx , sizeof (CONTEXT)))
(TRUE == IsBadWritePtr (pStack , sizeof (STACKFRAME))))
{
return (FALSE) ;
}
pStack->AddrPC.Offset=pCtx->Eip;
pStack->AddrPC.Mode=AddrModeFlat;
pStack->AddrStack.Offset=pCtx->Esp;
pStack->AddrStack.Mode=AddrModeFlat;
pStack->AddrFrame.Offset=pCtx->Ebp;
pStack->AddrFrame.Mode=AddrModeFlat;
return(TRUE);
}
|
Now that I've described breakpoints and the symbol engine, I want to explain how debuggers implement the excellent Step Into, Step Over, and Step Out functionality. I didn't implement these features in WDBG because I wanted to concentrate on the core portions of the debugger. Step Into, Step Over, and Step Out require source and disassembly views that allow you to keep track of the current executing line or instruction. After you read the discussion in this section, you'll see that the core architecture of WDBG has the infrastructure you need to wire these features in and that adding these features is mostly an exercise in UI programming.
Step Into, Step Over, and Step Out all work with one-shot breakpoints, which, as you'll recall from earlier in the chapter, are breakpoints that the debugger discards after the breakpoints trigger. In the Debug Break discussion earlier in the chapter, you saw another instance in which the debugger uses one-shot breakpoints to stop the processing.
Step Into works differently depending on whether you're debugging at the source level or the disassembly level. When debugging at the source level, the debugger must rely on one-shot breakpoints because a single high-level language line
At the source level, the debugger knows the source line you're on. When you execute the debugger's Step Into command, the debugger uses the symbol engine to look up the address of the next line to execute. The debugger will do a partial disassembly at the next line address to see whether the line is a call instruction. If the line is a call instruction, the debugger will set a one-shot breakpoint on the first address of the function the debuggee is about to call. If the next line address isn't a call instruction, the debugger sets a one-shot breakpoint there. After setting the one-shot breakpoint, the debugger will release the debuggee so that it runs to the freshly set one-shot breakpoint. When the one-shot breakpoint triggers, the debugger will replace the opcode at the one-shot location and free any memory associated with the one-shot breakpoint. If the user is working at the disassembly level, Step Into is much easier to implement because the debugger will just force the CPU into single-step execution.
Step Over is similar to Step Into in that the debugger must look up the next line in the symbol engine and does the partial disassembly at the line address. The difference is that in Step Over the debugger will set a one-shot breakpoint after the call instruction if the line is a call.
The Step Out operation is in some ways the simplest of the three. When the user selects the Step Out command, the debugger walks the stack to find the return address for the current function and sets a one-shot breakpoint on that address.
The processing for Step Into, Step Over, and Step Out seems straightforward, but there's one small twist that you need to consider. If you write your debugger to handle Step Into, Step Over, and Step Out, what are you going to do if you've set the one-shot breakpoint for those cases and a regular breakpoint triggers before the one-shot breakpoint? As a debugger writer, you have two choices. The first is to leave your one-shot breakpoints alone so that they trigger. The other option is to remove your one-shot breakpoint when the debugger notifies you that a regular breakpoint triggered. The latter option is what the Visual C++ debugger does.
Either way of handling this case is correct, but by removing the one-shot breakpoint for Step Into, Step Over, and Step Out, you avoid user confusion. If you allow the one-shot breakpoint to trigger after the normal breakpoint, the user can easily be left wondering why the debugger stopped at an odd location.
In general, I didn't have much trouble developing WDBG. However, one problem that proved to be rather interesting did come up. If you run the Visual C++ debugger, the Output window shows you the complete path to the modules as they load. Because I was trying to make WDBG as complete as I could, I wanted to duplicate that functionality. I didn't think doing so would be that difficult.
If you look at the following definition of the LOAD_DLL_DEBUG_INFO structure passed to the debugger on LOAD_DLL_DEBUG_EVENT notifications, you'll see a field for lpImageName , which you would think would be the name of the module loading. That's exactly what it is, but none of the Win32 operating systems ever fills it out.
typedef struct _LOAD_DLL_DEBUG_INFO {
HANDLE hFile;
LPVOID lpBaseOfDll;
DWORD dwDebugInfoFileOffset;
DWORD nDebugInfoSize;
LPVOID lpImageName;
WORD fUnicode;
} LOAD_DLL_DEBUG_INFO;
|
Because I was loading the module into the DBGHELP.DLL symbol engine as I got the LOAD_DLL_DEBUG_EVENT notifications, I thought I could just look up the complete module name after loading it. The SymGetModuleInfo API function takes an IMAGEHLP_MODULE structure, as shown here, and there is space for the complete module name.
typedef struct _IMAGEHLP_MODULE {
DWORD SizeOfStruct;
DWORD BaseOfImage;
DWORD ImageSize;
DWORD TimeDateStamp;
DWORD CheckSum;
DWORD NumSyms;
SYM_TYPE SymType;
CHAR ModuleName[32];
CHAR ImageName[256];
CHAR LoadedImageName[256];
} IMAGEHLP_MODULE, *PIMAGEHLP_MODULE;
|
The puzzling thing I noticed was that
SymGetModuleInfo
would return that the module symbol information was loaded, but the name of the module would be the name of the DBG symbol file or the module name would be missing completely. This behavior surprised me, but when I thought for a minute, I could see how it might be happening. When I got the
LOAD_DLL_DEBUG_INFO
structure, the
hFile
member was valid and I would in turn call
SymLoadModule
with that
hFile
. Because I never gave the DBGHELP.DLL symbol engine a full filename to load, it just
I just wanted to get the complete name of the module that was loaded. At first, I thought I could use the file handle
After pondering the problem a bit, I figured that there had to be an API function that would take a handle value and tell you the complete name of the open file. When I
Using the problem-solving approach that I outlined in Chapter 1, I took stock of the situation and set about formulating some hypothesis to explain the problem. As I read up on the PSAPI.DLL GetModuleFilenameEx function, I started to realize why it might not work when I was calling it. When I was receiving the LOAD_DLL_DEBUG_EVENT notification, it was telling me that a DLL was about to load into the address space, not that the DLL had loaded. Because the memory hadn't been mapped to hold the DLL, the PSAPI.DLL GetModuleFilenameEx was failing; when I stepped through it at the assembly-language level, it appeared to be looking through a mapped memory list that the operating system held for each process.
Now that I knew the source of the problem, I just needed a way to find out when the operating system fully mapped the module into memory. Although I probably could have gone to extreme measures to get this information, such as reverse engineer the image loader in NTDLL.DLL and set a breakpoint there, I
Common Debugging Question
Why can't I step into system functions or set breakpoints in system memory on Windows 98?If you've ever tried to step into certain system functions on Windows 98, you've seen that the debugger doesn't let you. Windows 2000, on the other hand, allows you to step
anywhere you want in your user-mode processes. The reason is that Windows 2000 completely implements copy-on-write, whereas Windows 98 does copy-on-write only for addresses below 2 GB.As I described in the section "Reading and Writing Memory" earlier in the chapter, copy-on-write allows processes to have their own private copies of mapped memory pages when they, or the debugger, write to a page. Because of the architecture of Windows 98, all processes share the address space above 2 GB. Because Windows 98 doesn't implement copy-on-write for those addresses, if Windows 98 allowed you to set a breakpoint in the shared memory, the first process that executed that address would cause a breakpoint exception. Because that process probably isn't running under a debugger, the process would terminate with a breakpoint exception. Although some system DLLs, such as COMCTL32.DLL, load below 2 GB, the main system DLLs such as KERNEL32.DLL and USER32.DLL load above 2 GB, which means that unless you have a kernel debugger running on Windows 98, you can't step into them with a user-mode debugger.