ADPlus


Now that I've discussed WinDBG and SOS, we can finally turn to ADPlus. For some strange reason, your network administrators are never too keen on letting you install Visual Studio on a production server and single-stepping that mission-critical application. In those cases, the best you are going to get is a minidump, which ADPlus does with aplomb.

By installing the Debugging Tools for Windows, you not only get the trinity of WinDBG, CDB, and NTSD debuggers, you also get ADPlus.vbs, which is the Visual Basic Script file that is ADPlus. The good news is that once you've installed the Debugging Tools for Windows on one machine, to install ADPlus on a production server is simply a matter of using XCOPY to copy the entire Debugging Tools for Windows directory tree. That's extremely helpful because when you tell a network administrator that you want to install software on a production server, they look at you as if you had just told them that you were outlawing all caffeinated drinks and donuts from the office.

What ADPlus does is build a script file containing all the commands you want to execute and passes that script file to the debugger on the command line. As you can imagine, this makes your life much easier because ADPlus can abstract the grunt work for you. More importantly, ADPlus offers a consistent configuration scheme so you can easily reuse the configurations for other projects. Once you get used to the ADPlus way of operating, it's quite easy to take it into directions you didn't think possible. By the way, by default, ADPlus uses CDB as the debugger, but you can specify WinDBG or NTSD with the -dbg command-line switch to ADPlus.

ADPlus has two execution modes you can use for snapping managed minidumps. Running ADPlus in hang mode means that the CDB debugger will do a noninvasive attach to the target process and write a minidump of the process. When run in crash mode, ADPlus will configure CDB to attach as a normal native debugger, and you'll have specific actions, such as running specific SOS commands or creating minidumps, when the configured exception or breakpoint is hit. There's an undocumented third mode, called Quick. However, that mode produces only basic minidumps, so you can't process them with SOS.

Before I jump into techniques for configuring ADPlus, you should look at the ADPlus documentation in the Debugging Tools for Windows help file, Debugger.chm. You'll find the documentation in Debugging Tools for Windows\Extra Tools\ADPlus. If you really want to see how ADPlus works, it's also not a bad idea to read the ADPlus.vbs file. It's surprisingly well commented, and you'll learn even more tricks on how to use it. As with nearly every thing else in this book, I'm assuming that you've at least scanned the ADPlus documentation.

One major issue with ADPlus is that the documentation does not discuss at all the correct way to use command-line parameters with ADPlus. Simple switches are fine, but if you pass an option that takes a file or directory, make sure to pass the complete path to the item. ADPlus has a small problem in that it assumes that the current directory is where ADPlus.vbs is located, so any relative paths to directories or files in command line switches will be incorrect.

Hang Mode

Having the ability to get vital information out of your application automatically is a huge boon for debugging. In most cases, your production application debugging will revolve around using ADPlus hang mode to get the information so you can look at the live data. If you read the ADPlus documentation, you saw that the ADPlus command line alone is sufficient for you to grab a minidump of a process at any time.

However, it's far better to set up a configuration file that contains all the options you'll want so you can reuse it. Listing 6-1 shows my standard configuration file that I use for all .NET applications to get a minidump and a few other key pieces of information. You can also find this file in the .\ADPlus directory in the code.

Listing 6-1. DNHANG.XML ADPlus Configuration File

[View full width]

<!-- Default ADPlus HANG mode configuration for all process types. --> <ADPlus> <Settings> <!-- Set the mode to HANG --> <RunMode>HANG</RunMode> <!-- Snap the dumps, don't tell me about it --> <Option>Quiet</Option> </Settings> <HangActions> <!-- For custom actions, I want to see all --> <!-- the handle info, the managed CLR version, --> <!-- managed threads, managed call stacks, and --> <!-- bigger objects. --> <CustomActions> !handle 0 f; .loadby sos mscorwks; !eeversion; !threads; ~*e!clrstack; !dumpheap -stat -min 100; </CustomActions> </HangActions> </ADPlus>



The <Settings> element contains the global options for the whole configuration file, and you can see that I'm specifying hang mode. The Quiet in the <Option> element is extremely important to set. By default, ADPlus wants to pop up a message box and let you know where it's writing files. As we all know, any extraneous message boxes are not a great idea, so I tell ADPlus to shut up and just do its job. The Quiet option is especially important if you are going to use ADPlus on a test machine and you have a script set up to call ADPlus at specific intervals. With the message box up, the VBScript file will never end.

The <HangActions> element, pardon the bad pun, is where the action occurs. In nearly all cases, the default actions for hangs, defined in the <Options> element, are very worthwhile to run. I've listed all the commands run by default in Table 6-4 because the documentation does not report what's run. The first column of the table lists the ADPlus keyword for the command. To keep you from typing a long command string every time for each command, ADPlus allows text substitution from the configuration file. Additionally, these keywords expand to include any appropriate directories for the current run, which has a timestamp in the name, so you won't know the names beforehand. Note that I have not discussed some of the commands in this chapter because they are geared toward native development.

Table 6-4. Standard Commands in ADPlus Hang Mode

ADPlus Keyword

Actual Command

Meaning

FullDump

.dump /ma /c <comment> <filename>

Creates an SOS-compatible minidump.

Stacks

~*kb250

Walk the first 250 items on the call stacks for each thread.

LoadedModules

lm v

Display all the version information for loaded modules.

MatchingSymbols

lm l

Show all modules that had symbols loaded.

Heap

!heap 0 -k

Displays all operating system heap information, and on x86 systems, shows the stack back trace associated with each entry.

Handle

!handle 0 0

Displays the statistics table from the !handle command.

Dlls

!dlls

Show the table entries of all loaded modules.

Locks

!ntsdexts.locks

Display all acquired critical sections.

ThreadUsage

!runaway

Show how much time is used by each thread.


The <CustomActions> element is where even more action occurs. It allows you to specify any custom commands you want to execute after those specified in the <Options> element. As you can see from the comments in Listing 6-1, I'm running numerous SOS commands to get more information about the state of the application. If you've read the chapter to this point, the commands should be self-explanatory.

One issue with ADPlus configuration files that the documentation does not make clear is that you must explicitly separate all keywords and commands with semicolons. That's why I like to line up the items under the elements so I can scan down the list and ensure that the semicolon is present. Nothing is worse than building up a complicated ADPlus configuration file and having the debugger not report any data because ADPlus generates an invalid command. Fortunately, ADPlus properly parses the separate lines, so it's no trouble. Of course, there would have to be one special case to the rule of semicolons on each line: the Clear keyword. In the <HangAction><Option> element, you can tell ADPlus not to do any of the default commands with Clear. If a semicolon follows Clear, ADPlus will report an invalid command error.

Because the configuration file in Listing 6-1 does not include a <ProcessID> or <ProcessName> element under the <Settings> element, you need to specify the process to run against on the ADPlus command line. To specify a process ID, use the -p switch, and to specify the process by name, use the -pn switch. What most people don't realize about ADPlus is that it will happily script the debugger to attach to multiple processes so you can specify multiple -p or -pn switches as required.

If you are working with IIS 6.0 and want to snap only a minidump of a particular application pool, the command Tlist.exe -v will show you all running processes and their command lines. (Tlist.exe comes as part of the Debugging Tools for Windows.) Look through the list for the different W3wp.exe instances and their -ap command-line options to identify the application pool in which each instance is running. You'll then have the process ID of the exact W3wp.exe instance you can use with ADPlus's -p switch.

The two other ADPlus switches you'll need to specify if you are using my DNHANG.xml file are c, which tells ADPlus the full path to the configuration file, and o, which is the output directory for all files. As part of the debugger script buildup, ADPlus will create a directory in the output location called Hang_Mode__Date_ month-day-year__Time_ hour-minute-seconds. This is great because that means you can continue to run the same ADPlus command line repeatedly without losing your data.

In the run's output directory are two key files: the minidump file and the log from the debugger. You could open the minidump and execute all the debugger commands again, but it's best to first open the log file and read the output. That way you save a great deal of time because you can see the results of the common commands, and if there are any anomalies, you can open up the dump and poke at it until your heart is content looking for the problem. By the way, if you want to see the script ADPlus generated for the debugger, look in the CDBScripts directory.

The one drawback of the default options in the hang configuration file is that it's writing a full-memory minidump. Although that minidump is invaluable, because the noninvasive attach is suspending your process, it can take too long to create the dump. With a large ASP.NET application, the minidump creation can easily take five to ten minutes to create. While the worker process is stopped in the debugger, all the connection requests are bouncing off and falling on the floor.

Listing 6-2 shows my DNHANG-Quick.xml file, which does a minimal amount of work to tell you what's going on in a .NET application. By turning off all the default operations that ADPlus wants to do, we avoid the full-memory minidump time and native call stack walks, which require symbol-loading time but still get useful .NET information. Since you can execute any commands in the <CustomActions> element, you have complete control of the information you want to see at any time.

Listing 6-2. DNHANG-Quick.xml ADPlus Configuration File

[View full width]

<!-- Quick HANG mode configuration for all process types. --> <ADPlus> <Settings> <!-- Set the mode to HANG --> <RunMode>HANG</RunMode> <!-- Snap the dumps, don't tell me about it --> <Option>Quiet</Option> </Settings> <HangActions> <!-- Clear out all the default options that ADPlus wants to run. --> <Option> Clear </Option> <!-- For custom actions, I want to see all --> <!-- the handle info, the managed CLR version, --> <!-- managed threads, and managed call stacks. --> <CustomActions> .loadby sos mscorwks; !eeversion; !threads; ~*e!clrstack; </CustomActions> </HangActions> </ADPlus>



Crash Mode

Whereas hang mode simply gathers some information and jumps off the process, crash mode means that ADPlus configures the debugger to attach as a native debugger and perform specific commands on an exception or breakpoint. Although we can't yet set those breakpoints on our C# source code, there's still a tremendous amount of power available in the ADPlus crash mode.

Crash Mode Exceptions

The <Exceptions> element in the configuration file defines the actions you want to perform for all exceptions or a specific exception. There's one issue with the <Exceptions> element that may cause you some grief, and I want to show you how to work around it. Under <Exceptions>, the <Options> element can take a very neat keyword, FullDumpOnFirstChance. The documentation implies that you'll get a full-memory minidump each time your code throws any native SEH exception. Because .NET's exceptions are implemented internally with SEH, getting a minidump on each throw is a wonderful way of seeing what's happening in your application when running under testing scenarios. Listing 6-3 shows the configuration file that looks as if it would work.

Listing 6-3. Incorrect ADPlus configuration for a minidump for each exception

[View full width]

<ADPlus> <Settings> <!-- Only CRASH mode supports attaching and looking at exceptions. --> <RunMode>CRASH</RunMode> <!-- Be quiet and don't show any message boxes. --> <Option>Quiet</Option> <!-- Exception options. --> </Settings> <Exceptions> <!-- *Doesn't work!* *Doesn't work!* *Doesn't work!* *Doesn't work!*--> <Option>FullDumpOnFirstChance</Option> </Exceptions> </ADPlus>



Although Listing 6-3's code will produce a minidump for each type of SEH exception, it overwrites the particular exception's minidump file each time one is thrown. In other words, you get a minidump of only the last exception thrown instead of all previous exceptions.

Fortunately, it's not too hard to come up with a workaround. The ADPlus documentation discusses the <Config> element under the <Exceptions> element in which you can configure the action you want for a specific exception or all exceptions. In reality, the <Config> element is where you're specifying the sx operations, which I discussed in the "Exceptions and Events" section earlier in the chapter. The configuration file in Listing 6-4 shows how to get a unique dump on each SEH thrown in your application.

Listing 6-4. DNCRASH-DumpOnAllFirstChance.xml ADPlus Configuration File

[View full width]

<!-- Write a minidump on all SEH exceptions ADPlus configuration file. --> <ADPlus> <Settings> <!-- Only CRASH mode supports attaching and looking at exceptions. --> <RunMode>CRASH</RunMode> <!-- Don't pop up any modal dialogs. --> <Option>Quiet</Option> <!-- Exception options. --> </Settings> <Exceptions> <!-- Configure all exceptions to write a new dump every time there's a first chance exception. --> <Config> <!-- Set the configuration for all exceptions. --> <Code>AllExceptions</Code> <!-- Write the dump on first chance exceptions.--> <Actions1>FullDump</Actions1> <!-- Note that you can't use the ReturnAction1 element because it causes an error in ADPlus if you use AllExceptions. The documentation does not make clear that the ReturnActions1 applies only if you are setting specific exceptions values. --> </Config> </Exceptions> </ADPlus>



When it comes to .NET, what you really want to get are minidumps only when a specific .NET exception type is thrown. Having a production application writing a minidump on every exception is completely impractical. With SOS giving us the !soe command, we might have something we can use to write that minidump only on a nasty OutOfMemoryException. Toward the bottom of the "!StopOnException(!soe) and !PrintException(!pe) Command" section earlier in this chapter, I discussed how the !soe command did its work by invoking the sxe command. If you didn't read that section carefully, I would encourage you to read it because that gives us a hint on how to get a minidump only when a specific .NET exception is thrown.

Listing 6-5 shows an ADPlus configuration file that writes a minidump only if there's been an OutOfMemoryException. OutOfMemoryException bugs are extremely hard to track down, but you can now use ADPlus to get the debugger attached and when the horrible occurs, you'll have the exact state of the application at the instance the problem happened.

Listing 6-5. DNCRASH-DumpOnOutOfMemoryException.xml ADPlus Configuration File

[View full width]

<!-- Write a minidump only when an OutOfMemoryException occurs ADPlus configuration file. --> <ADPlus> <Settings> <!-- Only CRASH mode supports attaching and looking at exceptions. --> <RunMode>CRASH</RunMode> <!-- Don't pop up any modal dialogs. --> <Option>Quiet</Option> <!-- Exception options. --> </Settings> <Exceptions> <!-- Default to not doing any dumps on first chance exceptions. --> <Option> NoDumpOnFirstChance </Option> <Config> <!-- For all exceptions, turn off the stack walking for first chance exceptions. In production environments, you don't want to pay the performance hit for initial symbol loading and stack walking. --> <Code> AllExceptions </Code> <!-- At least log the message to the log file.--> <Actions1> Log </Actions1> <!-- If we're falling over on an unhandled exception, log it and write a minidump.--> <Actions2> Log; MiniDump; </Actions2> <!-- For first chance exceptions, say the debugger didn't handle it so the normal unwinding code gets it. --> <ReturnAction1> GN; </ReturnAction1> <!-- For unhandled exceptions just quit. --> <ReturnAction2> Q; </ReturnAction2> </Config> <Config> <!-- Set the configuration for CLR first chance exceptions. --> <Code> clr </Code> <!-- Turn off all the defaults from ADPlus.--> <Actions1> Void </Actions1> <!-- Execute the cool command to do the dump on the specific exception. --> <!-- Here's how to read the command: .loadby sos mscorwks // Load SOS based on the MScorwks.dll path. !stoponexception System .OutOfMemoryException 3 // Tell SOS to set pseudo register 3 to 1 if the exception thrown is // a System.OutOfMemoryException. .if(@$t3==1){...} // Using the debugger command program , execute the expression in the // curly braces if pseudo register 3 is 1. .dump /ma /u c:\\x\\y\\foo.dmp // Write out a minidump. There's no way to get the full path to // where ADPlus is writing out the rest of the dumps. // Note that the command program code has a bug in it in which it // doesn't properly handle single \ characters. You probably don't // want spaces in the output directories either. --> <CustomActions1> .loadby sos mscorwks; !stoponexception System .OutOfMemoryException 3; .if @@(@$t3==1){.dump /ma /u C:\ \DumpDirectory\\OOM.dmp} </CustomActions1> <!-- After taking the dump, let the application have it. --> <ReturnAction1> GN </ReturnAction1> </Config> </Exceptions> </ADPlus>



There are two interesting points about the configuration file in Listing 6-5. The first is that the ADPlus default is to write a call stack out each time you have a first-chance exception. Because the idea of my configuration file is to run on production environments, having the symbol loading and stack walking occurring every time causes a performance hit that you don't need. Consequently, I turn off the stack walking but leave the logging on, which writes to the output log that a specific exception happened.

The second item is how I go about determining the exact CLR exception that occurred. As I mentioned, I take advantage of the !soe command to write one in a specific pseudo register if the exception type matches. If that's the case, I use the cool debugger command language to do a conditional command that will write the dump on that exception. Unfortunately, there's no way in your custom commands sections in ADPlus to get the current output directory name, so I have to manually declare exactly where the particular dump file will be written. With the /u switch to .dump, I can ensure that the name is unique so as not to overwrite files. In addition, as you can see from the comments in the configuration file itself, you have to use double slashes (\\) for each path item because the debugger engine parses the string wrong.

The ADPlus configuration file's exception-handling prowess is an outstanding addition to your debugging tool chest. I showed an example of writing a minidump whenever a specific .NET exception is thrown, but a good exercise would be to extend DNCRASH-DumpOnOutOfMemoryException.xml to write minidumps on several specific .NET exceptions. Take a look at the "!StopOnException (!soe) and !PrintException (!pe) Commands" section earlier in this chapter to get some ideas.

Before I move to ideas for handling breakpoints by using ADPlus configuration files, I do need to mention one last item concerning using ADPlus to configure the debugger in a production environment. If your architecture has you throwing thousands of exceptions in normal operation, you're going to have performance problems with the debugger attached to your application. When running under a native debugger, any SEH exception causes the operating system to suspend all the threads in the debuggee and report the exception into the debugger.

The heavy cross-process communication overhead can make your application unresponsive because of the volume of exceptions. As many bugs I work on for clients occur only in production environments, if you can't run ADPlus because of the volumes of exceptions being thrown, you've destroyed the last best hope of finding those problems. If you are throwing exceptions just because you can, you'll want to think long and hard about your architecture and fix it if you can. Finally, if you have an exception type called GoodReturnException in which you report success, trust meyou need to rearchitect your application immediately!

Crash Mode Breakpoints

Configuring a breakpoint is very much like configuring an exception, except that you use the <Breakpoints> element. A very common scenario in which you'd want to get a minidump at a particular breakpoint is if your ASP.NET worker process is mysteriously ending. That means at some point in its operation, something is calling the Windows ExitProcess API from Kernel32.dll to end the application. The configuration file in Listing 6-6 shows setting a breakpoint on ExitProcess and writing a minidump.

Listing 6-6. DNCRASH-BreakOnExitProcess.xml ADPlus Configuration File

[View full width]

<!-- Break on ExitProcess ADPlus configuration file. --> <ADPlus> <Settings> <!-- Set the mode to CRASH. --> <RunMode>CRASH</RunMode> <!-- Do the work, don't tell me about it. --> <Option>Quiet</Option> </Settings> <Exceptions> <!-- Don't dump on first chance exceptions. --> <Option> NoDumpOnFirstChance </Option> </Exceptions> <Breakpoints> <NewBP> <!-- Set the breakpoint on ExitProcess. --> <Address> kernel32!ExitProcess </Address> <!-- A normal breakpoint. --> <Type> BP </Type> <!-- When hit, do a full memory minidump and walk the call stacks. --> <Actions> FullDump; Stacks; </Actions> <!-- After doing the actions, continue on and let the application end. --> <ReturnAction> G </ReturnAction> </NewBP> </Breakpoints> </ADPlus>



The main interesting point in Listing 6-6 is that because ExitProcess is an exported function from Kernel32.dll, I can use the module!exported function syntax to specify the native address to set. Once you have the minidump created on the call to ExitProcess, you'll need just to open up the minidump and look at the call stack of the current thread to see what led up to the call.

Snapping at the Right Time

As you've seen, getting the minidump written on a specific exception or when executing a native location isn't too difficult. However, it is a different story for other types of issues when you'd want to get the minidump, such as when memory usage grows past a certain point, or you suspect you're throwing an excessive number of exceptions. Fortunately, with all the performance counters available in Windows, there's data available for you to analyze and determine if this is the right time to create the minidump.

You could write your own performance counter monitor program that would spawn ADPlus when your specific condition was met, but the performance alerts built into Windows will do everything you need. You simply have to click a few buttons to tell the operating system what performance counters you want to monitor, the condition when the error occurs, and how to execute the program to write the dump. It's so simple, even a manager could handle it.

For example, if you suspect that there may be a kernel mode handle leak in your ASP.NET application, you could set up a performance alert to write a minidump whenever the handle count was greater than 5,000. To create this performance alert, from Control Panel\Administrative Tools, start the Computer Management console with administrator privileges. Under the System Tools, Performance Logs and Alerts tree control node, right-click Alerts, and from the shortcut menu, select New Alert Settings. After naming your alert in the New Alert Settings dialog box and clicking OK, the property page for the alert will open.

On the General tab of the alert property page, click Add, and the Add Counters dialog box will appear. Set the Performance Object to Process, select Handle Count in the counters list box, and in the instances list box, select your ASP.NET worker process. Clicking the Add button will add the performance counter but be sure to click the Close button to go back to the alert property page where you can set the Alert When Value Is combo box to Over and type 5000 in the Limit edit control. Figure 6-9 shows the filled-out General page.

Figure 6-9. Performance alert General property page


In the Sample Data Every section of the General property page, you'll want to pick an appropriate sampling interval for the performance alert. For the example I'm showing, a sampling value of once per hour is probably good. Since minidumps take several minutes to write out, you want to pick an interval that doesn't cause your performance alert to hang the machine by constantly writing out dumps of the target process. It's also a good idea to set the account to use in the Run As edit control. That way, if your batch file does more than just call ADPlus, you'll have the appropriate rights to network shares and other resources.

On the Action tab, select the Run This Program check box, and in the now-enabled edit control, type the program to run. It's best to use a batch file to start ADPlus from Performance Alerts so you don't have to mess with the long command-line options necessary for ADPlus inside the property page. Additionally, that gives you more processing options, such as restarting the worker process or anything else you would need to do to recover from the issue. In addition, you'll want to include the complete path to the batch file to ensure that it runs correctly. Figure 6-10 shows the completed Action tab.

Figure 6-10. Performance Alert Action tab


The default Performance Alert behavior is to run the scan continually. However, if you want to have more control over the starting and stopping, the Schedule tab lets you control the exact starting and stopping times. You can also choose to start and stop the scans manually from the Management Console.

After you've clicked OK to create your performance alert, you still have one more step to allow it to run correctly. Still staying in Computer Management, you need to go to the Services and Applications, Services node in the tree control. In the Services view, double-click Performance Logs and Alerts to bring up the property pages for the service. Click the Log On tab, and select the Allow Service To Interact With Desktop check box. If the user is not set to the Local System account, you may need to enable that first.

Performance alerts are a wonderful tool, but if you're going to use them on a production server, you'll want to carefully think through what will occur when the performance alert triggers. While you're trying to solve a nasty bug, you could cause very nasty problems for your server if you accidentally leave the sampling interval at 5 seconds, causing the server to end up in a minidump-writing frenzy. However, at least with performance alerts, you'll stand a fighting chance of getting that dump right when you need it.




Debugging Microsoft  .NET 2.0 Applications
Debugging Microsoft .NET 2.0 Applications
ISBN: 0735622027
EAN: 2147483647
Year: 2006
Pages: 99
Authors: John Robbins

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net