DeadlockDetection Requirements

[Previous] [Next]

As you might have noticed in the preceding tips and tricks section, I didn't provide any suggestions about what to do when an unexpected deadlock paralyzes your code. The recommendations there were more preventive measures you can take to try to avoid deadlocks in the first place rather than prescriptions for fixing them whey they do occur. In this section, you'll see that solving deadlocks isn't easy with just the debugger and that you almost always need some additional help. A utility that can come to your rescue when you need that extra assistance is DeadlockDetection.

Debugging War Story
The Deadlock Makes No Sense

The Battle

A team (that I wasn't a member of) was developing an application and ran into a nasty deadlock that made no sense. After struggling with the deadlock for a couple of days—an ordeal that brought development to a standstill—the team asked me to come help them figure out the bug.

The product they were working on had an interesting architecture and was heavily multithreaded. The deadlock they were running into occurred only at a certain time, and it always happened in the middle of a series of dynamic-link library (DLL) loads. The program deadlocked when WaitForSingleObject was called to check whether a thread was able to create some shared objects.

The team was good and had already double-checked and triple-checked their code for potential deadlocks—but they remained completely stumped. I asked if they had walked through the code to check for deadlocks, and they assured me that they had.

The Outcome

I remember this situation fondly because it was one of the few times that I've gotten to look like a hero within 5 minutes of starting the debugger. Once the team duplicated the deadlock, I took a quick look at the Call Stack window and noticed that the program was waiting on a thread handle inside DllMain. As part of their architecture, when a certain DLL loads, that DLL's DllMain starts another thread and then immediately calls WaitForSingleObject on an acknowledge event object to ensure that the spawned thread was able to properly initialize some important shared objects before continuing with the rest of the DllMain processing.

What the team didn't know is that each process has something called a "process critical section" that the operating system uses to synchronize various actions that happen behind the scenes in a process. One situation in which the process critical section is used is to serialize the execution of DllMain for the four cases in which DllMain is called: DLL_PROCESS_ATTACH, DLL_THREAD_ATTACH, DLL_THREAD_DETACH, and DLL_PROCESS_DETACH. The second parameter to DllMain indicates the reason the call to DllMain occurred.

In the team's application, the call to LoadLibrary caused the operating system to grab the process critical section so that the operating system could call the DLL's DllMain for the DLL_PROCESS_ATTACH case. The DLL's DllMain function then spawned a second thread. Whenever a process spawns a new thread, the operating system grabs the process critical section so that it can call the DllMain function of each loaded DLL for the DLL_THREAD_ATTACH case. In this particular program, the second thread blocked because the first thread was holding the process critical section. Unfortunately, the first thread then called WaitForSingleObject to ensure that the second thread was able to properly initialize some shared objects. Because the second thread was blocked on the process critical section, held by the first thread, and the first thread blocked while waiting on the second thread, the result was the usual deadlock.

The Lesson

The obvious lesson is to avoid doing any Wait* calls inside DllMain. However, the issues with the process critical section extend beyond the Wait* functions. The operating system acquires the process critical section behind your back in CreateProcess, GetModuleFileName, GetProcAddress, LoadLibrary, and FreeLibrary, so you shouldn't call any of these functions in DllMain. Because DllMain acquires the process critical section, only one thread at a time is ever executing a DllMain.

As you can see, even experienced developers can get bitten by multithreaded bugs—and as I mentioned earlier, this kind of bug is often in the place you least expect it.

Here's the list of basic requirements I worked with when I developed DeadlockDetection:

  1. Show exactly where the deadlock happens in the user's code. A tool that tells only that EnterCriticalSection is blocked doesn't help much. To be really effective, the tool needs to let you get back to the address, and consequently the source file and line number, where the deadlock occurred so that you can fix it quickly.
  2. Show what synchronization object caused the deadlock.
  3. Show what Windows function is blocked and the parameters passed to the function. It helps to see timeout values and the values passed to the function.
  4. Determine which thread caused the deadlock.
  5. The utility must be lightweight so that it interferes with the user's program as little as possible.
  6. The information output processing must be extensible. The information collected in a deadlock detection system can be processed in many ways, and the utility needs to allow others, not just you, to extend the information as they see fit.
  7. The tool must integrate easily with the user's programs.

One of the key points to keep in mind with a utility such as DeadlockDetection is that it definitely affects the behavior of the application it's observing. Once again, it's the Heisenberg uncertainty principle in action. DeadlockDetection can produce deadlocks in your programs you might not otherwise see because the work it does to gather information slows down your threads. I almost defined this behavior as a feature because any time you can cause a deadlock in your code, you've identified a bug, which is the first step toward correcting it—and as always, it's better for you to find the bugs than for your customers to find them.



Debugging Applications
Debugging Applications for MicrosoftВ® .NET and Microsoft WindowsВ® (Pro-Developer)
ISBN: 0735615365
EAN: 2147483647
Year: 2000
Pages: 122
Authors: John Robbins

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net