The Debugging Process | Debugging Microsoft .NET 2.0 Applications

Finally, let's start talking about hands-on debugging by discussing the debugging process. Determining a process that works for all bugs, even "freak" bugs (bugs that come out of the blue and don't make any sense), was a bit challenging. But by drawing on my own experiences and by talking to my colleagues and clients about their experiences, I eventually came up with a debugging approach that all great developers intuitively follow but that less experienced (or just less-skilled) developers often don't find obvious.

As you'll see, this debugging process doesn't take a rocket scientist to implement. The hard part is ensuring that you start with this process every time you debug. Here are the nine steps involved in the debugging approach that I recommend:

Step 1: Duplicate the bug.
Step 2: Describe the bug.
Step 3: Always assume that the bug is yours.
Step 4: Divide and conquer.
Step 5: Think creatively.
Step 6: Utilize tools.
Step 7: Start heavy debugging.
step 8: Verify that the bug is fixed.
step 9: Learn and share.

Depending on your bug, you can skip some steps entirely because the problem and the location of the problem are entirely obvious. You must always start with Step 1 and get through Step 2. At any point between Steps 3 and 7, however, you might figure out the solution and be able to fix the bug. In those cases, after you fix the bug, skip to Step 8 to verify and test the fix. Figure 1-1 illustrates the steps of the debugging process.

Figure 1-1. Leverage tools should say Utilize tools.

Step 1: Duplicate the Bug

The most critical step in the debugging process is the first one: duplicating the bug. This is sometimes difficult, or even impossible, but if you can't duplicate a bug, you probably can't eliminate it. When trying to duplicate a bug, you might need to go to extremes. I had one bug in my code that I couldn't duplicate just by running the program. I had an idea of the data conditions that might cause it, however, so I ran the program under the debugger and entered the data I needed to duplicate the bug directly into memory. It worked. If you're dealing with a synchronization problem, you might need to take steps, such as loading the same tasks so that you can duplicate the state in which the bug occurred.

At this point you're probably thinking, "Well, duh! Of course the first thing you do is duplicate the bug. If I could duplicate it all the time, I wouldn't need your book!" It all depends on your definition of "duplicatability." My definition is duplicating the bug on a single machine once in a 24-hour period. That's sufficient for my company to come in to work on it. Why? Simple. If you can get it on one machine, you can throw 30 machines at it and get the bug duplicated 30 times. The big mistake people make with duplicating the bug is to not get as many machines as possible into the mix. If you have 30 people to manually punch keys for you, that's great. However, a valuable effort would be to automate the user interface to drive the bug out into the open. You can use either the excellent testing tools that are part of Visual Studio 2005 or another commercial product if you need user interface automation.

Once you've duplicated the bug by using one general set of steps, you should determine whether you can duplicate the bug through a different set of steps. You can get to some bugs via one code path only, but you can get to other bugs through multiple paths. The idea is to try to see the behavior from all possible angles. By duplicating the bug from multiple paths, you have a much better sense of the data and boundary conditions that are causing the problems. Additionally, as we all know, some bugs can mask other bugs. The more ways you can find to duplicate a bug, the better off you'll be.

Even if you can't duplicate the bug, you should still log it into your bug tracking system. If I have a bug that I can't duplicate, I always log it into the system anyway, but I leave a note that says that I couldn't duplicate it. That way, if another engineer is responsible for that section of the code, that engineer at least has an idea that something is amiss. When logging a bug that you can't duplicate, you need to be as descriptive as possible. If the description is good enough, it might be sufficient for you or another engineer to solve the problem eventually. A good description is especially important because you can correlate various non-reproducible bug reports, enabling you to start seeing patterns in the bug's behavior.

The last thing I want to discuss about duplicating the bug is that it's very rare you get the bug duplicated on your development machine. That's because you have different tools and processes on the development machine than on the running machine, or worse yet, you're logged in with Administrator rights. With both Microsoft and VMWare giving away their virtualization environments, there are no excuses for you to not have a virtualized version of the runtime environment on your machine set up with remote debugging. You should work at duplicating your bugs in the virtual machine so you have a better chance of finding the exact pattern that works.

Step 2: Describe the Bug

If you were a typical engineering student in college, you probably concentrated on your math and engineering classes and barely passed your writing classes. In the real world, your writing skills are almost more important than your engineering skills because you need to be able to describe your bugs, both verbally and in writing. When faced with a tough bug, you should always stop right after you duplicate it and describe it. Ideally, you do this in your bug-tracking system, even if it's your responsibility to debug the bug, but talking it out is also useful. The main reason for describing the bug is that doing so often helps you fix it. I can't remember how many times another engineer's description helped me look at a bug in a different way.

Step 3: Always Assume That the Bug Is Yours

In all the years that I've been in software development, only a minuscule percentage of the bugs I've seen were the result of the compiler or the operating environment. If you have a bug, the odds are excellent that it's your fault, and you should always assume and hope that it is. If the bug is in your code, at least you can fix it; if it's in your compiler or the operating environment, you have bigger problems. You should eliminate any possibility that the bug is in your code before spending time looking for it elsewhere.

To reinforce the need to assume that the bug is yours, we've found a very interesting statistic at Wintellect. Over the last six years, we've worked on thousands of bugs for our clients. Of all of those bugs, only four were not in our client's code. I can't stress enough: always assume it's your bug, because it is.

Step 4: Divide and Conquer

If you've duplicated your bug and described it well, you have started a hypothesis about the bug and have an idea of where it's hiding. In this step, you start firming and testing your hypothesis. The important thing to remember here is the paraphrased line from the movie Star Wars: "Use the source, Luke!" Read the source code, and desk-check what you think is happening with what the code really does. Reading the code will force you to take the extra time to look at the problem. Starting with the state of the machine at the time of the crash or problem, work through the various scenarios that could cause you to get to that section of code. If your hypothesis of what went wrong doesn't pan out, stop for a moment and reassess the situation. You've learned a little more about the bug, so now you can reevaluate your hypothesis and try again.

Debugging is like a binary search algorithm. You're trying to find the bug, and with each iteration through your different hypotheses, you are, hopefully, eliminating the sections of the programs where the bug is not. As you continue to look, you eliminate more and more of the program until you can box the bug into a section of code. As you continue to develop your hypothesis and learn more about the bug, you can update your bug description to reflect the new information. When I'm in this step, I generally try out three to five solid hypotheses before moving on to the next step. If you think you're getting close, you can do a little light debugging in this step to do final verification of the hypothesis. By light, I mean double-checking states and variable values, not slogging through looking at everything.

Everything you do in this step is strictly evidence-based. Don't go down a path because it feels right. I'm not saying to ignore your hunches, but back up that hunch with actual data that helps confirm or deny that hunch. I've seen many developers, myself included, waste too much time on a wild goose chase with no solid evidence. Errors in software are a cause-and-effect situation, so keep your eyes on finding that cause.

Step 5: Think Creatively

If the bug you're trying to eliminate is one of those nasty ones that happens only on certain machines or is hard to duplicate, start looking at the bug from different perspectives. This is the step in which you should start thinking about version mismatches, operating system differences, problems with your program's binaries or its installation, and other external factors.

When I'm teaching a training class on debugging, I ask how many people have been working on a bug all day long and solve it on the drive home. Nearly all the hands in the room go up. What's even wilder is when I ask how many of us have woken up in the middle of the night with the exact solution for a bug. While it's a sign of our sleep deprivation, at least 40 percent of the hands always go up. It's all about seeing the forest for the trees. Your subconscious knows exactly where to look, so set it free on the bug.

My big hint here is the "two-hour rule." After working on a bug for two hours, you have to do an honest assessment: are you making progress or not? If you're not, just stop working on the particular problem because you're wasting your time. Of course, if this is the one bug holding up shipment, telling your boss that you've worked the two hours and are going to work on something else might be a career-limiting move.

When you are at that two-hour wall and need to get the creative juices going, the secret is "Bug Talk," which at several companies I've worked at, is the highest priority interrupt possible. That means that you are totally stumped and need to talk the bug over with someone. The idea is that you can walk into a person's office and present the problem on a whiteboard. I don't know how many times I've walked into someone's office, uncapped the marker, literally just touched the marker on the board, and solved my problem without even saying a word. Just getting your mind prepared to present the problem helps you get past the individual tree you're staring at and lets you see the whole forest. When you choose a person to do a Bug Talk with, you should not pick someone you're working very closely with on the same section of the project. That way, you can increase the likelihood that your Bug Talk partner isn't making the same assumptions you are about the problem.

What's interesting is that the "someone" doesn't even have to be a human. My cats, as it turns out, are excellent debuggers, and they have helped me solve a number of really nasty bugs. After rounding them up, I draw the problem out on my whiteboard and let them work their magic. Of course, one time I was doing this when I hadn't taken a shower and was wearing nothing but shortsthat was a little difficult to explain to the United Parcel Service (UPS) delivery guy standing at my door.

The one person you should always avoid doing Bug Talks with is your spouse or significant other. For some reason, the fact that you're having a relationship with that person means that there's a built-in problem. Of course, you've probably already seen this when you try to describe that bug and the person's eyes glaze over, and he or she nearly passes out.

Step 6: Utilize Tools

I've never understood why some companies let their engineers spend weeks searching for a bug when spending a thousand dollars for error detection, performance, and code-coverage tools would help them find the current bugand bugs they will encounter in the futurein minutes.

I've run into several clients now that will provide the Visual Studio 2005 Professional Edition only for their developers. Say what you want about the Developer Division's crazy marketing, but there's nothing "professional" about that version. It lacks the real tools you need to do your job. The Team Editions of Visual Studio, specifically Visual Studio 2005 Team Edition for Software Developers or the ultimate, Visual Studio 2005 Team Suite (which includes all the flavors of Visual Studio Team Editions), are the minimum that developers need to do their job.

I realize that what I'm advocating is not cheap. At the time I wrote this, a brand new purchase of Visual Studio 2005 Team Edition for Software Developers costs $5,469.00 USD when paired with an MSDN Subscription. The Visual Studio 2005 Team Suite is an eye-watering and knee-buckling $10,939.00 USD. Personally, I think Microsoft is absolutely insane to overcharge their developer base like this, but it's the cost of making real things happen on the .NET platform. While it's not much good news, if you are buying multiple copies or upgrading, you can force the price down considerably.

The reason you need at least Visual Studio 2005 Team Edition for Software Developers is because of the extra tools. The static analysis portion, which used to be called FxCop, looks over your code to ensure that you're following rules as promulgated by the outstanding book, Framework Design Guidelines : Conventions, Idioms, and Patterns for Reusable .NET Libraries by Krzysztof Cwalina and Brad Abrams (Addison Wesley, 2004). The code analysis tool will help ensure that everyone is following a consistent usage for all public-facing interfaces and classes.

Also included are the performance and coverage tools in the Visual Studio 2005 Team Edition for Software Developers. As you can guess, the performance tool is an excellent profiler that helps you find the bottlenecks in your code. As you'll see throughout this book, code coverage is a major passion of mine when it comes to debugging. The information that you're looking for from a code coverage tool are the lines you haven't executed. Any time you have more than 15 percent of your code not executed, you're going to experience extreme pain when testers get a hold of your code.

If you're at a startup company where money is tight, or you just have a cheap boss, not having Visual Studio 2005 Team Suite doesn't mean the end of your .NET development days. With a combination of open source and commercial tools, some of which are much better than their counterparts in Visual Studio, you can get some great debugging, testing, and tuning done.

For unit testing, NUnit (www.nunit.org) combined with the TestDriven.NET add-in (www.testdriven.net) is an excellent replacement for the unit testing tools in Visual Studio Team Developer Edition. FxCop (www.gotdotnet.com/team/fxcop/) is the standalone version of the integrated Code Analysis tools and offers a few extra checks, such as checking spelling using the Microsoft Office spelling engine. For profiling, Red Gate's ANT and SciTech Software's .NET Memory Profiler (www.memprofiler.com), along with the free CLRProfiler (www.microsoft.com/downloads/) from Microsoft, will handle all your profiling and tuning needs. You also have complete suites of .NET developer tools from Rational-IBM and Compuware.

Step 7: Start Heavy Debugging

I differentiate heavy debugging from the light debugging I mentioned in Step 4 by what you're doing in the debugger. When you're doing light debugging, you're just looking at a few states and a couple of variables. In contrast, when you're doing heavy debugging, you're spending a good deal of time exploring your program's operation. It is during the heavy debugging stage that you want to use the debugger's advanced features. Your goal is to let the debugger do as much of the heavy lifting as possible. Chapter 5 discusses the Visual Studio debuggers' advanced features.

Just as when you're doing light debugging, when you're doing heavy debugging, you should have an idea of where you think your bug is before you start using the debugger and then use the debugger to prove or disprove your hypothesis. Never sit in the debugger and just poke around. In fact, I strongly encourage you to actually write out your hypothesis before you ever fire up the debugger. That will help you keep completely focused on exactly what you're trying to accomplish.

Also, when you're doing heavy debugging, remember to regularly review changes you made to fix the bug in the debugger. I like to have two machines set up side-by-side at this stage. That way I can work at fixing the bug on one machine and use the other machine to run the same code with normal condition cases. The idea is to always double-check and triple-check any changes so that you're not destabilizing the normal operation of your product. Here's some career advice: your boss really dislikes it when you check in code to fix a bug and your product handles only weird boundary conditions and no longer handles the normal operation case.

If you set up your project correctly and follow the debugging steps in this chapter and the recommendations in Chapter 2, you hopefully won't have to spend much time doing heavy debugging.

Step 8: Verify That the Bug Is Fixed

When you think that you've finally fixed the bug, the next step in the debugging process is to test, test, and retest the fix. Did I also mention that you need to test the fix? If the bug is in an isolated module on a line of code called once, testing the fix is easy. However, if the fix is in a core module, especially one that handles your data structures and the like, you need to be very careful that your fix doesn't cause problems or have side effects in other parts of the project.

When testing your fix, especially in critical code, you should verify that it works with all data conditions, good and bad. Nothing is worse than a fix for one bug that causes two other bugs. If you do make a change in a critical module, you should let the rest of the team know that you made the change. That way, they can also be on the lookout for any ripple effects.

The only way I've found to know that you have the right level of testing is to ensure that you've executed the code. With the excellent code coverage tools built into Visual Studio Team Developer and Team Tester Editions, you have no excuse for not showing that you've thoroughly tested the change code. Code coverage is not the end all and be all of testing, but it's an excellent benchmark to ensure that you're getting your job done.

Debugging War Story: Where did the integration go?

The Battle

One of the developers I worked with at NuMega thought he'd found a great bug in NuMega's Visual C++ Integrated Development Environment (VC IDE) integration because it didn't work on his machine. For those of you who are unfamiliar with NuMega's VC IDE integration, let me provide a little background information. NuMega's software products integrate with the VC IDEand have for a number of years. This integration allows NuMega's windows, toolbars, and menus to appear seamlessly inside the VC IDE.

The Outcome

This developer spent a couple of hours using SoftICE, a kernel debugger, exploring the bug. After awhile, he had set breakpoints all over the operating system. Finally, he found his "bug." He noticed that when he started the VC IDE, CreateProcess was being called with the \\R2D2\VSCommon\MSDev98\Bin\MSDEV.EXE path instead of the C:\VSCommon\MSDev98\Bin\MSDEV.EXE path he thought it should be called with. In other words, instead of running the VC IDE from his local machine (C:\VSCommon\ MSDev98\Bin\MSDEV.EXE), he was running it from his old machine (\\R2D2\ VSCommon\MSDev98\Bin\MSDEV.EXE). How did this happen?

The developer had just gotten a new machine and had installed the full NuMega VC IDE integration for the products. To get it set up faster, he copied his desktop shortcuts (.lnk files) from his old machine, which were installed without VC IDE integration, to his new machine by dragging them with the mouse. When you drag shortcuts, the internal paths update to reflect the location of the original target. Since he was always starting the VC IDE from his desktop shortcut, which was pointing to his old machine, he'd been running the VC IDE on his old machine all along.

The Lesson

The developer went about debugging the problem in the wrong way by just jumping right in with a kernel debugging instead of attempting to duplicate the problem in multiple ways. In Step 1 of the debugging process, "Duplicate the Bug," I recommended that you try to duplicate the bug in multiple ways so that you can be assured that you're looking at the right bug, not just multiple bugs masking and compounding one another. If this developer had followed Step 5, "Think Creatively," he would have been better off because he would have thought about the problem first instead of plunging right in.

As an alternative, the developer could have used Step 6, "Utilize Tools," and seen if there were any tools that could have helped. In this case, a quick run of Sysinternal's excellent FileMon (www.sysinternals.com/Utilities/Filemon.html), which watches all file accesses on your machine, would have shown that the file reads were coming off a network path instead of a local machine.

Step 9: Learn and Share

Each time you fix a nasty bug, you should take the time to quickly summarize what you learned. I like to record my good bugs in a journal so that I can later see what I did right in finding and fixing the problem. More important, I also want to learn what I did wrong so that I can learn to avoid dead ends when debugging and solve bugs faster. You learn the most about development when you're debugging, so you should take every opportunity to learn from it.

One of the most important steps that you can take after fixing a good bug is to share with your colleagues the information you learned while fixing the bug, especially if the bug is project-specific. This information will help your coworkers the next time they need to eliminate a similar bug. For example, the SQL Team at Microsoft has set up an internal e-mail alias called "SQL Dev Debugging Discussion," and that's where fantastic war stories can be shared across the team. Alternatively, you could set up an internal blog or wiki where everyone could post those stories.

If you're worried about no one contributing to these areas, there's a simple solution. Make it part of their performance review. It's an easy matter to make it part of everyone's goals to achieve the company's goals. It will take some thought on the manager's part to work out the details, but if a developer's pay is based on it, you can get the contributions you need.

Final Debugging Process Secret

I'd like to share one final debugging secret with you: the debugger can answer all your debugging questions as long as you ask it the right ones. Again, I'm suggesting that you need to have a hypothesis in mindsomething you want to prove or disprovebefore the debugger can help you. As I recommended earlier in Step 7, I write out my hypothesis before I ever touch the debugger to ensure that I have a purpose each time I use it.

Remember that the debugger is just a tool, like a screwdriver. It does only what you tell it to do. The real debugger is the software in your hardware cranium.