Debugging Developer Psychology | Comprehensive VB .NET Debugging

Moving to the VB .NET world is going to mean leaving some of your current debugging habits behind and replacing them with new ones. You may be asked to debug a much wider variety of applications, all the way from Windows Forms applications to Web Forms applications to XML Web services, and on to Windows services, SQL Server stored procedures, and even VB.Classic code.

Although the debugging tools available to you are much more grown up than those available with VB.Classic, you are no longer working with an interpreted language. In fact, it's likely that you're no longer working with just a single language. This means that you should take some time to understand the new debugging problems that you'll have to solve.

Living Without Edit and Continue

Perhaps the first and most obvious debugging challenge for the VB.Classic developer is the loss of Edit and Continue (often abbreviated E&C). In the VB.Classic IDE you were able to edit your code while the program was in "break" mode, and then rerun the modified code without stopping or recompiling the program. This facility is quite addictive when you are debugging or testing code, and its loss has come as a nasty shock to the VB community. Even though VB .NET allows you to edit the source of a running program, these code changes will not actually be applied until the program is restarted.

There was some evidence buried deep in the beta documentation to suggest that the Redmondites did actually cajole E&C into working in the VB .NET environment, but then found that it was usually slower than just stopping and restarting the program. It is quite likely that E&C will be brought back in a future version of VB .NET, but until that time developers need to meet the challenge of living without it.

Developers will feel the loss of E&C deeply. It is psychologically easy to work toward a solution by progressive approximations, trying ideas and then modifying those ideas after seeing immediate results. This convenience, however, hides some problems.

E&C can be used well, but also has a dark side. Many developers find that it allows them to indulge in the rather sordid and unsafe habit of "shotgun debugging."

What typically happens is that a developer will start his or her program, put a breakpoint at the point of interest and then modify the code in ad hoc ways in order to get it working. Because it's possible to try multiple fixes and tests during a single run without having to restart the program, this approach appears to be quite productive in terms of time spent. In fact, this making of relatively undirected changes to software in the hope that a bug will be perturbed out of existence almost never works, and it often introduces more bugs .

The major problem with this ad hoc "shotgun" approach is its informality. At its heart is a misunderstanding of the tradeoff between the speed of producing a working section of code and the quality of that code. Debugging is about trying to make code work, while testing is about trying to break code. Mixing these two activities together is likely to mean that neither is done well. As debugging guru John Robbins explains in his excellent book Debugging Applications (Microsoft Press, 2000), when you're in the debugger you should be debugging, not editing. Otherwise it's just as easy to add a bug as it is to remove one.

Let's take a look at some programming activities to find out how E&C can be used badly and how it can be used well. After exploring the E&C issues, I show you what workarounds are possible in its absence.

Bad Debugging with E&C

In this first scenario, a developer makes several attempts during a single run to fix a problem. He repeats a cycle of altering code and then resetting the current execution point to run through the altered code. For example, a developer believes that a bug is caused by an off-by-one error in a loop. So he first puts a +1 in the loop and then checks the result. If that doesn't work, he puts a “1 in the loop instead, still hoping to get the right result.

Proponents of this debugging approach say that it is more efficient because there is no need to restart the program after every code change. The problem, though, is the indisputable fact that in this case the developer doesn't actually understand the problem. If he did understand the problem, there would be no need to make multiple attempts at a fix ”he would be able to go straight to the problem area and fix it in one attempt.

While E&C makes it very tempting to try many potential solutions until something seems to work, this is akin to poking a jellyfish with a stick to see if it moves; it doesn't actually teach you very much. A quality solution is much more likely to come from some deep thinking about a problem rather than ad hoc attempts at a fix.

More Bad Debugging with E&C

In the second scenario, a developer starts by finding and fixing a single bug, but then attempts to fix more defects that appear. These defects are often related to or hiding behind the first bug. The temptation when using E&C is to find and fix as many bugs as possible during a single run ”this can be very satisfying and at first sight seems to be an efficient method of working.

Unfortunately, this approach also has problems. First, the developer once again probably doesn't understand the original defect at a deep level. If she understood the original bug properly, it is unlikely that her original fix would have exposed the multiple new issues. Even if she did understand the bug, it is debatable how well the new issues and subsequent fixes were understood. As before, E&C makes it far too tempting to fix bugs as they arise, rather than taking time to think about the issues away from the heat of the battle.

Another problem with this approach is that if there are multiple issues being exposed within a block of code, it is likely that the code itself is fundamentally flawed. Such code always benefits from taking a step back from the situation rather than fixing the bugs in an ad hoc manner. Using E&C to push multiple fixes is unlikely to improve the code's quality.

The next problem is that several programming experiments have been done that collectively show that the average developer introduces one new bug for every two bugs that he or she fixes. Even the best developers produce one new bug for every four that they fix. Making multiple concurrent fixes, some of which might potentially interfere with each other, is not conducive to producing high-quality solutions.

Perhaps the final nail in the coffin of "shotgun debugging" is that many of the E&C fixes made are small ones, often affecting only a few lines of code. These are made because they are the type of changes that look manageable without having to perform desk-checking or proper review. Unfortunately, the literature shows that the chances of creating a bug significantly increase when making such small changes. Specifically, as the number of lines changed increases from one to five lines, the chance of making a bad change is high and increases . With more than five altered lines, the chance of making a bad change decreases, probably because the developer becomes more careful as the fix becomes larger.

Bad Unit Testing with E&C

In this third scenario, a developer edits code several times within a single run in order to run multiple unit tests on a procedure or component. For example, a VB .NET program that executes an embedded SQL script sees the developer correcting, tuning, and reexecuting the SQL script many times until he is happy that everything is working properly. The typical argument for this approach is that it is much faster to run these multiple edit-and-test cycles in one hit rather than having to restart the program every time a change is made.

There are several problems with this type of ad hoc testing. The first is that any tests done in this manner are not going to be as comprehensive as a manual or automated unit test harness that has been implemented beforehand. It's hard to invent several good unit tests on the fly, especially when the psychological mindset is aimed at making the code work rather than proving that the code doesn't work. The temptation is always to fix the code in-flight when each test fails rather than running a test suite, collating the results, and then thinking about the fixes as a single, well-controlled patch to the program. Simultaneous mixing of the unit testing mindset with the quite different mindset of fixing code that fails the unit tests is very dangerous.

A second problem with this approach is that it makes it very difficult to perform unit regression testing on future versions of the code. Even if the developer could manage to remember the entire unit test suite that he had performed, repeating a set of interactive regression tests every time the code changes is very boring and therefore prone to mistakes.

Another trap that many developers fall into with E&C in this context is changing code on both sides of the interface being tested. It is just too easy to make a change to the calling code that makes the interface being tested appear to work. What might actually be happening is that two errors, one on either side of the interface, could compensate for each other and result in the interface appearing to work. Once again, the psychological need to get the code working can interfere with the unit tests being performed. In effect, this makes it easier for the developer to fool himself into believing that everything is working correctly through an artificial test case created by manipulating two sets of code.

Good Debugging with E&C

One appropriate way of debugging with E&C is when you find an obvious coding error, often while looking at something completely different, and change the code for an obvious fix. While it is a fine line between an "obviously correct" fix and a "subtly wrong" fix, most developers know when they've found a glaringly obvious problem. In this case, it can be good practice to fix the problem, test the fix immediately, and then proceed on your way with your original mission.

Another reasonable way of using E&C is when you've found an error and managed to reproduce it successfully. The next step is often to reproduce the error in several different ways in order to learn more about it. E&C can be very useful in allowing multiple code changes and tests during a single run in order to triangulate the bug and learn more about it. The proviso, of course, is that you must put the code back into its original state once you have completed your investigation.

Good Unit Testing with E&C

E&C can prove very useful when you perform graphical user interface (GUI) unit testing. Although it's still better to create a proper test harness if possible, this task can be rather difficult to automate because you have to simulate a very unpredictable component (the end user ), and also it's difficult to validate the unit test results. For this reason, many developers prefer to test interactively, so I'll reluctantly grant that this can be a reasonable use of E&C.

Good Prototyping with E&C

In this scenario, a developer makes multiple code changes during a single run in order to prototype, typically when investigating the behavior of a class, a control, or some other component. With E&C, it is much faster to perform this type of testing, and because it is only prototyping, there is much less danger of bugs running rampant. This scenario is one of the genuinely useful ways of using E&C.

Workarounds for "Bad" E&C

Viewed in the light of the numerous problems mentioned previously, losing the ability to use E&C in "shotgun" mode is a blessing in disguise. You can wean yourself away from the E&C addiction and instead produce better bug fixes and unit tests.

If you have previously been tempted to use "shotgun debugging," you can now try a more formal approach that might be called think and restart (T&R). With T&R, you sit back and take the time to understand the problem at a deep level before introducing the fix. After restarting the program and testing the fix, you should be very surprised, even shocked, if the fix didn't work. In this case, stop the program and figure out where you don't understand the problem before making another fix and restarting the program for some tests. The pain of frequent program restarts may even help by tempting you into making the correct fix on the first time around.

To replace ad hoc unit testing, you should think seriously about creating a test harness, otherwise known as a debugging scaffold . Constructing a test harness before building your program is like erecting a scaffold before building your wall. For small projects, it makes life much easier. For large projects, it is essential. A test harness makes it possible to repeat your unit tests and regression tests at will to ensure that fixed bugs stay fixed and that no new bugs have been introduced. It also makes it feasible for you to create a thorough and more thoughtful test suite than is possible with "shotgun testing." So you can write the test harness, start your application, and then apply all of your unit tests in one hit. After analyzing the test results, you can try to understand any resulting problems by setting breakpoints and looking at the flow of data within the code. Finally, you will have to stop the program before you can apply all of the necessary fixes and start a new test cycle.

Although the two alternative approaches outlined previously may be more formal and seem to take longer than using E&C in shotgun mode, the payoff is that your fixes and tests will be much higher in quality. Your understanding of the code is likely to be better and it will have been through some comprehensive testing.

At the root of this clash between formal and informal debugging and testing are two different mindsets . The classic VB mindset often values a solution that is produced fast to meet immediate business needs for an application that may not even be needed tomorrow. The resulting solution may not be of the highest quality, but is produced fast ”it is considered fit for its purpose. Up until recently, this type of application was the typical domain of many VB.Classic programming shops .

The VB .NET mindset is perhaps more applicable to bigger departmental or enterprise-type solutions where the application is likely to have a longer life, be subject to more maintenance, and where quality is more important than speed of result. In this mindset, the quality required tends to rule out the use of E&C in shotgun mode because it often produces solutions with more bugs that are harder to maintain.

Workarounds for "Good" E&C

That still leaves you with finding an alternative to the reasonable uses of E&C mentioned previously. If you want to modify code within the IDE during program execution, this is still possible. After you select Tools ’ Options ’ Debugging ’ Edit and Continue, you should select the "Allow me to edit VB files while debugging" box. You also should select the "Always ignore changes and continue debugging" option.

The problem is that these changes won't be applied until you restart the program, and unless you identify changes made since the program was last restarted, it's possible to become confused about which code is actually executing. My advice is to put the new code within a region by using the region directive with a suitable heading. You can then collapse this region within the IDE code window so that you don't see the new code, but you can use the region heading to identify what change you made and why you made it. To remember this code change, add a bookmark linked to it in the Task window. When you decide to review your changes, double-clicking the bookmark in the Task window will take you straight to the related region or lines of code.

For GUI unit testing, you should create one or more procedures that can throw tests against the GUI. Because you are able to change data at will during a run, and you can also execute your test procedures from the Command window, it is then trivial to modify the data in the Command window and invoke your test procedures. This gives you an E&C unit test framework without having to change code.

This still leaves you trying to find a replacement for E&C during prototyping. While you can prototype code at will from the Command window by just changing data values, this isn't really a complete solution. Until E&C is added to VB .NET, it's the best solution that you have.

Psychological Factors

From the point of view of the CLR or any other runtime engine, the most error-prone component of any computer application is the actual programmer. The problem is that nearly every developer finds it extremely difficult to maintain the level of precision necessary to program a complex software system successfully. It's not just that the tools and plumbing have idiosyncrasies, implementation subtleties, unfamiliar levels of abstraction, and outright defects. The main issue is that human brains are unsuited to handling the level of detail required and to spanning levels of abstraction ranging from bits up to gigabytes. Add into this volatile mix a compiler that has to understand the lovingly crafted code and then translate it into goo that the processor can understand, and you are left with enormous potential for mistakes and misunderstandings when writing a computer application. It's hardly surprising that studies report most projects spend around 50% of their schedule in the debugging phase.

COM is a good example of a reasonable development technology that has proved difficult to implement in the real world because of developer frailties. Dealing with the separation of components from their registry entries and keeping the two in step for every version of every component interface on every user's machine is a nightmare in a world of constantly changing business requirements and code. It's not that COM technology doesn't work, it's just that even clever people find it very hard to cope with the detail involved in orchestrating their components and corresponding registry entries into a coherent whole on every user's machine and then maintaining that orchestration over the lifetime of an application. Countless numbers of VB.Classic developers have struggled with implementing binary compatibility on a daily basis.

Faced with an almost new language and completely new ways of interacting with the core of Windows, developers must ensure that they learn effectively and don't create bugs due to their initial inexperience with the .NET world.

Learning the .NET Framework

The first learning challenge is rather large. There are over 5,600 classes in over 90 namespaces within the .NET base class library ”an enormous amount of functionality by anybody's standards. Understanding and implementing these base classes within your applications without creating bugs is not a task for the faint-hearted. The average VB.Classic developer will have to put much more effort into learning the .NET Framework than into learning the VB .NET language itself. As opposed to just learning a relatively familiar language, with the .NET Framework you're faced with that scary "I don't know what I don't know" feeling.

If you're already an expert in a particular part of the Win32 API, relearning the .NET Framework approach to that area may not be so hard. Much of the .NET class library simply wraps the Win32 API in an object-oriented wrapper for your convenience. There are some exceptions to this, such as in the graphics area, but in general the Win32 API is still at the core of the .NET Framework. If, however, you're an expert in VB.Classic's approach to your specialist area, now is the time to start thinking about doing some serious homework.

The first step is probably the most important: Read the framework documentation thoroughly! In the past, we VB.Classic developers have traditionally been spoiled. The language concealed much of the complexity from us ”it just worked. We often didn't need to read the documentation, and when we did we found that it often wasn't that great. Sometimes the documentation had been written for C programmers and was unintelligible to your average VB.Classic developer. At other times the documentation looked as though it was thrown together by a Microsoft intern on some very heavy medication . So the end result was that we became accustomed to ignoring the documentation. Because VB.Classic successfully hid many of the nasty details, we were often able to get away with this approach.

Now we're in a different and more complex world. So when you need to use a framework class, the first place to go is its documentation. The problem of dodgy documentation and flaky code samples is still present, but to a much lesser degree than in VB.Classic. VB .NET is a first-class citizen of the new world, and the documentation has in most cases been upgraded to match this new position. Nearly all of the code samples are written in both C# and VB .NET, another reflection of the fact that VB is no longer a second-class citizen. Finally, there is a vast amount of documentation available on other resources such as MSDN, and this extra information is sometimes quite good at showing you the "when" and the "why," in addition to the "how." To avoid bugs in your implementation, it is important to have a broad, as well as a deep, understanding of the specific class that you're using.

I would also advise taking some time away from coding to just read through the class library documentation in general. Even though you're reading casually and often looking at classes about which you have no specific interest, your mind will absorb the information by osmosis; it will just seep in without any great effort on your part. You don't have to be able to keep all of this information at a conscious level ”even at an unconscious level it will help you to avoid bugs.

One final tip is to take the time to become an expert about either a single namespace or just a set of classes within that namespace. Learn everything you can about the namespace, and experiment with it until you understand it at a deep and broad level. Twist it into knots so that you understand all of the designer's conventions and where things can go wrong. Answer questions about it from your colleagues and even in public newsgroups. Every question that you answer about that namespace is likely to teach you more about avoiding bugs when you use it. Once you've learned a namespace or even a single class in great detail, this knowledge will also help you in learning other areas of the .NET Framework.

Sharing Knowledge Between Developers

VB .NET looks and feels enough like VB.Classic to cause some confusion. As an example, you can if you want ignite a moderately sized flame war in any VB .NET newsgroup by asking innocently whether <Object> = Nothing is useful or required in VB .NET. Developers moving to VB .NET need to go back to school for a while if they want to create software that is relatively bug-free and easy to modify and debug.

One major cause of bugs is likely to be the "top geek" syndrome. This occurs when a developer who is highly paid for his expertise is faced with a situation where his current knowledge no longer applies. Lacking the knowledge to use his new tools safely, the response is to charge ahead anyway, with unflattering results on the reliability of his new application.

The sensible response would be to admit his ignorance and not produce designs or code without a better understanding of his tools. In reality, this is unlikely to happen because both his ego and his compensation are usually directly linked to his ability to appear as a guru.

One approach to this problem is to adopt a discipline taken from the Extreme Programming (XP) process called Pair Programming (PP). While the XP process has its critics , PP can be very useful when applied to situations where simultaneous learning and programming is taking place.

The idea is for two developers to pair up and program together. The two developers take it in turns either to drive the keyboard or to sit and observe. The idea is that the driver handles the small details by writing and talking about the code while the observer watches for problems either in the code or at a higher, more strategic level. The driver tends to the trees while the observer considers the whole forest.

If done carefully , this pairing technique can be very useful when operating in a new environment such as VB .NET. The driver is likely to make lots of small and mostly silly errors that will be corrected by the observer, while the observer learns from the driver's experience of actually writing the code. The two learn together at a faster rate than either one could do individually, each contributing his or her share of the knowledge.

The trick is to make sure that the pair chemistry is right. For instance, it is not recommended to pair a guru with a novice. The novice is soon likely to be out of his or her depth and the guru will become bored and frustrated, especially when the novice has his or her turn as the driver. It is better to keep the pairing reasonably close in ability.

Apart from the bug prevention and detection benefits, another benefit of PP is through taking a developer who has to use an unfamiliar component and pairing her together with the author (or experienced user) of that component. The author can then guide the developer rapidly through the component's public interface and explain the component's conventions and usage subtleties. This can save a lot of time and effort, and prevent bugs arising from accidental misuse of the component.

A final benefit from PP is that having to discuss your code with another developer often means that you find yourself refactoring your code to make it simpler because it is then easier to explain. Simpler code means less opportunity for bugs, and many of the bugs that do creep in are easier for the observer to spot.

Murphy's Law Is Wrong

There is one final, perhaps rather philosophical, factor that I want to discuss. You've undoubtedly come across Murphy's Law, which states, "What can go wrong, will go wrong." Unfortunately, analysis of real-life bugs and system failures shows that Murphy's Law is completely wrong ”what can go wrong usually goes right. In most applications, there are usually many defects hiding within the software when it goes into production. Most of the time, these bugs stay silent and the application works successfully. Only occasionally is a bug actually triggered, whereupon the application goes wrong or crashes.

The implication is that developers are usually wrong when they assume that because their application has been running successfully in a testing or production environment for a while, the software doesn't have many remaining defects. In reality, the bugs are biding their time and waiting for the most opportune moment to strike. This mindset can be compared with the technical culture at NASA before the 1986 explosion of the space shuttle Challenger. O-ring worries were put to one side because there had been so many successful launches with these exact O-rings.

There is another clue here. It becomes clear that some errors in complex distributed software systems will happen, and it is impossible to prevent all errors. So as well as working to prevent errors, you should also place serious emphasis on limiting their negative consequences and on detecting them when they do occur.