Top 10 lists seem to be all the rage nowadays, so I thought I'd make my own contribution. Unfortunately, I've led a relatively clean life and couldn't find ten bugs that were worth discussing. Instead, here's a list of the five most memorable bugs that I've encountered in my 23 years of commercial programming.
I was asked to investigate a performance problem where a trading application written in VB.Classic was taking more than 2 minutes to start. This slow startup time was really annoying the traders using the application, to the point where they were threatening dire retribution if the problem wasn't fixed. So I decided to explore the issue by using a third-party profiler to look in detail at the startup performance.
After constructing call graphs and mapping the most expensive procedures, I found a single statement that was occupying no less than 50% of the startup time! The two grid controls that formed the core of the application's GUI were referenced by code that marked every other grid column in bold. There was one statement inside a loop that changed the font to bold, and this statement was the culprit. Although the line of code only took milliseconds to run, it was executed over 50,000 times. The original developer had used small volumes of data and hadn't bothered to check whether the routine was being called redundantly. Over time, as the volume of data grew, the startup times became slower and slower.
After changing the code so that the grid columns were set to bold only once, the application's startup time dropped by nearly a minute and the day was saved. The moral here is that it's very easy to spend a lot of time tuning the wrong part of your program. It's better to get significant portions of your application to work correctly and then use a good profiler to look at where the real speed bumps are hiding. Finally, when your whole application is up and running correctly, use the profiler again to discover any remaining performance issues caused by your system integration.
This bug lived in the same trading application mentioned in the preceding section. While testing some new functionality that I had recently added, I happened to notice that the code to display the results of a certain type of trade would never work properly. After looking at the source control system, it was obvious that this bug had existed for at least a year, and I was amazed that none of the traders had ever spotted it. After puzzling for a while and checking with a colleague, I fixed the bug and went on testing my new functionality.
About 3 minutes later, my phone rang. On the other end of the line was an irate trader who complained that one of his trades wasn't showing correctly. Upon further investigation, I realized that the trader had been hit with the exact same bug I had noticed in the code 3 minutes earlier. This bug had been lying around for a year, just waiting for a developer to come along and spot it so that it could strike for real.
This is a good example of a type of bug known as a Schroedinbug , which I discuss in Chapter 7 s Interlude section. While most of us have heard about these peculiar entities, it is an eerie feeling when you actually encounter one in the wild.
While working for a software house, I was given the task of making a minor amendment to a deal-entry application. I innocently opened the source code to find an extraordinary mess that defied description. The code contained everything from hard-coded value changes (If PageLine = 17 and VarValue = 28.7 Then VarValue = 16.1 ) to variable gotos (Goto Line(VarX+VarY-5)) . After a couple of hours investigation, I counted something like 30 bugs. After a couple of days , the bug count was over 200. It was like walking into an insane asylum . The original code author was long gone, and I was left with a true crawling horror.
This was perhaps my moment of conversion, when I started to devote some significant hours to the pursuit of software quality. It even led, indirectly, to the writing of this book. There's nothing like a few scarring experiences to teach you about software development.
Possibly the most bizarre "bug" that I have ever encountered happened in the mid-1980s. One of my company's inventory programs, running on a Nixdorf 8870 minicomputer, always crashed when used by a certain administrator. When it was used by anybody else, the program ran without a murmur, but it seemed to have a distinct aversion to this specific woman .
I first thought that she was performing some arcane program operation that wasn't done by the other staff. After quizzing her for a while, I still couldn't understand what was going wrong. I tried several other hunches, but nothing would explain why the program was crashing. I even wondered whether her clothes were generating some static electricity that was upsetting the computer!
In the end, I actually went to the woman's office to do some serious debugging. From the moment that she arrived in the room I watched her like a hawk, expecting her to do something that would lead to the problem. What she actually did was fairly mundane. Because the normal chair in the room didn't ergonomically suit her back, she went to the office next door to retrieve a chair that she liked better. Then she sat down, and about a minute later the program crashed.
In a flash of inspiration, I took the chair apart. After locating a magnet , I tested the inside of the chair. Sure enough, the core of the chair had somehow become magnetized, and this was what was causing the computer to react violently to this woman's presence. Problem solved .
I still maintain that this was my finest debugging hour ”everything since has been downhill.
The occasion was the rollout of a new version of my group 's in-house trading application to all of the commodity traders, and the time was 6:00 P.M. on a Friday. The estimate for three of us to complete the rollout was 2-3 hours, although the traders wouldn't be back until 9:00 A.M. the following day. We'd already performed a test rollout to a couple of production PCs, and everything had worked perfectly . Full of optimism , we set to work.
At about 8:00 P.M., we realized that we had a problem. On just three of the ten PCs, our application refused to start properly, producing a strange error that suggested a registry problem. By midnight, after increasingly frustrated efforts to locate the problem, we were reduced to adding copious trace statements and then recompiling the application in order to find the exact line where the error was occurring. This line turned out to be a Dim statement, a nonexecutable line that would never normally give an error.
By 3:00 A.M., after numerous experiments and diagnostic attempts, we were still baffled. As far as we could tell, the problem was some sort of registry issue that occurred because the Windows user profile was corrupted ”re-creating the user profile from scratch cured the problem. Unfortunately, it appeared that running certain other programs would then corrupt the profile again, and our program would stop working. Why this problem happened on some PCs but not others was also a mystery. The installation scripting process for that particular company was a real witch's brew.
By 6:00 A.M., 12 hours after we had started, we still hadn't found a satisfactory solution or workaround, and desperation was starting to set in.
By 9:00 A.M., as the traders came in to work, we decided that they could cope with the seven PCs that were working, and we would fix the other three PCs on the following Monday. Exhausted and frustrated at being beaten, we retired to our respective homes to lick our wounds.
Several months later, we still weren't able to diagnose the exact problem, even with the help of other teams . In fact, the bug was never found. It finally disappeared when we moved from Windows NT to Windows 2000 as the base operating system, but this still remains the most exasperating bug that I've ever encountered.