|
Debugging is a fascinating topic no matter what language or platform you're using. It's the only part of software development in which engineers kick, scream at, or even throw their computers. For a normally reticent, introverted group, this degree of emotion is extraordinary. Debugging is also the part of software development that's famous for causing you to pull all-nighters. I've yet to run into an engineer who has called his or her partner to say, "Honey, I can't come home because we're having so much fun doing our UML diagrams that we want to pull an all-nighter!" However, I've run into plenty of engineers who have called their partner with the lament, "Honey, I can't come home because we've run into a whopper of a bug."
Bugs are cool! They help you learn the most about how things work. We all got into this business because we like to learn, and tracking down bugs is the ultimate learning experience. I don't know how many times I've had nearly every programming book I own open and spread out across my office looking for a good bug. It feels just plain great to find and fix those bugs! Of course, the coolest bugs are those that you find before the customer sees your product. That means you have to do your job to find those bugs before your customers do. Having your customers find them is extremely uncool.
Compared with other engineering fields, software engineering is an anomaly in two ways. First, software engineering is a new and somewhat immature branch of engineering compared with other forms of engineering that have been around for a while, such as structural and electrical engineering. Second, users have come to accept bugs in our products, particularly in PC software. Although they grudgingly resign themselves to bugs on PCs, they're still not happy when they find them. Interestingly enough, those same customers would never tolerate a bug in a nuclear reactor design or a piece of medical hardware. With PC software becoming more a part of people's lives, the free ride that the software engineering field has enjoyed is nearly over. I don't doubt that the liability laws that apply to other engineering disciplines will eventually cover software engineering as well.
You need to care about bugs because ultimately they are costly to your business. In the short term, customers contact you for help, forcing you to spend your time and money sustaining the current product while your competitors work on their next versions. In the long term, the invisible hand of economics kicks in and customers just start buying alternatives to your buggy product. Software is now more of a service than a capital investment, so the pressure for higher quality software will increase. With many applications supporting Extensible Markup Language (XML) for input and output, your users are almost able to switch among software products from various vendors just by moving from one Web site to another. This boon for users will mean less job security for you and me if our products are buggy and more incentive to create high-quality products. Let me phrase this another way: the buggier your product, the more likely you are to have to look for a new job. If there's anything that engineers hate, it's going through the job-hunting process.
Before you can start debugging, you need a definition of bugs. My definition of a bug is "anything that causes a user pain." I classify bugs into the following categories:
Inconsistent user interfaces
Unmet expectations
Poor performance
Crashes or data corruption
Inconsistent user interfaces, though not the most serious type of bug, are annoying. One reason for the success of the Microsoft Windows operating system is that all Windows-based applications generally behave the same way. When an application deviates from the Windows standard, it becomes a burden for the user. An excellent example of this nonstandard, irksome behavior is the Find accelerators in Microsoft Outlook. In every other English-language Windows-based application on the planet, Ctrl+F brings up the Find dialog box so that you can find text in the current window. In Outlook, however, Ctrl+F forwards the open message, which I consider a bug. Even after many years of using Outlook, I can never remember to use the F4 key to find text in the currently open message.
With client applications, it's pretty easy to solve problems with inconsistent user interfaces by following the recommendations in the book Microsoft Windows User Experience (Microsoft Press, 1999), which is also available from MSDN Online at http://msdn.microsoft.com/library/en-us/dnwue/html/welcome.asp. If that book doesn't address a particular issue, look for another Microsoft application that does something similar to what you're trying to achieve and follow its model. Microsoft seems to have infinite resources and unlimited time; if you take advantage of their extensive research, solving consistency problems won't cost you an arm and a leg.
If you're working on Web front ends, life is much more difficult because there's no standard for user interface display. As we've all experienced from the user perspective, it's quite difficult to get a good user interface (UI) in a Web browser. For developing strong Web client UIs, I can recommend two books. The first is the standard bible on Web design, Jacob Nielsen's Designing Web Usability: The Practice of Simplicity. The second is an outstanding small book that you should give to any self-styled usability experts on your team who couldn't design their way out of a wet paper bag (such as any executive who wants to do the UI but has never used a computer): Steve Krug's Don't Make Me Think! A Common Sense Approach to Web Usability. Whatever you do for your Web UI, keep in mind that not all your users will have 100-MB-per-second pipes for their browsers, so keep your UI simple and avoid lots of fluff that takes forever to download. When doing research on great Web clients, User Interface Engineering (www.uie.com) found that approaches such as CNN.com worked best with all users. A simple set of clean links with information groups under clean sections let users find what they were looking better than anything else.
Not meeting the user's expectations is one of the hardest bugs to solve. This bug usually occurs right at the beginning of a project, when the company doesn't do sufficient research on what the real customer needs. In both types of shops—shrink wrap (those writing software for sale) and Information Technology (or IT, which are those writing in-house applications)—the cause of this bug comes down to communication problems.
In general, development teams don't communicate directly with their product's customers, so they aren't learning what the users need. Ideally, all members of the engineering team should be visiting customer sites so that they can see how the customers use their product. Watching over a customer's shoulder as your product is being used can be an eye-opening experience. Additionally, this experience will give you the insight you need to properly interpret what customers are asking your product to do. If you do get to talk to customers, make sure you speak with as many as possible so that you can get input from across a wide spectrum. In fact, I would strongly recommend that you stop reading right now and go schedule a customer meeting. I can't say it strongly enough: the more you talk with customers, the better an engineer you'll be.
In addition to customer visits, another good idea is to have the engineering team review the support call summaries and support e-mails. This feedback will allow the engineering team to see the problems that the users are having, without any filtering applied.
Another aspect of this kind of bug is the situation in which the user's level of expectation has been raised higher than the product can deliver. This inflation of user expectations is the classic result of too much hype, and you must resist misrepresenting your product's capabilities at all costs. When users don't get what they anticipated from a product, they tend to feel that the product is even buggier than it really is. The rule for avoiding this situation is to never promise what you can't deliver and to always deliver what you promise.
Users are very frustrated by bugs that cause the application to slow down when it encounters real-world data. Invariably, improper testing is the root of all poor performance bugs—however great the application might have looked in development, the team failed to test it with anything approaching real-world volumes. One project I worked on, NuMega's BoundsChecker 3.0, had this bug with its original FinalCheck technology. That version of FinalCheck inserted additional debugging and contextual information directly into the source code so that BoundsChecker could better report errors. Unfortunately, we failed to sufficiently test the FinalCheck code on larger real-world applications before we released BoundsChecker 3.0. As a result, more users than we cared to admit couldn't use that feature. We completely rewrote the FinalCheck feature in subsequent releases, but because of the performance problems in the original version, many users never tried it again, even though it was one of the product's most powerful and useful features. Interestingly enough, we released BoundsChecker 3.0 in 1995 and I still had people seven years later—at least two eons in Internet time—telling me that they still hadn't used FinalCheck because of one bad experience!
You tackle poor performance bugs in two ways. First, make sure you determine your application's performance requirements up front. To know whether you have a performance problem, you need a goal to measure against. An important part of performance planning is keeping baseline performance numbers. If your application starts missing those numbers by 10 percent or more, you need to stop and determine why your performance dropped and take steps to correct the problem. Second, make sure you test your applications against scenarios that are as close to the real world as possible—and do this as early in the development cycle as you can.
Here's one common question I continually get from developers: "Where can I get those real-world data sets so that I can do performance testing?" The answer is to talk to your customers. It never hurts to ask whether you can get their data sets so that you can do your testing. If a customer is concerned about privacy issues, take a look at writing a program that will change sensitive information. You can let the customer run that program and ensure that the changes hide sufficient sensitive information so that the customer feels comfortable giving you the data. It also helps to offer free software when the customer needs some motivation to give you their data.
Crashes and data corruption are what most developers and users think of when they think of a bug. I also put memory leaks into this category. Users might be able to work around the types of bugs just described, but crashes stop them dead, which is why the majority of this book concentrates on solving these extreme problems. In addition, crashes and data corruption are the most common types of bugs. As we all know, some of these bugs are easy to solve, and others are almost impossible. The main point to remember about crashes and data corruption bugs is that you should never ship a product if you know it has one of these bugs in it.
Although shipping software without bugs is possible—provided you give enough attention to detail—I've shipped enough products to know that most teams haven't reached that level of software development maturity. Bugs are a fact of life in this business. However, you can minimize the number of bugs your applications have. That is what teams that ship high-quality products—and there are many out there—do. The reasons for bugs generally fall into the following process categories:
Short or impossible deadlines
The "Code First, Think Later" approach
Misunderstood requirements
Engineer ignorance or improper training
Lack of commitment to quality
We've all been part of development teams for which "management" has set a deadline that was determined by either a tarot card reader or, if that was too expensive, a Magic 8-Ball. Although we'd like to believe that managers are responsible for most unrealistic schedules, more often than not, they aren't to blame. Engineers' work estimates are usually the basis of the schedule, and sometimes engineers underestimate how long it will take them to develop a solid product. Engineers are funny people. They are introverted but almost always very positive thinkers. Given a task, they believe down to their bones that they can make the computer stand up and dance. If their manager comes to them and says that they have to add an XML transform to the application, the average engineer says "Sure, boss! It'll be three days." Of course, that engineer might not even know how to spell "XML," but he'll know it'll take three days. The big problem is that engineers and managers don't take into account the learning time necessary to make a feature happen. In the section "Scheduling Time for Building Debugging Systems" in Chapter 2, I'll cover some of the rules that you should take into account when scheduling. Whether an unrealistic ship date is the fault of management or engineering or both, the bottom line is that a schedule that's impossible to meet leads to cut corners and a lower quality product.
I've been fortunate enough to work on several teams that have shipped software on time. In each case, the development team truly owned the schedule, and we were good at determining realistic ship dates. To figure out realistic ship dates, we based our dates on a feature set. If the company found the proposed ship date unacceptable, we cut features to move up the date. In addition, everyone on the development team agreed to the schedule before we presented it to management. That way, the team's credibility was on the line to finish the product on time. Interestingly, besides shipping on time, these products were some of the highest quality products I've ever worked on.
My friend Peter Ierardi coined the term "Code First, Think Later" to describe the all-too-common situation in which an engineering team starts programming before they start thinking. Every one of us is guilty of this approach to an extent. Playing with compilers, writing code, and debugging is the fun stuff; it's why we got interested in this business in the first place. Very few of us like to sit down and write documents that describe what we're going to do.
If you don't write these documents, however, you'll start to run into bugs. Instead of stopping and thinking about how to avoid bugs in the first place, you'll start tweaking the code as you go along to work around the bugs. As you might imagine, this tactic will compound the problem because you'll introduce more and more bugs into an already unstable code base. The company I work for goes around the world helping debug the nastiest problems that developers encounter. Unfortunately, many times we are brought in to help solve corruption or performance problems and there's nothing we can do because the problems are fundamentally architectural. When we bring the problems to the management who hired us and tell them it's going to take a partial rewrite to fix the problems, we sometimes hear, "We've got too big an investment in this code base to change it now." That's a sure sign of a company that has fallen into the "Code First, Think Later" problem. When reporting on a client, we simply report "CFTL" as the reason we were unsuccessful when helping them.
Fortunately, the solution to this problem is simple: plan your projects. Some very good books have been written about requirements gathering and project planning. I cite them in Appendix B, and I highly recommend that you read them. Although it isn't very sexy and is generally a little painful, up-front planning is vital to eliminating bugs.
One of the big complaints I got on the first version of this book was that I recommended that you plan your projects but didn't tell you how to do it. That complaint is perfectly valid, and I want to make sure I address the issue here in the second edition. The only problem is that I really don't know how! Now you're wondering if I'm doing the bad author thing and leaving it as an exercise to the reader. Read on, and I'll tell you what planning tactics have worked for me. I hope they'll provide you with some ideas as well.
If you read my bio at the end of the book, you'll notice that I didn't get started in the software business until I was in my late 20s and that it's really my second career. My first career was to jump out of airplanes and hunt down the enemy, as I was a paratrooper and Green Beret in the United States Army. If that's not preparation for the software business, I don't know what is! Of course, if you meet me now, you'll see just a short fat guy with a pasty green glow—a result of sitting in front of a monitor too much. However, I really did used to be a man. I really did!
Being a Green Beret taught me how to plan. When you're planning a special operations mission and the odds are fairly high that you could die, you are extremely motivated to do the best planning possible. When planning one of those operations, the Army puts the whole team in what's called "isolation." At Fort Bragg, North Carolina, the home of Special Forces, there are special areas where they actually lock the team away to plan the mission. The whole key during the planning was called "what if-ing yourself to death." We'd sit around and think about scenarios. What happens if we're supposed to parachute in and we pass the point of no return and the Air Force can't find the drop zone? What happens if we have casualties before we jump? What happens if we hit the ground and can't find the guerilla commander we're supposed to meet? What happens if the guerilla commander we're supposed to meet has more people with him than he's supposed to? What happens if we're ambushed? We'd spend forever thinking up questions and devising the answers to these questions before ever leaving isolation. The idea was to have every contingency planned out so that nothing was left to chance. Trust me: when there's a good chance you might die when doing your job, you want to know all the variables and account for them.
When I got into the software business, that's the kind of planning I was used to doing. The first time I sat in a meeting and said, "What if Bob dies before we get through the requirements phase?" everyone got quite nervous, so now I phrase questions with a less morbid spin, like "What if Bob wins the lottery and quits before we get through the requirements phase?" However, the idea is still the same. Find all the areas of doubt and confusion in your plans and address them. It's not easy to do and will drive weaker engineers crazy, but the key issues will always pop out if you drill down enough. For example, in the requirements phase, you'll be asking questions such as, "What if our requirements aren't what the user wants?" Such questions will prompt you to budget time and money to find out if those requirements are what you need to be addressing. In the design phase, you'll be asking questions like, "What if our performance isn't good enough?" Such questions will make you remember to sit down and define your performance goals and start planning how you're going to achieve those goals by testing against real-world scenarios. Planning is much easier if you can get all the issues on the table. Just be thankful that your life doesn't depend on shipping software on time!
The Battle
A client called us in because they had a big performance problem and the ship date was fast approaching. One of the first things we ask for when we start on these emergency problems is a 15-minute architectural overview so that we can get up to speed on the terminology as well as get an idea of how the project fits together. The client hustled in one of the architects and he started the explanation on the white board.
Normally, these circle and arrow sessions take 10 to 15 minutes. However, this architect was still going strong 45 minutes later, and I was getting confused because I needed more than a roadmap to keep up. I finally admitted that I was totally lost and asked again for the 10-minute system overview. I didn't need to know everything; I just needed to know the high points. The architect started again and in 15 minutes was only about 25 percent through the system!
The Outcome
This was a large COM system, and at about this point I started to figure out what the performance problem was. Evidently, some architect on the team had become enamored with COM. He didn't just sip from a glass of COM Kool-Aid; he immediately started guzzling from the 55-gallon drum of COM. In what I later guessed was a system that needed 8–10 main objects, this team had over 80! To give you an idea how ridiculous this was, it was like every character in a string was a COM object. This thing was over-engineered and completely under-thought. It was the classic case in which the architects had zero hands-on experience.
After about a half a day, I finally got the manager off to the side and said that there wasn't much we could do for performance because the overhead of COM itself was killing them. He was none too happy to hear this and immediately blurted out this infamous phrase: "We've got too big an investment in this code to change now!" Unfortunately, with their existing architecture, we couldn't do much to effect a performance boost.
The Lesson
This project suffered from several major problems right from the beginning. First, team members handed over the complete design to nonimplementers. Second, they immediately started coding when the plan came down from on high. There was absolutely no thought other than to code this thing up and code it up now. It was the classic "Code First, Think Later" problem preceded by "No-Thought Design." I can't stress this enough: you have to get realistic technology assessments and plan your development before you ever turn on the computer.
Proper planning also minimizes one of the biggest bug causers in development: feature creep. Feature creep—the tacking on of features not originally planned—is a symptom of poor planning and inadequate requirements gathering. Adding last-minute features, whether in response to competitive pressure, as a developer's pet feature, or on the whim of management, causes more bugs in software than almost anything else.
Software engineering is an extremely detail-oriented business. The more details you hash out and solve before you start coding, the fewer you leave to chance. The only way to achieve proper attention to detail is to plan your milestones and the implementation for your projects. Of course, this doesn't mean that you need to go completely overboard and generate thousands of pages of documentation describing what you're going to do.
One of the best design documents I ever created for a product was simply a series of paper drawings, or paper prototypes, of the user interface. Based on research and on the teachings of Jared Spool and his company, User Interface Engineering, my team drew the user interface and worked through each user scenario completely. In doing so, we had to focus on the requirements for the product and figure out exactly how the users were going to perform their tasks. In the end, we knew exactly what we were going to deliver, and more important, so did everyone else in the company. If a question about what was supposed to happen in a given scenario arose, we pulled out the paper prototypes and worked through the scenario again.
Even though you might do all the planning in the world, you have to really understand your product's requirements to implement them properly. At one company where I worked—mercifully, for less than a year—the requirements for the product seemed very simple and straightforward. As it turned out, however, most of the team members didn't understand the customers' needs well enough to figure out what the product was supposed to do. The company made the classic mistake of drastically increasing engineering head count but failing to train the new engineers sufficiently. Consequently, even though the team planned everything to extremes, the product shipped several years late and the market rejected it.
There were two large mistakes on this project. The first was that the company wasn't willing to take the time to thoroughly explain the customers' needs to the engineers who were new to the problem domain, even though some of us begged for the training. The second mistake was that many of the engineers, both old and new, didn't care to learn more about the problem domain. As a result, the team kept changing direction each time marketing and sales reexplained the requirements. The code base was so unstable that it took months to get even the simplest user scenarios to work without crashing.
Very few companies train their engineers in their problem domain at all. Although many of us have college degrees in engineering, we generally don't know much about how customers will use our products. If companies spent adequate time up front helping their engineers understand the problem domain, they could eliminate many bugs caused by misunderstood requirements.
The fault isn't just with the companies, though. Engineers must make the commitment to learn the problem domain as well. Some engineers like to think they're building tools that enable a solution so that they can maintain their separation from the problem domain. As engineers, we're responsible for solving the problem, not merely enabling a solution!
An example of enabling a solution is a situation in which you design a user interface that technically works but doesn't match the way the user works. Another example of enabling a solution is building your application in such a way that it solves the user's short-term problem but doesn't move forward to accommodate the user's changing business needs.
When solving the user's problem rather than just enabling a solution, you, as the engineer, become as knowledgeable as you can about the problem domain so that your software product becomes an extension of the user. The best engineers are not those who can twiddle bits but those who can solve a user's problem.
Another significant cause of bugs results from developers who don't understand the operating system, the language, or the technology their projects use. Unfortunately, few engineers are willing to admit this deficiency and seek training. Instead, they cover up their lack of knowledge and, unintentionally, introduce avoidable bugs.
In many cases, however, this ignorance isn't a personal failing so much as a fact of life in modern software development. So many layers and interdependencies are involved in developing software these days that no one person can be expected to know the ins and outs of every operating system, language, and technology. There's nothing wrong with admitting that you don't know something. It's not a sign of weakness, and it won't take you out of the running to be the office's alpha geek. In fact, if a team is healthy, acknowledging the strengths and limitations of each member works to the team's advantage. By cataloging the skills their developers have and don't have, the team can get the maximum advantage from their training dollars. By strengthening every developer's weaknesses, the team will better be able to adjust to unforeseen circumstances and, in turn, broaden the whole team's skill set. The team can also schedule development time more accurately when team members are willing to admit what they don't know. You can build in time for learning and create a much more realistic schedule if team members are candid about the gaps in their knowledge.
The best way to learn about a technology is to do something with that technology. Years ago, when NuMega sent me off to learn about Microsoft Visual Basic so that we could write products for Visual Basic developers, I laid out a schedule for what I was going to learn and my boss was thrilled. The idea was to develop an application that insulted you, appropriately called "The Insulter." Version 1 was a simple form with a single button that, when pressed, popped up a random insult from the list of hard-coded insults. The second version read insults from a database and allowed you to add new insults by using a form. The third version connected to the company Microsoft Exchange server and allowed you to e-mail insults to others in the company. My manager was very happy to see how and what I was going to do to learn the technology. All your manager really cares about is being able to tell his boss what you're doing day to day. If you give your manager that information, you'll be his favorite employee. When I had my first encounter with .NET, I simply dusted off the Insulter idea, and it became Insulter .NET!
I'll have more to say about what skills and knowledge are critical for developers to have in the section "Prerequisites to Debugging" later in this chapter.
Absolutely! Unfortunately, many companies go about them in completely the wrong way. One company I worked for required formal code reviews that were straight out of one of those only-in-fantasyland software engineering textbooks I had in college. Everything was role-based: there was a Recorder for recording comments, a Secretary for keeping the meeting moving, a Door Keeper to open the door, a Leader to suck oxygen, and so on. All that you really had, however, were 40 people in a room, none of whom had read the code. It was a huge waste of time.
The kind of code reviews I like are the one-on-one informal kind. You simply sit down with a printout of the code and read it line by line with the developer. As you read it, you're keeping track of all the input and outputs so that you can see what's happening in the code. Think about what I just wrote. If that sounds perilously close to debugging the code, you're exactly right. Focus on what the code does—that's the purpose of a code review.
Another trick for ensuring that your code reviews are worthwhile is to have the junior developers review the senior developer's code. Not only does that teach the less experienced developers that their contribution is valuable, but it's also a fine way to teach them about the product and show them great programming tips and tricks.
The final reason that bugs exist in projects is, in my opinion, the most serious. Every company and every engineer I've ever talked to has told me that they are committed to quality. Unfortunately, some companies and engineers lack the real commitment that quality requires. If you've ever worked at a company that was committed to quality or with an engineer who was, you certainly know it. They both feel a deep pride in what they are producing and are willing to spend the effort on all parts of development, not on just the sexy parts. For example, instead of getting all wrapped up in the minutia of an algorithm, they pick a simpler algorithm and spend their time working on how best to test that algorithm. The customer doesn't buy algorithms, after all; the customer buys high-quality products. Companies and individuals with a real commitment to quality exhibit many of the same characteristics: careful up-front planning, personal accountability, solid quality control, and excellent communication abilities. Many companies and individuals go through the motions of the big software development tasks (that is, scheduling, coding, and so on), but only those who pay attention to the details ship on time with high quality.
A good example of a commitment to quality is when I had my first monthly review at NuMega. First off, I was astounded that I was getting a review that quickly when normally you have to beg for any feedback from your managers. One of the key parts of the review was to record how many bugs I had logged against the product. I was stunned to discover that NuMega would evaluate this statistic as part of my performance review, however, because even though tracking bugs is a vital part of maintaining a product's quality, no other company I had worked at had ever checked something so obvious. The developers know where the bugs are, but they must be given an incentive to enter those bugs into the bug tracking system. NuMega found the trick. When I learned about the bug count entry part of my review, you'd better believe I logged everything I found, no matter how trivial. With all the technical writers, quality engineers, development engineers, and managers engaged in healthy competition to log the most bugs, few surprise bugs slipped through the cracks. More important, we had a realistic idea of where we stood on a project at any given time.
Another excellent example from the engineering side is the first edition of this book. On the book's companion CD was over 2.5 MB of source code (and that wasn't compiled code, it was just the source code!). That's quite a bit of code, and I'm happy to say many multiples more than what you get with most books. What many people don't realize is that I spent over 50 percent of the time on that book just testing the code. People get really excited when they find a bug in the Bugslayer code, and the last thing I want is one of those "Gotcha! I found a bug in the Bugslayer!" e-mails. While I can't say that I had zero bugs on that CD, I did have only five. My commitment to the readers was to give them the absolute best of my ability. My goal for this edition is fewer than five in the 6+ MB of source code for this edition.
When I was a development manager, I followed a ritual that I'm sure fostered a commitment to quality: each team member had to agree that the product was ready to go at every milestone. If any person on the team didn't feel that the product was ready, it didn't ship. I'd rather fix a minor bug and suffer through another complete day of testing than send out something the team wasn't proud of. Not only did this ritual ensure that everyone on the team thought the quality was there, but it also gave everyone on the team a stake in the outcome. An interesting phenomenon I noticed was that team members never got the chance to stop the release for someone else's bug; the bug's owner always beat them to it.
A company's commitment to quality sets the tone for the entire development effort. That commitment starts with the hiring process and extends through the final quality assurance on the release candidate. Every company says that it wants to hire the best people, but few companies are willing to offer salaries and benefits that will draw them. In addition, some companies aren't willing to provide the tools and equipment that engineers need to produce high-quality products. Unfortunately, too many companies resist spending $500 on a tool that will solve a nasty crash bug in minutes but are willing to blow many thousands of dollars to pay their developers to flounder around for weeks trying to solve that same bug.
A company also shows its commitment to quality when it does the hardest thing to do in business—fire people who are not living up to the standards the organization set. When building a great team full of people on the right-hand side of the bell curve, you have to work to keep them there. We've all seen the person whose chief job seems to be stealing oxygen but who keeps getting raises and bonuses like you even though you're killing yourself and working late nights and sometimes weekends to make the product happen. The result is good people quickly realizing that the effort isn't worth it. They start slacking off or, worse yet, looking for other jobs.
When I was a project manager, I dreaded doing it, but I fired someone two days before Christmas. I knew that people on the team were feeling that this one individual wasn't working up to standards. If they came back from the Christmas holiday with that person still there, I'd start losing the team we had worked so hard to build. I had been documenting the person's poor performance for quite a while, so I had the proper reasons for proceeding. Trust me, I would rather have been shot at again in the Army than fire that person. It would have been much easier to let it ride, but my commitment was to my team and to the company to do the quality job I had been hired to do. In all, I ended up firing a total of three people on my teams. It was better to go through that upheaval than to have anyone turn off and stop performing. I agonized over every firing, but I had to do it. A commitment to quality is extremely difficult and will mean that you'll have to do things that will keep you up at night, but that's what it takes to ship great software and take care of your people.
If you do find yourself in an organization that suffers from a lack of commitment to quality, you'll find that there's no easy way to turn a company into a quality-conscious organization overnight. If you're a manager, you can set the direction and tone for the engineers working for you and work with upper management to lobby for extending a commitment to quality across the organization. If you're an engineer, you can work to make your code the most robust and extensible on the project so that you set an example for others.
Now that we've gone over the types and origins of bugs and you have some ideas about how to avoid or solve them, it's time to start thinking about the process of debugging. Although many people start thinking about debugging only when they crash during the coding phase, you should think about it right from the beginning, in the requirements phase. The more you plan your projects up front, the less time—and money—you'll spend debugging them later.
As I mentioned earlier in the chapter, feature creep can be a bane to your project. More often than not, unplanned features introduce bugs and wreak havoc on a product. This doesn't mean that your plans must be cast in stone, however. Sometimes you must change or add a feature to a product to be competitive or to better meet the user's needs. The key point to remember is that before you change your code, you need to determine—and plan for—exactly what will change. And keep in mind that adding a feature doesn't affect only the code; it also affects testing, documentation, and sometimes even marketing messages. When revising your production schedule, a general rule to follow is that the time it takes to add or remove a feature grows exponentially the further along the production cycle you are.
In Steve McConnell's excellent book Code Complete (Microsoft Press, 1993, pp. 25–26), he refers to the costs of fixing a bug. To fix a bug during the requirements and planning phases costs very little. As the product progresses, however, the cost of fixing a bug rises exponentially, as does the cost of debugging—much the same scenario as if you add or remove features along the way.
Planning for debugging goes together with planning for testing. As you plan, you need to look for different ways to speed up and improve both processes. One of the best precautions you can take is to write file data dumpers and validators for internal data structures as well as for binary files, if appropriate. If your project reads and writes data to a binary file, you should automatically schedule someone to write a testing program that dumps the data in a readable format to a text file. The dumper should also validate the data and check all interdependencies in the binary file. This step will make both your testing and your debugging easier.
By properly planning for debugging, you minimize the time spent in your debugger, and this is your goal. You might think such advice sounds strange coming from a book on debugging, but the idea is to try to avoid bugs in the first place. If you build sufficient debugging code into your applications, that code—not the debugger—should tell you where the bugs are. I'll cover the issues concerning debugging code more in Chapter 3.
|