Section 7.3. Bug Tracking Annoyances | Practical Development Environments

7.3. Bug Tracking Annoyances

There are a number of problems with using bug tracking systems that seem to occur regardless of which system is used. Some of these annoyances are discussed in this section.

7.3.1. Multiplying Products

Most of a bug's information has simple values: a string of text to describe the problem, the name (chosen from a list) of one person assigned to the bug, and so on. If a field holds only one value at a time, things are simple. Everything becomes more complicated when one field can have multiple values at the same time.

A good example of this is the field in a bug that is typically named something like Product, which is used to indicate which of a project's products are affected by the bug. A sensible default value for this field might be All, since that may well be true. However, when the bug affects just a few of the products, the obvious thing to do is to have the field contain multiple values. This could be a string with comma-delimited values, or it could be the input from an HTML form. Regardless of how the information is actually stored, the problem is that the number of choices that can be made in reports and other queries has just increased dramatically.

For instance, if there is a version of the product for Windows, GNU/Linux, and Macintosh, and bugs may affect one, two, or all three of these versions, then your reports show some subset of these choices. When another version for the next version of Windows is added, all the reports become out-of-date, because they don't include the new version.

Fields with multiple values always complicate the experience of using the tool, and the more fields that there are with multiple values, the more complex it all becomes. If possible, avoid fields with multiple values since they tend to make writing useful reports much harder.

In case the above seems overstated, imagine a bug tracking tool with just three fields in its bugs: Owner, Description, and Product. Owner is a single-value field (because only one person owns the bug at a time), Description is a text string, and Product is the multivalued field representing the different (but related) products that are affected by a bug. Each bug is actually a point in three dimensions: the Owner is along one axis, the Description along another, and the Product on the third. If the Owner and Description stay the same, and the Product value is All, then the bug has a definite point in space. The number of values the Owner coordinate can take is just the number of users, which only increases by one for each new user. The number of points that the Description can take is huge, being the number of ways to write a few paragraphs, so we ignore it except for keywords when trying to run reports. However, the number of different values that the Product coordinate can take with n products is 2n (imagine a bitmap with a 1 for each affected product). Adding another product increases the number of choices by 2n.

So if multivalued fields can complicate your bug tracking tool, what can make it simpler? Good pattern-matching capabilities and text fields are a start. Suppose you want to select all bugs found in a series of releases such as 3.1.1, 3.1.2, and 3.1.3. Using a pattern such as 3.1.* makes this easy. Having to select each value in turn is the hard way, and when 3.1.4 is released, the report becomes incorrect. A good bug tracking tool lets you use patterns and Boolean operators to create complex reports flexibly.

7.3.2. Cleaning Up

There are two parts to keeping a bug tracking system clean. The first part is the process as designed of adding new bugs, assigning them, fixing them, testing the changes, and marking the bugs as Closed. There may be some tendency to delay closing bugs while they are being tested more rigorously, but eventually most bugs that are marked as Fixed are marked as Closed, and that can be considered their final state. Some bugs will be marked as Deferred, meaning either that the effort to fix them is too great for the time and people available, or that no one understood the bug and more information needs to be gathered. (Or maybe the bug is actually a protest about some part of the product and closing it would be tantamount to telling someone their opinion is being ignoredsee Section 12.4.)

The second kind of cleanup of a bug tracking system has to do with the available choices for each field. This is sometimes known as the bug tracking system's metadata. If there is a field for the build in which the bug was found, then every time a new build is created, the new build has to be added to the choices for that field. Doing this automatically as part of the build process is a good idea. The alternative is to require people to enter the build information by hand, which may well lead to more data-entry errors.

Using a small set of valid choices for a field on a screen makes life easier for people. However, as more builds occur, a small set of choices becomes a large set of choices, which is now not at all convenient for the user. More than a few dozen choices seems tedious to most people searching for an entry in a list. One solution is to keep only the last dozen values visible but still allow people to enter text (which can be validated separately). Maintaining the current list of values is something that needs to be done automatically, either as part of the build process or by a separate task that is run regularly. Reducing the number of choices also needs to be an automatic process, though some care is needed to retain the choices for any builds that have been released to customers.

When a developer leaves the project, what happens to the bugs she owned? If the developer is marked as somehow inactive, can you still search for her bugs and reassign them? How does the developer's name show up in historical reports? These questions are all part of cleaning up the data in a bug tracking system.

One last warning is about the effect that attaching large files to bugs can have. Many bug tracking tools allow people to attach files to a bug, which is useful for storing long text logs or screenshots. If no limit is even hinted at by the tool, then people start to treat this capability as some kind of magic filing system. Whole core dumps, CD or DVD images, and other enormous files are attached without a thought about the tool's underlying database or filesystem. The consequence is that the tool's performance becomes unacceptably slow. Regular automated checks of the sizes of attachments are a good idea. Some databases and filesystems may handle large datafiles better than others, but none of them are invulnerable to this particular problem.

7.3.3. One Bug, Multiple Releases

One of the more complicated pieces of information associated with a bug is the group of different releases that the bug exists in. For example, a bug might be discovered in Version 2.1 of the product, but Version 2.2 has already shipped and Version 3.0 is about to be released. Does the bug exist in all three versions? How should this be recorded in the bug tracking system? Notice that we're referring to releases here, not branches in an SCM tool, since a bug tracking system should not have to be aware of how a release was created.

The simplest way to handle this is to leave the information about the affected releases out of the bug tracking system. For each release, keep a spreadsheet of bugs that might be in the release and coordinate with the development team to confirm when the bug is fixed. This approach is tedious and is prone to error, but is not uncommon on smaller projects.

Another simple way to handle this problem is to make two extra copies of the original Version 2.1 bug and then change the value for the Release Found field in each of the two copies to 2.2 and 3.0, respectively. Each of the copies will have its own unique bug identifier. This approach has the advantage that the bug count for each release is more likely to be correct, and some bug tracking tools support duplicating bugs very well. One disadvantage is that information has to be added to the original and to each of the copies about their connections with each other; otherwise, information will invariably be added to just one copy and not the others. The main disadvantage is that developers, product managers, and customers find it hard to keep track of which bug was fixed in which release. If a customer has been told that the bug he reported is number 12345, then he expects to see that bug number in the release notes when his bug is fixed.

A different approach that is sometimes taken is to add multiple instances of the fields that are affected, and then use one instance per release. For instance, a bug might have fields named Release Fixed 1, Release Fixed 2, and Release Fixed 3, and then each field would be set to 2.1, 2.2, and 3.0 in our example bug. Other fields such as the status and who the bug is assigned to can also be treated in this way. This approach is equivalent to duplicating just the affected fields of a bug and can record the information correctly. The big drawback is that all the reports now have to use much more complicated queries"Show me all my Open bugs" becomes "Show me all bugs that are Open in one or more of these fields." From experience, I recommend avoiding this approach.

Some bug tracking tools claim to support adding multiple releases to a bug, but then their support for reports using the multiple release values is often not as robust as might be expected. Generally, keeping track of bugs in multiple releases of a product is hard to do automatically and is not well supported by existing bug tracking tools.

7.3.4. Severity Inflation

Many bugs have fields to indicate how serious the bug is. One common series of values goes like this: Severity 1 means "The bug stops the product, and no workaround is possible"; Severity 2 means "The bug stops the product, but a workaround is possible." Severity 3 and 4 bugs are defined as meaning "The bug breaks a minor part of the product" and "The bug is cosmetic or an irritation," respectively. There may be one Severity field for the customer and a similar field for the engineering organization of a project.

In most companies and projects, limited resources mean that as the ship date for a release approaches, only bugs with Severity 1 and 2 get fixed; the others are closed or deferred. Over time this practice leads to severity inflation. Someone entering a bug knows that this bug won't stop the product, but she remembers that none of her Severity 3 bugs got fixed last time and she really wants this one fixed, so she makes it a Severity 2. In the extreme, by a process of induction, all bugs become Severity 1 bugs and the purpose of the field is lost.

One approach to avoiding such severity inflation is to have a small group of people who understand both the source of the product and the marketing and sales requirements assign the severity value for each bug. Of course, this kind of group has so many different outlooks that it is hard to make it work well. Along with having someone other than the originator of the bug decide on its severity, another approach is to use votes. Each developer or customer has some limited number of votes that can be cast for different bugs. This at least gives a sense of which bugs people care enough to vote for (not that software development is usually run as a democracy). It also goes some way to discouraging duplicate bugs that were entered just to add weight to an issue.

A Severity field for customers is even more prone to distortion than one for internal use. A Severity 1 bug may mean "I need an answer right now," "This bug is critical to our continued use of your product," or "I guess I have to keep making this bug more severe to get your attention!" That's altogether too much information squashed into one field. Having Urgency, Criticality, and Irritation fields might work ("Please indicate on a scale of 1 to 10 just how angry you are right now!"), but these may in effect just be knobs that do nothing but give you something to fiddle with (like the button to close the doors in an elevator).

One last thought: when you change the value of the customer's Severity field, it is always a problem. If you increase the severity, the customer worries whether the problem is part of a bigger issue. If you decrease the severity, you seem to be minimizing his distress. If you provide this kind of field for customers, let them change the values themselves.

7.3.5. Identifying the Right Area

Imagine entering a new bug on your first day at a new job. You have some idea of what the product you are using is called. Then there's a field in the bug named something like Area, Feature, Module, or Section and it has a dozen or more choices to indicate which part of the product the problem is in. If you're lucky, the choice is obvious, but these choices often seem vaguely named. Then it turns out that making a choice is a requirement for entering the bug. So you choose the catchall choice General, hoping that someone else will make a better choice than you for this bug later on. "There has to be a better way," you think.

There are some better ways. One way is simply to not require that the choice be made when submitting a bug. It's often wrong anyway, just a guess from some piece of text printed out in a logfile at around the same time that the bug occurred. Let the assigned engineer work out which part of the product it belongs to.

Another way is to eliminate the dreaded General and Miscellaneous choices, since they usually mean no choice at all. The most helpful thing is to have an easily accessible glossary of all the choices and their intended meanings available from where the bug is entered. An HTML link to such a page, or floating help text, or even just the local location of a glossary file are all better than guessing. Simple, concrete names and a small number of choices will also help everyone concerned.

7.3.6. Customizing the Bug Tracking System

The customizations discussed in this section are changes in the way a bug tracking system is configured. All these changes usually require some administrative privileges, and descriptions of any changes that were made should appear in audit logs.

One common customization is changing the states that a bug can be in, in order to make them better fit the project's existing workflow. The administrator should be able to add and remove states, and the tool should ask you how to deal with bugs in removed states. The transitions from state to state should also be customizable.

Defining Your States

A project's workflow is often hard to define clearly, usually because everyone seems to have different opinions about what should be a state, what should be entered in some other field or fields, and what is simply irrelevant. Some ideas to help when you're defining all the states that a bug can be in are:

Think first about the reports that are most important for the project. Create examples of these reports. Then design the workflow and states so that the reports are easy to generate.
Minimize the number of possible states. Open and Closed are two good ones to begin with. Many projects work well with only a few more states.
Walk through not only the expected workflow, but also cases where bugs are moved to states by mistake. Make sure that users can recover from such mistakes without an administrator's help.
Display a one-page diagram of the workflow's states and transitions somewhere that is easy to find when using the bug tracking system. Make sure that the diagram is updated when changes occur.

The administrator of the system should be able to change the name that users of the system see for each field. Here's a true story: in one company I worked at, we changed the name of the state where more information on a bug was required from Rejected to Returned, because people don't like feeling rejected. This was easy to do only because the text displayed was not the name of the field in the underlying database. An underlying database may have length and character restrictions on field names, and the name of a field (or the names of choices for values in a field) may change over time.

Adding a new choice for an existing field is a regular activity and should be possible from the command line, so that automated environments can do this as part of nightly builds. Adding a new field happens less frequently, but does occur as a project changes. What existing bugs show as values in the new field depends on the system. Some systems allow default values or have other ways to update the field's value in a bulk operation.

Removing values from a field or removing fields is harder. One approach for values is to make the values read-only and not selectable in a pop-up list. Deleted fields are usually not shown at all, even in historical reports. A more accurate bug tracking system would show how fields were added and removed when displaying the history of a bug. Merging and splitting fields generally requires access to the underlying database and has all the same problems as a combination of adding or removing fields. In some systems, unpredicted things happen if you add a field, delete it, and then want to add it back with the same name. So don't do that.

7.3.7. Overloading Fields

One way in which bug tracking systems commonly grow is through the addition of new kinds of information. Say you want to record the name of the customer who reported a bug, but there is currently no specific field in the bug for this. The information often gets recorded anyway, often in an existing text field. If the bug tracking tool supports searching text for keywords, this is often good enough.

When this information becomes more important than having a vague sense that knowing which customer reported a bug might be useful, there is a temptation to keep adding the information just as before, maybe with some little separator characters to show that this information is different from the rest of the field. This is bad for your bug data in the long run. All sorts of data gets merged into a small number of fields. A common example is when severity inflation (see Section 7.3.4, earlier in this chapter) has occurred and someone needs to quickly identify the bugs to be fixed for the next release. He goes in, edits a convenient field such as the bug's description, and adds text such as "MUST!:" or "_ _VITAL_ _:" to the description. Very ugly.

Getting rid of this kind of quick workaround later on is often harder than it might seem. Many bug tracking tools support search but not search-and-replace in their text fields. New bugs get added with misspellings of the text, and soon it's not clear to anyone when they should or should not add the cryptic message. The answer to this problem is simple: if your bug's fields don't do what you need them to do, modify the tool so that the fields do the right thing. Add a new field for new types of information, though if it's unlikely to be used by many others, you might want to put it far down on a screen where it won't distract from existing fields. If you are tempted to overload fields, first consider the time that someone will have to spend to remove that ugly workaround when the proper field is created later on.

7.3.8. Bug History

One of the marks of an adequate bug tracking system is the variety of reports that can be easily constructed with it. A good bug tracking system will let you generate such reports with the added dimension of time. Asking, "How many bugs are there?" is sometimes not as useful as asking, "Is the number of bugs increasing or decreasing?" or "How long do most bugs take from being reported to getting fixed?"

Some bug tracking systems do not support such historical reports; they can only recommend that you run your reports regularly, collect the information somewhere else (such as in a spreadsheet), and then create your own historical reports. This approach is particularly vulnerable to errors. Correcting something in the way the report is created can invalidate all the prior results. It's much better to be able to generate historical reports from within the bug tracking system itself.

Another important feature is the ability to see the history of a particular bug in full detailwhat changed, when did it change, and who changed it? Different systems show the addition and removal of fields in different ways.

7.3.9. Bug Statistics

The state of software engineering being what it is, any means to measure progress toward a goal is eagerly seized upon. Bug tracking systems can provide a comforting sense of tangible data about a project. This is true only in a very general sense. There are generally no precise guidelines in a project about when a bug should or should not be created, and there are good reasons why developers aren't paid by the number of bugs they fix! Still, the science of statistics is designed to handle this kind of uncertainty, and statistical analysis can be applied to bug tracking systems.

Some examples of questions with statistical answers that people like to ask about bugs are:

How long is a bug usually active?
How long does it take to get bug fixes retested?
How many bug fixes introduced new bugs?
How many bugs were found in each area of the product?

These are all interesting statistics, and sometimes they may be useful for warning a project when something is going wrong with its development process. However, there are also a few good reasons to be wary of relying too much on statistics from bug tracking systems:

"Bug sweeps" (meetings where many bugs are reassigned or deferred) can cause large discontinuities in statistical trends. There will be no more nice bell-shaped curves after one of these meetings.
Bugs are subjective, so one tester might submit many more bugs than another tester. This may not mean that their respective areas really have any difference in the quality of source code.
As noted in Section 7.3.4, earlier in this chapter, severity and priority values tend to become more urgent over time, even if the product's quality remains the same.

A still more theoretical approach is to create statistical models of the number of bugs in GNU/Linux and Mozilla, as a couple of Oxford physicists named Challet and Du did in their paper "Closed Source Versus Open Source in a Microscopic Model of Software Dynamics" (http://tiago.com/doc/bug_dynamics.pdf). Note that the appearance of this reference in a book that has the word practical in its title is meant only as a diversion from the business of doing useful things with your own bug tracking system.

7.3.10. Writing an Effective Bug Report

The three key points to bear in mind when creating a bug report should be:

How to reproduce the bug, as precisely as possible, and how often this will make the bug appear
What should have happened, at least in your opinion
What actually happened, or at least as much information as you have recorded

Many applications can generate a textual description of how they have been installed and configured. If such a description is available, you should always add it to the bug. The application may also contain some tests to check that it is still configured correctly. If so, you should run the tests and attach their results too.

As well as providing correct and useful information in the bug, it's important to check that you behave as expected for the project. This is especially true for open source projects. Maybe you should ask questions first on a users' mailing list before escalating the issue to a developers' mailing list? Or you may mark a bug as maximum priority, because it's stopping your work, only to see it downgraded because no one else is blocked by that bug. Etiquette is important, and imperious commands to "fix this bug immediately" rarely help anything.

One common cause of frustration with bug tracking systems is related to how their information is added. Bugs are too often added with vague descriptions, missing information, premature conclusions, or a best guess at the real build label (see Section 3.5). A classic situation is when a complex program has a bug deep inside it, but the only error message that is visible is one from some unrelated area. That area often gets bugs from the deeper level assigned to it, much to the frustration of the developer responsible for it. Adding some good local documentation to wherever people add new bugs can go a long way to improving the quality of all the bugs.

Some useful documents with general advice about creating bugs include the Mozilla Project's "Bug Writing Guidelines" (http://www.mozilla.org/quality/bug-writing-guidelines.html) and Simon Tatham's "How to Report Bugs Effectively" (http://www.chiark.greenend.org.uk/~sgtatham/bugs.html).

There are also many examples of more site-specific documents about writing bug reports; these contain productspecific information such as descriptions of what each part of the product does. Two of these are Opera's "Guidelines for Filing Good Bug Reports" (http://www.opera.com/support/bugs/guidelines) and FreeBSD's "Writing FreeBSD Problem Reports" (http://www.freebsd.org/doc/en_US.ISO8859-1/articles/problem-reports/article.html). Slightly off-topic, but also useful, is Eric S. Raymond and Rick Moen's classic article "How to Ask Questions the Smart Way" (http://www.catb.org/~esr/faqs/smart-questions.html). This document has some excellent reminders and strong opinions about how to interact well with groups of technical volunteers.

If you cut and paste text directly from some applications, you may find that non-ASCII characters appear in the bug's text. This can often be seen with text from Microsoft Word documents that use smart quotes. The end result looks ugly and is sometimes confusing, so it's worth checking what a new bug's text fields look like after you have submitted it.