Measuring Test Coverage | A Practical Guide to Testing Object-Oriented Software

Recall that coverage is the metric that indicates the level of confidence we can have in our testing.

What Is to Be Covered?

There are a large number of attributes of a system that could be used to measure coverage. Fundamentally, there are two categories of these attributes: inputs and outputs. That is, we can measure how many of the possible inputs have been used in test cases and we can measure how many of the possible outputs that a system can produce have been produced during testing.

Traditionally, coverage has been viewed from the output perspective. Metrics such as the percentage of lines of code covered or the number of alternatives out of a decision that have been exercised are typical. We have worked to cover all uses of the system based on our use case model. This approach works fine for being certain that the product does what it is supposed to do. If we can produce all of the required outputs, then the product does what it is supposed to do.

As systems have become more integrated into life-critical and mission-critical systems, expectations have increased to include "Does the system do anything it is not supposed to?" Coverage must then be measured in terms of inputs. Are there inputs that can cause catastrophic effects? Our earlier discussions about input decision tables and equivalence classes provide a basis for this. Some possible metrics include the percentage of possible events that can be generated at the interface or the percentage of values within an equivalence class that have been used.

When Is Coverage Measured?

Coverage data is collected continuously during testing. In fact, the PACT classes include specific groupings of test cases the test suites that are intended to achieve certain types of coverage. Each test case is selected for a reason and can be directly related to covering some aspect of the system. Since development is incremental, coverage figures are updated as additional functionality is added. For example, a use case has a main scenario and then a set of alternative paths, extensions, and exceptions. Usually these various uses will gradually be completed over the course of an increment. As test cases are added to cover the various paths through the use case, the degree of coverage increases. The test plan for the use case documents the mapping between test cases and the coverage of the use case they provide.

Coverage can change even after a product is released. In particular it can go down! When new versions of a DLL are released, additional classes may be defined in the DLL and integrated into the application dynamically. The number of possible paths is increased and, unless additional test cases are added, the coverage over the application has decreased.

When Is Coverage Used?

Our answer is "at release time." That is, coverage is part of the decision criteria for a release. We don't release until we have achieved our test coverage goals (not the promised delivery date). The system test report relates the level of coverage to the quality of the delivered product as measured by the percentage of defects found after delivery. This document is updated periodically to reflect the latest information.

ODC Defect Impacts

We used the defect triggers defined by ODC earlier in this chapter to select test cases. Now we want to use the defect impacts (see Figure 9.15) to examine how well the important aspects of the system have been tested by the test cases we have selected. The defect impact categories are attributes of the system that are adversely affected by defects. As we analyze the results of a testing session, the question is whether the test set resulted in defects being identified in each of the impact areas. The system test report should list each of these categories and describe how each was evidenced in the test results.

Figure 9.15. ODC defect impacts

graphics/09fig15.gif

This alternative to traditional coverage provides a means of ensuring that tests have covered those attributes that are most important to the success of the application. The difficulty is that in taking this reverse view of the world, we cannot be certain whether the program simply is not going to impact a specific area or whether our testing has not been sufficiently thorough. In Figure 9.15 we provide some additional information for the tester to consider. There is no exact algorithm to follow, but having the list in hand when writing the system test plan and evaluating coverage has proven useful for us.

More Examples

The Java version of Brickles was developed on a PC and then installed on a Unix box. After installation, users reported that the game wasn't working. It went straight to the "All Pucks Gone, You Lose" message. A closer look showed that the game was working properly except that the bitmaps for the paddle and puck were not accessible. Thus the user didn't see anything and each puck dropped on the sticky floor directly! The developers fixed the problem by making the bitmaps accessible and conducted a deployment test on a clean machine to certify that the fix worked.

The tic-tac-toe distributed program uses the JavaHelp framework. When the program was created on a development machine it was not recognized that the Help system did not automatically add itself to the class path. Deployment testing found this defect during tests to cover all events including menu selections and mouse move events.