Applying Risk Analysis to the Inventory

During the planning phase, we want to be able to produce quick estimates of the number and type of tests that we will need to perform. I use a method that yields a quick estimate of the MITs for this process. I will then use this estimated MITs total in the sizing worksheet to estimate the optimum time frame, and resources required, for the test effort.

Part of the payback for a good risk analysis is that you pick the best tests, perform fewer tests, and get a higher return (higher bug find rate) on them. When I begin constructing test cases, I examine the paths and data sets in greater detail. I will use the more rigorous approach to calculate the MITs total.

In my spreadsheet-based inventory, a sizing worksheet begins with inventory items already discussed in chapters 8 and 9. In this next step, we will add the inventory risk analysis and the test coverage estimates that it yields. Once you enter an inventory item or a test item into one of the spreadsheet inventories, its name, average rank, and the number of tests associated with it are carried automatically to all the worksheets. So, for example, if you change the rank of an item or the number of tests associated with the item, the number of MITs, or the recommended test coverage, all the estimates for resource requirements are automatically updated for you. You can download this sample spreadsheet, which I developed for the Testers Paradise Application Version 2.0, to help you get started, at www.testersparadise.com.

The Test Estimation Process

Consider the information in Table 10.1. This is the first page of the spreadsheet.[1] It is the ranking worksheet that contains the project information, the preliminary high-level inventory, a number of tests for each item, and the rank of each item. In this worksheet, the ranking criteria are S for severity, P for probability, C for cost, C/CS for cost for customer service, and R for required.

Table 10.1: Sample Preliminary Inventory with Ranking

Tester's Paradise (Release 2.0)

Inventory Items

T#Tests Identified

S

P

C

C/CS

R

AVGR

Bug Fix Information

Fix For Error #123 (see req. B477)

7

1

   

1

1.00

 

Fix for Error #124 (see req. B501)

4

3

4

3

  

3.33

New Function

New Menu Option View Mini Clip

4

    

1.6

1.60

(see req. D071 & D072)

Arrange Payment (Path)

5

1

2

1

1

 

1.25

 

Method of Payment (Path)

11

1

2

2

1

2

1.60

 

Method of Payment (data)

12

1

2

1

1

 

1.25

 

Purchase Option: Not Available in some states (data)

50

    

1

1.00

 

Minimum Order must be $30.00 (data)

3

1

   

1

1.00

 

Method of Payment limited to 2 credit cards (Data)

12

1

1

1

2

1

1.20

Structural/Environment Information

Enhancement - automatic detection for 5 modems. (Rel. 1 had auto-detect for 3 classes only)

5

    

1

1.00

 

Installation is automatic at logon

1

    

1

1.00

 

Total New Tests with average rank:

114

     

1.38

Existing Application Base Function

Our Best Simulator (automated suite BSIM01)

65

    

1.00

1.00

Standard base function tests still apply:

Message Data Flow Checker (automated suite DFCHECK)

61

    

1.00

1.00

All test suites for Version 1.0 will be run

Screen Comparison - Pixel Viewer (automated suite PIXVIEW)

76

    

1.00

1.00

MITSs Totals - All Tests

Tot tests & average rank=

316

     

1.31

start sidebar
Which Comes First? The Analysis or the Plan?

For many years I covered the analytical test analysis topics, path analysis and data anaylsis, before I discussed risk analysis. But in that time, I have found that in practice more testers can put risk analysis into practice and get real value from it more quickly and easily than test analysis. As a result, I put the risk analysis topics before the test analysis topics in this book. If you have to go back to testing before you finish the book, you can at least take the risk topics and the worksheet with you.

However, it is not possible to discuss all the great uses for a worksheet without having some tests to count. So for the purposes of finishing the discussion on risk analysis, I am going to pretend that you have already identified some tests under each inventory item. The next three chapters present good methods for counting how many tests actually exist, along with other good tools for you to use in the actual selection process.

In each of the next three chapters, I will continue to add tests to the inventory as I explain path and data analysis.

end sidebar

This project is a typical middleweight project with some Agile programming on the Web front end and some well-planned database and business logic modules in the back.

There are many ways of determining the risk index associated with each ranking criteria, as we discussed in the previous chapter. The worksheet calculates the average rank of each inventory item. This becomes the base rank used in the estimation process. But before we move on to this next step, we need to examine where the tests associated with each inventory item came from.

Where Do the Tests Come From?

The worksheet lists a number of tests associated with each inventory item. The notations (Path) and (Data) next to certain inventory items refer to the type of tests being counted. I will cover both path and data analysis in detail in the next three chapters, where I will show you some very powerful and fast techniques for estimating and counting the number of path tests and data tests you will need in your test effort.

Suffice it to say, you can be as general or as specific as you like when you estimate the number of tests for each of these preliminary inventories. I always estimate certain types of tests; as with everything else in MITs, you are invited to customize this list to suit yourself. The following are my test sources:

  • Most Important Nonanalytical Tests (MINs). These are the tests that come from the Subject Matter Experts. They tend to dig deep into the system and focus on hot spots. They are often ad hoc in nature and do not provide any type of consistent coverage.

  • Most Important Paths (MIPs). These are the logic flow paths through a system. They form the basis for user-based function testing and allow the tester to verify and validate functions end to end. Chapters 11 and 12 deal with path analysis.

  • Most Important Data (MIDs). These are the data sets that must be validated and verified. Chapter 13 deals with techniques for counting and choosing your test data.

  • Most Important Environments (MIEs). There are two types of environmental testing. The first type is illustrated in the real-world shipping example from Chapter 7, where it was important to identify the various parts of the system that were used by the inventory test items (see Figure 7.2 and Figure 8.9 for examples of this type of environment). In this case, we use the test inventory to track what tests use what environments. This type of environment test is incorporated into various test scripts as a matter of course. The second type of environment testing is unique because it involves multiple environments that the software must run on. This is the more common scenario in RAD/Agile development efforts, where the application will be expected to run on several types of platforms, hardware, operating systems, databases, and so on.

  • This type of test is unique in this list, because while the other three types of tests tend to turn into test scripts, testing these environments requires that you run all your scripts on each environment. This is shown in the following equation:

    MITs = (MINs + MIPs + MIDs) × (MIEs)

    For this reason, this type of testing has an enormous impact on the time it will take to complete testing, and that is why the test environments get their own pages in my spreadsheets.

Clearly, the more accurate the number of tests identified at an early stage, the better the quality of the resulting estimate of the resources required to conduct the test effort. Once this worksheet has been filled out, the next step is to update the MITs Totals worksheet with these values.

The MITs Totals Worksheet

Up until now I have talked about risk in general terms, as it relates to a requirement or a feature-the high-level view. If I am analyzing the requirements and feature list, I will identify smaller and smaller testable bits through analysis. Those testable items are ranked and recorded in the inventory. Eventually, I will add up the most important tests in each category and put the totals in the sizing worksheet so that I can count how much time they will take to accomplish.

In my spreadsheet, the base values are automatically transferred from the Ranking worksheet to the MITs Totals worksheet, where the estimated test coverage and MITs tests are calculated as shown in Table 10.2. In this table, you see a preliminary estimate of the test coverage and the number of MITs that need to be run.

Table 10.2: Data from the MITs Totals Worksheet

Tester's Paradise (Release 2.0)

Inventory Items

T #Tests Identified

Rank

%Cov (100/ Rank)%

PastPerf Past Performance

TI Tests Identified/Rank = Number of tests to run

Bug Fix Information

Fix For Error #123 (see req. B477)

7

1.00

100%

75%

7.00

 

Fix for Error #124 (see req. B501)

4

3.33

30%

95%

1.20

New Function

New Menu Option #3 View Mini Clip

4

1.60

63%

 

2.50

(see req. D071 & D072)

Arrange Payment (Path)

5

1.25

80%

NA

4.00

0

Method of Payment (Path)

11

1.60

63%

NA

6.88

0

Method of Payment (Data)

12

1.25

80%

NA

9.60

0

Purchase Option: Not Available in some states (Data)

50

1.00

100%

NA

50.00

0

Minimum Order must be $30.00 (Data)

3

1.00

100%

NA

3.00

0

Method of Payment limited to 2 credit cards (Data)

12

1.20

83%

NA

10.00

Structural/Environment Information

Enhancement -automatic detection for 5 modems. (Rel. 1 had auto-detect for 3 classes only)

5

1.00

100%

NA

5.00

0

Installation is automatic at logon

1

1.00

100%

NA

1.00

0

Total New Tests, Average Rank, Average Test Coverage, MITs by Summation

113

1.38

72%

MITS SUM

100.18

 

TotalTests/avg of rank values and MITs by Summation

82.00

  

--->

101.00

Existing Application Base Function

Our Best Simulator (automated suite BSIM01)

65

 

67%

97%

43.55

Standard base function tests still apply:

Message Data Flow Checker (automated suite DFCHECK)

61

 

47%

90%

28.67

All test suites for Version 1.0 will be run

Screen Comparison - Pixel Viewer (automated suite PIXVIEW)

76

 

77%

94%

58.52

 

Tot New + Old tests =

315

avg.%

64%

 

231.74

MITs Totals - All Tests

Min MITs = T × %Cov =

201.00

  

MITs =

232.00

Minimum % test coverage

min MITS / T × 100 =

201.00

 

64%

  

Proposed % test coverage

(TI / T) × 100 =

  

74%

 

232.00

The first section of Table 10.2 deals with the new test items in Release 2. The middle section lists the tests from the previous release, and the last section contains the MITs totals. We will talk about this last section first.

The first thing to notice about the last section of Table 10.2, the last four lines, is that a total of 315 tests have been identified. But there are two different totals for the MITs. These are listed in the third from the last line.

MITs by the Rank Average Method

The first MITs value, 201 tests, coverage of 64 percent, is the average value of the total of all the tests divided by the average rank of all the tests:

Minimum MITs = (MINs + MIPs + MIDs)/Average Rank

This is the absolute minimum number of tests that we could run. It is not the recommended value, and it may be completely hypothetical; however, it does give management a low-end value to use in the resource negotiations.

MITs by the Summation Method

The second MITs value, 232 tests, with a test coverage of 74 percent, is the recommended value for the MITs in this test effort. This value is the sum of the MITs for each item divided by its rank index. This is MITs by the summation method. The formal equation is shown in the following equation:

where:

T = All tests for the item

R = Rank of the item

tsi = Total selected items

The summation method always gives a more accurate picture of how many tests will be needed, since it is more granular by nature. In my sizing worksheet, it provides the high end for resource negotiations.

If I am preparing a simple high-level (pie-in-the-sky) estimate, it will use requirements and features as the ranked test items. The work estimates will be very approximate. If I can include the count of actual tests identified for the project and use the MITs by summation method, then I can create a much more accurate estimate of resource requirements.

This means I have a better chance of negotiating for enough resources to actually succeed in my test effort. This is why I perform path and data analysis on the test items before I present my sizing estimates. Don't worry, though. If the development methodology in your shop does not give you time to plan and analyze, the worksheets are still powerful measurement and reporting tools.

Risk-Based Test Coverage and Performance

In the MITs method, the proposed test coverage, shown in column 5 of Table 10.2, is based on the number of tests identified for the item and the average rank index of the item. The past performance metric shown in column 6 is a measure of how good the MITs test coverage was. It is unfortunate that you have to wait until the product has been in production for some time before you can get the performance measure, but it does help you improve your work in the future.

The middle section, Existing Application Base Function, comprises the tests that were developed for the previous release. Notice that these tests found from 90 to 97 percent of the bugs before this previous release went into production. Especially noteworthy was the test suite for the Data Flow Checker. It only provided 47 percent test coverage, yet this suite found 90 percent of bugs in the release. Not much had been changed in these existing functions, so there seems to be no reason to make any changes in these test suites.

Rerunning test suites from previous releases is normally used to make sure that nothing got broken in the process of putting the new code in place. Unfortunately, they are the first thing management likes to cull out when time pressure builds.

A good test sweep for Release 1 can then be a starting point for subsequent releases. Microsoft likes to call this a "smoke test"; I like to call it a diagnostic suite. The point is that a test suite of this type can outlive the code it was originally written to test. So don't lose track of where you put these older tests if you are forced to forego them in one test cycle. You may well want them back at some point.

One final note before we move on to examining the sizing worksheet: Keep in mind that the rank of tests can change from one release to the next, as well as during a release. For example, maybe a low-ranking block of tests finds an astounding number of bugs. We'd want to up the ranking for it the next time we run it. Eventually, when the system stabilizes, you might want to go back to original ranking on that block of tests. This also assumes that assumptions for Release 1 are also valid for Release 2, which is not always the case.

The Sizing Worksheet

The sizing worksheet, as I use it here, is first and foremost the tool I use to calculate how much the test effort will do, based on the risk analysis, and how much that will cost in terms of testers, test time, regression test time, and test environments.

The sizing worksheet will contain relative percentages, like test coverage. It will contain time taken, and if you are lucky and very clever, it will contain the cost of doing and not doing certain tests. It will contain assumptions; these should be clearly noted as such. It will also contain estimates, which should be clearly distinguished from actuals.

start sidebar
Some Thoughts about Test Coverage

How much of what there was to test did you test? Was that "adequate"? I saw a reference to an IEEE guideline that said that minimum test coverage should be 80 percent. I want to know: "80 percent of what?" If there is no immutable method for counting all the tests that exist, what benefit are we gaining by testing some fixed percentage? How do we know that the 80 percent that was tested was the right 80 percent? It is ironic that Pareto analysis suggests that 20 percent of the code generally causes 80 percent of the problems. How does this method ensure that we will test any part of the most-likely-to-fail 20 percent?

Unfortunately, there is no magic number for test coverage. The adequate test effort is made so by selecting the correct tests, not by running a particular number of them. Further, you can only find the magic 20 percent problem center with good risk analysis techniques.

end sidebar

During the test effort, estimates are replaced by the actual effort times. The worksheet becomes the tracking vehicle for project deliverables. As we test, we replace the estimates with real time, so when we estimate the next test effort, we have actual times to compare the estimates with. This improves our estimates.

Once my ranking is complete, the totals on the MITs Totals worksheet populate the sizing worksheet. Table 10.3 shows my sample test sizing worksheet. Most of the numbers in the third column are either linked values from the MITs Totals worksheet or they are calculated in their cells here on the sizing worksheet. I will point out some of the highlights of this sample.

Table 10.3: Tester's Paradise Test Sizing Worksheet

Item

Tester's Paradise (Release 2.0)

 

1

Total Tests for 100% coverage (T) from MITs Totals row on Test Calc. Sheet

315

 

MITs Recommended number of scripts

232.00

 

MITs Minimum number of scripts from MITs Totals Sheet

208.00

 

MITs estimate for recommended coverage - all code

74%

 

MITs estimate for minimum required coverage - all code

66%

 

Number of existing tests from Version 1

130.74

 

Total New Tests identified

113

 

Number of tests to be created

101.00

2.

Average number of keystrokes in a test script

50

 

Est. script create time (manual script entry) 20 min. each- > (total new tests x 20/60) = person-hours total

32.58

3.

Est. Automated replay time total MITs (including validation) 4/60 hours/script = replay hr./cycle total (For each test environment)

15.47

 

Est. manual replay time for MITs tests (including validation) x 20/60) = hours/cycle (For each test environment)

77.33

4.

LOC Approx. 10,000 C++ language, 2,000 ASP

12,000 lines

 

Est. Number of errors (3 error/100 LOC) = 400

400 errors

5.

Number of code turnovers expected

4

 

Number of complete test cycles est.

5

6.

Number of test environments

6

 

Total Number of tests that will be run (against each environment) 4 complete automated cycles = Total MITs × 4

928

 

Total Tests - all environments in 5 cycles × Total MITs × 6 environments

6960

7.

Pre-Turnover: Analysis planning and design

80 hr

 

Post-Turnover:

 

8.

Script creation & 1st test cycle (manual build + rerun old suites) = Hours

41.30

 

4 Automated Test cycles (time per cycle x 4) x Running concurrently on 6 environments (in Hours)

61.87

 

Total: Script run time with automation Running concurrently on 6 environments (1 manual + 4 automated) = weeks to run all tests through 5 cycles on 6 environments

7.22

 

Total: Script run time all Manual (5 manual cycles) = weeks serial testing for 6 environments - Best Recommendation for automating testing!

58

9.

Error logs, Status etc. (est. 1 day in 5 for each environment) weeks

1.73

 

Total: Unadjusted effort Total Run Time + Bug Reporting (in Weeks)

8.95

10.

Factor of Safety adjustment = 50% Total adjusted effort (Total effort in Weeks)

13.43

11.

Minimum completion time: 7 weeks due to coding constraints Assumptions:

  • 100% availability of the test system.

  • 10 test machines preset with the req. environments.

  • Multiple testers will be assigned as needed

  • 3 Programmers available for fixes

  • Standard error density and composition

Current Status: Analysis, test plan and test design completed. Awaiting code turnover to begin testing.

7 weeks

Item 1: MITs Tests and Coverage

The total number of tests that have been identified in this application is 315. After MITs analysis, the recommended test coverage is 74 percent. This means that I am going to size the test effort for 232 tests-the high end of the estimate. From this starting point, the rest of the worksheet is dedicated to calculating how many hours it will take to run 232 tests, create automated tests, report and track bugs, and so on.

If I have to, I can fall back on the minimum MITs number, but I really don't like to underestimate the test effort at this time.

Item 2: Test Units and Time to Create Tests

The number of 50 keystrokes was based on average number of keystrokes in a script from the first release of the application. The estimate of how long it would take to capture a script, 20 minutes, was also based on historical data.

Caution 

Check yourself carefully on this estimate. One of the most interesting and amusing boo-boos that I have ever been asked to find a way out of came from a very mature and professional test organization. This group had performed a full MITs evaluation, complete with factor of safety, only to find when they began creating tests that their time to create estimate was underestimated by two-thirds. Happily, we were able to speed the test creation just a bit, shave the number of MITs by 5 percent, and make up the difference from the factor of safety.

You will need to specify your own definition for "test" units. In this example, the units are keystrokes. This is because all of these tests are driven from the user interface. The average number of keystrokes per script was established based on the test scripts developed in the first release of the application. But you may have data sets, or message types, or all sorts of other units. Just add as many rows as you need and keep adding up the totals.

You can add more rows and columns to your own worksheet as you need. For example, these tests are all driven from the user interface and entered via keystrokes. If you have tests that run from batched data files, you will add a row that contains your estimate of the time required to perform the batch testing.

Item 3: Time to Run Tests and Create Automated Tests

Again, this estimate needs to be as accurate as possible. Keying in scripts is hard and thankless. I use a team-oriented approach to ensure that everyone keeps moving and that morale stays high. I believe in providing free soda, pizza, and whatever else it takes to make my people feel good about doing this boring, repetitive work. A tester who is awake creates better scripts than one who is bored to death.

Notice that the estimated replay time for the automated tests was 15.47 hours, or just under two days per cycle. This was the time it would take the entire test suite of 232 tests to run to completion on one machine. The same tests would take 77.33 hours to run manually.

Item 4: Estimate the Number of Errors That Will Be Found

This project had about 12,000 lines of code and script. The company maintained bugs per KLOC (thousands of lines of code) statistics; three errors per KLOC was the magic number, so we used it. It's very hard to estimate how many errors you will actually find, but if you can estimate based on historical data, you at least have a target. We never took it too seriously, but sometimes it was helpful.

Do be careful, though. I was on a project once where the development manager wanted us to stop testing the product after we had found the "estimated" number of bugs. We were not finished testing; we were still finding bugs at the rate of several each day. Nonetheless, I had my hands full convincing her that we were not finished testing.

Item 5: Code Turnovers, Test Cycles

Four code turnovers were expected during the test effort. So the team was expecting to process one or two major deliveries and two or three bug fix deliveries. Because the effort was automated, the testers were expected to run the entire automated test suite every time code was turned over. This meant running every automated script as many as five times.

This was a completely different approach from running a manual test effort. In a manual effort, you can hardly hope to run most tests more than once. In this effort, we found that testers had time on their hands once they started running the automated tests, so instead of just going back and rerunning tests, they were actually refining the existing tests and creating better, higher-quality tests. They would take two or three old tests, pull them out, and replace them with one high-quality test. Over the course of the test effort, the number of tests actually dropped because of this refinement.

Building the new scripts was counted as part of the first test cycle, killing two birds with one stone. The new automated test scripts were built and tested at the same time. This counted as the first test cycle. The plan was to run the following four test cycles in a fully automated mode.

Item 6: Test Environments and Total Tests

Every test needed to be verified and validated in all six environments, so the test room was set up with all six machines. This meant that in two days of automated test runs, 928 test scripts were run on each of the six machines. This was when we found that the administrative overhead was higher than our estimate of having to spend one day out of five doing administrative and bug reporting tasks. (See Item 9.)

This meant that 6,960 tests would be run over the course of the test effort, in five cycles against six test environments. Because the effort was automated, this was doable. It would not have been possible to test to this standard if the tests were being run in a manual mode. The bug reporting effort was larger than expected, however.

Item 7: Planning Time

I don't usually spend more than two weeks planning my test efforts, but I may spend the entire test cycle designing and refining tests. It just depends on how the code arrives, how many test environments I have, and any other relevant particulars for the effort. Also, I like to have this planning time finished before I present the sizing worksheet to management. It simply goes down better. Of course, this in not possible in a really large project, but it usually is possible in a Web application project.

If "plan" is one of those four-letter words that is not politically acceptable in your shop, then you can always absorb your planning time in the estimate of the time required to run your tests, or by adding an extra test cycle.

Item 8: The Case for Test Automation

One of the most noteworthy things about Table 10.3 is the comparison between the total number of hours it would take to run a mostly automated test effort, 15.47, compared to the number of hours it would take to run a manual test effort, 77.33. You can see that there is a good case here for automating the test scripts from this new release. It is this kind of analysis that allows you to make good recommendations about what to automate and when to do it.

In a manual effort it would have taken 58 weeks to run all five test cycles. The automated tests could be run in under eight weeks by fewer testers.

Item 9: Administration, Documentation, and Logging Time

In our shop we usually spent about one day in five bug-logging. This seems to be pretty normal. However, it turned out to be low in this effort, for the reasons that I stated previously, in Item 6. We needed our factor of safety to help us survive this low estimate during the effort.

Test results have to be analyzed, bugs have to be reported, questions have to be answered about what happened, tests have to be reconstructed to understand just went wrong, and failures need to be reproduced for developers. These are a few of the administrative tasks that kept the testers busy in between test runs. In this effort, managing the multiple test environments required more time than we anticipated.

Item 10: Factor of Safety

Notice that there is a 50 percent safety factor built into the overall time required to conduct this effort. Now if your management says, "What's this factor of safety?", you can say it's to cover the unforeseen and unplanned activities that happen every day, like meetings, kids getting sick, telephone and computer systems going down-this is just real life. As I said before, 50 percent is very low; this is where everyone is focused on this test effort. We used it up making up for our low bug reporting and administration estimate.

Item 11: Constraints, Assumptions, and Status

This next box looks pretty innocuous, but it is where everything hits the wall. The minimum completion time is seven weeks, due to coding constraints. This is the quickest we could get it done. We expect to find 400 errors here. It takes time to fix errors. This is the part management loses sight of most often: the time it takes to fix the bugs. See the section coming up, Don't Forget the Developers to Fix the Bugs, for a real-life example of this.

The main assumption in this effort was 100 percent availability of the test system. I am happy to say this wasn't a problem in this particular project, but it normally is a major dependency.

The head tester, Joseph, a quiet, gentle person who couldn't kill a spider, had completed his analysis, test plan, and test design when this proposal was presented to our senior vice president. His test environments were ready, baselined, and fully mirrored. This tester was waiting for his code.

Negotiating the Test Effort

After many years testing, this is still my favorite example of how a good tester can succeed. Joseph had successfully negotiated for 12 weeks to complete the automated test effort in the tables. When the first cycle was automated, his actuals looked so good that he rescaled his sizing worksheet and it looked like everything would still fit into the 12 weeks. Then someone in marketing promised a vice president somewhere that they would deliver the application in four weeks.

I was asked to join the test team that went to tell our vice president what he had to do to get this delivered in four weeks. As always, I was ready to play the sacrificial tester in case things went bad.

"So how can we get this done in four weeks?" the vice president asked the tester, Joseph. "You only have three testers; we can give you 10 good folks from customer service and editorial to help you test. That will do the trick, won't it?"

Joseph wasn't ruffled. "We planned on having six machines running. It takes one machine 16 hours to run the entire suite. If we doubled the number of machines, we could cut that in half." This made sense to the vice president.

Joseph continued, "I wouldn't mind having one other trained tester, but I would decline your kind offer of these guys from editorial because they don't stick to plan and we can't reproduce what they break. We end up spending all our time with them, instead of getting our testing done.

"What we really need is lots of developers to fix the bugs we find right away. If we can run a test cycle in a day, we need the bugs fixed as fast as we can find them. The programmers we have can't possibly give us fixes for 400 bugs in four weeks."

The VP was surprised, but he agreed. The product was delivered in four weeks, and it was very stable. The lesson here is, don't forget the time required to fix the bugs found by testing. This is a common problem in test efforts-mostly because testers don't control this issue.

Don't Forget the Developers to Fix the Bugs

I have encountered this problem several times in my career, enough times to spend a few words on it here. The following example is the best example I ever saw of this situation, because it happened to a very mature and contentious development shop. It just goes to show that we can all get sucked into this mistake.

This was a heavyweight project for a major telecommunications company that came dangerously close to failure because management did not keep sufficient developers ready to fix the bugs that the testers found.

The testers had prepared a full complement of test collateral, including a very good estimate of how many bugs the effort would find, what severity they would be, and so on, but they didn't take the next step and build an estimate of how much time and how many developers it would take to fix all these bugs. So when the project was marked complete by the development managers, they moved all but one of the developers to new projects in different countries.

When I joined the project, I built a worksheet straight away. The lone developer was kind enough to contribute estimates of how much time it would take to fix each logged bug with a severity of 2 or higher. The testers had a good inventory; I simply added a sheet to it. The new sheet was basically the test script worksheet with an environment matrix. Each bug was recorded under the test script that uncovered it. I added two columns to the sheet; one was for the severity of the bug, and the other was the time-to-fix estimate. This worksheet showed management not only how many developer hours would be required to fix these serious to critical bugs, but which requirement, features, and environments the developers should be familiar with. It also showed testers where they could expect to be regression testing the most.

Initially, management supplied developers who were not assigned to any project at that time to fix the bugs. These developers spent precious weeks not fixing bugs, but, rather, trying to understand the system. Management was finally persuaded to bring back key developers specifically to address issues in the code they had written. The developers who had written the code were able to fix the bugs, and the deadlines for system delivery were met, but meanwhile, their new projects fell behind.

It seems absurd that something so important as fixing the bugs found by testing could fall through the cracks. I attribute this problem mostly to a failure of project management techniques in general. When I have seen this failure, it has always been traceable to the fact that there was no task on the project plan for fixing bugs after the code was delivered. This is especially true and problematic in large integration projects where integration issues are not discovered until well after the code has been turned over.

In these cases, the developers who wrote the code have invariably been assigned to their next project, leaving no one who is familiar with the code to fix the bugs. In such circumstances, the time and the cost needed to fix the bug are both significantly higher than when a bug is fixed by the original author.

On a slightly different note, I have to say that not everyone has this problem. There is a new development methodology called eXtreme programming that has gone in quite the opposite direction. Rather than forget that time and developers are required to fix the bugs that testers find, they are counting on it.

The eXtreme Project

Software makers are still trying to off-load the cost of validation and verification onto the user. The latest example of this is eXtreme programming. After close inspection, I have decided it is simply the latest iteration of the I-feel-lucky approach to software development. eXtreme uses a frontal assault on the concept of testing. The RAD approach was to pretend to do formal testing, while what is really going on is a bit of cursory exploration by the testers before letting the customers test it. eXtreme asserts that they are so good that no testing is necessary. This means that no time or effort is wasted before turning it over to the customer, who will begin testing in earnest, whether they mean to or not. What the method does do extremely well is pander to the classic A-type compulsive personalities, giving them full permission to run amuck while keeping the financial auditors at bay until it is too late to turn back.

If eXtreme succeeds at all in fielding a viable product, it is due to the Money for Flexibility (MFF) strategy that goes along with this approach and the full dress rehearsal that they are giving the software while the first sacrificial customers test it. The successful eXtreme projects that I have seen keep a full contingent of developers at the ready when they roll out the software to a select few customers. This gives them the lead-time that they need to fix all the major issues before too many people see the thing.

The Contract to Test

The sizing worksheet is an important tool in establishing the contract to test. When I write a contract, the first thing I do is estimate the test effort. Then I write the test scripts for some of the buggier areas and compare the two. I can then go in to sign the contract feeling better about my estimate. A test sizing worksheet is used to prepare estimates of the resources required to carry out the test effort.

The worksheet is an important communications tool both in the estimation and in the negotiation phase. It shows management why I couldn't do what they wanted. It protects the customer. If management doesn't understand how big the thing is, they don't understand why you can't test it in 24 hours. In the resource negotiations process, the worksheet, along with the rank and test calculations sheets, serves as the basis for the contract to test between parties.

Once the negotiations are concluded and the contract to test is formalized, this part of the process is completed. The next step is to actually identify tests under each inventory item. Now let's imagine that you are going to actually start designing, collecting, and discovering tests.

[1]I simply selected the cells I wanted in the spreadsheet and copied them, then pasted them here in my Word document.



Software Testing Fundamentals
Software Testing Fundamentals: Methods and Metrics
ISBN: 047143020X
EAN: 2147483647
Year: 2005
Pages: 132

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net