Due to business considerations, a growing number of organizations rely on external vendors to develop the software for their needs. These organizations typically conduct an acceptance test to validate the software. In-process metrics and detailed information to assess the quality of the vendors ' software are generally not available to the contracting organizations. Therefore, useful indicators and metrics related to acceptance testing are important for the assessment of the software. Such metrics would be different from the calendar-time “based metrics discussed in previous sections because acceptance testing is normally short and there may be multiple code drops and, therefore, multiple mini acceptance tests in the validation process.
The IBM 2000 Sydney Olympics project was one such project, in which IBM evaluated vendor-delivered code to ensure that all elements of a highly complex system could be integrated successfully (Bassin, Biyani, and Santhanam, 2002). The summer 2000 Olympic Games was considered the largest sporting event in the world. For example, there were 300 medal events, 28 different sports, 39 competition venues, 30 accreditation venues , 260,000 INFO users, 2,000 INFO terminals, 10,000 news records, 35,000 biographical records, and 1.5 million historical records. There were 6.4 million INFO requests per day on the average and the peak Internet hits per day was 874.5 million. For the Venue Results components of the project, Bassin, Biyani, and Santhanam developed and successfully applied a set of metrics for IBM's testing of the vendor software. The metrics were defined based on test case data and test case execution data; that is, when a test case was attempted for a given increment code delivery, an execution record was created. Entries for a test case execution record included the date and time of the attempt, and the execution status, test phase, pointers to any defects found during execution, and other ancillary information. There were five categories of test execution status: pass, completed with errors, fail, not implemented, and blocked. A status of "failed" or "completed with errors" would result in the generation of a defect record. A status of "not implemented" indicated that the test case did not succeed because the targeted function had not yet been implemented, because this was in an incremental code delivery environment. The "blocked" status was used when the test case did not succeed because access to the targeted area was blocked by code that was not functioning correctly. Defect records would not be recorded for these latter two statuses. The key metrics derived and used include the following:
Metrics related to test cases
- Percentage of test cases attempted ” used as an indicator of progress relative to the completeness of the planned test effort
- Number of defects per executed test case ” used as an indicator of code quality as the code progressed through the series of test activities
- Number of failing test cases without defect records ” used as an indicator of the completeness of the defect recording process
Metrics related to test execution records
With these metrics and a set of in-depth defect analysis referenced as orthogonal defect classification, Bassin and associates were able to provide value-added reports , evaluations, and assessments to the project team.
These metrics merit serious considerations for software projects in similar environments. The authors contend that the underlying concepts are useful, in addition to vendor-delivered software, for projects that have the following characteristics:
It should be noted these test case execution metrics require tracking at a very granular level. By definition, the unit of analysis is at the execution level of each test case. They also require the data to be thorough and complete. Inaccurate or incomplete data will have much larger impact on the reliability of these metrics than on metrics based on higher-level units of analysis. Planning the implementation of these metrics therefore must address the issues related to the test and defect tracking system as part of the development process and project management system. Among the most important issues are cost and behavioral compliance with regard to the recording of accurate data. Finally, these metrics measure the outcome of test executions. When using these metrics to assess the quality of the product to be shipped, the effectiveness of the test plan should be known or assessed a priori , and the framework of effort/outcome model should be applied.
What Is Software Quality?
Software Development Process Models
Fundamentals of Measurement Theory
Software Quality Metrics Overview
Applying the Seven Basic Quality Tools in Software Development
Defect Removal Effectiveness
The Rayleigh Model
Exponential Distribution and Reliability Growth Models
Quality Management Models
In-Process Metrics for Software Testing
Complexity Metrics and Models
Metrics and Lessons Learned for Object-Oriented Projects
Availability Metrics
Measuring and Analyzing Customer Satisfaction
Conducting In-Process Quality Assessments
Conducting Software Project Assessments
Dos and Donts of Software Process Improvement
Using Function Point Metrics to Measure Software Process Improvements
Concluding Remarks
A Project Assessment Questionnaire