If configuration management (CM) provides the framework for the management of all systems engineering activities, then metrics provide the framework for measuring whether or not configuration management has been effective.
That metrics are an absolute requirement is proven by the following dismal statistics:
While configuration management is not a panacea for all problems, practicing sound CM methodologies can assist in effectively controlling the systems engineering effort. This chapter surveys a wide variety of metrics that can be deployed to measure the developmental process. Not all metrics are applicable to all situations. The organization must carefully consider which of the following metrics are a best fit.
Why should anyone care about productivity and quality? There are several reasons for this. The first and foremost reason is that our customers and end users require a working, quality product. Measuring the process as well as the product tells us whether we have achieved our goal. However, there are other, more subtle reasons why one needs to measure productivity and quality, including:
One measures productivity and quality to quantify the project's progress as well as to quantify the attributes of the product. A metric enables one to understand and manage the process as well as to measure the impact of change to the process that is, new methods , training, etc. The use of metrics also enables one to know when one has met his goals that is, usability, performance, test coverage, etc.
In measuring software systems, one can create metrics based on the different parts of a system for example, requirements, specifications, code, documentation, tests, and training. For each of these components , one can measure its attributes, which include usability, maintainability, extendibility, size , defect level, performance, and completeness.
While the majority of organizations will use metrics found in books such as this one, it is possible to generate metrics specific to a particular task. Characteristics of metrics dictate that they should be:
Sample product metrics include:
Sample process metrics include:
Once the organization determines the slate of metrics to be implemented, it must develop a methodology for reviewing the results of the metrics program. Metrics are useless if they do not result in improved quality or productivity. At a minimum, the organization should:
The following metrics are typically used by those measuring the CM process:
Using the IEEE methodology [IEEE 1989], the measurement process can be described in nine stages. These stages may overlap or occur in different sequences, depending on organization needs. Each of these stages in the measurement process influences the production of a delivered product with the potential for high reliability. Other factors influencing the measurement process include:
Product measures include:
Process measures include:
The nine stages consist of the following.
Initiate a planning process. Form a planning group and review reliability constraints and objectives, giving consideration to user needs and requirements. Identify the reliability characteristics of a software product necessary to achieve these objectives. Establish a strategy for measuring and managing software reliability. Document practices for conducting measurements.
Define the reliability goals for the software being developed to optimize reliability in light of realistic assessments of project constraints, including size scope, cost, and schedule.
Review the requirements for the specific development effort to determine the desired characteristics of the delivered software. For each characteristic, identify specific reliability goals that can be demonstrated by the software or measured against a particular value or condition. Establish an acceptable range of values. Consideration should be given to user needs and requirements.
Establish intermediate reliability goals at various points in the development effort.
Establish a software reliability measurement process that best fits the organization's needs. Review the rest of the process and select those stages that best lead to optimum reliability.
Add to or enhance these stages as needed. Consider the following suggestions:
Identify potential measures that would be helpful in achieving the reliability goals established in Stage 2.
Prepare a data collection and measurement plan for the development and support effort. For each potential measure, determine the primitives needed to perform the measurement. Data should be organized so that information related to events during the development effort can be properly recorded in a database and retained for historical purposes.
For each intermediate reliability goal identified in Stage 2, identify the measures needed to achieve this goal. Identify the points during development when the measurements are to be taken. Establish acceptable values or a range of values to assess whether the intermediate reliability goals are achieved.
Include in the plan an approach for monitoring the measurement effort itself. The responsibility for collecting and reporting data, verifying its accuracy, computing measures, and interpreting the results should be described.
Monitor measurements. Once the data collection and reporting begin, monitor the measurements and the progress made during development, so as to manage the reliability and thereby achieve the goals for the delivered product. The measurements assist in determining whether the intermediate reliability goals are achieved and whether the final goal is achievable. Analyze the measure and determine if the results are sufficient to satisfy the reliability goals. Decide whether a measure's results assist in affirming the reliability of the product or process being measured. Take corrective action.
Analyze measurements to ensure that reliability of the delivered software satisfies the reliability objectives and that the reliability, as measured, is acceptable.
Identify assessment steps that are consistent with the reliability objectives documented in the data collection and measurement plan. Check the consistency of acceptance criteria and the sufficiency of tests to satisfactorily demonstrate that the reliability objectives have been achieved. Identify the organization responsible for determining final acceptance of the reliability of the software. Document the steps in assessing the reliability of the software.
Assess the effectiveness of the measurement effort and perform the necessary corrective action. Conduct a follow-up analysis of the measurement effort to evaluate the reliability assessment and development practices, record lessons learned, and evaluate user satisfaction with the software's reliability.
Retain measurement data on the software throughout the development and operation phases for use in future projects. This data provides a baseline for reliability improvement and an opportunity to compare the same measures across completed projects. This information can assist in developing future guidelines and standards.
The Contel Technology Center's Software Engineering lab has as one of its prime goals the improvement of software engineering productivity. As a result of work in this area, Pfleeger and McGowan [1990] have suggested a set of metrics for which data is to be collected and analyzed. This set of metrics is based on a process maturity framework developed at the Software Engineering Institute (SEI) at Carnegie Mellon University. The SEI framework divides organizations into five levels based on how mature (i.e., organized, professional, aligned to software tenets) the organization is. The five levels range from initial, or ad hoc, to an optimizing environment. Contel recommends that metrics be divided into five levels as well. Each level is based on the amount of information made available to the development process. As the development process matures and improves , additional metrics can be collected and analyzed .
This level is characterized by an ad hoc approach to software development. Inputs to the process are not well-defined but the outputs are as expected. Preliminary baseline project metrics should be gathered at this level to form a basis for comparison as improvements are made and maturity increases . This can be accomplished by comparing new project measurements with the baseline ones.
At this level, the process is repeatable in much the same way that a sub-routine is repeatable. The requirements act as input, the code as output, and constraints are such things as budget and schedule. Although proper inputs produce proper outputs, there is no means to easily discern how the outputs are actually produced. Only project- related metrics make sense at this level because the activities within the actual transitions from input to output are not available to be measured. Measures are this level can include:
At this level, the activities of the process are clearly defined. This means that the input to and output from each well-defined functional activity can be examined, which permits a measurement of the intermediate products. Measures include:
At this level, feedback from early project activities is used to set priorities for later project activities. At this level, activities are readily compared and contrasted; the effects of changes in one activity can be tracked in the others. At this level, measurements can be made across activities and are used to control and stabilize the process so that productivity and quality can match expectation. The following types of data are recommended to be collected. Metrics at this stage, although derived from the following data, are tailored to the individual organization.
At this level, measures from activities are used to change and improve the process. This process change can affect the organization as well as the project. Studies by the SEI report that 85 percent of organizations are at level 1, 14 percent at level 2, and 1 percent at level 3. None of the firms surveyed had reached Levels 4 or 5. Therefore, the authors have not recommended a set of metrics for Level 5.
The IEEE standards [1988] were written with the objective of providing the software community with defined measures currently used as indicators of reliability. By emphasizing early reliability assessment, this standard supports methods through measurement to improve product reliability.
This section presents a sub-set of the IEEE standard that is easily adaptable by the general IT community.
This measure can be used to predict remaining faults by comparison with expected fault density, determine if sufficient testing has been completed, and establish standard fault densities for comparison and prediction.
F d = F/KSLOC
Where:
This measure can be used after design and code inspections of new development or large block modifications. If the defect density is outside the norm after several inspections, it is an indication of a problem.
Where:
This is a graphical method used to predict reliability, estimate additional testing time to reach an acceptable reliable system, and identify modules and sub-systems that require additional testing. A plot is drawn of cumulative failures versus a suitable time base.
This measure represents the number of days that faults spend in the system, from their creation to their removal. For each fault detected and removed, during any phase, the number of days from its creation to its removal is determined (fault-days). The fault-days are then summed for all faults detected and removed, to get the fault-days number at system level, including all faults detected and removed up to the delivery date. In those cases where the creation date of the fault is not known, the fault is assumed to have been created at the middle of the phase in which it was introduced.
This measure is used to quantify a software test coverage index for a software delivery. From the system's functional requirements, a cross-reference listing of associated modules must first be created.
Functional (Modular)Test Coverage Index =
Where:
This measure aids in identifying requirements that are either missing from, or in addition to, the original requirements.
TM = —100%
Where:
R1 |
= |
number of requirements met by the architecture |
R2 |
= |
number of original requirements |
This measure is used to quantify the readiness of a software product. Changes from previous baselines to the current baselines are an indication of the current product stability.
SMI =
Where:
SMI =
This measure is used to determine the reliability of a software system resulting from the software architecture under consideration, as represented by a specification based on the entity-relationship-attributed model. What is required is a list of the system's inputs, its outputs, and a list of the functions performed by each program. The mappings from the software architecture to the requirements are identified. Mappings from the same specification item to more than one differing requirement are examined for requirements inconsistency. Additionally, mappings from more than one specification item to a single requirement are examined for specification inconsistency.
This measure is used to determine the structured complexity of a coded module. The use of this measure is designed to limit the complexity of the module, thereby promoting understandability of the module.
C = E - N + 1
Where:
C |
= |
complexity |
N |
= |
number of nodes (sequential groups of program statements) |
E |
= |
number of edges (program flows between nodes) |
This measure is used to determine the simplicity of the detailed design of a software program. The values determined can be used to identify problem areas within the software design.
DSM = W i D i
Where:
DSM |
= |
design structure measure |
P1 |
= |
total number of modules in program |
P2 |
= |
number of modules dependent on input or output |
P3 |
= |
number of modules dependent on prior processing (state) |
P4 |
= |
number of database elements |
P5 |
= |
number of nonunique database elements |
P6 |
= |
number of database segments |
P7 |
= |
number of modules not single entrance /single exit |
The design structure is the weighted sum of six derivatives determined using the primitives given above.
D 1 |
= |
designed organized top-down |
D 2 |
= |
module dependence (P2/P1) |
D 3 |
= |
module dependent on prior processing (P3/P1) |
D 4 |
= |
database size (P5/P4) |
D 5 |
= |
database compartmentalization (P6/P4) |
D 6 |
= |
module single entrance/exit (P7/P1) |
The weights (W i ) are assigned by the user based on the priority of each associated derivative. Each W i has a value between 0 and 1.
This is a measure of the completeness of the testing process, from both a developer's and user's perspective. The measure relates directly to the development, integration, and operational test stages of product development.
Where:
This is a structural complexity or procedural complexity measure that can be used to evaluate the information flow structure of large-scale systems, the procedure and module information flow structure, the complexity of the interconnections between modules, and the degree of simplicity of relationships between sub-systems, and to correlate total observed failures and software reliability with data complexity.
Weighted IFC = Length — (Fan-in — Fan-out) 2
Where:
The flow of information between modules or sub-systems needs to be determined either through the use of automated techniques or charting mechanisms. A local flow from module A to B exists if one of the following occurs:
This measure is the basic parameter required by most software reliability models. Detailed record keeping of failure occurrences that accurately track time (calendar or execution) at which the faults manifest themselves is essential.
The objective of this measure is to collect information to identify the parts of the software maintenance products that may be inadequate for use in a software maintenance environment. Questionnaires are used to examine the format and content of the documentation and source code attributes from a maintainability perspective.
The questionnaires examine the following product characteristics:
Two questionnaires ” the Software Documentation Questionnaire and the Software Source Listing Questionnaire ” are used to evaluate the software products in a desk audit.
For the software documentation evaluation, the resource documents should include those that contain the program design specifications, program testing information and procedures, program maintenance information, and guidelines used in the preparation of the documentation. Typical questions from the questionnaire include:
The software source listing evaluation reviews either high-order language or assembler source code. Multiple evaluations using the questionnaire are conducted for the unit level of the program (module). The modules selected should represent a sample size of at least 10 percent of the total source code. Typical questions include:
McCabe's [1976] proposal for a cyclomatic complexity number was the first attempt to objectively quantify the "flow of control" complexity of software.
The metric is computed by decomposing the program into a directed graph that represents its flow of control. The cyclomatic complexity number is then calculated using the following:
V(g) = Edges - Nodes + 2
In its shortened form, the cyclomatic complexity number is a count of decision points within a program with a single entry and a single exit plus one.
In the 1970s, Halstead [1976] developed a theory regarding the behavior of software. Some of his findings evolved into software metrics. One of these is referred to as "Effort" or just "E," and is a well-known complexity metric.
The Effort measure is calculated as:
E = Volume/Level
where Volume is a measure of the size of a piece of code and Level is a measure of how "abstract" the program is. The level of abstracting varies from almost zero (0) for programs with low abstraction, to almost one (1) for programs that are highly abstract.
This chapter provides a detailed reference listing for pertinent CM and software engineering metrics.
[Halstead 1977] Halstead, M., Elements of Software Science, Elsevier, New York, 1977 .
[IEEE 1989] IEEE Guide for the Use of IEEE Standard Dictionary of Measures to Produce Reliable Software, Standard 982.2-1988, June 12, 1989 , IEEE Standards Department, Piscatawy, NJ.
[IEEE 1988] IEEE Standard of Measures to Produce Reliable Software, Standard 982.1-1988, IEEE Standards Department, Piscataway, NJ.
[McCabe 1976] McCabe, T., "A Complexity Measure," IEEE Transactions on Software Engineering , 308 “320 , December 1976 .
[Pfleeger and McGowan 1990] Pfleeger, S.L. and C. McGowan, "Software Metrics in the Process Maturity Framework," Journal of Systems Software , 12, 255 “261, 1990 .
Note : The information contained within the IEEE standards metrics section is copyrighted information of the IEEE, extracted from IEEE Std. 982.1 “1988, IEEE Standard Dictionary of Measures to Produce Reliable Software. This information was written within the context of IEEE Std 982.1 “1988 and the IEEE takes no responsibility for or liability for damages resulting from the reader's misinterpretation of said information resulting from the placement and context of this publication. Information is reproduced with the permission of the IEEE.
Preface