An Example of Module Design Metrics in Practice | Complexity Metrics and Models

In this section, we describe an analysis of several module design metrics as they relate to defect level, and how such metrics can be used to develop a software quality improvement plan. Special attention is given to the significance of cyclomatic complexity. Data from all program modules of a key component in the AS/400 software system served as the basis of the analysis. The component provides facilities for message control among users, programs, and the operating system. It was written in PL/ MI (a PL/1 “like language) and has about 70 KLOC. Because the component functions are complex and involve numerous interfaces, the component has consistently experienced high reported error rates from the field. The purpose of the analysis was to produce objective evidences so that data-based plans can be formulated for quality and maintainability improvement.

The metrics in the analysis include:

McCabe's cyclomatic complexity index (CPX).
Fan-in: The number of modules that call a given module (FAN-IN).
Fan-out: The number of modules that are called by a given module. In AS/400 this metric refers to the number of MACRO calls in the module (MAC).
Number of INCLUDES in the module. In AS/400 INCLUDES are used for calls such as subroutines and declarations. The difference between MACRO and INCLUDE is that for INCLUDE there are no parameters passing. For this reason, INCLUDES are not counted as fan-out. However, INCLUDES do involve interface, especially for the common INCLUDES.
Number of design changes and enhancements since the initial release of AS/400 (DCR).
Previous defect history. This metric refers to the number of formal test defects and field defects in the same modules in System/38, the predecessor midrange computer system of AS/400. This component reused most of the modules in System/38. This metric is denoted PTR38 in the analysis.
Defect level in the current system (AS/400). This is the total number of formal test defects and field defects for the latest release when the analysis was done. This metric is denoted DEFS in the analysis.

Our purpose was to explain the variations in defect level among program modules by means of the differences observed in the metrics described earlier. Therefore, DEFS is the dependent variable and the other metrics are the independent variables. The means and standard deviations of all variables in the analysis are shown in Table 11.3. The large mean values of MACRO calls (MAC) and FAN-IN illustrate the complexity of the component. Indeed, as the component provides facilities for message control in the entire operating system, numerous modules in the system have MACRO-call links with many modules of the component. The large standard deviation for FAN-IN also indicates that the chance for significant relationships between fan-in and other variables is slim.

Table 11.4 shows the Pearson correlation coefficients between defect level and other metrics. The high correlations for many factors were beyond expectation. The significant correlations for complexity indexes and MACRO calls support the theory that associates complexity with defect level. McCabe's complexity index measures the complexity within the module. FAN-OUT, or MACRO calls in this case, is an indicator of the complexity between modules.

Table 11.3. Means, Standard Deviations, and Number of Modules

Standard Variable	Mean	Deviation	n
CPX	23.5	23.2	72
FAN-IN	143.5	491.6	74
MAC	61.8	27.4	74
INCLUDES	15.4	9.5	74
DCR	2.7	3.1	75
PTR38	8.7	9.8	63
DEFS	6.5	8.9	75

Table 11.4. Correlation Coefficients Between Defect Level and Other Metrics

Variable	Pearson Correlation	n	Significance ( p Value)
CPX	.65	72	.0001
FAN-IN	.02	74	Not significant
MAC	.68	74	.0001
INCLUDES	.65	74	.0001
DCR	.78	75	.0001
PTR38	.87	75	.0001

As expected, the correlation between FAN-IN and DEFS was not significant. Because the standard deviation of FAN-IN is large, this finding is tentative. More focused analysis is needed. Theoretically, modules with a large fan-in are relatively simple and are usually located at lower layers of the system structure. Therefore, fan-in should not positively correlate with defect level. The correlation should either be negative or insignificant, as the present case showed.

The high correlation for module changes and enhancement simply illustrates the fact that the more changes, the more chances for injecting defects. Moreover, small changes are especially error-prone . Because most of the modules in this component were designed and developed for the System/38, changes for AS/400 were generally small.

The correlation between previous defect history and current defect level was the strongest (0.87). This finding confirms the view of the developers that many modules in the component are chronic problem components , and systematic plans and actions are needed for any significant quality improvement.

The calculation of Pearson's correlation coefficient is based on the least-squares method. Because the least-squares method is extremely sensitive to outliers, examination of scatterplots to confirm the correlation is mandatory. Relying on the correlation coefficients alone sometimes may be erroneous. The scatter diagram of defect level with McCabe's complexity index is shown in Figure 5.9 in Chapter 5 where we discuss the seven basic quality tools. The diagram appears radiant in shape: low-complexity modules at the low defect level; however, for high-complexity modules, while more are at the high defect level, there are others with low defect levels. Perhaps the most impressive finding from the diagram is the blank area in the upper left part, confirming the correlation between low complexity and low defect level. As can be seen, there are many modules with a complexity index far beyond McCabe's recommended level of 10 ”probably due to the high complexity of system programs in general, and the component functions specifically .

Figure 11.3 shows the scatter diagrams for defect level with MAC, INCLUDE, DCR, and PTR38. The diagrams confirm the correlations. Because the relationships appear linear, the linear regression lines and confidence intervals are also plotted.

Figure 11.3. Scatter Diagram ”DEFS with MAC, INCLUDE, DCR, and PTR38

graphics/11fig03.gif

The extreme data point at the upper right corner of the diagrams represents the best known module in the component, which formats a display of messages in a queue and sends it to either the screen or printer. With more than 5,000 lines of source code, it is a highly complex module with a history of many problems.

The next step in our analysis was to look at the combined effect of these metrics on defect level simultaneously . To achieve this task, we used the multiple regression approach. In a multiple regression model, the effect of each independent variable is adjusted for the effects of other variables. In other words, the regression coefficient and the significance level of an independent variable represent the net effect of that variable on the dependent variable ”in this case, the defect level. We found that in the combined model, MAC and INCLUDE become insignificant. When we excluded them from the model, we obtained the following:

graphics/11icon20.gif

With an R 2 of 0.83, the model is highly significant. Each of the three independent variables is also significant at the 0.05 level. In other words, the model explains 83% of the variations in defect level observed among the program modules.

To verify the findings, we must control for the effect of program size ”lines of code. Since LOC is correlated with DEFS and other variables, its effect must be partialled out in order to conclude that there are genuine influences of PTR38, DCR, and CPX on DEFS. To accomplish the task, we did two things: (1) normalized the defect level by LOC and used defects per KLOC (DEFR) as the dependent variable and (2) included LOC as one of the independent variables (control variable) in the multiple regression model. We found that with this control, PTR38, DCR, and CPX were still significant at the 0.1 level. In other words, these factors truly represent something for which the length of the modules cannot account. However, the R 2 of the model was only 0.20. We contend that this again is due to the wide fluctuation of the dependent variable, the defect rate. The regression coefficients, their standard errors, t values, and the significance levels are shown in Table 11.5.

This analysis indicates that other than module length, the three most important factors affecting the defect rates of the modules are the number of changes and enhancements, defect history, and complexity level. From the intervention standpoint, since developers have no control over release enhancements, the latter two factors become the best clues for quality improvement actions. The relationships among defect history, complexity, and current defect level are illustrated in Figure 11.4. The best return on investment, then, is to concentrate efforts on modules with high defect history (chronic problem modules) and high complexity.

Figure 11.4. Scatter Diagrams of DEF, PTR38, and CPX

graphics/11fig04.gif

Table 11.5. Results of Multiple Regression Model of Defect Rate

Variable	Regression Coefficients	Standard Error	t Value	Significance ( p Value)
Intercept	4.631	2.813	1.65	.10
CPX	.115	.066	1.73	.09
DCR	1.108	.561	1.98	.05
PTR38	.359	.220	1.63	.10
LOC	“.014	.005	2.99	.004
R 2	.20

Based on the findings from this analysis and other observations, the component team established a quality improvement plan with staged implementation. The following list includes some of the actions related to this analysis:

Scrutinize the several modules with moderate complexity and yet high defect level. Examine module design and code implementation and take proper actions.
Identify high-complexity and chronic problem modules, do intramodule restructuring and cleanup (e.g., better separation of mainline and subroutines, better comments, better documentation in the prologue, removal of dead code, better structure of source statements). The first-stage target is to reduce the complexity of these modules to 35 or lower.
Closely related to the preceding actions, to reduce the number of compilation warning messages to zero for all modules.
Include complexity as a key factor in new module design, with the maximum not to exceed 35.
Improve test effectiveness, especially for complex modules. Use test coverage measurement tools to ensure that such modules are adequately covered.
Improve component documentation and education.

Since the preceding analysis was conducted , the component team has been making consistent improvements according to its quality plan. Field data from new releases indicate significant improvement in the component's quality.

What Is Software Quality?

Software Development Process Models

Fundamentals of Measurement Theory

Software Quality Metrics Overview

Applying the Seven Basic Quality Tools in Software Development

Defect Removal Effectiveness

The Rayleigh Model

Exponential Distribution and Reliability Growth Models

Quality Management Models

In-Process Metrics for Software Testing

Complexity Metrics and Models

Metrics and Lessons Learned for Object-Oriented Projects

Availability Metrics

Measuring and Analyzing Customer Satisfaction

Conducting In-Process Quality Assessments

Conducting Software Project Assessments

Dos and Donts of Software Process Improvement

Using Function Point Metrics to Measure Software Process Improvements

Concluding Remarks

A Project Assessment Questionnaire

A Project Assessment Questionnaire