In this section, we describe an analysis of several module design metrics as they relate to defect level, and how such metrics can be used to develop a software quality improvement plan. Special attention is given to the significance of cyclomatic complexity. Data from all program modules of a key component in the AS/400 software system served as the basis of the analysis. The component provides facilities for message control among users, programs, and the operating system. It was written in PL/ MI (a PL/1 “like language) and has about 70 KLOC. Because the component functions are complex and involve numerous interfaces, the component has consistently experienced high reported error rates from the field. The purpose of the analysis was to produce objective evidences so that data-based plans can be formulated for quality and maintainability improvement.
The metrics in the analysis include:
Our purpose was to explain the variations in defect level among program modules by means of the differences observed in the metrics described earlier. Therefore, DEFS is the dependent variable and the other metrics are the independent variables. The means and standard deviations of all variables in the analysis are shown in Table 11.3. The large mean values of MACRO calls (MAC) and FAN-IN illustrate the complexity of the component. Indeed, as the component provides facilities for message control in the entire operating system, numerous modules in the system have MACRO-call links with many modules of the component. The large standard deviation for FAN-IN also indicates that the chance for significant relationships between fan-in and other variables is slim.
Table 11.4 shows the Pearson correlation coefficients between defect level and other metrics. The high correlations for many factors were beyond expectation. The significant correlations for complexity indexes and MACRO calls support the theory that associates complexity with defect level. McCabe's complexity index measures the complexity within the module. FAN-OUT, or MACRO calls in this case, is an indicator of the complexity between modules.
Table 11.3. Means, Standard Deviations, and Number of Modules
Standard Variable |
Mean |
Deviation |
n |
---|---|---|---|
CPX |
23.5 |
23.2 |
72 |
FAN-IN |
143.5 |
491.6 |
74 |
MAC |
61.8 |
27.4 |
74 |
INCLUDES |
15.4 |
9.5 |
74 |
DCR |
2.7 |
3.1 |
75 |
PTR38 |
8.7 |
9.8 |
63 |
DEFS |
6.5 |
8.9 |
75 |
Table 11.4. Correlation Coefficients Between Defect Level and Other Metrics
Variable |
Pearson Correlation |
n |
Significance ( p Value) |
---|---|---|---|
CPX |
.65 |
72 |
.0001 |
FAN-IN |
.02 |
74 |
Not significant |
MAC |
.68 |
74 |
.0001 |
INCLUDES |
.65 |
74 |
.0001 |
DCR |
.78 |
75 |
.0001 |
PTR38 |
.87 |
75 |
.0001 |
As expected, the correlation between FAN-IN and DEFS was not significant. Because the standard deviation of FAN-IN is large, this finding is tentative. More focused analysis is needed. Theoretically, modules with a large fan-in are relatively simple and are usually located at lower layers of the system structure. Therefore, fan-in should not positively correlate with defect level. The correlation should either be negative or insignificant, as the present case showed.
The high correlation for module changes and enhancement simply illustrates the fact that the more changes, the more chances for injecting defects. Moreover, small changes are especially error-prone . Because most of the modules in this component were designed and developed for the System/38, changes for AS/400 were generally small.
The correlation between previous defect history and current defect level was the strongest (0.87). This finding confirms the view of the developers that many modules in the component are chronic problem components , and systematic plans and actions are needed for any significant quality improvement.
The calculation of Pearson's correlation coefficient is based on the least-squares method. Because the least-squares method is extremely sensitive to outliers, examination of scatterplots to confirm the correlation is mandatory. Relying on the correlation coefficients alone sometimes may be erroneous. The scatter diagram of defect level with McCabe's complexity index is shown in Figure 5.9 in Chapter 5 where we discuss the seven basic quality tools. The diagram appears radiant in shape: low-complexity modules at the low defect level; however, for high-complexity modules, while more are at the high defect level, there are others with low defect levels. Perhaps the most impressive finding from the diagram is the blank area in the upper left part, confirming the correlation between low complexity and low defect level. As can be seen, there are many modules with a complexity index far beyond McCabe's recommended level of 10 ”probably due to the high complexity of system programs in general, and the component functions specifically .
Figure 11.3 shows the scatter diagrams for defect level with MAC, INCLUDE, DCR, and PTR38. The diagrams confirm the correlations. Because the relationships appear linear, the linear regression lines and confidence intervals are also plotted.
Figure 11.3. Scatter Diagram ”DEFS with MAC, INCLUDE, DCR, and PTR38
The extreme data point at the upper right corner of the diagrams represents the best known module in the component, which formats a display of messages in a queue and sends it to either the screen or printer. With more than 5,000 lines of source code, it is a highly complex module with a history of many problems.
The next step in our analysis was to look at the combined effect of these metrics on defect level simultaneously . To achieve this task, we used the multiple regression approach. In a multiple regression model, the effect of each independent variable is adjusted for the effects of other variables. In other words, the regression coefficient and the significance level of an independent variable represent the net effect of that variable on the dependent variable ”in this case, the defect level. We found that in the combined model, MAC and INCLUDE become insignificant. When we excluded them from the model, we obtained the following:
With an R 2 of 0.83, the model is highly significant. Each of the three independent variables is also significant at the 0.05 level. In other words, the model explains 83% of the variations in defect level observed among the program modules.
To verify the findings, we must control for the effect of program size ”lines of code. Since LOC is correlated with DEFS and other variables, its effect must be partialled out in order to conclude that there are genuine influences of PTR38, DCR, and CPX on DEFS. To accomplish the task, we did two things: (1) normalized the defect level by LOC and used defects per KLOC (DEFR) as the dependent variable and (2) included LOC as one of the independent variables (control variable) in the multiple regression model. We found that with this control, PTR38, DCR, and CPX were still significant at the 0.1 level. In other words, these factors truly represent something for which the length of the modules cannot account. However, the R 2 of the model was only 0.20. We contend that this again is due to the wide fluctuation of the dependent variable, the defect rate. The regression coefficients, their standard errors, t values, and the significance levels are shown in Table 11.5.
This analysis indicates that other than module length, the three most important factors affecting the defect rates of the modules are the number of changes and enhancements, defect history, and complexity level. From the intervention standpoint, since developers have no control over release enhancements, the latter two factors become the best clues for quality improvement actions. The relationships among defect history, complexity, and current defect level are illustrated in Figure 11.4. The best return on investment, then, is to concentrate efforts on modules with high defect history (chronic problem modules) and high complexity.
Figure 11.4. Scatter Diagrams of DEF, PTR38, and CPX
Table 11.5. Results of Multiple Regression Model of Defect Rate
Variable |
Regression Coefficients |
Standard Error |
t Value |
Significance ( p Value) |
---|---|---|---|---|
Intercept |
4.631 |
2.813 |
1.65 |
.10 |
CPX |
.115 |
.066 |
1.73 |
.09 |
DCR |
1.108 |
.561 |
1.98 |
.05 |
PTR38 |
.359 |
.220 |
1.63 |
.10 |
LOC |
“.014 |
.005 |
2.99 |
.004 |
R 2 |
.20 |
Based on the findings from this analysis and other observations, the component team established a quality improvement plan with staged implementation. The following list includes some of the actions related to this analysis:
Since the preceding analysis was conducted , the component team has been making consistent improvements according to its quality plan. Field data from new releases indicate significant improvement in the component's quality.
What Is Software Quality?
Software Development Process Models
Fundamentals of Measurement Theory
Software Quality Metrics Overview
Applying the Seven Basic Quality Tools in Software Development
Defect Removal Effectiveness
The Rayleigh Model
Exponential Distribution and Reliability Growth Models
Quality Management Models
In-Process Metrics for Software Testing
Complexity Metrics and Models
Metrics and Lessons Learned for Object-Oriented Projects
Availability Metrics
Measuring and Analyzing Customer Satisfaction
Conducting In-Process Quality Assessments
Conducting Software Project Assessments
Dos and Donts of Software Process Improvement
Using Function Point Metrics to Measure Software Process Improvements
Concluding Remarks
A Project Assessment Questionnaire