The lines of code (LOC) count is usually for executable statements. It is actually a count of instruction statements. The interchangeable use of the two terms apparently originated from Assembler program in which a line of code and an instruction statement are the same thing. Because the LOC count represents the program size and complexity, it is not a surprise that the more lines of code there are in a program, the more defects are expected. More intriguingly, researchers found that defect density (defects per KLOC) is also significantly related to LOC count. Early studies pointed to a negative relationship: the larger the module size, the smaller the defect rate. For instance, Basili and Perricone (1984) examined FORTRAN modules with fewer than 200 lines of code for the most part and found higher defect density in the smaller modules. Shen and colleagues (1985) studied software written in Pascal, PL/S, and Assembly language and found an inverse relationship existed up to about 500 lines. Since larger modules are generally more complex, a lower defect rate is somewhat counterintuitive. Interpretation of this finding rests on the explanation of interface errors: Interface errors are more or less constant regardless of module size, and smaller modules are subject to higher error density because of smaller denominators.
More recent studies point to a curvilinear relationship between lines of code and defect rate: Defect density decreases with size and then curves up again at the tail when the modules become very large. For instance, Withrow (1990) studied modules written in Ada for a large project at Unisys and confirmed the concave relationship between defect density (during formal test and integration phases) and module size (Table 11.1). Specifically, of 362 modules with a wide range in size (from fewer than 63 lines to more than 1,000), Withrow found the lowest defect density in the category of about 250 lines. Explanation of the rising tail is readily available. When module size becomes very large, the complexity increases to a level beyond a programmer's immediate span of control and total comprehension . This new finding is also consistent with previous studies that did not address the defect density of very large modules.
Experience from the AS/400 development also lends support to the curvilinear model. In the example in Figure 11.1, although the concave pattern is not as significant as that in Withrow's study, the rising tail is still evident.
Figure 11.1. Curvilinear Relationship Between Defect Rate and Module Size ”AS/400 data
The curvilinear model between size and defect density sheds new light on software quality engineering. It implies that there may be an optimal program size that can lead to the lowest defect rate. Such an optimum may depend on language, project, product, and environment; apparently many more empirical investigations are needed. Nonetheless, when an empirical optimum is derived by reasonable methods (e.g., based on the previous release of the same product, or based on a similar product by the same development group ), it can be used as a guideline for new module development.
Table 11.1. Curvilinear Relationship Between Defect Rate and Module Size ”Withrow (1990)
Maximum Source Lines of Modules |
Average Defect per 1,000 Source Lines |
---|---|
63 |
1.5 |
100 |
1.4 |
158 |
0.9 |
251 |
0.5 |
398 |
1.1 |
630 |
1.9 |
1000 |
1.3 |
>1000 |
1.4 |
What Is Software Quality?
Software Development Process Models
Fundamentals of Measurement Theory
Software Quality Metrics Overview
Applying the Seven Basic Quality Tools in Software Development
Defect Removal Effectiveness
The Rayleigh Model
Exponential Distribution and Reliability Growth Models
Quality Management Models
In-Process Metrics for Software Testing
Complexity Metrics and Models
Metrics and Lessons Learned for Object-Oriented Projects
Availability Metrics
Measuring and Analyzing Customer Satisfaction
Conducting In-Process Quality Assessments
Conducting Software Project Assessments
Dos and Donts of Software Process Improvement
Using Function Point Metrics to Measure Software Process Improvements
Concluding Remarks
A Project Assessment Questionnaire