Structure Metrics | Complexity Metrics and Models

Lines of code, Halstead's software science, McCabe's cyclomatic complexity, and other metrics that measure module complexity assume that each program module is a separate entity. Structure metrics try to take into account the interactions between modules in a product or system and quantify such interactions. Many approaches in structure metrics have been proposed. Some good examples include invocation complexity by McClure (1978), system partitioning measures by Belady and Evangelisti (1981), information flow metrics by Henry and Kafura (1981), and stability measures by Yau and Collofello (1980). Many of these metrics and models, however, are yet to be verified by empirical data from software development projects.

Perhaps the most common design structure metrics are the fan-in and fan-out metrics, which are based on the ideas of coupling proposed by Yourdon and Constantine (1979) and Myers (1978):

Fan-in: A count of the modules that call a given module
Fan-out: A count of modules that are called by a given module

In general, modules with a large fan-in are relatively small and simple, and are usually located at the lower layers of the design structure. In contrast, modules that are large and complex are likely to have a small fan-in. Therefore, modules or components that have a large fan-in and large fan-out may indicate a poor design. Such modules have probably not been decomposed correctly and are candidates for re-design. From the complexity and defect point of view, modules with a large fan-in are expected to have negative or insignificant correlation with defect levels, and modules with a large fan-out are expected to have a positive correlation. In the AS/400 experience, we found a positive correlation between fan-out and defect level, and no correlation between fan-in and defects. However, the standard deviations of fan-in and fan-out were quite large in our data. Therefore, our experience was inconclusive.

Henry and Kafura's structure complexity is defined as:

graphics/11icon12.gif

In an attempt to incorporate the module complexity and structure complexity, Henry and Selig's work (1990) defines a hybrid form of their information-flow metric as

graphics/11icon13.gif

where C ip is the internal complexity of procedure p, which can be measured by any module complexity metrics such as McCabe's cyclomatic complexity.

Based on various approaches to structure complexity and module complexity measures, Card and Glass (1990) developed a system complexity model

graphics/11icon14.gif

where

C t = System complexity

S t = Structural (intermodule) complexity

D t = Data (intramodule) complexity

They defined relative system complexity as

graphics/11icon15.gif

where n is the number of modules in the system.

Structure complexity is further defined as

graphics/11icon16.gif

where

S = Structural complexity

f ( i )= Fan-out of module i

n = Number of modules in system

and data complexity is further defined as

graphics/11icon17.gif

where

D i = Data complexity of module i

V ( i )= I/O variables in module i

f ( i )= Fan-out of module i.

graphics/11icon18.gif

where

D = Data (intramodule) complexity

D ( i )= Data complexity of module i

n = Number of new modules in system

Simply put, according to Card and Glass (1990), system complexity is a sum of structural (intermodule) complexity and overall data (intramodule) complexity. Structural complexity is defined as the mean (per module) of squared values of fan-out. This definition is based on the findings in the literature that fan-in is not an important complexity indicator and that complexity increases as the square of connections between programs (fan-out). With regard to data (intramodule) complexity of a module, it is defined as a function that is directly dependent on the number of I/O variables and inversely dependent on the number of fan-outs in the module. The rationale is that the more I/O variables in a module, the more functionality needs to be accomplished by the module and, therefore, the higher internal complexity. On the contrary, more fan-out means that functionality is deferred to modules at lower levels, therefore, the internal complexity of a module is reduced. Finally, the overall data complexity is defined as the average of data complexity of all new modules. In Card and Glass's model, only new modules enter the formula because oftentimes the entire system consists of reused modules, which have been designed, used, aged, and stabilized in terms of reliability and quality.

In a study of eight software projects, Card and Glass found that the system complexity measure was significantly correlated with subjective quality assessment by a senior development manager and with development error rate. Specifically, the correlation between system complexity and development defect rate was 0.83, with complexity accounting for fully 69% of the variation in error rate. The regression formula thus derived was

graphics/11icon19.gif

In other words, each unit increase in system complexity increases the error rate by 0.4 (errors per thousand lines of code).

The Card and Glass model appears quite promising and has an appeal to software development practitioners . They also provide guidelines on achieving a low complexity design. When more validation studies become available, the Card and Glass model and related methods may gain greater acceptance in the software development industry.

While Card and Glass's model is for the system level, the system values of the metrics in the model are aggregates (averages) of module-level data. Therefore, it is feasible to correlate these metrics to defect level at the module level. The meanings of the metrics at the module level are as follows :

D i = data complexity of module i, as defined earlier
S i = structural complexity of module i, that is, a measure of the module's interaction with other modules
C i = S i + D i = the module's contribution to overall system complexity

In Troster's study (1992) discussed earlier, data at the module level for Card and Glass's metrics are also available. It would be interesting to compare these metrics with McCabe's cyclomatic complexity with regard to their correlation with defect rate. Not unexpectedly, the rank-order correlation coefficients for these metrics are very similar to that for McCabe's (0.27). Specifically, the coefficients are 0.28 for D i , 0.19 for S i , and 0.27 for C i . More research in this area will certainly yield more insights into the relationships of various design and module metrics and their predictive power in terms of software quality.

What Is Software Quality?

Software Development Process Models

Fundamentals of Measurement Theory

Software Quality Metrics Overview

Applying the Seven Basic Quality Tools in Software Development

Defect Removal Effectiveness

The Rayleigh Model

Exponential Distribution and Reliability Growth Models

Quality Management Models

In-Process Metrics for Software Testing

Complexity Metrics and Models

Metrics and Lessons Learned for Object-Oriented Projects

Availability Metrics

Measuring and Analyzing Customer Satisfaction

Conducting In-Process Quality Assessments

Conducting Software Project Assessments

Dos and Donts of Software Process Improvement

Using Function Point Metrics to Measure Software Process Improvements

Concluding Remarks

A Project Assessment Questionnaire

A Project Assessment Questionnaire