Statistical Process Control for Software Development Processes

Statistical process control is a mainstay in manufacturing but is not yet common in software development. This isn't because software architects and programmers don't have a background in mathematics and statistics, because most do. Rather, the software development process does not naturally produce the kind of high-volume time-series data that is so natural a side effect of the manufacturing assembly line and of all continuous industrial processes. The software development process is any process or subprocess used by a software project or organization. It thus applies to any identifiable activity that may be undertaken to produce a software system or product. This includes planning, estimating, designing, coding, testing, inspecting, reviewing, measuring, and controlling, as well as the subtasks that comprise any of these undertakings.^[6]

Software process management is all about managing the work processes involved in designing, developing, deploying, maintaining, and supporting both software products and today's increasing array of software-intensive or "digital" systems. The time-honored concept of a controlled process dates back to Walter Shewhart in 1931. It defines a process as being in control when, by using past experience, you can predict within limits how the process may be expected to perform in the future. Controlled processes are stable processes, and process stability lets you predict results. If a controlled process cannot meet changing customer requirements or industry standards, or if it no longer meets business objectives, it must be changed or improved. The four key goals of process management are process definition, process measurement, process control, and process improvement. Deming taught that these goals should be pursued iteratively in his famous Plan, Do, Check, Act (PDCA) approach that has been so successful in manufacturing process improvement. Figure 15.1 is adapted from Florac and Carleton^[7] but is familiar to every practitioner of manufacturing process improvement. Our goal here is to apply the rich history of manufacturing process management to software development. The first step in doing so is to discover the measurable product and process characteristics that are important in enterprise business software development. These are presented in Table 15.1, also adapted from Florac and Carleton.^[8] This static table suggests a dynamic framework for measuring process behavior; it involves the following:

Clarifying business goals as they relate to software product development

Figure 15.1. The Process of Process Management

W. A. Florac and A. D. Carleton, Measuring the Software Process, p. 6, Fig. 1.2. © Pearson Education, Inc. Reprinted by permission of Pearson Education, Inc. All rights reserved.

Table 15.1. Measurable Characteristics for Software Development^[a]
Resources	Activities	Consumed	Retained	Products
Products and by-products from other processes Ideas, concepts Resources: People Facilities Tools Raw materials Energy Money Time	Processes and controllers: Requirements analysis Designing Coding Testing Configuration control Change control Problem management Reviewing Inspecting Integrating	Resources: Effort Raw materials Energy Money Time	People Facilities Tools Materials Work in process Data Knowledge Experience	Products: Requirements Specifications Designs Units Modules Test cases Test results Tested components Documentation
Guidelines and directions: Policies Procedures Goals Constraints Rules Laws Regulations Training Instructions	Flow paths Product paths Resource paths Data paths Control paths Buffers and dampers Queues Stacks Bins			Defects Defect reports Change requests Data Acquired materials Other artifacts
				By-products: Knowledge Experience Skills Process improvements Data Goodwill Satisfied customers

^[a] W. A. Florac and A. D. Carleton, Measuring the Software Process, p. 27, © Pearson Education, Inc. Adapted by permission of Pearson Education, Inc. All rights reserved.

Identifying and prioritizing critical issues that inhibit goal attainment
Selecting metrics (see Chapter 3) that characterize the product under development
Making process measurements by collecting data
Using the data to analyze process behavior
Evaluating process performance

Such a framework is familiar to software architects and project managers, but they usually pursue it from a qualitative rather than quantitative perspective. This chapter is intended to give the software developer some new, more quantitative tools for managing the software development process.

The critical issue in setting up a measurable software process that will be amenable to statistical analysis is identifying measurable characteristics or attributes and selecting the measures. Because the characteristics and their measures differ from project to project, depending on the application and its requirements, we will not attempt to employ all the measures given in Chapter 3 here. Instead, we will define a typical set of measures and use them in an example to give you an overall appreciation of a quantitative framework for software process development. Table 15.2 gives the major criteria for measures to characterize software process performance. Table 15.3 lists a large number of measurable attributes of software development processes as an example only (not all of them are recommended for every project a development group undertakes).

Table 15.2. Criteria for Software Performance Measures^[b]
Criterion	Selection Guidelines and Examples
Measures should relate closely to the issue under study.	Quality, resource requirements, time to deliver.
They should have high information content.	Attributes are sensitive to process results.
They should pass the reality test.	They should reflect the degree to which the process achieves results.
They should permit easy information content of data.	Choose attributes that have numerical measures that are readily available, consistent, and well-defined.
They should show measurable variation.	A numerical measure that never changes provides no information.
They should have diagnostic value.	For example, they should not only identify unusual occurrences but also indicate their causes.

^[b] W. A. Florac and A. D. Carleton, Measuring the Software Process, p. 28, © Pearson Education, Inc. Adapted by permission of Pearson Education, Inc. All rights reserved.

Table 15.3. Measurable Attributes of Software Development Processes^[c]
Givens	Activities	Utilized	Results
Changes: Type Date Size Number received Requirements: Requirements stability Number identified Percentage traced to design Percentage traced to code	Flow charts: Processing time Throughput rates Diversions Delays Backlogs Length, size: Queues Buffers Stacks	Effort: Number of development hours Number of rework hours Number of support hours Number of preparation hours Time: Number of meeting hours	Status of work units: Number designed Number coded Number tested Size of work units: Number of requirements Number of function points Number of lines of code Number of modules Number of objects Number of bytes in database
Problem reports: Type Date Size Origin Severity Number received Funds: Money Budget Status People: Years of experience Type of education Percentage trained in XYZ Employment codes Facilities and environment: Square feet per employee Noise level Lighting Number of staff in office or cubicles Investments in tools per employee Hours of computer usage Percentage of capacity utilized		Start time or date Ending time or date Duration of process or task Wait time Money: Cost to date Cost of variance Cost of rework	Output quantity: Number of action items Number of approvals Number of defects found Test results: Number of test cases passed Percentage of test coverage Program architecture: Fan-in Fan-out Changes: Type Date Size Effort expended Problems and defects: Number of reports Defects density Type Origin Distribution by type Distribution by origin Number open Number closed Critical resource utilization: Percentage of memory utilized Percentage of CPU capacity utilized Percentage of I/O capacity utilized

^[c] W. A. Florac and A. D. Carleton, Measuring the Software Process, p. 28, © Pearson Education, Inc. Adapted by permission of Pearson Education, Inc. All rights reserved.

As soon as a set of measurable attributes or characteristics has been selected for a software development project and numerical measures and a data collection system have been set up for them, data will surely begin to flow as work progresses. Figure 15.2, adapted from Florac and Carleton, presents a Venn diagram showing the overlap of problem identification with problem analysis. The methods employed go from qualitative to quantitative, from left to right. The "low-tech" methods on the left are already familiar to you. In the center, the Pareto chart is a commonly used form of the histogram; its use is explained in Chapter 6. The cause-and-effect diagram is another name for Ishikawa's fishbone diagram. The right side of the figure reviews statistical and other quantitative methods; their use is described in the next section. Scatter diagrams display empirically observed relationships between two process attribute measures. Any pattern in the plot may indicate a causal relationship in the data and thus is potentially grist for the statistical millin particular, correlation and regression analysis. Correlation analysis can tell us how closely correlated (corelated) two variables are, and regression analysis can tell us the nature of that relationship. In fact, regression analysis can tell us how much of the variability in one variable can be explained by variation in the other. Multiple regression analysis can tell us, given a large number of measurable influences on an outcome, which are the most significant, and how much of the variability in the result is explained by any chosen subset of them. Usually it's the two or three most significant, which may explain more than 80% of the variability in the result. We won't bore you with statistical formulas. Nobody programs these things anymore. Even high school students do them with a few button presses on a $100 graphing calculator. The serious user will purchase, download, and use the Minitab 14, a comprehensive statistical package. See the Minitab Release 14 URL, http://www.minitab.com/products/minitab/14/default.aspx, for a functional description and ordering information. For the most part, the software process manager can often do quite well using a spreadsheet such as Microsoft Excel. So, what do you need to know about statistics to control a software development process? Let's briefly review the software architect's statistical toolbox and consider an example related to quality.

Figure 15.2. The Analytic Toolkit

Figure 15.1. The Process of Process Management

W. A. Florac and A. D. Carleton, Measuring the Software Process, p. 6, Fig. 1.2. © Pearson Education, Inc. Reprinted by permission of Pearson Education, Inc. All rights reserved.

Table 15.1. Measurable Characteristics for Software Development[a]

Table 15.2. Criteria for Software Performance Measures[b]

Table 15.3. Measurable Attributes of Software Development Processes[c]

Figure 15.2. The Analytic Toolkit

W. A. Florac and A. D. Carleton, Measuring the Software Process, p. 55, Fig. 3.4. © Pearson Education, Inc. Reprinted by permission of Pearson Education, Inc. All rights reserved.

Table 15.1. Measurable Characteristics for Software Development^[a]

Table 15.2. Criteria for Software Performance Measures^[b]

Table 15.3. Measurable Attributes of Software Development Processes^[c]