Iterative and Evolutionary Research
Evidence on the question of IID and evolutionary delivery comes from several studies by Alan MacCormack and others at Harvard Business School. In the first study [MacCormack01, MVI01] the question, "Does evolutionary development, rather than the waterfall model, result in better success?" was explored in a two-year in-depth analysis of projects. The report's conclusion?
And specifically on evolutionary feedback-based requirements and design,
The study identified four practices that were statistically correlated with the most successful projects:
Practices 1 and 2 are associated with all modern IID methods. Practice 4 is a key element in the UP.
In a follow-up study [MKCC03], MacCormack and colleagues examined the effect of eight practices on productivity and defects (reported by customers), including IID and releasing a partial system early for evaluation and evolutionary design. The projects ranged from application software to embedded systems, with median values of nine developers and a 14-month duration; 75% used iterative and evolutionary development, 25% the waterfall. A key conclusion of the study:
In contrast, early detailed design specifications were not particularly valuable:
And detailed design specs did not improve productivity. However, design reviews with peers did significantly reduce defect rates.
In the multivariate model of defect factors, the following iterative-related practices and their magnitude of impact were significant:
Similarly, in the model of productivity factors, over 50% of the variation in productivity was related to just two factors, both related to iterative practices:
In a study of productive software organizations [HC96], researchers at Bell Labs found a consistent pattern on highly successful projects:
A study published in 2001 summarized the results of research into over 400 projects spanning 15 years [CLW01]. Less than 5% of the code was actually useful or used. This high "software pollution" rate (reflecting un-useful requirements and over-engineering within a waterfall lifecycle) was significantly reduced by adopting iterative, short evolutionary delivery cycles as in the Evo method reducing releases from about six months on average to about two weeks.
In a survey of agile method results [Shine03], 88% of organizations cited improved productivity, and 84% improved quality. The most frequently used agile methods were Scrum and XP. Regarding cost of development, 46% stated no change and 49% stated it was less expensive with agile methods. One of the more interesting results predictable in terms of agile method claims was the increase in business satisfaction with the new software: Overall 83% claimed higher satisfaction and 26% overall claimed "significantly better satisfaction." The most frequently cited positive feature of agile methods (48%) was "respond to change rather than follow a predefined plan."
Another large study [Standish98] illustrating the value of iterative-related practices is the Standish Group's CHAOS study of project failure and success factors, analyzing 23,000 projects in the 1998 version. In the CHAOS TEN list of the top ten factors for success, at least four of the top five factors are strongly related to IID practices (Table 6.1).
High user involvement is central to IID methods; short iterations with demos, reviews, evolutionary requirements refinement, and client-driven iterations are key practices.
Executive support is promoted by these practices and especially through the demonstration of early, tangible results; people like to be associated with projects that show quick value and progress.
Clear business objectives is supported by adaptive client-driven iteration planning. By asking each iteration "What is most valuable?" and building it, the business objectives are clarified and realized, and the project aligned with them.
Of course, small milestones are at the heart of iterative methods.
To quote the study,
There is significant size research indicating smaller (and thus, less complex) projects are more successful and productive. This is not direct proof of the value of iterative development, but is very relevant to the IID practice of decomposing large projects into a series of small, short sub-project iterations.
A large study [Thomas01] of failure and success factors in over 1,000 UK IT projects found that 90% of the successful projects were less than 12 months duration; indeed, that 47% were less than 6 months. To quote,
The trend that the larger the project, the more likely it will fail, has been corroborated in a number of other studies. For example, in a study [Jones96] large sample set data show 48% of 10,000 function point (FP) projects are cancelled, as are 65% of 100,000 FP ones.
Going back to early, fundamental size issues, exploration of general systems theory in the 1950s by von Bertalanfy, Bateson, and others led to this fundamental conclusion [Bertalanfy68]:
More straightforward evidence that small is beautiful comes from a 23,000 project study [Standish98]. For example, project success versus duration, see Figure 6.1.
Figure 6.1. success vs. duration Success was defined as "The project is completed on time and on budget, with all features and functions as originally specified."
This trend was confirmed in a follow-up study spanning 35,000 projects [Standish00], regarding cost (another size measure) versus success (Table 6.2).
And, to reiterate a portion of the Standish conclusion,
Another interesting research note on size in the Standish research was the declining project failure rates, from 31% in the 1994 study to 23% in the 2000 study. This was correlated with smaller, shorter projects and smaller teams.
Direct smaller-size and evolutionary delivery evidence was presented in a previously cited study [CLW01]. The percentage of developed code that was ultimately found to be useful increased when the delivery cycle was reduced from around six months to about two weeks, as recommended in Evo.
A study by Boehm and Papaccio showed that a typical software project experienced a 25% change in requirements [BP88]. This trend is corroborated in another large study; as illustrated in Figure 6.2 [Jones97], software development is a domain of inventive high-change projects.
Figure 6.2. rates of change on software projects
Another measure of change is to investigate how much use is actually made of implemented features defined in early specifications. A large study [Johnson02] showed that 45% of features were never used (Figure 6.3).
Figure 6.3. actual use of requested features
Evolutionary requirements to address change is becoming more widespread. A study of 107 projects [CM95] showed that only 18% of the projects tried to complete the requirements in a single early step; 32% used two cycles of requirements refinement (with programming in between); and in 50% of the projects the requirements analysis was completed over three or more iterations.
The data in this section demonstrates that software development is a high-change domain. Practices or values that encourage early "complete" specifications or schedules are incongruous. Iterative and evolutionary practices that emphasize adaptability and steps to provoke early change are consistent with this research.
Waterfall Failure Research
In a study of failure factors on 1,027 IT projects in the UK [Thomas01] (only 13% didn't fail), scope management related to attempting waterfall practices (including detailed up-front requirements) was the single largest contributing factor for failure, being cited in 82% of the projects as the number one problem, with an overall weighted failure influence of 25%. To quote the study's conclusion,
Other significant evidence of failure applying the waterfall comes from one of its most frequent users in the past, the USA Department of Defense (DoD). Most DoD projects were required by the standard DOD-STD-2167 to follow a waterfall lifecycle. A report on failure rates in a sample of earlier 2167-era DoD projects concluded that 75% of the projects failed or were never used [Jarzombek99]. Consequently, a task force was convened, chaired by Dr. Frederick Brooks, the well-known software engineering expert. The report recommended replacing the waterfall with IID [DSB87]:
In another study of 6,700 projects, it was found that four out of the five key factors contributing to project failure were associated with and aggravated by the waterfall model [Jones95], including inability to deal with changing requirements, and problems with late integration.
In 1996 Barry Boehm published a well-known paper summarizing failures of the waterfall [Boehm96], with advice to use a risk-reducing IID approach combined with three milestone anchor points around which to plan and control; this advice was eventually adopted in the UP.
There are several studies (covering thousands of projects) that shed light on the value of large, up-front specifications in a waterfall-oriented lifecycle.
One study [Jarzombek99] cited a 1995 DoD software project study (of over $37 billion USD worth of projects) showing that 46% of the systems so egregiously did not meet the real needs (although they met the specifications) that they were never successfully used, and another 20% required extensive rework to meet the true needs (rather than the specifications) before they could be used.
As mentioned earlier, another study [Johnson02] showed that 45% of features were never used with an additional 19% rarely used.
In the previously cited study of over 400 waterfall-oriented projects [CLW01] averaging six-month cycles, only 10% of the developed code was actually deployed, and of that, only 20% was used. The prime reasons included:
There is a productivity motivation to apply short iterations, even if there were up-front requirements.
A study [Solon02] against a sample set (43,700 projects) showed the following productivity differences between IID and waterfall:
Interestingly, the same study showed that among the waterfall projects, those that applied it only "loosely" were significantly more productive than those that applied it "rigorously," indicating the negative effect that it has on productivity.
Another relevant study [Jones00] showed that as the size of project decreases (measured in language-independent function points), the monthly productivity of staff increases (Figure 6.4).
Figure 6.4. productivity vs. size
This data illustrates the motivation of organizing a project into small mini-project iterations with low function points per iteration, as the most dramatic productivity drop occurs in the lower function point range (under 1,000).
Timeboxing by itself has been shown to have a productivity effect. Dupont, one of the earliest timebox pioneers, found developer productivity around 80 function points per month with timeboxed iterations, but only 15 to 25 function points for other methods [Martin91].
Note the rate of 80 function points per month at Dupont compared to a high of 12 function points per month in Figure 6.4. This suggests that the combination of a low-complexity step with timeboxing has a higher productivity impact than simply a small step without timeboxing.
In another study [Jones00], 47 factors that increase or decrease productivity were identified, including project complexity:
This indicates a productivity advantage by organizing projects in low-complexity mini-project iterations.
To reiterate the results of a study on productivity and iterative development [MKCC03], their conclusion was,
Quality and Defect Research
Broadly, defect reduction comes from avoiding defects before they occur (Deming's Total Quality Management principle) and from feedback (tests, evaluations, and so forth). IID methods can address both. For example, several methods promote a per-iteration simple process assessment or reflection by the team, to encourage regular process improvement and defect avoidance. Feedback is enabled by the emphasis on early development of the riskiest elements, per-iteration demos, and a test-early, test-often approach. The association of lower defect rates with iterative development is consistent with Deming's predictions, as IID illustrates the Deming/Shewhart Plan-Do-Study-Act (PDSA) cycle, and supports a culture of continuous improvement by measuring, reflecting, and adjusting each iteration.
Specifically, the study [MKCC03] showed that IID was correlated with lower defects. In other research [MVI01], it was shown that as the time lag between coding and testing decreased, defect rates likewise decreased. A study by Deck [Deck94] also shows a statistically significant reduction in defects using an iterative method. Large case-study research [Jones00] showed that defect rates increase non-linearly as project size grows (Figure 6.5).
Figure 6.5. defects vs. size
Although not statistically reliable, there are several single-case study reports of lower defect densities associated with iterative methods (e.g., [Manzo02]).