Before we discuss the approach taken at Infosys, this section describes some concepts relating to estimation and scheduling. Effort estimation usually takes place in the early stages of a project, when the software to be built is being understood. It may be redone later when more information becomes available.
Highly precise estimates are generally not needed. Reasonable estimates in a software project tend to become a self-fulfilling prophecy people work to meet the schedules (which are derived from effort estimates). Indeed, in software projects, one cannot even precisely answer the question, "Is this estimate accurate?" because the only way to ascertain the accuracy of an estimate is to compare it with the actual effort expended. Because of the general principle of human psychology reflected in the maxim "work expands to fill the available time," one cannot say that just because the actual effort expended matches the estimated effort, the estimates are "accurate." Hence, the goal for a project manager is to obtain reasonable estimates so that the goals are met and the project personnel are not burned out. The range of reasonableness is not very wide, and it depends on human factors, but it is probably wide enough to give sufficient leeway for estimation.
Effort estimation can rely on a hunch or on previous experience, but a more scientific and desirable approach is to use an estimation model.
A software estimation model defines the project characteristics whose values (or their estimates) it needs and the ways these values are used to compute the effort. An estimation model does not and cannot work in a vacuum; it needs inputs to produce the effort estimate as output. At the start of a project, when the details of the software itself are not known, the hope is that the estimation model will require values of characteristics that can be measured at that stage.
The size of the software is the predominant factor in determining how much effort is needed to build it. But the ultimate size is not known when the project is being conceived, and the software does not exist. Hence, if size is to be used for the effort estimation model, it must be estimated for the initial estimation.
A common approach is to use a simple equation to obtain an estimate of the overall effort from the size estimate. This equation can be determined through regression analysis of past data on effort and size.1,2 Then, once the overall effort for the project is known, the effort for various phases or activities can be determined as a percentage of the total effort.
Many models have been proposed that use this top-down approach to estimation,1,3 with the COCOMO model being the most famous.1,4 Models using function points (instead of LOC) as size units have also been built.5,6 In these models, you can accommodate other factors that affect the effort by refining the estimates based on these factors. This is the approach taken in the COCOMO model.1 Another approach is to adjust the size of the system based on these parameters, as is done in function points.7
In the bottom-up approach, on the other hand, you obtain the estimates first for parts of the project and then for the overall estimate.1 That is, the overall estimate of the project is derived from the estimates of its parts. One bottom-up method calls for using some type of activity-based estimation. In this strategy, the major activities are first enumerated, and then the effort for each activity is estimated. From these estimates, the effort for the overall project is obtained.
The bottom-up approach lends itself to direct estimation of effort; once the project is partitioned into smaller tasks, it is possible to directly estimate the effort required for them. Although size does play a role in determining the effort for many activities in a project, a key advantage of this approach is that it does not require explicit size estimates for the software. Instead, it requires a list of project tasks, which might be easier to prepare in some situations. A risk of bottom-up methods is that you may omit some important activities in the list of tasks. When effort is directly estimated for tasks, it may prove difficult to directly estimate the effort required for some overhead tasks, such as project management, that span the project and are not as clearly defined as coding or testing.
Both the top-down and the bottom-up approaches require information about the project: size (for top-down approaches) and a list of tasks (for bottom-up approaches). In many ways, these approaches are complementary.1 Both types of estimates are more accurate if more information about the project is available or as the project proceeds. For example, estimating the size is much more difficult when very high level requirements are given but becomes considerably easier when design is finished, and even easier and more accurate when code is developed. Thus, the accuracy of estimates depends on the point at which effort is estimated, with accuracy increasing as more information about the project becomes available.4
Once the effort is known or fixed, various schedules (or project duration) are possible, depending on the number of resources (people) put on the project. For example, for a project whose effort estimate is 56 person-months, a total schedule of 8 months is possible with 7 people. A schedule of 7 months with 8 people is also possible, as is a schedule of approximately 9 months with 6 people.
As is well known, however, manpower and months are not fully interchangeable in a software project.8 For instance, in the example here, a schedule of 1 month with 56 people is not possible even though the effort "matches" the requirement. Similarly, no one would execute the project in 28 months with 2 people. In other words, once the effort is fixed, you can gain some flexibility in setting the schedule by appropriately staffing the project. But this flexibility is not unlimited, a fact corroborated by data that shows that no simple equation between effort and schedule fits the empirical data.9
"Stretching" the schedule is easy; you simply apply fewer people (although the project may not be very valuable if completed over a long duration). Compressing the schedule, however, is not easy. A clear example is given earlier: You cannot compress the schedule of a 56 person-month project to 1 month regardless of the resources you apply. And, generally speaking, compressing the schedule beyond what is "normal" increases the effort; by having more resources than optimally needed, you might end up wasting them, doing more rework, and so on.
Some approaches discuss the effect of schedule compression on total effort. However, to assess that effect you must first define the normal schedule for a project.
One method to determine the normal (or nominal) schedule is to use a suitable function to determine it from the effort. In turn, one method for determining the function is to study the patterns in data from completed projects. For example, you can obtain a scatter plot of effort and schedule for completed projects and then fit a regression curve through this scatter plot. This curve is generally nonlinear because the schedule does not grow linearly with effort. You can then use the equation for the curve to determine the schedule for a project whose effort has been estimated. Many models follow this approach, and Boehm's book summarizes the various models.1