The Central Limit Theorem Applied to Networks


The Central Limit Theorem Applied to Networks

Take notice that the critical path through the network always connects the beginning node or milestone and the ending node or milestone. The ending milestone can be thought of as the output milestone, and all the tasks in between are input to the final output milestone. Furthermore, if the project manager has used three-point estimates for the task durations, then the duration of any single task is a random variable best represented by the expected value of the task. [2] The total duration of the critical path from the input or beginning milestone to the output milestone, itself a 0-duration event, or the date assigned to the output milestone, represents the length of the overall schedule. The length of the overall schedule is a summation of random variables and is itself a random variable, L, of length:

L = Σ Di = (D1 + D2 + Di ...)

where Di are the durations of the tasks along the critical path.

We know from our discussion of the Central Limit Theorem that for a "large" number of durations in the sum the distribution of L will tend to be Normal regardless of the distributions of the individual tasks. This statement is precisely the case if all the distributions are the same for each task, but even if some are not, then the statement is so close to approximately true that it matters little to the project manager that L may not be exactly Normal distributed. Figure 7-8 illustrates this point.

click to expand
Figure 7-8: The Output Milestone Distribution.

Significance of Normal Distributed Output Milestone

The significance of the fact that the output milestone is approximately Normal distributed is not trivial. Here is why. Given the symmetry of the Normal curve, a Normal distributed output milestone means there is just as likely a possibility that the schedule will underrun (complete early) as overrun (complete late). Confronted with such a conclusion, most project managers would say: "No! The schedule is biased toward overrun." Such a reaction simply means that either the schedule is too short and the Normal output milestone is too aggressive, or the project manager has not thought objectively about the schedule risk.

Consider this conclusion about a Normal output milestone from another point of view. Without even considering what the distributions of the individual tasks on the WBS might be, whether BETA or Triangular or Normal or whatever, the project manager remains confident that the output milestone is Normal in its distribution! That is to say that there is a conclusion for every project, and it is the same conclusion for every project — the summation milestone of the critical path is approximately Normal. [3]

Calculating the Statistical Parameters of the Output Milestone

What the project manager does not know is the standard deviation or the variance of the Normal distribution. It is quite proper to ask of what real utility it is to know that the output milestone is Normal with an expected value (mean value) but have no other knowledge of the distribution. The answer is straightforward: either a schedule simulation can be run to determine the distribution parameters or, if there is no opportunity to individually estimate the tasks on the WBS, then the risk estimation effort can be moved to the output milestone as a practical matter.

At this point, there really is not an option about selecting the distribution since it is known to be Normal; if expected values have been used to compute the critical path, or some reasonable semblance of expected values has been used, then the mean of the output milestone is calculable. It then remains to make some risk assessment of the probable underrun. Usually we calculate the underrun distance from the mean as a most optimistic duration. Once done, this underrun estimate is identically the same as the distance from the expected value to the most pessimistic estimate. Such a conclusion is true because of the symmetry of the Normal distribution; underrun and overrun must be symmetrically located around the mean.

The last estimate to make is the estimate for the standard deviation. The standard deviation estimate is roughly one-sixth of the distance from the most optimistic duration estimate to the most pessimistic estimate.

Next, the Normal distribution for the outcome milestone is normalized to the standard Normal distribution. The standard Normal curve has mean = 0 and σ = 1. Once normalized, the project manager can apply the Normal distribution confidence curves to develop confidence intervals for communicating to the project sponsor.

Statistical Parameters of Other Program Milestones

The Central Limit Theorem is almost unlimited in its handiness to the project manager. Given that there is a "large number" of tasks leading up to any program or project milestone, and large is usually taken to be ten or more as a working estimate, then the distribution of that milestone is approximately Normal. The discussion of the output milestone applies in all respects, most importantly the statements about using the Normal confidence curves or tables.

Therefore, some good advice for every project manager is to obtain a handbook of numerical tables or learn to use the Normal function in any spreadsheet program that has statistical functions. Of course, every project manager learns the confidence figures for 1, 2, or 3 standard deviations: they are, respectively, 68.26, 95.46, and 98.76.

[2]Actually, whether or not the project manager proactively thinks about the random nature of the task durations and assigns a probability distribution to a task does not change the fact that the majority of tasks in projects (projects we define as one-time endeavors never done exactly the same way before) are risky and tasks have some randomness to their duration estimate. Therefore, the fact that the project manager does not or did not think about this randomness does not change the reality of the situation. Therefore, the conclusions cited in the text regarding the application of the Central Limit Theorem and the Law of Large Numbers are not invalidated by the fact that the project manager may not have gone to the effort to estimate the duration randomness.

[3]Strictly speaking, the Law of Large Numbers and the Central Limit Theorem are applicable to linear sums of durations. The critical path usually qualifies as a linear summation of durations. Merge points, fixed dates, and PDM relationships other than finish-to-start do not qualify.