Hypothesis Testing


Hypothesis Testing

Another common problem facing project managers is hypothesizing what is going to happen in their projects and then evaluating the hypothesis outcomes analytically. The letter H is commonly used to represent a hypothesis, and there is always more than one hypothesis: the true outcome and false outcome, or the null outcome and the alternative outcome, denoted H(0) and H(1). For instance, a project manager working in the environmental area is often faced with the null hypothesis that an additional regulation will not be passed that impacts the project or the alternative hypothesis that additional legislation will be passed and some impact to the project will occur. In this section, we examine some of the quantitative aspects of evaluating hypotheses that might occur in projects.

The Type 1 and Type 2 Error

Right from the outset, we are faced with what the statistical community calls the Type 1 and Type 2 error. The Type 1 error is straightforward: the hypothesis is true but we reject or ignore the possibility. Grave consequences could occur in making the Type 1 error, and the project manager seeks to avoid this mistake. For example, in our environmental example, we reject the possibility of a new regulation that impacts the project, ignoring the possible ramifications, but indeed the hypothesis is true and a new regulation is issued. "Now what?" asks the project sponsor.

The Type 2 error is usually less risky: we falsely believe the alternate hypothesis, H(l), and make investments to protect against the outcome that never happens. It is easy to see how a Type 2 error could be made in the environmental project, spending money to thwart the impact of a new regulation that never happens. Though no project manager or project sponsor wants to waste resources, perhaps a project cost impact is the only consequence of making a Type 2 error.

Interval of Acceptance

In testing for the outcome of the hypothesis, especially by simulation, we will "run" the hypothesis many times. The first few times may not be representative of the final outcome since it takes many runs to converge to the final and ultimate outcome. For a number of reasons, we may not have the luxury of waiting for or estimating the convergence. We may have to establish an interval around the likely outcome, called the interval of acceptance, within which we say that if an outcome falls anywhere in the interval of acceptance, then that outcome is "good enough." Now, if the objective is to avoid the Type 1 error, then we must be careful about rejecting a hypothesis that really is true. Thus we are led by the need to risk-manage the Type 1 error to widen the interval of acceptance. However, the wide acceptance criterion lets in the Type 2 error! Remember that Type 2 is accepting a hypothesis that is really false. There is no absolute rule here. It is all about experience and heuristics. Some say that the interval of acceptance should never be greater than 10%, or at most 20%. Each project team will have to decide for itself.

In many practical situations there is no bias toward optimism or pessimism. Our environmental example could be of this type, though regulatory agencies usually have a bias one way or the other. Nevertheless, if there is no bias, or it is "reasonably" small, then we know the distribution of values of H(0) or H(1) is going to be symmetrical even though we do not know the exact distribution. However, we get some help here as well. Recall the Central Limit Theorem: regardless of the actual distribution, over a very large number of trials the average outcome distribution will be Normal. Thus, the project manager can refer to the Normal distribution to estimate the confidence that goes along with an acceptance interval and thereby manage the risk of the Type 1 error. For instance, we know that only about 4% of all outcomes lie more than 2σ from the mean value of a Normal distribution. In other words, we are about 96% confident that an outcome will be within 2σ of the mean. If the mean and variance (and from variance the standard deviation can be calculated) can be estimated from simulation, then the project manager can get a handle on the Type 1 error (rejecting something that is actually true). Figure 8-7 illustrates the points we are making.

click to expand
Figure 8-7: Type 1 and 2 Errors.

Testing for the Validity of the Hypothesis

Having constructed a null hypothesis and its alternative, H(0) and H(1), and made some assumptions about the outcomes being Normal because of the tendency of a large number of trials to have a Normal distribution, the question remains: Can we test to see if the H(0) is valid? In fact, there are many things we can do.

The common approach to hypothesis testing is to test the performance of a test statistic. For example, with the Normal distributions of H(0) and H(1) normalized to the standard Normal plot, where the value of σ = 1 and μ = 0, then if an outcome were any normalized number greater than about three, you would suspect that outcome did not belong to the H(0) since the confidence of a normalized outcome of three or more is only about a quarter of a percent, 0.26% to be more precise. We get this figure from a table of two-sided Normal probability density values or from a statistical function in a mathematics and statistical package.

t-Statistic Test

The other common test in hypothesis testing is to discover the true mean of the distribution for H(0) and H(1). For this task, we need a statistic commonly called the "t statistic" or the "Student's t" statistic. [5] The following few steps show what is done to obtain an estimate of the mean of the distribution of the hypothesis:

  • From the data observations of the outcomes of "n" trials of the hypothesis, calculate the sample average. Sample average = Hav = (1/n) * Hi, where Hi is the ith outcome of the hypothesis trials.

  • Calculate the sample variance, VarH = [1/(n-1)] * (Hi - Hav)2.

  • Calculate a statistic, t = n * (Hav - μ)/VarH, where μ is an estimate of the true mean for which we are testing the validity of the assumption that μ is correct.

  • Look up the value of t in a table of t statistics where n-1 = "degrees of freedom." If the value of t is realistic from the lookup table, then μ is a good estimate of the mean. For example, using a t-statistics lookup table for n-1 = 100, the probability is 0.99 that the value of t will be between 2.617. If the calculated value of t for the observed data is not in this range, then the hypothesis regarding the estimate of the mean is to be rejected with very little chance, less than 1%, that we are committing a Type 1 error.

[5]The "t statistic" was first developed by a statistics professor to assist his students. The professor chose not to associate his name with the statistic, but named it the "Student's t" instead.