Statistical Details for Analysis of Variance | SAS/STAT 9.1 Users Guide, Volumes 1-7

Definitions

Analysis of variance (ANOVA) is a technique for analyzing experimental data in which one or more response (or dependent or simply Y) variables are measured under various conditions identified by one or more classification variables. The combinations of levels for the classification variables form the cells of the experimental design for the data. For example, an experiment may measure weight change (the dependent variable) for men and women who participated in three different weightloss programs. The six cells of the design are formed by the six combinations of sex (men, women) and program (A, B, C).

In an analysis of variance, the variation in the response is separated into variation attributable to differences between the classification variables and variation attributable to random error. An analysis of variance constructs tests to determine the significance of the classification effects. A typical goal in an analysis of variance is to compare means of the response variable for various combinations of the classification variables.

An analysis of variance may be written as a linear model. Analysis of variance procedures in SAS/STAT software use the model to predict the response for each observation. The difference between the actual and predicted response is the residual error . Most of the procedures fit model parameters that minimize the sum of squares of residual errors. Thus, the method is called least squares regression . The variance due to the random error, ƒ ² , is estimated by the mean squared error (MSE or s ² ).

Fixed and Random Effects

The explanatory classification variables in an ANOVA design may represent fixed or random effects. The levels of a classification variable for a fixed effect give all the levels of interest, while the levels of a classification variable for a random effect are typically a subset of levels selected from a population of levels. The following are examples.

In a large drug trial, the levels that correspond to types of drugs are usually considered to comprise a fixed effect, but the levels corresponding to the various clinics where the drugs are administered comprise a random effect.
In agricultural experiments, it is common to declare locations (or plots) as random because the levels are chosen randomly from a large population of locations and you assume fertility to vary normally across locations.
In repeated-measures experiments with people or animals as subjects, subjects are declared random because they are selected from the larger population to which you want to generalize.

A typical assumption is that random effects have values drawn from a normally distributed random process with mean zero and common variance. Effects are declared random when the levels are randomly selected from a large population of possible levels. Inferences are made using only a few levels but can be generalized across the whole population of random effects levels.

The consequence of having random effects in your model is that some observations are no longer uncorrelated but instead have a covariance that depends on the variance of the random effect. In fact, a more general approach to random effect models is to model the covariance between observations.

Tests of Effects

Analysis of variance tests are constructed by comparing independent mean squares. To test a particular null hypothesis, you compute the ratio of two mean squares that have the same expected value under that hypothesis; if the ratio is much larger than 1, then that constitutes significant evidence against the null. In particular, in an analysis-of-variance model with fixed effects only, the expected value of each mean square has two components : quadratic functions of fixed parameters and random variation. For example, for a fixed effect called A, the expected value of its mean square is

Under the null hypothesis of no A effect, the fixed portion Q( ² ) of the expected mean square is zero. This mean square is then compared to another mean square, say MS(E), that is independent of the first and has expected value . The ratio of the two mean squares

has the F distribution under the null hypothesis. When the null hypothesis is false, the numerator term has a larger expected value, but the expected value of the denominator remains the same. Thus, large F values lead to rejection of the null hypothesis. The probability of getting an F value at least as large as the one observed given that the null hypothesis is true is called the significance probability value (or the p -value). A p -value of less than 0.05, for example, indicates that data with no real A effect will yield F values as large as the one observed less than 5% of the time. This is usually considered moderate evidence that there is a real A effect. Smaller p -values constitute even stronger evidence. Larger p -values indicate that the effect of interest is less than random noise. In this case, you can conclude either that there is no effect at all or that you do not have enough data to detect the differences being tested .

General Linear Models

An analysis-of-variance model can be written as a linear model, which is an equation that predicts the response as a linear function of parameters and design variables. In general,

where y _i is the response for the i th observation, ² _k are unknown parameters to be estimated, and x _ij are design variables. Design variables for analysis of variance are indicator variables; that is, they are always either 0 or 1.

The simplest model is to fit a single mean to all observations. In this case there is only one parameter, ² , and one design variable, x _i , which always has the value of 1:

The least-squares estimator of ² is the mean of the y _i . This simple model underlies all more complex models, and all larger models are compared to this simple mean model. In writing the parameterization of a linear model, ² is usually referred to as the intercept .

A one-way model is written by introducing an indicator variable for each level of the classification variable. Suppose that a variable A has four levels, with two observations per level. The indicator variables are created as follows :

Intercept	A1	A2	A3	A4
1	1
1	1
1		1
1		1
1			1
1			1
1				1
1				1

The linear model for this example is

To construct crossed and nested effects, you can simply multiply out all combinations of the main-effect columns . This is described in detail in 'Specification of Effects' in Chapter 32, 'The GLM Procedure.'

Linear Hypotheses

When models are expressed in the framework of linear models, hypothesis tests are expressed in terms of a linear function of the parameters. For example, you may want to test that ² ₂ ˆ’ ² ₃ = 0. In general, the coefficients for linear hypotheses are some set of L s:

Several of these linear functions can be combined to make one joint test. These tests can be expressed in one matrix equation:

For each linear hypothesis, a sum of squares (SS) due to that hypothesis can be constructed. These sums of squares can be calculated either as a quadratic form of the estimates

or, equivalently, as the increase in sums of squares for error (SSE) for the model constrained by the null hypothesis

This SS is then divided by appropriate degrees of freedom and used as a numerator of an F statistic.