Details | SAS/STAT 9.1, Users Guide, Volume 3 (volume 3 ONLY)

Statistical Assumptions for Using PROC GLM

The basic statistical assumption underlying the least-squares approach to general linear modeling is that the observed values of each dependent variable can be written as the sum of two parts : a fixed component x ² ² , which is a linear function of the independent coefficients, and a random noise, or error, component :

The independent coefficients x are constructed from the model effects as described in the Parameterization of PROC GLM Models section on page 1787. Further, the errors for different observations are assumed to be uncorrelated with identical variances. Thus, this model can be written

where Y is the vector of dependent variable values, X is the matrix of independent coefficients, I is the identity matrix, and ƒ ² is the common variance for the errors. For multiple dependent variables, the model is similar except that the errors for different dependent variables within the same observation are not assumed to be uncorrelated. This yields a multivariate linear model of the form

where Y and B are now matrices, with one column for each dependent variable, vec( Y ) strings Y out by rows, and indicates the Kronecker matrix product.

Under the assumptions thus far discussed, the least-squares approach provides estimates of the linear parameters that are unbiased and have minimum variance among linear estimators. Under the further assumption that the errors have a normal (or Gaussian) distribution, the least-squares estimates are the maximum likelihood estimates and their distribution is known. All of the significance levels ( p values) and confidence limits calculated by the GLM procedure require this assumption of normality in order to be exactly valid, although they are good approximations in many other cases.

Specification of Effects

Each term in a model, called an effect , is a variable or combination of variables. Effects are specified with a special notation using variable names and operators. There are two kinds of variables: classification (or class ) variables and continuous variables . There are two primary operators: crossing and nesting . A third operator, the bar operator , is used to simplify effect specification.

In an analysis-of-variance model, independent variables must be variables that identify classification levels. In the SAS System, these are called class variables and are declared in the CLASS statement. (They can also be called categorical , qualitative , discrete , or nominal variables .) Class variables can be either numeric or character . The values of a class variable are called levels . For example, the class variable Sex has the levels male and female .

In a model, an independent variable that is not declared in the CLASS statement is assumed to be continuous. Continuous variables, which must be numeric, are used for response variables and covariates. For example, the heights and weights of subjects are continuous variables.

Types of Effects

There are seven different types of effects used in the GLM procedure. In the following list, assume that A , B , C , D , and E are class variables and that X1 , X2 , and Y are continuous variables:

Regressor effects are specified by writing continuous variables by themselves : X1 X2 .
Polynomial effects are specified by joining two or more continuous variables with asterisks : X1 * X1 X1 * X2 .
Main effects are specified by writing class variables by themselves: A B C .
Crossed effects (interactions) are specified by joining class variables with asterisks: A * B B * C A * B * C .
Nested effects are specified by following a main effect or crossed effect with a class variable or list of class variables enclosed in parentheses. The main effect or crossed effect is nested within the effects listed in parentheses:

In this example, B ( A ) is read B nested within A .
Continuous-by-class effects are written by joining continuous variables and class variables with asterisks: X1 * A .
Continuous-nesting-class effects consist of continuous variables followed by a class variable interaction enclosed in parentheses: X1 ( A ) X1 * X2 ( A * B ).

One example of the general form of an effect involving several variables is

This example contains crossed continuous terms by crossed classification terms nested within more than one class variable. The continuous list comes first, followed by the crossed list, followed by the nesting list in parentheses. Note that asterisks can appear within the nested list but not immediately before the left parenthesis. For details on how the design matrix and parameters are defined with respect to the effects specified in this section, see the section Parameterization of PROC GLM Models on page 1787.

The MODEL statement and several other statements use these effects. Some examples of MODEL statements using various kinds of effects are shown in the following table; a , b , and c represent class variables, and y , y1 , y2 , x , and z represent continuous variables.

Specification	Kind of Model
model y=x;	simple regression
model y=x z;	multiple regression
*model y=x xx;**	polynomial regression
model y1 y2=x z;	multivariate regression
model y=a;	one-way ANOVA
model y=a b c;	main effects model
*model y=a b ab;**	factorial model (with interaction)
model y=a b(a) c(b a);	nested model
model y1 y2=a b;	multivariate analysis of variance (MANOVA)
model y=a x;	analysis-of-covariance model
model y=a x(a);	separate-slopes model
*model y=a x xa;**	homogeneity-of-slopes model

The Bar Operator

You can shorten the specification of a large factorial model using the bar operator. For example, two ways of writing the model for a full three-way factorial model are

  proc glm;                 and           proc glm;   class A B C;                            class A B C;   model Y=A B C A*B                       model Y=ABC;   A*C B*C A*B*C;                 run;   run;

When the bar () is used, the right- and left-hand sides become effects, and the cross of them becomes an effect. Multiple bars are permitted. The expressions are expanded from left to right, using rules 2_4 given in Searle (1971, p. 390).

Multiple bars are evaluated left to right. For instance, A B C is evaluated as follows .
Crossed and nested groups of variables are combined. For example, A ( B ) C ( D ) generates A * C ( B D ), among other terms.
Duplicate variables are removed. For example, A ( C ) B ( C ) generates A * B ( C C ), among other terms, and the extra C is removed.
Effects are discarded if a variable occurs on both the crossed and nested parts of an effect. For instance, A ( B ) B ( DE ) generates A * B ( B D E ), but this effect is eliminated immediately.

You can also specify the maximum number of variables involved in any effect that results from bar evaluation by specifying that maximum number, preceded by an @ sign, at the end of the bar effect. For example, the specification A B C @2 would result in only those effects that contain 2 or fewer variables: in this case, A B A * B C A * C and B * C .

The following table gives more examples of using the bar and at operators.

A C(B)	is equivalent to	A C(B) A*C(B)
A(B) C(B)	is equivalent to	A(B) C(B) A*C(B)
A(B) B(D E)	is equivalent to	A(B) B(D E)
A B(A) C	is equivalent to	A B(A) C AC BC(A)
A B(A) C @2	is equivalent to	A B(A) C A*C
A B C D @2	is equivalent to	A B AB C AC BC D AD BD CD
AB(CD)	is equivalent to	A*B(C D)

Using PROC GLM Interactively

You can use the GLM procedure interactively. After you specify a model with a MODEL statement and run PROC GLM with a RUN statement, you can execute a variety of statements without reinvoking PROC GLM.

The Syntax section (page 1742) describes which statements can be used interactively. These interactive statements can be executed singly or in groups by following the single statement or group of statements with a RUN statement. Note that the MODEL statement cannot be repeated; PROC GLM allows only one MODEL statement.

If you use PROC GLM interactively, you can end the GLM procedure with a DATA step, another PROC step, an ENDSAS statement, or a QUIT statement.

When you are using PROC GLM interactively, additional RUN statements do not end the procedure but tell PROC GLM to execute additional statements.

When you specify a WHERE statement with PROC GLM, it should appear before the first RUN statement. The WHERE statement enables you to select only certain observations for analysis without using a subsetting DATA step. For example, the statement where group ne 5 omits observations with GROUP=5 from the analysis. Refer to SAS Language Reference: Dictionary for details on this statement.

When you specify a BY statement with PROC GLM, interactive processing is not possible; that is, once the first RUN statement is encountered , processing proceeds for each BY group in the data set, and no further statements are accepted by the procedure.

Interactivity is also disabled when there are different patterns of missing values among the dependent variables. For details, see the Missing Values section on page 1836.

Parameterization of PROC GLM Models

The GLM procedure constructs a linear model according to the specifications in the MODEL statement. Each effect generates one or more columns in a design matrix X . This section shows precisely how X is built.

Intercept

All models include a column of 1s by default to estimate an intercept parameter ¼ . You can use the NOINT option to suppress the intercept.

Regression Effects

Regression effects (covariates) have the values of the variables copied into the design matrix directly. Polynomial terms are multiplied out and then installed in X .

Main Effects

If a class variable has m levels, PROC GLM generates m columns in the design matrix for its main effect. Each column is an indicator variable for one of the levels of the class variable. The default order of the columns is the sort order of the values of their levels; this order can be controlled with the ORDER= option in the PROC GLM statement, as shown in the following table.

Data		Design Matrix
			A		B
A	B	¼	A1	A2	B1	B2	B3
1	1	1	1		1
1	2	1	1			1
1	3	1	1				1
2	1	1		1	1
2	2	1		1		1
2	3	1		1			1

There are more columns for these effects than there are degrees of freedom for them; in other words, PROC GLM is using an over-parameterized model.

Crossed Effects

First, PROC GLM reorders the terms to correspond to the order of the variables in the CLASS statement; thus, B * A becomes A * B if A precedes B in the CLASS statement. Then, PROC GLM generates columns for all combinations of levels that occur in the data. The order of the columns is such that the rightmost variables in the cross index faster than the leftmost variables. No columns are generated corresponding to combinations of levels that do not occur in the data.

Data		Design Matrix
			A		B			A * B
A	B	¼	A1	A2	B1	B2	B3	A1B1	A1B2	A1B3	A2B1	A2B2	A2B3
1	1	1	1		1			1
1	2	1	1			1			1
1	3	1	1				1			1
2	1	1		1	1						1
2	2	1		1		1						1
2	3	1		1			1						1

In this matrix, main-effects columns are not linearly independent of crossed-effect columns; in fact, the column space for the crossed effects contains the space of the main effect.

Nested Effects

Nested effects are generated in the same manner as crossed effects. Hence, the design columns generated by the following statements are the same (but the ordering of the columns is different):

model y=a b(a);	(B nested within A)
*model y=a ab;**	(omitted main effect for B)

The nesting operator in PROC GLM is more a notational convenience than an operation distinct from crossing. Nested effects are characterized by the property that the nested variables never appear as main effects. The order of the variables within nesting parentheses is made to correspond to the order of these variables in the CLASS statement. The order of the columns is such that variables outside the parentheses index faster than those inside the parentheses, and the rightmost nested variables index faster than the leftmost variables.

Data		Design Matrix
			A		B ( A )
A	B	¼	A1	A2	B1A1	B2A1	B3A1	B1A2	B2A2	B3A2
1	1	1	1		1
1	2	1	1			1
1	3	1	1				1
2	1	1		1				1
2	2	1		1					1
2	3	1		1						1

Continuous-Nesting-Class Effects

When a continuous variable nests with a class variable, the design columns are constructed by multiplying the continuous values into the design columns for the class effect.

Data			Design Matrix
			A		X(A)
X	A	¼	A1	A2	X(A1)	X(A2)
21	1	1	1		21
24	1	1	1		24
22	1	1	1		22
28	2	1		1		28
19	2	1		1		19
23	2	1		1		23

This model estimates a separate slope for X within each level of A .

Continuous-by-Class Effects

Continuous-by-class effects generate the same design columns as continuous-nesting-class effects. The two models differ by the presence of the continuous variable as a regressor by itself, in addition to being a contributor to X * A .

Data		Design Matrix
				A		X * A
X	A	¼	X	A1	A2	X*A1	X*A2
21	1	1	21	1		21
24	1	1	24	1		24
22	1	1	22	1		22
28	2	1	28		1		28
19	2	1	19		1		19
23	2	1	23		1		23

Continuous-by-class effects are used to test the homogeneity of slopes. If the continuous-by-class effect is nonsignificant, the effect can be removed so that the response with respect to X is the same for all levels of the class variables.

General Effects

An example that combines all the effects is

The continuous list comes first, followed by the crossed list, followed by the nested list in parentheses.

The sequencing of parameters is important to learn if you use the CONTRAST or ESTIMATE statement to compute or test some linear function of the parameter estimates.

Effects may be retitled by PROC GLM to correspond to ordering rules. For example, B * A ( ED ) may be retitled A * B ( DE ) to satisfy the following:

Class variables that occur outside parentheses (crossed effects) are sorted in the order in which they appear in the CLASS statement.
Variables within parentheses (nested effects) are sorted in the order in which they appear in a CLASS statement.

The sequencing of the parameters generated by an effect can be described by which variables have their levels indexed faster:

Variables in the crossed part index faster than variables in the nested list.
Within a crossed or nested list, variables to the right index faster than variables to the left.

For example, suppose a model includes four effects ” A , B , C , and D ”each having two levels, 1 and 2. If the CLASS statement is

  class A B C D;

then the order of the parameters for the effect B * A ( C D ), which is retitled A * B ( C D ), is as follows.

A ₁ B ₁ C ₁ D ₁
A ₁ B ₂ C ₁ D ₁
A ₂ B ₁ C ₁ D ₁
A ₂ B ₂ C ₁ D ₁
A ₁ B ₁ C ₁ D ₂
A ₁ B ₂ C ₁ D ₂
A ₂ B ₁ C ₁ D ₂
A ₂ B ₂ C ₁ D ₂
A ₁ B ₁ C ₂ D ₁
A ₁ B ₂ C ₂ D ₁
A ₂ B ₁ C ₂ D ₁
A ₂ B ₂ C ₂ D ₁
A ₁ B ₁ C ₂ D ₂
A ₁ B ₂ C ₂ D ₂
A ₂ B ₁ C ₂ D ₂
A ₂ B ₂ C ₂ D ₂

Note that first the crossed effects B and A are sorted in the order in which they appear in the CLASS statement so that A precedes B in the parameter list. Then, for each combination of the nested effects in turn , combinations of A and B appear. The B effect changes fastest because it is rightmost in the ( renamed ) cross list. Then A changes next fastest. The D effect changes next fastest, and C is the slowest since it is leftmost in the nested list.

When numeric class variables are used, their levels are sorted by their character format, which may not correspond to their numeric sort sequence. Therefore, it is advisable to include a format for numeric class variables or to use the ORDER=INTERNAL option in the PROC GLM statement to ensure that levels are sorted by their internal values.

Degrees of Freedom

For models with classification (categorical) effects, there are more design columns constructed than there are degrees of freedom for the effect. Thus, there are linear dependencies among the columns. In this event, the parameters are not jointly estimable; there is an infinite number of least-squares solutions. The GLM procedure uses a generalized (g2) inverse to obtain values for the estimates; see the Computational Method section on page 1840 for more details. The solution values are not produced unless the SOLUTION option is specified in the MODEL statement. The solution has the characteristic that estimates are zero whenever the design column for that parameter is a linear combination of previous columns. (Strictly termed, the solution values should not be called estimates, since the parameters may not be formally estimable.) With this full parameterization, hypothesis tests are constructed to test linear functions of the parameters that are estimable .

Other procedures (such as the CATMOD procedure) reparameterize models to full rank using certain restrictions on the parameters. PROC GLM does not reparameterize, making the hypotheses that are commonly tested more understandable. See Goodnight (1978a) for additional reasons for not reparameterizing.

PROC GLM does not actually construct the entire design matrix X ; rather, a row x _i of X is constructed for each observation in the data set and used to accumulate the crossproduct matrix

Hypothesis Testing in PROC GLM

See Chapter 11, The Four Types of Estimable Functions, for a complete discussion of the four standard types of hypothesis tests.

Example

To illustrate the four types of tests and the principles upon which they are based, consider a two-way design with interaction based on the following data:

		B
		1	2
	1	23.5 23.7	28.7
A	2	8.9	5.6 8.9
	3	10.3 12.5	13.6 14.6

Invoke PROC GLM and specify all the estimable functions options to examine what the GLM procedure can test. The following statements are followed by the summary ANOVA table. See Figure 32.8.

  data example;   input a b y @@;   datalines;   1 1 23.5  1 1 23.7  1 2 28.7  2 1  8.9  2 2  5.6   2 2  8.9  3 1 10.3  3 1 12.5  3 2 13.6  3 2 14.6   ;   proc glm;   class a b;   model y=a b a*b / e e1 e2 e3 e4;   run;

  The GLM Procedure   Dependent Variable: y   Sum of   Source                     DF        Squares    Mean Square   F Value   Pr > F   Model                       5    520.4760000    104.0952000     49.66   0.0011   Error                       4      8.3850000      2.0962500   Corrected Total             9    528.8610000   R-Square     Coeff Var      Root MSE        y Mean   0.984145      9.633022      1.447843      15.03000

Figure 32.8: Summary ANOVA Table from PROC GLM

The following sections show the general form of estimable functions and discuss the four standard tests, their properties, and abbreviated output for the two-way crossed example.

Estimability

Figure 32.9 is the general form of estimable functions for the example. In order to be testable, a hypothesis must be able to fit within the framework displayed here.

  The GLM Procedure   General Form of Estimable Functions   Effect            Coefficients   Intercept         L1   a         1       L2   a         2       L3   a         3       L1   L2   L3   b         1       L5   b         2       L1   L5   a*b       1 1     L7   a*b       1 2     L2   L7   a*b       2 1     L9   a*b       2 2     L3   L9   a*b       3 1     L5   L7   L9   a*b       3 2     L1   L2   L3   L5+L7+L9

Figure 32.9: General Form of Estimable Functions

If a hypothesis is estimable, the L s in the preceding scheme can be set to values that match the hypothesis. All the standard tests in PROC GLM can be shown in the preceding format, with some of the L s zeroed and some set to functions of other L s.

The following sections show how many of the hypotheses can be tested by comparing the model sum-of-squares regression from one model to a submodel . The notation used is

where SS( A effects ) denotes the regression model sum of squares for the model consisting of A effects . This notation is equivalent to the reduction notation defined by Searle (1971) and summarized in Chapter 11, The Four Types of Estimable Functions.

Type I Tests

Type I sums of squares (SS), also called sequential sums of squares , are the incremental improvement in error sums of squares as each effect is added to the model. They can be computed by fitting the model in steps and recording the difference in error sum of squares at each step.

Source	Type I SS
A	SS( A ¼ )
B	SS( B ¼ , A )
A * B	SS( A * B ¼ , A, B )

Type I sums of squares are displayed by default because they are easy to obtain and can be used in various hand calculations to produce sum of squares values for a series of different models. Nelder (1994) and others have argued that Type I and II sums are essentially the only appropriate ones for testing ANOVA effects; however, refer also to the discussion of Nelder s article, especially Rodriguez et al. (1995) and Searle (1995).

The Type I hypotheses have these properties:

Type I sum of squares for all effects add up to the model sum of squares. None of the other sum of squares types have this property, except in special cases.
Type I hypotheses can be derived from rows of the Forward-Dolittle transformation of X ² X (a transformation that reduces X ² X to an upper triangular matrix by row operations).
Type I sum of squares are statistically independent of each other under the usual assumption that the true residual errors are independent and identically normally distributed (see page 1783).
Type I hypotheses depend on the order in which effects are specified in the MODEL statement.
Type I hypotheses are uncontaminated by parameters corresponding to effects that precede the effect being tested; however, the hypotheses usually involve parameters for effects following the tested effect in the model. For example, in the model
```
  Y=A B;  
```
the Type I hypothesis for B does not involve A parameters, but the Type I hypothesis for A does involve B parameters.
Type I hypotheses are functions of the cell counts for unbalanced data; the hypotheses are not usually the same hypotheses that are tested if the data are balanced.
Type I sums of squares are useful for polynomial models where you want to know the contribution of a term as though it had been made orthogonal to preceding effects. Thus, in polynomial models, Type I sums of squares correspond to tests of the orthogonal polynomial effects.

The Type I estimable functions and associated tests for the example are shown in Figure 32.10. (This combines tables from several pages of output.)

  The GLM Procedure   Type I Estimable Functions   ----------------Coefficients----------------   Effect            a                       b             a*b   Intercept         0                       0             0   a         1       L2                      0             0   a         2       L3                      0             0   a         3   L2   L3                  0             0   b         1       0.1667*L2   0.1667*L3     L5            0   b         2   0.1667*L2+0.1667*L3   L5           0   a*b       1 1     0.6667*L2               0.2857*L5     L7   a*b       1 2     0.3333*L2   0.2857*L5   L7   a*b       2 1     0.3333*L3               0.2857*L5     L9   a*b       2 2     0.6667*L3   0.2857*L5   L9   a*b       3 1   0.5*L2   0.5*L3          0.4286*L5   L7   L9   a*b       3 2   0.5*L2   0.5*L3   0.4286*L5    L7+L9

  The GLM Procedure   Dependent Variable: y   Source                     DF      Type I SS    Mean Square   F Value   Pr > F   a                           2    494.0310000    247.0155000    117.84   0.0003   b                           1     10.7142857     10.7142857      5.11   0.0866   a*b                         2     15.7307143      7.8653571      3.75   0.1209

Figure 32.10: Type I Estimable Functions and Associated Tests

Type II Tests

The Type II tests can also be calculated by comparing the error sums of squares (SS) for subset models. The Type II SS are the reduction in error SS due to adding the term after all other terms have been added to the model except terms that contain the effect being tested. An effect is contained in another effect if it can be derived by deleting variables from the latter effect. For example, A and B are both contained in A * B . For this model

Source	Type II SS
A	SS( A ¼ , B )
B	SS( B ¼ , A )
A * B	SS( A * B ¼ , A, B )

Type II SS have these properties:

Type II SS do not necessarily sum to the model SS.
The hypothesis for an effect does not involve parameters of other effects except for containing effects (which it must involve to be estimable).
Type II SS are invariant to the ordering of effects in the model.
For unbalanced designs, Type II hypotheses for effects that are contained in other effects are not usually the same hypotheses that are tested if the data are balanced. The hypotheses are generally functions of the cell counts.

The Type II estimable functions and associated tests for the example are shown in Figure 32.11. (Again, this combines tables from several pages of output.)

  The GLM Procedure   Type II Estimable Functions   ----------------Coefficients----------------   Effect            a                       b             a*b   Intercept         0                       0             0   a         1       L2                      0             0   a         2       L3                      0             0   a         3       -L2-L3                  0             0   b         1       0                       L5            0   b         2       0   L5           0   a*b       1 1     0.619*L2+0.0476*L3      0.2857*L5     L7   a*b       1 2     0.381*L2   0.0476*L3   0.2857*L5   L7   a*b       2 1   0.0476*L2+0.381*L3     0.2857*L5     L9   a*b       2 2     0.0476*L2+0.619*L3   0.2857*L5   L9   a*b       3 1   0.5714*L2   0.4286*L3    0.4286*L5   L7   L9   a*b       3 2   0.4286*L2   0.5714*L3   0.4286*L5    L7+L9

  The GLM Procedure   Dependent Variable: y   Source                     DF     Type II SS    Mean Square   F Value   Pr > F   a                           2    499.1202857    249.5601429    119.05   0.0003   b                           1     10.7142857     10.7142857      5.11   0.0866   a*b                         2     15.7307143      7.8653571      3.75   0.1209

Figure 32.11: Type II Estimable Functions and Associated Tests

Type III and Type IV Tests

Type III and Type IV sums of squares (SS), sometimes referred to as partial sums of squares , are considered by many to be the most desirable; see Searle (1987, Section 4.6). These SS cannot, in general, be computed by comparing model SS from several models using PROC GLM s parameterization. However, they can sometimes be computed by reduction for methods that reparameterize to full rank, when such a reparameterization effectively imposes Type III linear constraints on the parameters. In PROC GLM, they are computed by constructing a hypothesis matrix L and then computing the SS associated with the hypothesis L ² =0. As long as there are no missing cells in the design, Type III and Type IV SS are the same.

These are properties of Type III and Type IV SS:

The hypothesis for an effect does not involve parameters of other effects except for containing effects (which it must involve to be estimable).
The hypotheses to be tested are invariant to the ordering of effects in the model.
The hypotheses are the same hypotheses that are tested if there are no missing cells. They are not functions of cell counts.
The SS do not generally add up to the model SS and, in some cases, can exceed the model SS.

The SS are constructed from the general form of estimable functions. Type III and Type IV tests are different only if the design has missing cells. In this case, the Type III tests have an orthogonality property, while the Type IV tests have a balancing property. These properties are discussed in Chapter 11, The Four Types of Estimable Functions. For this example, since the data contains observations for all pairs of levels of A and B, Type IV tests are identical to the Type III tests that are shown in Figure 32.12. (This combines tables from several pages of output.)

  The GLM Procedure   Type III Estimable Functions   -------------Coefficients-------------   Effect            a                 b             a*b   Intercept         0                 0             0   a         1       L2                0             0   a         2       L3                0             0   a         3   L2   L3            0             0   b         1       0                 L5            0   b         2       0   L5           0   a*b       1 1     0.5*L2            0.3333*L5     L7   a*b       1 2     0.5*L2   0.3333*L5   L7   a*b       2 1     0.5*L3            0.3333*L5     L9   a*b       2 2     0.5*L3   0.3333*L5   L9   a*b       3 1   0.5*L2   0.5*L3    0.3333*L5   L7   L9   a*b       3 2   0.5*L2   0.5*L3   0.3333*L5    L7+L9

  The GLM Procedure   Dependent Variable: y   Source                     DF    Type III SS    Mean Square   F Value   Pr > F   a                           2    479.1078571    239.5539286    114.28   0.0003   b                           1      9.4556250      9.4556250      4.51   0.1009   a*b                         2     15.7307143      7.8653571      3.75   0.1209

Figure 32.12: Type III Estimable Functions and Associated Tests

Absorption

Absorption is a computational technique used to reduce computing resource needs in certain cases. The classic use of absorption occurs when a blocking factor with a large number of levels is a term in the model.

For example, the statements

  proc glm;   absorb herd;   class a b;   model y=a b a*b;   run;

are equivalent to

  proc glm;   class herd a b;   model y=herd a b a*b;   run;

The exception to the previous statements is that the Type II, Type III, or Type IV SS for HERD are not computed when HERD is absorbed.

The algorithm for absorbing variables is similar to the one used by the NESTED procedure for computing a nested analysis of variance. As each new row of [ X Y ] (corresponding to the nonabsorbed independent effects and the dependent variables) is constructed, it is adjusted for the absorbed effects in a Type I fashion. The efficiency of the absorption technique is due to the fact that this adjustment can be done in one pass of the data and without solving any linear equations, assuming that the data have been sorted by the absorbed variables.

Several effects can be absorbed at one time. For example, these statements

  proc glm;   absorb herd cow;   class a b;   model y=a b a*b;   run;

are equivalent to

  proc glm;   class herd cow a b;   model y=herd cow(herd) a b a*b;   run;

When you use absorption, the size of the X ² X matrix is a function only of the effects in the MODEL statement. The effects being absorbed do not contribute to the size of the X ² X matrix.

For the preceding example, a and b can be absorbed:

  proc glm;   absorb a b;   class herd cow;   model y=herd cow(herd);   run;

Although the sources of variation in the results are listed as

  a b(a) herd cow(herd)

all types of estimable functions for herd and cow ( herd ) are free of a , b , and a * b parameters.

To illustrate the savings in computing using the ABSORB statement, PROC GLM is run on generated data with 1147 degrees of freedom in the model with the following statements:

  data a;   do herd=1 to 40;   do cow=1 to 30;   do treatment=1 to 3;   do rep=1 to 2;   y = herd/5 + cow/10 + treatment + rannor(1);   output;   end;   end;   end;   end;   proc glm;   class herd cow treatment;   model y=herd cow(herd) treatment;   run;

This analysis would have required over 6 megabytes of memory for the X ² X matrix had PROC GLM solved it directly. However, in the following statements, the GLM procedure needs only a 4 — 4 matrix for the intercept and treatment because the other effects are absorbed.

  proc glm;   absorb herd cow;   class treatment;   model y = treatment;   run;

These statements produce the results shown in Figure 32.13.

  The GLM Procedure   Class Level Information   Class          Levels    Values   treatment           3    1 2 3   Number of Observations Read        7200   Number of Observations Used        7200   The GLM Procedure   Dependent Variable: y   Sum of   Source                     DF        Squares    Mean Square   F Value   Pr > F   Model                    1201    49465.40242       41.18685     41.57   <.0001   Error                    5998     5942.23647        0.99070   Corrected Total          7199    55407.63889   R-Square     Coeff Var      Root MSE        y Mean   0.892754      13.04236      0.995341      7.631598   Source                     DF      Type I SS    Mean Square   F Value   Pr > F   herd                       39    38549.18655      988.44068    997.72   <.0001   cow(herd)                1160     6320.18141        5.44843      5.50   <.0001   treatment                   2     4596.03446     2298.01723   2319.58   <.0001   Source                     DF    Type III SS    Mean Square   F Value   Pr > F   treatment                   2    4596.034455    2298.017228   2319.58   <.0001

Figure 32.13: Absorption of Effects

Specification of ESTIMATE Expressions

Consider the model

The corresponding MODEL statement for PROC GLM is

  model y=x1 x2 x3;

To estimate the difference between the parameters for x ₁ and x ₂ ,

you can use the following ESTIMATE statement:

  estimate 'B1   B2' x1 1 x2   1;

To predict y at x ₁ = 1, x ₂ = 0, and x ₃ = ˆ’ 2, you can estimate

with the following ESTIMATE statement:

  estimate 'B0+B1   2B3' intercept 1 x1 1 x3   2;

Now consider models involving class variables such as

  model y=A B A*B;

with the associated parameters:

The LS-mean for the first level of A is L ² , where

You can estimate this with the following ESTIMATE statement:

  estimate 'LS-mean(A1)' intercept1A1B0.50.5A*B0.50.5;

Note in this statement that only one element of L is specified following the A effect, even though A has three levels. Whenever the list of constants following an effect name is shorter than the effect s number of levels, zeros are used as the remaining constants. (If the list of constants is longer than the number of levels for the effect, the extra constants are ignored, and a warning message is displayed.)

To estimate the A linear effect in the preceding model, assuming equally spaced levels for A , you can use the following L :

The ESTIMATE statement for this L is written as

  estimate 'A Linear' A   1 0 1;

If you do not specify the elements of L for an effect that contains a specified effect, then the elements of the specified effect are equally distributed over the corresponding levels of the higher-order effect. In addition, if you specify the intercept in an ESTIMATE or CONTRAST statement, it is distributed over all classification effects that are not contained by any other specified effect. The distribution of lower-order coefficients to higher-order effect coefficients follows the same general rules as in the LSMEANS statement, and it is similar to that used to construct Type IV tests. In the previous example, the ˆ’ 1 associated with ± ₁ is divided by the number n ₁ _j of ³ ₁ j parameters; then each ³ ₁ _j coefficient is set to ˆ’ 1 /n ₁ _j . The 1 associated with ± ₃ is distributed among the ³ ₃ _j parameters in a similar fashion. In the event that an unspecified effect contains several specified effects, only that specified effect with the most factors in common with the unspecified effect is used for distribution of coefficients to the higher-order effect.

Numerous syntactical expressions for the ESTIMATE statement were considered, including many that involved specifying the effect and level information associated with each coefficient. For models involving higher-level effects, the requirement of specifying level information can lead to very bulky specifications. Consequently, the simpler form of the ESTIMATE statement described earlier was implemented. The syntax of this ESTIMATE statement puts a burden on you to know a priori the order of the parameter list associated with each effect. You can use the ORDER= option in the PROC GLM statement to ensure that the levels of the classification effects are sorted appropriately.

Note: If you use the ESTIMATE statement with unspecified effects, use the E option to make sure that the actual L constructed by the preceding rules is the one you intended.

A Check for Estimability

Each L is checked for estimability using the relationship: L = LH where H = ( X ² X ) ˆ’ X ² X .The L vector is declared nonestimable, if for any i

where ˆˆ = 10 ^{ˆ’ 4} by default; you can change this with the SINGULAR= option. Continued fractions (like 1/3) should be specified to at least six decimal places, or the DIVISOR parameter should be used.

Comparing Groups

An important task in analyzing data with classification effects is to estimate the typical response for each level of a given effect; often, you also want to compare these estimates to determine which levels are equivalent in terms of the response. You can perform this task in two ways with the GLM procedure: with direct, arithmetic group means; and with so-called least-squares means (LS-means).

Means Versus LS-Means

Computing and comparing arithmetic means ”either simple or weighted within-group averages of the input data ”is a familiar and well-studied statistical process. This is the right approach to summarizing and comparing groups for one-way and balanced designs. However, in unbalanced designs with more than one effect, the arithmetic mean for a group may not accurately reflect the typical response for that group, since it does not take other effects into account.

For example, consider the following analysis of an unbalanced two-way design:

  data twoway;   input Treatment Block y @@;   datalines;   1 1 17   1 1 28   1 1 19   1 1 21   1 1 19   1 2 43   1 2 30   1 2 39   1 2 44   1 2 44   1 3 16   2 1 21   2 1 21   2 1 24   2 1 25   2 2 39   2 2 45   2 2 42   2 2 47   2 3 19   2 3 22   2 3 16   3 1 22   3 1 30   3 1 33   3 1 31   3 2 46   3 3 26   3 3 31   3 3 26   3 3 33   3 3 29   3 3 25   ;   title1 "Unbalanced Two-way Design";   ods select ModelANOVA Means LSMeans;   proc glm data=twoway;   class Treatment Block;   model y = TreatmentBlock;   means Treatment;   lsmeans Treatment;   run;   ods select all;

The ANOVA results are shown in Figure 32.14.

  Unbalanced Two-way Design   The GLM Procedure   Dependent Variable: y   Source                     DF      Type I SS    Mean Square   F Value   Pr > F   Treatment                   2       8.060606       4.030303      0.24   0.7888   Block                       2    2621.864124    1310.932062     77.95   <.0001   Treatment*Block             4      32.684361       8.171090      0.49   0.7460   Source                     DF    Type III SS    Mean Square   F Value   Pr > F   Treatment                   2     266.130682     133.065341      7.91   0.0023   Block                       2    1883.729465     941.864732     56.00   <.0001   Treatment*Block             4      32.684361       8.171090      0.49   0.7460

Figure 32.14: ANOVA Results for Unbalanced Two-Way Design

No matter how you look at it, this data exhibits a strong effect due to the blocks ( F -test p < . 0001) and no significant interaction between treatments and blocks ( F -test p > . 7). But the lack of balance affects how the treatment effect is interpreted: in a main-effects-only model, there are no significant differences between the treatment means themselves (Type I F -test p > . 7), but there are highly significant differences between the treatment means corrected for the block effects (Type III F -test p < . 01).

LS-means are, in effect, within-group means appropriately adjusted for the other effects in the model. More precisely, they estimate the marginal means for a balanced population (as opposed to the unbalanced design). For this reason, they are also called estimated population marginal means by Searle et al. (1980). In the same way that the Type I F -test assesses differences between the arithmetic treatment means (when the treatment effect comes first in the model), the Type III F -test assesses differences between the LS-means. Accordingly, for the unbalanced two-way design, the discrepancy between the Type I and Type III tests is reflected in the arithmetic treatment means and treatment LS-means, as shown in Figure 32.15 and Figure 32.16. See the section Construction of Least-Squares Means on page 1820 for more on LS-means.

  Unbalanced Two-way Design   The GLM Procedure   Level of             --------------y-------------   Treatment      N             Mean          Std Dev   1             11       29.0909091       11.5104695   2             11       29.1818182       11.5569735   3             11       30.1818182        6.3058414

Figure 32.15: Treatment Means for Unbalanced Two-Way Design

  Unbalanced Two-way Design   The GLM Procedure   Least Squares Means   Treatment        y LSMEAN   1              25.6000000   2              28.3333333   3              34.4444444

Figure 32.16: Treatment LS-means for Unbalanced Two-Way Design

Note that, while the arithmetic means are always uncorrelated (under the usual assumptions for analysis of variance; see page 1783), the LS-means may not be. This fact complicates the problem of multiple comparisons for LS-means; see the following section.

Multiple Comparisons

When comparing more than two means, an ANOVA F -test tells you whether the means are significantly different from each other, but it does not tell you which means differ from which other means. Multiple comparison procedures (MCPs), also called mean separation tests , give you more detailed information about the differences among the means. The goal in multiple comparisons is to compare the average effects of three or more treatments (for example, drugs, groups of subjects) to decide which treatments are better, which ones are worse , and by how much, while controlling the probability of making an incorrect decision. A variety of multiple comparison methods are available with the MEANS and LSMEANS statement in the GLM procedure.

The following classification is due to Hsu (1996). Multiple comparison procedures can be categorized in two ways: by the comparisons they make and by the strength of inference they provide. With respect to which comparisons are made, the GLM procedure offers two types:

comparisons between all pairs of means
comparisons between a control and all other means

The strength of inference says what can be inferred about the structure of the means when a test is significant; it is related to what type of error rate the MCP controls. MCPs available in the GLM procedure provide one of the following types of inference, in order from weakest to strongest.

Individual: differences between means, unadjusted for multiplicity
Inhomogeneity: means are different
Inequalities : which means are different
Intervals: simultaneous confidence intervals for mean differences

Methods that control only individual error rates are not true MCPs at all. Methods that yield the strongest level of inference, simultaneous confidence intervals, are usually preferred, since they enable you not only to say which means are different but also to put confidence bounds on how much they differ, making it easier to assess the practical significance of a difference. They are also less likely to lead nonstatisticians to the invalid conclusion that nonsignificantly different sample means imply equal population means. Interval MCPs are available for both arithmetic means and LS-means via the MEANS and LSMEANS statements, respectively. ^[*]

Table 32.3 and Table 32.4 display MCPs available in PROC GLM for all pairwise comparisons and comparisons with a control, respectively, along with associated strength of inference and the syntax (when applicable ) for both the MEANS and the LSMEANS statements.

Table 32.3: Multiple Comparisons Procedures for All Pairwise Comparison
Method	Strength of Inference	Syntax
Method	Strength of Inference	MEANS	LSMEANS
Student s t	Individual	T	PDIFF ADJUST=T
Duncan	Individual	DUNCAN
Student-Newman-Keuls	Inhomogeneity	SNK
REGWQ	Inequalities	REGWQ
Tukey-Kramer	Intervals	TUKEY	PDIFF ADJUST=TUKEY
Bonferroni	Intervals	BON	PDIFF ADJUST=BON
Sidak	Intervals	SIDAK	PDIFF ADJUST=SIDAK
Scheff	Intervals	SCHEFFE	PDIFF ADJUST=SCHEFFE
SMM	Intervals	SMM	PDIFF ADJUST=SMM
Gabriel	Intervals	GABRIEL
Simulation	Intervals		PDIFF ADJUST=SIMULATE

Table 32.4: Multiple Comparisons Procedures for Comparisons with a Control
Method	Strength of Inference	Syntax
Method	Strength of Inference	MEANS	LSMEANS
Student s t	Individual		PDIFF=CONTROL ADJUST=T
Dunnett	Intervals	DUNNETT	PDIFF=CONTROL ADJUST=DUNNETT
Bonferroni	Intervals		PDIFF=CONTROL ADJUST=BON
Sidak	Intervals		PDIFF=CONTROL ADJUST=SIDAK
Scheff	Intervals		PDIFF=CONTROL ADJUST=SCHEFFE
SMM	Intervals		PDIFF=CONTROL ADJUST=SMM
Simulation	Intervals		PDIFF=CONTROL ADJUST=SIMULATE
Note: One-sided Dunnett s tests are also available from the MEANS statement with the DUNNETTL and DUNNETTU options and from the LSMEANS statement with PDIFF=CONTROLL and PDIFF=CONTROLU.

Details of these multiple comparison methods are given in the following sections.

Pairwise Comparisons

All the methods discussed in this section depend on the standardized pairwise differences , where

i and j are the indices of two groups
y _i and y _j are the means or LS-means for groups i and j
_ij is the square-root of the estimated variance of y _i ˆ’ y _j . For simple arithmetic means, , where n _i and n _j are the sizes of groups i and j , respectively, and s ² is the mean square for error, with ½ degrees of freedom. For weighted arithmetic means, , where w _i and w _j are the sums of the weights in groups i and j , respectively. Finally, for LS-means defined by the linear combinations and of the parameter estimates, .

Furthermore, all of the methods are discussed in terms of significance tests of the form

where c ( ± ) is some constant depending on the significance level. Such tests can be inverted to form confidence intervals of the form

The simplest approach to multiple comparisons is to do a t test on every pair of means (the T option in the MEANS statement, ADJUST=T in the LSMEANS statement). For the i th and j th means, you can reject the null hypothesis that the population means are equal if

where ± is the significance level, ½ is the number of error degrees of freedom, and t ( ± ; ½ ) is the two-tailed critical value from a Student s t distribution. If the cell sizes are all equal to, say, n , the preceding formula can be rearranged to give

the value of the right-hand side being Fisher s least significant difference (LSD).

There is a problem with repeated t tests, however. Suppose there are ten means and each t test is performed at the 0.05 level. There are 10(10-1)/2=45 pairs of means to compare, each with a 0.05 probability of a type 1 error (a false rejection of the null hypothesis). The chance of making at least one type 1 error is much higher than 0.05. It is difficult to calculate the exact probability, but you can derive a pessimistic approximation by assuming that the comparisons are independent, giving an upper bound to the probability of making at least one type 1 error (the experimentwise error rate) of

The actual probability is somewhat less than 0.90, but as the number of means increases , the chance of making at least one type 1 error approaches 1.

If you decide to control the individual type 1 error rates for each comparison, you are controlling the individual or comparisonwise error rate. On the other hand, if you want to control the overall type 1 error rate for all the comparisons, you are controlling the experimentwise error rate. It is up to you to decide whether to control the comparisonwise error rate or the experimentwise error rate, but there are many situations in which the experimentwise error rate should be held to a small value. Statistical methods for comparing three or more means while controlling the probability of making at least one type 1 error are called multiple comparisons procedures .

It has been suggested that the experimentwise error rate can be held to the ± level by performing the overall ANOVA F -test at the ± level and making further comparisons only if the F -test is significant, as in Fisher s protected LSD. This assertion is false if there are more than three means (Einot and Gabriel 1975). Consider again the situation with ten means. Suppose that one population mean differs from the others by such a sufficiently large amount that the power (probability of correctly rejecting the null hypothesis) of the F -test is near 1 but that all the other population means are equal to each other. There will be 9(9 ˆ’ 1) / 2 = 36 t tests of true null hypotheses, with an upper limit of 0.84 on the probability of at least one type 1 error. Thus, you must distinguish between the experimentwise error rate under the complete null hypothesis, in which all population means are equal, and the experimentwise error rate under a partial null hypothesis, in which some means are equal but others differ. The following abbreviations are used in the discussion:

CER	comparisonwise error rate
EERC	experimentwise error rate under the complete null hypothesis
MEER	maximum experimentwise error rate under any complete or partial null hypothesis

These error rates are associated with the different strengths of inference discussed on page 1806: individual tests control the CER; tests for inhomogeneity of means control the EERC; tests that yield confidence inequalities or confidence intervals control the MEER. A preliminary F -test controls the EERC but not the MEER.

You can control the MEER at the ± level by setting the CER to a sufficiently small value. The Bonferroni inequality (Miller 1981) has been widely used for this purpose. If

where c is the total number of comparisons, then the MEER is less than ± . Bonferroni t tests (the BON option in the MEANS statement, ADJUST=BON in the LSMEANS statement) with MEER < ± declare two means to be significantly different if

where

for comparison of k means.

Sidak (1967) has provided a tighter bound, showing that

also ensures that MEER ‰ ± for any set of c comparisons. A Sidak t test (Games 1977), provided by the SIDAK option, is thus given by

where

for comparison of k means.

You can use the Bonferroni additive inequality and the Sidak multiplicative inequality to control the MEER for any set of contrasts or other hypothesis tests, not just pairwise comparisons. The Bonferroni inequality can provide simultaneous inferences in any statistical application requiring tests of more than one hypothesis. Other methods discussed in this section for pairwise comparisons can also be adapted for general contrasts (Miller 1981).

Scheff (1953; 1959) proposes another method to control the MEER for any set of contrasts or other linear hypotheses in the analysis of linear models, including pairwise comparisons, obtained with the SCHEFFE option. Two means are declared significantly different if

where F ( ± ; k ˆ’ 1, ½ ) is the ± -level critical value of an F distribution with k ˆ’ 1 numerator degrees of freedom and ½ denominator degrees of freedom.

Scheff s test is compatible with the overall ANOVA F -test in that Scheff s method never declares a contrast significant if the overall F -test is nonsignificant. Most other multiple comparison methods can find significant contrasts when the overall F -test is nonsignificant and, therefore, suffer a loss of power when used with a preliminary F -test.

Scheff s method may be more powerful than the Bonferroni or Sidak methods if the number of comparisons is large relative to the number of means. For pairwise comparisons, Sidak t tests are generally more powerful.

Tukey (1952; 1953) proposes a test designed specifically for pairwise comparisons based on the studentized range, sometimes called the honestly significant difference test, that controls the MEER when the sample sizes are equal. Tukey (1953)and Kramer (1956) independently propose a modification for unequal cell sizes. The Tukey or Tukey-Kramer method is provided by the TUKEY option in the MEANS statement and the ADJUST=TUKEY option in the LSMEANS statement. This method has fared extremely well in Monte Carlo studies (Dunnett 1980). In addition, Hayter (1984) gives a proof that the Tukey-Kramer procedure controls the MEER for means comparisons, and Hayter (1989) describes the extent to which the Tukey-Kramer procedure has been proven to control the MEER for LS-means comparisons. The Tukey-Kramer method is more powerful than the Bonferroni, Sidak, or Scheff methods for pairwise comparisons. Two means are considered significantly different by the Tukey-Kramer criterion if

where q ( ± ; k, ½ ) is the ± -level critical value of a studentized range distribution of k independent normal random variables with ½ degrees of freedom.

Hochberg (1974) devised a method (the GT2 or SMM option) similar to Tukey s, but it uses the studentized maximum modulus instead of the studentized range and employs Sidak (1967) uncorrelated t inequality. It is proven to hold the MEER at a level not exceeding ± with unequal sample sizes. It is generally less powerful than the Tukey-Kramer method and always less powerful than Tukey s test for equal cell sizes. Two means are declared significantly different if

where m ( ± ; c, ½ ) is the ± -level critical value of the studentized maximum modulus distribution of c independent normal random variables with ½ degrees of freedom and c = k ( k ˆ’ 1)/2.

Gabriel (1978) proposes another method (the GABRIEL option) based on the studentized maximum modulus. This method is applicable only to arithmetic means. It rejects if

For equal cell sizes, Gabriel s test is equivalent to Hochberg s GT2 method. For unequal cell sizes, Gabriel s method is more powerful than GT2 but may become liberal with highly disparate cell sizes (refer also to Dunnett (1980)). Gabriel s test is the only method for unequal sample sizes that lends itself to a graphical representation as intervals around the means. Assuming y _i > y _j , you can rewrite the preceding inequality as

The expression on the left does not depend on j , nor does the expression on the right depend on i . Hence, you can form what Gabriel calls an ( l, u )-interval around each sample mean and declare two means to be significantly different if their ( l, u )-intervals do not overlap. See Hsu (1996, section 5.2.1.1) for a discussion of other methods of graphically representing all pair-wise comparisons.

Comparing All Treatments to a Control

One special case of means comparison is that in which the only comparisons that need to be tested are between a set of new treatments and a single control. In this case, you can achieve better power by using a method that is restricted to test only comparisons to the single control mean. Dunnett (1955) proposes a test for this situation that declares a mean significantly different from the control if

where y is the control mean and d ( ± ; k, ½ , ₁ , , _k _{ˆ’ 1} ) is the critical value of the many-to-one t statistic (Miller 1981; Krishnaiah and Armitage 1966) for k means to be compared to a control, with ½ error degrees of freedom and correlations ₁ , , _k _{ˆ’ 1} , _i = n _i /( n + n _i ). The correlation terms arise because each of the treatment means is being compared to the same control. Dunnett s test holds the MEER to a level not exceeding the stated ± .

Approximate and Simulation-based Methods

Both Tukey s and Dunnett s tests are based on the same general quantile calculation:

where the t _i have a joint multivariate t distribution with ½ degrees of freedom and correlation matrix R . In general, evaluating q ^t ( ± , ½ , R ) requires repeated numerical calculation of an ( n + 1)-fold integral. This is usually intractable, but the problem reduces to a feasible 2-fold integral when R has a certain symmetry in the case of Tukey s test, and a factor analytic structure (cf. Hsu 1992) in the case of Dunnett s test. The R matrix has the required symmetry for exact computation of Tukey s test if the t _i s are studentized differences between

k ( k ˆ’ 1) / 2 pairs of k uncorrelated means with equal variances ”that is, equal sample sizes
k ( k ˆ’ 1) / 2 pairs of k LS-means from a variance-balanced design (for example, a balanced incomplete block design)

Refer to Hsu (1992; 1996) for more information. The R matrix has the factor analytic structure for exact computation of Dunnett s test if the t _i s are studentized differences between

k ˆ’ 1 means and a control mean, all uncorrelated. (Dunnett s one-sided methods depend on a similar probability calculation, without the absolute values.) Note that it is not required that the variances of the means (that is, the sample sizes) be equal.
k ˆ’ 1 LS-means and a control LS-mean from either a variance-balanced design, or a design in which the other factors are orthogonal to the treatment factor (for example, a randomized block design with proportional cell frequencies).

However, other important situations that do not result in a correlation matrix R that has the structure for exact computation include

all pairwise differences with unequal sample sizes
differences between LS-means in many unbalanced designs

In these situations, exact calculation of q ^t ( ± , ½ , R ) is intractable in general. Most of the preceding methods can be viewed as using various approximations for q ^t ( ± , ½ , R ). When the sample sizes are unequal, the Tukey-Kramer test is equivalent to another approximation. For comparisons with a control when the correlation R does not have a factor analytic structure, Hsu (1992) suggests approximating R with a matrix R * that does have such a structure and correspondingly approximating q ^t ( ± , ½ , R ) with q ^t ( ± , ½ , R *). When you request Dunnett s test for LS-means (the PDIFF=CONTROL and ADJUST=DUNNETT options), the GLM procedure automatically uses Hsu s approximation when appropriate.

Finally, Edwards and Berry (1987) suggest calculating q ^t ( ± , ½ , R ) by simulation. Multivariate t vectors are sampled from a distribution with the appropriate ½ and R parameters, and Edwards and Berry (1987) suggest estimating q ^t ( ± , ½ , R ) by , the ± percentile of the observed values of max( t ₁ , , t _n ). Sufficient samples are generated for the true P (max( t ₁ , , t _n ) > ) to be within a certain accuracy radius ³ of ± with accuracy confidence 100(1 ˆ’ ˆˆ ). You can approximate q ^t ( ± , ½ , R ) by simulation for comparisons between LS-means by specifying ADJUST=SIM (with either PDIFF=ALL or PDIFF=CONTROL). By default, ³ = 0 . 005 and ˆˆ = 0 . 01, so that the tail area of is within 0.005 of ± with 99% confidence. You can use the ACC= and EPS= options with ADJUST=SIM to reset ³ and ˆˆ , or you can use the NSAMP= option to set the sample size directly. You can also control the random number sequence with the SEED= option.

Hsu and Nelson (1998) suggest a more accurate simulation method for estimating q ^t ( ± , ½ , R ), using a control variate adjustment technique. The same independent, standardized normal variates that are used to generate multivariate t vectors from a distribution with the appropriate ½ and R parameters are also used to generate multivariate t vectors from a distribution for which the exact value of q ^t ( ± , ½ , R ) is known. max( t ₁ , , t _n ) for the second sample is used as a control variate for adjusting the quantile estimate based on the first sample; refer to Hsu and Nelson (1998) for more details. The control variate adjustment has the drawback that it takes somewhat longer than the crude technique of Edwards and Berry (1987), but it typically yields an estimate that is many times more accurate. In most cases, if you are using ADJUST=SIM, then you should specify ADJUST=SIM(CVADJUST). You can also specify ADJUST=SIM(CVADJUST REPORT) to display a summary of the simulation that includes, among other things, the actual accuracy radius ³ , which should be substantially smaller than the target accuracy radius (0.005 by default).

Multiple-Stage Tests

You can use all of the methods discussed so far to obtain simultaneous confidence intervals (Miller 1981). By sacrificing the facility for simultaneous estimation, you can obtain simultaneous tests with greater power using multiple-stage tests (MSTs). MSTs come in both step-up and step-down varieties (Welsch 1977). The step-down methods, which have been more widely used, are available in SAS/STAT software.

Step-down MSTs first test the homogeneity of all of the means at a level ³ _k . If the test results in a rejection, then each subset of k ˆ’ 1 means is tested at level ³ _k _{ˆ’ 1} ; otherwise , the procedure stops. In general, if the hypothesis of homogeneity of a set of p means is rejected at the ³ _p level, then each subset of p ˆ’ 1 means is tested at the ³ _p _{ˆ’ 1} level; otherwise, the set of p means is considered not to differ significantly and none of its subsets are tested. The many varieties of MSTs that have been proposed differ in the levels ³ _p and the statistics on which the subset tests are based. Clearly, the EERC of a step-down MST is not greater than ³ _k , and the CER is not greater than ³ ₂ , but the MEER is a complicated function of ³ _p , p = 2 , ,k .

With unequal cell sizes, PROC GLM uses the harmonic mean of the cell sizes as the common sample size. However, since the resulting operating characteristics can be undesirable, MSTs are recommended only for the balanced case. When the sample sizes are equal and if the range statistic is used, you can arrange the means in ascending or descending order and test only contiguous subsets. But if you specify the F statistic, this shortcut cannot be taken. For this reason, only range-based MSTs are implemented. It is common practice to report the results of an MST by writing the means in such an order and drawing lines parallel to the list of means spanning the homogeneous subsets. This form of presentation is also convenient for pairwise comparisons with equal cell sizes.

The best known MSTs are the Duncan (the DUNCAN option) and Student-Newman-Keuls (the SNK option) methods (Miller 1981). Both use the studentized range statistic and, hence, are called multiple range tests . Duncan s method is often called the new multiple range test despite the fact that it is one of the oldest MSTs in current use.

The Duncan and SNK methods differ in the ³ _p values used. For Duncan s method, they are

whereas the SNK method uses

Duncan s method controls the CER at the ± level. Its operating characteristics appear similar to those of Fisher s unprotected LSD or repeated t tests at level ± (Petrinovich and Hardyck 1969). Since repeated t tests are easier to compute, easier to explain, and applicable to unequal sample sizes, Duncan s method is not recommended. Several published studies (for example, Carmer and Swanson (1973)) have claimed that Duncan s method is superior to Tukey s because of greater power without considering that the greater power of Duncan s method is due to its higher type 1 error rate (Einot and Gabriel 1975).

The SNK method holds the EERC to the ± level but does not control the MEER (Einot and Gabriel 1975). Consider ten population means that occur in five pairs such that means within a pair are equal, but there are large differences between pairs. If you make the usual sampling assumptions and also assume that the sample sizes are very large, all subset homogeneity hypotheses for three or more means are rejected. The SNK method then comes down to five independent tests, one for each pair, each at the ± level. Letting ± be 0.05, the probability of at least one false rejection is

As the number of means increases, the MEER approaches 1. Therefore, the SNK method cannot be recommended.

A variety of MSTs that control the MEER have been proposed, but these methods are not as well known as those of Duncan and SNK. An approach developed by Ryan (1959; 1960), Einot and Gabriel (1975), and Welsch (1977) sets

You can use range statistics, leading to what is called the REGWQ method after the authors initials . If you assume that the sample means have been arranged in descending order from y ₁ through y _k , the homogeneity of means y _i , , y _j , i < j , is rejected by REGWQ if

where p = j ˆ’ i + 1 and the summations are over u = i, 1,j (Einot and Gabriel 1975). To ensure that the MEER is controlled, the current implementation checks whether q ( ³ _p ; p, ½ ) is monotonically increasing in p . If not, then a set of critical values that are increasing in p is substituted instead.

REGWQ appears to be the most powerful step-down MST in the current literature (for example, Ramsey 1978). Use of a preliminary F -test decreases the power of all the other multiple comparison methods discussed previously except for Scheff s test.

Bayesian Approach

Waller and Duncan (1969) and Duncan (1975) take an approach to multiple comparisons that differs from all the methods previously discussed in minimizing the Bayes risk under additive loss rather than controlling type 1 error rates. For each pair of population means ¼ _i and ¼ _j , null ( ) and alternative ( ) hypotheses are defined:

For any i , j pair, let d indicate a decision in favor of and d _a indicate a decision in favor of , and let = ¼ _i ˆ’ ¼ _j . The loss function for the decision on the i , j pair is

where k represents a constant that you specify rather than the number of means. The loss for the joint decision involving all pairs of means is the sum of the losses for each individual decision. The population means are assumed to have a normal prior distribution with unknown variance, the logarithm of the variance of the means having a uniform prior distribution. For the i , j pair, the null hypothesis is rejected if

where t _B is the Bayesian t value (Waller and Kemp 1976) depending on k , the F statistic for the one-way ANOVA, and the degrees of freedom for F . The value of t _B is a decreasing function of F , so the Waller-Duncan test (specified by the WALLER option) becomes more liberal as F increases.

Recommendations

In summary, if you are interested in several individual comparisons and are not concerned about the effects of multiple inferences, you can use repeated t tests or Fisher s unprotected LSD. If you are interested in all pairwise comparisons or all comparisons with a control, you should use Tukey s or Dunnett s test, respectively, in order to make the strongest possible inferences. If you have weaker inferential requirements and, in particular, if you don t want confidence intervals for the mean differences, you should use the REGWQ method. Finally, if you agree with the Bayesian approach and Waller and Duncan s assumptions, you should use the Waller-Duncan test.

Interpretation of Multiple Comparisons

When you interpret multiple comparisons, remember that failure to reject the hypothesis that two or more means are equal should not lead you to conclude that the population means are, in fact, equal. Failure to reject the null hypothesis implies only that the difference between population means, if any, is not large enough to be detected with the given sample size. A related point is that nonsignificance is nontransitive: that is, given three sample means, the largest and smallest may be significantly different from each other, while neither is significantly different from the middle one. Nontransitive results of this type occur frequently in multiple comparisons.

Multiple comparisons can also lead to counter-intuitive results when the cell sizes are unequal. Consider four cells labeled A, B, C, and D, with sample means in the order A>B>C>D. If A and D each have two observations, and B and C each have 10,000 observations, then the difference between B and C may be significant, while the difference between A and D is not.

Simple Effects

Suppose you use the following statements to fit a full factorial model to a two-way design:

  data twoway;   input A B Y @@;   datalines;   1 1 10.6   1 1 11.0   1 1 10.6   1 1 11.3   1 2   0.2   1 2  1.3   1 2   0.2   1 2  0.2   1 3  0.1   1 3  0.4   1 3   0.4   1 3  1.0   2 1 19.7   2 1 19.3   2 1 18.5   2 1 20.4   2 2   0.2   2 2  0.5   2 2  0.8   2 2   0.4   2 3   0.9   2 3   0.1   2 3   0.2   2 3   1.7   3 1 29.7   3 1 29.6   3 1 29.0   3 1 30.2   3 2  1.5   3 2  0.2   3 2   1.5   3 2  1.3   3 3  0.2   3 3  0.4   3 3   0.4   3 3   2.2   ;   proc glm data=twoway;   class A B;   model Y = A B A*B;   run;

Partial results for the analysis of variance are shown in Figure 32.17. The Type I and Type III results are the same because this is a balanced design.

  The GLM Procedure   Dependent Variable: Y   Source                     DF      Type I SS    Mean Square   F Value   Pr > F   A                           2     219.905000     109.952500    165.11   <.0001   B                           2    3206.101667    1603.050833   2407.25   <.0001   A*B                         4     487.103333     121.775833    182.87   <.0001   Source                     DF    Type III SS    Mean Square   F Value   Pr > F   A                           2     219.905000     109.952500    165.11   <.0001   B                           2    3206.101667    1603.050833   2407.25   <.0001   A*B                         4     487.103333     121.775833    182.87   <.0001

Figure 32.17: Two-way Design with Significant Interaction

The interaction A * B is significant, indicating that the effect of A depends on the level of B. In some cases, you may be interested in looking at the differences between predicted values across A for different levels of B . Winer (1971) calls this the simple effects of A . You can compute simple effects with the LSMEAN statement by specifying the SLICE= option. In this case, since the GLM procedure is interactive, you can compute the simple effects of A by submitting the following statements after the preceding statements.

  lsmeans A*B / slice=B;   run;

The results are shown Figure 32.18. Note that A has a significant effect for B =1 but not for B =2 and B =3.

  The GLM Procedure   Least Squares Means   A    B        Y LSMEAN   1    1      10.8750000   1    2       0.2750000   1    3       0.2750000   2    1      19.4750000   2    2       0.1750000   2    3   0.7250000   3    1      29.6250000   3    2       0.3750000   3    3   0.5000000   The GLM Procedure   Least Squares Means   A*B Effect Sliced by B for Y   Sum of   B        DF         Squares     Mean Square    F Value    Pr > F   1         2      704.726667      352.363333     529.13    <.0001   2         2        0.080000        0.040000       0.06    0.9418   3         2        2.201667        1.100833       1.65    0.2103

Figure 32.18: Interaction LS-means and Simple Effects

Homogeneity of Variance in One-Way Models

One of the usual assumptions for the GLM procedure is that the underlying errors are all uncorrelated with homogeneous variances (see page 1783). You can test this assumption in PROC GLM by using the HOVTEST option in the MEANS statement, requesting a homogeneity of variance test. This section discusses the computational details behind these tests. Note that the GLM procedure allows homogeneity of variance testing for simple one-way models only. Homogeneity of variance testing for more complex models is a subject of current research.

Bartlett (1937) proposes a test for equal variances that is a modification of the normal-theory likelihood ratio test (the HOVTEST=BARTLETT option). While Bartlett s test has accurate Type I error rates and optimal power when the underlying distribution of the data is normal, it can be very inaccurate if that distribution is even slightly nonnormal (Box 1953). Therefore, Bartlett s test is not recommended for routine use.

An approach that leads to tests that are much more robust to the underlying distribution is to transform the original values of the dependent variable to derive a dispersion variable and then to perform analysis of variance on this variable. The significance level for the test of homogeneity of variance is the p- value for the ANOVA F- test on the dispersion variable. All of the homogeneity of variance tests available in PROC GLM except Bartlett s use this approach.

Levene s test (Levene 1960) is widely considered to be the standard homogeneity of variance test (the HOVTEST=LEVENE option). Levene s test is of the dispersion-variable-ANOVA form discussed previously, where the dispersion variable is either

O Brien (1979) proposes a test (HOVTEST=OBRIEN) that is basically a modificationofLevene s , using the dispersion variable

where n _i is the size of the i ^th group and is its sample variance. You can use the W= option in parentheses to tune O Brien s dispersion variable to match the suspected kurtosis of the underlying distribution. The choice of the value of the W= option is rarely critical. By default, W=0.5, as suggested by O Brien (1979; 1981).

Finally, Brown and Forsythe (1974) suggest using the absolute deviations from the group medians :

where m _i is the median of the i ^th group. You can use the HOVTEST=BF option to specify this test.

Simulation results (Conover et al. 1981; Olejnik and Algina 1987) show that, while all of these ANOVA-based tests are reasonably robust to the underlying distribution, the Brown-Forsythe test seems best at providing power to detect variance differences while protecting the Type I error probability. However, since the within-group medians are required for the Brown-Forsythe test, it can be resource intensive if there are very many groups or if some groups are very large.

If one of these tests rejects the assumption of homogeneity of variance, you should use Welch s ANOVA instead of the usual ANOVA to test for differences between group means. However, this conclusion holds only if you use one of the robust homogeneity of variance tests (that is, not for HOVTEST=BARTLETT); even then, any homogeneity of variance test has too little power to be relied upon always to detect when Welch s ANOVA is appropriate. Unless the group variances are extremely different or the number of groups is large, the usual ANOVA test is relatively robust when the groups are all about the same size. As Box (1953) notes, To make the preliminary test on variances is rather like putting to sea in a rowing boat to find out whether conditions are sufficiently calm for an ocean liner to leave port!

Example 32.10 on page 1892 illustrates the use of the HOVTEST and WELCH options in the MEANS statement in testing for equal group variances and adjusting for unequal group variances in a one-way ANOVA.

Weighted Means

In previous releases, if you specified a WEIGHT statement and one or more of the multiple comparisons options, PROC GLM estimated the variance of the difference between weighted group means for group i and j as

where MSE is the (weighted) mean square for error and n _i is the size of group i . This variance is involved in all of the multiple comparison methods. Beginning with Release 6.12, the variance of the difference between weighted group means for group i and j is computed as

where w _i is the sum of the weights for the observations in group i .

Construction of Least-Squares Means

To construct a least-squares mean (LS-mean) for a given level of a given effect, construct a row vector L according to the following rules and use it in an ESTIMATE statement to compute the value of the LS-mean:

Set all L _i corresponding to covariates (continuous variables) to their mean value.
Consider effects contained by the given effect. Set the L _i corresponding to levels associated with the given level equal to 1. Set all other L _i in these effects equal to 0. (See Chapter 11, The Four Types of Estimable Functions, for a definition of containing .)
Consider the given effect. Set the L _i corresponding to the given level equal to 1. Set the L _i corresponding to other levels equal to 0.
Consider the effects that contain the given effect. If these effects are not nested within the given effect, then set the L _i corresponding to the given level to 1 /k , where k is the number of such columns. If these effects are nested within the given effect, then set the L _i corresponding to the given level to 1 / ( k ₁ k ₂ ), where k ₁ is the number of nested levels within this combination of nested effects, and k ₂ is the number of such combinations. For L _i corresponding to other levels, use 0.
Consider the other effects not yet considered. If there are no nested factors, then set all L _i corresponding to this effect to 1 /j , where j is the number of levels in the effect. If there are nested factors, then set all L _i corresponding to this effect to 1 / ( j ₁ j ₂ ), where j ₁ is the number of nested levels within a given combination of nested effects and j ₂ is the number of such combinations.

The consequence of these rules is that the sum of the Xs within any classification effect is 1. This set of Xs forms a linear combination of the parameters that is checked for estimability before it is evaluated.

For example, consider the following model:

  proc glm;   classABC;   model Y=A B A*B C Z;   lsmeans A B A*B C;   run;

Assume A has 3 levels, B has 2 levels, and C has 2 levels, and assume that every combination of levels of A and B exists in the data. Assume also that Z is a continuous variable with an average of 12.5. Then the least-squares means are computed by the following linear combinations of the parameter estimates:

A * B

LSM()

1/3

1/2

1/6

1/2

12.5

LSM(A1)

1/2

12.5

LSM(A2)

1/2

12.5

LSM(A3)

1/2

12.5

LSM(B1)

1/3

1/2

12.5

LSM(B2)

1/3

1/2

12.5

LSM(AB11)

1/2

12.5

LSM(AB12)

1/2

12.5

LSM(AB21)

1/2

12.5

LSM(AB22)

1/2

12.5

LSM(AB31)

1/2

12.5

LSM(AB32)

1/2

12.5

LSM(C1)

1/3

1/2

1/6

12.5

LSM(C2)

1/3

1/2

1/6

12.5

Setting Covariate Values

By default, all covariate effects are set equal to their mean values for computation of standard LS-means. The AT option in the LSMEANS statement enables you to set the covariates to whatever values you consider interesting.

If there is an effect containing two or more covariates, the AT option sets the effect equal to the product of the individual means rather than the mean of the product (as with standard LS-means calculations). The AT MEANS option leaves covariates equal to their mean values (as with standard LS-means) and incorporates this adjustment to crossproducts of covariates.

As an example, the following is a model with a classification variable A and two continuous variables, x1 and x2 :

  class A;   model y= A x1 x2 x1*x2;

The coefficients for the continuous effects with various AT specifications are shown in the following table.

Syntax	x1	x2	x1 * x2
lsmeans A;
lsmeans A / at means;			·
lsmeans A / at x1 = 1.2;	1.2		1 . 2 ·
lsmeans A / at (x1 x2) = (1.2 0.3);	1.2	0.3	1 . 2 ·0 . 3

For the first two LSMEANS statements, the A LS-mean coefficient for x1 is (the mean of x1 ) and for x2 is (the mean of x2 ). However, for the first LSMEANS statement, the coefficient for x1 * x2 is , but for the second LSMEANS statement the coefficient is . . The third LSMEANS statement sets the coefficient for x1 equal to 1 . 2 and leaves that for x2 at , and the final LSMEANS statement sets these values to 1 . 2 and 0 . 3, respectively.

If you specify a WEIGHT variable, then weighted means are used for the covariate values. Also, observations with missing dependent variables are included in computing the covariate means, unless these observations form a missing cell. You can use the E option in conjunction with the AT option to check that the modified LS-means coefficients are the ones you desire .

The AT option is disabled if you specify the BYLEVEL option, in which case the coefficients for the covariates are set equal to their means within each level of the LS-mean effect in question.

Changing the Weighting Scheme

The standard LS-means have equal coefficients across classification effects; however, the OM option in the LSMEANS statement changes these coefficients to be proportional to those found in the input data set. This adjustment is reasonable when you want your inferences to apply to a population that is not necessarily balanced but has the margins observed in the original data set.

In computing the observed margins, PROC GLM uses all observations for which there are no missing independent variables, including those for which there are missing dependent variables. Also, if there is a WEIGHT variable, PROC GLM uses weighted margins to construct the LS-means coefficients. If the analysis data set is balanced or if you specify a simple one-way model, the LS-means will be unchanged by the OM option.

The BYLEVEL option modifies the observed-margins LS-means. Instead of computing the margins across the entire data set, PROC GLM computes separate margins for each level of the LS-mean effect in question. The resulting LS-means are actually equal to raw means in this case. The BYLEVEL option disables the AT option if it is specified.

Note that the MIXED procedure implements a more versatile form of the OM option, enabling you to specifying an alternative data set over which to compute observed margins. If you use the BYLEVEL option, too, then this data set is effectively the population over which the population marginal means are computed. See Chapter 46, The MIXED Procedure, for more information.

You may want to use the E option in conjunction with either the OM or BYLEVEL option to check that the modified LS-means coefficients are the ones you desire. It is possible that the modified LS-means are not estimable when the standard ones are, or vice versa.

Multivariate Analysis of Variance

If you fit several dependent variables to the same effects, you may want to make tests jointly involving parameters of several dependent variables. Suppose you have p dependent variables, k parameters for each dependent variable, and n observations. The models can be collected into one equation:

where Y is n — p , X is n — k , ² is k — p , and ˆˆ is n — p . Each of the p models can be estimated and tested separately. However, you may also want to consider the joint distribution and test the p models simultaneously .

For multivariate tests, you need to make some assumptions about the errors. With p dependent variables, there are n — p errors that are independent across observations but not across dependent variables. Assume

where vec( ˆˆ ) strings ˆˆ out by rows, denotes Kronecker product multiplication, and & pound ; is p — p . can be estimated by

where b = ( X ² X ) ˆ’ X ² Y , r is the rank of the X matrix, and e is the matrix of residuals.

If S is scaled to unit diagonals, the values in S are called partial correlations of the Ys adjusting for the Xs . This matrix can be displayed by PROC GLM if PRINTE is specified as a MANOVA option.

The multivariate general linear hypothesis is written

You can form hypotheses for linear combinations across columns, as well as across rows of ² .

The MANOVA statement of the GLM procedure tests special cases where L corresponds to Type I, Type II, Type III, or Type IV tests, and M is the p — p identity matrix. These tests are joint tests that the given type of hypothesis holds for all dependent variables in the model, and they are often sufficient to test all hypotheses of interest.

Finally, when these special cases are not appropriate, you can specify your own L and M matrices by using the CONTRAST statement before the MANOVA statement and the M= specification in the MANOVA statement, respectively. Another alternative is to use a REPEATED statement, which automatically generates a variety of M matrices useful in repeated measures analysis of variance. See the REPEATED Statement section on page 1777 and the Repeated Measures Analysis of Variance section on page 1825 for more information.

One useful way to think of a MANOVA analysis with an M matrix other than the identity is as an analysis of a set of transformed variables defined by the columns of the M matrix. You should note, however, that PROC GLM always displays the M matrix in such a way that the transformed variables are defined by the rows, not the columns, of the displayed M matrix.

All multivariate tests carried out by the GLM procedure first construct the matrices H and E corresponding to the numerator and denominator, respectively, of a univariate F -test.

The diagonal elements of H and E correspond to the hypothesis and error SS for univariate tests. When the M matrix is the identity matrix (the default), these tests are for the original dependent variables on the left-hand side of the MODEL statement. When an M matrix other than the identity is specified, the tests are for transformed variables defined by the columns of the M matrix. These tests can be studied by requesting the SUMMARY option, which produces univariate analyses for each original or transformed variable.

Four multivariate test statistics, all functions of the eigenvalues of E ^{ˆ’ 1} H (or ( E + H ) ^{ˆ’ 1} H ), are constructed:

Wilks lambda = det( E )/det( H + E )
Pillai s trace = trace( H ( H + E ) ^{ˆ’ 1} )
Hotelling-Lawley trace = trace( E ^{ˆ’ 1} H )
Roy s maximum root = » , largest eigenvalue of E ^{ˆ’ 1} H

By default, all four are reported with p -values based on F approximations, as discussed in the Multivariate Tests section in Chapter 2, Introduction to Regression Procedures. Alternatively, if you specify MSTAT=EXACT on the associated MANOVA or REPEATED statement, p -values for three of the four tests are computed exactly (Wilks Lambda, the Hotelling-Lawley Trace, and Roy s Greatest Root), and the p -values for the fourth (Pillai s trace) are based on an F -approximation that is more accurate than the default. See the Multivariate Tests section in Chapter 2, Introduction to Regression Procedures, for more details on the exact calculations.

Repeated Measures Analysis of Variance

When several measurements are taken on the same experimental unit (person, plant, machine, and so on), the measurements tend to be correlated with each other. When the measurements represent qualitatively different things, such as weight, length, and width, this correlation is best taken into account by use of multivariate methods, such as multivariate analysis of variance. When the measurements can be thought of as responses to levels of an experimental factor of interest, such as time, treatment, or dose, the correlation can be taken into account by performing a repeated measures analysis of variance.

PROC GLM provides both univariate and multivariate tests for repeated measures for one response. For an overall reference on univariate repeated measures, refer to Winer (1971). The multivariate approach is covered in Cole and Grizzle (1966). For a discussion of the relative merits of the two approaches, see LaTour and Miniard (1983).

Another approach to analysis of repeated measures is via general mixed models. This approach can handle balanced as well as unbalanced or missing within-subject data, and it offers more options for modeling the within-subject covariance. The main drawback of the mixed models approach is that it generally requires iteration and, thus, may be less computationally efficient. For further details on this approach, see Chapter 46, The MIXED Procedure, and Wo l finger and Chang (1995).

Organization of Data for Repeated Measures Analysis

In order to deal efficiently with the correlation of repeated measures, the GLM procedure uses the multivariate method of specifying the model, even if only a univariate analysis is desired. In some cases, data may already be entered in the univariate mode, with each repeated measure listed as a separate observation along with a variable that represents the experimental unit (subject) on which measurement is taken. Consider the following data set old :

  SUBJ  GROUP    TIME       Y   1      1       1      15   1      1       2      19   1      1       3      25   2      1       1      21   2      1       2      18   2      1       3      17   1      2       1      14   1      2       2      12   1      2       3      16   2      2       1      11   2      2       2      20   .   .   .   10      3       1      14   10      3       2      18   10      3       3      16

There are three observations for each subject, corresponding to measurements taken at times 1, 2, and 3. These data could be analyzed using the following statements:

  proc glm data=old;   class group subj time;   model y=group subj(group) time group*time;   test h=group e=subj(group);   run;

However, this analysis assumes subjects measurements are uncorrelated across time. A repeated measures analysis does not make this assumption. It uses a data set new :

  GROUP        Y1     Y2     Y3   1        15     19     25   1        21     18     17   2        14     12     16   2        11     20     21   .   .   .   3        14     18     16

In the data set new , the three measurements for a subject are all in one observation. For example, the measurements for subject 1 for times 1, 2, and 3 are 15, 19, and 25. For these data, the statements for a repeated measures analysis (assuming default options) are

  proc glm data=new;   class group;   model y1-y3=group / nouni;   repeated time;   run;

To convert the univariate form of repeated measures data to the multivariate form, you can use a program like the following:

  proc sort data=old;   by group subj;   run;   data new(keep=y1-y3 group);   array yy(3) y1-y3;   do time=1 to 3;   set old;   by group subj;   yy(time)=y;   if last.subj then return;   end;   run;

Alternatively, you could use PROC TRANSPOSE to achieve the same results with a program like this one:

  proc sort data=old;   by group subj;   run;   proc transpose out=new(rename=(_1=y1 _2=y2 _3=y3));   by group subj;   id time;   run;

Refer to the discussions in SAS Language Reference: Concepts for more information on rearrangement of data sets.

Hypothesis Testing in Repeated Measures Analysis

In repeated measures analysis of variance, the effects of interest are

between-subject effects (such as GROUP in the previous example)
within-subject effects (such as TIME in the previous example)
interactions between the two types of effects (such as GROUP*TIME in the previous example)

Repeated measures analyses are distinguished from MANOVA because of interest in testing hypotheses about the within-subject effects and the within-subject-by-between-subject interactions.

For tests that involve only between-subjects effects, both the multivariate and univariate approaches give rise to the same tests. These tests are provided for all effects in the MODEL statement, as well as for any CONTRASTs specified. The ANOVA table for these tests is labeled Tests of Hypotheses for Between Subjects Effects on the PROC GLM results. These tests are constructed by first adding together the dependent variables in the model. Then an analysis of variance is performed on the sum divided by the square root of the number of dependent variables. For example, the statements

  model y1-y3=group;   repeated time;

give a one-way analysis of variance using ( Y 1 + Y 2 + Y 3)/ as the dependent variable for performing tests of hypothesis on the between-subject effect GROUP. Tests for between-subject effects are equivalent to tests of the hypothesis L ² M = 0, where M is simply a vector of 1s.

For within-subject effects and for within-subject-by-between-subject interaction effects, the univariate and multivariate approaches yield different tests. These tests are provided for the within-subject effects and for the interactions between these effects and the other effects in the MODEL statement, as well as for any CONTRASTs specified. The univariate tests are displayed in a table labeled Univariate Tests of Hypotheses for Within Subject Effects. Results for multivariate tests are displayed in a table labeled Repeated Measures Analysis of Variance.

The multivariate tests provided for within-subjects effects and interactions involving these effects are Wilks Lambda, Pillai s Trace, Hotelling-Lawley Trace, and Roy s maximum root. For further details on these four statistics, see the Multivariate Tests section in Chapter 2, Introduction to Regression Procedures. As an example, the statements

  model y1-y3=group;   repeated time;

produce multivariate tests for the within-subject effect TIME and the interaction TIME*GROUP.

The multivariate tests for within-subject effects are produced by testing the hypothesis L ² M = 0, where the L matrix is the usual matrix corresponding to Type I, Type II, Type III, or Type IV hypotheses tests, and the M matrix is one of several matrices depending on the transformation that you specify in the REPEATED statement. The only assumption required for valid tests is that the dependent variables in the model have a multivariate normal distribution with a common covariance matrix across the between-subject effects.

The univariate tests for within-subject effects and interactions involving these effects require some assumptions for the probabilities provided by the ordinary F -tests to be correct. Specifically, these tests require certain patterns of covariance matrices, known as Type H covariances (Huynh and Feldt 1970). Data with these patterns in the covariance matrices are said to satisfy the Huynh-Feldt condition. You can test this assumption (and the Huynh-Feldt condition) by applying a sphericity test (Anderson 1958)toanysetofvariablesdefined by an orthogonal contrast transformation. Such a set of variables is known as a set of orthogonal components. When you use the PRINTE option in the REPEATED statement, this sphericity test is applied both to the transformed variables defined by the REPEATED statement and to a set of orthogonal components if the specified transformation is not orthogonal. It is the test applied to the orthogonal components that is important in determining whether your data have Type H covariance structure. When there are only two levels of the within-subject effect, there is only one transformed variable, and a sphericity test is not needed. The sphericity test is labeled Test for Sphericity on the output.

If your data satisfy the preceding assumptions, use the usual F -tests to test univariate hypotheses for the within-subject effects and associated interactions.

If your data do not satisfy the assumption of Type H covariance, an adjustment to numerator and denominator degrees of freedom can be used. Two such adjustments, based on a degrees of freedom adjustment factor known as ˆˆ ( epsilon ) (Box 1954), are provided in PROC GLM. Both adjustments estimate ˆˆ and then multiply the numerator and denominator degrees of freedom by this estimate before determining significance levels for the F -tests. Significance levels associated with the adjusted tests are labeled Adj Pr > F on the output. The first adjustment, initially proposed for use in data analysis by Greenhouse and Geisser (1959), is labeled Greenhouse-Geisser Epsilon and represents the maximum-likelihood estimate of Box s ˆˆ factor. Significance levels associated with adjusted F -tests are labeled G-G on the output. Huynh and Feldt (1976) have shown that the G-G estimate tends to be biased down-ward (that is, too conservative), especially for small samples, and they have proposed an alternative estimator that is constructed using unbiased estimators of the numerator and denominator of Box s ˆˆ . Huynh and Feldt s estimator is labeled Huynh-Feldt Epsilon on the PROC GLM output, and the significance levels associated with adjusted F -tests are labeled H-F. Although ˆˆ must be in the range of 0 to 1, the H-F estimator can be outside this range. When the H-F estimator is greater than 1, a value of 1 is used in all calculations for probabilities, and the H-F probabilities are not adjusted. In summary, if your data do not meet the assumptions, use adjusted F -tests. However, when you strongly suspect that your data may not have Type H covariance, all these univariate tests should be interpreted cautiously. In such cases, you should consider using the multivariate tests instead.

The univariate sums of squares for hypotheses involving within-subject effects can be easily calculated from the H and E matrices corresponding to the multivariate tests described in the Multivariate Analysis of Variance section on page 1823. If the M matrix is orthogonal, the univariate sums of squares is calculated as the trace (sum of diagonal elements) of the appropriate H matrix; if it is not orthogonal, PROC GLM calculates the trace of the H matrix that results from an orthogonal M matrix transformation. The appropriate error term for the univariate F -tests is constructed in a similar way from the error SSCP matrix and is labeled Error( factorname ), where factorname indicates the M matrix that is used in the transformation.

When the design specifies more than one repeated measures factor, PROC GLM computes the M matrix for a given effect as the direct (Kronecker) product of the M matrices defined by the REPEATED statement if the factor is involved in the effect or as a vector of 1s if the factor is not involved. The test for the main effect of a repeated-measures factor is constructed using an L matrix that corresponds to a test that the mean of the observation is zero. Thus, the main effect test for repeated measures is a test that the means of the variables defined by the M matrix are all equal to zero, while interactions involving repeated-measures effects are tests that the between-subjects factors involved in the interaction have no effect on the means of the transformed variables defined by the M matrix. In addition, you can specify other L matrices to test hypotheses of interest by using the CONTRAST statement, since hypotheses defined by CONTRAST statements are also tested in the REPEATED analysis. To see which combinations of the original variables the transformed variables represent, you can specify the PRINTM option in the REPEATED statement. This option displays the transpose of M , which is labeled as M in the PROC GLM results. The tests produced are the same for any choice of transformation ( M ) matrix specified in the REPEATED statement; however, depending on the nature of the repeated measurements being studied, a particular choice of transformation matrix, coupled with the CANONICAL or SUMMARY option, can provide additional insight into the data being studied.

Transformations Used in Repeated Measures Analysis of Variance

As mentioned in the specifications of the REPEATED statement, several different M matrices can be generated automatically, based on the transformation that you specify in the REPEATED statement. Remember that both the univariate and multivariate tests that PROC GLM performs are unaffected by the choice of transformation; the choice of transformation is important only when you are trying to study the nature of a repeated measures effect, particularly with the CANONICAL and SUMMARY options. If one of these matrices does not meet your needs for a particular analysis, you may want to use the M= option in the MANOVA statement to perform the tests of interest.

The following sections describe the transformations available in the REPEATED statement, provide an example of the M matrix that is produced, and give guidelines for the use of the transformation. As in the PROC GLM output, the displayed matrix is labeled M. This is the M ² matrix.

CONTRAST Transformation

This is the default transformation used by the REPEATED statement. It is useful when one level of the repeated measures effect can be thought of as a control level against which the others are compared. For example, if five drugs are administered to each of several animals and the first drug is a control or placebo, the statements

  proc glm;   model d1-d5= / nouni;   repeated drug 5 contrast(1) / summary printm;   run;

produce the following M matrix:

When you examine the analysis of variance tables produced by the SUMMARY option, you can tell which of the drugs differed significantly from the placebo.

POLYNOMIAL Transformation

This transformation is useful when the levels of the repeated measure represent quantitative values of a treatment, such as dose or time. If the levels are unequally spaced, level values can be specified in parentheses after the number of levels in the REPEATED statement. For example, if five levels of a drug corresponding to 1, 2, 5, 10 and 20 milligrams are administered to different treatment groups, represented by the variable group , the statements

  proc glm;   class group;   model r1-r5=group / nouni;   repeated dose 5 (1 2 5 10 20) polynomial / summary printm;   run;

produce the following M matrix.

The SUMMARY option in this example provides univariate ANOVAs for the variables defined by the rows of this M matrix. In this case, they represent the linear, quadratic, cubic, and quartic trends for dose and are labeled dose_ 1, dose_ 2, dose_ 3, and dose_ 4, respectively.

HELMERT Transformation

Since the Helmert transformation compares a level of a repeated measure to the mean of subsequent levels, it is useful when interest lies in the point at which responses cease to change. For example, if four levels of a repeated measures factor represent responses to treatments administered over time to males and females, the statements

  proc glm;   class sex;   model resp1-resp4=sex / nouni;   repeated trtmnt 4 helmert / canon printm;   run;

produce the following M matrix:

MEAN Transformation

This transformation can be useful in the same types of situations in which the CONTRAST transformation is useful. If you substitute the following statement for the REPEATED statement shown in the CONTRAST Transformation section,

  repeated drug 5 mean / printm;

the following M matrix is produced:

As with the CONTRAST transformation, if you want to omit a level other than the last, you can specify it in parentheses after the keyword MEAN in the REPEATED statement.

PROFILE Transformation

When a repeated measure represents a series of factors administered over time, but a polynomial response is unreasonable, a profile transformation may prove useful. As an example, consider a training program in which four different methods are employed to teach students at several different schools . The repeated measure is the score on tests administered after each of the methods is completed. The statements

  proc glm;   class school;   model t1-t4=school / nouni;   repeated method 4 profile / summary nom printm;   run;

produce the following M matrix:

To determine the point at which an improvement in test scores takes place, you can examine the analyses of variance for the transformed variables representing the differences between adjacent tests. These analyses are requested by the SUMMARY option in the REPEATED statement, and the variables are labeled METHOD.1, METHOD.2, and METHOD.3.

Random Effects Analysis

When some model effects are random (that is, assumed to be sampled from a normal population of effects), you can specify these effects in the RANDOM statement in order to compute the expected values of mean squares for various model effects and contrasts and, optionally , to perform random effects analysis of variance tests.

PROC GLM versus PROC MIXED for Random Effects Analysis

Other SAS procedures that can be used to analyze models with random effects include the MIXED and VARCOMP procedures. Note that, for these procedures, the random effects specification is an integral part of the model, affecting how both random and fixed effects are fit; for PROC GLM, the random effects are treated in a post hoc fashion after the complete fixed effect model is fit. This distinction affects other features in the GLM procedure, such as the results of the LSMEANS and ESTIMATE statements. These features assume that all effects are fixed, so that all tests and estimability checks for these statements are based on a fixed effects model, even when you use a RANDOM statement. Standard errors for estimates and LS-means based on the fixed effects model may be significantly smaller than those based on a true random effects model; in fact, some functions that are estimable under a true random effects model may not even be estimable under the fixed effects model. Therefore, you should use the MIXED procedure to compute tests involving these features that take the random effects into account; see Chapter 46, The MIXED Procedure, for more information.

Note that, for balanced data, the test statistics computed when you specify the TEST option on the RANDOM statement have an exact F distribution only when the design is balanced; for unbalanced designs, the p values for the F -tests are approximate. For balanced data, the values obtained by PROC GLM and PROC MIXED agree; for unbalanced data, they usually do not.

Computation of Expected Mean Squares for Random Effects

The RANDOM statement in PROC GLM declares one or more effects in the model to be random rather than fixed. By default, PROC GLM displays the coefficients of the expected mean squares for all terms in the model. In addition, when you specify the TEST option in the RANDOM statement, the procedure determines what tests are appropriate and provides F ratios and probabilities for these tests.

The expected mean squares are computed as follows. Consider the model

where ² represents the fixed effects and ² ₁ , ² ₂ , , ˆˆ represent the random effects. Random effects are assumed to be normally and independently distributed. For any L in the row space of X = ( X X ₁ X ₂ X _k ), the expected value of the sum of squares for L ² is

where C is of the same dimensions as L and is partitioned as the X matrix. In other words,

Furthermore, C = ML , where M is the inverse of the lower triangular Cholesky decomposition matrix of L ( X ² X ) ˆ’ L ² . SSQ( A )isdefined as tr( A ² A ).

For the model in the following MODEL statement

  model Y=A B(A) C A*C;   random B(A);

with B ( A ) declared as random, the expected mean square of each effect is displayed as

If any fixed effects appear in the expected mean square of an effect, the letter Q followed by the list of fixed effects in the expected value is displayed. The actual numeric values of the quadratic form ( Q matrix) can be displayed using the Q option.

To determine appropriate means squares for testing the effects in the model, the TEST option in the RANDOM statement performs the following.

First, it forms a matrix of coefficients of the expected mean squares of those effects that were declared to be random.
Next, for each effect in the model, it determines the combination of these expected mean squares that produce an expectation that includes all the terms in the expected mean square of the effect of interest except the one corresponding to the effect of interest. For example, if the expected mean square of an effect

PROC GLM determines the combination of other expected mean squares in the model that has expectation
If the preceding criterion is met by the expected mean square of a single effect in the model (as is often the case in balanced designs), the F test is formed directly. In this case, the mean square of the effect of interest is used as the numerator, the mean square of the single effect with an expected mean square that satisfies the criterion is used as the denominator, and the degrees of freedom for the test are simply the usual model degrees of freedom.
When more than one mean square must be combined to achieve the appropriate expectation, an approximation is employed to determine the appropriate degrees of freedom (Satterthwaite 1946). When effects other than the effect of interest are listed after the Q in the output, tests of hypotheses involving the effect of interest are not valid unless all other fixed effects involved in it are assumed to be zero. When tests such as these are performed by using the TEST option in the RANDOM statement, a note is displayed reminding you that further assumptions are necessary for the validity of these tests. Remember that although the tests are not valid unless these assumptions are made, this does not provide a basis for these assumptions to be true. The particulars of a given experiment must be examined to determine whether the assumption is reasonable.

Refer to Goodnight and Speed (1978), Milliken and Johnson (1984, Chapters 22 and 23), and Hocking (1985) for further theoretical discussion.

Sum-to-Zero Assumptions

The formulation and parameterization of the expected mean squares for random effects in mixed models is an ongoing item of controversy in the statistical literature. Confusion arises over whether or not to assume that terms involving fixed effects sum to zero. Cornfield and Tukey (1956), Winer (1971), and others assume that they do sum to zero; Searle (1971), Hocking (1973), and others (including PROC GLM) do not.

Different assumptions about these sum-to-zero constraints can lead to different expected mean squares for certain terms, and hence to different F and p values.

For arguments in favor of not assuming that terms involving fixed effects sum to zero, see Section 9.7 of Searle (1971)andSections1and4ofMcLean et al. (1991). Other references are Hartley and Searle (1969)andSearle et al. (1992).

Computing Type I, II, and IV Expected Mean Squares

When you use the RANDOM statement, by default the GLM procedure produces the Type III expected mean squares for model effects and for contrasts specified before the RANDOM statement. In order to obtain expected values for other types of mean squares, you need to specify which types of mean squares are of interest in the MODEL statement. For example, in order to obtain the Type IV expected mean squares for effects in the RANDOM and CONTRAST statements, specify the SS4 option in the MODEL statement. If you want both Type III and Type IV expected mean squares, specify both the SS3 and SS4 options in the MODEL statement. Since the estimable function basis is not automatically calculated for Type I and Type II SS, the E1 (for Type I) or E2 (for Type II) option must be specified in the MODEL statement in order for the RANDOM statement to produce the expected mean squares for the Type I or Type II sums of squares. Note that it is important to list the fixed effects first in the MODEL statement when requesting the Type I expected mean squares.

For example, suppose you have a two-way design with factors A and B in which the main effect for B and the interaction are random. In order to compute the Type III expected mean squares (in addition to the fixed-effect analysis), you can use the following statements:

  proc glm;   class A B;   model Y = A B A*B;   random B A*B;   run;

If you use the SS4 option in the MODEL statement,

  proc glm;   class A B;   model Y = A B A*B/ss4;   random B A*B;   run;

then only the Type IV expected mean squares are computed (as well as the Type IV fixed-effect tests). For the Type I expected mean squares, you can use the following statements:

  proc glm;   class A B;   model Y = A B A*B/e1;   random B A*B;   run;

For each of these cases, in order to perform random effect analysis of variance tests for each effect specified in the model, you need to specify the TEST option in the RANDOM statement, as follows:

  proc glm;   class A B;   model Y = A B A*B;   random B A*B / test;   run;

The GLM procedure automatically determines the appropriate error term for each test, based on the expected mean squares.

Missing Values

For an analysis involving one dependent variable, PROC GLM uses an observation if values are nonmissing for that dependent variable and all the class variables.

For an analysis involving multiple dependent variables without the MANOVA or REPEATED statement, or without the MANOVA option in the PROC GLM statement, a missing value in one dependent variable does not eliminate the observation from the analysis of other nonmissing dependent variables. On the other hand, for an analysis with the MANOVA or REPEATED statement, or with the MANOVA option in the PROC GLM statement, PROC GLM uses an observation if values are nonmissing for all dependent variables and all the variables used in independent effects.

During processing, the GLM procedure groups the dependent variables by their pattern of missing values across observations so that sums and crossproducts can be collected in the most efficient manner.

If your data have different patterns of missing values among the dependent variables, interactivity is disabled. This can occur when some of the variables in your data set have missing values and

you do not use the MANOVA option in the PROC GLM statement
you do not use a MANOVA or REPEATED statement before the first RUN statement

Note that the REG procedure handles missing values differently in this case; see Chapter 61, The REG Procedure, for more information.

Computational Resources

Memory

For large problems, most of the memory resources are required for holding the X ² X matrix of the sums and crossproducts. The section Parameterization of PROC GLM Models on page 1787 describes how columns of the X matrix are allocated for various types of effects. For each level that occurs in the data for a combination of class variables in a given effect, a row and column for X ² X is needed.

The following example illustrates the calculation. Suppose A has 20 levels, B has 4 levels, and C has 3 levels. Then consider the model

  proc glm;   classABC;   model Y1 Y2 Y3=A B A*B C A*C B*C A*B*C X1 X2;   run;

The X ² X matrix (bordered by X ² Y and Y ² Y ) can have as many as 425 rows and columns:

1	for the intercept term
20	for A
4	for B
80	for A * B
3	for C
60	for A * C
12	for B * C
240	for A * B * C
2	for X1 and X2 (continuous variables)
3	for Y1 , Y2 ,and Y3 (dependent variables)

The matrix has 425 rows and columns only if all combinations of levels occur for each effect in the model. For m rows and columns, 8 m ² bytes are needed for crossproducts. In this case, 8 ·425 ² = 1,445,000 bytes, or about 1,445,000 / 1024 = 1411 K .

The required memory grows as the square of the number of columns of X ; most of the memory is for the A * B * C interaction. Without A * B * C , you have 185 columns and need 268K for X ² X . Without either A * B * C or A * B , you need 86K. If A is recoded to have ten levels, then the full model has only 220 columns and requires 378K.

The second time that a large amount of memory is needed is when Type III, Type IV, or contrast sums of squares are being calculated. This memory requirement is a function of the number of degrees of freedom of the model being analyzed and the maximum degrees of freedom for any single source. Let Rank equal the sum of the model degrees of freedom, MaxDF be the maximum number of degrees of freedom for any single source, and N _y be the number of dependent variables in the model. Then the memory requirement in bytes is

Unfortunately, these quantities are not available when the X ² X matrix is being constructed, so PROC GLM may occasionally request additional memory even after you have increased the memory allocation available to the program.

If you have a large model that exceeds the memory capacity of your computer, these are your options:

Eliminate terms, especially high-level interactions.
Reduce the number of levels for variables with many levels.
Use the ABSORB statement for parts of the model that are large.
Use the REPEATED statement for repeated measures variables.
Use PROC ANOVA or PROC REG rather than PROC GLM, if your design allows.

A related limitation is that for any model effect involving classification variables (interactions as well as main effects), the number of levels can not exceed 32,767. This is because GLM internally indexes effect levels using signed short (16-bit) integers, for which the maximum value is 2 ¹⁵ ˆ’ 1 = 32, 767.

CPU Time

Typically, if the GLM procedure requires a lot of CPU time, it will be for one of several reasons. Suppose that the input data has n rows (observations) and the model has E effects which together produce a design matrix X with m columns. Then if m or n is relatively large, the procedure may spend a lot of time in any of the following areas:

collecting the sums of squares and crossproducts
solving the normal equations
computing the Type III tests

The time required for collecting sums and crossproducts is difficult to calculate because it is a complicated function of the model. The worst case occurs if all columns are continuous variables, involving nm ² / 2 multiplications and additions. If the columns are levels of a classification, then only m sums may be needed, but a significant amount of time may be spent in look-up operations. Solving the normal equations requires time for approximately m ³ / 2 multiplications and additions, and the number of operations required to compute the Type III tests is also proportional to both E and m ³ .

Suppose that you know that Type IV sums of squares are appropriate for the model you are analyzing (for example, if your design has no missing cells). You can specify the SS4 option in your MODEL statement, which saves CPU time by requesting the Type IV sums of squares instead of the more computationally burdensome Type III sums of squares. This proves especially useful if you have a factor in your model that has many levels and is involved in several interactions.

If the operating system enables SAS to run parallel computational threads on multiple CPUs, then both the solution of the normal equations and the computation of Type III tests can take advantage of this to reduce the computational time for large models. In solving the normal equations, the fundamental row sweep operations (Goodnight 1979) are performed in parallel. In computing the Type III tests, both the orthogonalization for the estimable functions and the sums of squares calculation have been parallelized.

The reduction in computational time due to parallel processing depends on the size of the model, the number of processors, and the parallel architecture of the operating system. If the model is large enough that the overwhelming proportion of CPU time for the procedure is accounted for in solving the normal equations and/or computing the Type III tests, then you can expect a reduction in computational time approximately inversely proportional to the number of CPUs. However, as you increase the number of processors, the efficiency of this scaling can be reduced by several effects. One mitigating factor is a purely mathematical one known as Amdahl s Law , which is related to the fact that only part of the processing time for the procedure can be parallelized. Even taking Amdahl s Law into account, the parallelization efficiency can be reduced by cache effects related to how fast the multiple processors can access memory. See Cohen (2002) for a discussion of these issues. For additional information on parallel processing in SAS, refer to the chapter on Support for Parallel Processing in SAS Language Reference: Concepts .

Computational Method

Let X represent the n — p design matrix and Y the n — 1 vector of dependent variables. (See the section Parameterization of PROC GLM Models on page 1787 for information on how X is formed from your model specification.)

The normal equations X ² X ² = X ² Y are solved using a modified sweep routine that produces a generalized (g2) inverse ( X ² X ) ^ˆ’ and a solution b = ( X ² X ) ^ˆ’ X ² y (Pringle and Raynor 1971).

For each effect in the model, a matrix L is computed such that the rows of L are estimable. Tests of the hypothesis L ² = 0 are then made by first computing

and then computing the associated F value using the mean squared error.

Output Data Sets

OUT= Data Set Created by the OUTPUT Statement

The OUTPUT statement produces an output data set that contains the following:

all original data from the SAS data set input to PROC GLM
the new variables corresponding to the diagnostic measures specified with statistics keywords in the OUTPUT statement (PREDICTED=, RESIDUAL=, and so on).

With multiple dependent variables, a name can be specified for any of the diagnostic measures for each of the dependent variables in the order in which they occur in the MODEL statement.

For example, suppose that the input data set A contains the variables y1 , y2 , y3 , x1 , and x2 . Then you can use the following statements:

  proc glm data=A;   model y1 y2 y3=x1;   output out=out p=y1hat y2hat y3hat   r=y1resid lclm=y1lcl uclm=y1ucl;   run;

The output data set out contains y1 , y2 , y3 , x1 , x2 , y1hat , y2hat , y3hat , y1resid , y1lcl , and y1ucl . The variable x2 is output even though it is not used by PROC GLM. Although predicted values are generated for all three dependent variables, residuals are output for only the first dependent variable.

When any independent variable in the analysis (including all class variables) is missing for an observation, then all new variables that correspond to diagnostic measures are missing for the observation in the output data set.

When a dependent variable in the analysis is missing for an observation, then some new variables that correspond to diagnostic measures are missing for the observation in the output data set, and some are still available. Specifically, in this case, the new variables that correspond to COOKD, COVRATIO, DFFITS, PRESS, R, RSTUDENT, STDR, and STUDENT are missing in the output data set. The variables corresponding to H, LCL, LCLM, P, STDI, STDP, UCL, and UCLM are not missing.

OUT= Data Set Created by the LSMEANS Statement

The OUT= option in the LSMEANS statement produces an output data set that contains

the unformatted values of each classification variable specified in any effect in the LSMEANS statement
a new variable, LSMEAN , which contains the LS-mean for the specified levels of the classification variables
a new variable, STDERR , which contains the standard error of the LS-mean

The variances and covariances among the LS-means are also output when the COV option is specified along with the OUT= option. In this case, only one effect can be specified in the LSMEANS statement, and the following variables are included in the output data set:

new variables, COV1 , COV2 , , COV n , where n is the number of levels of the effect specified in the LSMEANS statement. These variables contain the covariances of each LS-mean with each other LS-mean.
a new variable, NUMBER , which provides an index for each observation to identify the covariances that correspond to that observation. The covariances for the observation with NUMBER equal to n can be found in the variable COV n .

OUTSTAT= Data Set

The OUTSTAT= option in the PROC GLM statement produces an output data set that contains

the BY variables, if any
_ TYPE_ , a new character variable. _ TYPE_ may take the values ˜SS1 , ˜SS2 , ˜SS3 , ˜SS4 , or ˜CONTRAST , corresponding to the various types of sums of squares generated, or the values ˜CANCORR , ˜STRUCTUR , or ˜SCORE , if a canonical analysis is performed through the MANOVA statementandnoM=matrixisspecified.
_ SOURCE_ , a new character variable. For each observation in the data set, _ SOURCE_ contains the name of the model effect or contrast label from which the corresponding statistics are generated.
_ NAME_ , a new character variable. For each observation in the data set, _ NAME_ contains the name of one of the dependent variables in the model or, in the case of canonical statistics, the name of one of the canonical variables ( CAN1 , CAN2 , and so forth).
four new numeric variables: SS , DF , F , and PROB , containing sums of squares, degrees of freedom, F values, and probabilities, respectively, for each model or contrast sum of squares generated in the analysis. For observations resulting from canonical analyses, these variables have missing values.
if there is more than one dependent variable, then variables with the same names as the dependent variables represent
- for _TYPE_ =SS1, SS2, SS3, SS4, or CONTRAST, the crossproducts of the hypothesis matrices
- for _TYPE_ =CANCORR, canonical correlations for each variable
- for _TYPE_ =STRUCTUR, coefficients of the total structure matrix
- for _TYPE_ =SCORE, raw canonical score coefficients

The output data set can be used to perform special hypothesis tests (for example, with the IML procedure in SAS/IML software), to reformat output, to produce canonical variates (through the SCORE procedure), or to rotate structure matrices (through the FACTOR procedure).

Displayed Output

The GLM procedure produces the following output by default:

The overall analysis-of-variance table breaks down the Total Sum of Squares for the dependent variable into the portion attributed to the Model and the portion attributed to Error.
The Mean Square term is the Sum of Squares divided by the degrees of freedom (DF).
The Mean Square for Error is an estimate of ƒ ² , the variance of the true errors.
The F Value is the ratio produced by dividing the Mean Square for the Model by the Mean Square for Error. It tests how well the model as a whole (adjusted for the mean) accounts for the dependent variable s behavior. An F -test is a joint test to determine that all parameters except the intercept are zero.
A small significance probability, Pr > F, indicates that some linear function of the parameters is significantly different from zero.
R-Square, R ² , measures how much variation in the dependent variable can be accounted for by the model. R ² , which can range from 0 to 1, is the ratio of the sum of squares for the model divided by the sum of squares for the corrected total. In general, the larger the value of R ² , the better the model s fit.
Coef Var, the coefficient of variation, which describes the amount of variation in the population, is 100 times the standard deviation estimate of the dependent variable, Root MSE (Mean Square for Error), divided by the Mean. The coefficient of variation is often a preferred measure because it is unitless.
Root MSE estimates the standard deviation of the dependent variable (or equivalently, the error term) and equals the square root of the Mean Square for Error.
Mean is the sample mean of the dependent variable.

These tests are used primarily in analysis-of-variance applications:

The Type I SS (sum of squares) measures incremental sums of squares for the model as each variable is added.
The Type III SS is the sum of squares for a balanced test of each effect, adjusted for every other effect.

These items are used primarily in regression applications:

The Estimates for the model Parameters (the intercept and the coefficients)
t Value is the Student s t value for testing the null hypothesis that the parameter (if it is estimable) equals zero.
The significance level, Pr > t, is the probability of getting a larger value of t if the parameter is truly equal to zero. A very small value for this probability leads to the conclusion that the independent variable contributes significantly to the model.
The Standard Error is the square root of the estimated variance of the estimate of the true value of the parameter.

Other portions of output are discussed in the following examples.

ODS Table Names

PROC GLM assigns a name to each table it creates. You can use these names to reference the table when using the Output Delivery System (ODS) to select tables and create output data sets. These names are listed in the following table. For more information on ODS, see Chapter 14, Using the Output Delivery System.

Table 32.5: ODS Tables Produced in PROC GLM
ODS Table Name	Description	Statement / Option
Aliasing	Type 1,2,3,4 aliasing structure	MODEL / (E1 E2 E3 or E4) and ALIASING
AltErrContrasts	ANOVA table for contrasts with alternative error	CONTRAST / E=
AltErrTests	ANOVA table for tests with alternative error	TEST / E=
Bartlett	Bartlett s homogeneity of variance test	MEANS / HOVTEST=BARTLETT
CLDiffs	Multiple comparisons of pairwise differences	MEANS / CLDIFF or DUNNETT or (Unequal cells and not LINES)
CLDiffsInfo	Information for multiple comparisons of pairwise differences	MEANS / CLDIFF or DUNNETT or (Unequal cells and not LINES)
CLMeans	Multiple comparisons of means with confidence/comparison interval	MEANS / CLM
CLMeansInfo	Information for multiple comparison of means with confidence/comparison interval	MEANS / CLM
CanAnalysis	Canonical analysis	(MANOVA or REPEATED) / CANONICAL
CanCoef	Canonical coefficients	(MANOVA or REPEATED) / CANONICAL
CanStructure	Canonical structure	(MANOVA or REPEATED) / CANONICAL
CharStruct	Characteristic roots and vectors	(MANOVA / not CANONICAL) or (REPEATED / PRINTRV)
ClassLevels	Classification variable levels	CLASS statement
ContrastCoef	L matrix for contrast	CONTRAST / EST
Contrasts	ANOVA table for contrasts	CONTRAST statement
DependentInfo	Simultaneously analyzed dependent variables	default when there are multiple dependent variables with different patterns of missing values
Diff	PDiff matrix of Least-Squares Means	LSMEANS / PDIFF
Epsilons	Greenhouse-Geisser and Huynh-Feldt epsilons	REPEATED statement
ErrorSSCP	Error SSCP matrix	(MANOVA or REPEATED) / PRINTE
EstFunc	Type 1,2,3,4 estimable functions	MODEL / (E1 E2 E3 or E4)
Estimates	Estimate statement results	ESTIMATE statement
ExpectedMeanSquares	Expected mean squares	RANDOM statement
FitStatistics	R-Square, C.V., Root MSE, and dependent mean	default
GAliasing	General form of aliasing structure	MODEL/EandALIASING
GEstFunc	General form of estimable functions	MODEL / E
HOVFTest	Homogeneity of variance ANOVA	MEANS / HOVTEST
HypothesisSSCP	Hypothesis SSCP matrix	(MANOVA or REPEATED) / PRINTH
InvXPX	inv( X X ) matrix	MODEL / INVERSE
LSMeanCL	Confidence interval for LS-means	LSMEANS / CL
LSMeanCoef	Coefficients of Least-Squares Means	LSMEANS / E
LSMeanDiffCL	Confidence interval for LS-mean differences	LSMEANS / PDIFF and CL
LSMeans	Least-Squares means	LSMEANS statement
MANOVATransform	Multivariate transformation matrix	MANOVA / M=
MCLines	Multiple comparisons LINES output	MEANS / LINES or ((DUNCAN or WALLER or SNK or REGWQ) and not (CLDIFF or CLM)) or (Equal cells and not CLDIFF)
MCLinesInfo	Information for multiple comparison LINES output	MEANS / LINES or ((DUNCAN or WALLER or SNK or REGWQ) and not (CLDIFF or CLM)) or (Equal cells and not CLDIFF)
MCLinesRange	Ranges for multiple range MC tests	MEANS / LINES or ((DUNCAN or WALLER or SNK or REGWQ) and not (CLDIFF or CLM)) or (Equal cells and not CLDIFF)
MatrixRepresentation	X matrix element representation	as needed for other options
Means	Group means	MEANS statement
ModelANOVA	ANOVA for model terms	default
MultStat	Multivariate tests	MANOVA statement
NObs	Number of observations	default
OverallANOVA	Over-all ANOVA	default
ParameterEstimates	Estimated linear model coefficients	MODEL / SOLUTION
PartialCorr	Partial correlation matrix	(MANOVA or REPEATED) / PRINTE
PredictedInfo	Predicted values info	MODEL / PREDICTED or CLM or CLI
PredictedValues	Predicted values	MODEL / PREDICTED or CLM or CLI
QForm	Quadratic form for expected mean squares	RANDOM / Q
RandomModelANOVA	Random effect tests	RANDOM / TEST
RepeatedLevelInfo	Correspondence between dependents and repeated measures levels	REPEATED statement
RepeatedTransform	Repeated Measures Transformation Matrix	REPEATED / PRINTM
SimDetails	Details of difference quantile simulation	LSMEANS / ADJUST=SIMULATE(REPORT)
SimResults	Evaluation of difference quantile simulation	LSMEANS / ADJUST=SIMULATE(REPORT)
SlicedANOVA	Sliced effect ANOVA table	LSMEANS / SLICE
Sphericity	Sphericity tests	REPEATED / PRINTE
Tests	Summary ANOVA for specified	MANOVA H= effects MANOVA / H= SUMMARY
Tolerances	X X Tolerances	MODEL / TOLERANCE
Welch	Welch s ANOVA	MEANS / WELCH
XPX	X X matrix	MODEL / XPX

ODS Graphics (Experimental)

This section describes the use of ODS for creating statistical graphs with the GLM procedure. These graphics are experimental in this release, meaning that both the graphical results and the syntax for specifying them are subject to change in a future release. To request these graphs you must specify the ODS GRAPHICS statement with an appropriate model, as discussed in the following. For more information on the ODS GRAPHICS statement, see Chapter 15, Statistical Graphics Using ODS.

When the ODS GRAPHICS are in effect, then for particular models the GLM procedure will produce default graphics.

If you specify an analysis of covariance model, with one classification variable and one continuous variable, the GLM procedure will produce an analysis of covariance plot of the response values versus the covariate values, with lines representing the fitted relationship within each classification level. For an example of the analysis of covariance plot, see Example 32.4 on page 1860.
If you specify a one-way analysis of variance model, with just one independent classification variable, the GLM procedure will produce a grouped box plot of the response values versus the classification levels. For an example of the box plot, see the One-Way Layout with Means Comparisons section on page 424.

ODS Graph Names

PROC GLM assigns a name to each graph it creates using ODS. You can use these names to reference the graphs when using ODS. The names are listed in Table 32.6.

Table 32.6: ODS Graphics Produced by PROC GLM
ODS Graph Name	Plot Description
ANCOVAPlot	Analysis of covariance plot
BoxPlot	Box plot

To request these graphs you must specify the ODS GRAPHICS statement. For more information on the ODS GRAPHICS statement, see Chapter 15, Statistical Graphics Using ODS.

^[*] The Duncan-Waller method does not fit into the preceding scheme, since it is based on the Bayes risk rather than any particular error rate.