To specify one or more scenarios for an analysis parameter (or set of parameters) in the POWER statement, you provide a list of values for the option that corresponds to the parameter(s). To identify the parameter you wish to solve for, you place missing values in the appropriate list.
Scenarios for scalar-valued parameters, such as power, are represented by a number- list .
A number-list can be one of two things: a series of one or more numbers expressed in the form of one or more DOLISTs, or a missing value indicator ( . ).
The DOLIST format is the same as in the DATA step language. For example, you can specify four scenarios (30, 50, 70, and 100) for a total sample size in any of the following ways.
NTOTAL = 30 50 70 100 NTOTAL = 30 to 70 by 20 100
A missing value identifies a parameter as the result parameter; it is valid only with options representing parameters you can solve for in a given analysis. For example, you can request a solution for NTOTAL:
NTOTAL = .
By default, PROC GLMPOWER rounds sample sizes conservatively (down in the input, up in the output) so that all total sizes and sample sizes for individual design profiles are integers. This is generally considered conservative because it selects the closest realistic design providing at most the power of the (possibly fractional ) input or mathematically optimized design. In addition, all design profile sizes are adjusted to be multiples of their corresponding weights. If a design profile is present more than once in the exemplary data set, then the weights for that design profile are summed. For example, if a particular design profile is present twice in the exemplary data set with weight values 2 and 6, then all sample sizes for this design profile become multiples of 2 + 6 = 8.
With the NFRACTIONAL option, sample size input is not rounded, and sample size output is reported in two versions, a raw fractional version and a ceiling version rounded up to the nearest integer.
Whenever an input sample size is adjusted, both the original ( nominal ) and adjusted ( actual ) sample sizes are reported. Whenever computed output sample sizes are adjusted, both the original input ( nominal ) power and the achieved ( actual ) power at the adjusted sample size are reported.
The Error column in the main output table explains reasons for missing results and flags numerical results that are bounds rather than exact answers.
The Information column provides further details about Error entries, warnings about any boundary conditions detected , and notes about any adjustments to input. Note that the Information column is hidden by default in the main output. You can view it by using the ODS OUTPUT statement to save the output as a dataset and the PRINT procedure. For example, the following SAS statements print both the Error and Info columns for a power computation in a one-way ANOVA.
data MyExemp; input A $ Y1 Y2; datalines; 1 10 11 2 12 11 3 15 11 ; run; proc glmpower data=MyExemp; class A; model Y1 Y2 = A; power stddev = 2 ntotal = 310 power = .; ods output output=Power; proc print noobs data=Power; var NominalNTotal NTotal Dependent Power Error Info; run;
The output is shown in Figure 34.5.
Nominal NTotal NTotal Dependent Power Error Info 3 3 Y1 . Invalid input Error DF=0 10 9 Y1 0.557 Input N adjusted 3 3 Y2 . Invalid input Error DF=0 / No effect 10 9 Y2 0.050 Input N adjusted / No effect
The sample size of 3 specified with the NTOTAL= option leads to an Invalid input message in the Error column and an Error DF=0 message in the Info column, because a sample size of 3 is so small that there are no degrees of freedom left for the error term . The sample size of 10 leads to an Input N adjusted message in the Info column, because it is rounded down to 9 to produce integer group sizes of 3 per cell. The cell means scenario represented by the dependent variable Y2 leads to a No effect message to appear in the Info column, because the means in this scenario are all equal.
If you use the PLOTONLY option in the PROC GLMPOWER statement, the procedure only displays graphical output. Otherwise, the displayed output of the GLMPOWER procedure includes the following:
the Fixed Scenario Elements table, which shows all applicable single-valued analysis parameters, in the following order: the weight variable, the source of the test, parameters input explicitly, parameters supplied with defaults, and ancillary results
an output table showing the following when applicable (in order): the index of the scenario, the source of the test, all multivalued input, ancillary results, the primary computed result, and error descriptions
plots (if requested )
Ancillary results include the following:
Actual Power, the achieved power, if it differs from the input (Nominal) power value
fractional sample size, if the NFRACTIONAL option is used in the analysis statement
If sample size is the result parameter and the NFRACTIONAL option is used in the analysis statement, then both Fractional and Ceiling sample size results are displayed. Fractional sample sizes correspond to the Nominal values of power or precision probability. Ceiling sample sizes are simply the fractional sample sizes rounded up to the nearest integer; they correspond to Actual values of power or precision probability.
PROC GLMPOWER assigns a name to each table that it creates. You can use these names to reference the table when using the Output Delivery System (ODS) to select tables and create output data sets. These names are listed in Table 34.6. For more information on ODS, see Chapter 14, Using the Output Delivery System.
ODS Table Name | Description | Statement |
---|---|---|
FixedElements | factoid with single-valued analysis parameters | default |
Output | all input and computed analysis parameters, error messages, and information messages for each scenario | default |
PlotContent | data contained in plots, including analysis parameters and indices identifying plot features. ( Note: This table is saved as a dataset and not displayed in PROC GLMPOWER output.) | PLOT |
The ODS path names are created as follows :
Glmpower.Power <n> .FixedElements
Glmpower.Power <n> .Output
Glmpower.Power <n> .PlotContent
Glmpower.Power <n> .Plot <m>
where
The Plot <m> objects are the graphs.
The <n> indexing the Power statement is only used if there is more than one instance.
The <n> indexing the plots increases with every panel in every plot statement, resetting to 1 only at new analysis statements.
This section describes the approaches used in PROC GLMPOWER to compute power and sample size.
The univariate linear model has the form
where y is the N — 1 vector of responses, X is the N — p design matrix, ² is the p — 1 vector of model parameters corresponding to the columns of X , and ˆˆ is an N — 1 vector of errors with
In PROC GLMPOWER, the model parameters ² are not specified directly, but rather indirectly as y * , which represents either conjectured response means or typical response values for each design profile. The y * values are manifested as the dependent variable in the MODEL statement. The vector ² is obtained from y * according to the least squares equation,
Note that, in general, there is not a 1 to 1 mapping between y* and ² . Many different scenarios for y* may lead to the same ² . If you specify y * with the intention of representing cell means, keep in mind that PROC GLMPOWER allows scenarios that are not valid cell means according to the model specified in the MODEL statement. For example, if y* exhibits an interaction effect but the corresponding interaction term is left out of the model, then the cell means ( X ² ) derived from ² differ from y* . In particular, the cell means thus derived are the projection of y* onto the model space.
It is convenient in power analysis to parameterize the design matrix X in three parts , { , w , N }, defined as follows:
The q — p essence design matrix is the collection of unique rows of X . Its rows are sometimes referred to as design profiles. Here, q ‰ N is defined simply as the number of unique rows of X .
The q — 1 weight vector w reveals the relative proportions of design profiles. Row i of is to be included in the design w i times for every w j times row j is included. The weights are assumed to be standardized (i.e., sum up to 1).
The total sample size is N . This is the number of rows in X . If you gather Nw i = n i copies of the i th row of , for i = 1 ,..., q , then you end up with X .
It is useful to express the the crossproduct matrix X ² X in terms of these three parts,
since this factors out the portion ( N ) depending on sample size and the portion ( ² diag( w ) ) depending only on the design structure.
A general linear hypothesis for the univariate model has the form
where L is an r L — p contrast matrix (assumed to be full rank), and is the null value (usually just a vector of zeroes). Note that effect tests are just contrasts using special forms of L . Thus, this scheme covers both effect tests and custom contrasts.
The test statistic is
where
where DF E = N ˆ’ rank( X ). Note that DF E = N ˆ’ p if X has full rank.
Under H , F ~ F ( r L , DF E ). Under H A , F is distributed as F ( r L , DF E , » ) with noncentrality
Muller and Peterson (1984) give the exact power of the test as
Sample size is computed by inverting the power equation.
Refer to Muller et al. (1992) and O Brien and Shieh (1992) for additional discussion.
If you specify covariates in the model (whether continuous or categorical), then two adjustments are made in order to compute approximate power in the presence of the covariates. Let n ½ denote the number of covariates (counting dummy variables for categorical covariates individually). In other words, n ½ is the total degrees of freedom used by the covariates. The adjustments are the following:
The error degrees of freedom decreases by n ½ .
The error standard deviation ƒ shrinks by a factor of (1 ˆ’ 2 ) 1/2 (if the CORRXY= option to specify the correlation between covariates and response) or (1 ˆ’ r ) 1/2 (if the PROPVARREDUCTION= option is used to specify the proportional reduction in total R 2 incurred by the covariates). Let ƒ * represent the updated value of ƒ .
As a result of these changes, the power is computed as
where » ˜… is calculated using ƒ ˜… rather than ƒ :