Details | SAS.STAT 9.1 Users Guide (Vol. 5)

Overview of Power Concepts

In statistical hypothesis testing, you typically express the belief that some effect exists in a population by specifying an alternative hypothesis H ₁ . You state a null hypothesis H as the assertion that the effect does not exist and attempt to gather evidence to reject H in favor of H ₁ . Evidence is gathered in the form of sample data, and a statistical test is used to assess H . If H is rejected but there really is no effect, this is called a Type 1 error . The probability of a Type 1 error is usually designated alpha or ± , and statistical tests are designed to ensure that ± is suitably small (for example, less than 0.05).

If there really is an effect in the population but H is not rejected in the statistical test, then a Type 2 error has been made. The probability of a Type 2 error is usually designated beta or ² . The probability 1 ˆ’ ² of avoiding a Type 2 error, that is, correctly rejecting H and achieving statistical significance, is called the power . ( Note: Another more general definition of power is the probability of rejecting H for any given set of circumstances, even those corresponding to H being true. The POWER procedure uses this more general definition.)

An important goal in study planning is to ensure an acceptably high level of power. Sample size plays a prominent role in power computations because the focus is often on determining a sufficient sample size to achieve a certain power, or assessing the power for a range of different sample sizes.

Some of the analyses in the POWER procedure focus on precision rather than power. An analysis of confidence interval precision is analogous to a traditional power analysis, with CI Half-Width taking the place of effect size and Prob(Width) taking the place of power. The CI Half-Width is the margin of error associated with the confidence interval, the distance between the point estimate and an endpoint. The Prob(Width) is the probability of obtaining a confidence interval with at most a target half-width.

Summary of Analyses

Table 57.23 gives a summary of the analyses supported in the POWER procedure. The name of the analysis statement reflects the type of data and design. The TEST=, CI=, and DIST= options specify the focus of the statistical hypothesis (in other words, the criterion on which the research question is based) and the test statistic to be used in data analysis.

Table 57.23: Summary of Analyses
Statement	Options
Multiple linear regression:	MULTREG
Type III F test
Correlation: Fisher s z test	ONECORR	DIST=FISHERZ
Correlation: t test	ONECORR	DIST=T
Binomial proportion: Exact test	ONESAMPLEFREQ	TEST=EXACT
Binomial proportion: z test	ONESAMPLEFREQ	TEST=Z
Binomial proportion: z test with continuity adjustment	ONESAMPLEFREQ	TEST=ADJZ
One-sample t test	ONESAMPLEMEANS	TEST=T
One-sample t test with lognormal data	ONESAMPLEMEANS	TEST=T DIST=LOGNORMAL
One-sample equivalence test for mean of normal data	ONESAMPLEMEANS	TEST=EQUIV
One-sample equivalence test for mean of lognormal data	ONESAMPLEMEANS	TEST=EQUIV DIST=LOGNORMAL
Confidence interval for a mean	ONESAMPLEMEANS	CI=T
One-way ANOVA: One-degree-of-freedom contrast	ONEWAYANOVA	TEST=CONTRAST
One-way ANOVA: Overall F test	ONEWAYANOVA	TEST=OVERALL
McNemar exact conditional test	PAIREDFREQ
McNemar normal approximation test	PAIREDFREQ	DIST=NORMAL
Paired t test	PAIREDMEANS	TEST=DIFF
Paired t test of mean ratio with lognormal data	PAIREDMEANS	TEST=RATIO
Paired additive equivalence of mean difference with normal data	PAIREDMEANS	TEST=EQUIV_ DIFF
Paired multiplicative equivalence of mean ratio with lognormal data	PAIREDMEANS	TEST=EQUIV_ RATIO
Confidence interval for mean of paired differences	PAIREDMEANS	CI=DIFF
Pearson chi-square test for two independent proportions	TWOSAMPLEFREQ	TEST=PCHI
Fisher s exact test for two independent proportions	TWOSAMPLEFREQ	TEST=FISHER
Likelihood ratio chi-square test for two independent proportions	TWOSAMPLEFREQ	TEST=LRCHI
Two-sample t test assuming equal variances	TWOSAMPLEMEANS	TEST=DIFF
Two-sample Satterthwaite t test assuming unequal variances	TWOSAMPLEMEANS	TEST=DIFF_ SATT
Two-sample pooled t test of mean ratio with lognormal data	TWOSAMPLEMEANS	TEST=RATIO
Two-sample additive equivalence of mean difference with normal data	TWOSAMPLEMEANS	TEST=EQUIV_ DIFF
Two-sample multiplicative equivalence of mean ratio with lognormal data	TWOSAMPLEMEANS	TEST=EQUIV_ RATIO
Two-sample confidence interval for mean difference	TWOSAMPLEMEANS	CI=DIFF
Log-rank test for comparing two survival curves	TWOSAMPLESURVIVALTEST=LOGRANK
Gehan rank test for comparing two survival curves	TWOSAMPLESURVIVALTEST=GEHAN
Tarone-Ware rank test for comparing two survival curves	TWOSAMPLESURVIVALTEST=TARONEWARE

Specifying Value Lists in Analysis Statements

To specify one or more scenarios for an analysis parameter (or set of parameters), you provide a list of values for the statement option that corresponds to the parameter(s). To identify the parameter you wish to solve for, you place missing values in the appropriate list.

There are five basic types of such lists: keyword-lists , number-lists , grouped-number- lists , name-lists , and grouped-name-lists . Some parameters, such as the direction of a test, have values represented by one or more keywords in a keyword-list . Scenarios for scalar-valued parameters, such as power, are represented by a number-list . Scenarios for groups of scalar-valued parameters, such as group sample sizes in a multigroup design, are represented by a grouped-number-list . Scenarios for named parameters, such as reference survival curves, are represented by a name-list . Scenarios for groups of named parameters, such as group survival curves, are represented by a grouped- name-list .

The following subsections explain these five basic types of lists.

Keyword-lists

A keyword-list is a list of one or more keywords separated by spaces. For example, you can specify both 2-sided and upper-tailed versions of a one-sample t test:

  SIDES = 2 U

Number-lists

A number-list can be one of two things: a series of one or more numbers expressed in the form of one or more DOLISTs, or a missing value indicator (.).

The DOLIST format is the same as in the DATA step language. For example, for the one-sample t test you can specify four scenarios (30, 50, 70, and 100) for a total sample size in any of the following ways.

  NTOTAL = 30 50 70 100   NTOTAL = 30 to 70 by 20 100

A missing value identifies a parameter as the result parameter; it is valid only with options representing parameters you can solve for in a given analysis. For example, you can request a solution for NTOTAL:

  NTOTAL = .

Grouped-number-lists

A grouped-number-list specifies multiple scenarios for numeric values in two or more groups, possibly including missing value indicators to solve for a specific group. The list can assume one of two general forms, a crossed version and a matched version.

Crossed Grouped-number-lists

The crossed version of a grouped number list consists of a series of number-lists (see the Number-lists section on page 3491), one representing each group, each separated by a vertical bar (). The values for each group represent multiple scenarios for that group, and the scenarios for each individual group are crossed to produce the set of all scenarios for the analysis option. For example, you can specify the following six scenarios for the sizes ( n ₁ , n ₂ ) of two groups

(20 , 30)(20 , 40)(20 , 50)

(25 , 30)(25 , 40)(25 , 50)

as follows :

  GROUPNS = 20 25  30 40 50

If the analysis can solve for a value in one group given the other groups, then one of the number-lists in a crossed grouped-number-list can be a missing value indicator (.). For example, in a two-sample t test you can posit three scenarios for the group 2 sample size while solving for the group 1 sample size:

  GROUPNS = .  30 40 50

Some analyses can involve more than two groups. For example, you can specify 2 — 3 — 1 = 6 scenarios for the means of three groups in a one-way ANOVA as follows:

  GROUPMEANS = 10 12  10 to 20 by 5  24

Matched Grouped-number-lists

The matched version of a grouped number list consists of a series of numeric lists each enclosed in parentheses. Each list consists of a value for each group and represents a single scenario for the analysis option. Multiple scenarios for the analysis option are represented by multiple lists. For example, you can express the crossed grouped-number-list

  GROUPNS = 20 25  30 40 50

alternatively in a matched format:

  GROUPNS = (20 30) (20 40) (20 50) (25 30) (25 40) (25 50)

The matched version is particularly useful when you wish to include only a subset of all combinations of individual group values. For example, you may want to pair 20 only with 50, and 25 only with 30 and 40:

  GROUPNS = (20 50) (25 30) (25 40)

If the analysis can solve for a value in one group given the other groups, then you can replace the value for that group with a missing value indicator (.). If used, the missing value indicator must occur in the same group in every scenario. For example, you can solve for the group 1 sample size (as in the Crossed Grouped-number-lists section on page 3491) using a matched format:

  GROUPNS = (. 30) (. 40) (. 50)

Some analyses can involve more than two groups. For example, you can specify two scenarios for the means of three groups in a one-way ANOVA:

  GROUPMEANS = (15 24 32) (12 25 36)

Name-lists

A name-list is a list of one or more names in single or double quotes separated by spaces. For example, you can specify two scenarios for the reference survival curve in a log-rank test:

  REFSURVIVAL = "Curve A" "Curve B"

Grouped-name-lists

A grouped-name-list specifies multiple scenarios for names in two or more groups. The list can assume one of two general forms, a crossed version and a matched version.

Crossed Grouped-name-lists

The crossed version of a grouped name list consists of a series of name-lists (see the Name-lists section on page 3492), one representing each group, each separated by a vertical bar (). The values for each group represent multiple scenarios for that group, and the scenarios for each individual group are crossed to produce the set of all scenarios for the analysis option. For example, you can specify the following six scenarios for the survival curves ( c ₁ , c ₂ ) of two groups

( Curve A , Curve C )( Curve A , Curve D )( Curve A , Curve E )
( Curve B , Curve C )( Curve B , Curve D )( Curve B , Curve E )

as follows:

  GROUPSURVIVAL = "Curve A" "Curve B"  "Curve C" "Curve D"   "Curve E"

Matched Grouped-name-lists

The matched version of a grouped name list consists of a series of name lists each enclosed in parentheses. Each list consists of a name for each group and represents a single scenario for the analysis option. Multiple scenarios for the analysis option are represented by multiple lists. For example, you can express the crossed grouped-name-list

  GROUPSURVIVAL = "Curve A" "Curve B"  "Curve C" "Curve D"   "Curve E"

alternatively in a matched format:

  GROUPSURVIVAL = ("Curve A" "Curve C")   ("Curve A" "Curve D")   ("Curve A" "Curve E")   ("Curve B" "Curve C")   ("Curve B" "Curve D")   ("Curve B" "Curve E")

The matched version is particularly useful when you wish to include only a subset of all combinations of individual group values. For example, you may want to pair Curve A only with Curve C , and Curve B only with Curve D and Curve E :

  GROUPSURVIVAL = ("Curve A" "Curve C")   ("Curve B" "Curve D")   ("Curve B" "Curve E")

Sample Size Adjustment Options

By default, PROC POWER rounds sample sizes conservatively (down in the input, up in the output) so that all total sizes (and individual group sample sizes, if a multigroup design) are integers. This is generally considered conservative because it selects the closest realistic design providing at most the power of the (possibly fractional ) input or mathematically optimized design. In addition, in a multigroup design, all group sizes are adjusted to be multiples of the corresponding group weights. For example, if GROUPWEIGHTS = (2 6), then all group 1 sample sizes become multiples of 2, and all group 2 sample sizes become multiples of 6 (and all total sample sizes become multiples of 8).

With the NFRACTIONAL option, sample size input is not rounded, and sample size output (whether total or group-wise) are reported in two versions, a raw fractional version and a ceiling version rounded up to the nearest integer.

Whenever an input sample size is adjusted, both the original ( nominal ) and adjusted ( actual ) sample sizes are reported. Whenever computed output sample sizes are adjusted, both the original input ( nominal ) power and the achieved ( actual ) power at the adjusted sample size are reported.

Error and Information Output

The Error column in the main output table explains reasons for missing results and flags numerical results that are bounds rather than exact answers. For example, consider the sample size analysis implemented by the following statements:

  proc power;   twosamplefreq test=pchi   oddsratio= 1.0001   refproportion=.4   nulloddsratio=1   power=.9   ntotal=.;   run;

The output in Figure 57.6 reveals that the sample size to achieve a power of 0.9 could not be computed, but that the sample size 2.15E+09 achieves a power of 0.206.

  The POWER Procedure   Pearson Chi-square Test for Two Proportions   Fixed Scenario Elements   Distribution                         Asymptotic normal   Method                            Normal approximation   Null Odds Ratio                                      1   Reference (Group 1) Proportion                     0.4   Odds Ratio                                      1.0001   Nominal Power                                      0.9   Number of Sides                                      2   Alpha                                             0.05   Group 1 Weight                                       1   Group 2 Weight                                       1   Computed N Total   Actual   Power      N Total    Error   0.206     2.15E+09    Solution is a lower bound

Figure 57.6: Error Column

The Information column provides further details about Error entries, warnings about any boundary conditions detected , and notes about any adjustments to input. Note that the Information column is hidden by default in the main output. You can view it by using the ODS OUTPUT statement to save the output as a data set and the PRINT procedure. For example, the following SAS statements print both the Error and Info columns for a power computation in a two-sample t test.

  proc power;   twosamplemeans   meandiff= 0 7   stddev=2   ntotal=2 5   power=.;   ods output output=Power;   proc print noobs data=Power;   var MeanDiff NominalNTotal NTotal Power Error Info;   run;

The output is shown in Figure 57.7.

  Mean  Nominal   Diff  NTotal  NTotal  Power      Error      Info   0      2       2     .     Invalid input  N too small / No effect   0      5       4    0.050                 Input N adjusted / No effect   7      2       2     .     Invalid input  N too small   7      5       4    0.477                 Input N adjusted

Figure 57.7: Error and Information Columns

The mean difference of 0 specified with the MEANDIFF= option leads to a No effect message to appear in the Info column. The sample size of 2 specified with the NTOTAL= option leads to an Invalid input message in the Error column and an NTotal too small message in the Info column. The sample size of 5 leads to an Input N adjusted message in the Info column because it is rounded down to 4 to produce integer group sizes of 2 per group.

Displayed Output

If you use the PLOTONLY option in the PROC POWER statement, the procedure only displays graphical output. Otherwise , the displayed output of the POWER procedure includes the following:

the Fixed Scenario Elements table, which shows all applicable single-valued analysis parameters, in the following order: distribution, method, parameters input explicitly, and parameters supplied with defaults
an output table showing the following when applicable (in order): the index of the scenario, all multivalued input, ancillary results, the primary computed result, and error descriptions
plots (if requested )

For each input parameter, the order of the input values is preserved in the output.

Ancillary results include the following:

Actual Power, the achieved power, if it differs from the input (Nominal) power value
Actual Prob(Width), the achieved precision probability, if it differs from the input (Nominal) probability value
Actual Alpha, the achieved significance level, if it differs from the input (Nominal) alpha value
fractional sample size, if the NFRACTIONAL option is used in the analysis statement

If sample size is the result parameter and the NFRACTIONAL option is used in the analysis statement, then both Fractional and Ceiling sample size results are displayed. Fractional sample sizes correspond to the Nominal values of power or precision probability. Ceiling sample sizes are simply the fractional sample sizes rounded up to the nearest integer; they correspond to Actual values of power or precision probability.

ODS Table Names

PROC POWER assigns a name to each table that it creates. You can use these names to reference the table when using the Output Delivery System (ODS) to select tables and create output data sets. These names are listed in Table 57.24. For more information on ODS, see Chapter 14, Using the Output Delivery System.

Table 57.24: ODS Tables Produced in PROC POWER
ODS Table Name	Description	Statement
FixedElements	factoid with single-valued analysis parameters	default ^[*]
Output	all input and computed analysis parameters, error messages, and information messages for each scenario	default
PlotContent	data contained in plots, including analysis parameters and indices identifying plot features. ( Note: this table is saved as a data set and not displayed in PROC POWER output.)	PLOT
^[*] Depends on input.

The ODS path names are created as follows:

Power.<analysis statement name> < n > .FixedElements
Power.<analysis statement name> < n > .Output
Power.<analysis statement name> < n > .PlotContent
Power.<analysis statement name> < n > .Plot < m >

where

The Plot < m > objects are the graphs.
The < n > indexing the analysis statement name is only used if there is more than one instance.
The < n > indexing the plots increases with every panel in every plot statement, resetting to 1 only at new analysis statements.

Computational Resources

Memory

In the TWOSAMPLESURVIVAL statement, the amount of required memory is roughly proportional to the product of the number of subintervals (specified by the NSUBINTERVAL= option) and the total time of the study (specified by the ACCRUALTIME=, FOLLOWUPTIME=, and TOTALTIME= options).

CPU Time

In the Satterthwaite t test analysis (TWOSAMPLEMEANS TEST=DIFF_ SATT), the required CPU time grows as the mean difference decreases relative to the standard deviations. In the PAIREDFREQ statement, the required CPU time for the exact power computation (METHOD=EXACT) grows with the sample size.

Computational Methods and Formulas

This section describes the approaches used in PROC POWER to compute power for each analysis. The first subsection defines some common notation. The following subsections describe the various power analyses, including discussions of the data, statistical test, and power formula for each analysis. Unless otherwise indicated, computed values for parameters besides power (for example, sample size) are obtained by solving power formulas for the desired parameters.

Common Notation

Table 57.25 displays notation for some of the more common parameters across analyses. The Associated Syntax column shows examples of relevant analysis statement options, where applicable.

Table 57.25: Common Notation
Symbol	Description	Associated Syntax
±	significance level	ALPHA=
N	total sample size	NTOTAL=, NPAIRS=
n _i	sample size in i th group	NPERGROUP=, GROUPNS=
w _i	allocation weight for i th group (standardized to sum to 1)	GROUPWEIGHTS=
µ	(arithmetic) mean	MEAN=
µ _i	(arithmetic) mean in i th group	GROUPMEANS=, PAIREDMEANS=
µ _diff	(arithmetic) mean difference, µ ₂ ˆ’ µ ₁ or µ _T ˆ’ µ _R	MEANDIFF=
µ	null mean or mean difference (arithmetic)	NULL=, NULLDIFF=
³	geometric mean	MEAN=
³ _i	geometric mean in i th group	GROUPMEANS=, PAIREDMEANS=
³	null mean or mean ratio (geometric)	NULL=, NULLRATIO=
ƒ	standard deviation (or common standard deviation per group)	STDDEV=
ƒ _i	standard deviation in i th group	GROUPSTDDEVS=, PAIREDSTDDEVS=
ƒ _diff	standard deviation of differences
CV	coefficient of variation, defined as the ratio of the standard deviation to the (arithmetic) mean	CV=, PAIREDCVS=
	correlation	CORR=
µ _T , µ _R	treatment and reference (arithmetic) means for equivalence test	GROUPMEANS=, PAIREDMEANS=
³ _T , ³ _R	treatment and reference geometric means for equivalence test	GROUPMEANS=, PAIREDMEANS=
_L	lower equivalence bound	LOWER=
_U	upper equivalence bound	UPPER=
t ( ½ , )	t distribution with d.f. ½ and noncentrality
F ( ½ ₁ , ½ ₂ , » )	F distribution with numerator d.f. ½ ₁ , denominator d.f. ½ ₂ , and noncentrality »
t _p ; _½	p th percentile of t distribution with d.f. ½
F _p ; ½ ₁ , ½ ₂	p th percentile of F distribution with numerator d.f. ½ ₁ and denominator d.f. ½ ₂
Bin( N, p )	binomial distribution with sample size N and proportion p

A lower 1-sided test is associated with SIDES=L (or SIDES=1 with the effect smaller than the null value), and an upper 1-sided test is associated with SIDES=U (or SIDES=1 with the effect larger than the null value).

Owen (1965) defines a function, known as Owen s Q , that is convenient for representing terms in power formulas for confidence intervals and equivalence tests:

where ( ·) and ( ·) are the density and cumulative distribution function of the standard normal distribution, respectively.

Analyses in the MULTREG Statement

Type III F Test in Multiple Regression (TEST=TYPE3)

Maxwell (2000) discusses a number of different ways to represent effect sizes (and to compute exact power based on them) in multiple regression. PROC POWER supports two of these, multiple partial correlation and R ² in full and reduced models.

Let p denote the total number of predictors in the full model (excluding the intercept) and Y the response variable. You are testing that the coefficients of p ₁ ‰ 1 predictors in a set X ₁ are 0, controlling for all of the other predictors X _{ˆ’ 1} , which is comprised of p ˆ’ p ₁ ‰ variables .

The hypotheses can be expressed in two different ways. The first is in terms of , the multiple partial correlation between the predictors in X ₁ and the response Y adjusting for the predictors in X _{ˆ’ 1} :

The second is in terms of the multiple correlations in full and reduced nested models:

Note that the squared values of and are the population R ² values for full and reduced models.

The test statistic can be written in terms of the sample multiple partial correlation

or the sample multiple correlations in full and reduced models,

The test is the usual Type III F test in multiple regression:

Although the test is invariant to whether the predictors are assumed to be random or fixed, the power is affected by this assumption. If the response and predictors are assumed to have a joint multivariate normal distribution, then the exact power is given by the following formula:

The distribution of (for any ) is given in Chapter 32 of Johnson, Kotz, and Balakrishnan (1995). Sample size tables are presented in Gatsonis and Sampson (1989).

If the predictors are assumed to have fixed values, then the exact power is given by the noncentral F distribution. The noncentrality parameter is

or equivalently,

The power is

The minimum acceptable input value of N depends on several factors, as shown in Table 57.26.

Table 57.26: Minimum Acceptable Sample Size Values in the MULTREG Statement
Predictor Type	Intercept in Model?	p ₁ = 1?	Minimum N
Random	Yes	Yes	p + 3
Random	Yes	No	p + 2
Random	No	Yes	p + 2
Random	No	No	p + 1
Fixed	Yes	Yes or No	p + 2
Fixed	No	Yes or No	p + 1

Analyses in the ONECORR Statement

Fisher s z Test for Pearson Correlation (TEST=PEARSON DIST=FISHERZ)

Fisher s z transformation (Fisher 1921) of the sample correlation is defined as

Fisher s z test assumes the approximate normal distribution N ( µ, ƒ ² ) for z , where

and

where p * is the number of variables partialled out (Anderson 1984, pp. 132_133) and is the partial correlation between Y and X ₁ adjusting for the set of zero or more variables X _{ˆ’ 1} .

The test statistic

is assumed to have a normal distribution N ( , ½ ) where is the null partial correlation and and ½ are derived from section 16.33 of Stuart and Ord (1994):

The approximate power is computed as

Because the test is biased , the achieved significance level may differ from the nominal significance level. The actual alpha is computed in the same way as the power except with the correlation replaced by the null correlation .

t Test for Pearson Correlation (TEST=PEARSON DIST=T)

The 2-sided case is identical to multiple regression with an intercept and p ₁ = 1, which is discussed in the Analyses in the MULTREG Statement section on page 3500.

Let p * denote the number of variables partialled out. For the 1-sided cases, the test statistic is

which is assumed to have a null distribution of t ( N ˆ’ 2 ˆ’ p * ).

If the X and Y variables are assumed to have a joint multivariate normal distribution, then the exact power is given by the following formula:

The distribution of (given the underlying true correlation is given in Chapter 32 of Johnson, Kotz, and Balakrishnan (1995).

If the X variables are assumed to have fixed values, then the exact power is given by the noncentral t distribution t ( N ˆ’ 2 ˆ’ p * , ), where the noncentrality is

The power is

Analyses in the ONESAMPLEFREQ Statement

Exact Test of a Binomial Proportion (TEST=EXACT)

Let X be distributed as Bin( N, p ). The hypotheses for the test of the proportion p are as follows:

The exact test assumes binomially distributed data and requires N ‰ 1 and 0 < p < 1. The test statistic is

The significance probability ± is split symmetrically for 2-sided tests, in the sense that each tail is filled with as much as possible up to ± / 2.

Exact power computations are based on the binomial distribution and computing formulas such as the following from Johnson and Kotz (1970, equation 3.20):

where ½ ₁ = 2 C and ½ ₂ = 2( N ˆ’ C + 1)

Let C _L and C _U denote lower and upper critical values, respectively. Let ± _a denote the achieved (actual) significance level, which for 2-sided tests is the sum of the favorable major tail ( ± _M ) and the opposite minor tail ( ± _m ).

For the upper 1-sided case,

For the lower 1-sided case,

For the 2-sided case,

z Test for Binomial Proportion (TEST=Z)

For the normal approximation test, the test statistic is

For the METHOD=EXACT option, the computations are the same as described in the Exact Test of a Binomial Proportion (TEST=EXACT) section on page 3504 except for the definitions of the critical values.

For the upper 1-sided case,

For the lower 1-sided case,

For the 2-sided case,

For the METHOD=NORMAL option, the test statistic Z ( X ) is assumed to have the normal distribution

The approximate power is computed as

The approximate sample size is computed in closed form for the 1-sided cases by inverting the power equation,

and by numerical inversion for the 2-sided case.

z Test for Binomial Proportion with Continuity Adjustment (TEST=ADJZ)

For the normal approximation test with continuity adjustment, the test statistic is (Pagano and Gauvreau 1993 p. 295):

For the METHOD=EXACT option, the computations are the same as described in the Exact Test of a Binomial Proportion (TEST=EXACT) section on page 3504 except for the definitions of the critical values.

For the upper 1-sided case,

For the lower 1-sided case,

For the 2-sided case,

For the METHOD=NORMAL option, the test statistic Z _c ( X ) is assumed to have the normal distribution N ( µ, ƒ ² ) where µ and ƒ ² are derived as follows.

For convenience of notation, define

Then

and

The probabilities P ( X = Np ), P ( X < Np ), and P ( X > Np ) and the truncated expectations and are approximated by assuming the normal-approximate distribution of X , N ( Np, Np (1 ˆ’ p )). Letting ( ·) and ( ·) denote the standard normal PDF and CDF, respectively, and defining d as

the terms are computed as follows:

The mean and variance of Z _c ( X ) are thus approximated by

and

The approximate power is computed as

Analyses in the ONESAMPLEMEANS Statement

One-sample t Test (TEST=T)

The hypotheses for the one-sample t test are

The test assumes normally distributed data and requires N ‰ 2. The test statistics are

where x is the sample mean, s is the sample standard deviation, and

The test is

Exact power computations for t tests are discussed in O Brien and Muller (1993, section 8.2), although not specifically for the one-sample case. The power is based on the noncentral t and F distributions:

Solutions for N , ± , and are obtained by numerically inverting the power equation. Closed-form solutions for other parameters, in terms of , are as follows:

One-sample t Test with Lognormal Data (TEST=T DIST=LOGNORMAL)

The lognormal case is handled by re- expressing the analysis equivalently as a normality-based test on the log-transformed data, using properties of the lognormal distribution as discussed in Johnson and Kotz (1970, chapter 14). The approaches in the One-sample t Test (TEST=T) section on page 3508 then apply.

In contrast to the usual t test on normal data, the hypotheses with lognormal data are defined in terms of geometric means rather than arithmetic means. This is because the transformation of a null arithmetic mean of lognormal data to the normal scale depends on the unknown coefficient of variation, resulting in an ill-defined hypothesis on the log-transformed data. Geometric means transform cleanly and are more natural for lognormal data.

The hypotheses for the one-sample t test with lognormal data are

Let µ * and ƒ * be the (arithmetic) mean and standard deviation of the normal distribution of the log-transformed data. The hypotheses can be rewritten as follows:

where µ * = log( ³ ).

The test assumes lognormally distributed data and requires N ‰ 2.

The power is

where

Equivalence Test for Mean of Normal Data (TEST=EQUIV DIST=NORMAL)

The hypotheses for the equivalence test are

The analysis is the two one-sided tests (TOST) procedure of Schuirmann (1987). The test assumes normally distributed data and requires N ‰ 2. Phillips (1990) derives an expression for the exact power assuming a two-sample balanced design; the results are easily adapted to a one-sample design:

where Q .( · , ·; · , ·) is Owen s Q function, defined in the Common Notation section on page 3498.

Equivalence Test for Mean of Lognormal Data (TEST=EQUIV DIST=LOGNORMAL)

The lognormal case is handled by re-expressing the analysis equivalently as a normality-based test on the log-transformed data, using properties of the lognormal distribution as discussed in Johnson and Kotz (1970, chapter 14). The approaches in the Equivalence Test for Mean of Normal Data (TEST=EQUIV DIST=NORMAL) section on page 3510 then apply.

In contrast to the additive equivalence test on normal data, the hypotheses with lognormal data are defined in terms of geometric means rather than arithmetic means. This is because the transformation of an arithmetic mean of lognormal data to the normal scale depends on the unknown coefficient of variation, resulting in an ill-defined hypothesis on the log-transformed data. Geometric means transform cleanly and are more natural for lognormal data.

The hypotheses for the equivalence test are

The analysis is the two one-sided tests (TOST) procedure of Schuirmann (1987) on the log-transformed data. The test assumes lognormally distributed data and requires N ‰ 2. Diletti, Hauschke, and Steinijans (1991) derive an expression for the exact power assuming a crossover design; the results are easily adapted to a one-sample design:

where

is the standard deviation of the log-transformed data, and Q .( · , ·; · , ·) is Owen s Q function, defined in the Common Notation section on page 3498.

Confidence Interval for Mean (CI=T)

This analysis of precision applies to the standard t -based confidence interval:

where x is the sample mean and s is the sample standard deviation. The half-width is defined as the distance from the point estimate x to a finite endpoint,

A valid conference interval captures the true mean. The exact probability of obtaining at most the target confidence interval half-width h , unconditional or conditional on validity, is given by Beal (1989):

where

and Q .( · , ·; · , ·) is Owen s Q function, defined in the Common Notation section on page 3498.

A quality confidence interval is both sufficiently narrow (half-width ‰ h ) and valid:

Analyses in the ONEWAYANOVA Statement

One-Degree-of-Freedom Contrast (TEST=CONTRAST)

The hypotheses are

where G is the number of groups, { c ₁ , , c _G } are the contrast coefficients, and c is the null contrast value.

The test is the usual F test for a contrast in one-way ANOVA. It assumes normal data with common group variances and requires N ‰ G + 1 and n _i ‰ 1.

O Brien and Muller (1993, section 8.2.3.2) give the exact power as

where

Overall F Test (TEST=OVERALL)

The hypotheses are

where G is the number of groups.

The test is the usual overall F test for equality of means in one-way ANOVA. It assumes normal data with common group variances and requires N ‰ G + 1 and n _i ‰ 1.

O Brien and Muller (1993, section 8.2.3.1) give the exact power as

where the noncentrality is

and

Analyses in the PAIREDFREQ Statement

Overview of Conditional McNemar tests

Notation:

		Case
		Failure	Success
Control	Failure	n ₀₀	n ₀₁	n .
	Success	n ₁₀	n ₁₁	n ₁ .
		n .	n . ₁	N

n ₀₀ = #{control=failure, case=failure }
n ₀₁ = #{control=failure, case=success }
n ₁₀ = #{control=success, case=failure }
n ₁₁ = #{control=success, case=success }
N = n ₀₀ + n ₀₁ + n ₁₀ + n ₁₁
n _D = n ₀₁ + n ₁₀ ‰ # discordant pairs
_ij = theoretical population value of » _ij
_{1 ·} = ₁₀ + ₁₁
_·1 = ₀₁ + ₁₁
OR = null odds ratio

All McNemar tests covered in PROC POWER are conditional , meaning that n _D is assumed fixed at its observed value.

For the usual OR = 0, the hypotheses are

The test statistic for both tests covered in PROC POWER (DIST=EXACT_ COND and DIST=NORMAL) is the McNemar statistic Q _M , which has the following form when OR = 0:

For the conditional McNemar tests, this is equivalent to the square of the Z ( X ) statistic for the test of a single proportion (normal approximation to binomial), where the proportion is the null is 0 . 5, and N is n _D (see, e.g., Schork and Williams 1980):

This can be generalized to a custom null for , which is equivalent to specifying a custom odds ratio:

So, a conditional McNemar test (asymptotic or exact) with a custom null is equivalent to the test of a single proportion with a null value , with a sample size of n _D :

which is equivalent to

The general form of the test statistic is thus

The two most common conditional McNemar tests assume either the exact conditional distribution of Q _M (covered by the DIST=EXACT_ COND analysis) or a standard normal distribution for Q _M (covered by the DIST=NORMAL analysis).

McNemar Exact Conditional Test (TEST=MCNEMAR DIST=EXACT_COND)

For DIST=EXACT_ COND, the power is calculated assuming that the test is conducted using the exact conditional distribution of Q _M (conditional on n _D ). The power is calculated by first computing the conditional power for each possible n _D . The unconditional power is computed as a weighted average over all possible outcomes of n _D :

where n _D _ Bin( ₀₁ + ₁₀ , N ), and P (Reject p ₁ = p n _D ) is calculated using the exact method in the Exact Test of a Binomial Proportion (TEST=EXACT) section on page 3504.

The achieved significance level, reported as Actual Alpha in the analysis, is computed in the same way except using the actual alpha of the one-sample test in place of its power:

where ± * ( p ₁ , p n _D ) is the actual alpha calculated using the exact method in the Exact Test of a Binomial Proportion (TEST=EXACT) section on page 3504 with proportion p ₁ , null p , and sample size n _D .

McNemar Normal Approximation Test (TEST=MCNEMAR DIST=NORMAL)

For DIST=NORMAL, power is calculated assuming the test is conducted using the normal-approximate distribution of Q _M (conditional on n _D ).

For the METHOD=EXACT option, the power is calculated in the same way as described in the McNemar Exact Conditional Test (TEST=MCNEMAR DIST=EXACT_ COND) section on page 3516, except that P (Reject p ₁ = p n _D ) is calculated using the exact method in the z Test for Binomial Proportion (TEST=Z) section on page 3505. The achieved significance level is calculated in the same way as described at the end of the McNemar Exact Conditional Test (TEST=MCNEMAR DIST=EXACT_ COND) section on page 3516.

For the METHOD=MIETTINEN option, approximate sample size for the 1-sided cases is computed according to equation (5.6) in Miettinen (1968):

Approximate power for the 1-sided cases is computed by solving the sample size equation for power, and approximate power for the 2-sided case follows easily by summing the 1-sided powers each at ± / 2:

The 2-sided solution for N is obtained by numerically inverting the power equation.

In general, compared to METHOD=CONNOR, the METHOD=MIETTINEN approximation tends to be slightly more accurate but may be slightly anticonservative in the sense of underestimating sample size and overestimating power (Lachin 1992, p. 1250).

For the METHOD=CONNOR option, approximate sample size for the 1-sided cases is computed according to equation (3) in Connor (1987):

The 2-sided solution for N is obtained by numerically inverting the power equation.

In general, compared to METHOD=MIETTINEN, the METHOD=CONNOR approximation tends to be slightly less accurate but slightly conservative in the sense of overestimating sample size and underestimating power (Lachin 1992, p. 1250).

Analyses in the PAIREDMEANS Statement

Paired t Test (TEST=DIFF)

The hypotheses for the paired t test are

The test assumes normally distributed data and requires N ‰ 2. The test statistics are

where d and s _d are the sample mean and standard deviation of the differences and

and

The test is

Exact power computations for t tests are given in O Brien and Muller (1993, section 8.2.2):

Paired t Test for Mean Ratio with Lognormal Data (TEST=RATIO)

The lognormal case is handled by re-expressing the analysis equivalently as a normality-based test on the log-transformed data, using properties of the lognormal distribution as discussed in Johnson and Kotz (1970, chapter 14). The approaches in the Paired t Test (TEST=DIFF) section on page 3518 then apply.

In contrast to the usual t test on normal data, the hypotheses with lognormal data are defined in terms of geometric means rather than arithmetic means.

The hypotheses for the paired t test with lognormal pairs { Y ₁ , Y ₂ } are

Let and * be the (arithmetic) means, standard deviations, and correlation of the bivariate normal distribution of the log-transformed data {log Y ₁ , log Y ₂ }. The hypotheses can be rewritten as follows:

where

and CV ₁ , CV ₂ , and are the coefficients of variation and the correlation of the original untransformed pairs { Y ₁ ,Y ₂ }. The conversion from to * is shown in Jones and Miller (1966).

The test assumes lognormally distributed data and requires N ‰ 2. The power is

where

and

Additive Equivalence Test for Mean Difference with Normal Data (TEST=EQUIV_DIFF)

The hypotheses for the equivalence test are

where

and Q .( · , ·; · , ·) is Owen s Q function, defined in the Common Notation section on page 3498.

Multiplicative Equivalence Test for Mean Ratio with Lognormal Data (TEST=EQUIV_RATIO)

The lognormal case is handled by re-expressing the analysis equivalently as a normality-based test on the log-transformed data, using properties of the lognormal distribution as discussed in Johnson and Kotz (1970, chapter 14). The approaches in the Additive Equivalence Test for Mean Difference with Normal Data (TEST=EQUIV_DIFF) section on page 3520 then apply.

In contrast to the additive equivalence test on normal data, the hypotheses with lognormal data are defined in terms of geometric means rather than arithmetic means.

The hypotheses for the equivalence test are

where 0 < _L < _U

where ƒ * is the standard deviation of the differences between the log-transformed pairs (in other words, the standard deviation of log( Y _T ) ˆ’ log( Y _R ), where Y _T and Y _R are observations from the treatment and reference, respectively), computed as

where CV _R , CV _T , and are the coefficients of variation and the correlation of the original untransformed pairs { Y _T , Y _R }, and Q .( · , ·; · , ·) is Owen s Q function. The conversion from to * is shown in Jones and Miller (1966), and Owen s Q function is definedinthe Common Notation section on page 3498.

Confidence Interval for Mean Difference (CI=DIFF)

This analysis of precision applies to the standard t -based confidence interval:

where d and s _d are the sample mean and standard deviation of the differences. The half-width is defined as the distance from the point estimate d to a finite endpoint,

A valid conference interval captures the true mean difference. The exact probability of obtaining at most the target confidence interval half-width h , unconditional or conditional on validity, is given by Beal (1989):

where

and Q .( · , ·; · , ·) is Owen s Q function, defined in the Common Notation section on page 3498.

A quality confidence interval is both sufficiently narrow (half-width ‰ h ) and valid:

Analyses in the TWOSAMPLEFREQ Statement

Overview of the 2 — 2 Table

Notation:

		Group 2
		Failure	Success
Group 1	Failure	x ₁	x ₂	m
	Success	n ₁ ˆ’ x ₁	n ₂ ˆ’ x ₂	N ˆ’ m
		n ₁	n ₂	N

x ₁ = # successes in group 1
x ₂ = # successes in group 2
m = x ₁ + x ₂ = total # successes

The hypotheses are

where p is constrained to be 0 for all but the unconditional Pearson chi-square test.

Internal calculations are performed in terms of p ₁ , p ₂ , and p . An input set consisting of OR , p ₁ , and OR is transformed as follows:

An input set consisting of RR , p ₁ ,and RR is transformed as follows:

Note that the transformation of either OR or RR to p is not unique. The chosen parameterization fixes the null value p ₁₀ at the input value of p ₁ .

Pearson Chi-square Test for Two Proportions (TEST=PCHI)

The usual Pearson chi-square test is unconditional. The test statistic

is assumed to have a null distribution of N (0 , 1).

Sample size for the 1-sided cases is given by equation (4) in Fleiss, Tytun, and Ury (1980). One-sided power is computed as suggested by Diegert and Diegert (1981) by inverting the sample size formula. Power for the 2-sided case is computed by adding the lower-sided and upper-sided powers each with ± / 2, and sample size for the 2-sided case is obtained by numerically inverting the power formula. A custom null value p for the proportion difference p ₂ ˆ’ p ₁ is also supported.

For the 1-sided cases, a closed-form inversion of the power equation yield an approximate total sample size

For the 2-sided case, the solution for N is obtained by numerically inverting the power equation.

Likelihood Ratio chi-square Test for Two Proportions (TEST=LRCHI)

The usual likelihood ratio chi-square test is unconditional. The test statistic

is assumed to have a null distribution of N (0 , 1) and an alternative distribution of N ( , 1) where

The approximate power is

For the 1-sided cases, a closed-form inversion of the power equation yield an approximate total sample size

For the 2-sided case, the solution for N is obtained by numerically inverting the power equation.

Fisher s Exact Conditional Test for Two Proportions (Test=FISHER)

Fisher s exact test is conditional on the observed total number of successes m .Power and sample size computations for the METHOD=WALTERS option are based on a test with similar power properties, the continuity-adjusted arcsine test. The test statistic

is assumed to have a null distribution of N (0 , 1) and an alternative distribution of N ( , 1) where

The approximate power for the 1-sided balanced case is given by Walters (1979) and is easily extended to the unbalanced and 2-sided cases:

Analyses in the TWOSAMPLEMEANS Statement

Two-sample t Test Assuming Equal Variances (TEST=DIFF)

The hypotheses for the two-sample t test are

The test assumes normally distributed data and common standard deviation per group, and it requires N ‰ 3, n ₁ ‰ 1, and n ₂ ‰ 1. The test statistics are

where x ₁ and x ₂ are the sample means and s _p is the pooled standard deviation, and

The test is

Exact power computations for t tests are given in O Brien and Muller (1993, section 8.2.1):

Solutions for N , n ₁ , n ₂ , ± , and are obtained by numerically inverting the power equation. Closed-form solutions for other parameters, in terms of , are as follows:

Finally, here is a derivation of the solution for w ₁ :

Solve the equation for w ₁ (which requires the quadratic formula). Then determine the range of given w ₁ :

This implies

Two-sample Satterthwaite t Test Assuming Unequal Variances (TEST=DIFF_SATT)

The hypotheses for the two-sample Satterthwaite t test are

The test assumes normally distributed data and requires N ‰ 3, n ₁ ‰ 1, and n ₂ ‰ 1. The test statistics are

where x ₁ and x ₂ are the sample means and s ₁ and s ₂ are the sample standard deviations.

As DiSantostefano and Muller (1995, p. 585) state, the test is based on assuming that under H , F is distributed as F (1 , ½ ), where ½ is given by Satterthwaite s approximation (Satterthwaite 1946),

Since ½ is unknown, in practice it must be replaced by an estimate

So the test is

Exact solutions for power for the 2-sided and upper 1-sided cases are given in Moser, Stevens, and Watts (1989). The lower 1-sided case follows easily using symmetry. The equations are as follows:

where

The density f ( u ) is obtained from the fact that

Two-sample Pooled t Test of Mean Ratio with Lognormal Data (TEST=RATIO)

The lognormal case is handled by re-expressing the analysis equivalently as a normality-based test on the log-transformed data, using properties of the lognormal distribution as discussed in Johnson and Kotz (1970, chapter 14). The approaches in the Two-sample t Test Assuming Equal Variances (TEST=DIFF) section on page 3526 then apply.

In contrast to the usual t test on normal data, the hypotheses with lognormal data are defined in terms of geometric means rather than arithmetic means. The test assumes equal coefficients of variation in the two groups.

The hypotheses for the two-sample t test with lognormal data are

Let and ƒ * be the (arithmetic) means and common standard deviation of the corresponding normal distributions of the log-transformed data. The hypotheses can be rewritten as follows:

where

The test assumes lognormally distributed data and requires N ‰ 3, n ₁ ‰ 1, and n ₂ ‰ 1.

The power is

where

Additive Equivalence Test for Mean Difference with Normal Data (TEST=EQUIV_DIFF)

The hypotheses for the equivalence test are

The analysis is the two one-sided tests (TOST) procedure of Schuirmann (1987). The test assumes normally distributed data and requires N ‰ 3, n ₁ ‰ 1, and n ₂ ‰ 1. Phillips (1990) derives an expression for the exact power assuming a balanced design; the results are easily adapted to an unbalanced design:

where Q .( · , ·; · , ·) is Owen s Q function, defined in the Common Notation section on page 3498.

Multiplicative Equivalence Test for Mean Ratio with Lognormal Data (TEST=EQUIV_RATIO)

The lognormal case is handled by re-expressing the analysis equivalently as a normality-based test on the log-transformed data, using properties of the lognormal distribution as discussed in Johnson and Kotz (1970, chapter 14). The approaches in the Additive Equivalence Test for Mean Difference with Normal Data (TEST=EQUIV_DIFF) section on page 3530 then apply.

In contrast to the additive equivalence test on normal data, the hypotheses with lognormal data are defined in terms of geometric means rather than arithmetic means.

The hypotheses for the equivalence test are

where 0 < _L < _U

The analysis is the two one-sided tests (TOST) procedure of Schuirmann (1987) on the log-transformed data. The test assumes lognormally distributed data and requires N ‰ 3, n ₁ ‰ 1, and n ₂ ‰ 1. Diletti, Hauschke, and Steinijans (1991) derive an expression for the exact power assuming a crossover design; the results are easily adapted to an unbalanced two-sample design:

where

is the (assumed common) standard deviation of the normal distribution of the logtransformed data, and Q .( · , ·; · , ·) is Owen s Q function, defined in the Common Notation section on page 3498.

Confidence Interval for Mean Difference (CI=DIFF)

This analysis of precision applies to the standard t -based confidence interval:

where x ₁ and x ₂ are the sample means and s _p is the pooled standard deviation. The half-width is defined as the distance from the point estimate x ₂ ˆ’ x ₁ to a finite endpoint,

where

and Q .( · , ·; · , ·) is Owen s Q function, defined in the Common Notation section on page 3498.

A quality confidence interval is both sufficiently narrow (half-width ‰ h ) and valid:

Analyses in the TWOSAMPLESURVIVAL Statement

Rank Tests for Two Survival Curves (TEST=LOGRANK, TEST=GEHAN, TEST=TARONEWARE)

The method is from Lakatos (1988) and Cantor (1997, pp. 83_92).

Define the following notation:

X _j ( i )= i th input time point on survival curve for group j
S _j ( i )=input survivor function value corresponding to X _j ( i )
h _j ( t )= hazard rate for group j at time t
_j ( t )=loss hazard rate for group j at time t
» _j = exponential hazard rate for group j
R = hazard ratio of group 2 to group 1 ‰ (assumed constant) value of
m _j = median survival time for group j
b = number of subintervals per time unit
T = accrual time
= post-accrual follow-up time
L _j = exponential loss rate for group j
XL _j = input time point on loss curve for group j
SL _j = input survivor function value corresponding to XL _j
mL _j = median survival time for group j
r _i = rank for i th time point

Each survival curve can be specified in one of several ways.

For exponential curves:
- a single point ( X _j (1) ,S _j (1)) on the curve
- median survival time
- hazard rate
- hazard ratio (for curve 2, with respect to curve 1)

For piecewise linear curves with proportional hazards:
- a set of points {( X ₁ (1) , S ₁ (1)) , ( X ₁ (2) , S ₁ (2)) , } (for curve 1)
- hazard ratio (for curve 2, with respect to curve 1)

For arbitrary piecewise linear curves:
- a set of points {( X _j (1) , S _j (1)) , ( X _j (2) , S _j (2)) , }

A total of M evenly spaced time points{ t =0 , t ₁ , t ₂ , ,t _M = T + } are used in calculations, where

The hazard function is calculated for each survival curve at each time point. For an exponential curve, the (constant) hazard is given by one of the following, depending on the input parameterization:

For a piecewise linear curve, define the following additional notation:

The hazard is computed using linear interpolation as follows:

With proportional hazards, the hazard rate of group 2 s curve in terms of the hazard rate of group 1 s curve is

Hazard function values { _j ( t _i )} for the loss curves are computed in an analogous way from { L _j , XL _j , SL _j , mL _j }.

The expected number at risk N _j ( i ) at time i in group j is calculated for each group and time points 0 through M ˆ’ 1, as follows:

Define _i as the ratio of hazards and _i as the ratio of expected numbers at risk for time t _i :

The expected number of deaths in each subinterval is calculated as follows:

The rank values are calculated as follows according to which test statistic is used:

The distribution of the test statistic is approximated by N ( E, 1) where

Note that N ^1/2 can be factored out of the mean E , and so it can be expressed equivalently as

where E * is free of N and

The approximate power is

Note that the upper and lower 1-sided cases are expressed differently than in other analyses. This is because E * > 0 corresponds to a higher survival curve in group 1 and thus, by the convention used in PROC power for 2-group analyses, the lower side.

For the 1-sided cases, a closed-form inversion of the power equation yield an approximate total sample size

For the 2-sided case, the solution for N is obtained by numerically inverting the power equation.