Details | SAS.STAT 9.1 Users Guide (Vol. 5)

Missing Values

If an observation has a missing value for a response variable, PROC NPAR1WAY excludes that observation from the analysis.

By default, PROC NPAR1WAY excludes observations with missing values of the CLASS variable. If you specify the MISSING option, PROC NPAR1WAY treats missing values of the CLASS variable as a valid class level and includes these observations in the analysis.

PROC NPAR1WAY treats missing BY variable values like any other BY variable value. The missing values form a separate BY group . When a value of the FREQ variable is missing, PROC NPAR1WAY excludes the observation from the analysis.

Tied Values

Tied values occur when two are more observations are equal, whether the observations occur in the same sample or in different samples. In theory, nonparametric tests were developed for continuous distributions where the probability of a tie is zero. In practice, however, ties often occur. PROC NPAR1WAY uses the same method to handle ties for all score types. The procedure computes the scores as if there were no ties, averages the scores for tied observations, and assigns this average score to each observation with the same value.

When there are tied values, PROC NPAR1WAY first sorts the observations in ascending order and assigns ranks as if there were no ties. Then the procedure computes the scores based on these ranks, using the formula for the specified score type. The procedure averages the scores for tied observations and assigns this average score to each of the tied observations. Thus, all equal data values have the same score value. PROC NPAR1WAY then computes the test statistic from these scores.

Note that the asymptotic tests may be less accurate when the distribution of the data is heavily tied. For such data, it may be appropriate to use the exact tests provided by PROC NPAR1WAY as described in the section Exact Tests on page 3171.

When computing empirical distribution function statistics for data with ties, PROC NPAR1WAY uses the formulas given in the section Tests Based on the Empirical Distribution Function on page 3168. No special handling of ties is necessary.

Note that PROC NPAR1WAY bases its computations on the internal numeric values of the analysis variables ; the procedure does not format or round these values before analysis. When values differ in their internal representation, even slightly, PROC NPAR1WAY does not treat them as tied values. If this is a concern for your data, then round the analysis variables by an appropriate amount before invoking PROC NPAR1WAY. For information on the ROUND function, refer to the discussion in SAS Language Reference: Dictionary .

Statistical Computations

Simple Linear Rank Tests for Two-Sample Data

Statistics of the form

are called simple linear rank statistics , where

R _j	is the rank of the observation j
a ( R _j )	is the score based on that rank
c _j	is an indicator variable denoting the class to which the j th observation belongs
n	is the total number of observations

For two-sample data (where the observations are classified into two levels), PROC NPAR1WAY calculates simple linear rank statistics for the scores that you specify. The section Scores for Linear Rank and One-Way ANOVA Tests on page 3166 describes the available scores, which you can use to test for differences in location and differences in scale.

To compute S , PROC NPAR1WAY sums the scores of the observations in the smaller of the two samples. If both samples have the same number of observations, PROC NPAR1WAY sums those scores for the sample that appears first in the input data set.

For each score that you specify, PROC NPAR1WAY computes an asymptotic test of the null hypothesis of no difference between the two classification levels. Exact tests are also available for these two-sample linear rank statistics. PROC NPAR1WAY computes exact tests for each score type that you specify in the EXACT statement. See the section Exact Tests on page 3171 for details.

To compute an asymptotic test for a linear rank sum statistic, PROC NPAR1WAY uses a standardized test statistic z , which has an asymptotic standard normal distribution under the null hypothesis. The standardized test statistic is computed as

where E ( S ) is the expected value of S under the null hypothesis, and Var ( S ) is the variance under the null hypothesis. As shown in Randles and Wolfe (1979),

where n ₁ is the number of observations in the first (smaller) class level or sample, n ₂ is the number of observations in the other class level, and

where is the average score,

PROC NPAR1WAY computes one-sided and two-sided asymptotic p -values for each two-sample linear rank test. When the test statistic z is greater than its null hypothesis expected value of zero, PROC NPAR1WAY computes the right-sided p -value, which is the probability of a larger value of the statistic occurring under the null hypothesis. When the test statistic is less than or equal to zero, PROC NPAR1WAY computes the left-sided p -value, which is the probability of a smaller value of the statistic occurring under the null hypothesis. The one-sided p -value P ₁ can be expressed as

where Z has a standard normal distribution. The two-sided p -value P ₂ is computed as

For Wilcoxon scores and Siegel-Tukey scores, PROC NPAR1WAY incorporates a continuity correction when computing the standardized test statistic z , unless you specify the CORRECT=NO option. PROC NPAR1WAY applies the continuity correction by subtracting 0.5 from the numerator S ˆ’ E ( S ) if it is greater than zero. If the numerator is less than zero, PROC NPAR1WAY adds 0.5. Some sources recommend a continuity correction for nonparametric tests that use a continuous distribution to approximate a discrete distribution. Refer to Sheskin (1997). If you specify CORRECT=NO, PROC NPAR1WAY does not use a continuity correction for any test.

One-Way ANOVA Tests

PROC NPAR1WAY computes a one-way ANOVA test for each score type that you specify. Under the null hypothesis of no difference among class levels (or samples), this test statistic has an asymptotic chi-square distribution with r ˆ’ 1 degrees of freedom, where r is the number of class levels. For Wilcoxon scores, this test is known as the Kruskal-Wallis test.

Exact one-way ANOVA tests are also available for multisample data (where the data are classified into more than two levels). For two-sample data, exact simple linear rank tests are available. PROC NPAR1WAY computes exact tests for each score type that you specify in the EXACT statement. See the section Exact Tests on page 3171 for details on exact tests.

PROC NPAR1WAY computes the one-way ANOVA test statistic as

where T _i is the total of scores for the class level i , E ( T _i ) is the expected total for level i under the null hypothesis of no difference among levels, n _i is the number of observations in level i , and S ² is the sample variance of the scores.

where a ( R _j ) is the score for observation j , and c _ij indicates whether observation j is in level i .

where a is the average score,

Scores for Linear Rank and One-Way ANOVA Tests

For each score type that you specify, PROC NPAR1WAY computes a one-way ANOVA statistic and also a linear rank statistic for two-sample data. The following score types are used primarily to test for differences in location: Wilcoxon, median, Van der Waerden, and Savage. The following scores types are used to test for scale differences: Siegel-Tukey, Ansari-Bradley, Klotz, and Mood. This section gives formulas for the score types. For further information on the formulas and the applicability of each score, refer to Randles and Wolfe (1979), Gibbons and Chakraborti (1992), Conover (1999), and Hollander and Wolfe (1999).

In addition to the score types described in this section, you can specify the SCORES=DATA option to use the input data observations as scores. This enables you to produce a very wide variety of tests. You can construct any scores using the DATA step, and then PROC NPAR1WAY computes the corresponding linear rank and one-way ANOVA tests. You can also analyze the raw data with the SCORES=DATA option; for two-sample data, this permutation test is known as Pitman s test.

Wilcoxon Scores

Wilcoxon scores are the ranks of the observations.

Using Wilcoxon scores in the linear rank statistic for two-sample data produces the rank sum statistic of the Mann-Whitney-Wilcoxon test. Using Wilcoxon scores in the one-way ANOVA statistic produces the Kruskal-Wallis test. Wilcoxon scores are locally most powerful for location shifts of a logistic distribution.

When computing the asymptotic Wilcoxon two-sample test, PROC NPAR1WAY uses a continuity correction by default, as described in the section Simple Linear Rank Tests for Two-Sample Data on page 3163. If you specify CORRECT=NO in the PROC NPAR1WAY statement, the procedure does not use a continuity correction.

Median Scores

Median scores equal 1 for observations greater than the median, and 0 otherwise .

Using median scores in the linear rank statistic for two-sample data produces the two-sample median test. The one-way ANOVA statistic with median scores is equivalent to the Brown-Mood test. Median scores are particularly powerful for distributions that are symmetric and heavy-tailed.

Van der Waerden Scores

Van der Waerden scores are the quantiles of a standard normal distribution. These scores are also known as quantile normal scores .

where is the cumulative distribution function of a standard normal distribution. These scores are powerful for normal distributions.

Savage Scores

Savage scores are expected values of order statistics from the exponential distribution, with 1 subtracted to center the scores around 0.

Savage scores are powerful for comparing scale differences in exponential distributions or location shifts in extreme value distributions (Hajek 1969, p. 83).

Siegel-Tukey Scores

Siegel-Tukey scores are computed as

where the score values continue to increase in this pattern towards the middle ranks until all observations have been assigned a score.

Ansari-Bradley Scores

Ansari-Bradley scores are similar to Siegel-Tukey scores, but Ansari-Bradley assigns the same scores to corresponding extreme ranks. (Siegel Tukey scores are just a permutation of the ranks 1 , 2 , , n .)

Equivalently, Ansari-Bradley scores are defined as

Klotz Scores

Klotz scores are the squares of the Van der Waerden (or quantile normal) scores.

where is the cumulative distribution function of a standard normal distribution.

Mood Scores

Mood scores are computed as the square of the difference between each rank and the average rank.

Tests Based on the Empirical Distribution Function

If you specify the EDF option, PROC NPAR1WAY computes tests based on the empirical distribution function. These include the Kolmogorov-Smirnov and Cramer-von Mises tests, and also the Kuiper test for two-sample data. This section gives formulas for these test statistics. For further information on the formulas and the interpretation of EDF statistics, refer to Hollander and Wolfe (1999) and Gibbons and Chakraborti (1992). For details on the k -sample analogues of the Kolmogorov-Smirnov and Cramer-von Mises statistics used by NPAR1WAY, refer to Kiefer (1959).

The empirical distribution function (EDF) of a sample { x _j }, j = 1 , 2 , , n , is defined as the following function:

where I ( ·) is an indicator function. PROC NPAR1WAY uses the subsample of values within the i th class level to generate an EDF for the class, F _i . The EDF for the overall sample, pooled over classes, can also be expressed as

where n _i is the number of observations in the i th class level, and n is the total number of observations.

Kolmogorov-Smirnov Test

The Kolmogorov-Smirnov statistic measures the maximum deviation of the EDF within the classes from the pooled EDF. PROC NPAR1WAY computes the Kolmogorov-Smirnov statistic as

The asymptotic Kolmogorov-Smirnov statistic is computed as

For each class level i and overall, PROC NPAR1WAY displays the value of F _i at the maximum deviation from F and the value ( F _i ˆ’ F ) at the maximum deviation from F . PROC NPAR1WAY also gives the observation where the maximum deviation occurs.

If there are only two class levels, PROC NPAR1WAY computes the two-sample Kolmogorov-Smirnov test statistic D as

The p -value for this test is the probability that D is greater than the observed value d under the null hypothesis of no difference between class levels or samples. PROC NPAR1WAY computes the asymptotic p -value for D with the approximation

where

The quality of this approximation has been studied by Hodges (1957).

If you specify the D option, or if you request exact Kolmogorov-Smirnov p -values with the KS option in the EXACT statement, PROC NPAR1WAY also computes the one-sided Kolmogorov-Smirnov statistics D + and D ˆ’ for two-sample data.

The asymptotic probability that D + is greater than the observed value d ⁺ , under the null hypothesis of no difference between the two class levels, is computed as

Similarly, the asymptotic probability that D ˆ’ is greater than the observed value d ^ˆ’ is computed as

To request exact p -values for the Kolmogorov-Smirnov statistics, you can specify the KS option in the EXACT statement. See the section Exact Tests on page 3171 for more information.

Cramer-von Mises Test

The Cramer-von Mises statistic is defined as

where t _j is the number of ties at the j th distinct value and p is the number of distinct values. The asymptotic value is computed as

PROC NPAR1WAY displays the contribution of each class level to the sum CM _a .

Kuiper Test

For data with two class levels, PROC NPAR1WAY computes the Kuiper statistic, its scaled value for the asymptotic distribution, and the asymptotic p -value. The Kuiper statistic is computed as

The asymptotic value is

PROC NPAR1WAY displays max _j F ₁ ( x _j ) ˆ’ F ₂ ( x _j ) for each class level.

The p -value for the Kuiper test is the probability of observing a larger value of K _a under the null hypothesis of no difference between the two classes. PROC NPAR1WAY computes this p -value according to Owen (1962), p. 441.

Exact Tests

PROC NPAR1WAY provides exact p -values for tests for location and scale differences based on the following scores: Wilcoxon, median, van der Waerden, Savage, Siegel-Tukey, Ansari-Bradley, Klotz, and Mood scores. Additionally, PROC NPAR1WAY provides exact p -values for tests using the raw data as scores. Exact tests are available for two-sample and multisample data. When the data are classified into two samples, tests are based on simple linear rank statistics. When the data are classified into more than two samples, tests are based on one-way ANOVA statistics.

Exact tests can be useful in situations where the asymptotic assumptions are not met and the asymptotic p -values are not close approximations for the true p -values. Standard asymptotic methods involve the assumption that the test statistic follows a particular distribution when the sample size is sufficiently large. When the sample size is not large, asymptotic results may not be valid, with the asymptotic p -values differing perhaps substantially from the exact p -values. Asymptotic results may also be unreliable when the distribution of the data is sparse, skewed, or heavily tied. Refer to Agresti (1996) and Bishop, Fienberg, and Holland (1975). Exact computations are based on the statistical theory of exact conditional inference for contingency tables, reviewed by Agresti (1992).

In addition to computation of exact p -values, PROC NPAR1WAY provides the option of estimating exact p -values by Monte Carlo simulation. This can be useful for problems that are so large that exact computations require a great amount of time and memory, but for which asymptotic approximations may not be sufficient.

The following sections summarize the exact computational algorithms, define the exact p -values that PROC NPAR1WAY computes, discuss the computational resource requirements, and describe the Monte Carlo estimation option.

Computational Algorithms

PROC NPAR1WAY computes exact p -values using the network algorithm developed by Mehta and Patel (1983). This algorithm provides a substantial advantage over direct enumeration, which can be very time consuming and feasible only for small problems. Refer to Agresti (1992) for a review of algorithms for computation of exact p -values, and refer to Mehta, Patel, and Tsiatis (1984) and Mehta, Patel, and Senchaudhuri (1991) for information on the performance of the network algorithm.

PROC NPAR1WAY constructs a contingency table from the input data, with rows formed by the levels of the classification variable and columns formed by the response variable values. The reference set for a given contingency table is the set of all contingency tables with the observed marginal row and column sums. Corresponding to this reference set, the network algorithm forms a directed acyclic network consisting of nodes in a number of stages. A path through the network corresponds to a distinct table in the reference set. The distances between nodes are defined so that the total distance of a path through the network is the corresponding value of the test statistic. At each node, the algorithm computes the shortest and longest path distances for all the paths that pass through that node. For the two-sample linear rank statistics, which can be expressed as a linear combination of cell frequencies multiplied by increasing row and column scores, PROC NPAR1WAY computes shortest and longest path distances using the algorithm given in Agresti, Mehta, and Patel (1990). For the multisample one-way test statistics, PROC NPAR1WAY computes an upper bound for the longest path and a lower bound for the shortest path, following the approach of Valz and Thompson (1994).

The longest and shortest path distances or bounds for a node are compared to the value of the test statistic to determine whether all paths through the node contribute to the p -value, none of the paths through the node contribute to the p -value, or neither of these situations occur. If all paths through the node contribute, the p -value is incremented accordingly , and these paths are eliminated from further analysis. If no paths contribute, these paths are eliminated from the analysis. Otherwise, the algorithm continues, still processing this node and the associated paths. The algorithm finishes when all nodes have been accounted for.

In applying the network algorithm, PROC NPAR1WAY uses full precision to represent all statistics, row and column scores, and other quantities involved in the computations. Although it is possible to use rounding to improve the speed and memory requirements of the algorithm, PROC NPAR1WAY does not do this since it can result in reduced accuracy of the p -values.

Definition of p-Values

For two-sample linear rank tests, PROC NPAR1WAY computes exact one-sided and two-sided p -values for each test specified in the EXACT statement. For the one-sided test, PROC NPAR1WAY displays the right-sided p -value when the observed value of the test statistic is greater than its expected value. The right-sided p -value is the sum of probabilities for those tables having a test statistic greater than or equal to the observed test statistic. Otherwise, when the test statistic is less than or equal to its expected value, PROC NPAR1WAY displays the left-sided p -value. The left-sided p -value is the sum of probabilities for those tables having a test statistic less than or equal to the one observed. The one-sided p -value P ₁ can be expressed as

where S is the observed value of the test statistic and Mean is the expected value of the test statistic under the null hypothesis. PROC NPAR1WAY computes the two-sided p -value as the sum of the one-sided p -value and the corresponding area in the opposite tail of the distribution of the statistic, equidistant from the expected value. The two-sided p -value P ₂ can be expressed as

For multisample data, the tests are based on one-way ANOVA statistics. For a test of this form, large values of the test statistic indicate a departure from the null hypothesis; the test is inherently two-sided. The exact p -value is the sum of probabilities for those tables having a test statistic greater than or equal to the value of the observed test statistic.

If you specify the POINT option in the EXACT statement, PROC NPAR1WAY also displays exact point probabilities for the test statistics. The exact point probability is the exact probability that the test statistic equals the observed value.

Computational Resources

PROC NPAR1WAY uses relatively fast and efficient algorithms for exact computations. These recently developed algorithms, together with improvements in computer power, make it feasible now to perform exact computations for data sets where previously only asymptotic methods could be applied. Nevertheless, there are still large problems that may require a prohibitive amount of time and memory for exact computations, depending on the speed and memory available on your computer. For large problems, consider whether exact methods are really needed or whether asymptotic methods might give results quite close to the exact results while requiring much less computer time and memory. When asymptotic methods may not be sufficient for such large problems, consider using Monte Carlo estimation of exact p -values, as described in the section Monte Carlo Estimation on page 3174.

A formula does not exist that can predict in advance how much time and memory are needed to compute an exact p -value for a certain problem. The time and memory required depend on several factors, including which test is being performed, the total sample size, the number of rows and columns, and the specific arrangement of the observations into table cells . Generally, larger problems (in terms of total sample size, number of rows, and number of columns) tend to require more time and memory. Additionally, for a fixed total sample size, time and memory requirements tend to increase as the number of rows and columns increase, since this corresponds to an increase in the number of tables in the reference set. Also for a fixed sample size, time and memory requirements increase as the marginal row and column totals become more homogeneous. Refer to Agresti, Mehta, and Patel (1990) and Gail and Mantel (1977).

At any time while PROC NPAR1WAY is computing exact p -values, you can terminate the computations by pressing the system interrupt key sequence (refer to the SAS Companion for your system) and choosing to stop computations. After you terminate exact computations, PROC NPAR1WAY completes all other remaining tasks . The procedure produces the requested output and reports missing values for any exact p -values not computed by the time of termination.

You can also use the MAXTIME= option in the EXACT statement to limit the amount of time PROC NPAR1WAY uses for exact computations. You specify a MAXTIME= value that is the maximum amount of time (in seconds) that PROC NPAR1WAY can use to compute an exact p -value. If PROC NPAR1WAY does not finish computing an exact p -value within that time, it terminates the computation and completes all other remaining tasks.

Monte Carlo Estimation

If you specify the MC option in the EXACT statement, PROC NPAR1WAY computes Monte Carlo estimates of the exact p -values instead of directly computing the exact p -values. Monte Carlo estimation can be useful for large problems that require a great amount of time and memory for exact computations but for which asymptotic approximations may not be sufficient. To describe the precision of each Monte Carlo estimate, PROC NPAR1WAY provides the asymptotic standard error and 100(1 ˆ’ ± )% confidence limits. The confidence level ± is determined by the ALPHA= option in the EXACT statement, which, by default, equals 0.01, and produces 99% confidence limits. The N= option in the EXACT statement specifies the number of samples PROC NPAR1WAY uses for Monte Carlo estimation; the default is 10,000 samples. You can specify a larger value for n to improve the precision of the Monte Carlo estimates. Because larger values of n generate more samples, the computation time increases . Or you can specify a smaller value of n to reduce the computation time.

To compute a Monte Carlo estimate of an exact p -value, PROC NPAR1WAY generates a random sample of tables with the same total sample size, row totals, and column totals as the observed table. PROC NPAR1WAY uses the algorithm of Agresti, Wackerly, and Boyett (1979), which generates tables in proportion to their hyper-geometric probabilities conditional on the marginal frequencies. For each sample table, PROC NPAR1WAY computes the value of the test statistic and compares it to the value for the observed table. When estimating a right-sided p -value, PROC NPAR1WAY counts all sample tables for which the test statistic is greater than or equal to the observed test statistic. Then the p -value estimate equals the number of these tables divided by the total number of tables sampled.

_MC	=	M/N
M	=	number of samples with (Test Statistic ‰ t )
N	=	total number of samples
t	=	observed Test Statistic

PROC NPAR1WAY computes left-sided and two-sided p -value estimates in a similar manner. For left-sided p -values, PROC NPAR1WAY evaluates whether the test statistic for each sampled table is less than or equal to the observed test statistic. For two-sided p -values, PROC NPAR1WAY examines the sample test statistics according to the expression for P ₂ given in the section Definition of p -Values on page 3172.

The variable M is a binomial variable with N trials and success probability p . It follows that the asymptotic standard error of the Monte Carlo estimate is

PROC NPAR1WAY constructs asymptotic confidence limits for the p -values according to

where z _{± /} ₂ is the 100(1 ˆ’ ± / 2) percentile of the standard normal distribution, and the confidence level ± is determined by the ALPHA= option in the EXACT statement.

When the Monte Carlo estimate _MC equals 0, then PROC NPAR1WAY computes the confidence limits for the p -value as

When the Monte Carlo estimate _MC equals 1, then PROC NPAR1WAY computes the confidence limits as

Output Data Set

The OUTPUT statement creates a SAS data set that contains statistics computed by PROC NPAR1WAY. You specify which statistics to store in the output data set, using options identical to those used in the PROC NPAR1WAY statement. When you specify one of these options in the OUTPUT statement, PROC NPAR1WAY includes all available statistics from that analysis in the output data set.

The output data set contains one observation for each analysis variable within a BY-group. The OUTPUT data set can include the following variables:

BY variables
_VA R _ , which identifies the analysis variable
variables containing the specified statistics

The following table lists the variable names and descriptions for all available statistics. Note that some statistics are available only for the two-sample case (where the classification variable groups the data into two classes). Other statistics are available only for the multisample case.

When you request exact p -values for certain analyses using the EXACT statement, PROC NPAR1WAY also includes those p -values in the output data set if you specify the corresponding analysis options in the OUTPUT statement. If you do not request exact p -values, then they do not appear in the output data set.

Monte Carlo estimates of exact p -values are not available in this output data set, but you can use the Output Delivery System (ODS) to store Monte Carlo estimates in a SAS data set. You can use the Output Delivery System to create a SAS data set from any piece of PROC NPAR1WAY output. For more information, see Table 52.6 on page 3184 and Chapter 14, Using the Output Delivery System.

Table 52.5: Output Data Set Variable Names and Descriptions
Option	Output Variables		Variable Descriptions
ANOVA	_MSA_		ANOVA Effect Mean Square, Among MS
	_MSE_		ANOVA Error Mean Square, Within MS
	_F_		F Statistic for ANOVA
	P_F		p -value, F Statistic for ANOVA
WILCOXON	_WIL_	^{[ *]}	Two-sample Wilcoxon Statistic
	Z_ IL	^{[ *]}	Wilcoxon Statistic, Standardized
	PL_WIL	^{[ *]}	p -value, Wilcoxon Test (Left-sided)
	PR_WIL	^{[ *]}	p -value, Wilcoxon Test (Right-sided)
	P2_WIL	^{[ *]}	p -value, Wilcoxon Test (Two-sided)
	PTL_WIL	^{[ *]}	p -value, Wilcoxon t Approximation (Left-sided)
	PTR_WIL	^{[ *]}	p -value, Wilcoxon t Approximation, (Right-sided)
	PT2_WIL	^{[ *]}	p -value, Wilcoxon t Approximation, (Two-sided)
	XPL_WIL	^{[ *]}	Exact p -value, Wilcoxon Test (Left-sided)
	XPR_WIL	^{[ *]}	Exact p -value, Wilcoxon Test (Right-sided)
	XPT_WIL	^{[ *]}	Exact Point Probability, Wilcoxon Test
	XP2_WIL	^{[ *]}	Exact p -value, Wilcoxon Test (Two-sided)
	_KW_		Kruskal-Wallis Statistic
	DF_KW		Degrees of Freedom, Kruskal-Wallis Test
	P_KW		p -value, Kruskal-Wallis Test
	XP_KW	^{[ **]}	Exact p -value, Kruskal-Wallis Test
	XPT_KW	^{[ **]}	Exact Point Probability, Kruskal-Wallis Test
MEDIAN	_MED_	^{[ *]}	Two-sample Median Statistic
	Z_MED	^{[ *]}	Median Statistic, Standardized
	PL_MED	^{[ *]}	p -value, Median Test (Left-sided)
	PR_MED	^{[ *]}	p -value, Median Test (Right-sided)
	P2_MED	^{[ *]}	p -value, Median Test (Two-sided)
	XPL_MED	^{[ *]}	Exact p -value, Median Test (Left-sided)
	XPR_MED	^{[ *]}	Exact p -value, Median Test (Right-sided)
	XPT_MED	^{[ *]}	Exact Point Probability, Median Test
	XP2_MED	^{[ *]}	Exact p -value, Median Test (Two-sided)
	_CHMED_		Median Chi-square (Brown-Mood Test)
	DF_CHMED		Degrees of Freedom, Median Chi-square
	P_CHMED		p -value, Median Chi-square Test
	XP_CHMED	^{[ **]}	Exact p -value, Median Chi-square
	XPT_CHME	^{[ **]}	Exact Point Probability, Median Chi-square
VW	_VW_	^{[ *]}	Two-sample Van der Waerden Statistic
	Z_VW	^{[ *]}	Van der Waerden Statistic, Standardized
	PL_VW	^{[ *]}	p -value, Van der Waerden Test (Left-sided)
	PR_VW	^{[ *]}	p -value, Van der Waerden Test (Right-sided)
	P2_VW	^{[ *]}	p -value, Van der Waerden Test (Two-sided)
	XPL_VW	^{[ *]}	Exact p -value, Van der Waerden Test (Left-sided)
	XPR_VW	^{[ *]}	Exact p -value, Van der Waerden Test (Right-sided)
	XPT_VW	^{[ *]}	Exact Point Probability, Van der Waerden Test
	XP2_VW	^{[ *]}	Exact p -value, Van der Waerden Test (Two-sided)
	_CHVW_		Van der Waerden Chi-square
	DF_CHVW		Degrees of Freedom, Van der Waerden Chi-square
	P_CHVW		p -value, Van der Waerden Chi-square Test
	XP_CHVW	^{[ **]}	Exact p -value, Van der Waerden Chi-square
	XPT_CHVW	^{[ **]}	Exact Point Prob, Van der Waerden Chi-square
SAVAGE	_SAV_	^{[ *]}	Two-sample Savage Statistic
	Z_SAV	^{[ *]}	Savage Statistic, Standardized
	PL_SAV	^{[ *]}	p -value, Savage Test (Left-sided)
	PR_SAV	^{[ *]}	p -value, Savage Test (Right-sided)
	P2_SAV	^{[ *]}	p -value, Savage Test (Two-sided)
	XPL_SAV	^{[ *]}	Exact p -value, Savage Test (Left-sided)
	XPR_SAV	^{[ *]}	Exact p -value, Savage Test (Right-sided)
	XPT_SAV	^{[ *]}	Exact Point Probability, Savage Test
	XP2_SAV	^{[ *]}	Exact p -value, Savage Test (Two-sided)
	_CHSAV_		Savage Chi-square
	DF_CHSAV		Degrees of Freedom, Savage Chi-square
	P_CHSAV		p -value, Savage Chi-square Test
	XP_CHSAV	^{[ **]}	Exact p -value, Savage Chi-square
	XPT_CHSA	^{[ **]}	Exact Point Probability, Savage Chi-square
ST	_ST_	^{[ *]}	Two-sample Siegel-Tukey Statistic
	Z_ST	^{[ *]}	Siegel-Tukey Statistic, Standardized
	PL_ST	^{[ *]}	p -value, Siegel-Tukey Test (Left-sided)
	PR_ST	^{[ *]}	p -value, Siegel-Tukey Test (Right-sided)
	P2_ST	^{[ *]}	p -value, Siegel-Tukey Test (Two-sided)
	XPL_ST	^{[ *]}	Exact p -value, Siegel-Tukey Test (Left-sided)
	XPR_ST	^{[ *]}	Exact p -value, Siegel-Tukey Test (Right-sided)
	XPT_ST	^{[ *]}	Exact Point Probability, Siegel-Tukey Test
	XP2_ST	^{[ *]}	Exact p -value, Siegel-Tukey Test (Two-sided)
	_CHST_		Siegel-Tukey Chi-square
	DF_CHST		Degrees of Freedom, Siegel-Tukey Chi-square
	P_CHST		p -value, Siegel-Tukey Chi-square Test
	XP_CHST	^{[ **]}	Exact p -value, Siegel-Tukey Chi-square
	XPT_CHST	^{[ **]}	Exact Point Probability, Siegel-Tukey Chi-square
AB	_AB_	^{[ *]}	Two-sample Ansari-Bradley Statistic
	Z_AB	^{[ *]}	Ansari-Bradley Statistic, Standardized
	PL_AB	^{[ *]}	p -value, Ansari-Bradley Test (Left-sided)
	PR_AB	^{[ *]}	p -value, Ansari-Bradley Test (Right-sided)
	P2_AB	^{[ *]}	p -value, Ansari-Bradley Test (Two-sided)
	XPL_AB	^{[ *]}	Exact p -value, Ansari-Bradley Test (Left-sided)
	XPR_AB	^{[ *]}	Exact p -value, Ansari-Bradley Test (Right-sided)
	XPT_AB	^{[ *]}	Exact Point Probability, Ansari-Bradley Test
	XP2_AB	^{[ *]}	Exact p -value, Ansari-Bradley Test (Two-sided)
	_CHAB_		Ansari Bradley Chi-square
	DF_CHAB		Degrees of Freedom, Ansari-Bradley Chi-square
	P_CHAB		p -value, Ansari-Bradley Chi-square Test
	XP_CHAB	^{[ **]}	Exact p -value, Ansari-Bradley Chi-square
	XPT_CHAB	^{[ **]}	Exact Point Probability, Ansari-Bradley Chi-square
KLOTZ	_KLOTZ_	^{[ *]}	Two-sample Klotz Statistic
	Z_K	^{[ *]}	Klotz Statistic, Standardized
	PL_K	^{[ *]}	p -value, Klotz Test (Left-sided)
	PR_K	^{[ *]}	p -value, Klotz Test (Right-sided)
	P2_K	^{[ *]}	p -value, Klotz Test (Two-sided)
	XPL_K	^{[ *]}	Exact p -value, Klotz Test (Left-sided)
	XPR_K	^{[ *]}	Exact p -value, Klotz Test (Right-sided)
	XPT_K	^{[ *]}	Exact Point Probability, Klotz Test
	XP2_K	^{[ *]}	Exact p -value, Klotz Test (Two-sided)
	_CHK_		Klotz Chi-square
	DF_CHK		Degrees of Freedom, Klotz Chi-square
	P_CHK		p -value, Klotz Chi-square Test
	XP_CHK	^{[ **]}	Exact p -value, Klotz Chi-square
	XPT_CHK	^{[ **]}	Exact Point Probability, Klotz Chi-square
MOOD	_MOOD_	^{[ *]}	Two-sample Mood Statistic
	Z_MOOD	^{[ *]}	Mood Statistic, Standardized
	PL_MOOD	^{[ *]}	p -value, Mood Test (Left-sided)
	PR_MOOD	^{[ *]}	p -value, Mood Test (Right-sided)
	P2_MOOD	^{[ *]}	p -value, Mood Test (Two-sided)
	XPL_MOOD	^{[ *]}	Exact p -value, Mood Test (Left-sided)
	XPR_MOOD	^{[ *]}	Exact p -value, Mood Test (Right-sided)
	XPT_MOOD	^{[ *]}	Exact Point Probability, Mood Test
	XP2_MOOD	^{[ *]}	Exact p -value, Mood Test (Two-sided)
	_CHMOOD_		Mood Chi-square
	DF_CHMOO		Degrees of Freedom, Mood Chi-square
	P_CHMOOD		p -value, Mood Chi-square Test
	XP_CHMOO	^{[ **]}	Exact p -value, Mood Chi-square
	XPT_CHMO	^{[ **]}	Exact Point Probability, Mood Chi-square
SCORES=DATA	_DATA_	^{[ *]}	Two-sample Data Scores Statistic
	Z_DATA	^{[ *]}	Data Scores Statistic, Standardized
	PL_DATA	^{[ *]}	p -value, Data Scores Test (Left-sided)
	PR_DATA	^{[ *]}	p -value, Data Scores Test (Right-sided)
	P2_DATA	^{[ *]}	p -value, Data Scores Test (Two-sided)
	XPL_DATA	^{[ *]}	Exact p -value, Data Scores Test (Left-sided)
	XPR_DATA	^{[ *]}	Exact p -value, Data Scores Test (Right-sided)
	XPT_DATA	^{[ *]}	Exact Point Probability, Data Scores Test
	XP2_DATA	^{[ *]}	Exact p -value, Data Scores Test (Two-sided)
	_CHDATA_		Data Scores Chi-square
	DF_CHDAT		Degrees of Freedom, Data Scores Chi-square
	P_CHDATA		p -value, Data Scores Chi-square Test
	XP_CHDAT	^{[ **]}	Exact p -value, Data Scores Chi-square
	XPT_CHDA	^{[ **]}	Exact Point Probability, Data Scores Chi-square
EDF	_KS_		Kolmogorov-Smirnov Statistic
	_KSA_		Kolmogorov-Smirnov Statistic (Asymptotic)
	_Dp_	^{[ *]}	Two-sample Kolmogorov-Smirnov D+
	P_Dp	^{[ *]}	p -value, Kolmogorov-Smirnov D+
	_Dm_	^{[ *]}	Two-sample Kolmogorov-Smirnov D-
	P_Dm	^{[ *]}	p -value, Kolmogorov-Smirnov D-
	_D_	^{[ *]}	Two-sample Kolmogorov-Smirnov Statistic
	P_KSA	^{[ *]}	p -value, Two-sample Kolmogorov-Smirnov
	XP_Dp	^{[ *]}	Exact p -value, Kolmogorov-Smirnov D+
	XPT_Dp	^{[ *]}	Exact Point Probability, Kolmogorov-Smirnov D+
	XP_ Dm	^{[ *]}	Exact p -value, Kolmogorov-Smirnov D-
	XPT_Dm	^{[ *]}	Exact Point Probability, Kolmogorov-Smirnov D-
	XP_D	^{[ *]}	Exact p -value, Kolmogorov-Smirnov D
	XPT_D	^{[ *]}	Exact Point Probability, Kolmogorov-Smirnov D
	_CM_		Cramer-von Mises Statistic
	_CMA_		Cramer-von Mises Statistic (Asymptotic)
	_K_	^{[ *]}	Kuiper Two-sample Statistic
	_KA_	^{[ *]}	Kuiper Two-sample Statistic (Asymptotic)
	P_KA	^{[ *]}	p -value, Two-sample Kuiper (Asymptotic)
^{[ ]} Statistic included only for two-sample cases ^{[ *]} Statistic included only for multisample cases

Displayed Output

If you specify the ANOVA option, PROC NPAR1WAY displays a Class Means table and an Analysis of Variance table for each response variable. The Class Means table includes the following information for each CLASS variable value, or level:

N, the number of observations
the Mean of the response variable

The Analysis of Variance table includes the following information for each Source of variation (Among classes, and Within classes):

DF, the degrees of freedom associated with the source
the Sum of Squares
the Mean Square, the sum of squares divided by the degrees of freedom

The Analysis of Variance table also includes the following:

the F Value for testing the hypothesis that the group means are equal. This is computed by dividing the Mean Square (Among) by the Mean Square (Within).
Pr > F, the significance probability corresponding to the F Value

For each score type that you specify, PROC NPAR1WAY displays a Class Scores table. The available score types include Wilcoxon, median, Van der Waerden, Savage, Siegel-Tukey, Ansari-Bradley, Klotz, Mood, and raw data scores. PROC NPAR1WAY assigns the specified scores to the response variable values, and classifies then according to the CLASS variable values. The Class Scores table includes the following information for each class:

N, the number of observations
Sum of Scores
Expected Under H0, the expected sum of scores under the null hypothesis of no difference among classes
Std Dev Under H0, the standard deviation under the null hypothesis
Mean Score

When there are only two levels of the CLASS variable, PROC NPAR1WAY displays the following Two-Sample Test results for each analysis of scores:

Statistic, which is the sum of scores for the class with the smaller sample size
Z, the standardized test statistic, which has an asymptotic standard normal distribution under the null hypothesis
One-Sided Pr < Z, or One-Sided Pr > Z, the asymptotic one-sided p -value, displayed as Pr < Z or Pr > Z, depending on whether Z is <= 0 or > 0
Two-Sided Pr > Z, the asymptotic two-sided p -value

For Wilcoxon scores, PROC NPAR1WAY also displays a t -approximation for the two-sample test.

If you request an exact test by specifying the score type in the EXACT statement, PROC NPAR1WAY displays the following exact p -values for two-sample data:

One-Sided Pr <= S, or One-Sided Pr >= S, the one-sided exact p -value, displayed as Pr <= S or Pr >= S, depending on whether S <= Mean or S > Mean, where S is the test statistic and Mean is its expected value under the null hypothesis
Point Pr = S, the point probability, if you specify the POINT optioninthe EXACT statement
Two-Sided Pr >= S - Mean, the two-sided exact p -value

If you request Monte Carlo estimates for the exact test by specifying the MC option in the EXACT statement, PROC NPAR1WAY displays the following information for two-sample data:

Estimate of One-Sided Pr <= S or One-Sided Pr >= S, the one-sided exact p -value, together with its Lower and Upper Confidence Limits
Estimate of Two-Sided Pr >= S - Mean, the two-sided exact p -value, together with its Lower and Upper Confidence Limits
Number of Samples used to compute the Monte Carlo estimates
Initial Seed used to compute the Monte Carlo estimates

For both two-sample and multisample data, PROC NPAR1WAY displays the following One-Way Analysis for each score type:

Chi-Square, the one-way ANOVA statistic for testing the null hypothesis of no difference among classes
DF, the degrees of freedom
Pr > Chi-Square, the asymptotic p -value

For multisample data, if you request an exact test by specifying the score type in the EXACT statement, PROC NPAR1WAY also displays the exact p -value as follows:

Exact Pr >= Chi-Square
Exact Pr = Chi-Square, the point probability, if you specify the POINT option in the EXACT statement

For multisample data, if you request a Monte Carlo estimate for the exact test by specifying the MC option in the EXACT statement, PROC NPAR1WAY displays the following information:

Estimate of Pr >= Chi-Square, together with its Lower and Upper Confidence Limits
Number of Samples used to compute the Monte Carlo estimate
Initial Seed used to compute the Monte Carlo estimate

If you specify the EDF option, PROC NPAR1WAY produces tables for the Kolmogorov-Smirnov Test, the Cramer-von Mises Test, and for two-sample data only, the Kuiper Test. The Kolmogorov-Smirnov Test table includes the following information for each CLASS variable value, or level:

N, the number of observations
EDF at Maximum, the value of the class EDF (empirical distribution function) at its maximum deviation from the pooled EDF
Deviation from Mean at Maximum, the value of at its maximum, where n _i it the class sample size, F _i is the class EDF, and F is the pooled EDF

PROC NPAR1WAY displays the following Kolmogorov-Smirnov statistics:

KS, the Kolmogorov-Smirnov statistic
KSa, the asymptotic Kolmogorov-Smirnov statistic, where

For two-sample data, PROC NPAR1WAY displays the following Kolmogorov-Smirnov statistics:

Pr > KSa, the asymptotic p -value for KSa, which equals Pr > D
D = max F1 ˆ’ F2 , the two-sample Kolmogorov-Smirnov statistic

For two-sample data, if you specify the D option, PROC NPAR1WAY also displays the following one-sided Kolmogorov-Smirnov statistics and their asymptotic p -values:

D+ = max(F1 ˆ’ F2)
Pr > D+
D ˆ’ = max(F2 ˆ’ F1)
Pr > D

For two-sample data, if you request an exact Kolmogorov-Smirnov test by specifying the KS option in the EXACT statement, PROC NPAR1WAY displays the following exact p -values:

Exact Pr >= D
Exact Pr >= D+
Exact Pr >= D ˆ’
Exact Point Pr = D, Exact Point Pr = D+, and Exact Point Pr = D ˆ’ , if you specify the POINT option in the EXACT statement

If you request Monte Carlo estimates for the two-sample exact Kolmogorov-Smirnov test, PROC NPAR1WAY displays the following information for two-sample data:

Estimate of Pr >= D, together with its Lower and Upper Confidence Limits
Estimate of Pr >= D+, together with its Lower and Upper Confidence Limits
Estimate of Pr >= D ˆ’ , together with its Lower and Upper Confidence Limits
Number of Samples used to compute the Monte Carlo estimates
Initial Seed used to compute the Monte Carlo estimates

The Cramer-von Mises Test table includes the following information for each CLASS variable value, or level:

N, the number of observations
Summed Deviation from Mean, which is

PROC NPAR1WAY also displays the following Cramer-von Mises statistics:

CM, the Cramer-von Mises statistic
CMa, the asymptotic Cramer-von Mises statistic, where CMa = n CM

For two-sample data, PROC NPAR1WAY displays the Kuiper Test table, which includes the following information for each class:

N, the number of observations
Deviation from Mean, which is max _j F ₁ ( x _j ) ˆ’ F ₂ ( x _j )

PROC NPAR1WAY also displays the following Kuiper two-sample test statistics:

K, the Kuiper two-sample test statistic
Ka, the asymptotic Kuiper two-sample test statistic, where
Pr > Ka

ODS Table Names

PROC NPAR1WAY assigns a name to each table it creates. You can use these names to reference the table when using the Output Delivery System (ODS) to select tables and create output data sets. These names are listed in the following table. For more information on ODS, see Chapter 14, Using the Output Delivery System.

The WILCOXON, MEDIAN, VW, SAVAGE, and EDF options are the default if you do not specify any analysis options in the PROC NPAR1WAY statement.

Table 52.6: ODS Tables Produced in PROC NPAR1WAY
ODS Table Name	Description	Statement	Option
ANOVA	Analysis of variance	PROC	ANOVA
ABAnalysis	Ansari-Bradley one-way analysis	PROC	AB
ABMC	Monte Carlo estimates for the Ansari-Bradley exact test	EXACT	AB / MC
ABScores	Ansari-Bradley scores	PROC	AB
ABTest	Ansari-Bradley two-sample test	PROC	AB ^{[ *]}
ClassMeans	Class Means	PROC	ANOVA
CVMStats	Cramer-von Mises statistics	PROC	EDF
CVMTest	Cramer-von Mises test	PROC	EDF
DataScores	Data scores	PROC	SCORES=DATA
DataScoresAnalysis	Data scores one-way analysis	PROC	SCORES=DATA
DataScoresMC	Monte Carlo estimates for the exact test based on data scores	EXACT	SCORES=DATA / MC
DataScoresTest	Data scores two-sample test	PROC	SCORES=DATA ^{[ *]}
KlotzAnalysis	Klotz one-way analysis	PROC	KLOTZ
KlotzMC	Monte Carlo estimates for the Klotz exact test	EXACT	KLOTZ / MC
KlotzScores	Klotz scores	PROC	KLOTZ
KlotzTest	Klotz two-sample test	PROC	KLOTZ
KolSmirExactTest	Kolmogorov-Smirnov exact test	EXACT	KS ^{[ *]}
KolSmir2Stats	Kolmogorov-Smirnov two-sample statistics	PROC	EDF ^{[ *]}
KolSmirStats	Kolmogorov-Smirnov statistics	PROC	EDF ^{[ **]}
KolSmirTest	Kolmogorov-Smirnov test	PROC	EDF
KruskalWallisMC	Monte Carlo estimates for the Kruskal-Wallis exact test	EXACT	WILCOXON / MC ^{[ **]}
KruskalWallisTest	Kruskal-Wallis test	PROC	WILCOXON
KSMC	Monte Carlo estimates for the Kolmogorov-Smirnov exact test	EXACT	KS / MC ^{[ *]}
KuiperStats	Kuiper two-sample statistics	PROC	EDF ^{[ *]}
KuiperTest	Kuiper test	PROC	EDF ^{[ *]}
MedianAnalysis	Median one-way analysis	PROC	MEDIAN
MedianMC	Monte Carlo estimates for the median exact test	EXACT	MEDIAN / MC
MedianScores	Median scores	PROC	MEDIAN
MedianTest	Median two-sample test	PROC	MEDIAN ^{[ *]}
MoodAnalysis	Mood one-way analysis	PROC	MOOD
MoodMC	Monte Carlo estimates for the Mood exact test	EXACT	MOOD / MC
MoodScores	Mood scores	PROC	MOOD
MoodTest	Mood two-sample test	PROC	MOOD
SavageAnalysis	Savage one-way analysis	PROC	SAVAGE
SavageMC	Monte Carlo estimates for the Savage exact test	EXACT	SAVAGE / MC
SavageScores	Savage scores	PROC	SAVAGE
SavageTest	Savage two-sample test	PROC	SAVAGE ^{[ *]}
STAnalysis	Siegel-Tukey one-way analysis	PROC	ST
STMC	Monte Carlo estimates for the Siegel-Tukey exact test	EXACT	ST/MC
STScores	Siegel-Tukey scores	PROC	ST
STTest	Siegel-Tukey two-sample test	PROC	ST ^{[ *]}
VWAnalysis	Van der Waerden one-way analysis	PROC	VW
VWMC	Monte Carlo estimates for the Van der Waerden exact test	EXACT	VW / MC
VWScores	Van der Waerden scores	PROC	VW
VWTest	Van der Waerden two-sample test	PROC	VW ^{[ *]}
WilcoxonMC	Monte Carlo estimates for the Wilcoxon two-sample exact test	EXACT	WILCOXON / MC ^{[ *]}
WilcoxonScores	Wilcoxon scores	PROC	WILCOXON
WilcoxonTest	Wilcoxon two-sample test	PROC	WILCOXON ^{[ *]}
^{[ ]} PROC NPAR1WAY produces this table only for two-sample data. ^{[ *]} PROC NPAR1WAY produces this table only for multisample data.