This example illustrates how you can use PROC NPAR1WAY to perform a one-way nonparametric analysis. The data from Halverson and Sherwood (1932) consist of weight gain measurements for five different levels of gossypol additive. Gossypol is a substance contained in cottonseed shells , and these data were collected to study the effect of gossypol on animal nutrition.
The following DATA step statements create the SAS data set Gossypol :
data Gossypol; input Dose n; do i=1 to n; input Gain @@; output; end; datalines; 0 16 228 229 218 216 224 208 235 229 233 219 224 220 232 200 208 232 .04 11 186 229 220 208 228 198 222 273 216 198 213 .07 12 179 193 183 180 143 204 114 188 178 134 208 196 .10 17 130 87 135 116 118 165 151 59 126 64 78 94 150 160 122 110 178 .13 11 154 130 130 118 118 104 112 134 98 100 104 ;
The data set Gossypol contains the variable Dose , which represents the amount of gossypol additive, and the variable Gain , which represents the weight gain.
Researchers are interested in whether there is a difference in weight gain among the different dose levels of gossypol. The following statements invoke the NPAR1WAY procedure to perform a nonparametric analysis of this problem:
proc npar1way data=Gossypol; class Dose; var Gain; run;
The variable Dose is the CLASS variable, and the VAR statement specifies the variable Gain is the response variable. The CLASS statement is required, and you must name only one CLASS variable. You may name one or more analysis variables in the VAR statement. If you omit the VAR statement, PROC NPAR1WAY analyzes all numeric variables in the data set except for the CLASS variable, the FREQ variable, and the BY variables .
Since no analysis options are specified in the PROC NPAR1WAY statement, the ANOVA, WILCOXON, MEDIAN, VW, SAVAGE, and EDF options are invoked by default. The following tables show the results of these analyses.
The NPAR1WAY Procedure Analysis of Variance for Variable Gain Classified by Variable Dose Dose N Mean -------------------------------------- 0 16 222.187500 0.04 11 217.363636 0.07 12 175.000000 0.1 17 120.176471 0.13 11 118.363636 Source DF Sum of Squares Mean Square F Value Pr > F ------------------------------------------------------------------- Among 4 140082.986077 35020.74652 55.8143 <.0001 Within 62 38901.998997 627.45160 Average scores were used for ties.
Wilcoxon Scores (Rank Sums) for Variable Gain Classified by Variable Dose Sum of Expected Std Dev Mean Dose N Scores Under H0 Under H0 Score -------------------------------------------------------------------- 0 16 890.50 544.0 67.978966 55.656250 0.04 11 555.00 374.0 59.063588 50.454545 0.07 12 395.50 408.0 61.136622 32.958333 0.1 17 275.50 578.0 69.380741 16.205882 0.13 11 161.50 374.0 59.063588 14.681818 Average scores were used for ties. Kruskal-Wallis Test Chi-Square 52.6656 DF 4 Pr > Chi-Square <.0001
Next PROC NPAR1WAY displays the one-way ANOVA statistic, which for Wilcoxon scores is known as the Kruskal-Wallis test. The statistic equals 52.6656, with four degrees of freedom, which is the number of class levels minus one. The p -value, or probability of a larger statistic under the null hypothesis, is <.0001. This leads to rejection of the null hypothesis that there is no difference in location for Gain among the levels of Dose . This p -value is asymptotic, computed from the asymptotic chi-square distribution of the test statistic. For certain data sets it may also be useful to compute the exact p -value; for example, for small data sets, or data sets that are sparse, skewed, or heavily tied. You can use the EXACT statement to request exact p -values for any of the location or scale tests available in PROC NPAR1WAY.
Figure 52.3 through Figure 52.5 display the analyses produced by the MEDIAN, VW, and SAVAGE options. For each score type, PROC NPAR1WAY provides a summary of scores and the one-way ANOVA statistic, as previously described for Wilcoxon scores. Other score types available in PROC NPAR1WAY are Siegel-Tukey, Ansari-Bradley, Klotz, and Mood, which are used to test for scale differences. Additionally, you can request the SCORES=DATA option, which uses the input data as scores. This option gives you the flexibility to construct any scores for your data with the DATA step and then analyze these scores with PROC NPAR1WAY.
Median Scores (Number of Points Above Median) for Variable Gain Classified by Variable Dose Sum of Expected Std Dev Mean Dose N Scores Under H0 Under H0 Score -------------------------------------------------------------------- 0 16 16.0 7.880597 1.757902 1.00 0.04 11 11.0 5.417910 1.527355 1.00 0.07 12 6.0 5.910448 1.580963 0.50 0.1 17 0.0 8.373134 1.794152 0.00 0.13 11 0.0 5.417910 1.527355 0.00 Average scores were used for ties. Median One-Way Analysis Chi-Square 54.1765 DF 4 Pr > Chi-Square <.0001
Van der Waerden Scores (Normal) for Variable Gain Classified by Variable Dose Sum of Expected Std Dev Mean Dose N Scores Under H0 Under H0 Score -------------------------------------------------------------------- 0 16 16.116474 0.0 3.325957 1.007280 0.04 11 8.340899 0.0 2.889761 0.758264 0.07 12 0.576674 0.0 2.991186 0.048056 0.1 17 14.688921 0.0 3.394540 0.864054 0.13 11 9.191777 0.0 2.889761 0.835616 Average scores were used for ties. Van der Waerden One-Way Analysis Chi-Square 47.2972 DF 4 Pr > Chi-Square <.0001
Savage Scores (Exponential) for Variable Gain Classified by Variable Dose Sum of Expected Std Dev Mean Dose N Scores Under H0 Under H0 Score -------------------------------------------------------------------- 0 16 16.074391 0.0 3.385275 1.004649 0.04 11 7.693099 0.0 2.941300 0.699373 0.07 12 3.584958 0.0 3.044534 0.298746 0.1 17 11.979488 0.0 3.455082 0.704676 0.13 11 8.203044 0.0 2.941300 0.745731 Average scores were used for ties. Savage One-Way Analysis Chi-Square 39.4908 DF 4 Pr > Chi-Square <.0001
Kolmogorov-Smirnov Test for Variable Gain Classified by Variable Dose EDF at Deviation from Mean Dose N Maximum at Maximum ------------------------------------------------- 0 16 0.000000 1.910448 0.04 11 0.000000 1.584060 0.07 12 0.333333 0.499796 0.1 17 1.000000 2.153861 0.13 11 1.000000 1.732565 Total 67 0.477612 Maximum Deviation Occurred at Observation 36 Value of Gain at Maximum = 178.0 Kolmogorov-Smirnov Statistics (Asymptotic) KS 0.457928 KSa 3.748300 Cramer-von Mises Test for Variable Gain Classified by Variable Dose Summed Deviation Dose N from Mean --------------------------------------- 0 16 2.165210 0.04 11 0.918280 0.07 12 0.348227 0.1 17 1.497542 0.13 11 1.335745 Cramer-von Mises Statistics (Asymptotic) CM 0.093508 CMa 6.265003
In the preceding example, the CLASS variable Dose has five levels, and the analyses examine possible differences among these five levels, or samples. The following statements invoke the NPAR1WAY procedure to perform a nonparametric analysis of the two lowest levels of Dose :
proc npar1way data=Gossypol; where Dose <= .04; class Dose; var Gain; run;
The following tables show the results of this two-sample analysis. The tables in Figure 52.7 are produced by the ANOVA option.
The NPAR1WAY Procedure Analysis of Variance for Variable Gain Classified by Variable Dose Dose N Mean -------------------------------------- 0 16 222.187500 0.04 11 217.363636 Source DF Sum of Squares Mean Square F Value Pr > F ------------------------------------------------------------------- Among 1 151.683712 151.683712 0.5587 0.4617 Within 25 6786.982955 271.479318 Average scores were used for ties.
Wilcoxon Scores (Rank Sums) for Variable Gain Classified by Variable Dose Sum of Expected Std Dev Mean Dose N Scores Under H0 Under H0 Score -------------------------------------------------------------------- 0 16 253.50 224.0 20.221565 15.843750 0.04 11 124.50 154.0 20.221565 11.318182 Average scores were used for ties. Wilcoxon Two-Sample Test Statistic 124.5000 Normal Approximation Z 1.4341 One-Sided Pr < Z 0.0758 Two-Sided Pr > Z 0.1515 t Approximation One-Sided Pr < Z 0.0817 Two-Sided Pr > Z 0.1635 Z includes a continuity correction of 0.5. Kruskal-Wallis Test Chi-Square 2.1282 DF 1 Pr > Chi-Square 0.1446
Figure 52.9 through Figure 52.11 display the two-sample analyses produced by the MEDIAN, VW, and SAVAGE options.
Median Scores (Number of Points Above Median) for Variable Gain Classified by Variable Dose Sum of Expected Std Dev Mean Dose N Scores Under H0 Under H0 Score -------------------------------------------------------------------- 0 16 9.0 7.703704 1.299995 0.562500 0.04 11 4.0 5.296296 1.299995 0.363636 Average scores were used for ties. Median Two-Sample Test Statistic 4.0000 Z 0.9972 One-Sided Pr < Z 0.1593 Two-Sided Pr > Z 0.3187 Median One-Way Analysis Chi-Square 0.9943 DF 1 Pr > Chi-Square 0.3187
Van der Waerden Scores (Normal) for Variable Gain Classified by Variable Dose Sum of Expected Std Dev Mean Dose N Scores Under H0 Under H0 Score -------------------------------------------------------------------- 0 16 3.346520 0.0 2.320336 0.209157 0.04 11 3.346520 0.0 2.320336 0.304229 Average scores were used for ties. Van der Waerden Two-Sample Test Statistic 3.3465 Z 1.4423 One-Sided Pr < Z 0.0746 Two-Sided Pr > Z 0.1492 Van der Waerden One-Way Analysis Chi-Square 2.0801 DF 1 Pr > Chi-Square 0.1492
Savage Scores (Exponential) for Variable Gain Classified by Variable Dose Sum of Expected Std Dev Mean Dose N Scores Under H0 Under H0 Score -------------------------------------------------------------------- 0 16 1.834554 0.0 2.401839 0.114660 0.04 11 1.834554 0.0 2.401839 0.166778 Average scores were used for ties. Savage Two-Sample Test Statistic 1.8346 Z 0.7638 One-Sided Pr < Z 0.2225 Two-Sided Pr > Z 0.4450 Savage One-Way Analysis Chi-Square 0.5834 DF 1 Pr > Chi-Square 0.4450
Kolmogorov-Smirnov Test for Variable Gain Classified by Variable Dose EDF at Deviation from Mean Dose N Maximum at Maximum -------------------------------------------------- 0 16 0.250000 0.481481 0.04 11 0.545455 0.580689 Total 27 0.370370 Maximum Deviation Occurred at Observation 4 Value of Gain at Maximum = 216.0 Kolmogorov-Smirnov Two-Sample Test (Asymptotic) KS 0.145172 D 0.295455 KSa 0.754337 Pr > KSa 0.6199 Cramer-von Mises Test for Variable Gain Classified by Variable Dose Summed Deviation Dose N from Mean ---------------------------------------- 0 16 0.098638 0.04 11 0.143474 Cramer-von Mises Statistics (Asymptotic) CM 0.008967 CMa 0.242112 Kuiper Test for Variable Gain Classified by Variable Dose Deviation Dose N from Mean ------------------------------ 0 16 0.090909 0.04 11 0.295455 Kuiper Two-Sample Test (Asymptotic) K 0.386364 Ka 0.986440 Pr > Ka 0.8383