Example 52.1. Two-Sample Location Tests and EDF Statistics
Fifty-nine female patients with rheumatoid arthritis who participated in a clinical trial were assigned to two groups, active and placebo. The response status ( excellent =5, good=4, moderate=3, fair=2, poor=1) of each patient was recorded.
The following SAS statements create the data set Arthritis , which contains the observed status values for all the patients. The variable Treatment denotes the treatment received by a patient, and the variable Response contains the response status of the patient. The variable Freq contains the frequency of the observation, which is the number of patients with the Treatment and Response combination.
data Arthritis; input Treatment $ Response Freq @@; datalines; Active 5 5 Active 4 11 Active 3 5 Active 2 1 Active 1 5 Placebo 5 2 Placebo 4 4 Placebo 3 7 Placebo 2 7 Placebo 1 12 ;
PROC NPAR1WAY tests the null hypothesis that there is no difference in the patient response status against an alternative hypothesis that the patient response status differs in the two treatment groups. The WILCOXON option requests the Wilcoxon test for difference in location, and the MEDIAN option requests the median test for difference in location. The EDF option requests empirical distribution function statistics. The variable Treatment is the CLASS variable, and the VAR statement specifies that the variable Response is the response variable.
proc npar1way wilcoxon median edf data=Arthritis; class Treatment; var Response; freq Freq; run;
Output 52.1.1 shows the results of the Wilcoxon analysis. The Wilcoxon two-sample test statistic equals 999.0, which is the sum of the Wilcoxon scores for the smaller sample (Active). This sum is greater than 810.0, its expected value under the null hypothesis of no difference between the two samples Active and Placebo. The one-sided p -value is 0.0016, which shows that the patient response for the Active treatment is significantly more than for the Placebo group .
Output 52.1.1: Wilcoxon Two-Sample Test The NPAR1WAY Procedure Wilcoxon Scores (Rank Sums) for Variable Response Classified by Variable Treatment Sum of Expected Std Dev Mean Treatment N Scores Under H0 Under H0 Score -------------------------------------------------------------------------- Active 27 999.0 810.0 63.972744 37.000000 Placebo 32 771.0 960.0 63.972744 24.093750 Average scores were used for ties. Wilcoxon Two-Sample Test Statistic 999.0000 Normal Approximation Z 2.9466 One-Sided Pr > Z 0.0016 Two-Sided Pr > Z 0.0032 t Approximation One-Sided Pr > Z 0.0023 Two-Sided Pr > Z 0.0046 Z includes a continuity correction of 0.5. Kruskal-Wallis Test Chi-Square 8.7284 DF 1 Pr > Chi-Square 0.0031
Output 52.1.2 shows the results of the median two-sample test. The statistic equals 18.9167, with a one-sided p -value of 0.0005. This shows that the response for the Active treatment is significantly more than for the Placebo group.
Output 52.1.2: Median Two-Sample Test Median Scores (Number of Points Above Median) for Variable Response Classified by Variable Treatment Sum of Expected Std Dev Mean Treatment N Scores Under H0 Under H0 Score -------------------------------------------------------------------------- Active 27 18.916667 13.271186 1.728195 0.700617 Placebo 32 10.083333 15.728814 1.728195 0.315104 Average scores were used for ties. Median Two-Sample Test Statistic 18.9167 Z 3.2667 One-Sided Pr > Z 0.0005 Two-Sided Pr > Z 0.0011 Median One-Way Analysis Chi-Square 10.6713 DF 1 Pr > Chi-Square 0.0011
Output 52.1.3 shows empirical distribution function statistics comparing these two samples. The asymptotic p -value for the Kolmogorov-Smirnov test is 0.0164. This indicates rejection of the null hypothesis that the distributions are identical for the two groups.
Output 52.1.3: Empirical Distribution Function Statistics Kolmogorov-Smirnov Test for Variable Response Classified by Variable Treatment EDF at Deviation from Mean Treatment N Maximum at Maximum ------------------------------------------------------- Active 27 0.407407 1.141653 Placebo 32 0.812500 1.048675 Total 59 0.627119 Maximum Deviation Occurred at Observation 3 Value of Response at Maximum = 3.0 Kolmogorov-Smirnov Two-Sample Test (Asymptotic) KS 0.201818 D 0.405093 KSa 1.550191 Pr > KSa 0.0164 Cramer-von Mises Test for Variable Response Classified by Variable Treatment Summed Deviation Treatment N from Mean -------------------------------------------- Active 27 0.526596 Placebo 32 0.444316 Cramer-von Mises Statistics (Asymptotic) CM 0.016456 CMa 0.970912 Kuiper Test for Variable Response Classified by Variable Treatment Deviation Treatment N from Mean ---------------------------------- Active 27 0.000000 Placebo 32 0.405093 Kuiper Two-Sample Test (Asymptotic) K 0.405093 Ka 1.550191 Pr > Ka 0.1409
Example 52.3. The Exact Savage Multisample Test
A researcher conducting a laboratory experiment randomly assigned 15 mice to receive one of three drugs. The survival time (in days) was then recorded.
The following SAS statements create the data set Mice , which contains the observed survival times for all the mice. The variable Trt denotes the treatment received by a mouse. The variable Days contains the number of days the mouse survived.
data Mice; input Trt $ Days @@; datalines; 1 1 1 1 1 3 1 3 1 4 2 3 2 4 2 4 2 4 2 15 3 4 3 4 3 10 3 10 3 26 ;
PROC NPAR1WAY tests the null hypothesis that there is no difference in the survival times among the three drugs against an alternative hypothesis of difference among the drugs. The SAVAGE option specifies that Savage scores are to be used. The variable Trt is the CLASS variable, and the VAR statement specifies that the variable Days is the response variable. The EXACT statement requests the exact test.
proc npar1way savage data=Mice; class Trt; var Days; exact; run;
Output 52.3.1 shows the results of the Savage test. The exact p -value is 0.0445, which is significant at the 0.05 level. However, the p -value based on the chi-square approximation is 0.0638, which results in nonrejection of the null hypothesis at the 0.05 level.
Output 52.3.1: Savage Multisample Test The NPAR1WAY Procedure Savage Scores (Exponential) for Variable Days Classified by Variable Trt Sum of Expected Std Dev Mean Trt N Scores Under H0 Under H0 Score ------------------------------------------------------------------- 1 5 3.367980 0.0 1.634555 0.673596 2 5 0.095618 0.0 1.634555 0.019124 3 5 3.272362 0.0 1.634555 0.654472 Average scores were used for ties. Savage One-Way Analysis Chi-Square 5.5047 DF 2 Asymptotic Pr > Chi-Square 0.0638 Exact Pr >= Chi-Square 0.0445