Two-Sample Tests | SAS/STAT 9.1 Users Guide, Volumes 1-7

This section describes tests appropriate for two independent samples (for example, two groups of subjects given different treatments ) and for two related samples (for example, before-and-after measurements on a single group of subjects). Related samples are also referred to as paired samples or matched pairs.

Comparing Two Independent Samples

SAS/STAT software provides several nonparametric tests for location and scale differences.

When you perform these tests, your data should consist of a random sample of observations from two different populations. Your goal is either to compare the location parameters (medians) or the scale parameters of the two populations. For example, suppose your data consist of the number of days in the hospital for two groups of patients: those who received a standard surgical procedure and those who received a new, experimental surgical procedure. These patients are a random sample from the population of patients who have received the two types of surgery. Your goal is to decide whether the median hospital stays differ for the two populations.

Tests in the NPAR1WAY Procedure

The NPAR1WAY procedure provides the following location tests: Wilcoxon rank sum test (Mann-Whitney U test), Median test, Savage test, and Van der Waerden test. Also note that the Wilcoxon rank sum test can be obtained from the FREQ procedure. In addition, PROC NPAR1WAY produces the following tests for scale differences: Siegel-Tukey test, Ansari-Bradley test, Klotz test, and Mood test. PROC NPAR1WAY also provides tests using the input data observations as scores, enabling you to produce a wide variety of tests. You can construct any scores with the DATA step, and then PROC NPAR1WAY computes the corresponding linear rank test. You can also directly analyze the raw data this way, producing the permutation test known as Pitman's test.

When data are sparse, skewed, or heavily tied, the usual asymptotic tests may not be appropriate. In these situations, exact tests may be suitable for analyzing your data. The NPAR1WAY procedure can produce exact p -values for all of the two-sample tests for location and scale differences.

Chapter 52, 'The NPAR1WAY Procedure,' provides detailed statistical formulas for these statistics, as well as examples of their use.

Tests in the FREQ Procedure

This procedure provides a test for comparing the location of two groups and for testing for independence between two variables .

The situation in which you want to compare the location of two groups of observations corresponds to a table with two rows. In this case, the asymptotic Wilcoxon rank sum test can be obtained by using SCORES=RANK in the TABLES statement and by looking at either of the following:

the Mantel-Haenszel statistic in the list of tests for no association. This is labeled as 'Mantel Haenszel Chi-Square' and PROC FREQ displays the statistic, the degrees of freedom, and the p -value. To obtain this statistic, specify the CHISQ option in the TABLES statement.
the CMH statistic 2 in the section on Cochran-Mantel-Haenszel statistics. PROC FREQ displays the statistic, the degrees of freedom, and the p -value. To obtain this statistic, specify the CMH2 option in the TABLES statement.

When you test for independence, the question being answered is whether the two variables of interest are related in some way. For example, you might want to know if student scores on a standard test are related to whether students attended a public or private school. One way to think of this situation is to consider the data as a two-way table; the hypothesis of interest is whether the rows and columns are independent. In the preceding example, the groups of students would form the two rows, and the scores would form the columns . The special case of a two-category response (Pass/Fail) leads to a 2 — 2 table; the case of more than two categories for the response (A/B/C/D/F) leads to a 2 — c table, where c is the number of response categories.

For testing whether two variables are independent, PROC FREQ provides Fisher's exact test. For a 2 — 2 table, PROC FREQ automatically provides Fisher's exact test when you specify the CHISQ option in the TABLES statement. For a 2 — c table, use the FISHER option in the EXACT statement to obtain the test.

See Chapter 29, 'The FREQ Procedure,' for details, formulas, and examples of these tests.

Comparing Two Related Samples

SAS/STAT software provides the following nonparametric tests for comparing the locations of two related samples:

Wilcoxon signed rank test
sign test
McNemar's test

The first two tests are available in the UNIVARIATE procedure, and the last test is available in the FREQ procedure. When you perform these tests, your data should consist of pairs of measurements for a random sample from a single population. For example, suppose your data consist of SAT scores for students before and after attending a course on how to prepare for the SAT. The pairs of measurements are the scores before and after the course, and the students should be a random sample of students who attended the course. Your goal in analysis is to decide if the median change in scores is significantly different from zero.

Tests in the UNIVARIATE Procedure

By default, PROC UNIVARIATE performs a Wilcoxon signed rank test and a sign test. To use these tests on two related samples, perform the following steps:

In the DATA step, create a new variable that contains the differences between the two related variables.
Run PROC UNIVARIATE, using the new variable in the VAR statement.

For discussion of the tests, formulas, and examples, refer to the chapter on the UNIVARIATE procedure in the Base SAS 9.1 Procedures Guide .

Tests in the FREQ Procedure

The FREQ procedure can be used to obtain McNemar's test, which is simply another special case of a Cochran-Mantel-Haenszel statistic (and also of the sign test). The AGREE option in the TABLES statement produces this test for 2 — 2 tables, and exact p -values are also available for this test. See Chapter 29, 'The FREQ Procedure,' for more information.