Computing Correlations

 < Day Day Up > 



You can use the Correlations task to compute pairwise correlation coefficients for the variables in your data set. The correlation is a measure of the strength of the linear relationship between two variables. This task can compute the standard Pearson product-moment correlations, nonparametric measures of association, partial correlations, and Cronbach's coefficient alpha. The task also can produce scatter plots with confidence ellipses.

The following example computes correlation coefficients for four variables in the Fitness data set. This data set contains measurements made on groups of men taking a physical fitness course at North Carolina State University. The variables are as follows:

age

age, in years

weight

weight, in kilograms

oxygen

oxygen intake rate, in milliliters per kilogram of body weight per minute

runtime

time taken to run 1.5 miles, in minutes

rstpulse

heart rate while resting

runpulse

heart rate while running

maxpulse

maximum heart rate recorded while running

group

group number

This example includes looking at correlations between the variables runtime, runpulse, maxpulse, and oxygen and also producing the corresponding scatter plots with confidence ellipses.

Open the Fitness Data Set

To open the Fitness data set, follow these steps:

  1. Select Tools Sample Datab

  2. Select Fitness.

  3. Click OK to create the sample data set in your Sasuser directory.

  4. Select File Open By SAS Nameb

  5. Select Sasuser from the list of Libraries.

  6. Select Fitness from the list of members.

  7. Click OK to bring the Fitness data set into the data table.

Request Correlations

To compute correlations for variables in the Fitness data set, follow these steps:

  1. Select Statistics Descriptive Correlationsb

  2. Select the variables runtime, runpulse, maxpulse, and oxygen to correlate.

Figure 7.18 displays the resulting Correlations dialog.

click to expand
Figure 7.18: Correlations Dialog

If you click OK in the Correlations main dialog, the default output, which includes Pearson correlations, is produced. Or, you can request specific types of correlations by using the Options dialog.

Request a Scatter Plot

To request a scatter plot with a confidence ellipse, follow these steps:

  1. Click on the Plots button.

  2. Select Scatter plots.

  3. Select Add confidence ellipses.

The confidence level used in calculating the confidence ellipse is 0.95. To use a different level, type that value in the Probability value: field, as displayed in Figure 7.19.

click to expand
Figure 7.19: Correlations- Plots Dialog

  1. Click OK.

Click OK in the main dialog to perform the analysis.

Review the Results

The results are presented in the project tree, as displayed in Figure 7.20.

click to expand
Figure 7.20: Correlations- Project Tree

You can double-click on any of the resulting nodes in the project tree to view the information in a separate window.

Figure 7.21 displays univariate statistics for each of the analysis variables. The table provides the number of observations, the mean, the standard deviation, the sum, and the minimum and maximum values for each variable.

click to expand
Figure 7.21: Correlations- Univariate Statistics

Figure 7.22 displays the table of correlations. The p-value, which is the significance probability of the correlation, is displayed under each of the correlation coefficients. For example, the correlation between the variables max-pulse and runtime is 0.22610, with an associated p-value of 0.2213, and the correlation between the variables oxygen and runpulse is -0.39797, with an associated p-value of 0.0266.

click to expand
Figure 7.22: Correlations- Table of Correlations

Six scatter plots, each of which includes a 95% confidence ellipse, are produced in this analysis. Each plot displays the relationship between one pair of the analysis variables. The scatter plot of runtime versus oxygen is displayed in Figure 7.23.

click to expand
Figure 7.23: Correlations- Scatter Plot with Confidence Ellipse

Confidence ellipses are used as a graphical indicator of correlation. When two variables are uncorrelated, the confidence ellipse is circular in shape. The ellipse becomes more elongated the stronger the correlation is between two variables.



 < Day Day Up > 



SAS Institute - The Analyst Application
The Analyst Application, Second Edition
ISBN: 158025991X
EAN: 2147483647
Year: 2003
Pages: 116

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net