Syntax


The following statements are available in PROC CORR.

  • PROC CORR < options > ;

    • BY variables ;

    • FREQ variable ;

    • PARTIAL variables ;

    • VAR variables ;

    • WEIGHT variable ;

    • WITH variables ;

The BY statement specifies groups in which separate correlation analyses are performed.

The FREQ statement specifies the variable that represents the frequency of occurrence for other values in the observation.

The PARTIAL statement identifies controlling variables to compute Pearson, Spearman, or Kendall partial-correlation coefficients.

The VAR statement lists the numeric variables to be analyzed and their order in the correlation matrix. If you omit the VAR statement, all numeric variables not listed in other statements are used.

The WEIGHT statement identifies the variable whose values weight each observation to compute Pearson product-moment correlation.

The WITH statement lists the numeric variables with which correlations are to be computed.

The PROC CORR statement is the only required statement for the CORR procedure. The rest of this section provides detailed syntax information for each of these statements, beginning with the PROC CORR statement. The remaining statements are in alphabetical order.

PROC CORR Statement

  • PROC CORR < options > ;

The following table summarizes the options available in the PROC CORR statement.

Table 1.1: Summary of PROC CORR Options

Tasks

Options

Specify data sets

Input data set

DATA=

Output data set with Hoeffding s D statistics

OUTH=

Output data set with Kendall correlation statistics

OUTK=

Output data set with Pearson correlation statistics

OUTP=

Output data set with Spearman correlation statistics

OUTS=

Control statistical analysis

Exclude observations with nonpositive weight values from the analysis

EXCLNPWGT

Exclude observations with missing analysis values from the analysis

NOMISS

Request Hoeffding s measure of dependence, D

HOEFFDING

Request Kendall s tau-b

KENDALL

Request Pearson product-moment correlation

PEARSON

Request Spearman rank-order correlation

SPEARMAN

Request Pearson correlation statistics using Fisher s z transformation

FISHER PEARSON

Request Spearman rank-order correlation statistics using Fisher s z transformation

FISHER SPEARMAN

Control Pearson correlation statistics

Compute Cronbach s coefficient alpha

ALPHA

Compute covariances

COV

Compute corrected sums of squares and crossproducts

CSSCP

Compute correlation statistics based on Fisher s z transformation

FISHER

Exclude missing values

NOMISS

Specify singularity criterion

SINGULAR=

Compute sums of squares and crossproducts

SSCP

Specify the divisor for variance calculations

VARDEF=

Control printed output

Display a specified number of ordered correlation coefficients

BEST=

Suppress Pearson correlations

NOCORR

Suppress all printed output

NOPRINT

Suppress p -values

NOPROB

Suppress descriptive statistics

NOSIMPLE

Display ordered correlation coefficients

RANK

The following options (listed in alphabetical order) can be used in the PROC CORR statement:

ALPHA

  • calculates and prints Cronbach s coefficient alpha. PROC CORR computes separate coefficients using raw and standardized values (scaling the variables to a unit variance of 1). For each VAR statement variable, PROC CORR computes the correlation between the variable and the total of the remaining variables. It also computes Cronbach s coefficient alpha using only the remaining variables.

    If a WITH statement is specified, the ALPHA option is invalid. When you specify the ALPHA option, the Pearson correlations will also be displayed. If you specify the OUTP= option, the output data set also contains observations with Cronbach s coefficient alpha. If you use the PARTIAL statement, PROC CORR calculates Cronbach s coefficient alpha for partialled variables. See the section Partial Correlation on page 18.

BEST= n

  • prints the n highest correlation coefficients for each variable, n 1. Correlations are ordered from highest to lowest in absolute value. Otherwise, PROC CORR prints correlations in a rectangular table using the variable names as row and column labels.

  • If you specify the HOEFFDING option, PROC CORR displays the D statistics in order from highest to lowest.

COV

  • displays the variance and covariance matrix. When you specify the COV option, the Pearson correlations will also be displayed. If you specify the OUTP= option, the output data set also contains the covariance matrix with the corresponding _TYPE_ variable value ˜COV. If you use the PARTIAL statement, PROC CORR computes a partial covariance matrix.

CSSCP

  • displays a table of the corrected sums of squares and crossproducts. When you specify the CSSCP option, the Pearson correlations will also be displayed. If you specify the OUTP= option, the output data set also contains a CSSCP matrix with the corresponding _TYPE_ variable value ˜CSSCP. If you use a PARTIAL statement, PROC CORR prints both an unpartial and a partial CSSCP matrix, and the output data set contains a partial CSSCP matrix.

DATA= SAS-data-set

  • names the SAS data set to be analyzed by PROC CORR. By default, the procedure uses the most recently created SAS data set.

EXCLNPWGT

  • excludes observations with nonpositive weight values from the analysis. By default, PROC CORR treats observations with negative weights like those with zero weights and counts them in the total number of observations.

FISHER < (fisher-options) >

  • requests confidence limits and p -values under a specified null hypothesis, H : = , for correlation coefficients using Fisher s z transformation. These correlations include the Pearson correlations and Spearman correlations.

  • The following fisher-options are available:

  • ALPHA= ±

    • specifies the level of the confidence limits for the correlation, 100(1 ˆ’ ± )%. The value of the ALPHA= option must be between 0 and 1, and the default is ALPHA=0.05.

  • BIASADJ=YES NO

    • specifies whether or not the bias adjustment is used in constructing confidence limits. The BIASADJ=YES option also produces a new correlation estimate using the bias adjustment. By default, BIASADJ=YES.

  • RHO0=

    • specifies the value in the null hypothesis H : = , where ˆ’ 1 < < 1. By default, RHO0=0.

  • TYPE= LOWER UPPER TWOSIDED

    • specifies the type of confidence limits. The TYPE=LOWER option requests a lower confidence limit from the lower alternative H 1 : < , the TYPE=UPPER option requests an upper confidence limit from the upper alternative H 1 : > , and the default TYPE=TWOSIDED option requests two-sided confidence limits from the two-sided alternative H 1 : ‰  .

HOEFFDING

  • requests a table of Hoeffding s D statistics. This D statistic is 30 times larger than the usual definition and scales the range between ˆ’ 0.5 and 1 so that large positive values indicate dependence. The HOEFFDING option is invalid if a WEIGHT or PARTIAL statement is used.

KENDALL

  • requests a table of Kendall s tau-b coefficients based on the number of concordant and discordant pairs of observations. Kendall s tau-b ranges from ˆ’ 1 to 1.

  • The KENDALL option is invalid if a WEIGHT statement is used. If you use a PARTIAL statement, probability values for Kendall s partial tau-b are not available.

NOCORR

  • suppresses displaying of Pearson correlations. If you specify the OUTP= option, the data set type remains CORR. To change the data set type to COV, CSSCP, or SSCP, use the TYPE= data set option.

NOMISS

  • excludes observations with missing values from the analysis. Otherwise, PROC CORR computes correlation statistics using all of the nonmissing pairs of variables. Using the NOMISS option is computationally more efficient.

NOPRINT

  • suppresses all displayed output. Use NOPRINT if you want to create an output data set only.

NOPROB

  • suppresses displaying the probabilities associated with each correlation coefficient.

NOSIMPLE

  • suppresses printing simple descriptive statistics for each variable. However, if you request an output data set, the output data set still contains simple descriptive statistics for the variables.

OUTH= output-data-set

  • creates an output data set containing Hoeffding s D statistics. The contents of the output data set are similar to the OUTP= data set. When you specify the OUTH= option, the Hoeffding s D statistics will be displayed, and the Pearson correlations will be displayed only if the PEARSON, ALPHA, COV, CSSCP, SSCP, or OUT= option is also specified.

OUTK= output-data-set

  • creates an output data set containing Kendall correlation statistics. The contents of the output data set are similar to those of the OUTP= data set. When you specify the OUTK= option, the Kendall correlation statistics will be displayed, and the Pearson correlations will be displayed only if the PEARSON, ALPHA, COV, CSSCP, SSCP, or OUT= option is also specified.

OUTP= output-data-set

OUT= output-data-set

  • creates an output data set containing Pearson correlation statistics. This data set also includes means, standard deviations, and the number of observations. The value of the _TYPE_ variable is ˜CORR. When you specify the OUTP= option, the Pearson correlations will also be displayed. If you specify the ALPHA option, the output data set also contains six observations with Cronbach s coefficient alpha.

OUTS= SAS-data-set

  • creates an output data set containing Spearman correlation coefficients. The contents of the output data set are similar to the OUTP= data set. When you specify the OUTS= option, the Spearman correlation coefficients will be displayed, and the Pearson correlations will be displayed only if the PEARSON, ALPHA, COV, CSSCP, SSCP, or OUT= option is also specified.

PEARSON

  • requests a table of Pearson product-moment correlations. If you do not specify the HOEFFDING, KENDALL, SPEARMAN, OUTH=, OUTK=, or OUTS= option, the CORR procedure produces Pearson product-moment correlations by default. Otherwise, you must specify the PEARSON, ALPHA, COV, CSSCP, SSCP, or OUT= option for Pearson correlations. The correlations range from ˆ’ 1 to 1.

RANK

  • displays the ordered correlation coefficients for each variable. Correlations are ordered from highest to lowest in absolute value. If you specify the HOEFFDING option, the D statistics are displayed in order from highest to lowest.

SINGULAR= p

  • specifies the criterion for determining the singularity of a variable if you use a PARTIAL statement. A variable is considered singular if its corresponding diagonal element after Cholesky decomposition has a value less than p times the original unpartialled value of that variable. The default value is 1E ˆ’ 8. The range of is between 0 and 1.

SPEARMAN

  • requests a table of Spearman correlation coefficients based on the ranks of the variables. The correlations range from ˆ’ 1 to 1. If you specify a WEIGHT statement, the SPEARMAN option is invalid.

SSCP

  • displays a table the sums of squares and crossproducts. When you specify the SSCP option, the Pearson correlations will also be displayed. If you specify the OUTP= option, the output data set contains a SSCP matrix and the corresponding _TYPE_ variable value is ˜SSCP. If you use a PARTIAL statement, the unpartial SSCP matrix is displayed, and the output data set does not contain an SSCP matrix.

VARDEF= d

  • specifies the variance divisor in the calculation of variances and covariances. The following table shows the possible values for the value d and associated divisors, where k is the number of PARTIAL statement variables. The default is VARDEF=DF.

    Table 1.2: Possible Values for VARDEF=

    Value

    Divisor

    Formula

    DF

    degrees of freedom

    n ˆ’ k ˆ’ 1

    N

    number of observations

    n

    WDF

    sum of weights minus one

    ( & pound ; w i ) ˆ’ k ˆ’ 1

    WEIGHTWGT

    sum of weights

    w i

  • The variance is computed as

  • where x is the sample mean.

  • If a WEIGHT statement is used, the variance is computed as

    click to expand
  • where w i is the weight for the i th observation and x w is the weighted mean.

  • If you use the WEIGHT statement and VARDEF=DF, the variance is an estimate of s 2 , where the variance of the i th observation is V ( x i ) = s 2 /w i . This yields an estimate of the variance of an observation with unit weight.

  • If you use the WEIGHT statement and VARDEF=WGT, the computed variance is asymptotically an estimate of s 2 / w , where w is the average weight (for large n ). This yields an asymptotic estimate of the variance of an observation with average weight.

BY Statement

  • BY variables ;

You can specify a BY statement with PROC CORR to obtain separate analyses on observations in groups defined by the BY variables. If a BY statement appears, the procedure expects the input data set to be sorted in order of the BY variables.

If your input data set is not sorted in ascending order, use one of the following alternatives:

  • Sort the data using the SORT procedure with a similar BY statement.

  • Specify the BY statement option NOTSORTED or DESCENDING in the BY statement for the CORR procedure. The NOTSORTED option does not mean that the data are unsorted but rather that the data are arranged in groups (according to values of the BY variables) and that these groups are not necessarily in alphabetical or increasing numeric order.

  • Create an index on the BY variables using the DATASETS procedure.

For more information on the BY statement, refer to the discussion in SAS Language Reference: Concepts . For more information on the DATASETS procedure, refer to the discussion in the SAS Procedures Guide .

FREQ Statement

  • FREQ variable ;

The FREQ statement lists a numeric variable whose value represents the frequency of the observation. If you use the FREQ statement, the procedure assumes that each observation represents n observations, where n is the value of the FREQ variable. If n is not an integer, SAS truncates it. If n is less than 1 or is missing, the observation is excluded from the analysis. The sum of the frequency variable represents the total number of observations.

The effects of the FREQ and WEIGHT statements are similar except when calculating degrees of freedom.

PARTIAL Statement

  • PARTIAL variables ;

The PARTIAL statement lists variables to use in the calculation of partial correlation statistics. Only the Pearson partial correlation, Spearman partial rank-order correlation, and Kendall s partial tau-b can be computed. It is not valid with the HOEFFDING option. When you use the PARTIAL statement, observations with missing values are excluded.

With a PARTIAL statement, PROC CORR also displays the partial variance and standard deviation for each analysis variable if the PEARSON option is specified.

VAR Statement

  • VAR variables ;

The VAR statement lists variables for which to compute correlation coefficients. If the VAR statement is not specified, PROC CORR computes correlations for all numeric variables not listed in other statements.

WEIGHT Statement

  • WEIGHT variable ;

The WEIGHT statement lists weights to use in the calculation of Pearson weighted product-moment correlation. The HOEFFDING, KENDALL, and SPEARMAN options are not valid with the WEIGHT statement.

The observations with missing weights are excluded from the analysis. By default, for observations with nonpositive weights, weights are set to zero and the observations are included in the analysis. You can use the EXCLNPWGT option to exclude observations with negative or zero weights from the analysis.

Note that most SAS/STAT procedures, such as PROC GLM, exclude negative and zero weights by default. If you use the WEIGHT statement, consider which value of the VARDEF= option is appropriate. See the discussion of the VARDEF= option for more information.

WITH Statement

  • WITH variables ;

The WITH statement lists variables with which correlations of the VAR statement variables are to be computed. The WITH statement requests correlations of the form r ( X i , Y j ), where X 1 ,...,X m are analysis variables specified in the VAR statement, and Y 1 ,...,Y n are variables specified in the WITH statement. The correlation matrix has a rectangular structure of the form

click to expand

For example, the statements

  proc corr;   var x1 x2;   with y1 y2 y3;   run;  

produce correlations for the following combinations:

click to expand



Base SAS 9.1.3 Procedures Guide (Vol. 3)
Base SAS 9.1 Procedures Guide, Volumes 1, 2, 3 and 4
ISBN: 1590472047
EAN: 2147483647
Year: 2004
Pages: 74

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net