Syntax


The following statements are available in PROC CANDISC.

  • PROC CANDISC < options > ;

    • CLASS variable ;

    • BY variables ;

    • FREQ variable ;

    • VAR variables ;

    • WEIGHT variable ;

The BY, CLASS, FREQ, VAR, and WEIGHT statements are described after the PROC CANDISC statement.

PROC CANDISC Statement

  • PROC CANDISC < options > ;

This statement invokes the CANDISC procedure. The options listed in the following table can appear in the PROC CANDISC statement.

Table 21.1: CANDISC Procedure Options

Task

Options

Specify Data Sets

DATA=

OUT=

OUTSTAT=

Control Canonical Variables

NCAN=

PREFIX=

Determine Singularity

SINGULAR=

Control Displayed Correlations

BCORR

PCORR

TCORR

WCORR

Control Displayed Covariances

BCOV

PCOV

TCOV

WCOV

Control Displayed SSCP Matrices

BSSCP

PSSCP

TSSCP

WSSCP

Suppress Output

NOPRINT

SHORT

Miscellaneous

ALL

ANOVA

DISTANCE

SIMPLE

STDMEAN

ALL

  • activates all of the display options.

ANOVA

  • displays univariate statistics for testing the hypothesis that the class means are equal in the population for each variable.

BCORR

  • displays between-class correlations.

BCOV

  • displays between-class covariances. The between-class covariance matrix equals the between-class SSCP matrix divided by n ( c ˆ’ 1) /c , where n is the number of observations and c is the number of classes. The between-class covariances should be interpreted in comparison with the total-sample and within-class covariances, not as formal estimates of population parameters.

BSSCP

  • displays the between-class SSCP matrix.

DATA= SAS-data-set

  • specifies the data set to be analyzed . The data set can be an ordinary SAS data set or one of several specially structured data sets created by SAS statistical procedures. These specially structured data sets include TYPE=CORR, COV, CSSCP, and SSCP. If you omit the DATA= option, the procedure uses the most recently created SAS data set.

DISTANCE

  • displays squared Mahalanobis distances between the group means, F statistics, and the corresponding probabilities of greater squared Mahalanobis distances between the group means.

NCAN= n

  • specifies the number of canonical variables to be computed. The value of n must be less than or equal to the number of variables. If you specify NCAN=0, the procedure displays the canonical correlations, but not the canonical coefficients, structures, or means. A negative value suppresses the canonical analysis entirely. Let v be the number of variables in the VAR statement and c be the number of classes. If you omit the NCAN= option, only min( v, c ˆ’ 1) canonical variables are generated; if you also specify an OUT= output data set, v canonical variables are generated, and the last v ˆ’ ( c ˆ’ 1) canonical variables have missing values.

NOPRINT

  • suppresses the normal display of results. Note that this option temporarily disables the Output Delivery System ( ODS ); see Chapter 14, Using the Output Delivery System, for more information.

OUT = SAS-data-set

  • creates an output SAS data set containing the original data and the canonical variable scores. To create a permanent SAS data set, specify a two-level name (refer to SAS Language Reference: Concepts , for more information on permanent SAS data sets).

OUTSTAT= SAS-data-set

  • creates a TYPE=CORR output SAS data set that contains various statistics including class means, standard deviations, correlations, canonical correlations, canonical structures, canonical coefficients, and means of canonical variables for each class. To create a permanent SAS data set, specify a two-level name (refer to SAS Language Reference: Concepts , for more information on permanent SAS data sets).

PCORR

  • displays pooled within-class correlations (partial correlations based on the pooled within-class covariances).

PCOV

  • displays pooled within-class covariances.

PREFIX= name

  • specifies a prefix for naming the canonical variables. By default the names are Can1 , Can2 , Can3 and so forth. If you specify PREFIX=Abc, the components are named Abc1 , Abc2 , and so on. The number of characters in the prefix, plus the number of digits required to designate the canonical variables, should not exceed 32. The prefix is truncated if the combined length exceeds 32.

PSSCP

  • displays the pooled within-class corrected SSCP matrix.

SHORT

  • suppresses the display of canonical structures, canonical coefficients, and class means on canonical variables; only tables of canonical correlations and multivariate test statistics are displayed.

SIMPLE

  • displays simple descriptive statistics for the total sample and within each class.

SINGULAR= p

  • specifies the criterion for determining the singularity of the total-sample correlation matrix and the pooled within-class covariance matrix, where 0 < p < 1. The default is SINGULAR=1E ˆ’ 8.

    Let S be the total-sample correlation matrix. If the R 2 for predicting a quantitative variable in the VAR statement from the variables preceding it exceeds 1 ˆ’ p , S is considered singular. If S is singular, the probability levels for the multivariate test statistics and canonical correlations are adjusted for the number of variables with R 2 exceeding 1 ˆ’ p .

    If S is considered singular and the inverse of S (Squared Mahalanobis Distances) is required, a quasi-inverse is used instead. For details see the Quasi-Inverse section in Chapter 25, The DISCRIM Procedure.

STDMEAN

  • displays total-sample and pooled within-class standardized class means.

TCORR

  • displays total-sample correlations.

TCOV

  • displays total-sample covariances.

TSSCP

  • displays the total-sample corrected SSCP matrix.

WCORR

  • displays within-class correlations for each class level.

WCOV

  • displays within-class covariances for each class level.

WSSCP

  • displays the within-class corrected SSCP matrix for each class level.

BY Statement

  • BY variables ;

You can specify a BY statement with PROC CANDISC to obtain separate analyses on observations in groups defined by the BY variables. When a BY statement appears, the procedure expects the input data set to be sorted in order of the BY variables.

If your input data set is not sorted in ascending order, use one of the following alternatives:

  • Sort the data using the SORT procedure with a similar BY statement.

  • Specify the BY statement option NOTSORTED or DESCENDING in the BY statement for the CANDISC procedure. The NOTSORTED option does not mean that the data are unsorted but rather that the data are arranged in groups (according to values of the BY variables) and that these groups are not necessarily in alphabetical or increasing numeric order.

  • Create an index on the BY variables using the DATASETS procedure (in base SAS software).

For more information on the BY statement, refer to the discussion in SAS Language Reference: Concepts . For more information on the DATASETS procedure, refer to the discussion in the SAS Procedures Guide .

CLASS Statement

  • CLASS variable ;

The values of the CLASS variable define the groups for analysis. Class levels are determined by the formatted values of the CLASS variable. The CLASS variable can be numeric or character. A CLASS statement is required.

FREQ Statement

  • FREQ variable ;

If a variable in the data set represents the frequency of occurrence for the other values in the observation, include the name of the variable in a FREQ statement. The procedure then treats the data set as if each observation appears n times, where n is the value of the FREQ variable for the observation. The total number of observations is considered to be equal to the sum of the FREQ variable when the procedure determines degrees of freedom for significance probabilities.

If the value of the FREQ variable is missing or is less than one, the observation is not used in the analysis. If the value is not an integer, the value is truncated to an integer.

VAR Statement

  • VAR variables ;

You specify the quantitative variables to include in the analysis using a VAR statement. If you do not use a VAR statement, the analysis includes all numeric variables not listed in other statements.

WEIGHT Statement

  • WEIGHT variable ;

To use relative weights for each observation in the input data set, place the weights in a variable in the data set and specify the name in a WEIGHT statement. This is often done when the variance associated with each observation is different and the values of the WEIGHT variable are proportional to the reciprocals of the variances. If the value of the WEIGHT variable is missing or is less than zero, then a value of zero for the weight is assumed.

The WEIGHT and FREQ statements have a similar effect except that the WEIGHT statement does not alter the degrees of freedom.




SAS.STAT 9.1 Users Guide (Vol. 1)
SAS/STAT 9.1 Users Guide, Volumes 1-7
ISBN: 1590472438
EAN: 2147483647
Year: 2004
Pages: 156

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net