Syntax


  • PROC CALIS < options > ;

    • COSAN matrix model ;

      • MATRIX matrix elements ;

      • VARNAMES variables ;

    • LINEQS model equations ;

      • STD variance pattern ;

      • COV covariance pattern ;

    • RAM model list ;

      • VARNAMES variables ;

    • FACTOR < options > ;

      • MATRIX matrix elements ;

      • VARNAMES variables ;

    • BOUNDS boundary constraints ;

    • BY variables ;

    • FREQ variable ;

    • LINCON linear constraints ;

    • NLINCON nonlinear constraints ;

    • NLOPTIONS optimization options ;

    • PARAMETERS parameters ;

    • PARTIAL variables ;

    • STRUCTEQ variables ;

    • VAR variables ;

    • WEIGHT variable ;

    • program statements

  • If no INRAM= data set is specified, one of the four statements that defines the input form of the analysis model, COSAN, LINEQS, RAM, or FACTOR, must be used.

  • The MATRIX statement can be used multiple times for the same or different matrices along with a COSAN or FACTOR statement. If the MATRIX statement is used multiple times for the same matrix, later definitions override earlier ones.

  • The STD and COV statements can be used only with the LINEQS model statement.

  • You can formulate a generalized COSAN model using a COSAN statement. MATRIX statements can be used to define the elements of a matrix used in the COSAN statement. The input notation resembles the COSAN program of R. McDonald and C. Fraser (McDonald 1978, 1980).

  • The RAM statement uses a simple list input that is especially suitable for describing J. McArdle s RAM analysis model (McArdle 1980, McArdle and McDonald 1984) for causal and path analysis problems.

  • The LINEQS statement formulates the analysis model by means of a system of linear equations similar to P. Bentler s (1989) EQS program notation. The STD and COV statements can be used to define the variances and covariances corresponding to elements of matrix in the LINEQS model.

  • A FACTOR statement can be used to compute a first-order exploratory or confirmatory factor (or component) analysis. The analysis of a simple exploratory factor analysis model performed by PROC CALIS is not as efficient as one performed by the FACTOR procedure. The CALIS procedure is designed for more general structural problems, and it needs significantly more computation time for a simple unrestricted factor or component analysis than does PROC FACTOR.

  • You can add program statements to impose linear or nonlinear constraints on the parameters if you specify the model by means of a COSAN, LINEQS, or RAM statement. The PARAMETERS statement defines additional parameters that are needed as independent variables in your program code and that belong to the set of parameters to be estimated. Variable names used in the program code should differ from the preceding statement names. The code should respect the syntax rules of SAS statements usually used in the DATA step. See the SAS Program Statements section on page 628 for more information.

  • The BOUNDS statement can be used to specify simple lower and upper boundary constraints for the parameters.

  • You can specify general linear equality and inequality constraints with the LINCON statement (or via an INEST= data set). The NLINCON statement can be used to specify general nonlinear equality and inequality constraints by referring to nonlinear functions defined by program statements.

  • The VAR, PARTIAL, WEIGHT, FREQ, and BY statements can be used in the same way as in other procedures, for example, the FACTOR or PRINCOMP procedure. You can select a subset of the input variables to analyze with the VAR statement. The PARTIAL statement defines a set of input variables that are chosen as partial variables for the analysis of a matrix of partial correlations or covariances. The BY statement specifies groups in which separate covariance structure analyses are performed.

PROC CALIS Statement

  • PROC CALIS < options > ;

This statement invokes the procedure. The options available with the PROC CALIS statement are summarized in Table 19.1 and discussed in the following six sections.

Table 19.1: PROC CALIS Statement Options

Data Set Options

Short Description

DATA=

input data set

INEST=

input initial values, constraints

INRAM=

input model

INWGT=

input weight matrix

OUTEST=

covariance matrix of estimates

OUTJAC

Jacobian into OUTEST= data set

OUTRAM=

output model

OUTSTAT=

output statistic

OUTWGT=

output weight matrix

Data Processing

Short Description

AUGMENT

analyzes augmented moment matrix

COVARIANCE

analyzes covariance matrix

EDF=

defines nobs by number error df

NOBS=

defines number of observations nobs

NOINT

analyzes uncorrected moments

RDF=

defines nobs by number regression df

RIDGE

specifies ridge factor for moment matrix

UCORR

analyzes uncorrected CORR matrix

UCOV

analyzes uncorrected COV matrix

VARDEF=

specifies variance divisor

Estimation Methods

Short Description

METHOD=

estimation method

ASYCOV=

formula of asymptotic covariances

DFREDUCE=

reduces degrees of freedom

G4=

algorithm for STDERR

NODIAG

excludes diagonal elements from fit

WPENALTY=

penalty weight to fit correlations

WRIDGE=

ridge factor for weight matrix

Optimization Techniques

Short Description

TECHNIQUE=

minimization method

UPDATE=

update technique

LINESEARCH=

line-search method

FCONV=

function convergence criterion

GCONV=

gradient convergence criterion

INSTEP=

initial step length (RADIUS=, SALPHA=)

LSPRECISION=

line-search precision (SPRECISION=)

MAXFUNC=

max number function calls

MAXITER=

max number iterations

Displayed Output Options

Short Description

KURTOSIS

compute and display kurtosis

MODIFICATION

modification indices

NOMOD

no modification indices

NOPRINT

suppresses the displayed output

PALL

all displayed output (ALL)

PCORR

analyzed and estimated moment matrix

PCOVES

covariance matrix of estimates

PDETERM

determination coefficients

PESTIM

parameter estimates

PINITIAL

pattern and initial values

PJACPAT

displays structure of variable and constant elements of the Jacobian matrix

PLATCOV

latent variable covariances, scores

PREDET

displays predetermined moment matrix

PRIMAT

displays output in matrix form

PRINT

adds default displayed output

PRIVEC

displays output in vector form

PSHORT

reduces default output (SHORT)

PSUMMARY

displays only fit summary (SUMMARY)

PWEIGHT

weight matrix

RESIDUAL =

residual matrix and distribution

SIMPLE

univariate statistics

STDERR

standard errors

NOSTDERR

computes no standard errors

TOTEFF

displays total and indirect effects

Miscellaneous Options

Short Description

ALPHAECV=

probability Browne & Cudeck ECV

ALPHARMS=

probability Steiger & Lind RMSEA

BIASKUR

biased skewness and kurtosis

DEMPHAS=

emphasizes diagonal entries

FDCODE

uses numeric derivatives for code

HESSALG=

algorithm for Hessian

NOADJDF

no adjustment of df for active constraints

RANDOM=

randomly generated initial values

SINGULAR=

singularity criterion

ASINGULAR=

absolute singularity information matrix

COVSING=

singularity tolerance of information matrix

MSINGULAR=

relative M singularity of information matrix

VSINGULAR=

relative V singularity of information matrix

SLMW=

probability limit for Wald test

START=

constant initial values

Data Set Options

DATA= SAS-data-set

  • specifies an input data set that can be an ordinary SAS data set or a specially structured TYPE=CORR, TYPE=COV, TYPE=UCORR, TYPE=UCOV, TYPE=SSCP, or TYPE=FACTOR SAS data set, as described in the section Input Data Sets on page 630. If the DATA= option is omitted, the most recently created SAS data set is used.

INEST INVAR ESTDATA= SAS-data-set

  • specifies an input data set that contains initial estimates for the parameters used in the optimization process and can also contain boundary and general linear constraints on the parameters. If the model did not change too much, you can specify an OUTEST= data set from a previous PROC CALIS analysis. The initial estimates are taken from the values of the PARMS observation.

INRAM= SAS-data-set

  • specifies an input data set that contains in RAM list form all information needed to specify an analysis model. The INRAM= data set is described in the section Input Data Sets on page 630. Typically, this input data set is an OUTRAM= data set (possibly modified) from a previous PROC CALIS analysis. If you use an INRAM= data set to specify the analysis model, you cannot use the model specification statements COSAN, MATRIX, RAM, LINEQS, STD, COV, FACTOR, or VARNAMES, but you can use the BOUNDS and PARAMETERS statements and program statements. If the INRAM= option is omitted, you must define the analysis model with a COSAN, RAM, LINEQS, or FACTOR statement.

INWGT= SAS-data-set

  • specifies an input data set that contains the weight matrix W used in generalized least-squares (GLS), weighted least-squares (WLS, ADF), or diagonally weighted least-squares (DWLS) estimation. If the weight matrix W defined by an INWGT= data set is not positive definite, it can be ridged using the WRIDGE= option. See the section Estimation Criteria on page 644 for more information. If no INWGT= data set is specified, default settings for the weight matrices are used in the estimation process. The INWGT= data set is described in the section Input Data Sets on page 630. Typically, this input data set is an OUTWGT= data set from a previous PROC CALIS analysis.

OUTEST OUTVAR= SAS-data-set

  • creates an output data set containing the parameter estimates, their gradient, Hessian matrix, and boundary and linear constraints. For METHOD=ML, METHOD=GLS, and METHOD=WLS, the OUTEST= data set also contains the information matrix, the approximate covariance matrix of the parameter estimates ((generalized) inverse of information matrix), and approximate standard errors. If linear or nonlinear equality or active inequality constraints are present, the Lagrange multiplier estimates of the active constraints, the projected Hessian, and the Hessian of the Lagrange function are written to the data set. The OUTEST= data set also contains the Jacobian if the OUTJAC option is used.

    The OUTEST= data set is described in the section OUTEST= SAS-data-set on page 634. If you want to create a permanent SAS data set, you must specify a two-level name . Refer to the chapter titled SAS Data Files in SAS Language Reference: Concepts for more information on permanent data sets.

OUTJAC

  • writes the Jacobian matrix, if it has been computed, to the OUTEST= data set. This is useful when the information and Jacobian matrices need to be computed for other analyses.

OUTSTAT= SAS-data-set

  • creates an output data set containing the BY group variables, the analyzed covariance or correlation matrices, and the predicted and residual covariance or correlation matrices of the analysis. You can specify the correlation or covariance matrix in an OUTSTAT= data set as an input DATA= data set in a subsequent analysis by PROC CALIS. The OUTSTAT= data set is described in the section OUTSTAT= SAS-dataset on page 641. If the model contains latent variables, this data set also contains the predicted covariances between latent and manifest variables and the latent variables scores regression coefficients (see the PLATCOV option on page 586). If the FACTOR statement is used, the OUTSTAT= data set also contains the rotated and unrotated factor loadings, the unique variances, the matrix of factor correlations, the transformation matrix of the rotation, and the matrix of standardized factor loadings.

    You can specify the latent variable score regression coefficients with PROC SCORE to compute factor scores.

    If you want to create a permanent SAS data set, you must specify a two-level name. Refer to the chapter titled SAS Data Files in SAS Language Reference: Concepts for more information on permanent data sets.

OUTRAM= SAS-data-set

  • creates an output data set containing the model information for the analysis, the parameter estimates, and their standard errors. An OUTRAM= data set can be used as an input INRAM= data set in a subsequent analysis by PROC CALIS. The OUTRAM= data set also contains a set of fit indices; it is described in more detail in the section OUTRAM= SAS-data-set on page 638. If you want to create a permanent SAS data set, you must specify a two-level name. Refer to the chapter titled SAS Data Files in SAS Language Reference: Concepts for more information on permanent data sets.

OUTWGT= SAS-data-set

  • creates an output data set containing the weight matrix W used in the estimation process. You cannot create an OUTWGT= data set with an unweighted least-squares or maximum likelihood estimation. The fit function in GLS, WLS (ADF), and DWLS estimation contain the inverse of the (Cholesky factor of the) weight matrix W writ-ten in the OUTWGT= data set. The OUTWGT= data set contains the weight matrix on which the WRIDGE= and the WPENALTY= options are applied. An OUTWGT= data set can be used as an input INWGT= data set in a subsequent analysis by PROC CALIS. The OUTWGT= data set is described in the section OUTWGT= SAS-data-set on page 643. If you want to create a permanent SAS data set, you must specify a two-level name. Refer to the chapter titled SAS Data Files in SAS Language Reference: Concepts for more information on permanent data sets.

Data Processing Options

AUGMENT AUG

  • analyzes the augmented correlation or covariance matrix. Using the AUG option is equivalent to specifying UCORR (NOINT but not COV) or UCOV (NOINT and COV) for a data set that is augmented by an intercept variable INTERCEPT that has constant values equal to 1. The variable INTERCEP can be used instead of the default INTERCEPT only if you specify the SAS option OPTIONS VALIDVARNAME=V6. The dimension of an augmented matrix is one higher than that of the corresponding correlation or covariance matrix. The AUGMENT option is effective only if the data set does not contain a variable called INTERCEPT and if you specify the UCOV, UCORR, or NOINT option.

    Caution: The INTERCEPT variable is included in the moment matrix as the variable with number n + 1. Using the RAM model statement assumes that the first n variable numbers correspond to the n manifest variables in the input data set. Therefore, specifying the AUGMENT option assumes that the numbers of the latent variables used in the RAM or path model have to start with number n + 2.

COVARIANCE COV

  • analyzes the covariance matrix instead of the correlation matrix. By default, PROC CALIS (like the FACTOR procedure) analyzes a correlation matrix. If the DATA= input data set is a valid TYPE=CORR data set (containing a correlation matrix and standard deviations), using the COV option means that the covariance matrix is computed and analyzed.

DFE EDF= n

  • makes the effective number of observations n + i , where i is 0 if the NOINT, UCORR, or UCOV option is specified without the AUGMENT option or where i is 1 otherwise . You can also use the NOBS= option to specify the number of observations.

DFR RDF= n

  • makes the effective number of observations the actual number of observations minus the RDF= value. The degree of freedom for the intercept should not be included in the RDF= option. If you use PROC CALIS to compute a regression model, you can specify RDF= number-of-regressor-variables to get approximate standard errors equal to those computed by PROC REG.

NOBS= nobs

  • specifies the number of observations. If the DATA= input data set is a raw data set, nobs is defined by default to be the number of observations in the raw data set. The NOBS= and EDF= options override this default definition. You can use the RDF= option to modify the nobs specification. If the DATA= input data set contains a covariance, correlation, or scalar product matrix, you can specify the number of observations either by using the NOBS=, EDF=, and RDF= options in the PROC CALIS statement or by including a _TYPE_ = N observation in the DATA= input data set.

NOINT

  • specifies that no intercept be used in computing covariances and correlations; that is, covariances or correlations are not corrected for the mean. You can specify this option (or UCOV or UCORR) to analyze mean structures in an uncorrected moment matrix, that is, to compute intercepts in systems of structured linear equations (see Example 19.2). The term NOINT is misleading in this case because an uncorrected covariance or correlation matrix is analyzed containing a constant (intercept) variable that is used in the analysis model. The degrees of freedom used in the variance divisor (specified by the VARDEF= option) and some of the assessment of the fit function (see the section Assessment of Fit on page 649) depend on whether an intercept variable is included in the model (the intercept is used in computing the corrected covariance or correlation matrix or is used as a variable in the uncorrected covariance or correlation matrix to estimate mean structures) or not included (an uncorrected covariance or correlation matrix is used that does not contain a constant variable).

RIDGE < = r >

  • defines a ridge factor r for the diagonal of the moment matrix S that is analyzed. The matrix S is transformed to

    click to expand

    If you do not specify r in the RIDGE option, PROC CALIS tries to ridge the moment matrix S so that the smallest eigenvalue is about 10 ˆ’ 3 .

    Caution: The moment matrix in the OUTSTAT= output data set does not contain the ridged diagonal.

UCORR

  • analyzes the uncorrected correlation matrix instead of the correlation matrix corrected for the mean. Using the UCORR option is equivalent to specifying the NOINT option but not the COV option.

UCOV

  • analyzes the uncorrected covariance matrix instead of the covariance matrix corrected for the mean. Using the UCOV option is equivalent to specifying both the COV and NOINT options. You can specify this option to analyze mean structures in an uncorrected covariance matrix, that is, to compute intercepts in systems of linear structural equations (see Example 19.2).

VARDEF= DF N WDF WEIGHT WGT

  • specifies the divisor used in the calculation of covariances and standard deviations. The default value is VARDEF=DF. The values and associated divisors are displayed in the following table, where i = 0 if the NOINT option is used and i = 1 otherwise and where k is the number of partial variables specified in the PARTIAL statement. Using an intercept variable in a mean structure analysis, by specifying the AUGMENT option, includes the intercept variable in the analysis. In this case, i = 1. When a WEIGHT statement is used, w j is the value of the WEIGHT variable in the j th observation, and the summation is performed only over observations with positive weight.

    Value

    Description

    Divisor

    DF

    degrees of freedom

    N ˆ’ k ˆ’ i

    N

    number of observations

    N

    WDF

    sum of weights DF

    WEIGHT WGT

    sum of weights

Estimation Methods

The default estimation method is maximum likelihood (METHOD=ML), assuming a multivariate normal distribution of the observed variables. The two-stage estimation methods METHOD=LSML, METHOD=LSGLS, METHOD=LSWLS, and METHOD=LSDWLS first compute unweighted least-squares estimates of the model parameters and their residuals. Afterward, these estimates are used as initial values for the optimization process to compute maximum likelihood, generalized least-squares, weighted least-squares, or diagonally weighted least-squares parameter estimates. You can do the same thing by using an OUTRAM= data set with least-squares estimates as an INRAM= data set for a further analysis to obtain the second set of parameter estimates. This strategy is also discussed in the section Use of Optimization Techniques on page 664. For more details, see the Estimation Criteria section on page 644.

METHOD MET= name

  • specifies the method of parameter estimation. The default is METHOD=ML. Valid values for name are as follows :

    ML M MAX

    performs normal-theory maximum likelihood parameter estimation. The ML method requires a nonsingular covariance or correlation matrix.

    GLS G

    performs generalized least-squares parameter estimation. If no INWGT= data set is specified, the GLS method uses the inverse sample covariance or correlation matrix as weight matrix W . Therefore, METHOD=GLS requires a nonsingular covariance or correlation matrix.

    WLS W ADF

    performs weighted least-squares parameter estimation. If no INWGT= data set is specified, the WLS method uses the inverse matrix of estimated asymptotic covariances of the sample covariance or correlation matrix as the weight matrix W . In this case, the WLS estimation method is equivalent to Browne s (1982, 1984) asymptotically distribution-free estimation. The WLS method requires a nonsingular weight matrix.

    DWLS D

    performs diagonally weighted least-squares parameter estimation. If no INWGT= data set is specified, the DWLS method uses the inverse diagonal matrix of asymptotic variances of the input sample covariance or correlation matrix as the weight matrix W . The DWLS method requires a nonsingular diagonal weight matrix.

    ULS LS U

    performs unweighted least-squares parameter estimation.

    LSML LSM LSMAX

    performs unweighted least-squares followed by normal-theory maximum likelihood parameter estimation.

    LSGLS LSG

    performs unweighted least-squares followed by generalized least-squares parameter estimation.

    LSWLS LSW LSADF

    performs unweighted least-squares followed by weighted least-squares parameter estimation.

    LSDWLS LSD

    performs unweighted least-squares followed by diagonally weighted least-squares parameter estimation.

    NONE NO

    uses no estimation method. This option is suitable for checking the validity of the input information and for displaying the model matrices and initial values.

ASYCOV ASC= name

  • specifies the formula for asymptotic covariances used in the weight matrix W for WLS and DWLS estimation. The ASYCOV option is effective only if METHOD= WLS or METHOD=DWLS and no INWGT= input data set is specified. The following formulas are implemented:

    BIASED:

    Browne s (1984) formula (3.4)

    biased asymptotic covariance estimates; the resulting weight matrix is at least positive semidefinite. This is the default for analyzing a covariance matrix.

    UNBIASED:

    Browne s (1984) formula (3.8)

    asymptotic covariance estimates corrected for bias; the resulting weight matrix can be indefinite (that is, can have negative eigenvalues), especially for small N .

    CORR:

    Browne and Shapiro s (1986) formula (3.2)

    (identical to DeLeeuw s (1983) formulas (2,3,4)) the asymptotic variances of the diagonal elements are set to the reciprocal of the value r specified by the WPENALTY= option (default: r = 100). This formula is the default for analyzing a correlation matrix.

    Caution: Using the WLS and DWLS methods with the ASYCOV=CORR option means that you are fitting a correlation (rather than a covariance) structure. Since the fixed diagonal of a correlation matrix for some models does not contribute to the model s degrees of freedom, you can specify the DFREDUCE= i option to reduce the degrees of freedom by the number of manifest variables used in the model. See the section Counting the Degrees of Freedom on page 676 for more information.

DFREDUCE DFRED= i

  • reduces the degrees of freedom of the 2 test by i . In general, the number of degrees of freedom is the number of elements of the lower triangle of the predicted model matrix C , n ( n +1) / 2, minus the number of parameters, t . If the NODIAG option is used, the number of degrees of freedom is additionally reduced by n . Because negative values of i are allowed, you can also increase the number of degrees of freedom by using this option. If the DFREDUCE= or NODIAG option is used in a correlation structure analysis, PROC CALIS does not additionally reduce the degrees of freedom by the number of constant elements in the diagonal of the predicted model matrix, which is otherwise done automatically. See the section Counting the Degrees of Freedom on page 676 for more information.

G4= i

  • specifies the algorithm to compute the approximate covariance matrix of parameter estimates used for computing the approximate standard errors and modification indices when the information matrix is singular. If the number of parameters t used in the model you analyze is smaller than the value of i , the time-expensive Moore-Penrose (G4) inverse of the singular information matrix is computed by eigenvalue decomposition. Otherwise, an inexpensive pseudo (G1) inverse is computed by sweeping. By default, i = 60. For more details, see the section Estimation Criteria on page 644.

NODIAG NODI

  • omits the diagonal elements of the analyzed correlation or covariance matrix from the fit function. This option is useful only for special models with constant error variables. The NODIAG option does not allow fitting those parameters that contribute to the diagonal of the estimated moment matrix. The degrees of freedom are automatically reduced by n . A simple example for the usefulness of the NODIAG option is the fitofthefirst-order factor model, S = FF ² + U 2 . In this case, you do not have to estimate the diagonal matrix of unique variances U 2 that are fully determined by diag ( S ˆ’ FF ² ).

WPENALTY WPEN= r

  • specifies the penalty weight r 0 for the WLS and DWLS fit of the diagonal elements of a correlation matrix (constant 1s). The criterion for weighted least-squares estimation of a correlation structure is

    click to expand

    where r is the penalty weight specified by the WPENALTY= r option and the w ij,kl are the elements of the inverse of the reduced ( n ( n ˆ’ 1) / 2) — ( n ( n ˆ’ 1) / 2) weight matrix that contains only the nonzero rows and columns of the full weight matrix W . The second term is a penalty term to fit the diagonal elements of the correlation matrix. The default value is 100. The reciprocal of this value replaces the asymptotic variance corresponding to the diagonal elements of a correlation matrix in the weight matrix W , and it is effective only with the ASYCOV=CORR option. The often used value r =1 seems to be too small in many cases to fit the diagonal elements of a correlation matrix properly. The default WPENALTY= value emphasizes the importance of the fit of the diagonal elements in the correlation matrix. You can decrease or increase the value of r if you want to decrease or increase the importance of the diagonal elements fit. This option is effective only with the WLS or DWLS estimation method and the analysis of a correlation matrix. See the section Estimation Criteria on page 644 for more details.

WRIDGE= r

  • defines a ridge factor r for the diagonal of the weight matrix W used in GLS, WLS, or DWLS estimation. The weight matrix W is transformed to

    click to expand

    The WRIDGE= option is applied on the weight matrix

    • before the WPENALTY= option is applied on it

    • before the weight matrix is written to the OUTWGT= data set

    • before the weight matrix is displayed

Optimization Techniques

Since there is no single nonlinear optimization algorithm available that is clearly superior (in terms of stability, speed, and memory) for all applications, different types of optimization techniques are provided in the CALIS procedure. Each technique can be modified in various ways. The default optimization technique for less than 40 parameters ( t < 40) is TECHNIQUE=LEVMAR. For 40 t < 400, TECHNIQUE=QUANEW is the default method, and for t 400, TECHNIQUE=CONGRA is the default method. For more details, see the section Use of Optimization Techniques on page 664. You can specify the following set of options in the PROC CALIS statement or in the NLOPTIONS statement.

TECHNIQUE TECH= name

OMETHOD OM= name

  • specifies the optimization technique. Valid values for name are as follows:

    CONGRA CG

    chooses one of four different conjugate-gradient optimization algorithms, which can be more precisely defined with the UPDATE= option and modified with the LINESEARCH= option. The conjugate-gradient techniques need only O ( t ) memory compared to the O ( t 2 ) memory for the other three techniques, where t is the number of parameters. On the other hand, the conjugate-gradient techniques are significantly slower than other optimization techniques and should be used only when memory is insufficient for more efficient techniques. When you choose this option, UPDATE=PB by default. This is the default optimization technique if there are more than 400 parameters to estimate.

    DBLDOG DD

    performs a version of double dogleg optimization, which uses the gradient to update an approximation of the Cholesky factor of the Hessian. This technique is, in many aspects, very similar to the dual quasi-Newton method, but it does not use line search. The implementation is based on Dennis and Mei (1979) and Gay (1983).

    LEVMAR LM MARQUARDT

    performs a highly stable but, for large problems, memory- and time-consuming Levenberg-Marquardt optimization technique, a slightly improved variant of the Mor (1978) implementation. This is the default optimization technique if there are fewer than 40 parameters to estimate.

    NEWRAP NR NEWTON

    performs a usually stable but, for large problems, memory- and time-consuming Newton-Raphson optimization technique. The algorithm combines a line-search algorithm with ridging, and it can be modified with the LINESEARCH= option. In releases prior to Release 6.11, this option invokes the NRRIDG option.

    NRRIDG NRR NR

    performs a usually stable but, for large problems, memory-and time-consuming Newton-Raphson optimization technique. This algorithm does not perform a line search. Since TECH=NRRIDG uses an orthogonal decomposition of the approximate Hessian, each iteration of TECH=NRRIDG can be slower than that of TECH=NEWRAP, which works with Cholesky decomposition. However, usually TECH=NRRIDG needs less iterations than TECH=NEWRAP.

    QUANEW QN

    chooses one of four different quasi-Newton optimization algorithms that can be more precisely defined with the UPDATE= option and modified with the LINESEARCH= option. If boundary constraints are used, these techniques sometimes converge slowly. When you choose this option, UPDATE=DBFGS by default. If nonlinear constraints are specified in the NLINCON statement, a modification of Powell s (1982a, 1982b) VMCWD algorithm is used, which is a sequential quadratic programming (SQP) method. This algorithm can be modified by specifying VERSION=1, which replaces the update of the Lagrange multiplier estimate vector µ to the original update of Powell (1978a, 1978b) that is used in the VF02AD algorithm. This can be helpful for applications with linearly dependent active constraints. The QUANEW technique is the default optimization technique if there are nonlinear constraints specified or if there are more than 40 and fewer than 400 parameters to estimate. The QUANEW algorithm uses only first-order derivatives of the objective function and, if available, of the nonlinear constraint functions.

    TRUREG TR

    performs a usually very stable but, for large problems, memory-and time-consuming trust region optimization technique. The algorithm is implemented similar to Gay (1983) and Mor and Sorensen (1983).

    NONE NO

    does not perform any optimization. This option is similar to METHOD=NONE, but TECH=NONE also computes and displays residuals and goodness-of-fit statistics. If you specify METHOD=ML, METHOD=LSML, METHOD=GLS, METHOD=LSGLS, METHOD=WLS, or METHOD=LSWLS, this option allows computing and displaying (if the display options are specified) of the standard error estimates and modification indices corresponding to the input parameter estimates.

UPDATE UPD= name

  • specifies the update method for the quasi-Newton or conjugate-gradient optimization technique.

    For TECHNIQUE=CONGRA, the following updates can be used:

    PB

    performs the automatic restart update methodof Powell (1977) and Beale (1972). This is the default.

    FR

    performs the Fletcher-Reeves update (Fletcher 1980, p. 63).

    PR

    performs the Polak-Ribiere update (Fletcher 1980, p. 66).

    CD

    performs a conjugate-descent update of Fletcher (1987).

    For TECHNIQUE=DBLDOG, the following updates (Fletcher 1987) can be used:

    DBFGS

    performs the dual Broyden, Fletcher, Goldfarb, and Shanno (BFGS) update of the Cholesky factor of the Hessian matrix. This is the default.

    DDFP

    performs the dual Davidon, Fletcher, and Powell (DFP) update of the Cholesky factor of the Hessian matrix.

    For TECHNIQUE=QUANEW, the following updates (Fletcher 1987) can be used:

    BFGS

    performs original BFGS update of the inverse Hessian matrix. This is the default for earlier releases.

    DFP

    performs the original DFP update of the inverse Hessian matrix.

    DBFGS

    performs the dual BFGS update of the Cholesky factor of the Hessian matrix. This is the default.

    DDFP

    performs the dual DFP update of the Cholesky factor of the Hessian matrix.

LINESEARCH LIS SMETHOD SM= i

  • specifies the line-search method for the CONGRA, QUANEW, and NEWRAP optimization techniques. Refer to Fletcher (1980) for an introduction to line-search techniques. The value of i can be 1 , ..., 8; the default is i = 2.

    LIS=1

    specifies a line-search method that needs the same number of function and gradient calls for cubic interpolation and cubic extrapolation; this method is similar to one used by the Harwell subroutine library.

    LIS=2

    specifies a line-search method that needs more function calls than gradient calls for quadratic and cubic interpolation and cubic extrapolation; this method is implemented as shown in Fletcher (1987) and can be modified to an exact line search by using the LSPRECISION= option.

    LIS=3

    specifies a line-search method that needs the same number of function and gradient calls for cubic interpolation and cubic extrapolation; this method is implemented as shown in Fletcher (1987) and can be modified to an exact line search by using the LSPRECISION= option.

    LIS=4

    specifies a line-search method that needs the same number of function and gradient calls for stepwise extrapolation and cubic interpolation.

    LIS=5

    specifies a line-search method that is a modified version of LIS=4.

    LIS=6

    specifies golden section line search (Polak 1971), which uses only function values for linear approximation.

    LIS=7

    specifies bisection line search (Polak 1971), which uses only function values for linear approximation.

    LIS=8

    specifies Armijo line-search technique (Polak 1971), which uses only function values for linear approximation.

FCONV FTOL= r

  • specifies the relative function convergence criterion. The optimization process is terminated when the relative difference of the function values of two consecutive iterations is smaller than the specified value of r , that is

    click to expand

    where FSIZE can be defined by the FSIZE= option in the NLOPTIONS statement. The default value is r = 10 ˆ’ FDIGITS , where FDIGITS either can be specified in the NLOPTIONS statement or is set by default to ˆ’ log 10 ( µ ), where µ is the machine precision.

GCONV GTOL= r

  • specifies the relative gradient convergence criterion (see the ABSGCONV= option on page 617 for the absolute gradient convergence criterion).

    Termination of all techniques (except the CONGRA technique) requires the normalized predicted function reduction to be small,

    click to expand

    where FSIZE can be defined by the FSIZE= option in the NLOPTIONS statement. For the CONGRA technique (where a reliable Hessian estimate G is not available),

    click to expand

    is used. The default value is r = 10 ˆ’ 8 .

    Note that for releases prior to Release 6.11, the GCONV= option specified the absolute gradient convergence criterion.

INSTEP= r

  • For highly nonlinear objective functions, such as the EXP function, the default initial radius of the trust-region algorithms TRUREG, DBLDOG, and LEVMAR or the default step length of the line-search algorithms can produce arithmetic overflows. If this occurs, specify decreasing values of 0 < r < 1 such as INSTEP=1E “1, INSTEP=1E “2, INSTEP=1E “4, ... , until the iteration starts successfully.

    • For trust-region algorithms (TRUREG, DBLDOG, and LEVMAR), the INSTEP option specifies a positive factor for the initial radius of the trust region. The default initial trust-region radius is the length of the scaled gradient, and it corresponds to the default radius factor of r = 1.

    • For line-search algorithms (NEWRAP, CONGRA, and QUANEW), INSTEP specifies an upper bound for the initial step length for the line search during the first five iterations. The default initial step length is r = 1.

  • For releases prior to Release 6.11, specify the SALPHA= and RADIUS= options. For more details, see the section Computational Problems on page 678.

LSPRECISION LSP= r

SPRECISION SP= r

  • specifies the degree of accuracy that should be obtained by the line-search algorithms LIS=2 and LIS=3. Usually an imprecise line search is inexpensive and successful. For more difficult optimization problems, a more precise and more expensive line search may be necessary (Fletcher 1980, p.22). The second (default for NEWRAP, QUANEW, and CONGRA) and third line-search methods approach exact line search for small LSPRECISION= values. If you have numerical problems, you should decrease the LSPRECISION= value to obtain a more precise line search. The default LSPRECISION= values are displayed in the following table.

    TECH=

    UPDATE=

    LSP default

    QUANEW

    DBFGS, BFGS

    r = 0.4

    QUANEW

    DDFP, DFP

    r = 0.06

    CONGRA

    all

    r = 0.1

    NEWRAP

    no update

    r = 0.9

    For more details, refer to Fletcher (1980, pp. 25 “29).

MAXFUNC MAXFU= i

  • specifies the maximum number i of function calls in the optimization process. The default values are displayed in the following table.

    TECH=

    MAXFUNC default

    LEVMAR, NEWRAP, NRRIDG, TRUREG

    i =125

    DBLDOG, QUANEW

    i =500

    CONGRA

    i =1000

    The default is used if you specify MAXFUNC=0. The optimization can be terminated only after completing a full iteration. Therefore, the number of function calls that is actually performed can exceed the number that is specified by the MAXFUNC= option.

MAXITER MAXIT= i <n>

  • specifies the maximum number i of iterations in the optimization process. The default values are displayed in the following table.

    TECH=

    MAXITER default

    LEVMAR, NEWRAP, NRRIDG, TRUREG

    i =50

    DBLDOG, QUANEW

    i =200

    CONGRA

    i =400

    The default is used if you specify MAXITER=0 or if you omit the MAXITER option.

    The optional second value n is valid only for TECH=QUANEW with nonlinear constraints. It specifies an upper bound n for the number of iterations of an algorithm and reduces the violation of nonlinear constraints at a starting point. The default is n =20. For example, specifying

      maxiter= . 0  

    means that you do not want to exceed the default number of iterations during the main optimization process and that you want to suppress the feasible point algorithm for nonlinear constraints.

RADIUS= r

  • is an alias for the INSTEP= option for Levenberg-Marquardt minimization.

SALPHA= r

  • is an alias for the INSTEP= option for line-search algorithms.

SPRECISION SP= r

  • is an alias for the LSPRECISION= option.

Displayed Output Options

There are three kinds of options to control the displayed output:

  • The PCORR, KURTOSIS, MODIFICATION, NOMOD, PCOVES, PDETERM, PESTIM, PINITIAL, PJACPAT, PLATCOV, PREDET, PWEIGHT, RESIDUAL, SIMPLE, STDERR, and TOTEFF options refertospecific parts of displayed output.

  • The PALL, PRINT, PSHORT, PSUMMARY, and NOPRINT options refer to special subsets of the displayed output options mentioned in the first item. If the NOPRINT option is not specified, a default set of output is displayed. The PRINT and PALL options add other output options to the default output, and the PSHORT and PSUMMARY options reduce the default displayed output.

  • The PRIMAT and PRIVEC options describe the form in which some of the output is displayed (the only nonredundant information displayed by PRIVEC is the gradient).

    Output Options

    PALL

    PRINT

    default

    PSHORT

    PSUMMARY

    fit indices

    *

    *

    *

    *

    *

    linear dependencies

    *

    *

    *

    *

    *

    PREDET

    *

    (*)

    (*)

    (*)

     

    model matrices

    *

    *

    *

    *

    PESTIM

    *

    *

    *

    *

    iteration history

    *

    *

    *

    *

    PINITIAL

    *

    *

    *

       

    SIMPLE

    *

    *

    *

    STDERR

    *

    *

    *

    RESIDUAL

    *

    *

         

    KURTOSIS

    *

    *

    PLATCOV

    *

    *

    TOTEFF

    *

    *

    PCORR

    *

           

    MODIFICATION

    *

    PWEIGHT

    *

    PCOVES

             

    PDETERM

    PJACPAT

    PRIMAT

    PRIVEC

KURTOSIS KU

  • computes and displays univariate kurtosis and skewness, various coefficients of multivariate kurtosis, and the numbers of observations that contribute most to the normalized multivariate kurtosis. See the section Measures of Multivariate Kurtosis on page 658 for more information. Using the KURTOSIS option implies the SIMPLE display option. This information is computed only if the DATA= data set is a raw data set, and it is displayed by default if the PRINT option is specified. The multivariate LS kappa and the multivariate mean kappa are displayed only if you specify METHOD=WLS and the weight matrix is computed from an input raw data set. All measures of skewness and kurtosis are corrected for the mean. If an intercept variable is included in the analysis, the measures of multivariate kurtosis do not include the intercept variable in the corrected covariance matrix, as indicated by a displayed message. Using the BIASKUR option displays the biased values of univariate skewness and kurtosis.

MODIFICATION MOD

  • computes and displays Lagrange multiplier test indices for constant parameter constraints, equality parameter constraints, and active boundary constraints, as well as univariate and multivariate Wald test indices. The modification indices are not computed in the case of unweighted or diagonally weighted least-squares estimation.

    The Lagrange multiplier test (Bentler 1986; Lee 1985; Buse 1982) provides an estimate of the 2 reduction that results from dropping the constraint. For constant parameter constraints and active boundary constraints, the approximate change of the parameter value is displayed also. You can use this value to obtain an initial value if the parameter is allowed to vary in a modified model. For more information, see the section Modification Indices on page 673.

NOMOD

  • does not compute modification indices. The NOMOD option is useful in connection with the PALL option because it saves computing time.

NOPRINT NOP

  • suppresses the displayed output. Note that this option temporarily disables the Output Delivery System (ODS). For more information, see Chapter 14, Using the Output Delivery System.

PALL ALL

  • displays all optional output except the output generated by the PCOVES, PDETERM, PJACPAT, and PRIVEC options.

    Caution: The PALL option includes the very expensive computation of the modification indices. If you do not really need modification indices, you can save computing time by specifying the NOMOD option in addition to the PALL option.

PCORR CORR

  • displays the (corrected or uncorrected) covariance or correlation matrix that is analyzed and the predicted model covariance or correlation matrix.

PCOVES PCE

  • displays the following:

    • the information matrix (crossproduct Jacobian)

    • the approximate covariance matrix of the parameter estimates (generalized inverse of the information matrix)

    • the approximate correlation matrix of the parameter estimates

  • The covariance matrix of the parameter estimates is not computed for estimation methods ULS and DWLS. This displayed output is not included in the output generated by the PALL option.

PDETERM PDE

  • displays three coefficients of determination: the determination of all equations (DETAE), the determination of the structural equations (DETSE), and the determination of the manifest variable equations (DETMV). These determination coefficients are intended to be global means of the squared multiple correlations for different subsets of model equations and variables. The coefficients are displayed only when you specify a RAM or LINEQS model, but they are displayed for all five estimation methods: ULS, GLS, ML, WLS, and DWLS.

    You can use the STRUCTEQ statement to define which equations are structural equations. If you don t use the STRUCTEQ statement, PROC CALIS uses its own default definition to identify structural equations.

    The term structural equation is not defined in a unique way. The LISREL program defines the structural equations by the user -defined BETA matrix. In PROC CALIS, the default definition of a structural equation is an equation that has a dependent left side variable that appears at least once on the right side of another equation, or an equation that has at least one right side variable that is the left side variable of another equation. Therefore, PROC CALIS sometimes identifies more equations as structural equations than the LISREL program does.

    If the model contains structural equations, PROC CALIS also displays the Stability Coefficient of Reciprocal Causation, that is, the largest eigenvalue of the BB ² matrix, where B is the causal coefficient matrix of the structural equations. These coefficients are computed as in the LISREL VI program of J reskog and S rbom (1985). This displayed output is not included in the output generated by the PALL option.

PESTIM PES

  • displays the parameter estimates. In some cases, this includes displaying the standard errors and t values.

PINITIAL PIN

  • displays the input model matrices and the vector of initial values.

PJACPAT PJP

  • displays the structure of variable and constant elements of the Jacobian matrix. This displayed output is not included in the output generated by the PALL option.

PLATCOV PLC

  • displays the following:

    • the estimates of the covariances among the latent variables

    • the estimates of the covariances between latent and manifest variables

    • the latent variable score regression coefficients

  • The estimated covariances between latent and manifest variables and the latent variable score regression coefficients are written to the OUTSTAT= data set. You can use the score coefficients with PROC SCORE to compute factor scores.

PREDET PRE

  • displays the pattern of variable and constant elements of the predicted moment matrix that is predetermined by the analysis model. It is especially helpful in finding manifest variables that are not used or that are used as exogenous variables in a complex model specified in the COSAN statement. Those entries of the predicted moment matrix for which the model generates variable (rather than constant) elements are displayed as missing values. This output is displayed even without specifying the PREDET option if the model generates constant elements in the predicted model matrix different from those in the analysis moment matrix and if you specify at least the PSHORT amount of displayed output.

  • If the analyzed matrix is a correlation matrix (containing constant elements of 1s in the diagonal) and the model generates a predicted model matrix with q constant (rather than variable) elements in the diagonal, the degrees of freedom are automatically reduced by q . The output generated by the PREDET option displays those constant diagonal positions . If you specify the DFREDUCE= or NODIAG option, this automatic reduction of the degrees of freedom is suppressed. See the section Counting the Degrees of Freedom on page 676 for more information.

PRIMAT PMAT

  • displays parameter estimates, approximate standard errors, and t values in matrix form if you specify the analysis model in the RAM or LINEQS statement. When a COSAN statement is used, this occurs by default.

PRINT PRI

  • adds the options KURTOSIS, RESIDUAL, PLATCOV, and TOTEFF to the default output.

PRIVEC PVEC

  • displays parameter estimates, approximate standard errors, the gradient, and t values in vector form. The values are displayed with more decimal places. This displayed output is not included in the output generated by the PALL option.

PSHORT SHORT PSH

  • excludes the output produced by the PINITIAL, SIMPLE, and STDERR options from the default output.

PSUMMARY SUMMARY PSUM

  • displays the fit assessment table and the ERROR, WARNING, and NOTE messages.

PWEIGHT PW

  • displays the weight matrix W used in the estimation. The weight matrix is displayed after the WRIDGE= and the WPENALTY= options are applied to it.

RESIDUAL RES < = NORM VARSTAND ASYSTAND >

  • displays the absolute and normalized residual covariance matrix, the rank order of the largest residuals, and a bar chart of the residuals. This information is displayed by default when you specify the PRINT option.

    Three types of normalized or standardized residual matrices can be chosen with the RESIDUAL= specification.

  • RESIDUAL= NORM Normalized Residuals

    RESIDUAL= VARSTAND Variance Standardized Residuals

    RESIDUAL= ASYSTAND Asymptotically Standardized Residuals

  • For more details, see the section Assessment of Fit on page 649.

SIMPLE S

  • displays means, standard deviations, skewness, and univariate kurtosis if available. This information is displayed when you specify the PRINT option. If you specify the UCOV, UCORR, or NOINT option, the standard deviations are not corrected for the mean. If the KURTOSIS option is specified, the SIMPLE option is set by default.

STDERR SE

  • displays approximate standard errors if estimation methods other than unweighted least squares (ULS) or diagonally weighted least squares (DWLS) are used (and the NOSTDERR option is not specified). If you specify neither the STDERR nor the NOSTDERR option, the standard errors are computed for the OUTRAM= data set. This information is displayed by default when you specify the PRINT option.

NOSTDERR NOSE

  • specifies that standard errors should not be computed. Standard errors are not computed for unweighted least-squares (ULS) or diagonally weighted least-squares (DWLS) estimation. In general, standard errors are computed even if the STDERR display option is not used (for file output).

TOTEFF TE

  • computes and displays total effects and indirect effects.

Miscellaneous Options

ALPHAECV= ±

  • specifies the significance level for a 1 ˆ’ ± confidence interval, 0 ± 1, for the Browne & Cudeck (1993) expected cross validation index (ECVI) . The default value is ± =0 . 1, which corresponds to a 90% confidence interval for the ECVI.

ALPHARMS= ±

  • specifies the significance level for a 1 ˆ’ ± confidence interval, 0 ± 1, for the Steiger & Lind (1980) root mean squared error of approximation (RMSEA) coefficient (refer to Browne and Du Toit 1992). The default value is ± = 0 . 1,which corresponds to a 90% confidence interval for the RMSEA.

ASINGULAR ASING= r

  • specifies an absolute singularity criterion r , r > 0, for the inversion of the information matrix, which is needed to compute the covariance matrix. The following singularity criterion is used:

    click to expand

    In the preceding criterion, d j,j is the diagonal pivot of the matrix, and VSING and MSING are the specified values of the VSINGULAR= and MSINGULAR= options. The default value for ASING is the square root of the smallest positive double precision value. Note that, in many cases, a normalized matrix D ˆ’ 1 HD ˆ’ 1 is decomposed, and the singularity criteria are modified correspondingly.

BIASKUR

  • computes univariate skewness and kurtosis by formulas uncorrected for bias. See the section Measures of Multivariate Kurtosis on page 658 for more information.

COVSING= r

  • specifies a nonnegative threshold r , which determines whether the eigenvalues of the information matrix are considered to be zero. If the inverse of the information matrix is found to be singular (depending on the VSINGULAR=, MSINGULAR=, ASINGULAR=, or SINGULAR= option), a generalized inverse is computed using the eigenvalue decomposition of the singular matrix. Those eigenvalues smaller than r are considered to be zero. If a generalized inverse is computed and you do not specify the NOPRINT option, the distribution of eigenvalues is displayed.

DEMPHAS DE= r

  • changes the initial values of all parameters that are located on the diagonals of the central model matrices by the relationship

    click to expand

    The initial values of the diagonal elements of the central matrices should always be nonnegative to generate positive definite predicted model matrices in the first iteration. By using values of r > 1, for example, r = 2, r = 10, ... , you can increase these initial values to produce predicted model matrices with high positive eigenvalues in the first iteration. The DEMPHAS= option is effective independent of the way the initial values are set; that is, it changes the initial values set in the model specification as well as those set by an INRAM= data set and those automatically generated for RAM, LINEQS, or FACTOR model statements. It also affects the initial values set by the START= option, which uses, by default, DEMPHAS=100 if a covariance matrix is analyzed and DEMPHAS=10 for a correlation matrix.

FDCODE

  • replaces the analytic derivatives of the program statements by numeric derivatives (finite difference approximations). In general, this option is needed only when you have program statements that are too difficult for the built-in function compiler to differentiate analytically. For example, if the program code for the nonlinear constraints contains many arrays and many DO loops with array processing, the built-in function compiler can require too much time and memory to compute derivatives of the constraints with respect to the parameters. In this case, the Jacobian matrix of constraints is computed numerically by using finite difference approximations. The FDCODE option does not modify the kind of derivatives specified with the HESSALG= option.

HESSALG HA = 1 2345611

  • specifies the algorithm used to compute the (approximate) Hessian matrix when TECHNIQUE=LEVMAR and NEWRAP, to compute approximate standard errors of the parameter estimates, and to compute Lagrange multipliers. There are different groups of algorithms available.

    • analytic formulas: HA= 1,2,3,4,11

    • finite difference approximation: HA= 5,6

    • dense storage: HA= 1,2,3,4,5,6

    • sparse storage: HA= 11

  • If the Jacobian is more than 25% dense, the dense analytic algorithm, HA= 1, is used by default. The HA= 1 algorithm is faster than the other dense algorithms, but it needs considerably more memory for large problems than HA= 2,3,4. If the Jacobian is more than 75% sparse, the sparse analytic algorithm, HA= 11, is used by default. The dense analytic algorithm HA= 4 corresponds to the original COSAN algorithm; you are advised not to specify HA= 4 due to its very slow performance. If there is not enough memory available for the dense analytic algorithm HA= 1 and you must specify HA= 2 or HA= 3, it may be more efficient to use one of the quasi-Newton or conjugate-gradient optimization techniques since Levenberg-Marquardt and Newton-Raphson optimization techniques need to compute the Hessian matrix in each iteration. For approximate standard errors and modification indices, the Hessian matrix has to be computed at least once, regardless of the optimization technique.

    The algorithms HA= 5 and HA= 6 compute approximate derivatives by using forward difference formulas. The HA= 5 algorithm corresponds to the analytic HA= 1: it is faster than HA= 6, however it needs much more memory. The HA= 6 algorithm corresponds to the analytic HA= 2: it is slower than HA= 5, however it needs much less memory.

    Test computations of large sparse problems show that the sparse algorithm HA= 11 can be up to ten times faster than HA= 1 (and needs much less memory).

MSINGULAR MSING= r

  • specifies a relative singularity criterion r , r> 0, for the inversion of the information matrix, which is needed to compute the covariance matrix. The following singularity criterion is used:

    click to expand

    where d j,j is the diagonal pivot of the matrix, and ASING and VSING are the specified values of the ASINGULAR= and VSINGULAR= options. If you do not specify the SINGULAR= option, the default value for MSING is 1E “ 12; otherwise, the default value is 1E “ 4 * SINGULAR. Note that, in many cases, a normalized matrix D ˆ’ 1 HD ˆ’ 1 is decomposed, and the singularity criteria are modified correspondingly.

NOADJDF

  • turns off the automatic adjustment of degrees of freedom when there are active constraints in the analysis. When the adjustment is in effect, most fit statistics and the associated probability levels will be affected. This option should be used when the researcher believes that the active constraints observed in the current sample will have little chance to occur in repeated sampling.

RANDOM = i

  • specifies a positive integer as a seed value for the pseudo-random number generator to generate initial values for the parameter estimates for which no other initial value assignments in the model definitions are made. Except for the parameters in the diagonal locations of the central matrices in the model, the initial values are set to random numbers in the range 0 r 1. The values for parameters in the diagonals of the central matrices are random numbers multiplied by 10 or 100. For more information, see the section Initial Estimates on page 661.

SINGULAR SING = r

  • specifies the singularity criterion r , 0 <r< 1, used, for example, for matrix inversion. The default value is the square root of the relative machine precision or, equivalently, the square root of the largest double precision value that, when added to 1, results in 1.

SLMW= r

  • specifies the probability limit used for computing the stepwise multivariate Wald test. The process stops when the univariate probability is smaller than r . The default value is r = 0 . 05.

START = r

  • In general, this option is needed only in connection with the COSAN model statement, and it specifies a constant r as an initial value for all the parameter estimates for which no other initial value assignments in the pattern definitions are made. Start values in the diagonal locations of the central matrices are set to 100 r if a COV or UCOV matrix is analyzed and 10 r if a CORR or UCORR matrix is analyzed. The default value is r = . 5. Unspecified initial values in a FACTOR, RAM, or LINEQS model are usually computed by PROC CALIS. If none of the initialization methods are able to compute all starting values for a model specified by a FACTOR, RAM, or LINEQS statement, then the start values of parameters that could not be computed are set to r , 10 r , or 100 r . If the DEMPHAS= option is used, the initial values of the diagonal elements of the central model matrices are multiplied by the value specified in the DEMPHAS= option. For more information, see the section Initial Estimates on page 661.

VSINGULAR VSING= r

  • specifies a relative singularity criterion r , r > 0, for the inversion of the information matrix, which is needed to compute the covariance matrix. The following singularity criterion is used:

    click to expand

    where d j,j is the diagonal pivot of the matrix, and ASING and MSING are the specified values of the ASINGULAR= and MSINGULAR= options. If you do not specify the SINGULAR= option, the default value for VSING is 1E ˆ’ 8; otherwise, the default value is SINGULAR. Note that in many cases a normalized matrix D ˆ’ 1 HD ˆ’ 1 is decomposed, and the singularity criteria are modified correspondingly.

COSAN Model Statement

  • COSAN matrix_term < + matrix_term ...> ;

  • where matrix_term represents matrix_definition < * matrix_definition ... >

  • and matrix_definition represents matrix_name (column_number < ,general_form < ,transformation >> )

The COSAN statement constructs the symmetric matrix model for the covariance analysis mentioned earlier (see the section The Generalized COSAN Model on page 552):

click to expand

You can specify only one COSAN statement with each PROC CALIS statement. The COSAN statement contains m matrix_term s corresponding to the generalized COSAN formula. The matrix_term s are separated by plus signs (+) according to the addition of the terms within the model.

Each matrix_term of the COSAN statement contains the definitions of the first n ( k )+ 1 matrices, F k j and P k , separated by asterisks (*) according to the multiplication of the matrices within the term. The matrices of the right-hand-side product are redundant and are not specified within the COSAN statement.

Each matrix_definition consists of the name of the matrix ( matrix_name ), followed in parentheses by the number of columns of the matrix ( column_number ) and, optionally , one or two matrix properties, separated by commas, describing the form of the matrix.

The number of rows of the first matrix in each term is defined by the input correlation or covariance matrix. You can reorder and reduce the variables in the input moment matrix using the VAR statement. The number of rows of the other matrices within the term is defined by the number of columns of the preceding matrix.

The first matrix property describes the general form of the matrix in the model. You can choose one of the following specifications of the first matrix property. The default first matrix property is GEN.

Code

Description

IDE

specifies an identity matrix; if the matrix is not square, this specification describes an identity submatrix followed by a rectangular zero submatrix.

ZID

specifies an identity matrix; if the matrix is not square, this specification describes a rectangular zero submatrix followed by an identity submatrix.

DIA

specifies a diagonal matrix; if the matrix is not square, this specification describes a diagonal submatrix followed by a rectangular zero submatrix.

ZDI

specifies a diagonal matrix; if the matrix is not square, this specification describes a rectangular zero submatrix followed by a diagonal submatrix.

LOW

specifies a lower triangular matrix; the matrix can be rectangular.

UPP

specifies an upper triangular matrix; the matrix can be rectangular.

SYM

specifies a symmetric matrix; the matrix cannot be rectangular.

GEN

specifies a general rectangular matrix (default).

The second matrix property describes the kind of inverse matrix transformation. If the second matrix property is omitted, no transformation is applied to the matrix.

Code

Description

INV

uses the inverse of the matrix.

IMI

uses the inverse of the difference between the identity and the matrix.

You cannot specify a nonsquare parameter matrix as an INV or IMI model matrix. Specifying a matrix of type DIA, ZDI, UPP, LOW, or GEN is not necessary if you do not use the unspecified location list in the corresponding MATRIX statements. After PROC CALIS processes the corresponding MATRIX statements, the matrix type DIA, ZDI, UPP, LOW, or GEN is recognized from the pattern of possibly nonzero elements. If you do not specify the first matrix property and you use the unspecified location list in a corresponding MATRIX statement, the matrix is recognized as a GEN matrix. You can also generate an IDE or ZID matrix by specifying a DIA, ZDI, or IMI matrix and by using MATRIX statements that define the pattern structure. However, PROC CALIS would be unable to take advantage of the fast algorithms that are available for IDE and ZID matrices in this case.

For example, to specify a second-order factor analysis model

click to expand

with m 1 = 3 first-order factors and m 2 = 2 second-order factors and with n = 9 variables, you can use the following COSAN statement:

  cosan F1(3) * F2(2) * P2(2,SYM)+F1(3) * U2(3,DIA) * I1(3,IDE)   +U1(9,DIA) * I2(9,IDE)  

MATRIX Statement

  • MATRIX matrix-name < location > = list < , location = list ...> ;

You can specify one or more MATRIX statements with a COSAN or FACTOR statement. A MATRIX statement specifies which elements of the matrix are constant and which are parameters. You can also assign values to the constant elements and initial values for the parameters. The input notation resembles that used in the COSAN program of R. McDonald and C. Fraser (personal communication), except that in PROC CALIS, parameters are distinguished from constants by giving parameters names instead of by using positive and negative integers.

A MATRIX statement cannot be used for an IDE or ZID matrix. For all other types of matrices, each element is assumed to be a constant of 0 unless a MATRIX statement specifies otherwise. Hence, there must be at least one MATRIX statement for each matrix mentioned in the COSAN statement except for IDE and ZID matrices. There can be more than one MATRIX statement for a given matrix. If the same matrix element is given different definitions, later definitions override earlier definitions.

At the start, all elements of each model matrix, except IDE or ZID matrices, are set equal to 0.

Description of location :

There are several ways to specify the starting location and continuation direction of a list with n +1, n 0, elements within the parameter matrix.

[ i,j ]

The list elements correspond to the diagonally continued matrix elements [ i,j ],[ i +1, j +1] , ... , [ i+n,j+n ]. The number of elements is defined by the length of the list and eventually terminated by the matrix boundaries. If the list contains just one element (constant or variable), then it is assigned to the matrix element [ i,j ].

[ i, ]

The list elements correspond to the horizontally continued matrix elements [ i,j ], [ i,j +1] , ... , [ i,j+n ], where the starting column j is the diagonal position for a DIA, ZDI, or UPP matrix and is the first column for all other matrix types. For a SYM matrix, the list elements refer only to the matrix elements in the lower triangle. For a DIA or ZDI matrix, only one list element is accepted.

[ ,j ]

The list elements correspond to the vertically continued matrix elements [ i,j ], [ i +1, j ] , ... , [ i+n,j ], where the starting row i is equal to the diagonal position for a DIA, ZDI, SYM, or LOW matrix and is the first row for each other matrix type. For a SYM matrix, the list elements refer only to the matrix elements in the lower triangle. For a DIA or ZDI matrix, only one list element is accepted.

[ , ]

unspecified location: The list is allocated to all valid matrix positions (except for a ZDI matrix) starting at the element [1,1] and continuing rowwise. The only valid matrix positions for a DIA or ZDI matrix are the diagonal elements; for an UPP or LOW matrix, the valid positions are the elements above or below the diagonal; and for a symmetric matrix, the valid positions are the elements in the lower triangle since the other triangle receives the symmetric allocation automatically. This location definition differs from the definitions with specified pattern locations in one important respect: if the number of elements in the list is smaller than the number of valid matrix elements, the list is repeated in the allocation process until all valid matrix elements are filled.

Omitting the left-hand-side term is equivalent to using [ , ] for an unspecified location .

Description of list :

The list contains numeric values or parameter names, or both, that are assigned to a list of matrix elements starting at a specified position and proceeding in a specified direction. A real number r in the list defines the corresponding matrix element as a constant element with this value. The notation n * r generates n values of r in the list. A name in the list defines the corresponding matrix element as a parameter to be estimated. You can use numbered name lists ( X1-X10 ) or the asterisk notation (5 * X means five occurrences of the parameter X ). If a sublist of n 1 names inside a list is followed by a list of n 2 n 1 real values inside parentheses, the last n 2 parameters in the name sublist are given the initial values mentioned inside the parenthesis. For example, the following list

  0. 1. A2-A5 (1.4 1.9 2.5) 5.  

specifies that the first two matrix elements (specified by the location to the left of the equal sign) are constants with values 0 and 1. The next element is parameter A2 with no specified initial value. The next three matrix elements are the variable parameters A3 , A4 , and A5 with initial values 1.4, 1.9, and 2.5, respectively. The next matrix element is specified by the seventh list element to be the constant 5.

If your model contains many unconstrained parameters and it is too cumbersome to find different parameter names, you can specify all those parameters by the same prefix name. A prefix is a short name followed by a colon . The CALIS procedure generates a parameter name by appending an integer suffix to this prefix name. The prefix name should have no more than five or six characters so that the generated parameter name is not longer than eight characters . For example, if the prefix A (the parameter A1 ) is already used once in a list , the previous example would be identical to

  0.1.4*A:(1.4 1.9 2.5) 5.  

To avoid unintentional equality constraints, the prefix names should not coincide with explicitly defined parameter names.

If you do not assign initial values to the parameters (listed in parentheses following a name sublist within the pattern list), PROC CALIS assigns initial values as follows:

  • If the PROC CALIS statement contains a START= r option, each uninitialized parameter is given the initial value r . The uninitialized parameters in the diagonals of the central model matrices are given the initial value 10 r , 100 r , or r multiplied by the value specified in the DEMPHAS= option.

  • If the PROC CALIS statement contains a RANDOM= i option, each uninitialized parameter is given a random initial value 0 r 1. The uninitialized parameters in the diagonals of the central model matrices are given the random values multiplied by 10, 100, or the value specified in the DEMPHAS= option.

  • Otherwise, the initial value is set corresponding to START=0.5.

For example, to specify a confirmatory second-order factor analysis model

click to expand

with m 1 = 3 first-order factors, m 2 = 2 second-order factors, and n = 9 variables and the following matrix pattern,

click to expand

you can specify the following COSAN and MATRIX statements:

  cosan f1(3) * f2(2) * p2(2,dia) + f1(3) * u2(3,dia) * i1(3,ide)   + u1(9,dia) * i2(9,ide);   matrix f1   [ ,1]= x1-x3,   [ ,2]= 3 * 0x4-x6,   [ ,3]= 6 * 0x7-x9;   matrix u1   [1,1]=u1-u9;   matrix f2   [ ,1]= 2 * y1,   [ ,2]= 0. 2 * y2;   matrix u2 = 3 * v:;   matrix p2 = 2 * p;   run;  

The matrix pattern includes several equality constraints. Two loadings in the first and second factor of F 2 (parameter names Y1 and Y2 ) and the two factor correlations in the diagonal of matrix P 2 (parameter name P ) are constrained to be equal. There are many other ways to specify the same model. See Figure 19.2 for the path diagram of this model.

click to expand
Figure 19.2: Path Diagram of Second-Order Factor Analysis Model

The MATRIX statement can also be used with the FACTOR model statement. See Using the FACTOR and MATRIX Statements on page 608 for the usage.

RAM Model Statement

  • RAM list-entry < , list-entry ...> ;

    where list-entry represents matrix-number row-number column-number <value><parameter-name>

The RAM statement defines the elements of the symmetric RAM matrix model

in the form of a list type input (McArdle and McDonald 1984).

The covariance structure is given by

click to expand

with selection matrix J and

click to expand

You can specify only one RAM statement with each PROC CALIS statement. Using the RAM statement requires that the first n variable numbers in the path diagram and in the vector v correspond to the numbers of the n manifest variables of the given covariance or correlation matrix. If you are not sure what the order of the manifest variables in the DATA= data set is, use a VAR statement to specify the order of these observed variables. Using the AUGMENT option includes the INTERCEPT variable as a manifest variable with number n + 1 in the RAM model. In this case, latent variables have to start with n + 2. The box of each manifest variable in the path diagram is assigned the number of the variable in the covariance or correlation matrix.

The selection matrix J is always a rectangular identity (IDE) matrix, and it does not have to be specified in the RAM statement. A constant matrix element is defined in a RAM statement by a list-entry with four numbers. You define a parameter element by three or four numbers followed by a name for the parameter. Separate the list entries with a comma. Each list-entry in the RAM statement corresponds to a path in the diagram, as follows:

  • The first number in each list entry ( matrix-number ) is the number of arrow heads of the path, which is the same as the number of the matrix in the RAM model (1 := A , 2 := P ).

  • The second number in each list entry ( row-number ) is the number of the node in the diagram to which the path points, which is the same as the row number of the matrix element.

  • The third number in each list entry ( column-number ) is the number of the node in the diagram from which the path originates, which is the same as the column number of the matrix element.

  • The fourth number ( value ) gives the (initial) value of the path coefficient. If you do not specify a fifth list-entry , this number specifies a constant coefficient; otherwise, this number specifies the initial value of this parameter. It is not necessary to specify the fourth item. If you specify neither the fourth nor the fifth item, the constant is set to 1 by default. If the fourth item ( value ) is not specified for a parameter, PROC CALIS tries to compute an initial value for this parameter.

  • If the path coefficient is a parameter rather than a constant, then a fifth item in the list entry ( parameter-name ) is required to assign a name to the parameter. Using the same name for different paths constrains the corresponding coefficients to be equal.

If the initial value of a parameter is not specified in the list, the initial value is chosen in one of the following ways:

  • If the PROC CALIS statement contains a RANDOM= i option, then the parameter obtains a randomly generated initial value r , such that 0 r 1. The uninitialized parameters in the diagonals of the central model matrices are given the random values r multiplied by 10, 100, or the value specified in the DEMPHAS= option.

  • If the RANDOM= option is not used, PROC CALIS tries to estimate the initial values.

  • If the initial values cannot be estimated, the value of the START= option is used as an initial value.

If your model contains many unconstrained parameters and it is too cumbersome to find different parameter names, you can specify all those parameters by the same prefix name. A prefix is a short name followed by a colon. The CALIS procedure then generates a parameter name by appending an integer suffix to this prefix name. The prefix name should have no more than five or six characters so that the generated parameter name is not longer than eight characters. To avoid unintentional equality constraints, the prefix names should not coincide with explicitly defined parameter names.

For example, you can specify the confirmatory second-order factor analysis model (mentioned on page 595)

click to expand

using the following RAM model statement.

  ram   1  1 10    x1,   1  2 10    x2,   1  3 10    x3,   1  4 11    x4,   1  5 11    x5,   1  6 11    x6,   1  7 12    x7,   1  8 12    x8,   1  9 12    x9,   1 10 13    y1,   1 11 13    y1,   1 11 14    y2,   1 12 14    y2,   2  1  1    u:,   2  2  2    u:,   2  3  3    u:,   2  4  4    u:,   2  5  5    u:,   2  6  6    u:,   2  7  7    u:,   2  8  8    u:,   2  9  9    u:,   2 10 10    v:,   2 11 11    v:,   2 12 12    v:,   2 13 13    p,   2 14 14    p;   run;  

The confirmatory second-order factor analysis model corresponds to the path diagram displayed in Figure 19.2.

There is a very close relationship between the RAM model algebra and the specification of structural linear models by path diagrams. See Figure 19.3 for an example.

click to expand
Figure 19.3: Examples of RAM Nomography

Refer to McArdle (1980) for the interpretation of the models displayed in Figure 19.3.

LINEQS Model Statement

  • LINEQS equation < , equation ...> ;

    where equation represents dependent = term < + term... > and where term represents one of the following:

    • coefficient-name < (number) > variable-name

    • prefix-name < (number) > variable-name

    • < number > variable-name

The LINEQS statement defines the LINEQS model

click to expand

You can specify only one LINEQS statement with each PROC CALIS statement. There are some differences from Bentler s notation in choosing the variable names. The length of each variable name is restricted to eight characters. The names of the manifest variables are defined in the DATA= input data set. The VAR statement can be used to select a subset of manifest variables in the DATA= input data set to analyze. You do not need to use a V prefix for manifest variables in the LINEQS statement nor do you need to use a numerical suffix in any variable name. The names of the latent variables must start with the prefix letter F (for Factor); the names of the residuals must start with the prefix letters E (for Error) or D (for Disturbance). The trailing part of the variable name can contain letters or digits. The prefix letter E is used for the errors of the manifest variables, and the prefix letter D is used for the disturbances of the latent variables. The names of the manifest variables in the DATA= input data set can start with F, E, or D, but these names should not coincide with the names of latent or error variables used in the model. The left-hand side (that is, endogenous dependent variable) of each equation should be either a manifest variable of the data set or a latent variable with prefix letter F. The left-hand-side variable should not appear on the right-hand side of the same equation; this means that matrix ² should not have a nonzero diagonal element. Each equation should contain, at most, one E or D variable.

The equations must be separated by a comma. The order of the equations is arbitrary. The displayed output generally contains equations and terms in an order different from the input.

Coefficients to estimate are indicated in the equations by a name preceding the independent variable s name. The coefficient s name can be followed by a number inside parentheses indicating the initial value for this coefficient. A number preceding the independent variable s name indicates a constant coefficient. If neither a coefficient name nor a number precedes the independent variable s name, a constant coefficient of 1 is assumed.

If the initial value of a parameter is not specified in the equation, the initial value is chosen in one of the following ways:

  • If you specify the RANDOM= option in the PROC CALIS statement, the variable obtains a randomly generated initial value r , such that 0 r 1. The uninitialized parameters in the diagonals of the central model matrices are given the nonnegative random values r multiplied by 10, 100, or the value specified in the DEMPHAS= option.

  • If the RANDOM= option is not used, PROC CALIS tries to estimate the initial values.

  • If the initial values cannot be estimated, the value of the START= option is used as an initial value.

In Bentler s notation, estimated coefficients are indicated by asterisks. Referring to a parameter in Bentler s notation requires the specification of two variable names that correspond to the row and column of the position of the parameter in the matrix. Specifying the estimated coefficients by parameter names makes it easier to impose additional constraints with code. You do not need any additional statements to express equality constraints. Simply specify the same name for parameters that should have equal values.

If your model contains many unconstrained parameters and it is too cumbersome to find different parameter names, you can specify all those parameters by the same prefix name. A prefix is a short name followed by a colon. The CALIS procedure then generates a parameter name by appending an integer suffix to this prefix name. The prefix name should have no more than five or six characters so that the generated parameter name is not longer than eight characters. To avoid unintentional equality constraints, the prefix names should not coincide with explicitly defined parameter names.

For example, you can specify confirmatory second-order factor analysis model (mentioned on page 595)

click to expand

by using the LINEQS and STD statements:

  lineqs   V1=X1F1+E1,   V2=X2F1+E2,   V3=X3F1+E3,   V4=X4F2+E4,   V5=X5F2+E5,   V6=X6F2+E6,   V7=X7F3+E7,   V8=X8F3+E8,   V9=X9F3+E9,   F1=Y1F4+D1,   F2=Y1F4+Y2F5+D2,   F3=Y2F5+D3;   std   E1-E9=9*U:,   D1-D3=3*V:,   F4F5=2*P;   run;  

STD Statement

  • STD assignment < , assignment ...> ;

    where assignment represents variables = pattern-definition

The STD statement tells which variances are parameters to estimate and which are fixed. The STD statement can be used only with the LINEQS statement. You can specify only one STD statement with each LINEQS model statement. The STD statement defines the diagonal elements of the central model matrix . These elements correspond to the variances of the exogenous variables and to the error variances of the endogenous variables. Elements that are not defined are assumed to be 0.

Each assignment consists of a variable list ( variables ) on the left-hand side and a pattern list ( pattern-definition ) on the right-hand side of an equal sign. The assignments in the STD statement must be separated by commas. The variables list on the left-hand side of the equal sign should contain only names of variables that do not appear on the left-hand side of an equation in the LINEQS statement, that is, exogenous, error, and disturbance variables.

The pattern-definition on the right-hand side is similar to that used in the MATRIX statement. Each list element on the right-hand side defines the variance of the variable on the left-hand side in the same list position. A name on the right-hand side means that the corresponding variance is a parameter to estimate. A name on the right-hand side can be followed by a number inside parentheses that gives the initial value. A number on the right-hand side means that the corresponding variance of the variable on the left-hand side is fixed. If the right-hand-side list is longer than the left-hand-side variable list, the right-hand-side list is shortened to the length of the variable list. If the right-hand-side list is shorter than the variable list, the right-hand-side list is filled with repetitions of the last item in the list.

The right-hand side can also contain prefixes. A prefix is a short name followed by a colon. The CALIS procedure then generates a parameter name by appending an integer suffix to this prefix name. The prefix name should have no more than five or six characters so that the generated parameter name is not longer than eight characters. To avoid unintentional equality constraints, the prefix names should not coincide with explicitly defined parameter names. For example, if the prefix A is not used in any previous statement, this STD statement

  std E1-E6=6 * A: (6 * 3.) ;  

defines the six error variances as free parameters A 1, ... , A 6, all with starting values of 3.

COV Statement

  • COV assignment < , assignment ...> ;

    where assignment represents variables < * variables2 > = pattern-definition

The COV statement tells which covariances are parameters to estimate and which are fixed. The COV statement can be used only with the LINEQS statement. The COV statement differs from the STD statement only in the meaning of the left-hand-side variables list. You can specify only one COV statement with each LINEQS statement. The COV statement defines the off-diagonal elements of the central model matrix . These elements correspond to the covariances of the exogenous variables and to the error covariances of the endogenous variables. Elements that are not defined are assumed to be 0. The assignment s in the COV statement must be separated by commas.

The variables list on the left-hand side of the equal sign should contain only names of variables that do not appear on the left-hand side of an equation in the LINEQS statement, that is, exogenous, error, and disturbance variables.

The pattern-definition on the right-hand side is similar to that used in the MATRIX statement. Each list element on the right-hand side defines the covariance of a pair of variables in the list on the left-hand side. A name on the right-hand side can be followed by a number inside parentheses that gives the initial value. A number on the right-hand side means that the corresponding covariance of the variable on the left-hand side is fixed. If the right-hand-side list is longer than the left-hand-side variable list, the right-hand-side list is shortened to the length of the variable list. If the right-hand-side list is shorter than the variable list, the right-hand-side list is filled with repetitions of the last item in the list.

You can use one of two alternatives to refer to parts of . Thefirst alternative uses only one variable list and refers to all distinct pairs of variables within the list. The second alternative uses two variable lists separated by an asterisk and refers to all pairs of variables among the two lists.

Within-List Covariances

Using k variable names in the variables list on the left-hand side of an equal sign in a COV statement means that the parameter list ( pattern-definition ) on the right-hand side refers to all k ( k ˆ’ 1) / 2 distinct variable pairs in the below-diagonal part of the matrix. Order is very important. The order relation between the left-hand-side variable pairs and the right-hand-side parameter list is illustrated by the following example:

  COV E1-E4 = PHI1-PHI6 ;  

This is equivalent to the following specification:

  COV E2 E1 = PHI1,   E3 E1 = PHI2, E3 E2 = PHI3,   E4 E1 = PHI4, E4 E2 = PHI5, E4 E3 = PHI6;  

The symmetric elements are generated automatically. When you use prefix names on the right-hand sides, you do not have to count the exact number of parameters. For example,

  COV E1-E4 = PHI: ;  

generates the same list of parameter names if the prefix PHI is not used in a previous statement.

click to expand
Figure 19.4: Within-List and Between-List Covariances

Between-List Covariances

Using k 1 and k 2 variable names in the two lists (separated by an asterisk) on the left-hand side of an equal sign in a COV statement means that the parameter list on the right-hand side refers to all k 1 k 2 distinct variable pairs in the matrix. Order is very important. The order relation between the left-hand-side variable pairs and the right-hand-side parameter list is illustrated by the following example:

  COV E1 E2 * E3 E4 = PHI1-PHI4 ;  

This is equivalent to the following specification:

  COV E1 E3 = PHI1, E1 E4 = PHI2,   E2 E3 = PHI3, E2 E4 = PHI4;  

The symmetric elements are generated automatically.

Using prefix names on the right-hand sides lets you achieve the same purpose without counting the number of parameters. That is,

  COV E1 E2 * E3 E4 = PHI: ;  

FACTOR Model Statement

  • FACTOR < options > ;

You can use the FACTOR statement to specify an exploratory or confirmatory first-order factor analysis of the given covariance or correlation matrix C ,

click to expand

or

click to expand

where U is a diagonal matrix and P is symmetric. Within this section, n denotes the number of manifest variables corresponding to the rows and columns of matrix C , and m denotes the number of latent variables (factors or components ) corresponding to the columns of the loading matrix F .

You can specify only one FACTOR statement with each PROC CALIS statement. You can specify higher-order factor analysis problems using a COSAN model specification. PROC CALIS requires more computing time and memory than PROC FACTOR because it is designed for more general structural estimation problems and is unable to exploit the special properties of the unconstrained factor analysis model.

For default (exploratory) factor analysis, PROC CALIS computes initial estimates for factor loadings and unique variances by an algebraic method of approximate factor analysis. If you use a MATRIX statement together with a FACTOR model specification, initial values are computed by McDonald s (McDonald and Hartmann 1992) method (if possible). For details, see Using the FACTOR and MATRIX Statements on page 608. If neither of the two methods are appropriate, the initial values are set by the START= option.

The unrestricted factor analysis model is not identified because any orthogonal rotated factor loading matrix is equivalent to the result F,

click to expand

To obtain an identified factor solution, the FACTOR statement imposes zero constraints on the m ( m ˆ’ 1) / 2 elements in the upper triangle of F by default.

The following options are available in the FACTOR statement.

COMPONENT COMP

  • computes a component analysis instead of a factor analysis (the diagonal matrix U in the model is set to 0). Note that the rank of FF ² is equal to the number m of components in F . If m is smaller than the number of variables in the moment matrix C , the matrix of predicted model values is singular and maximum likelihood estimates for F cannot be computed. You should compute ULS estimates in this case.

HEYWOOD HEY

  • constrains the diagonal elements of U to be nonnegative; in other words, the model is replaced by

    click to expand

N = m

  • specifies the number of first-order factors or components. The number m of factors should not exceed the number n of variables in the covariance or correlation matrix analyzed. For the saturated model, m = n , the COMP option should generally be specified for U = 0; otherwise, df < 0. For m = 0 no factor loadings are estimated, and the model is C = U , with U = diag . By default, m = 1.

NORM

  • normalizes the rows of the factor pattern for rotation using Kaiser s normalization.

RCONVERGE= p

RCONV= p

  • specifies the convergence criterion for rotation cycles. The option is applicable to rotation using either the QUARTIMAX, VARIMAX, EQUAMAX, or PARSIMAX method in the ROTATE= option. Rotation stops when the scaled change of the simplicity function value is less than the RCONVERGE= value. The default convergence criterion is

    click to expand

    where f new and f old are simplicity function values of the current cycle and the previous cycle, respectively, K = max (1 , f old ) is a scaling factor, and µ is 1E-9 by default and is modified by the RCONVERGE= value.

RITER= n

  • specifies the maximum number of cycles n for factor rotation using either the QUARTIMAX, VARIMAX, EQUAMAX, or PARSIMAX method in the ROTATE= option. The default n is the maximum between 100 and 10 times of the number of variables.

ROTATER= name

  • specifies an orthogonal rotation. By default, ROTATE=NONE. The possible values for name are as follows:

    PRINCIPAL PC

    specifies a principal axis rotation. If ROTATE=PRINCIPAL is used with a factor rather than a component model, the following rotation is performed:

     

    click to expand

     

    where the columns of matrix T contain the eigenvectors of .

    QUARTIMAX Q

    specifies quartimax rotation.

    VARIMAX V

    specifies varimax rotation.

    EQUAMAX E

    specifies equamax rotation.

    PARSIMAX P

    specifies parsimax rotation.

    NONE

    performs no rotation (default).

Using the FACTOR and MATRIX Statements

You can specify the MATRIX statement and the FACTOR statement to compute a confirmatory first-order factor or component analysis. You can define the elements of the matrices F , P , and U of the oblique model,

click to expand

To specify the structure for matrix F , P , or U , you have to refer to the matrix _F_ , _P_ , or _U_ in the MATRIX statement. Matrix names automatically set by PROC CALIS always start with an underscore . As you name your own matrices or variables, you should avoid leading underscores.

The default matrix forms are as follows.

_F_ lower triangular matrix (0 upper triangle for problem identification, removing rotational invariance)

_P_ identity matrix (constant)

_U_ diagonal matrix

For details about specifying the elements in matrices, see the section MATRIX Statement on page 593. If you are using at least one MATRIX statement in connection with a FACTOR model statement, you can also use the BOUNDS or PARAMETERS statement and program statements to constrain the parameters named in the MATRIX statement. Initial estimates are computed by McDonald s (McDonald and Hartmann 1992) method. McDonald s method of computing initial values works better if you scale the factors by setting the factor variances to 1 rather than by setting the loadings of the reference variables equal to 1.

BOUNDS Statement

  • BOUNDS constraint < , constraint ...> ;

    where constraint represents < number operator > parameter-list < operator number >

You can use the BOUNDS statement to define boundary constraints for any parameter that has its name specified in a MATRIX, LINEQS, STD, COV, or RAM statement or that is used in the model of an INRAM= data set. Valid operators are < =, < , > =, > , and = or, equivalently, LE, LT, GE, GT, and EQ. The following is an example of the BOUNDS statement:

  bounds        0.   <= a1-a9 x    <= 1. ,   -1.   <= c2-c5            ,   b1-b10 y   >= 0. ;  

You must separate boundary constraints with a comma, and you can specify more than one BOUNDS statement. The feasible region for a parameter is the intersection of all boundary constraints specified for that parameter; if a parameter has a maximum lower boundary constraint larger than its minimum upper bound, the parameter is set equal to the minimum of the upper bounds.

If you need to compute the values of the upper or lower bounds, create a TYPE=EST data set containing _TYPE_ = UPPERBD or _TYPE_ = LOWERBD observations and use it as an INEST= or INVAR= input data set in a later PROC CALIS run.

The BOUNDS statement can contain only parameter names and numerical constants. You cannot use the names of variables created in program statements.

The active set strategies made available in PROC CALIS cannot realize the strict inequality constraints < or > . For example, you cannot specify BOUNDS x > 0; to prevent infinite values for y = log ( x ). Use BOUNDS x > 1E-8; instead.

If the CALIS procedure encounters negative diagonal elements in the central model matrices during the minimization process, serious convergence problems can occur. You can use the BOUNDS statement to constrain these parameters to nonnegative values. Using negative values in these locations can lead to a smaller 2 value but uninterpretable estimates.

LINCON Statement

  • LINCON constraint < , constraint ...> ;

    where constraint represents number operator linear-term or linear-term operator number ,

    and linear-term is <+-><coefficient * > parameter <<+-><coefficient * > parameter...>

The LINCON statement specifies a set of linear equality or inequality constraints of the form

click to expand

The constraints must be separated by commas. Each linear constraint i in the statement consists of a linear combination & pound ; j a ij x j of a subset of the n parameters x j ,j = 1 , ..., n, and a constant value b i separated by a comparison operator. Valid operators are < =, < , > =, > , and = or, equivalently, LE, LT, GE, GT, and EQ. PROC CALIS cannot enforce the strict inequalities < or > . Note that the coefficients a ij in the linear combination must be constant numbers and must be followed by an asterisk and the name of a parameter (for example, listed in the PARMS, STD or COV statement). The following is an example of the LINCON statement that sets a linear constraint on parameters x1 and x2:

  lincon       x1 + 3 * x2 <= 1;  

Although you can easily express boundary constraints in LINCON statements, for many applications it is much more convenient to specify both the BOUNDS and the LINCON statements in the same PROC CALIS call.

The LINCON statement can contain only parameter names, operators, and numerical constants. If you need to compute the values of the coefficients a ij or right-hand sides b i , you can run a preliminary DATA step and create a TYPE=EST data set containing _TYPE_ = LE , _TYPE_ = GE , or _TYPE_ = EQ observations, then specify this data set as an INEST= or INVAR= data set in a following PROC CALIS run.

NLINCON Statement

  • NLINCON NLC constraint < , constraint ...> ;

    where constraint represents

    • number operator variable-list number operator or

      variable-list operator number or

      number operator variable-list

You can specify nonlinear equality and inequality constraints with the NLINCON or NLC statement. The QUANEW optimization subroutine is used when you specify nonlinear constraints using the NLINCON statement.

The syntax of the NLINCON statement is similar to that of the BOUNDS statement, except that the NLINCON statement must contain the names of variables that are defined in the program statements and are defined as continuous functions of parameters in the model. They must not be confused with the variables in the data set.

As with the BOUNDS statement, one- or two-sided constraints are allowed in the NLINCON statement; equality constraints must be one sided. Valid operators are < =, < , > =, > , and= or, equivalently, LE, LT, GE, GT, and EQ.

PROC CALIS cannot enforce the strict inequalities < or > but instead treats them as < = and > =, respectively. The listed nonlinear constraints must be separated by commas. The following is an example of the NLINCON statement that constrains the nonlinear parametric function x 1 * x 1 + u 1 , which is defined below in a program statement, to a fixed value of 1:

  nlincon    xx = 1;   xx = x1 * x1 + u1;  

Note that x1 and u1 are parameters defined in the model. The following three NLINCON statements, which require xx1 , xx2 , and xx3 to be between zero and ten, are equivalent:

  nlincon  0. <= xx1-xx3,   xx1-xx3 <= 10;   nlincon 0. <= xx1-xx3 <= 10.;   nlincon 10. >= xx1-xx3 >= 0.;  

NLOPTIONS Statement

  • NLOPTIONS option(s) ;

Many options that are available in PROC NLP can now be specified for the optimization subroutines in PROC CALIS using the NLOPTIONS statement. The NLOPTIONS statement provides more displayed and file output on the results of the optimization process, and it permits the same set of termination criteria as in PROC NLP. These are more technical options that you may not need to specify in most cases. The available options are summarized in Table 19.2 through Table 19.4, and the options are described in detail in the following three sections.

Table 19.2: Options Documented in the PROC CALIS Statement

Option

Short Description

Estimation Methods

G4= i

algorithm for computing STDERR

Optimization Techniques

TECHNIQUE= name

minimization method

UPDATE= name

update technique

LINESEARCH= i

line-search method

FCONV= r

relative change function convergence criterion

GCONV= r

relative gradient convergence criterion

INSTEP= r

initial step length (SALPHA=, RADIUS=)

LSPRECISION= r

line-search precision

MAXFUNC= i

maximum number of function calls

MAXITER= i<n>

maximum number of iterations

Miscellaneous Options

ASINGULAR= r

absolute singularity criterion for inversion of the information matrix

COVSING= r

singularity tolerance of the information matrix

MSINGULAR= r

relative M singularity criterion for inversion of the information matrix

SINGULAR= r

singularity criterion for inversion of the Hessian

VSINGULAR= r

relative V singularity criterion for inversion of the information matrix

Table 19.3: Termination Criteria Options

Option

Short Description

Options Used by All Techniques

ABSCONV= r

absolute function convergence criterion

MAXFUNC= i

maximum number of function calls

MAXITER= i<n>

maximum number of iterations

MAXTIME= r

maximum CPU time

MINITER= i

minimum number of iterations

Options for Unconstrained and Linearly Constrained Techniques

ABSFCONV= r<n>

absolute change function convergence criterion

ABSGCONV= r<n>

absolute gradient convergence criterion

ABSXCONV= r<n>

absolute change parameter convergence criterion

FCONV= r<n>

relative change function convergence criterion

FCONV2= r<n>

function convergence criterion

FDIGITS= r

precision in computation of the objective function

FSIZE= r

parameter for FCONV= and GCONV=

GCONV= r<n>

relative gradient convergence criterion

GCONV2= r<n>

relative gradient convergence criterion

XCONV= r<n>

relative change parameter convergence criterion

XSIZE= r

parameter for XCONV=

Options for Nonlinearly Constrained Techniques

ABSGCONV= r<n>

maximum absolute gradient of Lagrange function criterion

FCONV2= r<n>

predicted objective function reduction criterion

GCONV= r<n>

normalized predicted objective function reduction criterion

Table 19.4: Miscellaneous Options

Option

Short Description

Options for the Approximate Covariance Matrix of Parameter Estimates

CFACTOR= r

scalar factor for STDERR

NOHLF

use Hessian of the objective function for STDERR

Options for Additional Displayed Output

PALL

display initial and final optimization values

PCRPJAC

display approximate Hessian matrix

PHESSIAN

display Hessian matrix

PHISTORY

display optimization history

PINIT

display initial values and derivatives (PALL)

PNLCJAC

display Jacobian matrix of nonlinear constraints (PALL)

PRINT

display results of the optimization process

Additional Options for Optimization Techniques

DAMPSTEP < =r >

controls initial line-search step size

HESCAL= n

scaling version of Hessian or Jacobian

LCDEACT= r

Lagrange multiplier threshold of constraint

LCEPSILON= r

range for boundary and linear constraints

LCSINGULAR= r

QR decomposition linear dependence criterion

NOEIGNUM

suppress computation of matrices

RESTART= i

restart algorithm with a steepest descent direction

VERSION=1 2

quasi-Newton optimization technique version

Options Documented in the PROC CALIS Statement

The following options are the same as in the PROC CALIS statement and are documented in the section PROC CALIS Statement on page 568.

Estimation Method Option

G4= i

  • specifies the method for computing the generalized (G2 or G4) inverse of a singular matrix needed for the approximate covariance matrix of parameter estimates. This option is valid only for applications where the approximate covariance matrix of parameter estimates is found to be singular.

Optimization Technique Options

TECHNIQUE TECH= name

OMETHOD OM= name

  • specifies the optimization technique.

UPDATE UPD= name

  • specifies the update method for the quasi-Newton or conjugate-gradient optimization technique.

LINESEARCH LIS= i

  • specifies the line-search method for the CONGRA, QUANEW, and NEWRAP optimization techniques.

FCONV FTOL= r

  • specifies the relative function convergence criterion. For more details, see the section Termination Criteria Options on page 615.

GCONV GTOL= r

  • specifies the relative gradient convergence criterion. For more details, see the section Termination Criteria Options on page 615.

INSTEP SALPHA RADIUS= r

  • restricts the step length of an optimization algorithm during the first iterations.

LSPRECISION LSP= r

  • specifies the degree of accuracy that should be obtained by the line-search algorithms LIS=2 and LIS=3.

MAXFUNC MAXFU= i

  • specifies the maximum number i of function calls in the optimization process. For more details, see the section Termination Criteria Options on page 615.

MAXITER MAXIT= i < n >

  • specifies the maximum number i of iterations in the optimization process. For more details, see the section Termination Criteria Options on page 615.

Miscellaneous Options

ASINGULAR ASING= r

  • specifies an absolute singularity criterion r , r > 0, for the inversion of the information matrix, which is needed to compute the approximate covariance matrix of parameter estimates.

COVSING= r

  • specifies a nonnegative threshold r , r > 0, that decides whether the eigenvalues of the information matrix are considered to be zero. This option is valid only for applications where the approximate covariance matrix of parameter estimates is found to be singular.

MSINGULAR MSING= r

  • specifies a relative singularity criterion r , r > 0, for the inversion of the information matrix, which is needed to compute the approximate covariance matrix of parameter estimates.

SINGULAR SING = r

  • specifies the singularity criterion r , 0 r 1, that is used for the inversion of the Hessian matrix. The default value is 1E “8.

VSINGULAR VSING= r

  • specifies a relative singularity criterion r , r > 0, for the inversion of the information matrix, which is needed to compute the approximate covariance matrix of parameter estimates.

Termination Criteria Options

Let x * be the point at which the objective function f ( ·) is optimized, and let x ( k ) be the parameter values attained at the k th iteration. All optimization techniques stop at the k th iteration if at least one of a set of termination criteria is satisfied. The specified termination criteria should allow termination in an area of sufficient size around x *. You can avoid termination respective to any of the following function, gradient, or parameter criteria by setting the corresponding option to zero. There is a default set of termination criteria for each optimization technique; most of these default settings make the criteria ineffective for termination. PROC CALIS may have problems due to rounding errors (especially in derivative evaluations) that prevent an optimizer from satisfying strong termination criteria.

Note that PROC CALIS also terminates if the point x ( k ) is fully constrained by linearly independent active linear or boundary constraints, and all Lagrange multiplier estimates of active inequality constraints are greater than a small negative tolerance.

The following options are available only in the NLOPTIONS statement (except for FCONV, GCONV, MAXFUNC, and MAXITER), and they affect the termination criteria.

Options Used by All Techniques

The following five criteria are used by all optimization techniques.

ABSCONV ABSTOL= r

  • specifies an absolute function convergence criterion.

    • For minimization, termination requires

      click to expand
    • For maximization, termination requires

      click to expand
  • The default value of ABSCONV is

    • for minimization, the negative square root of the largest double precision value

    • for maximization, the positive square root of the largest double precision value

MAXFUNC MAXFU= i

  • requires the number of function calls to be no larger than i . The default values are listed in the following table.

    TECH=

    MAXFUNC default

    LEVMAR, NEWRAP, NRRIDG, TRUREG

    i =125

    DBLDOG, QUANEW

    i =500

    CONGRA

    i =1000

  • The default is used if you specify MAXFUNC=0. The optimization can be terminated only after completing a full iteration. Therefore, the number of function calls that is actually performed can exceed the number that is specified by the MAXFUNC= option.

MAXITER MAXIT= i < n >

  • requires the number of iterations to be no larger than i . The default values are listed in the following table.

    TECH=

    MAXITER default

    LEVMAR, NEWRAP, NRRIDG, TRUREG

    i =50

    DBLDOG, QUANEW

    i =200

    CONGRA

    i =400

  • The default is used if you specify MAXITER=0 or you omit the MAXITER option.

  • The optional second value n is valid only for TECH=QUANEW with nonlinear constraints. It specifies an upper bound n for the number of iterations of an algorithm and reduces the violation of nonlinear constraints at a starting point. The default value is n =20. For example, specifying MAXITER= . 0 means that you do not want to exceed the default number of iterations during the main optimization process and that you want to suppress the feasible point algorithm for nonlinear constraints.

MAXTIME= r

  • requires the CPU time to be no larger than r . The default value of the MAXTIME= option is the largest double floating point number on your computer.

MINITER MINIT= i

  • specifies the minimum number of iterations. The default value is i = 0.

    The ABSCONV=, MAXITER=, MAXFUNC=, and MAXTIME= options are useful for dividing a time-consuming optimization problem into a series of smaller problems by using the OUTEST= and INEST= data sets.

Options for Unconstrained and Linearly Constrained Techniques

This section contains additional termination criteria for all unconstrained, boundary, or linearly constrained optimization techniques.

ABSFCONV ABSFTOL= r < n >

  • specifies the absolute function convergence criterion. Termination requires a small change of the function value in successive iterations,

    click to expand

    The default value is r = 0. The optional integer value n determines the number of successive iterations for which the criterion must be satisfied before the process can be terminated.

ABSGCONV ABSGTOL= r < n >

  • specifies the absolute gradient convergence criterion. Termination requires the maximum absolute gradient element to be small,

    The default value is r =1E ˆ’ 5. The optional integer value n determines the number of successive iterations for which the criterion must be satisfied before the process can be terminated.

    Note: In some applications, the small default value of the ABSGCONV= criterion is too difficult to satisfy for some of the optimization techniques.

ABSXCONV ABSXTOL= r < n >

  • specifies the absolute parameter convergence criterion. Termination requires a small Euclidean distance between successive parameter vectors,

    click to expand

    The default value is r = 0. The optional integer value n determines the number of successive iterations for which the criterion must be satisfied before the process can be terminated.

FCONV FTOL= r < n >

  • specifies the relative function convergence criterion. Termination requires a small relative change of the function value in successive iterations,

    click to expand

    where FSIZE is defined by the FSIZE= option. The default value is r = 10 ˆ’ FDIGITS , where FDIGITS either is specified or is set by default to ˆ’ log 10 ( ˆˆ ), where ˆˆ is the machine precision. The optional integer value n determines the number of successive iterations for which the criterion must be satisfied before the process can be terminated.

FCONV2 FTOL2= r < n >

  • specifies another function convergence criterion. For least-squares problems, termination requires a small predicted reduction

    click to expand

    of the objective function.

    The predicted reduction

    click to expand

    is computed by approximating the objective function f by the first two terms of the Taylor series and substituting the Newton step

    click to expand

    The FCONV2 criterion is the unscaled version of the GCONV criterion. The default value is r = 0. The optional integer value n determines the number of successive iterations for which the criterion must be satisfied before the process can be terminated.

FDIGITS= r

  • specifies the number of accurate digits in evaluations of the objective function. Fractional values such as FDIGITS=4.7 are allowed. The default value is r = ˆ’ log 10 ˆˆ , where ˆˆ is the machine precision. The value of r is used for the specification of the default value of the FCONV= option.

FSIZE= r

  • specifies the FSIZE parameter of the relative function and relative gradient termination criteria. The default value is r = 0. See the FCONV= and GCONV= options.

GCONV GTOL= r < n >

  • specifies the relative gradient convergence criterion. For all techniques except the CONGRA technique, termination requires that the normalized predicted function reduction is small,

    click to expand

    where FSIZE is defined by the FSIZE= option. For the CONGRA technique (where a reliable Hessian estimate G is not available),

    click to expand

    is used. The default value is r =1E ˆ’ 8. The optional integer value n determines the number of successive iterations for which the criterion must be satisfied before the process can be terminated.

    Note: The default setting for the GCONV= option sometimes leads to early termination far from the location of the optimum. This is especially true for the special form of this criterion used in the CONGRA optimization.

GCONV2 GTOL2= r < n >

  • specifies another relative gradient convergence criterion. For least-squares problems and the TRUREG, LEVMAR, NRRIDG, and NEWRAP techniques, the criterion of Browne (1982) is used,

    click to expand

    This criterion is not used by the other techniques. The default value is r = 0. The optional integer value n determines the number of successive iterations for which the criterion must be satisfied before the process can be terminated.

XCONV XTOL= r < n >

  • specifies the relative parameter convergence criterion. Termination requires a small relative parameter change in subsequent iterations,

    click to expand

    The default value is r = 0. The optional integer value n determines the number of successive iterations for which the criterion must be satisfied before the process can be terminated.

XSIZE= r

  • specifies the XSIZE parameter of the relative function and relative gradient termination criteria. The default value is r = 0. See the XCONV= option.

Options for Nonlinearly Constrained Techniques

The non-NMSIMP algorithms available for nonlinearly constrained optimization (currently only TECH=QUANEW) do not monotonically reduce either the value of the objective function or some kind of merit function that combines objective and constraint functions. Furthermore, the algorithm uses the watchdog technique with backtracking (Chamberlain et al., 1982). Therefore, no termination criteria are implemented that are based on the values ( x or f ) of successive iterations. In addition to the criteria used by all optimization techniques, only three more termination criteria are currently available, and they are based on the Lagrange function

click to expand

and its gradient

click to expand

Here, m denotes the total number of constraints, g = g ( x ) denotes the gradient of the objective function, and » denotes the m vector of Lagrange multipliers. The Kuhn-Tucker conditions require that the gradient of the Lagrange function is zero at the optimal point ( x * , » *):

The termination criteria available for nonlinearly constrained optimization follow.

ABSGCONV ABSGTOL= r < n >

  • specifies that termination requires the maximum absolute gradient element of the Lagrange function to be small,

    click to expand

    The default value is r =1E ˆ’ 5. The optional integer value n determines the number of successive iterations for which the criterion must be satisfied before the process can be terminated.

FCONV2 FTOL2= r < n >

  • specifies that termination requires the predicted objective function reduction to be small:

    click to expand

    The default value is r =1E ˆ’ 6. This is the criterion used by the programs VMCWD and VF02AD (Powell 1982b). The optional integer value n determines the number of successive iterations for which the criterion must be satisfied before the process can be terminated.

GCONV GTOL= r < n >

  • specifies that termination requires the normalized predicted objective function reduction to be small:

    click to expand

    where FSIZE is defined by the FSIZE= option. The default value is r =1E ˆ’ 8. The optional integer value n determines the number of successive iterations for which the criterion must be satisfied before the process can be terminated.

Miscellaneous Options

Options for the Approximate Covariance Matrix of Parameter Estimates

You can specify the following options to modify the approximate covariance matrix of parameter estimates.

CFACTOR= r

  • specifies the scalar factor for the covariance matrix of parameter estimates. The scalar r 0 replaces the default value c/NM . For more details, see the section Approximate Standard Errors on page 648.

NOHLF

  • specifies that the Hessian matrix of the objective function (rather than the Hessian matrix of the Lagrange function) is used for computing the approximate covariance matrix of parameter estimates and, therefore, the approximate standard errors.

    It is theoretically not correct to use the NOHLF option. However, since most implementations use the Hessian matrix of the objective function and not the Hessian matrix of the Lagrange function for computing approximate standard errors, the NOHLF option can be used to compare the results.

Options for Additional Displayed Output

You can specify the following options to obtain additional displayed output.

PALL ALL

  • displays information on the starting values and final values of the optimization process.

PCRPJAC PJTJ

  • displays the approximate Hessian matrix. If general linear or nonlinear constraints are active at the solution, the projected approximate Hessian matrix is also displayed.

PHESSIAN PHES

  • displays the Hessian matrix. If general linear or nonlinear constraints are active at the solution, the projected Hessian matrix is also displayed.

PHISTORY PHIS

  • displays the optimization history. The PHISTORY option is set automatically if the PALL or PRINT option is set.

PINIT PIN

  • displays the initial values and derivatives (if available). The PINIT option is set automatically if the PALL option is set.

PNLCJAC

  • displays the Jacobian matrix of nonlinear constraints specified by the NLINCON statement. The PNLCJAC option is set automatically if the PALL option is set.

PRINT PRI

  • displays the results of the optimization process, such as parameter estimates and constraints.

More Options for Optimization Techniques

You can specify the following options, in addition to the options already listed, to fine-tune the optimization process. These options should not be necessary in most applications of PROC CALIS.

DAMPSTEP DS < =r >

  • specifies that the initial step-size value ± (0) for each line search (used by the QUANEW, CONGRA, or NEWRAP techniques) cannot be larger than r times the step-size value used in the former iteration. If the factor r is not specified, the default value is r = 2. The DAMPSTEP option can prevent the line-search algorithm from repeatedly stepping into regions where some objective functions are difficult to compute or where they can lead to floating point overflows during the computation of objective functions and their derivatives. The DAMPSTEP<= r > option can prevent time-costly function calls during line searches with very small step sizes ± of objective functions. For more information on setting the start values of each line search, see the section Restricting the Step Length on page 672.

HESCAL HS = 0 1 2 3

  • specifies the scaling version of the Hessian or crossproduct Jacobian matrix used in NRRIDG, TRUREG, LEVMAR, NEWRAP, or DBLDOG optimization. If HS is not equal to zero, the first iteration and each restart iteration sets the diagonal scaling matrix :

    click to expand

    where are the diagonal elements of the Hessian or crossproduct Jacobian matrix. In every other iteration, the diagonal scaling matrix click to expand is updated depending on the HS option:

    HS=0

    specifies that no scaling is done.

    HS=1

    specifies the Mor (1978) scaling update:

     

    click to expand

    HS=2

    specifies the Dennis, Gay, and Welsch (1981) scaling update:

     

    click to expand

    HS=3

    specifies that d i is reset in each iteration:

     

    click to expand

    In the preceding equations, ˆˆ is the relative machine precision. The default is HS=1 for LEVMAR minimization and HS=0 otherwise. Scaling of the Hessian or crossproduct Jacobian can be time-consuming in the case where general linear constraints are active.

LCDEACT LCD = r

  • specifies a threshold r for the Lagrange multiplier that decides whether an active inequality constraint remains active or can be deactivated. For maximization, r must be greater than zero; for minimization, r must be smaller than zero. The default is

    click to expand

    where + stands for maximization, ˆ’ stands for minimization, ABSGCONV is the value of the absolute gradient criterion, and gmax ( k ) is the maximum absolute element of the (projected) gradient g ( k ) or Z ² g ( k ) .

LCEPSILON LCEPS LCE = r

  • specifies the range r , r 0, for active and violated boundary and linear constraints. If the point x ( k ) satisfies the condition

    click to expand

    the constraint i is recognized as an active constraint. Otherwise, the constraint i is either an inactive inequality or a violated inequality or equality constraint. The default value is r =1E ˆ’ 8. During the optimization process, the introduction of rounding errors can force PROC NLP to increase the value of r by factors of 10. If this happens, it is indicated by a message displayed in the log.

LCSINGULAR LCSING LCS = r

  • specifies a criterion r , r 0, used in the update of the QR decomposition that decides whether an active constraint is linearly dependent on a set of other active constraints. The default is r =1E ˆ’ 8. The larger r becomes, the more the active constraints are recognized as being linearly dependent.

NOEIGNUM

  • suppresses the computation and displayed output of the determinant and the inertia of the Hessian, crossproduct Jacobian, and covariance matrices. The inertia of a symmetric matrix are the numbers of negative, positive, and zero eigenvalues. For large applications, the NOEIGNUM option can save computer time.

RESTART REST = i

  • specifies that the QUANEW or CONGRA algorithm is restarted with a steepest descent/ascent search direction after at most i iterations, i > 0. Default values are as follows:

    • CONGRA: UPDATE=PB: restart is done automatically so specification of i is not used.

    • CONGRA: UPDATE ‰  PB: i = min(10 n, 80), where n is the number of parameters.

    • QUANEW: i is the largest integer available.

VERSION VS=12

  • specifies the version of the quasi-Newton optimization technique with nonlinear constraints.

    VS=1

    specifies the update of the µ vector as in Powell (1978a, 1978b) (update like VF02AD).

    VS=2

    specifies the update of the µ vector as in Powell (1982a, 1982b) (update like VMCWD).

    The default is VS=2.

PARAMETERS Statement

  • PARAMETERS PARMS parameter(s) << = > number(s) > << , > parameter(s) << = > num ber(s) >...> ;

The PARAMETERS statement defines additional parameters that are not elements of a model matrix to use in your own program statements. You can specify more than one PARAMETERS statement with each PROC CALIS statement. The parameters can be followed by an equal sign and a number list. The values of the numbers list are assigned as initial values to the preceding parameters in the parameters list. For example, each of the following statements assigns the initial values ALPHA=.5 and BETA=-.5 for the parameters used in program statements:

  parameters alfa beta=.5 -.5;   parameters alfa beta (.5 -.5);   parameters alfa beta .5 -.5;   parameters alfa=.5 beta (-.5);  

The number of parameters and the number of values does not have to match. When there are fewer values than parameter names, either the RANDOM= or START= option is used. When there are more values than parameter names, the extra values are dropped. Parameters listed in the PARAMETERS statement can be assigned initial values by program statements or by the START= or RANDOM= option in the PROC CALIS statement.

Caution: The OUTRAM= and INRAM= data sets do not contain any information about the PARAMETERS statement or additional program statements.

STRUCTEQ Statement

  • STRUCTEQ variable < variable ...> ;

The STRUCTEQ statement is used to list the dependent variables of the structural equations. This statement is ignored if you omit the PDETERM option. This statement is useful because the term structural equation is not defined in a unique way, and PROC CALIS has difficulty identifying the structural equations.

If LINEQS statements are used, the names of the left-hand-side (dependent) variables of those equations to be treated as structural equations should be listed in the STRUCTEQ statement.

If the RAM statement is used, variable names in the STRUCTEQ statements depend on the VARNAMES statement:

  • If the VARNAMES statement is used, variable names must correspond to those in the VARNAMES statement.

  • If the VARNAMES statement is not used, variable names must correspond to the names of manifest variables or latent (F) variables.

The STRUCTEQ statement also defines the names of variables used in the causal coefficient matrix of the structural equations, B , for computing the Stability Coefficient of Reciprocal Causation (the largest eigenvalue of the BB ² matrix). If the PROC CALIS option PDETERM is used without the STRUCTEQ statement, the structural equations are defined as described in the PDETERM option. See the PROC CALIS option PDETERM on page 585 for more details.

VARNAMES Statement

  • VARNAMES VNAMES assignment < , assignment ...> ;

    where assignment represents

    • matrix-id variable-names or matrix-name = matrix-name

Use the VARNAMES statement in connection with the RAM, COSAN, or FACTOR model statement to allocate names to latent variables including error and disturbance terms. This statement is not needed if you are using the LINEQS statement.

In connection with the RAM model statement, the matrix-id must be specified by the integer number as it is used in the RAM list input (1 for matrix A , 2 for matrix P ). Because the first variables of matrix A correspond to the manifest variables in the input data set, you can specify names only for the latent variables following the manifest variables in the rows of A . For example, in the RAM notation of the alienation example, you can specify the latent variables by names F1, F2, F3 and the error variables by names E1, ... , E6, D1, D2, D3 with the following statement:

  vnames 1 F1-F3,   2 E1-E6 D1-D3;  

If the RAM model statement is not accompanied by a VNAMES statement, default variable names are assigned using the prefixes F, E, and D with numerical suffixes: latent variables are F1, F2, ... , and error variables are E1, E2, ... .

The matrix-id must be specified by its name when used with the COSAN or FACTOR statement. The variable-names following the matrix name correspond to the columns of this matrix. The variable names corresponding to the rows of this matrix are set automatically by

  • the names of the manifest variables for the first matrix in each term

  • the column variable names of the same matrix for the central symmetric matrix in each term

  • the column variable names of the preceding matrix for each other matrix

You also can use the second kind of name assignment in connection with a COSAN statement. Two matrix names separated by an equal sign allocate the column names of one matrix to the column names of the other matrix. This assignment assumes that the column names of at least one of the two matrices are already allocated. For example, in the COSAN notation of the alienation example, you can specify the variable names by using the following statements to allocate names to the columns of J , A , and P :

  vnames J  V1-V6 F1-F3 ,   A =J ,   P  E1-E6 D1-D3 ;  

BY Statement

  • BY variables ;

You can specify a BY statement with PROC CALIS to obtain separate analyses on observations in groups defined by the BY variables. When a BY statement appears, the procedure expects the input data set to be sorted in order of the BY variables.

If your input data set is not sorted in ascending order, use one of the following alternatives:

  • Sort the data using the SORT procedure with a similar BY statement.

  • Specify the BY statement option NOTSORTED or DESCENDING in the BY statement for the CALIS procedure. The NOTSORTED option does not mean that the data are unsorted but rather that the data are arranged in groups (according to values of the BY variables) and that these groups are not necessarily in alphabetical or increasing numeric order.

  • Create an index on the BY variables using the DATASETS procedure.

For more information on the BY statement, refer to the discussion in SAS Language Reference: Concepts . For more information on the DATASETS procedure, refer to the discussion in the SAS Procedures Guide .

VAR Statement

  • VAR variables ;

The VAR statement lists the numeric variables to be analyzed. If the VAR statement is omitted, all numeric variables not mentioned in other statements are analyzed. You can use the VAR statement to ensure that the manifest variables appear in correct order for use in the RAM statement. Only one VAR statement can be used with each PROC CALIS statement. If you do not use all manifest variables when you specify the model with a RAM or LINEQS statement, PROC CALIS does automatic variable selection. For more information, see the section Automatic Variable Selection on page 662.

PARTIAL Statement

  • PARTIAL variables ;

If you want the analysis to be based on a partial correlation or covariance matrix, use the PARTIAL statement to list the variables used to partial out the variables in the analysis. You can specify only one PARTIAL statement with each PROC CALIS statement.

FREQ Statement

  • FREQ variable ;

If one variable in your data set represents the frequency of occurrence for the other values in the observation, specify the variable s name in a FREQ statement. PROC CALIS then treats the data set as if each observation appears n i times, where n i is the value of the FREQ variable for observation i . Only the integer portion of the value is used. If the value of the FREQ variable is less than 1 or is missing, that observation is not included in the analysis. The total number of observations is considered to be the sum of the FREQ values when the procedure computes significance probabilities. You can use only one FREQ statement with each PROC CALIS statement.

WEIGHT Statement

  • WEIGHT variable ;

To compute weighted covariances or correlations, specify the name of the weighting variable in a WEIGHT statement. This is often done when the error variance associated with each observation is different and the values of the weight variable are proportional to the reciprocals of the variances. You can use only one WEIGHT statement with each PROC CALIS statement. The WEIGHT and FREQ statements have a similar effect, except the WEIGHT statement does not alter the number of observations unless VARDEF=WGT or VARDEF=WDF. An observation is used in the analysis only if the WEIGHT variable is greater than 0 and is not missing.

SAS Program Statements

This section lists the program statements used to express the linear and nonlinear constraints on the parameters and documents the differences between program statements in PROC CALIS and program statements in the DATA step. The very different use of the ARRAY statement by PROC CALIS is also discussed. Most of the program statements that can be used in the SAS DATA step also can be used in PROC CALIS. Refer to SAS Language Reference: Dictionary for a description of the SAS program statements. You can specify the following SAS program statements to compute parameter constraints with the CALIS procedure:

  • ABORT ;

  • CALL name < ( expression < , expression ...>)> ;

  • DELETE;

  • DO < variable = expression < TO expression> < BY expression>

    • < , expression < TO expression> < BY expression> ...>>

    • < WHILE expression>

    • < UNTIL expression> ;

  • END;

  • GOTO statement-label ;

  • IF expression ;

  • IF expression THEN program-statement ;

    • ELSE program-statement ;

  • variable = expression ;

  • variable+expression ;

  • LINK statement-label ;

  • PUT <variable> <=> < ...> ;

  • RETURN ;

  • SELECT < ( expression ) > ;

  • STOP;

  • SUBSTR ( variable, index, length ) = expression ;

  • WHEN (expression) program-statement ;

    • OTHERWISE program-statement ;

For the most part, the SAS program statements work the same as they do in the SAS DATA step as documented in SAS Language Reference: Concepts . However, there are several differences that should be noted.

  • The ABORT statement does not allow any arguments.

  • The DO statement does not allow a character index variable. Thus,

      do I=1,2,3;  

    is supported; however,

      do I='A','B','C';  

    is not valid in PROC CALIS, although it is supported in the DATA step.

  • The PUT statement, used mostly for program debugging in PROC CALIS, supports only some of the features of the DATA step PUT statement, and it has some new features that the DATA step PUT statement does not have:

    • The CALIS procedure PUT statement does not support line pointers, factored lists, iteration factors, overprinting, _INFILE_, the colon (:) format modifier, or $.

    • The CALIS procedure PUT statement does support expressions enclosed in parentheses. For example, the following statement displays the square root of x:

        put (sqrt(x));  
    • The CALIS procedure PUT statement supports the print item _PDV_ to display a formatted listing of all variables in the program. For example, the following statement displays a much more readable listing of the variables than the _ALL_ print item:

        put _pdv_ ;  
  • The WHEN and OTHERWISE statements allow more than one target statement. That is, DO/END groups are not necessary for multiple WHEN statements. For example, the following syntax is valid:

      select;   when ( expression1 ) statement1;   statement2;   when ( expression2 ) statement3;   statement4;   end;  

You can specify one or more PARMS statements to define parameters used in the program statements that are not defined in the model matrices (MATRIX, RAM, LINEQS, STD, or COV statement).

Parameters that are used only on the right-hand side of your program statements are called independent, and parameters that are used at least once on the left-hand side of an equation in the program code are called dependent parameters. The dependent parameters are used only indirectly in the minimization process. They should be fully defined as functions of the independent parameters. The independent parameters are included in the set X of parameters used in the minimization. Be sure that all independent parameters used in your program statements are somehow connected to elements of the model matrices. Otherwise the minimization function does not depend on those independent parameters, and the parameters vary without control (since the corresponding derivative is the constant 0). You also can specify the PARMS statement to set the initial values of all independent parameters used in the program statements that are not defined as elements of model matrices.

ARRAY Statement

  • ARRAY arrayname <(dimensions)>< $ ><variables and constants> ;

The ARRAY statement is similar to, but not the same as, the ARRAY statement in the DATA step. The ARRAY statement is used to associate a name with a list of variables and constants. The array name can then be used with subscripts in the program to refer to the items in the list.

The ARRAY statement supported by PROC CALIS does not support all the features of the DATA step ARRAY statement. With PROC CALIS, the ARRAY statement cannot be used to give initial values to array elements. Implicit indexing variables cannot be used; all array references must have explicit subscript expressions. Only exact array dimensions are allowed; lower-bound specifications are not supported. A maximum of six dimensions is allowed.

On the other hand, the ARRAY statement supported by PROC CALIS does allow both variables and constants to be used as array elements. Constant array elements cannot be changed. Both the dimension specification and the list of elements are optional, but at least one must be given. When the list of elements is not given or fewer elements than the size of the array are listed, array variables are created by suffixing element numbers to the array name to complete the element list.




SAS.STAT 9.1 Users Guide (Vol. 1)
SAS/STAT 9.1 Users Guide, Volumes 1-7
ISBN: 1590472438
EAN: 2147483647
Year: 2004
Pages: 156

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net