The following statements can be used with the NLMIXED procedure:
PROC NLMIXED options ;
ARRAY array specification ;
BOUNDS boundary constraints ;
BY variables ;
CONTRAST label expression <,expression> ;
ESTIMATE label expression ;
ID names ;
MODEL model specification ;
PARMS parameters and starting values ;
PREDICT expression ;
RANDOM random effects specification ;
REPLICATE variable ;
Program statements ;
The following sections provide a detailed description of each of these statements.
PROC NLMIXED options ;
This statement invokes the NLMIXED procedure. A large number of options are available in the PROC NLMIXED statement, and the following table categorizes them according to function.
Option | Description |
---|---|
Basic Options | |
DATA= | input data set |
METHOD= | integration method |
Displayed Output Specifications | |
START | gradient at starting values |
HESS | Hessian matrix |
ITDETAILS | iteration details |
CORR | correlation matrix |
COV | covariance matrix |
ECORR | corr matrix of additional estimates |
ECOV | cov matrix of additional estimates |
EDER | derivatives of additional estimates |
ALPHA= | alpha for confidence limits |
DF= | degrees of freedom for p values and confidence limits |
Debugging Output | |
LIST | model program, variables |
LISTCODE | compiled model program |
LISTDEP | model dependency listing |
LISTDER | model derivative |
XREF | model cross reference |
FLOW | model execution messages |
TRACE | detailed model execution messages |
Quadrature Options | |
NOAD | no adaptive centering |
NOADSCALE | no adaptive scaling |
OUTQ= | output data set |
QFAC= | search factor |
QMAX= | maximum points |
QPOINTS= | number of points |
QSCALEFAC= | scale factor |
QTOL= | tolerance |
Empirical Bayes Options | |
EBSTEPS= | number of Newton steps |
EBSUBSTEPS= | number of substeps |
EBSSFRAC= | step-shortening fraction |
EBSSTOL= | step-shortening tolerance |
EBTOL= | convergence tolerance |
EBOPT | comprehensive optimization |
EBZSTART | zero starting values |
Optimization Specifications | |
TECHNIQUE= | minimization technique |
UPDATE= | update technique |
LINESEARCH= | line-search method |
LSPRECISION= | line-search precision |
HESCAL= | type of Hessian scaling |
INHESSIAN<=> | start for approximated Hessian |
RESTART= | iteration number for update restart |
OPTCHECK<=> | check optimality in neighborhood |
Derivatives Specifications | |
FD<=> | finite-difference derivatives |
FDHESSIAN<=> | finite-difference second derivatives |
DIAHES | use only diagonal of Hessian |
Constraint Specifications | |
LCEPSILON= | range for active constraints |
LCDEACT= | LM tolerance for deactivating |
LCSINGULAR= | tolerance for dependent constraints |
Termination Criteria Specifications | |
MAXFUNC= | maximum number of function calls |
MAXITER= | maximum number of iterations |
MINITER= | minimum number of iterations |
MAXTIME= | upper limit seconds of CPU time |
ABSCONV= | absolute function convergence criterion |
ABSFCONV= | absolute function convergence criterion |
ABSGCONV= | absolute gradient convergence criterion |
ABSXCONV= | absolute parameter convergence criterion |
FCONV= | relative function convergence criterion |
FCONV2= | relative function convergence criterion |
GCONV= | relative gradient convergence criterion |
XCONV= | relative parameter convergence criterion |
FDIGITS= | number accurate digits in objective function |
FSIZE= | used in FCONV, GCONV criterion |
XSIZE= | used in XCONV criterion |
Step Length Specifications | |
DAMPSTEP<=> | damped steps in line search |
MAXSTEP= | maximum trust-region radius |
INSTEP= | initial trust-region radius |
Singularity Tolerances | |
SINGCHOL= | tolerance for Cholesky roots |
SINGHESS= | tolerance for Hessian |
SINGSWEEP= | tolerance for sweep |
SINGVAR= | tolerance for variances |
Covariance Matrix Tolerances | |
ASINGULAR= | absolute singularity for inertia |
MSINGULAR= | relative M singularity for inertia |
VSINGULAR= | relative V singularity for inertia |
G4= | threshold for Moore-Penrose inverse |
COVSING= | tolerance for singular COV matrix |
CFACTOR= | multiplication factor for COV matrix |
These options are described in alphabetical order. For a description of the mathematical notation used in the following sections, see the section Modeling Assumptions and Notation.
ABSCONV= r
ABSTOL= r
specifies an absolute function convergence criterion. For minimization, termination requires f ( ( k ) ) ‰ r. The default value of r is the negative square root of the largest double precision value, which serves only as a protection against overflows.
ABSFCONV= r < [ n ] >
ABSFTOL= r < [ n ] >
specifies an absolute function convergence criterion. For all techniques except NMSIMP, termination requires a small change of the function value in successive iterations:
The same formula is used for the NMSIMP technique, but ( k ) is defined as the vertex with the lowest function value, and ( k ˆ’ 1) is defined as the vertex with the highest function value in the simplex. The default value is r = 0. The optional integer value n specifies the number of successive iterations for which the criterion must be satisfied before the process can be terminated .
ABSGCONV= r < [ n ] >
ABSGTOL= r < [ n ] >
specifies an absolute gradient convergence criterion. Termination requires the maximum absolute gradient element to be small:
This criterion is not used by the NMSIMP technique. The default value is r = 1E ˆ’ 5. The optional integer value n specifies the number of successive iterations for which the criterion must be satisfied before the process can be terminated.
ABSXCONV= r < [ n ] >
ABSXTOL= r < [ n ] >
specifies an absolute parameter convergence criterion. For all techniques except NMSIMP, termination requires a small Euclidean distance between successive parameter vectors,
For the NMSIMP technique, termination requires either a small length ± ( k ) of the vertices of a restart simplex,
or a small simplex size ,
where the simplex size ( k ) is defined as the L1 distance from the simplex vertex ¾ ( k ) with the smallest function value to the other n simplex points :
The default is r = 1E ˆ’ 8 for the NMSIMP technique and r = 0 otherwise . The optional integer value n specifies the number of successive iterations for which the criterion must be satisfied before the process can terminate.
ALPHA= ±
specifies the alpha level to be used in computing confidence limits. The default value is 0.05.
ASINGULAR= r
ASING= r
specifies an absolute singularity criterion for the computation of the inertia (number of positive, negative, and zero eigenvalues) of the Hessian and its projected forms. The default value is the square root of the smallest positive double precision value.
CFACTOR= f
specifies a multiplication factor f for the estimated covariance matrix of the parameter estimates.
COV
requests the approximate covariance matrix for the parameter estimates.
CORR
requests the approximate correlation matrix for the parameter estimates.
COVSING= r >
specifies a nonnegative threshold that determines whether the eigenvalues of a singular Hessian matrix are considered to be zero.
DAMPSTEP < = r >
DS < = r >
specifies that the initial step-size value ± (0) for each line search (used by the QUANEW, CONGRA, or NEWRAP technique) cannot be larger than r times the step-size value used in the former iteration. If you specify the DAMPSTEP option without factor r , the default value is r = 2. The DAMPSTEP= r option can prevent the line-search algorithm from repeatedly stepping into regions where some objective functions are difficult to compute or where they could lead to floating point overflows during the computation of objective functions and their derivatives. The DAMPSTEP= r option can save time-costly function calls that result in very small step sizes ± . For more details on setting the start values of each line search, see the section Restricting the Step Length beginning on page 3096.
DATA = SAS-data-set
specifies the input data set. Observations in this data set are used to compute the log likelihood function that you specify with PROC NLMIXED statements.
NOTE: If you are using a RANDOM statement, the input data set must be clustered according to the SUBJECT= variable. One easy way to accomplish this is to sort your data by the SUBJECT= variable prior to calling PROC NLMIXED. PROC NLMIXED does not sort the input data set for you.
DF= d
specifies the degrees of freedom to be used in computing p values and confidence limits. The default value is the number of subjects minus the number of random effects for random effects models, and the number of observations otherwise.
DIAHES
specifies that only the diagonal of the Hessian is used.
EBOPT
requests that a more comprehensive optimization be carried out if the default empirical Bayes optimization fails to converge.
EBSSFRAC= r >
specifies the step-shortening fraction to be used while computing empirical Bayes estimates of the random effects. The default value is 0.8.
EBSSTOL= r ‰
specifies the objective function tolerance for determining the cessation of step-shortening while computing empirical Bayes estimates of the random effects. The default value is r = 1E ˆ’ 8.
EBSTEPS= n ‰
specifies the maximum number of Newton steps for computing empirical Bayes estimates of random effects. The default value is n = 50.
EBSUBSTEPS= n ‰
specifies the maximum number of step-shortenings for computing empirical Bayes estimates of random effects. The default value is n = 20.
EBTOL= r ‰
specifies the convergence tolerance for empirical Bayes estimation. The default value is r = ˆˆ E4, where ˆˆ is the machine precision. This default value equals approximately 1E ˆ’ 12 on most machines.
EBZSTART
requests that a zero be used as starting values during empirical Bayes estimation. By default, the starting values are set equal to the estimates from the previous iteration (or zero for the first iteration).
ECOV
requests the approximate covariance matrix for all expressions specified in ESTIMATE statements.
ECORR
requests the approximate correlation matrix for all expressions specified in ESTIMATE statements.
EDER
requests the derivatives of all expressions specified in ESTIMATE statements with respect to each of the model parameters.
FCONV= r < [ n ] >
FTOL= r < [ n ] >
specifies a relative function convergence criterion. For all techniques except NMSIMP, termination requires a small relative change of the function value in successive iterations,
where FSIZE is defined by the FSIZE= option. The same formula is used for the NMSIMP technique, but ( k ) is defined as the vertex with the lowest function value, and ( k ˆ’ 1) is defined as the vertex with the highest function value in the simplex. The default is r= 10 ˆ’ FDIGITS where FDIGITS is the value of the FDIGITS= option. The optional integer value n specifies the number of successive iterations for which the criterion must be satisfied before the process can terminate.
FCONV2= r < [ n ] >
FTOL2= r < [ n ] >
specifies another function convergence criterion. For all techniques except NMSIMP, termination requires a small predicted reduction
of the objective function. The predicted reduction
is computed by approximating the objective function f by the first two terms of the Taylor series and substituting the Newton step.
For the NMSIMP technique, termination requires a small standard deviation of the function values of the n + 1 simplex vertices , l = 0, , n ,
where . If there are n act boundary constraints active at ( k ) , the mean and standard deviation are computed only for the n + 1 ˆ’ n act unconstrained vertices. The default value is r = 1E ˆ’ 6 for the NMSIMP technique and r = 0 otherwise. The optional integer value n specifies the number of successive iterations for which the criterion must be satisfied before the process can terminate.
FD < = FORWARD CENTRAL r >
specifies that all derivatives be computed using finite difference approximations. The following specifications are permitted:
FD | is equivalent to FD=100. |
FD=CENTRAL | uses central differences. |
FD=FORWARD | uses forward differences. |
FD= r | uses central differences for the initial and final evaluations of the gradient, and Hessian. During iteration, start with forward differences and switch to a corresponding central-difference formula during the iteration process when one of the following two criteria is satisfied:
|
Note that the FD and FDHESSIAN options cannot apply at the same time. The FDHESSIAN option is ignored when only first-order derivatives are used. See the section Finite Difference Approximations of Derivatives beginning on page 3091 for more information.
FDHESSIAN < = FORWARD CENTRAL >
FDHES < = FORWARD CENTRAL >
FDH < = FORWARD CENTRAL >
specifies that second-order derivatives be computed using finite difference approximations based on evaluations of the gradients.
FDHESSIAN=FORWARD | uses forward differences. |
FDHESSIAN=CENTRAL | uses central differences. |
FDHESSIAN | uses forward differences for the Hessian except for the initial and final output. |
Note that the FD and FDHESSIAN options cannot apply at the same time. See the section Finite Difference Approximations of Derivatives beginning on page 3091 for more information.
FDIGITS= r
specifies the number of accurate digits in evaluations of the objective function. Fractional values such as FDIGITS=4.7 are allowed. The default value is r = ˆ’ log 10 ˆˆ , where ˆˆ is the machine precision. The value of r is used to compute the interval size h for the computation of finite-difference approximations of the derivatives of the objective function and for the default value of the FCONV= option.
FLOW
displays a message for each statement in the model program as it is executed. This debugging option is very rarely needed and produces voluminous output.
FSIZE= r
specifies the FSIZE parameter of the relative function and relative gradient termination criteria. The default value is r = 0. For more details, see the FCONV= and GCONV= options.
G4= n >
specifies a dimension to determine the type of generalized inverse to use when the approximate covariance matrix of the parameter estimates is singular. The default value of n is 60. See the section Covariance Matrix beginning on page 3101 for more information.
GCONV= r < [ n ] >
GTOL= r < [ n ] >
specifies a relative gradient convergence criterion. For all techniques except CONGRA and NMSIMP, termination requires that the normalized predicted function reduction is small,
where FSIZE is defined by the FSIZE= option. For the CONGRA technique (where a reliable Hessian estimate H is not available), the following criterion is used:
This criterion is not used by the NMSIMP technique. The default value is r = 1 E ˆ’ 8. The optional integer value n specifies the number of successive iterations for which the criterion must be satisfied before the process can terminate.
HESCAL= 0123
HS= 0123
specifies the scaling version of the Hessian matrix used in NRRIDG, TRUREG, NEWRAP, or DBLDOG optimization. If HS is not equal to 0, the first iteration and each restart iteration sets the diagonal scaling matrix :
where are the diagonal elements of the Hessian. In every other iteration, the diagonal scaling matrix is updated depending on the HS option:
HS=0 | specifies that no scaling is done. |
HS=1 | specifies the Mor (1978) scaling update:
|
HS=2 | specifies the Dennis, Gay, & Welsch (1981) scaling update:
|
HS=3 | specifies that d i is reset in each iteration:
|
In each scaling update, ˆˆ is the relative machine precision. The default value is HS=0. Scaling of the Hessian can be time consuming in the case where general linear constraints are active.
HESS
requests the display of the final Hessian matrix after optimization. If you also specify the START option, then the Hessian at the starting values is also printed.
INHESSIAN < = r >
INHESS < = r >
specifies how the initial estimate of the approximate Hessian is defined for the quasi-Newton techniques QUANEW and DBLDOG. There are two alternatives:
If you do not use the r specification, the initial estimate of the approximate Hessian is set to the Hessian at (0) .
If you do use the r specification, the initial estimate of the approximate Hessian is set to the multiple of the identity matrix rI .
By default, if you do not specify the option INHESSIAN= r , the initial estimate of the approximate Hessian is set to the multiple of the identity matrix rI , where the scalar r is computed from the magnitude of the initial gradient.
INSTEP= r
reduces the length of the first trial step during the line search of the first iterations. For highly nonlinear objective functions, such as the EXP function, the default initial radius of the trust-region algorithm TRUREG or DBLDOG or the default step length of the line-search algorithms can result in arithmetic overflows. If this occurs, you should specify decreasing values of 0 < r < 1 such as INSTEP=1E ˆ’ 1, INSTEP=1E ˆ’ 2, INSTEP=1E ˆ’ 4, and so on, until the iteration starts successfully.
For trust-region algorithms (TRUREG, DBLDOG), the INSTEP= option specifies a factor r > 0 for the initial radius ” (0) of the trust region. The default initial trust-region radius is the length of the scaled gradient. This step corresponds to the default radius factor of r = 1.
For line-search algorithms (NEWRAP, CONGRA, QUANEW), the INSTEP= option specifies an upper bound for the initial step length for the line search during the first five iterations. The default initial step length is r = 1.
For the Nelder-Mead simplex algorithm, using TECH=NMSIMP, the INSTEP= r option defines the size of the start simplex.
For more details, see the section Computational Problems beginning on page 3098.
ITDETAILS
requests a more complete iteration history, including the current values of the parameter estimates, their gradients, and additional optimization statistics. For further details, see the section Iterations beginning on page 3104.
LCDEACT= r
LCD= r
specifies a threshold r for the Lagrange multiplier that determines whether an active inequality constraint remains active or can be deactivated. During minimization, an active inequality constraint can be deactivated only if its Lagrange multiplier is less than the threshold value r < 0. The default value is
where ABSGCONV is the value of the absolute gradient criterion, and gmax ( k ) is the maximum absolute element of the (projected) gradient g ( k ) or Z T g ( k ) . (See the section Active Set Methods beginning on page 3093 for a definition of Z .)
LCEPSILON= r >
LCEPS= r >
LCE= r >
specifies the range for active and violated boundary constraints. The default value is r = 1E ˆ’ 8. During the optimization process, the introduction of rounding errors can force PROC NLMIXED to increase the value of r by a factor of 10 , 100 , . If this happens, it is indicated by a message displayed in the log.
LCSINGULAR= r >
LCSING= r >
LCS= r >
specifies a criterion r , used in the update of the QR decomposition, that determines whether an active constraint is linearly dependent on a set of other active constraints. The default value is r = 1E ˆ’ 8. The larger r becomes, the more the active constraints are recognized as being linearly dependent. If the value of r is larger than 0 . 1, it is reset to 0 . 1.
LINESEARCH= i
LIS= i
specifies the line-search method for the CONGRA, QUANEW, and NEWRAP optimization techniques. Refer to Fletcher (1987) for an introduction to line-search techniques. The value of i can be 1 ,..., 8. For CONGRA, QUANEW and NEWRAP, the default value is i = 2.
LIS=1 | specifies a line-search method that needs the same number of function and gradient calls for cubic interpolation and cubic extrapolation; this method is similar to one used by the Harwell subroutine library. |
LIS=2 | specifies a line-search method that needs more function than gradient calls for quadratic and cubic interpolation and cubic extrapolation; this method is implemented as shown in Fletcher (1987) and can be modified to an exact line search by using the LSPRECISION= option. |
LIS=3 | specifies a line-search method that needs the same number of function and gradient calls for cubic interpolation and cubic extrapolation; this method is implemented as shown in Fletcher (1987) and can be modified to an exact line search by using the LSPRECISION= option. |
LIS=4 | specifies a line-search method that needs the same number of function and gradient calls for stepwise extrapolation and cubic interpolation. |
LIS=5 | specifies a line-search method that is a modified version of LIS=4. |
LIS=6 | specifies golden section line search (Polak 1971), which uses only function values for linear approximation . |
LIS=7 | specifies bisection line search (Polak 1971), which uses only function values for linear approximation. |
LIS=8 | specifies the Armijo line-search technique (Polak 1971), which uses only function values for linear approximation. |
LIST
displays the model program and variable lists. The LIST option is a debugging feature and is not normally needed.
LISTCODE
displays the derivative tables and the compiled program code. The LISTCODE option is a debugging feature and is not normally needed.
LOGNOTE < = n >
writes periodic notes to the log describing the current status of computations . It is designed for use with analyses requiring extensive CPU resources. The optional integer value n specifies the desired level of reporting detail. The default is n = 1. Choosing n = 2 adds information about the objective function values at the end of each iteration. The most detail is obtained with n = 3, which also reports the results of function evaluations within iterations.
LSPRECISION= r
LSP= r
specifies the degree of accuracy that should be obtained by the line-search algorithms LIS=2 and LIS=3. Usually an imprecise line search is inexpensive and successful. For more difficult optimization problems, a more precise and expensive line search may be necessary (Fletcher 1987). The second line-search method (which is the default for the NEWRAP, QUANEW, and CONGRA techniques) and the third line-search method approach exact line search for small LSPRECISION= values. If you have numerical problems, you should try to decrease the LSPRECISION= value to obtain a more precise line search. The default values are shown in the following table.
TECH= | UPDATE= | LSP default |
---|---|---|
QUANEW | DBFGS, BFGS | r = 0.4 |
QUANEW | DDFP, DFP | r = 0.06 |
CONGRA | all | r = 0.1 |
NEWRAP | no update | r = 0.9 |
For more details, refer to Fletcher (1987).
MAXFUNC= i
MAXFU= i
specifies the maximum number i of function calls in the optimization process. The default values are
TRUREG, NRRIDG, NEWRAP: 125
QUANEW, DBLDOG: 500
CONGRA: 1000
NMSIMP: 3000
Note that the optimization can terminate only after completing a full iteration. Therefore, the number of function calls that is actually performed can exceed the number that is specified by the MAXFUNC= option.
MAXITER= i
MAXIT= i
specifies the maximum number i of iterations in the optimization process. The default values are
TRUREG, NRRIDG, NEWRAP: 50
QUANEW, DBLDOG: 200
CONGRA: 400
NMSIMP: 1000
These default values are also valid when i is specified as a missing value.
MAXSTEP= r < [ n ] >
specifies an upper bound for the step length of the line-search algorithms during the first n iterations. By default, r is the largest double precision value and n is the largest integer available. Setting this option can improve the speed of convergence for the CONGRA, QUANEW, and NEWRAP techniques.
MAXTIME= r
specifies an upper limit of r seconds of CPU time for the optimization process. The default value is the largest floating point double representation of your computer. Note that the time specified by the MAXTIME= option is checked only once at the end of each iteration. Therefore, the actual running time can be much longer than that specified by the MAXTIME= option. The actual running time includes the rest of the time needed to finish the iteration and the time needed to generate the output of the results.
METHOD= value
specifies the method for approximating the integral of the likelihood over the random effects. Valid values are as follows .
FIRO
specifies the first-order method of Beal and Sheiner (1982). When using METHOD=FIRO, you must specify the NORMAL distribution in the MODEL statement and you must also specify a RANDOM statement.
GAUSS
specifies adaptive Gauss-Hermite quadrature (Pinheiro and Bates 1995). You can prevent the adaptation with the NOAD option or prevent adaptive scaling with the NOADSCALE option. This is the default integration method.
HARDY
specifies Hardy quadrature based on an adaptive trapezoidal rule. This method is available only for one-dimensional integrals; that is, you must specify only one random effect.
ISAMP
specifies adaptive importance sampling (Pinheiro and Bates 1995) . You can prevent the adaptation with the NOAD option or prevent adaptive scaling with the NOADSCALE option. You can use the SEED= option to specify a starting seed for the random number generation used in the importance sampling. If you do not specify a seed, or specify a value less than or equal to zero, the seed is generated from reading the time of day from the computer clock.
MINITER= i
MINIT= i
specifies the minimum number of iterations. The default value is 0. If you request more iterations than are actually needed for convergence to a stationary point, the optimization algorithms can behave strangely. For example, the effect of rounding errors can prevent the algorithm from continuing for the required number of iterations.
MSINGULAR= r >
MSING= r >
specifies a relative singularity criterion for the computation of the inertia (number of positive, negative, and zero eigenvalues) of the Hessian and its projected forms. The default value is 1E ˆ’ 12 if you do not specify the SINGHESS= option; otherwise, the default value is max(10 ˆˆ , (1E ˆ’ 4) * SINGHESS). See the section Covariance Matrix beginning on page 3101 for more information.
NOAD
requests that the Gaussian quadrature be nonadaptive; that is, the quadrature points are centered at zero for each of the random effects and the current random-effects variance matrix is used as the scale matrix.
NOADSCALE
requests nonadaptive scaling for adaptive Gaussian quadrature; that is, the quadrature points are centered at the empirical Bayes estimates for the random effects, but the current random-effects variance matrix is used as the scale matrix. By default, the observed Hessian from the current empirical Bayes estimates is used as the scale matrix.
OPTCHECK < = r > >
computes the function values f ( l ) of a grid of points l in a ball of radius of r about *. If you specify the OPTCHECK option without factor r , the default value is r = 0.1 at the starting point and r = 0.01 at the terminating point. If a point is found with a better function value than f ( *), then optimization is restarted at .
OUTQ= SAS-data-set
specifies an output data set containing the quadrature points used for numerical integration.
QFAC = r >
specifies the additive factor used to adaptively search for the number of quadrature points. For METHOD=GAUSS, the search sequence is 1, 3, 5, 7, 9, 11, 11 + r , 11 + 2 r , ... , where the default value of r is 10. For METHOD=ISAMP, the search sequence is 10, 10 + r , 10 + 2 r , ... , where the default value of r is 50.
QMAX= r >
specifies the maximum number of quadrature points permitted before the adaptive search is aborted. The default values are 31 for adaptive Gaussian quadrature, 61 for non-adaptive Gaussian quadrature, 160 for adaptive importance sampling, and 310 for non-adaptive importance sampling.
QPOINTS= n >
specifies the number of quadrature points to be used during evaluation of integrals. For METHOD=GAUSS, n equals the number of points used in each dimension of the random effects, resulting in a total of n r points, where r is the number of dimensions. For METHOD=ISAMP, n specifies the total number of quadrature points regardless of the dimension of the random effects. By default, the number of quadrature points is selected adaptively, and this option disables the adaptive search.
QSCALEFAC= r >
specifies a multiplier for the scale matrix used during quadrature calculations. The default value is 1.0.
QTOL= r >
specifies the tolerance used to adaptively select the number of quadrature points. When the relative difference between two successive likelihood calculations is less than r , then the search terminates and the lesser number of quadrature points is used during the subsequent optimization process. The default value is 1E ˆ’ 4.
RESTART= i >
REST= i >
specifies that the QUANEW or CONGRA algorithm is restarted with a steepest de-scent/ascent search direction after, at most, i iterations. Default values are
CONGRA: UPDATE=PB: restart is performed automatically, i is not used.
CONGRA: UPDATE ‰ PB: i = min(10 n, 80), where n is the number of parameters.
QUANEW: i is the largest integer available.
SEED= i
specifies the random number seed for METHOD=ISAMP. If you do not specify a seed, or specify a value less than or equal to zero, the seed is generated from reading the time of day from the computer clock. The value must be less than 2 31 ˆ’ 1.
SINGCHOL= r >
specifies the singularity criterion r for Cholesky roots of the random-effects variance matrix and scale matrix for adaptive Gaussian quadrature. The default value is 1E4 times the machine epsilon ; this product is approximately 1E ˆ’ 12 on most computers.
SINGHESS= r >
specifies the singularity criterion r for the inversion of the Hessian matrix. The default value is 1E ˆ’ 8. See the ASINGULAR, MSINGULAR=, and VSINGULAR= options for more information.
SINGSWEEP= r >
specifies the singularity criterion r for inverting the variance matrix in the first-order method and the empirical Bayes Hessian matrix. The default value is 1E4 times the machine epsilon; this product is approximately 1E ˆ’ 12 on most computers.
SINGVAR= r >
specifies the singularity criterion r below which statistical variances are considered to equal zero. The default value is 1E4 times the machine epsilon; this product is approximately 1E ˆ’ 12 on most computers.
START
requests that the gradient of the log likelihood at the starting values be displayed. If you also specify the HESS option, then the starting Hessian is displayed as well.
TECHNIQUE= value
TECH= value
specifies the optimization technique. Valid values are
CONGRA
performs a conjugate-gradient optimization, which can be more precisely specified with the UPDATE= option and modified with the LINESEARCH= option. When you specify this option, UPDATE=PB by default.
DBLDOG
performs a version of double dogleg optimization, which can be more precisely specified with the UPDATE= option. When you specify this option, UPDATE=DBFGS by default.
NMSIMP
performs a Nelder-Mead simplex optimization.
NONE
does not perform any optimization. This option can be used
to perform a grid search without optimization
to compute estimates and predictions that cannot be obtained efficiently with any of the optimization techniques
NEWRAP
performs a Newton-Raphson optimization combining a line-search algorithm with ridging. The line-search algorithm LIS=2 is the default method.
NRRIDG
performs a Newton-Raphson optimization with ridging.
QUANEW
performs a quasi-Newton optimization, which can be defined more precisely with the UPDATE= option and modified with the LINESEARCH= option. This is the default estimation method.
TRUREG
performs a trust region optimization.
TRACE
displays the result of each operation in each statement in the model program as it is executed. This debugging option is very rarely needed, and it produces voluminous output.
UPDATE= method
UPD= method
specifies the update method for the quasi-Newton, double dogleg, or conjugate-gradient optimization technique. Not every update method can be used with each optimizer. See the section Optimization Algorithms beginning on page 3086 for more information.
Valid methods are
BFGS
performs the original Broyden, Fletcher, Goldfarb, and Shanno (BFGS) update of the inverse Hessian matrix.
DBFGS
performs the dual BFGS update of the Cholesky factor of the Hessian matrix. This is the default update method.
DDFP
performs the dual Davidon, Fletcher, and Powell (DFP) update of the Cholesky factor of the Hessian matrix.
DFP
performs the original DFP update of the inverse Hessian matrix.
PB
performs the automatic restart update method of Powell (1977) and Beale (1972).
FR
performs the Fletcher-Reeves update (Fletcher 1987).
PR
performs the Polak-Ribiere update (Fletcher 1987).
CD
performs a conjugate-descent update of Fletcher (1987).
VSINGULAR= r >
VSING= r >
specifies a relative singularity criterion for the computation of the inertia (number of positive, negative, and zero eigenvalues) of the Hessian and its projected forms. The default value is r = 1E ˆ’ 8 if the SINGHESS= option is not specified, and it is the value of SINGHESS= option otherwise. See the section Covariance Matrix beginning on page 3101 for more information.
XCONV= r < [ n ] > R
XTOL= r [ n ]
specifies the relative parameter convergence criterion. For all techniques except NMSIMP, termination requires a small relative parameter change in subsequent iterations.
For the NMSIMP technique, the same formula is used, but is defined as the vertex with the lowest function value and is defined as the vertex with the highest function value in the simplex. The default value is r = 1E ˆ’ 8 for the NMSIMP technique and r = 0 otherwise. The optional integer value n specifies the number of successive iterations for which the criterion must be satisfied before the process can be terminated.
XSIZE= r >
specifies the XSIZE parameter of the relative parameter termination criterion. The default value is r = 0. For more detail, see the XCONV= option.
ARRAY arrayname [{ dimensions }] [$] [variables and constants] ;
The ARRAY statement is similar to, but not the same as, the ARRAY statement in the SAS DATA step, and it is the same as the ARRAY statements in the NLIN, NLP, and MODEL procedures. The ARRAY statement is used to associate a name (of no more than eight characters ) with a list of variables and constants. The array name is used with subscripts in the program to refer to the array elements. The following statements illustrate this.
array r[8] r1-r8; do i = 1 to 8; r[i] = 0; end;
The ARRAY statement does not support all the features of the ARRAY statement in the DATA step. It cannot be used to assign initial values to array elements. Implicit indexing of variables cannot be used; all array references must have explicit subscript expressions. Only exact array dimensions are allowed; lower-bound specifications are not supported. A maximum of six dimensions is allowed.
On the other hand, the ARRAY statement does allow both variables and constants to be used as array elements. (Constant array elements cannot have values assigned to them.) Both dimension specification and the list of elements are optional, but at least one must be specified. When the list of elements is not specified or fewer elements than the size of the array are listed, array variables are created by suffixing element numbers to the array name to complete the element list.
BOUNDS b_ con[,b_ con... ] ;
where | b_ con := | number operator parameter_ list operator number |
or | b_ con := | number operator parameter_ list |
or | b_ con := | parameter_ list operator number |
and | operator := | <=, <, >=,or> |
Boundary constraints are specified with a BOUNDS statement. One- or two-sided boundary constraints are allowed. The list of boundary constraints are separated by commas. For example,
bounds 0 <= a1-a9 X <= 1, -1 <= c2-c5; bounds b1-b10 y >= 0;
You can specify more than one BOUNDS statement. If you specify more than one lower (upper) bound for the same parameter, the maximum (minimum) of these is taken.
If the maximum l j of all lower bounds is larger than the minimum of all upper bounds u j for the same variable j , the boundary constraint is replaced by j := l j := min( u j ) defined by the minimum of all upper bounds specified for j .
BY variables ;
You can use a BY statement with PROC NLMIXED to obtain separate analyses on DATA= data set observations in groups defined by the BY variables. This means that, unless TECH=NONE, an optimization problem is solved for each BY group separately. When a BY statement appears, the procedure expects the input DATA= data set to be sorted in order of the BY variables. If your input data set is not sorted in ascending order, use one of the following alternatives:
Use the SORT procedure with a similar BY statement to sort the data.
Use the BY statement option NOTSORTED or DESCENDING in the BY statement for the NLMIXED procedure. As a cautionary note, the NOTSORTED option does not mean that the data are unsorted but rather that the data are arranged in groups (according to values of the BY variables) and that these groups are not necessarily in alphabetical or increasing numeric order.
Use the DATASETS procedure (in Base SAS software) to create an index on the BY variables.
For more information on the BY statement, refer to the discussion in SAS Language Reference: Concepts . For more information on the DATASETS procedure, refer to the discussion in the SAS Procedures Guide .
CONTRAST label expression <, expression> <options> ;
The CONTRAST statement enables you to conduct a statistical test that several expressions simultaneously equal zero. The expressions are typically contrasts, that is, differences whose expected values equal zero under the hypothesis of interest.
In the CONTRAST statement you must provide a quoted string to identify the contrast and then a list of valid SAS expressions separated by commas. Multiple CONTRAST statements are permitted, and results from all statements are listed in a common table. PROC NLMIXED constructs approximate F tests for each statement using the delta method (Cox 1998) to approximate the variance-covariance matrix of the constituent expressions.
The following option is available in the CONTRAST statement.
DF= d
specifies the denominator degrees of freedom to be used in computing p values for the F statistics. The default value corresponds to the DF= option in the PROC NLMIXED statement.
ESTIMATE label expression <options> ;
The ESTIMATE statement enables you to compute an additional estimate that is a function of the parameter values. You must provide a quoted string to identify the estimate and then a valid SAS expression. Multiple ESTIMATE statements are permitted, and results from all statements are listed in a common table. PROC NLMIXED computes approximate standard errors for the estimates using the delta method (Billingsley 1986). It uses these standard errors to compute corresponding t statistics, p -values, and confidence limits.
The ECOV option in the PROC NLMIXED statement produces a table containing the approximate covariance matrix of all of the additional estimates you specify. The ECORR option produces the corresponding correlation matrix. The EDER option produces a table of the derivatives of the additional estimates with respect to each of the model parameters.
The following options are available in the ESTIMATE statement:
ALPHA= ±
specifies the alpha level to be used in computing confidence limits. The default value corresponds to the ALPHA= option in the PROC NLMIXED statement.
DF= d
specifies the degrees of freedom to be used in computing p -values and confidence limits. The default value corresponds to the DF= option in the PROC NLMIXED statement.
ID names ;
The ID statement identifies additional quantities to be included in the OUT= data set of the PREDICT statement. These can be any symbols you have defined with SAS programming statements.
MODEL dependent-variable ~ distribution ;
The MODEL statement is the mechanism for specifying the conditional distribution of the data given the random effects. You must specify a single dependent variable from the input data set, a tilde (~), and then a distribution with its parameters. Valid distributions are as follows.
normal(m,v) specifies a normal (Gaussian) distribution with mean m and variance v .
binary(p) specifies a binary (Bernoulli) distribution with probability p .
binomial(n,p) specifies a binomial distribution with count n and probability p .
gamma(a,b) specifies a gamma distribution with shape a and scale b .
negbin(n,p) specifies a negative binomial distribution with count n and probability p .
poisson(m) specifies a Poisson distribution with mean m .
general(ll) specifies a general log likelihood function that you construct using SAS programming statements.
The MODEL statement must follow any SAS programming statements you specify for computing parameters of the preceding distributions.
PARMS <name_ list [ =numbers ][ , name_ list [ =numbers ] ... ] > </ options> ;
The PARMS statement lists names of parameters and specifies initial values, possibly over a grid. You can specify the parameters and values either directly in a list or provide the name of a SAS data set that contains them using the DATA= option.
While the PARMS statement is not required, you are encouraged to use it to provide PROC NLMIXED with accurate starting values. Parameters not listed in the PARMS statement are assigned an initial value of 1. PROC NLMIXED considers all symbols not assigned values to be parameters, so you should specify your modeling statements carefully and check the output from the Parameters table to make sure the proper parameters are identified.
A list of parameter names in the PARMS statement is not separated by commas and is followed by an equal sign and a list of numbers. If the number list consists of only one number, this number defines the initial value for all the parameters listed to the left of the equal sign.
If the number list consists of more than one number, these numbers specify the grid locations for each of the parameters listed to the left of the equal sign. You can use the TO and BY keywords to specify a number list for a grid search. If you specify a grid of points in a PARMS statement, PROC NLMIXED computes the objective function value at each grid point and chooses the best ( feasible ) grid point as an initial point for the optimization process. You can use the BEST= option to save memory for the storing and sorting of all grid point information.
The following options are available in the PARMS statement after a slash (/):
BEST= i>
specifies the maximum number of points displayed in the Parameters table, selected as the points with the maximum likelihood values. By default, all grid values are displayed.
DATA = SAS-data-set
specifies a SAS data set containing parameter names and starting values. The data set should be in one of two forms: narrow or wide. The narrow-form data set contains the variables PARAMETER and ESTIMATE, with parameters and values listed as distinct observations. The wide-form data set has the parameters themselves as variables, and each observation provides a different set of starting values. BY groups are ignored in this data set, so the same starting grid is evaluated for each BY group.
PREDICT expression OUT=SAS-data-set <options> ;
The PREDICT statement enables you to construct predictions of an expression across all of the observations in the input data set. Any valid SAS programming expression involving the input data set variables, parameters, and random effects is valid. Predicted values are computed using the parameter estimates and empirical Bayes estimates of the random effects. Standard errors of prediction are computed using the delta method (Billingsley 1986, Cox 1998). Results are placed in an output data set that you specify with the OUT= option. Besides all variables from the input data set, the OUT= data set contains the following variables: Pred, StdErrPred, DF, tValue, Probt, Alpha, Lower, Upper. You can also add other computed quantities to this data set with the ID statement.
The following options are available in the PREDICT statement:
ALPHA= ±
specifies the alpha level to be used in computing t statistics and intervals. The default value corresponds to the ALPHA= option in the PROC NLMIXED statement.
DER
requests that derivatives of the predicted expression with respect to all parameters be included in the OUT= data set. The variable names for the derivatives are the same as the parameter names with the prefix Der_ appended. All of the derivatives are evaluated at the final estimates of the parameters and the empirical Bayes estimates of the random effects.
DF= d
specifies the degrees of freedom to be used in computing t statistics and intervals in the OUT= data set. The default value corresponds to the DF= option in the PROC NLMIXED statement.
RANDOM random-effects ~ distribution SUBJECT=variable <options> ;
The RANDOM statement defines the random effects and their distribution. The random effects must be represented by symbols that appear in your SAS programming statements. They typically influence the mean value of the distribution specified in the MODEL statement. The RANDOM statement consists of a list of the random effects (usually just one or two symbols), a tilde (~), the distribution for the random effects, and then a SUBJECT= variable.
NOTE: The input data set must be clustered according to the SUBJECT= variable. One easy way to accomplish this is to sort your data by the SUBJECT= variable prior to calling PROC NLMIXED. PROC NLMIXED does not sort the input data set for you; rather, it processes the data sequentially and considers an observation to be from a new subject whenever the value of its SUBJECT= changes from the previous observation.
The only distribution currently available for the random effects is normal( m , v ) with mean m and variance v . This syntax is illustrated as follows for one effect:
random u ~ normal(0,s2u) subject=clinic;
For multiple effects, you should specify bracketed vectors for m and v , the latter consisting of the lower triangle of the random-effects variance matrix listed in row order. This is illustrated for two and three random effects as follows.
random b1 b2 ~ normal([0,0],[g11,g21,g22]) subject=person; random b1 b2 b3 ~ normal([0,0,0],[g11,g21,g22,g31,g32,g33]) subject=person;
The SUBJECT= variable determines when new realizations of the random effects are assumed to occur. PROC NLMIXED assumes that a new realization occurs whenever the SUBJECT= variable changes from the previous observation, so your input data set should be clustered according to this variable. One easy way to accomplish this is to run PROC SORT prior to calling PROC NLMIXED using the SUBJECT= variable as the BY variable.
Only one RANDOM statement is permitted, so multilevel nonlinear mixed models are not currently accommodated.
The following options are available in the RANDOM statement:
ALPHA= ±
specifies the alpha level to be used in computing t statistics and intervals. The default value corresponds to the ALPHA= option in the PROC NLMIXED statement.
DF= d
specifies the degrees of freedom to be used in computing t statistics and intervals in the OUT= data set. The default value corresponds to the DF= option in the PROC NLMIXED statement.
OUT= SAS-data-set
requests an output data set containing empirical Bayes estimates of the random effects and their approximate standard errors of prediction.
REPLICATE variable ;
The REPLICATE statement provides a way to accommodate models in which different subjects have identical data. This occurs most commonly when the dependent variable is binary. When you specify a REPLICATE variable, PROC NLMIXED assumes that its value indicates the number of subjects having data identical to those for the current value of the SUBJECT= variable (specified in the RANDOM statement). Only the last observation of the REPLICATE variable for each subject is used, and the replicate variable must have only positive integer values.
This section lists the programming statements used to code the log likelihood function in PROC NLMIXED. It also documents the differences between programming statements in PROC NLMIXED and programming statements in the DATA step. The syntax of programming statements used in PROC NLMIXED is identical to that used in the CALIS and GENMOD procedures (see Chapter 19 and Chapter 31, respectively), and the MODEL procedure (refer to the SAS/ETS User s Guide ). Most of the programming statements that can be used in the SAS DATA step can also be used in the NLMIXED procedure. Refer to SAS Language Reference: Dictionary for a description of SAS programming statements. The following are valid statements:
ABORT;
CALL name [ (expression [, expression ... ]) ] ;
DELETE;
DO [ variable = expression
[ TO expression ] [ BY expression ]
[, expression [ TO expression ] [ BY expression ] ... ]
]
[ WHILE expression ] [ UNTIL expression ] ;
END;
GOTO statement_label ;
IF expression ;
IF expression THEN program_statement ;
ELSE program_statement ;
variable = expression ;
variable + expression ;
LINK statement_label ;
PUT [ variable] [ = ] [...] ;
RETURN ;
SELECT [( expression )] ;
STOP ;
SUBSTR( variable, index, length )= expression ;
WHEN ( expression ) program_statement ;
OTHERWISE program_statement ;
For the most part, the SAS programming statements work the same as they do in the SAS DATA step, as documented in SAS Language Reference: Concepts ; however, there are several differences.
The ABORT statement does not allow any arguments.
The DO statement does not allow a character index variable. Thus
do i = 1,2,3;
is supported; however, the following statement is not supported.
do i = A,B,C;
The LAG function does work appropriately with PROC NLMIXED, but you can use the ZLAG function instead.
The PUT statement, used mostly for program debugging in PROC NLMIXED, supports only some of the features of the DATA step PUT statement, and it has some new features that the DATA step PUT statement does not.
The PROC NLMIXED PUT statement does not support line pointers, factored lists, iteration factors, overprinting, “ INFILE “ , the colon (:) format modifier,or $ .
The PROC NLMIXED PUT statement does support expressions, but the expression must be enclosed in parentheses. For example, the following statement displays the square root of x:
put (sqrt(x));
The PROC NLMIXED PUT statement supports the item _ PDV_ to display a formatted listing of all variables in the program. For example, the following statement displays a much more readable listing of the variables than the _ ALL_ print item:
put _pdv_;
The WHEN and OTHERWISE statements enable you to specify more than one target statement. That is, DO/END groups are not necessary for multiple statement WHENs. For example, the following syntax is valid.
select; when ( exp1 ) stmt1 ; stmt2 ; when ( exp2 ) stmt3 ; stmt4 ; end;
When coding your programming statements, you should avoid defining variables that begin with an underscore (_), as they may conflict with internal variables created by PROC NLMIXED. The MODEL statement must come after any SAS programming statements you specify for computing parameters of the modeling distribution.