Examples | SAS.STAT 9.1 Users Guide (Vol. 5)

Example 51.1. One- Compartment Model with Pharmacokinetic Data

A popular application of nonlinear mixed models is in the field of pharmacokinetics, which studies how a drug disperses through a living individual. This example considers the theophylline data from Pinheiro and Bates (1995). Serum concentrations of the drug theophylline are measured in 12 subjects over a 25- hour period after oral administration. The data are as follows .

  data theoph;   input subject time conc dose wt;   datalines;   1  0.00  0.74 4.02 79.6   1  0.25  2.84 4.02 79.6   1  0.57  6.57 4.02 79.6   1  1.12 10.50 4.02 79.6   1  2.02  9.66 4.02 79.6   1  3.82  8.58 4.02 79.6   1  5.10  8.36 4.02 79.6   1  7.03  7.47 4.02 79.6   1  9.05  6.89 4.02 79.6   1 12.12  5.94 4.02 79.6   1 24.37  3.28 4.02 79.6   2  0.00  0.00 4.40 72.4   2  0.27  1.72 4.40 72.4   2  0.52  7.91 4.40 72.4   2  1.00  8.31 4.40 72.4   2  1.92  8.33 4.40 72.4   2  3.50  6.85 4.40 72.4   2  5.02  6.08 4.40 72.4   2  7.03  5.40 4.40 72.4   2  9.00  4.55 4.40 72.4   2 12.00  3.01 4.40 72.4   2 24.30  0.90 4.40 72.4   3  0.00  0.00 4.53 70.5   3  0.27  4.40 4.53 70.5   3  0.58  6.90 4.53 70.5   3  1.02  8.20 4.53 70.5   3  2.02  7.80 4.53 70.5   3  3.62  7.50 4.53 70.5   3  5.08  6.20 4.53 70.5   3  7.07  5.30 4.53 70.5   3  9.00  4.90 4.53 70.5   3 12.15  3.70 4.53 70.5   3 24.17  1.05 4.53 70.5   4  0.00  0.00 4.40 72.7   4  0.35  1.89 4.40 72.7   4  0.60  4.60 4.40 72.7   4  1.07  8.60 4.40 72.7   4  2.13  8.38 4.40 72.7   4  3.50  7.54 4.40 72.7   4  5.02  6.88 4.40 72.7   4  7.02  5.78 4.40 72.7   4  9.02  5.33 4.40 72.7   4 11.98  4.19 4.40 72.7   4 24.65  1.15 4.40 72.7   5  0.00  0.00 5.86 54.6   5  0.30  2.02 5.86 54.6   5  0.52  5.63 5.86 54.6   5  1.00 11.40 5.86 54.6   5  2.02  9.33 5.86 54.6   5  3.50  8.74 5.86 54.6   5  5.02  7.56 5.86 54.6   5  7.02  7.09 5.86 54.6   5  9.10  5.90 5.86 54.6   5 12.00  4.37 5.86 54.6   5 24.35  1.57 5.86 54.6   6  0.00  0.00 4.00 80.0   6  0.27  1.29 4.00 80.0   6  0.58  3.08 4.00 80.0   6  1.15  6.44 4.00 80.0   6  2.03  6.32 4.00 80.0   6  3.57  5.53 4.00 80.0   6  5.00  4.94 4.00 80.0   6  7.00  4.02 4.00 80.0   6  9.22  3.46 4.00 80.0   6 12.10  2.78 4.00 80.0   6 23.85  0.92 4.00 80.0   7  0.00  0.15 4.95 64.6   7  0.25  0.85 4.95 64.6   7  0.50  2.35 4.95 64.6   7  1.02  5.02 4.95 64.6   7  2.02  6.58 4.95 64.6   7  3.48  7.09 4.95 64.6   7  5.00  6.66 4.95 64.6   7  6.98  5.25 4.95 64.6   7  9.00  4.39 4.95 64.6   7 12.05  3.53 4.95 64.6   7 24.22  1.15 4.95 64.6   8  0.00  0.00 4.53 70.5   8  0.25  3.05 4.53 70.5   8  0.52  3.05 4.53 70.5   8  0.98  7.31 4.53 70.5   8  2.02  7.56 4.53 70.5   8  3.53  6.59 4.53 70.5   8  5.05  5.88 4.53 70.5   8  7.15  4.73 4.53 70.5   8  9.07  4.57 4.53 70.5   8 12.10  3.00 4.53 70.5   8 24.12  1.25 4.53 70.5   9  0.00  0.00 3.10 86.4   9  0.30  7.37 3.10 86.4   9  0.63  9.03 3.10 86.4   9  1.05  7.14 3.10 86.4   9  2.02  6.33 3.10 86.4   9  3.53  5.66 3.10 86.4   9  5.02  5.67 3.10 86.4   9  7.17  4.24 3.10 86.4   9  8.80  4.11 3.10 86.4   9 11.60  3.16 3.10 86.4   9 24.43  1.12 3.10 86.4   10  0.00  0.24 5.50 58.2   10  0.37  2.89 5.50 58.2   10  0.77  5.22 5.50 58.2   10  1.02  6.41 5.50 58.2   10  2.05  7.83 5.50 58.2   10  3.55 10.21 5.50 58.2   10  5.05  9.18 5.50 58.2   10  7.08  8.02 5.50 58.2   10  9.38  7.14 5.50 58.2   10 12.10  5.68 5.50 58.2   10 23.70  2.42 5.50 58.2   11  0.00  0.00 4.92 65.0   11  0.25  4.86 4.92 65.0   11  0.50  7.24 4.92 65.0   11  0.98  8.00 4.92 65.0   11  1.98  6.81 4.92 65.0   11  3.60  5.87 4.92 65.0   11  5.02  5.22 4.92 65.0   11  7.03  4.45 4.92 65.0   11  9.03  3.62 4.92 65.0   11 12.12  2.69 4.92 65.0   11 24.08  0.86 4.92 65.0   12  0.00  0.00 5.30 60.5   12  0.25  1.25 5.30 60.5   12  0.50  3.96 5.30 60.5   12  1.00  7.82 5.30 60.5   12  2.00  9.72 5.30 60.5   12  3.52  9.75 5.30 60.5   12  5.07  8.57 5.30 60.5   12  7.07  6.59 5.30 60.5   12  9.03  6.11 5.30 60.5   12 12.05  4.57 5.30 60.5   12 24.15  1.17 5.30 60.5   run;

Pinheiro and Bates (1995) consider the following first-order compartment model for these data:

where C _it is the observed concentration of the i th subject at time t , D is the dose of theophylline, is the elimination rate constant for subject i , is the absorption rate constant for subject i , Cl _i is the clearance for subject i , and e _it are normal errors. To allow for random variability between subjects, they assume

where the ² s denote fixed-effects parameters and the b _i s denote random-effects parameters with an unknown covariance matrix.

The PROC NLMIXED statements to fit this model are as follows.

  proc nlmixed data=theoph;   parms beta1=   3.22 beta2=0.47 beta3=   2.45   s2b1=0.03 cb12=0 s2b2=0.4 s2=0.5;   cl = exp(beta1 + b1);   ka = exp(beta2 + b2);   ke = exp(beta3);   pred = dose*ke*ka*(exp(-ke*time)-exp(-ka*time))/cl/(ka-ke);   model conc ~ normal(pred,s2);   random b1 b2 ~ normal([0,0],[s2b1,cb12,s2b2]) subject=subject;   run;

The PARMS statement specifies starting values for the three ² s and four variance-covariance parameters. The clearance and rate constants are defined using SAS programming statements, and the conditional model for the data is defined to be normal with mean PRED and variance S2. The two random effects are B1 and B2, and their joint distribution is defined in the RANDOM statement. Brackets are used in defining their mean vector (two zeroes) and the lower triangle of their variance-covariance matrix (a general 2 2 matrix). The SUBJECT= variable is SUBJECT.

The results from this analysis are as follows.

  The NLMIXED Procedure   Specifications   Data Set                                        WORK.THEOPH   Dependent Variable                              conc   Distribution for Dependent Variable             Normal   Random Effects                                  b1 b2   Distribution for Random Effects                 Normal   Subject Variable                                subject   Optimization Technique                          Dual Quasi-Newton   Integration Method                              Adaptive Gaussian   Quadrature

The Specifications table lists the set up of the model.

  The NLMIXED Procedure   Dimensions   Observations Used                      132   Observations Not Used                    0   Total Observations                     132   Subjects                                12   Max Obs Per Subject                     11   Parameters                               7   Quadrature Points                        5

The Dimensions table indicates that there are 132 observations, 12 subjects, and 7 parameters. PROC NLMIXED selects 5 quadrature points for each random effect, producing a total grid of 25 points over which quadrature is performed.

  The NLMIXED Procedure   Parameters   beta1   beta2   beta3    s2b1      cb12   s2b2   s2   NegLogLike     3.22   0.47   2.45    0.03         0    0.4  0.5   177.789945

The Parameters table lists the 7 parameters, their starting values, and the initial evaluation of the negative log likelihood using adaptive Gaussian quadrature.

  The NLMIXED Procedure   Iteration History   Iter  Calls  NegLogLike        Diff   MaxGrad     Slope   1      5  177.776248    0.013697  2.873367   63.0744   2      8    177.7643    0.011948  1.698144   4.75239   3     10  177.757264    0.007036  1.297439   1.97311   4     12  177.755688    0.001576  1.441408   0.49772   5     14    177.7467    0.008988  1.132279   0.8223   6     17  177.746401    0.000299  0.831293   0.00244   7     19  177.746318    0.000083  0.724198   0.00789   8     21   177.74574    0.000578  0.180018   0.00583   9     23  177.745736     3.88E-6  0.017958   8.25E-6   10     25  177.745736    3.222E-8  0.000143   6.51E-8   NOTE: GCONV convergence criterion satisfied.

The Iterations table indicates that 10 steps are required for the dual quasi-Newton algorithm to achieve convergence.

  The NLMIXED Procedure   Fit Statistics     2 Log Likelihood                  355.5   AIC (smaller is better)            369.5   AICC (smaller is better)           370.4   BIC (smaller is better)            372.9

The Fitting Information table lists the final optimized values of the log likelihood function and two information criteria in two different forms.

  The NLMIXED Procedure   Parameter Estimates   Standard   Parameter   Estimate      Error     DF   t Value   Pr > t    Alpha      Lower   beta1   3.2268    0.05950     10   54.23     <.0001     0.05   3.3594   beta2         0.4806     0.1989     10      2.42     0.0363     0.05    0.03745   beta3   2.4592    0.05126     10   47.97     <.0001     0.05   2.5734   s2b1         0.02803    0.01221     10      2.30     0.0445     0.05   0.000833   cb12   0.00127    0.03404     10   0.04     0.9710     0.05   0.07712   s2b2          0.4331     0.2005     10      2.16     0.0560     0.05   0.01353   s2            0.5016    0.06837     10      7.34     <.0001     0.05     0.3493   Parameter Estimates   Parameter      Upper    Gradient   beta1   3.0942   0.00009   beta2         0.9238    3.645E-7   beta3        -2.3449    0.000039   s2b1         0.05523   0.00014   cb12         0.07458   0.00007   s2b2          0.8798   6.98E-6   s2            0.6540    6.133E-6

The Parameter Estimates table contains the maximum likelihood estimates of the parameters. Both S2B1 and S2B2 are marginally significant, indicating between-subject variability in the clearances and absorption rate constants, respectively. There does not appear to be a significant covariance between them, as seen by the estimate of CB12.

The estimates of ² ₁ , ² ₂ , and ² ₃ are close to the adaptive quadrature estimates listed in Table 3 of Pinheiro and Bates (1995). However, Pinheiro and Bates use a Cholesky-root parameterization for the random-effects variance matrix and a logarithmic parameterization for the residual variance. The PROC NLMIXED statements using their parameterization are as follows, and results are similar.

  proc nlmixed data=theoph;   parms ll1=   1.5 l2=0 ll3=   0.1 beta1=   3 beta2=0.5 beta3=   2.5   ls2=   0.7;   s2 = exp(ls2);   l1 = exp(ll1);   l3 = exp(ll3);   s2b1 = l1*l1*s2;   cb12 = l2*l1*s2;   s2b2 = (l2*l2 + l3*l3)*s2;   cl = exp(beta1 + b1);   ka = exp(beta2 + b2);   ke = exp(beta3);   pred = dose*ke*ka*(exp(   ke*time)-exp(   ka*time))/cl/(ka   ke);   model conc ~ normal(pred,s2);   random b1 b2 ~ normal([0,0],[s2b1,cb12,s2b2]) subject=subject;   run;

Example 51.2. Probit-Normal Model with Binomial Data

For this example, consider the data from Weil (1970), also studied by Williams (1975), Ochi and Prentice (1984), and McCulloch (1994). In this experiment 16 pregnant rats receive a control diet and 16 receive a chemically treated diet, and the litter size for each rat is recorded after 4 and 21 days. The SAS data set is a follows.

  data rats;   input trt$ m x;   if (trt='c') then do;   x1 = 1;   x2 = 0;   end;   else do;   x1 = 0;   x2 = 1;   end;   litter = _n_;   datalines;   c 13 13   c 12 12   c  9  9   c  9  9   c  8  8   c  8  8   c 13 12   c 12 11   c 10  9   c 10  9   c  9  8   c 13 11   c  5  4   c  7  5   c 10  7   c 10  7   t 12 12   t 11 11   t 10 10   t  9  9   t 11 10   t 10  9   t 10  9   t  9  8   t  9  8   t  5  4   t  9  7   t  7  4   t 10  5   t  6  3   t 10  3   t  7  0   run;

Here, M represents the size of the litter after 4 days, and X represents the size of the litter after 21 days. Also, indicator variables X1 and X2 are constructed for the two treatment levels.

Following McCulloch (1994), assume a latent survival model of the form

where i indexes treatment, j indexes litter, and k indexes newborn rats within a litter. The t _i represent treatment means, the ± _ij represent random litter effects assumed to be iid , and the e _ijk represent iid residual errors, all on the latent scale.

Instead of observing the survival times y _ijk , assume that only the binary variable indicating whether y _ijk exceeds 0 is observed. If x _ij denotes the sum of these binary variables for the i th treatment and the j th litter, then the preceding assumptions lead to the following generalized linear mixed model:

where m _ij is the size of each litter after 4 days and

The PROC NLMIXED statements to fit this model are as follows.

  proc nlmixed data=rats;   parms t1=1 t2=1 s1=.05 s2=1;   eta = x1*t1 + x2*t2 + alpha;   p = probnorm(eta);   model x ~ binomial(m,p);   random alpha ~ normal(0,x1*s1*s1+x2*s2*s2) subject=litter;   estimate 'gamma2' t2/sqrt(1+s2*s2);   predict p out=p;   run;

As in the previous example, the PROC NLMIXED statement invokes the procedure and the PARMS statement defines the parameters. The parameters for this example are the two treatment means, T1 and T2, and the two random-effect standard deviations, S1 and S2.

The indicator variables X1 and X2 are used in the program to assign the proper mean to each observation in the input data set as well as the proper variance to the random effects. Note that programming expressions are permitted inside the distributional specifications, as illustrated by the random-effects variance specified here.

The ESTIMATE statement requests an estimate of , which is a location-scale parameter from Ochi and Prentice (1984).

The PREDICT statement constructs predictions for each observation in the input data set. For this example, predictions of P and approximate standard errors of prediction are output to a SAS data set named P. These predictions are functions of the parameter estimates and the empirical Bayes estimates of the random effects ± _i .

The output for this model is as follows.

  The NLMIXED Procedure   Specifications   Data Set                                    WORK.RATS   Dependent Variable                          x   Distribution for Dependent Variable         Binomial   Random Effects                              alpha   Distribution for Random Effects             Normal   Subject Variable                            litter   Optimization Technique                      Dual Quasi-Newton   Integration Method                          Adaptive Gaussian   Quadrature

The Specifications table provides basic information about this nonlinear mixed model.

  The NLMIXED Procedure   Dimensions   Observations Used                     32   Observations Not Used                  0   Total Observations                    32   Subjects                              32   Max Obs Per Subject                    1   Parameters                             4   Quadrature Points                      7

The Dimensions table provides counts of various variables.

  The NLMIXED Procedure   Parameters   t1         t2          s1          s2    NegLogLike   1          1        0.05           1    54.9362323

The Parameters table lists the starting point of the optimization.

  The NLMIXED Procedure   Iteration History   Iter     Calls    NegLogLike        Diff     MaxGrad       Slope   1         2    53.9933934    0.942839    11.03261   81.9428   2         3     52.875353     1.11804    2.148952   2.86277   3         5    52.6350386    0.240314    0.329957   1.05049   4         6    52.6319939    0.003045    0.122926   0.00672   5         8    52.6313583    0.000636    0.028246   0.00352   6        11    52.6313174    0.000041    0.013551   0.00023   7        13    52.6313115    5.839E-6    0.000603   0.00001   8        15    52.6313115     9.45E-9    0.000022   1.68E-8   NOTE: GCONV convergence criterion satisfied.

The Iterations table indicates successful convergence in 8 iterations.

  The NLMIXED Procedure   Fit Statistics     2 Log Likelihood                  105.3   AIC (smaller is better)            113.3   AICC (smaller is better)           114.7   BIC (smaller is better)            119.1

The Fitting Information table lists some useful statistics based on the maximized value of the log likelihood.

  The NLMIXED Procedure   Parameter Estimates   Standard   Parameter   Estimate      Error     DF   t Value   Pr > t    Alpha      Lower   t1            1.3063     0.1685     31      7.75     <.0001     0.05     0.9626   t2            0.9475     0.3055     31      3.10     0.0041     0.05     0.3244   s1            0.2403     0.3015     31      0.80     0.4315     0.05   0.3746   s2            1.0292     0.2988     31      3.44     0.0017     0.05     0.4198   Parameter Estimates   Parameter      Upper    Gradient   t1            1.6499   0.00002   t2            1.5705    9.283E-6   s1            0.8552    0.000014   s2            1.6385   3.16E-6

The Parameter Estimates table indicates significance of all of the parameters except S1.

  The NLMIXED Procedure   Additional Estimates   Standard   Label   Estimate     Error    DF t Value   Pr > t   Alpha     Lower     Upper   gamma2    0.6603    0.2165    31     3.05    0.0047    0.05    0.2186    1.1019

The Additional Estimates table displays results from the ESTIMATE statement. The estimate of ³ ₂ equals 0 . 66, agreeing with that obtained by McCulloch (1994). The standard error 0 . 22 is computed using the delta method (Billingsley 1986; Cox, 1998).

Not shown is the P data set, which contains the original 32 observations and predictions of the p _ij .

Example 51.3. Probit-Normal Model with Ordinal Data

The data for this example are from Ezzet and Whitehead (1991), who describe a crossover experiment on two groups of patients using two different inhaler devices (A and B). Patients from group 1 used device A for one week and then device B for another week. Patients from group 2 used the devices in reverse order. The data entered as a SAS data set are as follows.

  data inhaler;   input clarity group time freq;   gt = group*time;   sub = floor((_n_+1)/2);   datalines;   1 0 0 59   1 0 1 59   1 0 0 35   2 0 1 35   1 0 0  3   3 0 1  3   1 0 0  2   4 0 1  2   2 0 0 11   1 0 1 11   2 0 0 27   2 0 1 27   2 0 0  2   3 0 1  2   2 0 0  1   4 0 1  1   4 0 0  1   1 0 1  1   4 0 0  1   2 0 1  1   1 1 0 63   1 1 1 63   1 1 0 13   2 1 1 13   2 1 0 40   1 1 1 40   2 1 0 15   2 1 1 15   3 1 0  7   1 1 1  7   3 1 0  2   2 1 1  2   3 1 0  1   3 1 1  1   4 1 0  2   1 1 1  2   4 1 0  1   3 1 1  1   run;

The response measurement, CLARITY, is the patients assessment on the clarity of the leaflet instructions for the devices. The CLARITY variable is on an ordinal scale, with 1=easy, 2=only clear after rereading, 3=not very clear, and 4=confusing. The GROUP variable indicates the treatment group and the TIME variable indicates the time of measurement. The FREQ variable indicates the number of patients with exactly the same responses. A variable GT is created to indicate a group by time interaction, and a variable SUB is created to indicate patients.

As in the previous example and in Hedeker and Gibbons (1994), assume an underlying latent continuous variable, here with the form

where i indexes patient and j indexes the time period, g _i indicates groups, t _j indicates time, u _i is a patient-level normal random effect, and e _ij are iid normal errors. The ² s are unknown coefficients to be estimated.

Instead of observing y _ij , though, you observe only whether it falls in one of the four intervals: ( ˆ’ˆ , 0), (0 , I 1), ( I 1 , I 1 + I 2), or ( I 1 + I 2 , ˆ ), where I 1 and I 2 are both positive. The resulting category is the value assigned to the CLARITY variable.

The following code sets up and fits this ordinal probit model:

  proc nlmixed data=inhaler corr ecorr;   parms b0=0 b1=0 b2=0 b3=0 sd=1 i1=1 i2=1;   bounds i1 > 0, i2 > 0;   eta = b0 + b1*group + b2*time + b3*gt + u;   if (clarity=1) then p = probnorm(-eta);   else if (clarity=2) then   p = probnorm(i1   eta)   probnorm(   eta);   else if (clarity=3) then   p = probnorm(i1+i2-eta)   probnorm(i1   eta);   else p = 1   probnorm(i1+i2   eta);   if (p > 1e   8) then ll = log(p);   else ll =   1e100;   model clarity ~ general(ll);   random u ~ normal(0,sd*sd) subject=sub;   replicate freq;   estimate 'thresh2' i1;   estimate 'thresh3' i1 + i2;   estimate 'icc' sd*sd/(1+sd*sd);   run;

The PROC statement specifies the input data set and requests correlations both for the parameter estimates (CORR option) and the additional estimates specified with ESTIMATE statements (ECORR option).

The parameters as defined in the PARMS statement are as follows. B0 (overall intercept), B1 (group main effect), B2 (time main effect), B3 (group by time interaction), SD (standard deviation of the random effect), I1 (increment between first and second thresholds), and I2 (increment between second and third thresholds). The BOUNDS statement restricts I1 and I2 to be positive.

The SAS programming statements begin by defining the linear predictor ETA, which is a linear combination of the B parameters and a single random effect U. The next statements define the ordinal likelihood according to the CLARITY variable, ETA, and the increment variables. An error trap is included in case the likelihood becomes too small.

A general log likelihood specification is used in the MODEL statement, and the RANDOM statement defines the random effect U to have standard deviation SD and subject variable SUB. The REPLICATE statement indicates that data for each subject should be replicated according to the FREQ variable.

The ESTIMATE statements specify the second and third thresholds in terms of the increment variables (the first threshold is assumed to equal zero for model identifiability). Also computed is the intraclass correlation.

The output is as follows.

  The NLMIXED Procedure   Specifications   Data Set                                    WORK.INHALER   Dependent Variable                          clarity   Distribution for Dependent Variable         General   Random Effects                              u   Distribution for Random Effects             Normal   Subject Variable                            sub   Replicate Variable                          freq   Optimization Technique                      Dual Quasi-Newton   Integration Method                          Adaptive Gaussian   Quadrature

The Specifications table echoes some primary information specified for this nonlinear mixed model.

  The NLMIXED Procedure   Dimensions   Observations Used                     38   Observations Not Used                  0   Total Observations                    38   Subjects                             286   Max Obs Per Subject                    2   Parameters                             7   Quadrature Points                      5

The Dimensions table reveals a total of 286 subjects, which is the sum of the values of the FREQ variable. Five quadrature points are selected for log likelihood evaluation.

  The NLMIXED Procedure   Parameters   b0       b1        b2        b3        sd        i1        i2  NegLogLike   0        0         0         0         1         1         1  538.484276

The Parameters table lists the simple starting values for this problem.

  The NLMIXED Procedure   Iteration History   Iter    Calls    NegLogLike        Diff     MaxGrad       Slope   1         2    476.382511    62.10176    43.75062   1431.4   2         4    463.228197    13.15431    14.24648   106.753   3         5    458.528118     4.70008    48.31316   33.0389   4         6    450.975735    7.552383    22.60098   40.9954   5         8    448.012701    2.963033    14.86877   16.7453   6        10    447.245153    0.767549    7.774189   2.26743   7        11     446.72767    0.517483    3.793533   1.59278   8        13    446.518273    0.209396    0.868638   0.37801   9        16    446.514528    0.003745    0.328568   0.02356   10        18    446.513341    0.001187    0.056778   0.00183   11        20    446.513314    0.000027    0.010785   0.00004   12        22     446.51331    3.956E-6    0.004922   5.41E-6   13        24     446.51331    1.989E-7     0.00047   4E-7   NOTE: GCONV convergence criterion satisfied.

The Iterations table indicates successful convergence in 13 iterations.

  The NLMIXED Procedure   Fit Statistics     2 Log Likelihood                  893.0   AIC (smaller is better)            907.0   AICC (smaller is better)           910.8   BIC (smaller is better)            932.6

The Fitting Information table lists the log likelihood and information criteria.

  The NLMIXED Procedure   Parameter Estimates   Standard   Parameter   Estimate      Error     DF   t Value   Pr > t    Alpha      Lower   b0   0.6364     0.1342    285   4.74     <.0001     0.05   0.9006   b1            0.6007     0.1770    285      3.39     0.0008     0.05     0.2523   b2            0.6015     0.1582    285      3.80     0.0002     0.05     0.2900   b3   1.4817     0.2385    285   6.21     <.0001     0.05   1.9512   sd            0.6599     0.1312    285      5.03     <.0001     0.05     0.4017   i1            1.7450     0.1474    285     11.84     <.0001     0.05     1.4548   i2            0.5985     0.1427    285      4.19     <.0001     0.05     0.3177   Parameter Estimates   Parameter      Upper    Gradient   b0   0.3722     0.00047   b1            0.9491    0.000265   b2            0.9129     0.00008   b3   1.0122    0.000102   sd            0.9181   0.00009   i1            2.0352    0.000202   i2            0.8794    0.000087

The Parameter Estimates table indicates significance of all of the parameters.

  The NLMIXED Procedure   Additional Estimates   Standard   Label    Estimate     Error    DF  t Value  Pr > t   Alpha      Lower    Upper   thresh2    1.7450    0.1474   285    11.84    <.0001    0.05    1.4548    2.0352   thresh3    2.3435    0.2073   285    11.31    <.0001    0.05    1.9355    2.7515   icc        0.3034   0.08402   285     3.61    0.0004    0.05    0.1380    0.4687

The Additional Estimates table displays results from the ESTIMATE statements.

Example 51.4. Poisson-Normal Model with Count Data

This example uses the pump failure data of Gaver and OMuircheartaigh (1987). The number of failures and the time of operation are recorded for 10 pumps. Each of the pumps is classified into one of two groups corresponding to either continuous or intermittent operation. The data are as follows.

  data pump;   input y t group;   pump = _n_;   logtstd = log(t) - 2.4564900;   datalines;   5  94.320 1   1  15.720 2   5  62.880 1   14 125.760 1   3   5.240 2   19  31.440 1   1   1.048 2   1   1.048 2   4   2.096 2   22  10.480 2   run;

Each row denotes data for a single pump, and the variable LOGTSTD contains the centered operation times.

Letting y _ij denote the number of failures for the j th pump in the i th group, Draper (1996) considers the following hierarchical model for these data:

The model specifies different intercepts and slopes for each group, and the random effect is a mechanism for accounting for overdispersion.

The corresponding PROC NLMIXED statements are as follows.

  proc nlmixed data=pump;   parms logsig 0 beta1 1 beta2 1 alpha1 1 alpha2 1;   if (group = 1) then eta = alpha1 + beta1*logtstd + e;   else eta = alpha2 + beta2*logtstd + e;   lambda = exp(eta);   model y ~ poisson(lambda);   random e ~ normal(0,exp(2*logsig)) subject=pump;   estimate 'alpha1-alpha2' alpha1-alpha2;   estimate 'beta1-beta2' beta1-beta2;   run;

The output is as follows.

  The NLMIXED Procedure   Specifications   Data Set                                    WORK.PUMP   Dependent Variable                          y   Distribution for Dependent Variable         Poisson   Random Effects                              e   Distribution for Random Effects             Normal   Subject Variable                            pump   Optimization Technique                      Dual Quasi-Newton   Integration Method                          Adaptive Gaussian   Quadrature

The Specifications table displays some details for this Poisson-Normal model.

  The NLMIXED Procedure   Dimensions   Observations Used                     10   Observations Not Used                  0   Total Observations                    10   Subjects                              10   Max Obs Per Subject                    1   Parameters                             5   Quadrature Points                      5

The Dimensions table indicates that data for 10 pumps are used with one observation for each.

  The NLMIXED Procedure   Parameters   logsig      beta1       beta2      alpha1      alpha2    NegLogLike   0          1           1           1           1    32.8614614

The Parameters table lists the simple starting values for this problem and the initial evaluation of the negative log likelihood.

  The NLMIXED Procedure   Iteration History   Iter    Calls    NegLogLike        Diff     MaxGrad       Slope   1        2    30.6986932    2.162768    5.107253   91.602   2        5    30.0255468    0.673146    2.761738   11.0489   3        7     29.726325    0.299222    2.990401   2.36048   4        9    28.7390263    0.987299    2.074431   3.93678   5       10    28.3161933    0.422833    0.612531   0.63084   6       12      28.09564    0.220553    0.462162   0.52684   7       14    28.0438024    0.051838    0.405047   0.10018   8       16    28.0357134    0.008089    0.135059   0.01875   9       18     28.033925    0.001788    0.026279   0.00514   10       20    28.0338744    0.000051     0.00402   0.00012   11       22    28.0338727    1.681E-6    0.002864   5.09E-6   12       24    28.0338724    3.199E-7    0.000147   6.87E-7   13       26    28.0338724    2.532E-9    0.000017   5.75E-9   NOTE: GCONV convergence criterion satisfied.

The Iterations table indicates successful convergence in 13 iterations.

  The NLMIXED Procedure   Fit Statistics     2 Log Likelihood                   56.1   AIC (smaller is better)             66.1   AICC (smaller is better)            81.1   BIC (smaller is better)             67.6

The Fitting Information table lists the final log likelihood and associated information criteria.

  The NLMIXED Procedure   Parameter Estimates   Standard   Parameter   Estimate      Error     DF   t Value   Pr > t    Alpha      Lower   logsig   0.3161     0.3213      9   0.98     0.3508     0.05   1.0429   beta1   0.4256     0.7473      9   0.57     0.5829     0.05   2.1162   beta2         0.6097     0.3814      9      1.60     0.1443     0.05   0.2530   alpha1        2.9644     1.3826      9      2.14     0.0606     0.05   0.1632   alpha2        1.7992     0.5492      9      3.28     0.0096     0.05     0.5568   Parameter Estimates   Parameter      Upper    Gradient   logsig        0.4107   0.00002   beta1         1.2649   0.00002   beta2         1.4724   1.61E-6   alpha1        6.0921   5.25E-6   alpha2        3.0415   5.73E-6

  The NLMIXED Procedure   Additional Estimates   Standard   Label          Estimate     Error    DF  t Value  Pr > t   Alpha     Lower   alpha1-alpha2    1.1653    1.4855     9     0.78    0.4529    0.05   2.1952   beta1-beta2   1.0354    0.8389     9   1.23    0.2484    0.05   2.9331   Additional Estimates   Label             Upper   alpha1-alpha2    4.5257   beta1-beta2      0.8623

The Parameter Estimates and Additional Estimates tables list the maximum likelihood estimates for each of the parameters and two differences. The point estimates for the mean parameters agree fairly closely with the Bayesian posterior means reported by Draper (1996); however, the likelihood-based standard errors are roughly half the Bayesian posterior standard deviations. This is most likely due to the fact that the Bayesian standard deviations account for the uncertainty in estimating ƒ ² , whereas the likelihood values plug in its estimated value. This downward bias can be corrected somewhat by using the t ₉ distribution shown here.

Example 51.5. Failure Time and Frailty Model

In this example an accelerated failure time model with proportional hazard is fitted with and without random effects. The data are from the Getting Started example of PROC LIFEREG, see Chapter 39, The LIFEREG Procedure. Thirty-eight patients are divided into two groups of equal size, and different pain relievers are assigned to each group. The outcome reported is the time in minutes until headache relief. The variable censor indicates whether relief was observed during the course of the observation period ( censor = 0) or whether the observation is censored ( censor = 1).

  data headache;   input minutes group censor @@;   patient = _n_;   datalines;   11  1  0   12  1  0   19  1  0  19  1  0   19  1  0   19  1  0   21  1  0  20  1  0   21  1  0   21  1  0   20  1  0  21  1  0   20  1  0   21  1  0   25  1  0  27  1  0   30  1  0   21  1  1   24  1  1  14  2  0   16  2  0   16  2  0   21  2  0  21  2  0   23  2  0   23  2  0   23  2  0  23  2  0   25  2  1   23  2  0   24  2  0  24  2  0   26  2  1   32  2  1   30  2  1  30  2  0   32  2  1   20  2  1   ;

In modeling survival data, censoring of observations must be taken into account carefully . In this example, only right censoring occurs. If g ( t, ² ), h ( t, ² ), and G ( t, ² ) denote the density of failure, hazard function, and survival distribution function at time t , respectively, the log-likelihood can be written as

Refer to Cox and Oakes (1984, ch. 3). In these expressions U _u is the set of uncensored observations, U _c is the set of censored observations, and n denotes the total sample size.

The proportional hazards specification expresses the hazard in terms of a baseline hazard, multiplied by a constant. In this example the hazard is that of a Weibull model and is parameterized as h ( t, ² ) = ³± ( ± t ) ^³ ^{ˆ’ 1} and ± = exp{ ˆ’ x ² ² }.

The linear predictor is set equal to the intercept in the reference group ( group = 2); this defines the baseline hazard. The corresponding distribution of survival past time t is G ( t, ² ) = exp{ ˆ’ ( ± t ) ^³ }. Refer to Cox and Oakes (1984, Table 2.1) and the section Supported Distributions in Chapter 39, The LIFEREG Procedure, for this and other survival distribution models and various parameterizations.

The following NLMIXED statements fit this accelerated failure time model and estimate the cumulative distribution function of time to headache relief.

  proc nlmixed data=headache;   bounds gamma > 0;   linp  = b0 - b1*(group-2);   alpha = exp(-linp);   G_t   = exp(-(alpha*minutes)**gamma);   g     = gamma*alpha*((alpha*minutes)**(gamma-1))*G_t;   ll    = (censor=0)*log(g) + (censor=1)*log(G_t);   model minutes ~ general(ll);   predict 1-G_t out=cdf;   run;   proc print data=cdf;   var group censor patient minutes pred;   run;

Output 51.5.1: Analysis Results for Failure Time Model

  The NLMIXED Procedure   Specifications   Data Set                                    WORK.HEADACHE   Dependent Variable                          minutes   Distribution for Dependent Variable         General   Optimization Technique                      Dual Quasi-Newton   Integration Method                          None

The Specifications table shows that no integration is required, since the model does not contain random effects.

  The NLMIXED Procedure   Dimensions   Observations Used                     38   Observations Not Used                  0   Total Observations                    38   Parameters                             3

  The NLMIXED Procedure   Parameters   gamma         b0          b1    NegLogLike   1          1           1    263.990327

No starting values were given for the three parameters. The NLMIXED procedure assigns the default values in this case.

  The NLMIXED Procedure   Iteration History   Iter    Calls    NegLogLike        Diff     MaxGrad       Slope   1        2    169.244311    94.74602     22.5599   2230.83   2        4    142.873508     26.3708    14.88631   3.64643   3        6    140.633695    2.239814    11.25234   9.49454   4        8    122.890659    17.74304    19.44959   2.50807   5        9    121.396959    1.493699    13.85584   4.55427   6       11    120.623843    0.773116    13.67062   1.38064   7       12    119.278196    1.345647    15.78014   1.69072   8       14    116.271325    3.006871    26.94029   3.2529   9       16    109.427401    6.843925    19.88382   6.9289   10       19    103.298102    6.129298    12.15647   4.96054   11       22    101.686239    1.611863    14.24868   4.34059   12       23    100.027875    1.658364    11.69853   13.2049   13       26    99.9189048    0.108971    3.602552   0.55176   14       28    99.8738836    0.045021    0.170712   0.16645   15       30    99.8736392    0.000244    0.050822   0.00041   16       32    99.8736351    4.071E-6    0.000705   6.9E-6   17       34    99.8736351     6.1E-10    4.768E-6   1.23E-9   NOTE: GCONV convergence criterion satisfied.

  The NLMIXED Procedure   Fit Statistics     2 Log Likelihood                  199.7   AIC (smaller is better)            205.7   AICC (smaller is better)           206.5   BIC (smaller is better)            210.7

After 17 iterations and 34 evaluations of the objective function, the procedure converges.

  The NLMIXED Procedure   Parameter Estimates   Standard   Parameter   Estimate      Error     DF   t Value   Pr > t    Alpha      Lower   gamma         4.7128     0.6742     38      6.99     <.0001     0.05     3.3479   b0            3.3091    0.05885     38     56.23     <.0001     0.05     3.1900   b1   0.1933    0.07856     38   2.46     0.0185     0.05   0.3523   Parameter Estimates   Parameter      Upper    Gradient   gamma         6.0777    5.327E-8   b0            3.4283   4.77E-6   b1          -0.03426   1.22E-6

The parameter estimates and their standard errors are identical to those obtained with the LIFEREG procedure and the statements

  proc lifereg data=headache;   class group;   model minutes*censor(1) = group / dist=weibull;   run;

The t statistic and confidence limits are based on 38 degrees of freedom. The LIFEREG procedure computes z -intervals for the parameter estimates.

For the two groups you obtain

The probabilities of headache relief by t minutes are estimated as

These probabilities, calculated at the observed times, are shown for the two groups in Output 51.5.2.

Output 51.5.2: Estimated Cumulative Distribution Function

  Obs   group    censor    patient    minutes      Pred   1     1         0          1         11      0.03336   2     1         0          2         12      0.04985   3     1         0          3         19      0.35975   4     1         0          4         19      0.35975   5     1         0          5         19      0.35975   6     1         0          6         19      0.35975   7     1         0          7         21      0.51063   8     1         0          8         20      0.43325   9     1         0          9         21      0.51063   10     1         0         10         21      0.51063   11     1         0         11         20      0.43325   12     1         0         12         21      0.51063   13     1         0         13         20      0.43325   14     1         0         14         21      0.51063   15     1         0         15         25      0.80315   16     1         0         16         27      0.90328   17     1         0         17         30      0.97846   18     1         1         18         21      0.51063   19     1         1         19         24      0.73838   20     2         0         20         14      0.04163   21     2         0         21         16      0.07667   22     2         0         22         16      0.07667   23     2         0         23         21      0.24976   24     2         0         24         21      0.24976   25     2         0         25         23      0.35674   26     2         0         26         23      0.35674   27     2         0         27         23      0.35674   28     2         0         28         23      0.35674   29     2         1         29         25      0.47982   30     2         0         30         23      0.35674   31     2         0         31         24      0.41678   32     2         0         32         24      0.41678   33     2         1         33         26      0.54446   34     2         1         34         32      0.87656   35     2         1         35         30      0.78633   36     2         0         36         30      0.78633   37     2         1         37         32      0.87656   38     2         1         38         20      0.20414

Since the slope estimate is negative with p -value of 0.0185, you can infer that pain reliever 1 leads to overall significantly faster relief, but the estimated probabilities give no information about patient-to-patient variation within and between groups. For example, while pain reliever 1 provides faster relief overall, some patients in group 2 may respond more quickly than other patients in group 1. A frailty model enables you to accommodate and estimate patient-to-patient variation in health status by introducing random effects into a subjects hazard function.

The following statements model the hazard for patient i in terms of , where z _i is a (normal) random patient effect. Notice that the only difference from the previous NLMIXED statements are the RANDOM statement and the addition of z in the linear predictor. The empirical Bayes estimates of the random effect (RANDOM statement), the parameter estimates (ODS OUTPUT statement), and the estimated cumulative distribution function (PREDICT statement) are saved to subsequently graph the patient-specific distribution functions.

  ods output ParameterEstimates=est;   proc nlmixed data=headache;   bounds gamma > 0;   linp  = b0 - b1*(group-2) + z;   alpha = exp(-linp);   G_t   = exp(-(alpha*minutes)**gamma);   g     = gamma*alpha*((alpha*minutes)**(gamma-1))*G_t;   ll    = (censor=0)*log(g) + (censor=1)*log(G_t);   model minutes ~ general(ll);   random z ~ normal(0,exp(2*logsig)) subject=patient out=EB;   predict 1-G_t out=cdf;   run;   proc print data=eb;   var patient effect estimate stderrpred;   run;

Output 51.5.3: Analysis Results for Frailty Model

  The NLMIXED Procedure   Specifications   Data Set                                    WORK.HEADACHE   Dependent Variable                          minutes   Distribution for Dependent Variable         General   Random Effects                              z   Distribution for Random Effects             Normal   Subject Variable                            patient   Optimization Technique                      Dual Quasi-Newton   Integration Method                          Adaptive Gaussian   Quadrature

The Specifications table shows that the objective function is computed by adaptive Gaussian quadrature. The Dimensions table reports that nine quadrature points are being used to integrate over the random effects.

  The NLMIXED Procedure   Dimensions   Observations Used                     38   Observations Not Used                  0   Total Observations                    38   Subjects                              38   Max Obs Per Subject                    1   Parameters                             4   Quadrature Points                      9

  The NLMIXED Procedure   Parameters   gamma         b0          b1      logsig    NegLogLike   1          1           1           1     170.94366

  The NLMIXED Procedure   Iteration History   Iter    Calls    NegLogLike        Diff     MaxGrad       Slope   1        5    142.121411    28.82225    12.14484   88.8664   2        7    136.440369    5.681042    25.93096   65.7217   3        9    122.972041    13.46833    46.56546   146.887   4       11    120.904825    2.067216    23.77936   94.2862   5       13    109.224144    11.68068    57.65493   92.4075   6       15    105.064733    4.159411    4.824649   19.5879   7       16    101.902207    3.162526     14.1287   6.33767   8       18    99.6907395    2.211468    7.676822   3.42364   9       20    99.3654033    0.325336    5.689204   0.93978   10       22    99.2602178    0.105185    0.317643   0.23408   11       24     99.254434    0.005784     1.17351   0.00556   12       25    99.2456973    0.008737    0.247412   0.00871   13       27    99.2445445    0.001153    0.104942   0.00218   14       29    99.2444958    0.000049    0.005646   0.0001   15       31    99.2444957    9.147E-8    0.000271   1.84E-7   NOTE: GCONV convergence criterion satisfied.

The procedure converges after 15 iterations. The achieved ˆ’ 2 log likelihood is only 1.2 less than that in the model without random effects. Compared to a chi-square distribution with one degree of freedom, the addition of the random effect appears not to improve the model significantly. Care must be exercised, however, in the interpretation of likelihood ratio tests when the value under the null hypothesis falls on the boundary of the parameter space (refer to, for example, Self and Liang 1987).

  The NLMIXED Procedure   Fit Statistics     2 Log Likelihood                  198.5   AIC (smaller is better)            206.5   AICC (smaller is better)           207.7   BIC (smaller is better)            213.0

  The NLMIXED Procedure   Parameter Estimates   Standard   Parameter   Estimate      Error     DF   t Value   Pr > t    Alpha      Lower   gamma         6.2867     2.1334     37      2.95     0.0055     0.05     1.9641   b0            3.2786    0.06576     37     49.86     <.0001     0.05     3.1453   b1   0.1761    0.08264     37   2.13     0.0398     0.05   0.3436   logsig   1.9027     0.5273     37   3.61     0.0009     0.05   2.9711   Parameter Estimates   Parameter      Upper    Gradient   gamma        10.6093   1.89E-7   b0            3.4118    0.000271   b1   0.00868    0.000111   logsig   0.8343    0.000027

The estimate of the Weibull parameter has changed drastically from the model without random effects. The variance of the patient random effect is exp{ ˆ’ 2*1 . 9027} = 0 . 02225. The next listing shows the empirical Bayes estimates of the random effects. These are the adjustments made to the linear predictor in order to obtain a patients survival distribution.

  StdErr   Obs   patient    Effect    Estimate       Pred   1       1        z   0.13597     0.23249   2       2        z   0.13323     0.22793   3       3        z   0.06294     0.13813   4       4        z   0.06294     0.13813   5       5        z   0.06294     0.13813   6       6        z   0.06294     0.13813   7       7        z   0.02568     0.11759   8       8        z   0.04499     0.12618   9       9        z   0.02568     0.11759   10      10        z   0.02568     0.11759   11      11        z   0.04499     0.12618   12      12        z   0.02568     0.11759   13      13        z   0.04499     0.12618   14      14        z   0.02568     0.11759   15      15        z        0.05980     0.11618   16      16        z        0.10458     0.12684   17      17        z        0.17147     0.14550   18      18        z        0.06471     0.13807   19      19        z        0.11157     0.14604   20      20        z   0.13406     0.22899   21      21        z   0.12698     0.21667   22      22        z   0.12698     0.21667   23      23        z   0.08506     0.15701   24      24        z   0.08506     0.15701   25      25        z   0.05797     0.13294   26      26        z   0.05797     0.13294   27      27        z   0.05797     0.13294   28      28        z   0.05797     0.13294   29      29        z        0.06420     0.13956   30      30        z   0.05797     0.13294   31      31        z   0.04266     0.12390   32      32        z   0.04266     0.12390   33      33        z        0.07618     0.14132   34      34        z        0.16292     0.16460   35      35        z        0.13193     0.15528   36      36        z        0.06327     0.12124   37      37        z        0.16292     0.16460   38      38        z        0.02074     0.14160

The predicted values and patient-specific survival distributions can be plotted with the SAS code that follows.

  data cdf; set cdf; symbolid = int((patient-1)/19);   proc transpose data=est(keep=estimate) out=trest;   data trest; set trest; rename col1=gamma col2=b0 col3=b1;   data pred; merge eb(keep=estimate)   headache(keep=patient group);   if _n_ = 1 then merge trest(keep=gamma b0 b1);   do minutes=11 to 32;   linp = b0 - b1*(group-2) + estimate;   pred = 1-exp(- (exp(-linp)*minutes)**gamma);   symbolid = patient+1;   output;   end;   keep pred minutes symbolid;   run;

  data pred; set cdf(keep=pred minutes symbolid) pred;   run;   axis1 label=(angle=90 rotate=0   Estimated Patient-specific CDF)   minor=none;   axis2 label=(Minutes to Headache Relief)   minor=none order=(10 to 35 by 5);   symbol1 value=dot    c=black h=0.15in i=none r=1;   symbol2 value=circle c=black h=0.15in i=none r=1;   symbol3 value=none   c=black l=1      i=join r=19;   symbol4 value=none   c=black l=2      i=join r=19;   proc gplot data=pred;   plot pred*minutes=symbolid / frame cframe=ligr   nolegend vaxis=axis1 haxis=axis2;   run;

The separation of the distribution functions by groups is evident in Output 51.5.4. Most of the distributions of patients in the first group are to the left of the distributions in the second group. The separation is not complete, however. Several patients assigned the second pain reliever experience headache relief more quickly than patients assigned to the first group.

Output 51.5.4: Patient-Specific CDFs and Predicted Values. Pain Reliever 1: Solid Line, Closed Circles. Pain Reliever 2: Dashed Lines, Open Circles.