A popular application of nonlinear mixed models is in the field of pharmacokinetics, which studies how a drug disperses through a living individual. This example considers the theophylline data from Pinheiro and Bates (1995). Serum concentrations of the drug theophylline are measured in 12 subjects over a 25- hour period after oral administration. The data are as follows .
data theoph; input subject time conc dose wt; datalines; 1 0.00 0.74 4.02 79.6 1 0.25 2.84 4.02 79.6 1 0.57 6.57 4.02 79.6 1 1.12 10.50 4.02 79.6 1 2.02 9.66 4.02 79.6 1 3.82 8.58 4.02 79.6 1 5.10 8.36 4.02 79.6 1 7.03 7.47 4.02 79.6 1 9.05 6.89 4.02 79.6 1 12.12 5.94 4.02 79.6 1 24.37 3.28 4.02 79.6 2 0.00 0.00 4.40 72.4 2 0.27 1.72 4.40 72.4 2 0.52 7.91 4.40 72.4 2 1.00 8.31 4.40 72.4 2 1.92 8.33 4.40 72.4 2 3.50 6.85 4.40 72.4 2 5.02 6.08 4.40 72.4 2 7.03 5.40 4.40 72.4 2 9.00 4.55 4.40 72.4 2 12.00 3.01 4.40 72.4 2 24.30 0.90 4.40 72.4 3 0.00 0.00 4.53 70.5 3 0.27 4.40 4.53 70.5 3 0.58 6.90 4.53 70.5 3 1.02 8.20 4.53 70.5 3 2.02 7.80 4.53 70.5 3 3.62 7.50 4.53 70.5 3 5.08 6.20 4.53 70.5 3 7.07 5.30 4.53 70.5 3 9.00 4.90 4.53 70.5 3 12.15 3.70 4.53 70.5 3 24.17 1.05 4.53 70.5 4 0.00 0.00 4.40 72.7 4 0.35 1.89 4.40 72.7 4 0.60 4.60 4.40 72.7 4 1.07 8.60 4.40 72.7 4 2.13 8.38 4.40 72.7 4 3.50 7.54 4.40 72.7 4 5.02 6.88 4.40 72.7 4 7.02 5.78 4.40 72.7 4 9.02 5.33 4.40 72.7 4 11.98 4.19 4.40 72.7 4 24.65 1.15 4.40 72.7 5 0.00 0.00 5.86 54.6 5 0.30 2.02 5.86 54.6 5 0.52 5.63 5.86 54.6 5 1.00 11.40 5.86 54.6 5 2.02 9.33 5.86 54.6 5 3.50 8.74 5.86 54.6 5 5.02 7.56 5.86 54.6 5 7.02 7.09 5.86 54.6 5 9.10 5.90 5.86 54.6 5 12.00 4.37 5.86 54.6 5 24.35 1.57 5.86 54.6 6 0.00 0.00 4.00 80.0 6 0.27 1.29 4.00 80.0 6 0.58 3.08 4.00 80.0 6 1.15 6.44 4.00 80.0 6 2.03 6.32 4.00 80.0 6 3.57 5.53 4.00 80.0 6 5.00 4.94 4.00 80.0 6 7.00 4.02 4.00 80.0 6 9.22 3.46 4.00 80.0 6 12.10 2.78 4.00 80.0 6 23.85 0.92 4.00 80.0 7 0.00 0.15 4.95 64.6 7 0.25 0.85 4.95 64.6 7 0.50 2.35 4.95 64.6 7 1.02 5.02 4.95 64.6 7 2.02 6.58 4.95 64.6 7 3.48 7.09 4.95 64.6 7 5.00 6.66 4.95 64.6 7 6.98 5.25 4.95 64.6 7 9.00 4.39 4.95 64.6 7 12.05 3.53 4.95 64.6 7 24.22 1.15 4.95 64.6 8 0.00 0.00 4.53 70.5 8 0.25 3.05 4.53 70.5 8 0.52 3.05 4.53 70.5 8 0.98 7.31 4.53 70.5 8 2.02 7.56 4.53 70.5 8 3.53 6.59 4.53 70.5 8 5.05 5.88 4.53 70.5 8 7.15 4.73 4.53 70.5 8 9.07 4.57 4.53 70.5 8 12.10 3.00 4.53 70.5 8 24.12 1.25 4.53 70.5 9 0.00 0.00 3.10 86.4 9 0.30 7.37 3.10 86.4 9 0.63 9.03 3.10 86.4 9 1.05 7.14 3.10 86.4 9 2.02 6.33 3.10 86.4 9 3.53 5.66 3.10 86.4 9 5.02 5.67 3.10 86.4 9 7.17 4.24 3.10 86.4 9 8.80 4.11 3.10 86.4 9 11.60 3.16 3.10 86.4 9 24.43 1.12 3.10 86.4 10 0.00 0.24 5.50 58.2 10 0.37 2.89 5.50 58.2 10 0.77 5.22 5.50 58.2 10 1.02 6.41 5.50 58.2 10 2.05 7.83 5.50 58.2 10 3.55 10.21 5.50 58.2 10 5.05 9.18 5.50 58.2 10 7.08 8.02 5.50 58.2 10 9.38 7.14 5.50 58.2 10 12.10 5.68 5.50 58.2 10 23.70 2.42 5.50 58.2 11 0.00 0.00 4.92 65.0 11 0.25 4.86 4.92 65.0 11 0.50 7.24 4.92 65.0 11 0.98 8.00 4.92 65.0 11 1.98 6.81 4.92 65.0 11 3.60 5.87 4.92 65.0 11 5.02 5.22 4.92 65.0 11 7.03 4.45 4.92 65.0 11 9.03 3.62 4.92 65.0 11 12.12 2.69 4.92 65.0 11 24.08 0.86 4.92 65.0 12 0.00 0.00 5.30 60.5 12 0.25 1.25 5.30 60.5 12 0.50 3.96 5.30 60.5 12 1.00 7.82 5.30 60.5 12 2.00 9.72 5.30 60.5 12 3.52 9.75 5.30 60.5 12 5.07 8.57 5.30 60.5 12 7.07 6.59 5.30 60.5 12 9.03 6.11 5.30 60.5 12 12.05 4.57 5.30 60.5 12 24.15 1.17 5.30 60.5 run;
Pinheiro and Bates (1995) consider the following first-order compartment model for these data:
where C it is the observed concentration of the i th subject at time t , D is the dose of theophylline, is the elimination rate constant for subject i , is the absorption rate constant for subject i , Cl i is the clearance for subject i , and e it are normal errors. To allow for random variability between subjects, they assume
where the ² s denote fixed-effects parameters and the b i s denote random-effects parameters with an unknown covariance matrix.
The PROC NLMIXED statements to fit this model are as follows.
proc nlmixed data=theoph; parms beta1= 3.22 beta2=0.47 beta3= 2.45 s2b1=0.03 cb12=0 s2b2=0.4 s2=0.5; cl = exp(beta1 + b1); ka = exp(beta2 + b2); ke = exp(beta3); pred = dose*ke*ka*(exp(-ke*time)-exp(-ka*time))/cl/(ka-ke); model conc ~ normal(pred,s2); random b1 b2 ~ normal([0,0],[s2b1,cb12,s2b2]) subject=subject; run;
The PARMS statement specifies starting values for the three ² s and four variance-covariance parameters. The clearance and rate constants are defined using SAS programming statements, and the conditional model for the data is defined to be normal with mean PRED and variance S2. The two random effects are B1 and B2, and their joint distribution is defined in the RANDOM statement. Brackets are used in defining their mean vector (two zeroes) and the lower triangle of their variance-covariance matrix (a general 2 2 matrix). The SUBJECT= variable is SUBJECT.
The results from this analysis are as follows.
The NLMIXED Procedure Specifications Data Set WORK.THEOPH Dependent Variable conc Distribution for Dependent Variable Normal Random Effects b1 b2 Distribution for Random Effects Normal Subject Variable subject Optimization Technique Dual Quasi-Newton Integration Method Adaptive Gaussian Quadrature
The Specifications table lists the set up of the model.
The NLMIXED Procedure Dimensions Observations Used 132 Observations Not Used 0 Total Observations 132 Subjects 12 Max Obs Per Subject 11 Parameters 7 Quadrature Points 5
The Dimensions table indicates that there are 132 observations, 12 subjects, and 7 parameters. PROC NLMIXED selects 5 quadrature points for each random effect, producing a total grid of 25 points over which quadrature is performed.
The NLMIXED Procedure Parameters beta1 beta2 beta3 s2b1 cb12 s2b2 s2 NegLogLike 3.22 0.47 2.45 0.03 0 0.4 0.5 177.789945
The Parameters table lists the 7 parameters, their starting values, and the initial evaluation of the negative log likelihood using adaptive Gaussian quadrature.
The NLMIXED Procedure Iteration History Iter Calls NegLogLike Diff MaxGrad Slope 1 5 177.776248 0.013697 2.873367 63.0744 2 8 177.7643 0.011948 1.698144 4.75239 3 10 177.757264 0.007036 1.297439 1.97311 4 12 177.755688 0.001576 1.441408 0.49772 5 14 177.7467 0.008988 1.132279 0.8223 6 17 177.746401 0.000299 0.831293 0.00244 7 19 177.746318 0.000083 0.724198 0.00789 8 21 177.74574 0.000578 0.180018 0.00583 9 23 177.745736 3.88E-6 0.017958 8.25E-6 10 25 177.745736 3.222E-8 0.000143 6.51E-8 NOTE: GCONV convergence criterion satisfied.
The Iterations table indicates that 10 steps are required for the dual quasi-Newton algorithm to achieve convergence.
The NLMIXED Procedure Fit Statistics 2 Log Likelihood 355.5 AIC (smaller is better) 369.5 AICC (smaller is better) 370.4 BIC (smaller is better) 372.9
The Fitting Information table lists the final optimized values of the log likelihood function and two information criteria in two different forms.
The NLMIXED Procedure Parameter Estimates Standard Parameter Estimate Error DF t Value Pr > t Alpha Lower beta1 3.2268 0.05950 10 54.23 <.0001 0.05 3.3594 beta2 0.4806 0.1989 10 2.42 0.0363 0.05 0.03745 beta3 2.4592 0.05126 10 47.97 <.0001 0.05 2.5734 s2b1 0.02803 0.01221 10 2.30 0.0445 0.05 0.000833 cb12 0.00127 0.03404 10 0.04 0.9710 0.05 0.07712 s2b2 0.4331 0.2005 10 2.16 0.0560 0.05 0.01353 s2 0.5016 0.06837 10 7.34 <.0001 0.05 0.3493 Parameter Estimates Parameter Upper Gradient beta1 3.0942 0.00009 beta2 0.9238 3.645E-7 beta3 -2.3449 0.000039 s2b1 0.05523 0.00014 cb12 0.07458 0.00007 s2b2 0.8798 6.98E-6 s2 0.6540 6.133E-6
The Parameter Estimates table contains the maximum likelihood estimates of the parameters. Both S2B1 and S2B2 are marginally significant, indicating between-subject variability in the clearances and absorption rate constants, respectively. There does not appear to be a significant covariance between them, as seen by the estimate of CB12.
The estimates of ² 1 , ² 2 , and ² 3 are close to the adaptive quadrature estimates listed in Table 3 of Pinheiro and Bates (1995). However, Pinheiro and Bates use a Cholesky-root parameterization for the random-effects variance matrix and a logarithmic parameterization for the residual variance. The PROC NLMIXED statements using their parameterization are as follows, and results are similar.
proc nlmixed data=theoph; parms ll1= 1.5 l2=0 ll3= 0.1 beta1= 3 beta2=0.5 beta3= 2.5 ls2= 0.7; s2 = exp(ls2); l1 = exp(ll1); l3 = exp(ll3); s2b1 = l1*l1*s2; cb12 = l2*l1*s2; s2b2 = (l2*l2 + l3*l3)*s2; cl = exp(beta1 + b1); ka = exp(beta2 + b2); ke = exp(beta3); pred = dose*ke*ka*(exp( ke*time)-exp( ka*time))/cl/(ka ke); model conc ~ normal(pred,s2); random b1 b2 ~ normal([0,0],[s2b1,cb12,s2b2]) subject=subject; run;
For this example, consider the data from Weil (1970), also studied by Williams (1975), Ochi and Prentice (1984), and McCulloch (1994). In this experiment 16 pregnant rats receive a control diet and 16 receive a chemically treated diet, and the litter size for each rat is recorded after 4 and 21 days. The SAS data set is a follows.
data rats; input trt$ m x; if (trt='c') then do; x1 = 1; x2 = 0; end; else do; x1 = 0; x2 = 1; end; litter = _n_; datalines; c 13 13 c 12 12 c 9 9 c 9 9 c 8 8 c 8 8 c 13 12 c 12 11 c 10 9 c 10 9 c 9 8 c 13 11 c 5 4 c 7 5 c 10 7 c 10 7 t 12 12 t 11 11 t 10 10 t 9 9 t 11 10 t 10 9 t 10 9 t 9 8 t 9 8 t 5 4 t 9 7 t 7 4 t 10 5 t 6 3 t 10 3 t 7 0 run;
Here, M represents the size of the litter after 4 days, and X represents the size of the litter after 21 days. Also, indicator variables X1 and X2 are constructed for the two treatment levels.
Following McCulloch (1994), assume a latent survival model of the form
where i indexes treatment, j indexes litter, and k indexes newborn rats within a litter. The t i represent treatment means, the ± ij represent random litter effects assumed to be iid , and the e ijk represent iid residual errors, all on the latent scale.
Instead of observing the survival times y ijk , assume that only the binary variable indicating whether y ijk exceeds 0 is observed. If x ij denotes the sum of these binary variables for the i th treatment and the j th litter, then the preceding assumptions lead to the following generalized linear mixed model:
where m ij is the size of each litter after 4 days and
The PROC NLMIXED statements to fit this model are as follows.
proc nlmixed data=rats; parms t1=1 t2=1 s1=.05 s2=1; eta = x1*t1 + x2*t2 + alpha; p = probnorm(eta); model x ~ binomial(m,p); random alpha ~ normal(0,x1*s1*s1+x2*s2*s2) subject=litter; estimate 'gamma2' t2/sqrt(1+s2*s2); predict p out=p; run;
As in the previous example, the PROC NLMIXED statement invokes the procedure and the PARMS statement defines the parameters. The parameters for this example are the two treatment means, T1 and T2, and the two random-effect standard deviations, S1 and S2.
The indicator variables X1 and X2 are used in the program to assign the proper mean to each observation in the input data set as well as the proper variance to the random effects. Note that programming expressions are permitted inside the distributional specifications, as illustrated by the random-effects variance specified here.
The ESTIMATE statement requests an estimate of , which is a location-scale parameter from Ochi and Prentice (1984).
The PREDICT statement constructs predictions for each observation in the input data set. For this example, predictions of P and approximate standard errors of prediction are output to a SAS data set named P. These predictions are functions of the parameter estimates and the empirical Bayes estimates of the random effects ± i .
The output for this model is as follows.
The NLMIXED Procedure Specifications Data Set WORK.RATS Dependent Variable x Distribution for Dependent Variable Binomial Random Effects alpha Distribution for Random Effects Normal Subject Variable litter Optimization Technique Dual Quasi-Newton Integration Method Adaptive Gaussian Quadrature
The Specifications table provides basic information about this nonlinear mixed model.
The NLMIXED Procedure Dimensions Observations Used 32 Observations Not Used 0 Total Observations 32 Subjects 32 Max Obs Per Subject 1 Parameters 4 Quadrature Points 7
The Dimensions table provides counts of various variables.
The NLMIXED Procedure Parameters t1 t2 s1 s2 NegLogLike 1 1 0.05 1 54.9362323
The Parameters table lists the starting point of the optimization.
The NLMIXED Procedure Iteration History Iter Calls NegLogLike Diff MaxGrad Slope 1 2 53.9933934 0.942839 11.03261 81.9428 2 3 52.875353 1.11804 2.148952 2.86277 3 5 52.6350386 0.240314 0.329957 1.05049 4 6 52.6319939 0.003045 0.122926 0.00672 5 8 52.6313583 0.000636 0.028246 0.00352 6 11 52.6313174 0.000041 0.013551 0.00023 7 13 52.6313115 5.839E-6 0.000603 0.00001 8 15 52.6313115 9.45E-9 0.000022 1.68E-8 NOTE: GCONV convergence criterion satisfied.
The Iterations table indicates successful convergence in 8 iterations.
The NLMIXED Procedure Fit Statistics 2 Log Likelihood 105.3 AIC (smaller is better) 113.3 AICC (smaller is better) 114.7 BIC (smaller is better) 119.1
The Fitting Information table lists some useful statistics based on the maximized value of the log likelihood.
The NLMIXED Procedure Parameter Estimates Standard Parameter Estimate Error DF t Value Pr > t Alpha Lower t1 1.3063 0.1685 31 7.75 <.0001 0.05 0.9626 t2 0.9475 0.3055 31 3.10 0.0041 0.05 0.3244 s1 0.2403 0.3015 31 0.80 0.4315 0.05 0.3746 s2 1.0292 0.2988 31 3.44 0.0017 0.05 0.4198 Parameter Estimates Parameter Upper Gradient t1 1.6499 0.00002 t2 1.5705 9.283E-6 s1 0.8552 0.000014 s2 1.6385 3.16E-6
The Parameter Estimates table indicates significance of all of the parameters except S1.
The NLMIXED Procedure Additional Estimates Standard Label Estimate Error DF t Value Pr > t Alpha Lower Upper gamma2 0.6603 0.2165 31 3.05 0.0047 0.05 0.2186 1.1019
The Additional Estimates table displays results from the ESTIMATE statement. The estimate of ³ 2 equals 0 . 66, agreeing with that obtained by McCulloch (1994). The standard error 0 . 22 is computed using the delta method (Billingsley 1986; Cox, 1998).
Not shown is the P data set, which contains the original 32 observations and predictions of the p ij .
The data for this example are from Ezzet and Whitehead (1991), who describe a crossover experiment on two groups of patients using two different inhaler devices (A and B). Patients from group 1 used device A for one week and then device B for another week. Patients from group 2 used the devices in reverse order. The data entered as a SAS data set are as follows.
data inhaler; input clarity group time freq; gt = group*time; sub = floor((_n_+1)/2); datalines; 1 0 0 59 1 0 1 59 1 0 0 35 2 0 1 35 1 0 0 3 3 0 1 3 1 0 0 2 4 0 1 2 2 0 0 11 1 0 1 11 2 0 0 27 2 0 1 27 2 0 0 2 3 0 1 2 2 0 0 1 4 0 1 1 4 0 0 1 1 0 1 1 4 0 0 1 2 0 1 1 1 1 0 63 1 1 1 63 1 1 0 13 2 1 1 13 2 1 0 40 1 1 1 40 2 1 0 15 2 1 1 15 3 1 0 7 1 1 1 7 3 1 0 2 2 1 1 2 3 1 0 1 3 1 1 1 4 1 0 2 1 1 1 2 4 1 0 1 3 1 1 1 run;
The response measurement, CLARITY, is the patients assessment on the clarity of the leaflet instructions for the devices. The CLARITY variable is on an ordinal scale, with 1=easy, 2=only clear after rereading, 3=not very clear, and 4=confusing. The GROUP variable indicates the treatment group and the TIME variable indicates the time of measurement. The FREQ variable indicates the number of patients with exactly the same responses. A variable GT is created to indicate a group by time interaction, and a variable SUB is created to indicate patients.
As in the previous example and in Hedeker and Gibbons (1994), assume an underlying latent continuous variable, here with the form
where i indexes patient and j indexes the time period, g i indicates groups, t j indicates time, u i is a patient-level normal random effect, and e ij are iid normal errors. The ² s are unknown coefficients to be estimated.
Instead of observing y ij , though, you observe only whether it falls in one of the four intervals: ( ˆ’ˆ , 0), (0 , I 1), ( I 1 , I 1 + I 2), or ( I 1 + I 2 , ˆ ), where I 1 and I 2 are both positive. The resulting category is the value assigned to the CLARITY variable.
The following code sets up and fits this ordinal probit model:
proc nlmixed data=inhaler corr ecorr; parms b0=0 b1=0 b2=0 b3=0 sd=1 i1=1 i2=1; bounds i1 > 0, i2 > 0; eta = b0 + b1*group + b2*time + b3*gt + u; if (clarity=1) then p = probnorm(-eta); else if (clarity=2) then p = probnorm(i1 eta) probnorm( eta); else if (clarity=3) then p = probnorm(i1+i2-eta) probnorm(i1 eta); else p = 1 probnorm(i1+i2 eta); if (p > 1e 8) then ll = log(p); else ll = 1e100; model clarity ~ general(ll); random u ~ normal(0,sd*sd) subject=sub; replicate freq; estimate 'thresh2' i1; estimate 'thresh3' i1 + i2; estimate 'icc' sd*sd/(1+sd*sd); run;
The PROC statement specifies the input data set and requests correlations both for the parameter estimates (CORR option) and the additional estimates specified with ESTIMATE statements (ECORR option).
The parameters as defined in the PARMS statement are as follows. B0 (overall intercept), B1 (group main effect), B2 (time main effect), B3 (group by time interaction), SD (standard deviation of the random effect), I1 (increment between first and second thresholds), and I2 (increment between second and third thresholds). The BOUNDS statement restricts I1 and I2 to be positive.
The SAS programming statements begin by defining the linear predictor ETA, which is a linear combination of the B parameters and a single random effect U. The next statements define the ordinal likelihood according to the CLARITY variable, ETA, and the increment variables. An error trap is included in case the likelihood becomes too small.
A general log likelihood specification is used in the MODEL statement, and the RANDOM statement defines the random effect U to have standard deviation SD and subject variable SUB. The REPLICATE statement indicates that data for each subject should be replicated according to the FREQ variable.
The ESTIMATE statements specify the second and third thresholds in terms of the increment variables (the first threshold is assumed to equal zero for model identifiability). Also computed is the intraclass correlation.
The output is as follows.
The NLMIXED Procedure Specifications Data Set WORK.INHALER Dependent Variable clarity Distribution for Dependent Variable General Random Effects u Distribution for Random Effects Normal Subject Variable sub Replicate Variable freq Optimization Technique Dual Quasi-Newton Integration Method Adaptive Gaussian Quadrature
The Specifications table echoes some primary information specified for this nonlinear mixed model.
The NLMIXED Procedure Dimensions Observations Used 38 Observations Not Used 0 Total Observations 38 Subjects 286 Max Obs Per Subject 2 Parameters 7 Quadrature Points 5
The Dimensions table reveals a total of 286 subjects, which is the sum of the values of the FREQ variable. Five quadrature points are selected for log likelihood evaluation.
The NLMIXED Procedure Parameters b0 b1 b2 b3 sd i1 i2 NegLogLike 0 0 0 0 1 1 1 538.484276
The Parameters table lists the simple starting values for this problem.
The NLMIXED Procedure Iteration History Iter Calls NegLogLike Diff MaxGrad Slope 1 2 476.382511 62.10176 43.75062 1431.4 2 4 463.228197 13.15431 14.24648 106.753 3 5 458.528118 4.70008 48.31316 33.0389 4 6 450.975735 7.552383 22.60098 40.9954 5 8 448.012701 2.963033 14.86877 16.7453 6 10 447.245153 0.767549 7.774189 2.26743 7 11 446.72767 0.517483 3.793533 1.59278 8 13 446.518273 0.209396 0.868638 0.37801 9 16 446.514528 0.003745 0.328568 0.02356 10 18 446.513341 0.001187 0.056778 0.00183 11 20 446.513314 0.000027 0.010785 0.00004 12 22 446.51331 3.956E-6 0.004922 5.41E-6 13 24 446.51331 1.989E-7 0.00047 4E-7 NOTE: GCONV convergence criterion satisfied.
The Iterations table indicates successful convergence in 13 iterations.
The NLMIXED Procedure Fit Statistics 2 Log Likelihood 893.0 AIC (smaller is better) 907.0 AICC (smaller is better) 910.8 BIC (smaller is better) 932.6
The Fitting Information table lists the log likelihood and information criteria.
The NLMIXED Procedure Parameter Estimates Standard Parameter Estimate Error DF t Value Pr > t Alpha Lower b0 0.6364 0.1342 285 4.74 <.0001 0.05 0.9006 b1 0.6007 0.1770 285 3.39 0.0008 0.05 0.2523 b2 0.6015 0.1582 285 3.80 0.0002 0.05 0.2900 b3 1.4817 0.2385 285 6.21 <.0001 0.05 1.9512 sd 0.6599 0.1312 285 5.03 <.0001 0.05 0.4017 i1 1.7450 0.1474 285 11.84 <.0001 0.05 1.4548 i2 0.5985 0.1427 285 4.19 <.0001 0.05 0.3177 Parameter Estimates Parameter Upper Gradient b0 0.3722 0.00047 b1 0.9491 0.000265 b2 0.9129 0.00008 b3 1.0122 0.000102 sd 0.9181 0.00009 i1 2.0352 0.000202 i2 0.8794 0.000087
The Parameter Estimates table indicates significance of all of the parameters.
The NLMIXED Procedure Additional Estimates Standard Label Estimate Error DF t Value Pr > t Alpha Lower Upper thresh2 1.7450 0.1474 285 11.84 <.0001 0.05 1.4548 2.0352 thresh3 2.3435 0.2073 285 11.31 <.0001 0.05 1.9355 2.7515 icc 0.3034 0.08402 285 3.61 0.0004 0.05 0.1380 0.4687
The Additional Estimates table displays results from the ESTIMATE statements.
This example uses the pump failure data of Gaver and OMuircheartaigh (1987). The number of failures and the time of operation are recorded for 10 pumps. Each of the pumps is classified into one of two groups corresponding to either continuous or intermittent operation. The data are as follows.
data pump; input y t group; pump = _n_; logtstd = log(t) - 2.4564900; datalines; 5 94.320 1 1 15.720 2 5 62.880 1 14 125.760 1 3 5.240 2 19 31.440 1 1 1.048 2 1 1.048 2 4 2.096 2 22 10.480 2 run;
Each row denotes data for a single pump, and the variable LOGTSTD contains the centered operation times.
Letting y ij denote the number of failures for the j th pump in the i th group, Draper (1996) considers the following hierarchical model for these data:
The model specifies different intercepts and slopes for each group, and the random effect is a mechanism for accounting for overdispersion.
The corresponding PROC NLMIXED statements are as follows.
proc nlmixed data=pump; parms logsig 0 beta1 1 beta2 1 alpha1 1 alpha2 1; if (group = 1) then eta = alpha1 + beta1*logtstd + e; else eta = alpha2 + beta2*logtstd + e; lambda = exp(eta); model y ~ poisson(lambda); random e ~ normal(0,exp(2*logsig)) subject=pump; estimate 'alpha1-alpha2' alpha1-alpha2; estimate 'beta1-beta2' beta1-beta2; run;
The output is as follows.
The NLMIXED Procedure Specifications Data Set WORK.PUMP Dependent Variable y Distribution for Dependent Variable Poisson Random Effects e Distribution for Random Effects Normal Subject Variable pump Optimization Technique Dual Quasi-Newton Integration Method Adaptive Gaussian Quadrature
The Specifications table displays some details for this Poisson-Normal model.
The NLMIXED Procedure Dimensions Observations Used 10 Observations Not Used 0 Total Observations 10 Subjects 10 Max Obs Per Subject 1 Parameters 5 Quadrature Points 5
The Dimensions table indicates that data for 10 pumps are used with one observation for each.
The NLMIXED Procedure Parameters logsig beta1 beta2 alpha1 alpha2 NegLogLike 0 1 1 1 1 32.8614614
The Parameters table lists the simple starting values for this problem and the initial evaluation of the negative log likelihood.
The NLMIXED Procedure Iteration History Iter Calls NegLogLike Diff MaxGrad Slope 1 2 30.6986932 2.162768 5.107253 91.602 2 5 30.0255468 0.673146 2.761738 11.0489 3 7 29.726325 0.299222 2.990401 2.36048 4 9 28.7390263 0.987299 2.074431 3.93678 5 10 28.3161933 0.422833 0.612531 0.63084 6 12 28.09564 0.220553 0.462162 0.52684 7 14 28.0438024 0.051838 0.405047 0.10018 8 16 28.0357134 0.008089 0.135059 0.01875 9 18 28.033925 0.001788 0.026279 0.00514 10 20 28.0338744 0.000051 0.00402 0.00012 11 22 28.0338727 1.681E-6 0.002864 5.09E-6 12 24 28.0338724 3.199E-7 0.000147 6.87E-7 13 26 28.0338724 2.532E-9 0.000017 5.75E-9 NOTE: GCONV convergence criterion satisfied.
The Iterations table indicates successful convergence in 13 iterations.
The NLMIXED Procedure Fit Statistics 2 Log Likelihood 56.1 AIC (smaller is better) 66.1 AICC (smaller is better) 81.1 BIC (smaller is better) 67.6
The Fitting Information table lists the final log likelihood and associated information criteria.
The NLMIXED Procedure Parameter Estimates Standard Parameter Estimate Error DF t Value Pr > t Alpha Lower logsig 0.3161 0.3213 9 0.98 0.3508 0.05 1.0429 beta1 0.4256 0.7473 9 0.57 0.5829 0.05 2.1162 beta2 0.6097 0.3814 9 1.60 0.1443 0.05 0.2530 alpha1 2.9644 1.3826 9 2.14 0.0606 0.05 0.1632 alpha2 1.7992 0.5492 9 3.28 0.0096 0.05 0.5568 Parameter Estimates Parameter Upper Gradient logsig 0.4107 0.00002 beta1 1.2649 0.00002 beta2 1.4724 1.61E-6 alpha1 6.0921 5.25E-6 alpha2 3.0415 5.73E-6
The NLMIXED Procedure Additional Estimates Standard Label Estimate Error DF t Value Pr > t Alpha Lower alpha1-alpha2 1.1653 1.4855 9 0.78 0.4529 0.05 2.1952 beta1-beta2 1.0354 0.8389 9 1.23 0.2484 0.05 2.9331 Additional Estimates Label Upper alpha1-alpha2 4.5257 beta1-beta2 0.8623
The Parameter Estimates and Additional Estimates tables list the maximum likelihood estimates for each of the parameters and two differences. The point estimates for the mean parameters agree fairly closely with the Bayesian posterior means reported by Draper (1996); however, the likelihood-based standard errors are roughly half the Bayesian posterior standard deviations. This is most likely due to the fact that the Bayesian standard deviations account for the uncertainty in estimating ƒ 2 , whereas the likelihood values plug in its estimated value. This downward bias can be corrected somewhat by using the t 9 distribution shown here.
In this example an accelerated failure time model with proportional hazard is fitted with and without random effects. The data are from the Getting Started example of PROC LIFEREG, see Chapter 39, The LIFEREG Procedure. Thirty-eight patients are divided into two groups of equal size, and different pain relievers are assigned to each group. The outcome reported is the time in minutes until headache relief. The variable censor indicates whether relief was observed during the course of the observation period ( censor = 0) or whether the observation is censored ( censor = 1).
data headache; input minutes group censor @@; patient = _n_; datalines; 11 1 0 12 1 0 19 1 0 19 1 0 19 1 0 19 1 0 21 1 0 20 1 0 21 1 0 21 1 0 20 1 0 21 1 0 20 1 0 21 1 0 25 1 0 27 1 0 30 1 0 21 1 1 24 1 1 14 2 0 16 2 0 16 2 0 21 2 0 21 2 0 23 2 0 23 2 0 23 2 0 23 2 0 25 2 1 23 2 0 24 2 0 24 2 0 26 2 1 32 2 1 30 2 1 30 2 0 32 2 1 20 2 1 ;
In modeling survival data, censoring of observations must be taken into account carefully . In this example, only right censoring occurs. If g ( t, ² ), h ( t, ² ), and G ( t, ² ) denote the density of failure, hazard function, and survival distribution function at time t , respectively, the log-likelihood can be written as
Refer to Cox and Oakes (1984, ch. 3). In these expressions U u is the set of uncensored observations, U c is the set of censored observations, and n denotes the total sample size.
The proportional hazards specification expresses the hazard in terms of a baseline hazard, multiplied by a constant. In this example the hazard is that of a Weibull model and is parameterized as h ( t, ² ) = ³± ( ± t ) ³ ˆ’ 1 and ± = exp{ ˆ’ x ² ² }.
The linear predictor is set equal to the intercept in the reference group ( group = 2); this defines the baseline hazard. The corresponding distribution of survival past time t is G ( t, ² ) = exp{ ˆ’ ( ± t ) ³ }. Refer to Cox and Oakes (1984, Table 2.1) and the section Supported Distributions in Chapter 39, The LIFEREG Procedure, for this and other survival distribution models and various parameterizations.
The following NLMIXED statements fit this accelerated failure time model and estimate the cumulative distribution function of time to headache relief.
proc nlmixed data=headache; bounds gamma > 0; linp = b0 - b1*(group-2); alpha = exp(-linp); G_t = exp(-(alpha*minutes)**gamma); g = gamma*alpha*((alpha*minutes)**(gamma-1))*G_t; ll = (censor=0)*log(g) + (censor=1)*log(G_t); model minutes ~ general(ll); predict 1-G_t out=cdf; run; proc print data=cdf; var group censor patient minutes pred; run;
The NLMIXED Procedure Specifications Data Set WORK.HEADACHE Dependent Variable minutes Distribution for Dependent Variable General Optimization Technique Dual Quasi-Newton Integration Method None
The Specifications table shows that no integration is required, since the model does not contain random effects.
The NLMIXED Procedure Dimensions Observations Used 38 Observations Not Used 0 Total Observations 38 Parameters 3
The NLMIXED Procedure Parameters gamma b0 b1 NegLogLike 1 1 1 263.990327
No starting values were given for the three parameters. The NLMIXED procedure assigns the default values in this case.
The NLMIXED Procedure Iteration History Iter Calls NegLogLike Diff MaxGrad Slope 1 2 169.244311 94.74602 22.5599 2230.83 2 4 142.873508 26.3708 14.88631 3.64643 3 6 140.633695 2.239814 11.25234 9.49454 4 8 122.890659 17.74304 19.44959 2.50807 5 9 121.396959 1.493699 13.85584 4.55427 6 11 120.623843 0.773116 13.67062 1.38064 7 12 119.278196 1.345647 15.78014 1.69072 8 14 116.271325 3.006871 26.94029 3.2529 9 16 109.427401 6.843925 19.88382 6.9289 10 19 103.298102 6.129298 12.15647 4.96054 11 22 101.686239 1.611863 14.24868 4.34059 12 23 100.027875 1.658364 11.69853 13.2049 13 26 99.9189048 0.108971 3.602552 0.55176 14 28 99.8738836 0.045021 0.170712 0.16645 15 30 99.8736392 0.000244 0.050822 0.00041 16 32 99.8736351 4.071E-6 0.000705 6.9E-6 17 34 99.8736351 6.1E-10 4.768E-6 1.23E-9 NOTE: GCONV convergence criterion satisfied.
The NLMIXED Procedure Fit Statistics 2 Log Likelihood 199.7 AIC (smaller is better) 205.7 AICC (smaller is better) 206.5 BIC (smaller is better) 210.7
After 17 iterations and 34 evaluations of the objective function, the procedure converges.
The NLMIXED Procedure Parameter Estimates Standard Parameter Estimate Error DF t Value Pr > t Alpha Lower gamma 4.7128 0.6742 38 6.99 <.0001 0.05 3.3479 b0 3.3091 0.05885 38 56.23 <.0001 0.05 3.1900 b1 0.1933 0.07856 38 2.46 0.0185 0.05 0.3523 Parameter Estimates Parameter Upper Gradient gamma 6.0777 5.327E-8 b0 3.4283 4.77E-6 b1 -0.03426 1.22E-6
The parameter estimates and their standard errors are identical to those obtained with the LIFEREG procedure and the statements
proc lifereg data=headache; class group; model minutes*censor(1) = group / dist=weibull; run;
The t statistic and confidence limits are based on 38 degrees of freedom. The LIFEREG procedure computes z -intervals for the parameter estimates.
For the two groups you obtain
The probabilities of headache relief by t minutes are estimated as
These probabilities, calculated at the observed times, are shown for the two groups in Output 51.5.2.
Obs group censor patient minutes Pred 1 1 0 1 11 0.03336 2 1 0 2 12 0.04985 3 1 0 3 19 0.35975 4 1 0 4 19 0.35975 5 1 0 5 19 0.35975 6 1 0 6 19 0.35975 7 1 0 7 21 0.51063 8 1 0 8 20 0.43325 9 1 0 9 21 0.51063 10 1 0 10 21 0.51063 11 1 0 11 20 0.43325 12 1 0 12 21 0.51063 13 1 0 13 20 0.43325 14 1 0 14 21 0.51063 15 1 0 15 25 0.80315 16 1 0 16 27 0.90328 17 1 0 17 30 0.97846 18 1 1 18 21 0.51063 19 1 1 19 24 0.73838 20 2 0 20 14 0.04163 21 2 0 21 16 0.07667 22 2 0 22 16 0.07667 23 2 0 23 21 0.24976 24 2 0 24 21 0.24976 25 2 0 25 23 0.35674 26 2 0 26 23 0.35674 27 2 0 27 23 0.35674 28 2 0 28 23 0.35674 29 2 1 29 25 0.47982 30 2 0 30 23 0.35674 31 2 0 31 24 0.41678 32 2 0 32 24 0.41678 33 2 1 33 26 0.54446 34 2 1 34 32 0.87656 35 2 1 35 30 0.78633 36 2 0 36 30 0.78633 37 2 1 37 32 0.87656 38 2 1 38 20 0.20414
Since the slope estimate is negative with p -value of 0.0185, you can infer that pain reliever 1 leads to overall significantly faster relief, but the estimated probabilities give no information about patient-to-patient variation within and between groups. For example, while pain reliever 1 provides faster relief overall, some patients in group 2 may respond more quickly than other patients in group 1. A frailty model enables you to accommodate and estimate patient-to-patient variation in health status by introducing random effects into a subjects hazard function.
The following statements model the hazard for patient i in terms of , where z i is a (normal) random patient effect. Notice that the only difference from the previous NLMIXED statements are the RANDOM statement and the addition of z in the linear predictor. The empirical Bayes estimates of the random effect (RANDOM statement), the parameter estimates (ODS OUTPUT statement), and the estimated cumulative distribution function (PREDICT statement) are saved to subsequently graph the patient-specific distribution functions.
ods output ParameterEstimates=est; proc nlmixed data=headache; bounds gamma > 0; linp = b0 - b1*(group-2) + z; alpha = exp(-linp); G_t = exp(-(alpha*minutes)**gamma); g = gamma*alpha*((alpha*minutes)**(gamma-1))*G_t; ll = (censor=0)*log(g) + (censor=1)*log(G_t); model minutes ~ general(ll); random z ~ normal(0,exp(2*logsig)) subject=patient out=EB; predict 1-G_t out=cdf; run; proc print data=eb; var patient effect estimate stderrpred; run;
The NLMIXED Procedure Specifications Data Set WORK.HEADACHE Dependent Variable minutes Distribution for Dependent Variable General Random Effects z Distribution for Random Effects Normal Subject Variable patient Optimization Technique Dual Quasi-Newton Integration Method Adaptive Gaussian Quadrature
The Specifications table shows that the objective function is computed by adaptive Gaussian quadrature. The Dimensions table reports that nine quadrature points are being used to integrate over the random effects.
The NLMIXED Procedure Dimensions Observations Used 38 Observations Not Used 0 Total Observations 38 Subjects 38 Max Obs Per Subject 1 Parameters 4 Quadrature Points 9
The NLMIXED Procedure Parameters gamma b0 b1 logsig NegLogLike 1 1 1 1 170.94366
The NLMIXED Procedure Iteration History Iter Calls NegLogLike Diff MaxGrad Slope 1 5 142.121411 28.82225 12.14484 88.8664 2 7 136.440369 5.681042 25.93096 65.7217 3 9 122.972041 13.46833 46.56546 146.887 4 11 120.904825 2.067216 23.77936 94.2862 5 13 109.224144 11.68068 57.65493 92.4075 6 15 105.064733 4.159411 4.824649 19.5879 7 16 101.902207 3.162526 14.1287 6.33767 8 18 99.6907395 2.211468 7.676822 3.42364 9 20 99.3654033 0.325336 5.689204 0.93978 10 22 99.2602178 0.105185 0.317643 0.23408 11 24 99.254434 0.005784 1.17351 0.00556 12 25 99.2456973 0.008737 0.247412 0.00871 13 27 99.2445445 0.001153 0.104942 0.00218 14 29 99.2444958 0.000049 0.005646 0.0001 15 31 99.2444957 9.147E-8 0.000271 1.84E-7 NOTE: GCONV convergence criterion satisfied.
The procedure converges after 15 iterations. The achieved ˆ’ 2 log likelihood is only 1.2 less than that in the model without random effects. Compared to a chi-square distribution with one degree of freedom, the addition of the random effect appears not to improve the model significantly. Care must be exercised, however, in the interpretation of likelihood ratio tests when the value under the null hypothesis falls on the boundary of the parameter space (refer to, for example, Self and Liang 1987).
The NLMIXED Procedure Fit Statistics 2 Log Likelihood 198.5 AIC (smaller is better) 206.5 AICC (smaller is better) 207.7 BIC (smaller is better) 213.0
The NLMIXED Procedure Parameter Estimates Standard Parameter Estimate Error DF t Value Pr > t Alpha Lower gamma 6.2867 2.1334 37 2.95 0.0055 0.05 1.9641 b0 3.2786 0.06576 37 49.86 <.0001 0.05 3.1453 b1 0.1761 0.08264 37 2.13 0.0398 0.05 0.3436 logsig 1.9027 0.5273 37 3.61 0.0009 0.05 2.9711 Parameter Estimates Parameter Upper Gradient gamma 10.6093 1.89E-7 b0 3.4118 0.000271 b1 0.00868 0.000111 logsig 0.8343 0.000027
The estimate of the Weibull parameter has changed drastically from the model without random effects. The variance of the patient random effect is exp{ ˆ’ 2*1 . 9027} = 0 . 02225. The next listing shows the empirical Bayes estimates of the random effects. These are the adjustments made to the linear predictor in order to obtain a patients survival distribution.
StdErr Obs patient Effect Estimate Pred 1 1 z 0.13597 0.23249 2 2 z 0.13323 0.22793 3 3 z 0.06294 0.13813 4 4 z 0.06294 0.13813 5 5 z 0.06294 0.13813 6 6 z 0.06294 0.13813 7 7 z 0.02568 0.11759 8 8 z 0.04499 0.12618 9 9 z 0.02568 0.11759 10 10 z 0.02568 0.11759 11 11 z 0.04499 0.12618 12 12 z 0.02568 0.11759 13 13 z 0.04499 0.12618 14 14 z 0.02568 0.11759 15 15 z 0.05980 0.11618 16 16 z 0.10458 0.12684 17 17 z 0.17147 0.14550 18 18 z 0.06471 0.13807 19 19 z 0.11157 0.14604 20 20 z 0.13406 0.22899 21 21 z 0.12698 0.21667 22 22 z 0.12698 0.21667 23 23 z 0.08506 0.15701 24 24 z 0.08506 0.15701 25 25 z 0.05797 0.13294 26 26 z 0.05797 0.13294 27 27 z 0.05797 0.13294 28 28 z 0.05797 0.13294 29 29 z 0.06420 0.13956 30 30 z 0.05797 0.13294 31 31 z 0.04266 0.12390 32 32 z 0.04266 0.12390 33 33 z 0.07618 0.14132 34 34 z 0.16292 0.16460 35 35 z 0.13193 0.15528 36 36 z 0.06327 0.12124 37 37 z 0.16292 0.16460 38 38 z 0.02074 0.14160
The predicted values and patient-specific survival distributions can be plotted with the SAS code that follows.
data cdf; set cdf; symbolid = int((patient-1)/19); proc transpose data=est(keep=estimate) out=trest; data trest; set trest; rename col1=gamma col2=b0 col3=b1; data pred; merge eb(keep=estimate) headache(keep=patient group); if _n_ = 1 then merge trest(keep=gamma b0 b1); do minutes=11 to 32; linp = b0 - b1*(group-2) + estimate; pred = 1-exp(- (exp(-linp)*minutes)**gamma); symbolid = patient+1; output; end; keep pred minutes symbolid; run;
data pred; set cdf(keep=pred minutes symbolid) pred; run; axis1 label=(angle=90 rotate=0 Estimated Patient-specific CDF) minor=none; axis2 label=(Minutes to Headache Relief) minor=none order=(10 to 35 by 5); symbol1 value=dot c=black h=0.15in i=none r=1; symbol2 value=circle c=black h=0.15in i=none r=1; symbol3 value=none c=black l=1 i=join r=19; symbol4 value=none c=black l=2 i=join r=19; proc gplot data=pred; plot pred*minutes=symbolid / frame cframe=ligr nolegend vaxis=axis1 haxis=axis2; run;
The separation of the distribution functions by groups is evident in Output 51.5.4. Most of the distributions of patients in the first group are to the left of the distributions in the second group. The separation is not complete, however. Several patients assigned the second pain reliever experience headache relief more quickly than patients assigned to the first group.