Getting Started | SAS/STAT 9.1, Users Guide, Volume 3 (volume 3 ONLY)

The following examples demonstrate how you can use the LIFEREG procedure to fit a parametric model to failure time data.

Suppose you have a response variable y that represents failure time, censor is a binary variable with censor =0 indicating censored values, and x1 and x2 are two linearly independent variables. The following statements perform a typical accelerated failure time model analysis. Higher-order effects such as interactions and nested effects are allowed in the independent variables list, but are not shown in this example.

  proc lifereg;   model y*censor(0) = x1 x2;   run;

PROC LIFEREG can operate on interval-censored data. The model syntax for specifying the censored interval is

  proc lifereg;   model (begin, end) = x1 x2;   run;

You can also model binomial data using the events/trials syntax for the response, as illustrated in the following statements:

  proc lifereg;   model r/n=x1 x2;   run;

The variable n represents the number of trials and the variable r represents the number of events.

Modeling Right-Censored Failure Time Data

The following example demonstrates how you can use the LIFEREG procedure to fit a model to right-censored failure time data.

Suppose you conduct a study of two headache pain relievers. You divide patients into two groups, with each group receiving a different type of pain reliever. You record the time taken (in minutes) for each patient to report headache relief. Because some of the patients never report relief for the entire study, some of the observations are censored.

The following DATA step creates the SAS data set headache :

  data headache;   input minutes group censor @@;   datalines;   11  1  0   12  1  0   19  1  0   19  1  0   19  1  0   19  1  0   21  1  0   20  1  0   21  1  0   21  1  0   20  1  0   21  1  0   20  1  0   21  1  0   25  1  0   27  1  0   30  1  0   21  1  1   24  1  1   14  2  0   16  2  0   16  2  0   21  2  0   21  2  0   23  2  0   23  2  0   23  2  0   23  2  0   25  2  1   23  2  0   24  2  0   24  2  0   26  2  1   32  2  1   30  2  1   30  2  0   32  2  1   20  2  1   ;

The data set headache contains the variable minutes , which represents the reported time to headache relief, the variable group , the group to which the patient is assigned, and the variable censor , a binary variable indicating whether the observation is censored. Valid values of the variable censor are 0 (no) and 1 (yes). The following figure shows the first five records of the data set headache .

  Obs    minutes    group    censor   1        11        1         0   2        12        1         0   3        19        1         0   4        19        1         0   5        19        1         0

Figure 39.1: Headache Data

The following statements invoke the LIFEREG procedure:

  proc lifereg;   class group;   model minutes*censor(1)=group;   output out=new cdf=prob;   run;

The CLASS statement specifies the variable group as the classification variable. The MODEL statement syntax indicates that the response variable minutes is right-censored when the variable censor takes the value 1. The MODEL statement specifies the variable group as the single explanatory variable. Because the MODEL statement does not specify the DISTRIBUTION= option, the LIFEREG procedure fits the default type 1 extreme value distribution using log( minutes ) as the response. This is equivalent to fitting the Weibull distribution.

The OUTPUT statement creates the output data set new . In addition to the variables in the original data set headache , the SAS data set new also contains the variable prob . This new variable is created by the CDF= option to contain the estimates of the cumulative distribution function evaluated at the observed response.

The results of this analysis are displayed in the following figures.

Figure 39.2 displays the class level information and model fitting information. There are 30 noncensored observations and 8 right-censored observations. The log likelihood for the Weibull distribution is ˆ’ 9.3793. The log-likelihood value can be used to compare the goodness of fit for different models.

  The LIFEREG Procedure   Model Information   Data Set                    WORK.HEADACHE   Dependent Variable           Log(minutes)   Censoring Variable                 censor   Censoring Value(s)                      1   Number of Observations                 38   Noncensored Values                     30   Right Censored Values                   8   Left Censored Values                    0   Interval Censored Values                0   Name of Distribution              Weibull   Log Likelihood   9.37930239   Class Level Information   Name       Levels    Values   group           2    1 2

Figure 39.2: Model Fitting Information from the LIFEREG Procedure

The table of parameter estimates is displayed in Figure 39.3. Both the intercept and the slope parameter for the variable group are significantly different from 0 at the 0.05 level. Because the variable group has only one degree of freedom, parameter estimates are given for only one level of the variable group ( group =1). However, the estimate for the intercept parameter provides a baseline for group =2. The resulting model is

  The LIFEREG Procedure   Analysis of Parameter Estimates   Standard   95% Confidence     Chi   Parameter       DF Estimate    Error       Limits       Square Pr > ChiSq   Intercept        1   3.3091   0.0589   3.1938   3.4245 3161.70     <.0001   group         1  1  0.1933   0.0786  0.3473  0.0393    6.05     0.0139   group         2  0   0.0000   0.0000   0.0000   0.0000     .       .   Scale            1   0.2122   0.0304   0.1603   0.2809   Weibull Shape    1   4.7128   0.6742   3.5604   6.2381

Figure 39.3: Model Parameter Estimates from the LIFEREG Procedure

Note that the Weibull shape parameter for this model is the reciprocal of the extreme value scale parameter estimate shown in Figure 39.3 (1 / . 21219 = 4 . 7128).

The following statements produce a graph of the cumulative distribution values versus the variable minutes. The LEGEND1 statement defines the appearance of the legend that displays on the plot. The two AXIS statements define the appearance of the plot axes. The SYMBOL statements control the plotting symbol, color , and method of smoothing.

  legend1 frame cframe=ligr cborder=black   position=center value=(justify=center);   axis1 label=(angle=90 rotate=0 'Estimated CDF') minor=none;   axis2 minor=none;   symbol1 c=white i=spline;   symbol2 c=yellow i=spline;   proc sort data=new;   by prob;   proc gplot data=new;   plot prob*minutes=group/ frame cframe=ligr   legend=legend1 vaxis=axis1 haxis=axis2;   run;

The SORT procedure sorts the data set new by the variable prob . Then the GPLOT procedure plots the variable prob versus the variable minutes using the grouping variable as the identification variable. The LEGEND=, VAXIS=, and HAXIS= options specify the previously defined legend and axis statements.

Figure 39.4 displays the estimated cumulative distribution function for each group.

Figure 39.4: Plot of the Estimated Cumulative Distribution Function