This example fits a Weibull model and a lognormal model to the example given in Kalbfleisch and Prentice (1980, p. 5). An output data set called models is specified to contain the parameter estimates. By default, the natural log of the variable time is used by the procedure as the response. After this log transformation, the Weibull model is fit using the extreme value baseline distribution, and the lognormal is fit using the normal baseline distribution.
Since the extreme value and normal distributions do not contain any shape parameters, the variable SHAPE1 is missing in the models data set. An additional output data set, out , is created that contains the predicted quantiles and their standard errors for values of the covariate corresponding to temp =130 and temp =150. This is done with the control variable, which is set to 1 for only two observations.
Using the standard error estimates obtained from the output data set, approximate 90% confidence limits for the predicted quantities are then created in a subsequent DATA step for the log response. The logs of the predicted values are obtained because the values of the P= variable in the OUT= data set are in the same units as the original response variable, time . The standard errors of the quantiles of the log( time )are approximated (using a Taylor series approximation ) by the standard deviation of time divided by the mean value of time . These confidence limits are then converted back to the original scale by the exponential function. The following statements produce Output 39.1.1 through Output 39.1.5.
title 'Motorette Failures With Operating Temperature as a Covariate'; data motors; input time censor temp @@; if _N_=1 then do; temp=130; time=.; control=1; z=1000/(273.2+temp); output; temp=150; time=.; control=1; z=1000/(273.2+temp); output; end; if temp>150; control=0; z=1000/(273.2+temp); output; datalines; 8064 0 150 8064 0 150 8064 0 150 8064 0 150 8064 0 150 8064 0 150 8064 0 150 8064 0 150 8064 0 150 8064 0 150 1764 1 170 2772 1 170 3444 1 170 3542 1 170 3780 1 170 4860 1 170 5196 1 170 5448 0 170 5448 0 170 5448 0 170 408 1 190 408 1 190 1344 1 190 1344 1 190 1440 1 190 1680 0 190 1680 0 190 1680 0 190 1680 0 190 1680 0 190 408 1 220 408 1 220 504 1 220 504 1 220 504 1 220 528 0 220 528 0 220 528 0 220 528 0 220 528 0 220 ; proc print data=motors; run; proc lifereg data=motors outest=modela covout; a: model time*censor(0)=z; output out=outa quantiles=.1 .5 .9 std=std p=predtime control=control; run; proc lifereg data=motors outest=modelb covout; b: model time*censor(0)=z / dist=lnormal; output out=outb quantiles=.1 .5 .9 std=std p=predtime control=control; run; data models; set modela modelb; run; proc print data=models; id _model_; title 'fitted models'; run; data out; set outa outb; run; data out1; set out; ltime=log(predtime); stde=std/predtime; upper=exp(ltime+1.64*stde); lower=exp(ltime-1.64*stde); run; proc print; id temp; title 'quantile estimates and confidence limits'; run;
Motorette Failures With Operating Temperature as a Covariate Obs time censor temp control z 1 . 0 130 1 2.48016 2 . 0 150 1 2.36295 3 1764 1 170 0 2.25632 4 2772 1 170 0 2.25632 5 3444 1 170 0 2.25632 6 3542 1 170 0 2.25632 7 3780 1 170 0 2.25632 8 4860 1 170 0 2.25632 9 5196 1 170 0 2.25632 10 5448 0 170 0 2.25632 11 5448 0 170 0 2.25632 12 5448 0 170 0 2.25632 13 408 1 190 0 2.15889 14 408 1 190 0 2.15889 15 1344 1 190 0 2.15889 16 1344 1 190 0 2.15889 17 1440 1 190 0 2.15889 18 1680 0 190 0 2.15889 19 1680 0 190 0 2.15889 20 1680 0 190 0 2.15889 21 1680 0 190 0 2.15889 22 1680 0 190 0 2.15889 23 408 1 220 0 2.02758 24 408 1 220 0 2.02758 25 504 1 220 0 2.02758 26 504 1 220 0 2.02758 27 504 1 220 0 2.02758 28 528 0 220 0 2.02758 29 528 0 220 0 2.02758 30 528 0 220 0 2.02758 31 528 0 220 0 2.02758 32 528 0 220 0 2.02758
The LIFEREG Procedure Model Information Data Set WORK.MOTORS Dependent Variable Log(time) Censoring Variable censor Censoring Value(s) 0 Number of Observations 30 Noncensored Values 17 Right Censored Values 13 Left Censored Values 0 Interval Censored Values 0 Missing Values 2 Name of Distribution Weibull Log Likelihood -22.95148315 Type III Analysis of Effects Wald Effect DF Chi-Square Pr > ChiSq z 1 99.5239 <.0001 Analysis of Parameter Estimates Standard 95% Confidence Chi Parameter DF Estimate Error Limits Square Pr > ChiSq Intercept 1 -11.8912 1.9655 -15.7435 -8.0389 36.60 <.0001 z 1 9.0383 0.9060 7.2626 10.8141 99.52 <.0001 Scale 1 0.3613 0.0795 0.2347 0.5561 Weibull Shape 1 2.7679 0.6091 1.7982 4.2605
The LIFEREG Procedure Model Information Data Set WORK.MOTORS Dependent Variable Log(time) Censoring Variable censor Censoring Value(s) 0 Number of Observations 30 Noncensored Values 17 Right Censored Values 13 Left Censored Values 0 Interval Censored Values 0 Missing Values 2 Name of Distribution Lognormal Log Likelihood -24.47381031 Type III Analysis of Effects Wald Effect DF Chi-Square Pr > ChiSq z 1 42.0001 <.0001 Analysis of Parameter Estimates Standard 95% Confidence Chi Parameter DF Estimate Error Limits Square Pr > ChiSq Intercept 1 -10.4706 2.7719 -15.9034 -5.0377 14.27 0.0002 z 1 8.3221 1.2841 5.8052 10.8389 42.00 <.0001 Scale 1 0.6040 0.1107 0.4217 0.8652
fitted models _MODEL_ _NAME_ _TYPE_ _DIST_ _STATUS_ _LNLIKE_ time Intercept z _SCALE_ A time PARMS Weibull 0 Converged -22.9515 -1.0000 -11.8912 9.03834 0.36128 A Intercept COV Weibull 0 Converged -22.9515 -11.8912 3.8632 -1.77878 0.03448 A z COV Weibull 0 Converged -22.9515 9.0383 -1.7788 0.82082 -0.01488 A Scale COV Weibull 0 Converged -22.9515 0.3613 0.0345 -0.01488 0.00632 B time PARMS Lognormal 0 Converged -24.4738 -1.0000 -10.4706 8.32208 0.60403 B Intercept COV Lognormal 0 Converged -24.4738 -10.4706 7.6835 -3.55566 0.03267 B z COV Lognormal 0 Converged -24.4738 8.3221 -3.5557 1.64897 -0.01285 B Scale COV Lognormal 0 Converged -24.4738 0.6040 0.0327 -0.01285 0.01226
quantile estimates and confidence limits temp time censor control z _PROB_ predtime std ltime stde upper lower 130 . 0 1 2.48016 0.1 16519.27 5999.85 9.7123 0.36320 29969.51 9105.47 130 . 0 1 2.48016 0.5 32626.65 9874.33 10.3929 0.30265 53595.71 19861.63 130 . 0 1 2.48016 0.9 50343.22 15044.35 10.8266 0.29884 82183.49 30838.80 150 . 0 1 2.36295 0.1 5726.74 1569.34 8.6529 0.27404 8976.12 3653.64 150 . 0 1 2.36295 0.5 11310.68 2299.92 9.3335 0.20334 15787.62 8103.28 150 . 0 1 2.36295 0.9 17452.49 3629.28 9.7672 0.20795 24545.37 12409.24 130 . 0 1 2.48016 0.1 12033.19 5482.34 9.3954 0.45560 25402.68 5700.09 130 . 0 1 2.48016 0.5 26095.68 11359.45 10.1695 0.43530 53285.36 12779.95 130 . 0 1 2.48016 0.9 56592.19 26036.90 10.9436 0.46008 120349.65 26611.42 150 . 0 1 2.36295 0.1 4536.88 1443.07 8.4200 0.31808 7643.71 2692.83 150 . 0 1 2.36295 0.5 9838.86 2901.15 9.1941 0.29487 15957.38 6066.36 150 . 0 1 2.36295 0.9 21336.97 7172.34 9.9682 0.33615 37029.72 12294.62
The LIFEREG Procedure can be used to perform a Tobit analysis. The Tobit model, described by Tobin (1958), is a regression model for left-censored data assuming a normally distributed error term . The model parameters are estimated by maximum likelihood. PROC LIFEREG provides estimates of the parameters of the distribution of the uncensored data. Refer to Greene (1993) and Maddala (1983) for a more complete discussion of censored normal data and related distributions. This example shows how you can use PROC LIFEREG and the DATA step to compute two of the three types of predicted values discussed there.
Consider a continuous random variable Y, and a constant C. If you were to sample from the distribution of Y but discard values less than (greater than) C, the distribution of the remaining observations would be truncated on the left (right). If you were to sample from the distribution of Y and report values less than (greater than) C as C, the distribution of the sample would be left (right) censored .
The probability density function of the truncated random variable Y ² is given by
where f Y ( y ) is the probability density function of Y. PROC LIFEREG cannot compute the proper likelihood function to estimate parameters or predicted values for a truncated distribution.
Suppose the model being fit is specified as follows :
where ˆˆ i is a normal error term with zero mean and standard deviation ƒ .
Define the censored random variable Y i as
This is the Tobit model for left-censored normal data. Y* i is sometimes called the latent variable . PROC LIFEREG estimates parameters of the distribution of Y* i by maximum likelihood.
You can use the LIFEREG procedure to compute predicted values based on the mean functions of the latent and observed variables . The mean of the latent variable Y* i is x ² i ² and you can compute values of the mean for different settings of x i by specifying XBETA= variable-name in an OUTPUT statement. Estimates of x ² i ² for each observation will be written to the OUT= data set. Predicted values of the observed variable Y i can be computed based on the mean
where
and represent the normal probability density and cumulative distribution functions.
Although the distribution of ˆˆ i in the Tobit model is often assumed normal, you can use other distributions for the Tobit model in the LIFEREG procedure by specifying a distribution with the DISTRIBUTION= option in the MODEL statement. One distribution should be mentioned is the logistic distribution. For this distribution, the MLE has bounded influence function with respect to the response variable, but not the design variables. If you believe your data has outliers in the response direction, you might try this distribution for some robust estimation of the Tobit model.
With the logistic distribution the predicted values of the observed variable Y i can be computed based on the mean of Y* i
The following table shows a subset of the Mroz (1987) data set. In this data, Hours is the number of hours the wife worked outside the household in a given year, Yrs_ Ed is the years of education, and Yrs_ Exp is the years of work experience. A Tobit model will be fit to the hours worked with years of education and experience as covariates.
Hours | Yrs_ Ed | Yrs_ Exp |
---|---|---|
| 8 | 9 |
| 8 | 12 |
| 9 | 10 |
| 10 | 15 |
| 11 | 4 |
| 11 | 6 |
1000 | 12 | 1 |
1960 | 12 | 29 |
| 13 | 3 |
2100 | 13 | 36 |
3686 | 14 | 11 |
1920 | 14 | 38 |
| 15 | 14 |
1728 | 16 | 3 |
1568 | 16 | 19 |
1316 | 17 | 7 |
| 17 | 15 |
If the wife was not employed (worked 0 hours), her hours worked will be left-censored at zero. In order to accommodate left censoring in PROC LIFEREG, you need two variables to indicate censoring status of observations. You can think of these variables as lower and upper endpoints of interval censoring. If there is no censoring, set both variables to the observed value of Hours . To indicate left censoring, set the lower endpoint to missing and the upper endpoint to the censored value, zero in this case.
The following statements create a SAS data set with the variables Hours , Yrs_ Ed , and Yrs_ Exp from the preceding data. A new variable, Lower is created such that Lower =. if Hours =0 and Lower = Hours if Hours >0.
data subset; input Hours Yrs_Ed Yrs_Exp @@; if Hours eq 0 then Lower=.; else Lower=Hours; datalines; 0 8 9 0 8 12 0 9 10 0 10 15 0 11 4 0 11 6 1000 12 1 1960 12 29 0 13 3 2100 13 36 3686 14 11 1920 14 38 0 15 14 1728 16 3 1568 16 19 1316 17 7 0 17 15 ;
The following statements fit a normal regression model to the left-censored Hours data using Yrs_ Ed and Yrs_ Exp as covariates. You will need the estimated standard deviation of the normal distribution to compute the predicted values of the censored distribution from the preceding formulas. The data set OUTEST contains the standard deviation estimate in a variable named _ SCALE_ . You also need estimates of . These are contained in the data set OUT as the variable Xbeta
proc lifereg data=subset outest=OUTEST(keep=_scale_); model (lower, hours) = yrs_ed yrs_exp / d=normal; output out=OUT xbeta=Xbeta; run;
The LIFEREG Procedure Model Information Data Set WORK.SUBSET Dependent Variable Lower Dependent Variable Hours Number of Observations 17 Noncensored Values 8 Right Censored Values 0 Left Censored Values 9 Interval Censored Values 0 Name of Distribution Normal Log Likelihood -74.9369977 Analysis of Parameter Estimates Standard 95% Confidence Chi Parameter DF Estimate Error Limits Square Pr > ChiSq Intercept 1 -5598.64 2850.248 -11185.0 -12.2553 3.86 0.0495 Yrs_Ed 1 373.1477 191.8872 -2.9442 749.2397 3.78 0.0518 Yrs_Exp 1 63.3371 38.3632 -11.8533 138.5276 2.73 0.0987 Scale 1 1582.870 442.6732 914.9433 2738.397
The following statements combine the two data sets created by PROC LIFEREG to compute predicted values for the censored distribution. The OUTEST= data set contains the estimate of the standard deviation from the uncensored distribution, and the OUT= data set contains estimates of .
data predict; drop lambda _scale_ _prob_; set out; if _n_ eq 1 then set outest; lambda = pdf('NORMAL',Xbeta/_scale_) / cdf('NORMAL',Xbeta/_scale_); Predict = cdf('NORMAL', Xbeta/_scale_) * (Xbeta + _scale_*lambda); label Xbeta='MEAN OF UNCENSORED VARIABLE' Predict = 'MEAN OF CENSORED VARIABLE'; run; proc print data=predict noobs label; var hours lower yrs: xbeta predict; run;
MEAN OF MEAN OF UNCENSORED CENSORED Hours Lower Yrs_Ed Yrs_Exp VARIABLE VARIABLE 0 . 8 9 -2043.42 73.46 0 . 8 12 -1853.41 94.23 0 . 9 10 -1606.94 128.10 0 . 10 15 -917.10 276.04 0 . 11 4 -1240.67 195.76 0 . 11 6 -1113.99 224.72 1000 1000 12 1 -1057.53 238.63 1960 1960 12 29 715.91 1052.94 0 . 13 3 -557.71 391.42 2100 2100 13 36 1532.42 1672.50 3686 3686 14 11 322.14 805.58 1920 1920 14 38 2032.24 2106.81 0 . 15 14 885.30 1170.39 1728 1728 16 3 561.74 951.69 1568 1568 16 19 1575.13 1708.24 1316 1316 17 7 1188.23 1395.61 0 . 17 15 1694.93 1809.97
This example illustrates the use of parameter initial value specification to help overcome convergence difficulties.
The following statements create a data set and request a Weibull regression model be fit to the data.
data raw; input censor x c1 @@; datalines; 0 16 0.00 0 17 0.00 0 18 0.00 0 17 0.04 0 18 0.04 0 18 0.04 0 23 0.40 0 22 0.40 0 22 0.40 0 33 4.00 0 34 4.00 0 35 4.00 1 54 40.00 1 54 40.00 1 54 40.00 1 54 400.00 1 54 400.00 1 54 400.00 ; run; proc print; run; title 'OLS (default) initial values'; proc lifereg data=raw; model x*censor(1) = c1 / distribution = weibull itprint; run;
Output 39.3.1 shows the data set contents.
Obs censor x c1 1 0 16 0.00 2 0 17 0.00 3 0 18 0.00 4 0 17 0.04 5 0 18 0.04 6 0 18 0.04 7 0 23 0.40 8 0 22 0.40 9 0 22 0.40 10 0 33 4.00 11 0 34 4.00 12 0 35 4.00 13 1 54 40.00 14 1 54 40.00 15 1 54 40.00 16 1 54 400.00 17 1 54 400.00 18 1 54 400.00
Convergence was not attained in 50 iterations for this model, as the messages to the log indicate:
WARNING: | Convergence was not attained in 50 iterations. You may want to increase the maximum number of iterations (MAXITER= option) or change the convergence criteria (CONVERGE = value) in the MODEL statement. |
WARNING: | The procedure is continuing in spite of the above warning. Results shown are based on the last maximum likelihood iteration. Validity of the model fit is questionable. |
The first line ( iter =0) of the iteration history table, in Output 39.3.2, shows the default initial ordinary least squares (OLS) estimates of the parameters.
OLS (default) initial values Iter Ridge Loglike Intercept c1 Scale 0 0 -22.891088 3.2324769714 0.0020664542 0.3995754195
The log logistic distribution is more robust to large values of the response than the Weibull, so one approach to improving the convergence performance is to fitalog logistic distribution, and if this converges, use the resulting parameter estimates as initial values in a subsequent fit of a model with the Weibull distribution.
The following statements fit a log logistic distribution to the data.
proc lifereg data=raw; model x*censor(1) = c1 / distribution = llogistic; run;
The algorithm converges, and the maximum likelihood estimates for the log logistic distribution are shown in Output 39.3.3
The LIFEREG Procedure Model Information Data Set WORK.RAW Dependent Variable Log(x) Censoring Variable censor Censoring Value(s) 1 Number of Observations 18 Noncensored Values 12 Right Censored Values 6 Left Censored Values 0 Interval Censored Values 0 Name of Distribution LLogistic Log Likelihood 12.093136846 Analysis of Parameter Estimates Standard 95% Confidence Chi Parameter DF Estimate Error Limits Square Pr > ChiSq Intercept 1 2.8983 0.0318 2.8360 2.9606 8309.43 <.0001 c1 1 0.1592 0.0133 0.1332 0.1852 143.85 <.0001 Scale 1 0.0498 0.0122 0.0308 0.0804
The following statements re-fit the Weibull model using the maximum likelihood estimates from the log logistic fit as initial values.
proc lifereg data=raw outest=outest; model x*censor(1) = c1 / itprint distribution = weibull intercept=2.898 initial=0.16 scale=0.05; output out=out xbeta=xbeta; run;
Examination of the resulting output in Output 39.3.4 shows that the convergence problem has been solved by specifying different initial values.
As an example, the following invocation of PROC LIFEREG, using the INEST= data set providing starting values for the three parameters, is equivalent to the previous invocation.
data in; input intercept c1 scale; datalines; 2.898 0.16 0.05 ; proc lifereg data=raw inest=in outest=outest; model x*censor(1) = c1 / itprint distribution = weibull; output out=out xbeta=xbeta; run;
The LIFEREG Procedure Model Information Data Set WORK.RAW Dependent Variable Log(x) Censoring Variable censor Censoring Value(s) 1 Number of Observations 18 Noncensored Values 12 Right Censored Values 6 Left Censored Values 0 Interval Censored Values 0 Name of Distribution Weibull Log Likelihood 11.232023272 Algorithm converged. Analysis of Parameter Estimates Standard 95% Confidence Chi Parameter DF Estimate Error Limits Square Pr > ChiSq Intercept 1 2.9699 0.0326 2.9059 3.0338 8278.86 <.0001 c1 1 0.1435 0.0165 0.1111 0.1758 75.43 <.0001 Scale 1 0.0844 0.0189 0.0544 0.1308 Weibull Shape 1 11.8526 2.6514 7.6455 18.3749
The following artificial data are for a study of the natural recovery time of mice after injection of a certain toxin. 20 mice were grouped by sex ( sex : 1 = Male, 2 = Female) with equal sizes. Their ages (in days) were recorded at the injection. Their recovery times (in minutes) were also recorded. Toxin density in blood was used to decide whether a mouse recovered. Mice were checked at two times for recovery. If a mouse had recovered at the first time, the observation is left-censored, and no further measurement is made. The variable time1 is set to missing and time2 is set to the measurement time to indicate left-censoring. If a mouse had not recovered at the first time, it was checked later at a second time. If it had recovered by the second measurement time, the observation is interval-censored and the variable time1 is set to the first measurement time and time2 is set to the second measurement time. If there was no recovery at the second measurement, the observation is right-censored, and time1 is set to the second measurement time and time2 is set to missing to indicate right-censoring.
The following statements create a SAS data set containing the data from the experiment and fit a Weibull model with age, sex, and age and sex interaction as covariates.
title 'Natural Recovery Time'; data mice; input sex age time1 time2 ; datalines; 1 57 631 631 1 45 . 170 1 54 227 227 1 43 143 143 1 64 916 . 1 67 691 705 1 44 100 100 1 59 730 . 1 47 365 365 1 74 1916 1916 2 79 1326 . 2 75 837 837 2 84 1200 1235 2 54 . 365 2 74 1255 1255 2 71 1823 . 2 65 537 637 2 33 583 683 2 77 955 . 2 46 577 577 ; data xrow1; input sex age time1 time2 ; datalines; 1 50 . . ; data xrow2; input sex age time1 time2 ; datalines; 2 60.6 . . ; proc lifereg data=mice xdata=xrow1; class sex ; model (time1, time2) = age sex age*sex / dist=Weibull; probplot / nodata font = swiss plower=.5 vref(intersect) = 75 vreflab = '75 Percent' vreflabpos = 2 cfit=blue cframe=ligr ; inset / cfill = white ctext = blue; run;
Standard output is shown in Output 39.4.1. Tables containing general model information, Type III tests for the main effects and interaction terms, and parameter estimates are created.
Natural Recovery Time The LIFEREG Procedure Model Information Data Set WORK.MICE Dependent Variable Log(time1) Dependent Variable Log(time2) Number of Observations 20 Noncensored Values 9 Right Censored Values 5 Left Censored Values 2 Interval Censored Values 4 Name of Distribution Weibull Log Likelihood -25.91033295 Type III Analysis of Effects Wald Effect DF Chi-Square Pr > ChiSq age 1 33.8496 <.0001 sex 1 14.0245 0.0002 age*sex 1 10.7196 0.0011 Analysis of Parameter Estimates Standard 95% Confidence Chi Parameter DF Estimate Error Limits Square Pr > ChiSq Intercept 1 5.4110 0.5549 4.3234 6.4986 95.08 <.0001 age 1 0.0250 0.0086 0.0081 0.0419 8.42 0.0037 sex 1 1 3.9808 1.0630 6.0643 1.8974 14.02 0.0002 sex 2 0 0.0000 0.0000 0.0000 0.0000 . . age*sex 1 1 0.0613 0.0187 0.0246 0.0980 10.72 0.0011 age*sex 2 0 0.0000 0.0000 0.0000 0.0000 . . Scale 1 0.4087 0.0900 0.2654 0.6294 Weibull Shape 1 2.4468 0.5391 1.5887 3.7682
The following two plots display the predicted probability against the recovery time for two different populations. Output 39.4.2 is created with the PROBPLOT statement with the option XDATA= xrow1 , which specifies the population with sex = 1, age = 50. Although the SAS statements are not shown , Output 39.4.3 is created with the PROBPLOT statement with the option XDATA= xrow2 , which specifies the population with sex = 2, age = 60.6. These are the default values that the LIFEREG procedure would use for the probability plot if the XDATA= option had not been specified. Reference lines are used to display specified predicted probability points and their relative locations on the plot.
The following statements create a SAS data set containing observed and right-censored lifetimes of 70 diesel engine fans (Nelson 1982, p. 318).
title 'Engine Fan Lifetime Study; data fan; input lifetime censor@@; lifetime = lifetime / 1000; label lifetime = Lifetime; datalines; 450 0 460 1 1150 0 1150 0 1560 1 1600 0 1660 1 1850 1 1850 1 1850 1 1850 1 1850 1 2030 1 2030 1 2030 1 2070 0 2070 0 2080 0 2200 1 3000 1 3000 1 3000 1 3000 1 3100 0 3200 1 3450 0 3750 1 3750 1 4150 1 4150 1 4150 1 4150 1 4300 1 4300 1 4300 1 4300 1 4600 0 4850 1 4850 1 4850 1 4850 1 5000 1 5000 1 5000 1 6100 1 6100 0 6100 1 6100 1 6300 1 6450 1 6450 1 6700 1 7450 1 7800 1 7800 1 8100 1 8100 1 8200 1 8500 1 8500 1 8500 1 8750 1 8750 0 8750 1 9400 1 9900 1 10100 1 10100 1 10100 1 11500 1 ; run;
Some of the fans had not failed at the time the data were collected, and the unfailed units have right-censored lifetimes. The variable LIFETIME represents either a failure time or a censoring time in thousands of hours. The variable CENSOR is equal to 0 if the value of LIFETIME is a failure time, and it is equal to 1 if the value is a censoring time. The following statements use the LIFEREG procedure to produce the probability plot with an inset for the engine lifetimes.
symbol v=dot c=white; proc lifereg; model lifetime*censor(1) = / d = weibull; probplot cencolor = red cframe = ligr cfit = blue ppout npintervals=simul ; inset / cfill = white ctext = blue; run;
The resulting graphical output is shown in Output 39.5.1. The estimated CDF, a line representing the maximum likelihood fit, and pointwise parametric confidence bands are plotted in the body of Output 39.5.1. The values of right-censored observations are plotted along the top of the graph. The Cumulative Probability Estimates table is also created in Output 39.5.2.
The LIFEREG Procedure Cumulative Probability Estimates Simultaneous Kaplan- 95% Confidence Kaplan- Meier Cumulative Limits Meier Standard Lifetime Probability Lower Upper Estimate Error 0.45 0.0071 0.0007 0.2114 0.0143 0.0142 1.15 0.0215 0.0033 0.2114 0.0288 0.0201 1.15 0.0360 0.0073 0.2168 0.0433 0.0244 1.6 0.0506 0.0125 0.2304 0.0580 0.0282 2.07 0.0666 0.0190 0.2539 0.0751 0.0324 2.07 0.0837 0.0264 0.2760 0.0923 0.0361 2.08 0.1008 0.0344 0.2972 0.1094 0.0392 3.1 0.1189 0.0436 0.3223 0.1283 0.0427 3.45 0.1380 0.0535 0.3471 0.1477 0.0460 4.6 0.1602 0.0653 0.3844 0.1728 0.0510 6.1 0.1887 0.0791 0.4349 0.2046 0.0581 8.75 0.2488 0.0884 0.6391 0.2930 0.0980
Lower Endpoint | Upper Endpoint | Number Failed |
---|---|---|
. | 6 | 6 |
6 | 12 | 2 |
24 | 48 | 2 |
24 | . | 1 |
48 | 168 | 1 |
48 | . | 839 |
168 | 500 | 1 |
168 | . | 150 |
500 | 1000 | 2 |
500 | . | 149 |
1000 | 2000 | 1 |
1000 | . | 147 |
2000 | . | 122 |
The following SAS program will compute the Turnbull estimate and create a lognormal probability plot.
data micro; input t1 t2 f ; datalines; . 6 6 6 12 2 12 24 0 24 48 2 24 . 1 48 168 1 48 . 839 168 500 1 168 . 150 500 1000 2 500 . 149 1000 2000 1 1000 . 147 2000 . 122 ; symbol v=dot c=white; proc lifereg data=micro; model (t1 t2)=/d=lognormal intercept=25 scale=5; weight f; probplot cframe = ligr cfit = blue pupper = 10 itprintem printprobs maxitem = (1000,25) ppout; inset / cfill = white; run;
The two initial values INTERCEPT= 25 and SCALE= 5 in the MODEL statement are used to aid convergence in the model-fitting algorithm.
The following tables are created by the PROBPLOT statement in addition to the standard tabular output from the MODEL statement. Output 39.6.1 shows the iteration history for the Turnbull estimate of the CDF for the microprocessor data. With both options ITPRINTEM and PRINTPROBS specified in the PROBPLOT statement, this table contains the log likelihoods and interval probabilities for every 25th iteration and the last iteration. It would only contain the log likelihoods if the option PRINTPROBS were not specified.
The LIFEREG Procedure Iteration History for the Turnbull Estimate of the CDF Iteration Loglikelihood (., 6) (6, 12) (24, 48) (48, 168) (168, 500) (500, 1000) (1000, 2000) (2000, .) 1133.4051 0.125 0.125 0.125 0.125 0.125 0.125 0.125 0.125 25 104.16622 0.00421644 0.00140548 0.00140648 0.00173338 0.00237846 0.00846094 0.04565407 0.93474475 50 101.15151 0.00421644 0.00140548 0.00140648 0.00173293 0.00234891 0.00727679 0.01174486 0.96986811 75 101.06641 0.00421644 0.00140548 0.00140648 0.00173293 0.00234891 0.00727127 0.00835638 0.9732621 100 101.06534 0.00421644 0.00140548 0.00140648 0.00173293 0.00234891 0.00727125 0.00801814 0.97360037 125 101.06533 0.00421644 0.00140548 0.00140648 0.00173293 0.00234891 0.00727125 0.00798438 0.97363413 130 101.06533 0.00421644 0.00140548 0.00140648 0.00173293 0.00234891 0.00727125 0.007983 0.97363551
The LIFEREG Procedure Lower Upper Reduced Lagrange Lifetime Lifetime Probability Gradient Multiplier . 6 0.0042 0 0 6 12 0.0014 0 0 24 48 0.0014 0 0 48 168 0.0017 0 0 168 500 0.0023 0 0 500 1000 0.0073 -7.219342E-9 0 1000 2000 0.0080 -0.037063236 0 2000 . 0.9736 0.0003038877 0
The LIFEREG Procedure Cumulative Probability Estimates Pointwise 95% Confidence Lower Upper Cumulative Limits Standard Lifetime Lifetime Probability Lower Upper Error 6 6 0.0042 0.0019 0.0094 0.0017 12 24 0.0056 0.0028 0.0112 0.0020 48 48 0.0070 0.0038 0.0130 0.0022 168 168 0.0088 0.0047 0.0164 0.0028 500 500 0.0111 0.0058 0.0211 0.0037 1000 1000 0.0184 0.0094 0.0357 0.0063 2000 2000 0.0264 0.0124 0.0553 0.0101