This example uses the data presented in Appendix I of Kalbfleisch and Prentice (1980). The response variable, SurvTime , is the survival time in days of a lung cancer patient. Negative values of SurvTime are censored values. The covariates are Cell (type of cancer cell), Therapy (type of therapy: standard or test), Prior (prior therapy: 0=no, 10=yes), Age (age in years ), DiagTime (time in months from diagnosis to entry into the trial), and Kps (performance status). A censoring indicator variable Censor is created from the data, with value 1 indicating a censored time and value 0 an event time. Since there are only two types of therapy, an indicator variable, Treatment , is constructed for therapy type, with value 0 for standard therapy and value 1 for test therapy.
data VALung; drop check m; retain Therapy Cell; infile cards column=column; length Check $ 1; label SurvTime='failure or censoring time' Kps='karnofsky index' DiagTime='months till randomization' Age='age in years Prior='prior treatment?' Cell='cell type' Therapy='type of treatment' Treatment='treatment indicator'; M=Column; input Check $ @@; if M>Column then M=1; if Check='s'Check='t' then input @M Therapy $ Cell $ ; else input @M SurvTime Kps DiagTime Age Prior @@; if SurvTime > .; censor=(SurvTime<0); SurvTime=abs(SurvTime); Treatment=(Therapy=test); cards; standard squamous 72 60 7 69 0 411 70 5 64 10 228 60 3 38 0 126 60 9 63 10 118 70 11 65 10 10 20 5 49 0 82 40 10 69 10 110 80 29 68 0 314 50 18 43 0 100 70 6 70 0 42 60 4 81 0 8 40 58 63 10 144 30 4 63 0 25 80 9 52 10 11 70 11 48 10 standard small 30 60 3 61 0 384 60 9 42 0 4 40 2 35 0 54 80 4 63 10 13 60 4 56 0 123 40 3 55 0 -97 60 5 67 0 153 60 14 63 10 59 30 2 65 0 117 80 3 46 0 16 30 4 53 10 151 50 12 69 0 22 60 4 68 0 56 80 12 43 10 21 40 2 55 10 18 20 15 42 0 139 80 2 64 0 20 30 5 65 0 31 75 3 65 0 52 70 2 55 0 287 60 25 66 10 18 30 4 60 0 51 60 1 67 0 122 80 28 53 0 27 60 8 62 0 54 70 1 67 0 7 50 7 72 0 63 50 11 48 0 392 40 4 68 0 10 40 23 67 10 standard adeno 8 20 19 61 10 92 70 10 60 0 35 40 6 62 0 117 80 2 38 0 132 80 5 50 0 12 50 4 63 10 162 80 5 64 0 3 30 3 43 0 95 80 4 34 0 standard large 177 50 16 66 10 162 80 5 62 0 216 50 15 52 0 553 70 2 47 0 278 60 12 63 0 12 40 12 68 10 260 80 5 45 0 200 80 12 41 10 156 70 2 66 0 182 90 2 62 0 143 90 8 60 0 105 80 11 66 0 103 80 5 38 0 250 70 8 53 10 100 60 13 37 10 test squamous 999 90 12 54 10 112 80 6 60 0 87 80 3 48 0 231 50 8 52 10 242 50 1 70 0 991 70 7 50 10 111 70 3 62 0 1 20 21 65 10 587 60 3 58 0 389 90 2 62 0 33 30 6 64 0 25 20 36 63 0 357 70 13 58 0 467 90 2 64 0 201 80 28 52 10 1 50 7 35 0 30 70 11 63 0 44 60 13 70 10 283 90 2 51 0 15 50 13 40 10 test small 25 30 2 69 0 103 70 22 36 10 21 20 4 71 0 13 30 2 62 0 87 60 2 60 0 2 40 36 44 10 20 30 9 54 10 7 20 11 66 0 24 60 8 49 0 99 70 3 72 0 8 80 2 68 0 99 85 4 62 0 61 70 2 71 0 25 70 2 70 0 95 70 1 61 0 80 50 17 71 0 51 30 87 59 10 29 40 8 67 0 test adeno 24 40 2 60 0 18 40 5 69 10 83 99 3 57 0 31 80 3 39 0 51 60 5 62 0 90 60 22 50 10 52 60 3 43 0 73 60 3 70 0 8 50 5 66 0 36 70 8 61 0 48 10 4 81 0 7 40 4 58 0 140 70 3 63 0 186 90 3 60 0 84 80 4 62 10 19 50 10 42 0 45 40 3 69 0 80 40 4 63 0 test large 52 60 4 45 0 164 70 15 68 10 19 30 4 39 10 53 60 12 66 0 15 30 5 63 0 43 60 11 49 10 340 80 10 64 10 133 75 1 65 0 111 60 5 64 0 231 70 18 67 10 378 80 4 65 0 49 30 3 37 0 ;
PROC LIFETEST is invoked to compute the product-limit estimate of the survivor function for each type of cancer cell and to analyze the effects of the variables Age , Prior , DiagTime , Kps , and Treatment on the survival of the patients . These prognostic factors are specified in the TEST statement, and the variable Cell is specified in the STRATA statement. Traditional high-resolution graphs of the product-limit estimates, the log estimates, and the negative log-log estimates are requested through the PLOTS= option in the PROC LIFETEST statement. Because of a few large survival times, a MAXTIME of 600 is used to set the scale of the time axis; that is, the time scale extends from 0 to a maximum of 600 days in the plots. The variable Therapy is specified in the ID statement to identify the type of therapy for each observation in the product-limit estimates. The OUTTEST option specifies the creation of an output data set named Test to contain the rank test matrices for the covariates.
symbol1 c=blue; symbol2 c=orange; symbol3 c=green; symbol4 c=red; symbol5 c=cyan; symbol6 c=black; title 'VA Lung Cancer Data'; proc lifetest data=VALung plots=(s,ls,lls) outtest=Test maxtime=600; time SurvTime*Censor(1); id Therapy; strata Cell; test Age Prior DiagTime Kps Treatment; run;
Output 40.1.1 through Output 40.1.4 display the product-limit estimates of the survivor functions for the four cell types. Summary statistics of the survival times are also shown. The median survival times are 51 days, 156 days, 51 days, and 118 days for patients with adeno cells, large cells, small cells, and squamous cells , respectively.
Stratum 1: Cell = adeno Product-Limit Survival Estimates Survival Standard Number Number SurvTime Survival Failure Error Failed Left Therapy 0.000 1.0000 0 0 0 27 3.000 0.9630 0.0370 0.0363 1 26 standard 7.000 0.9259 0.0741 0.0504 2 25 test 8.000 . . . 3 24 standard 8.000 0.8519 0.1481 0.0684 4 23 test 12.000 0.8148 0.1852 0.0748 5 22 standard 18.000 0.7778 0.2222 0.0800 6 21 test 19.000 0.7407 0.2593 0.0843 7 20 test 24.000 0.7037 0.2963 0.0879 8 19 test 31.000 0.6667 0.3333 0.0907 9 18 test 35.000 0.6296 0.3704 0.0929 10 17 standard 36.000 0.5926 0.4074 0.0946 11 16 test 45.000 0.5556 0.4444 0.0956 12 15 test 48.000 0.5185 0.4815 0.0962 13 14 test 51.000 0.4815 0.5185 0.0962 14 13 test 52.000 0.4444 0.5556 0.0956 15 12 test 73.000 0.4074 0.5926 0.0946 16 11 test 80.000 0.3704 0.6296 0.0929 17 10 test 83.000* . . . 17 9 test 84.000 0.3292 0.6708 0.0913 18 8 test 90.000 0.2881 0.7119 0.0887 19 7 test 92.000 0.2469 0.7531 0.0850 20 6 standard 95.000 0.2058 0.7942 0.0802 21 5 standard 117.000 0.1646 0.8354 0.0740 22 4 standard 132.000 0.1235 0.8765 0.0659 23 3 standard 140.000 0.0823 0.9177 0.0553 24 2 test 162.000 0.0412 0.9588 0.0401 25 1 standard 186.000 0 1.0000 0 26 0 test NOTE: The marked survival times are censored observations. Quartile Estimates Point 95% Confidence Interval Percent Estimate [Lower Upper) 75 92.000 73.000 140.000 50 51.000 31.000 90.000 25 19.000 8.000 45.000 Mean Standard Error 65.556 10.127
Stratum 2: Cell = large Product-Limit Survival Estimates Survival Standard Number Number SurvTime Survival Failure Error Failed Left Therapy 0.000 1.0000 0 0 0 27 12.000 0.9630 0.0370 0.0363 1 26 standard 15.000 0.9259 0.0741 0.0504 2 25 test 19.000 0.8889 0.1111 0.0605 3 24 test 43.000 0.8519 0.1481 0.0684 4 23 test 49.000 0.8148 0.1852 0.0748 5 22 test 52.000 0.7778 0.2222 0.0800 6 21 test 53.000 0.7407 0.2593 0.0843 7 20 test 100.000 0.7037 0.2963 0.0879 8 19 standard 103.000 0.6667 0.3333 0.0907 9 18 standard 105.000 0.6296 0.3704 0.0929 10 17 standard 111.000 0.5926 0.4074 0.0946 11 16 test 133.000 0.5556 0.4444 0.0956 12 15 test 143.000 0.5185 0.4815 0.0962 13 14 standard 156.000 0.4815 0.5185 0.0962 14 13 standard 162.000 0.4444 0.5556 0.0956 15 12 standard 164.000 0.4074 0.5926 0.0946 16 11 test 177.000 0.3704 0.6296 0.0929 17 10 standard 182.000* . . . 17 9 standard 200.000 0.3292 0.6708 0.0913 18 8 standard 216.000 0.2881 0.7119 0.0887 19 7 standard 231.000 0.2469 0.7531 0.0850 20 6 test 250.000 0.2058 0.7942 0.0802 21 5 standard 260.000 0.1646 0.8354 0.0740 22 4 standard 278.000 0.1235 0.8765 0.0659 23 3 standard 340.000 0.0823 0.9177 0.0553 24 2 test 378.000 0.0412 0.9588 0.0401 25 1 test 553.000 0 1.0000 0 26 0 standard NOTE: The marked survival times are censored observations. Quartile Estimates Point 95% Confidence Interval Percent Estimate [Lower Upper) 75 231.000 164.000 340.000 50 156.000 103.000 216.000 25 53.000 43.000 133.000 Mean Standard Error 170.506 25.098
Stratum 3: Cell = small Product-Limit Survival Estimates Survival Standard Number Number SurvTime Survival Failure Error Failed Left Therapy 0.000 1.0000 0 0 0 48 2.000 0.9792 0.0208 0.0206 1 47 test 4.000 0.9583 0.0417 0.0288 2 46 standard 7.000 . . . 3 45 standard 7.000 0.9167 0.0833 0.0399 4 44 test 8.000 0.8958 0.1042 0.0441 5 43 test 10.000 0.8750 0.1250 0.0477 6 42 standard 13.000 . . . 7 41 standard 13.000 0.8333 0.1667 0.0538 8 40 test 16.000 0.8125 0.1875 0.0563 9 39 standard 18.000 . . . 10 38 standard 18.000 0.7708 0.2292 0.0607 11 37 standard 20.000 . . . 12 36 standard 20.000 0.7292 0.2708 0.0641 13 35 test 21.000 . . . 14 34 standard 21.000 0.6875 0.3125 0.0669 15 33 test 22.000 0.6667 0.3333 0.0680 16 32 standard 24.000 0.6458 0.3542 0.0690 17 31 test 25.000 . . . 18 30 test 25.000 0.6042 0.3958 0.0706 19 29 test 27.000 0.5833 0.4167 0.0712 20 28 standard 29.000 0.5625 0.4375 0.0716 21 27 test 30.000 0.5417 0.4583 0.0719 22 26 standard 31.000 0.5208 0.4792 0.0721 23 25 standard 51.000 . . . 24 24 standard 51.000 0.4792 0.5208 0.0721 25 23 test 52.000 0.4583 0.5417 0.0719 26 22 standard 54.000 . . . 27 21 standard 54.000 0.4167 0.5833 0.0712 28 20 standard 56.000 0.3958 0.6042 0.0706 29 19 standard 59.000 0.3750 0.6250 0.0699 30 18 standard 61.000 0.3542 0.6458 0.0690 31 17 test 63.000 0.3333 0.6667 0.0680 32 16 standard 80.000 0.3125 0.6875 0.0669 33 15 test 87.000 0.2917 0.7083 0.0656 34 14 test 95.000 0.2708 0.7292 0.0641 35 13 test 97.000* . . . 35 12 standard 99.000 . . . 36 11 test 99.000 0.2257 0.7743 0.0609 37 10 test 103.000* . . . 37 9 test 117.000 0.2006 0.7994 0.0591 38 8 standard 122.000 0.1755 0.8245 0.0567 39 7 standard 123.000* . . . 39 6 standard 139.000 0.1463 0.8537 0.0543 40 5 standard 151.000 0.1170 0.8830 0.0507 41 4 standard 153.000 0.0878 0.9122 0.0457 42 3 standard 287.000 0.0585 0.9415 0.0387 43 2 standard 384.000 0.0293 0.9707 0.0283 44 1 standard 392.000 0 1.0000 0 45 0 standard NOTE: The marked survival times are censored observations. Quartile Estimates Point 95% Confidence Interval Percent Estimate [Lower Upper) 75 99.000 59.000 151.000 50 51.000 25.000 61.000 25 20.000 13.000 25.000 Mean Standard Error 78.981 14.837
Stratum 4: Cell = squamous Product-Limit Survival Estimates Survival Standard Number Number SurvTime Survival Failure Error Failed Left Therapy 0.000 1.0000 0 0 0 35 1.000 . . . 1 34 test 1.000 0.9429 0.0571 0.0392 2 33 test 8.000 0.9143 0.0857 0.0473 3 32 standard 10.000 0.8857 0.1143 0.0538 4 31 standard 11.000 0.8571 0.1429 0.0591 5 30 standard 15.000 0.8286 0.1714 0.0637 6 29 test 25.000 0.8000 0.2000 0.0676 7 28 test 25.000* . . . 7 27 standard 30.000 0.7704 0.2296 0.0713 8 26 test 33.000 0.7407 0.2593 0.0745 9 25 test 42.000 0.7111 0.2889 0.0772 10 24 standard 44.000 0.6815 0.3185 0.0794 11 23 test 72.000 0.6519 0.3481 0.0813 12 22 standard 82.000 0.6222 0.3778 0.0828 13 21 standard 87.000* . . . 13 20 test 100.000* . . . 13 19 standard 110.000 0.5895 0.4105 0.0847 14 18 standard 111.000 0.5567 0.4433 0.0861 15 17 test 112.000 0.5240 0.4760 0.0870 16 16 test 118.000 0.4912 0.5088 0.0875 17 15 standard 126.000 0.4585 0.5415 0.0876 18 14 standard 144.000 0.4257 0.5743 0.0873 19 13 standard 201.000 0.3930 0.6070 0.0865 20 12 test 228.000 0.3602 0.6398 0.0852 21 11 standard 231.000* . . . 21 10 test 242.000 0.3242 0.6758 0.0840 22 9 test 283.000 0.2882 0.7118 0.0820 23 8 test 314.000 0.2522 0.7478 0.0793 24 7 standard 357.000 0.2161 0.7839 0.0757 25 6 test 389.000 0.1801 0.8199 0.0711 26 5 test 411.000 0.1441 0.8559 0.0654 27 4 standard 467.000 0.1081 0.8919 0.0581 28 3 test 587.000 0.0720 0.9280 0.0487 29 2 test 991.000 0.0360 0.9640 0.0352 30 1 test 999.000 0 1.0000 0 31 0 test NOTE: The marked survival times are censored observations. Quartile Estimates Point 95% Confidence Interval Percent Estimate [Lower Upper) 75 357.000 201.000 467.000 50 118.000 72.000 242.000 25 33.000 11.000 111.000 Mean Standard Error 230.225 48.475
The distribution of event and censored observations among the four cell types is summarized in Output 40.1.5.
Summary of the Number of Censored and Uncensored Values Percent Stratum Cell Total Failed Censored Censored 1 adeno 27 26 1 3.70 2 large 27 26 1 3.70 3 small 48 45 3 6.25 4 squamous 35 31 4 11.43 --------------------------------------------------------------- Total 137 128 9 6.57
The graph of the negative log of the estimated survivor functions is displayed in Output 40.1.7. Output 40.1.8 displays the log of the negative log of the estimated survivor functions against the log of time.
Results of the homogeneity tests across cell types are given in Output 40.1.9. The log-rank and Wilcoxon statistics and their corresponding covariance matrices are displayed. Also given is a table that consists of the approximate chi-square statistics, degrees of freedom, and p -values for the log-rank, Wilcoxon, and likelihood ratio tests. All three tests indicate strong evidence of a significant difference among the survival curves for the four types of cancer cells ( p < 0.001).
Rank Statistics Cell Log-Rank Wilcoxon adeno 10.306 697.0 large 8.549 1085.0 small 14.898 1278.0 squamous 16.655 890.0 Covariance Matrix for the Log-Rank Statistics Cell adeno large small squamous adeno 12.9662 4.0701 4.4087 4.4873 large 4.0701 24.1990 7.8117 12.3172 small 4.4087 7.8117 21.7543 9.5339 squamous 4.4873 12.3172 9.5339 26.3384 Covariance Matrix for the Wilcoxon Statistics Cell adeno large small squamous adeno 121188 34718 46639 39831 large 34718 151241 59948 56576 small 46639 59948 175590 69002 squamous 39831 56576 69002 165410 Test of Equality over Strata Pr > Test Chi-Square DF Chi-Square Log-Rank 25.4037 3 <.0001 Wilcoxon 19.4331 3 0.0002 2Log(LR) 33.9343 3 <.0001
Results of the log-rank test of the prognostic variables are shown in Output 40.1.10. The univariate test results correspond to testing each prognostic factor marginally. The joint covariance matrix of these univariate test statistics is also displayed. In computing the overall chi-square statistic, the partial chi-square statistics following a forward stepwise entry approach are tabulated.
Univariate Chi-Squares for the Log-Rank Test Test Standard Pr > Variable Statistic Deviation Chi-Square Chi-Square Label Age 40.7383 105.7 0.1485 0.7000 age in years Prior 19.9435 46.9836 0.1802 0.6712 prior treatment? DiagTime 115.9 97.8708 1.4013 0.2365 months till randomization Kps 1123.1 170.3 43.4747 <.0001 karnofsky index Treatment 4.2076 5.0407 0.6967 0.4039 treatment indicator Covariance Matrix for the Log-Rank Statistics Variable Age Prior DiagTime Kps Treatment Age 11175.4 301.2 892.2 2948.4 119.3 Prior 301.2 2207.5 2010.9 78.6 13.9 DiagTime 892.2 2010.9 9578.7 2295.3 21.9 Kps 2948.4 78.6 2295.3 29015.6 61.9 Treatment 119.3 13.9 21.9 61.9 25.4 Forward Stepwise Sequence of Chi-Squares for the Log-Rank Test Pr > Chi-Square Pr > Variable DF Chi-Square Chi-Square Increment Increment Kps 1 43.4747 <.0001 43.4747 <.0001 Treatment 2 45.2008 <.0001 1.7261 0.1889 Age 3 46.3012 <.0001 1.1004 0.2942 Prior 4 46.4134 <.0001 0.1122 0.7377 DiagTime 5 46.4200 <.0001 0.00665 0.9350 Variable Label Kps karnofsky index Treatment treatment indicator Age age in years Prior prior treatment? DiagTime months till randomization
You can establish this forward stepwise entry of prognostic factors by passing the matrix corresponding to the log-rank test to the RSQUARE method in the REG procedure. PROC REG finds the sets of variables that yield the largest chi-square statistics.
data RSq; set Test; if _type_='LOG RANK'; _type_='cov'; proc print data=RSq; proc reg data=RSq(type=COV); model SurvTime=Age Prior DiagTime Kps Treatment / selection=rsquare; title 'All Possible Subsets of Covariates for the log-rank Test'; run;
Output 40.1.11 displays the univariate statistics and their covariance matrix for the log-rank test.
Obs _TYPE_ _NAME_ SurvTime Age Prior DiagTime Kps Treatment 1 cov SurvTime 46.42 40.74 19.94 115.86 1123.14 4.208 2 cov Age 40.74 11175.44 301.23 892.24 2948.45 119.297 3 cov Prior 19.94 301.23 2207.46 2010.85 78.64 13.875 4 cov DiagTime 115.86 892.24 2010.85 9578.69 2295.32 21.859 5 cov Kps 1123.14 2948.45 78.64 -2295.32 29015.62 61.945 6 cov Treatment 4.21 119.30 13.87 21.86 61.95 25.409
Results of the best subset regression are shown in Output 40.1.12. The variable Kps generates the largest univariate test statistic among all the covariates, the pair Kps and Age generate the largest test statistic among any other pairs of covariates, and so on. The entry order of covariates is identical to that of PROC LIFETEST.
All Possible Subsets of Covariates for the log-rank Test The REG Procedure Model: MODEL1 Dependent Variable: SurvTime R-Square Selection Method Number in Model R-Square Variables in Model 1 0.9366 Kps 1 0.0302 DiagTime 1 0.0150 Treatment 1 0.0039 Prior 1 0.0032 Age ---------------------------------------------------------- 2 0.9737 Kps Treatment 2 0.9472 Age Kps 2 0.9417 Prior Kps 2 0.9382 DiagTime Kps 2 0.0434 DiagTime Treatment 2 0.0353 Age DiagTime 2 0.0304 Prior DiagTime 2 0.0181 Prior Treatment 2 0.0159 Age Treatment 2 0.0075 Age Prior ---------------------------------------------------------- 3 0.9974 Age Kps Treatment 3 0.9774 Prior Kps Treatment 3 0.9747 DiagTime Kps Treatment 3 0.9515 Age Prior Kps 3 0.9481 Age DiagTime Kps 3 0.9418 Prior DiagTime Kps 3 0.0456 Age DiagTime Treatment 3 0.0438 Prior DiagTime Treatment 3 0.0355 Age Prior DiagTime 3 0.0192 Age Prior Treatment ---------------------------------------------------------- 4 0.9999 Age Prior Kps Treatment 4 0.9976 Age DiagTime Kps Treatment 4 0.9774 Prior DiagTime Kps Treatment 4 0.9515 Age Prior DiagTime Kps 4 0.0459 Age Prior DiagTime Treatment ---------------------------------------------------------- 5 1.0000 Age Prior DiagTime Kps Treatment
This example uses the data of 137 bone marrow transplant patients extracted from Klein and Moeschberger (1997). At the time of transplant, each patient is classified into one of three risk categories: ALL (Acute Lymphoblastic Leukemia), low-risk AML (Acute Myeloctic Leukemia), and high-risk AML. The endpoint of interest is the disease-free survival, which is the time to death or relapse or the end of the study in days. The data are saved in the SAS data set BMT . In this data set, the variable Group represents the Patient s risk category, the variable T represents the disease-free survival time, and the variable Status is the censoring indicator with value 1 indicating an event time and value 0 a censored time.
proc format; value risk 1='ALL' 2='low-risk AML' 3='high-risk AML'; data BMT; input Group T Status @@; format Group risk.; label T='Time to Relapse'; datalines; 1 2081 0 1 1602 0 1 1496 0 1 1462 0 1 1433 0 1 1377 0 1 1330 0 1 996 0 1 226 0 1 1199 0 1 1111 0 1 530 0 1 1182 0 1 1167 0 1 418 1 1 383 1 1 276 1 1 104 1 1 609 1 1 172 1 1 487 1 1 662 1 1 194 1 1 230 1 1 526 1 1 122 1 1 129 1 1 74 1 1 122 1 1 86 1 1 466 1 1 192 1 1 109 1 1 55 1 1 1 1 1 107 1 1 110 1 1 332 1 2 2569 0 2 2506 0 2 2409 0 2 2218 0 2 1857 0 2 1829 0 2 1562 0 2 1470 0 2 1363 0 2 1030 0 2 860 0 2 1258 0 2 2246 0 2 1870 0 2 1799 0 2 1709 0 2 1674 0 2 1568 0 2 1527 0 2 1324 0 2 957 0 2 932 0 2 847 0 2 848 0 2 1850 0 2 1843 0 2 1535 0 2 1447 0 2 1384 0 2 414 1 2 2204 1 2 1063 1 2 481 1 2 105 1 2 641 1 2 390 1 2 288 1 2 421 1 2 79 1 2 748 1 2 486 1 2 48 1 2 272 1 2 1074 1 2 381 1 2 10 1 2 53 1 2 80 1 2 35 1 2 248 1 2 704 1 2 211 1 2 219 1 2 606 1 3 2640 0 3 2430 0 3 2252 0 3 2140 0 3 2133 0 3 1238 0 3 1631 0 3 2024 0 3 1345 0 3 1136 0 3 845 0 3 422 1 3 162 1 3 84 1 3 100 1 3 2 1 3 47 1 3 242 1 3 456 1 3 268 1 3 318 1 3 32 1 3 467 1 3 47 1 3 390 1 3 183 1 3 105 1 3 115 1 3 164 1 3 93 1 3 120 1 3 80 1 3 677 1 3 64 1 3 168 1 3 74 1 3 16 1 3 157 1 3 625 1 3 48 1 3 273 1 3 63 1 3 76 1 3 113 1 3 363 1 ;
Klein and Moeschberger (1997, Section 4.4) describe in detail how to compute the Hall and Wellner (HW) and equal precision (EP) confidence bands. Now you can use the SURVIVAL statement in PROC LIFETEST to obtain these confidence bands. In the following code, PROC LIFETEST is invoked to compute the product-limit estimates of the disease-free survival. The SURVIVAL statement is included to create an output SAS data set (named Out1 ) that contains the survival function estimates and to plot them with the experimental graphics using the ODS. To obtain both the HW and EP confidence bands in the OUT= data set, you specify the CONFBAND=ALL option. The BANDMIN=100 and BANDMAX=600 options restrict the confidence bands for the survivor function S ( t ) over the range 100 ‰ t ‰ 600. The CONFTYPE=ASINSQRT option is specified to apply the arcsine-square root transform to the survivor function in computing the pointwise confidence intervals and the confidence bands. The experimental ODS graphics statement is specified to display the graphics using ODS. The specific plots to be displayed are specified by the PLOTS=(STRATUM, SURVIVAL, HWB) option, which includes a panel of plots for each stratum, a plot of the survivor functions estimates for all strata, and a plot of the Hall-Wellner bands for all strata. Since most of the events occur within 800 days, MAXTIME=800 is specified to restrict the display to such time.
ods html; ods graphics on; proc lifetest data=BMT noprint; time T * Status(0); survival out=Out1 confband=all bandmin=100 bandmax=600 maxtime=800 conftype=asinsqrt plots=(stratum, survival, hwb); strata Group; run; ods graphics off; ods html close; proc contents data=Out1; run;
TheHWconfidence bands for disease-free survival are represented by the variables HW_ LCL and HW_ UCL in the Out1 data set, and the EP confidence bands are represented by the variables EP_ LCL and EP_ UCL . Other variables in the Out1 data set are shown in the printed output of PROC CONTENTS in Output 40.2.1.
The CONTENTS Procedure Alphabetic List of Variables and Attributes # Variable Type Len Format Label 5 CONFTYPE Char 8 Transform for Survival Confidence Interval 10 EP_LCL Num 8 Equal Precision Band Lower 95.00% Limit 11 EP_UCL Num 8 Equal Precision Band Upper 95.00% Limit 1 Group Num 8 RISK. 8 HW_LCL Num 8 Hall-Wellner Band Lower 95.00% Limit 9 HW_UCL Num 8 Hall-Wellner Band Upper 95.00% Limit 6 SDF_LCL Num 8 SDF Lower 95.00% Confidence Limit 7 SDF_UCL Num 8 SDF Upper 95.00% Confidence Limit 12 STRATUM Num 8 Stratum Number 4 SURVIVAL Num 8 Survival Distribution Function Estimate 2 T Num 8 Time to Relapse 3 _CENSOR_ Num 8 Censoring Flag: 0=Failed 1=Censored
Output 40.2.3 shows a plot of the estimated survival curves for the three leukemia groups. Censored observations are plotted as a plus sign. It appears that the lowrisk AML patients have the best prognosis while the high-risk AML patients have the worse prognosis, with the ALL patients in between. Output 40.2.4 shows a plot of the Hall-Wellner bands for the three leukemia groups. The band for the ALL patients overlaps with those of the low-risk and high-risk AML patients, but there is very little overlapping between the band for the low-risk AML patients and the band for the high-risk patients. One would expect the low-risk AML patients to live much longer than the high-risk AML patients.
The graphical display in Output 40.2.2 as well as those shown in Output 40.2.3 and Output 40.2.4 are requested by specifying the experimental ODS GRAPHICS statement and the experimental PLOTS= option in the SURVIVAL statement. For general information about ODS graphics, see Chapter 15, Statistical Graphics Using ODS. For specific information about the graphics available in the LIFETEST procedure, see the section ODS Graphics on page 2190.
The data in this example come from Lee (1992, p. 91) and represent the survival rate of males with angina pectoris. Survival time is measured as years from the time of diagnosis. The data are read as number of events and number of withdrawals in each one-year time interval for 16 intervals. Three variables are constructed from the data: Years (an artificial time variable with values that are the midpoints of the time intervals), Censored (a censoring indicator variable with value 1 indicating censored observations and value 0 indicating event observations), and Freq (the frequency variable). Two observations are created for each interval, one representing the event observations and the other representing the censored observations.
title 'Survival of Males with Angina Pectoris'; data males; keep Freq Years Censored; retain Years .5; input fail withdraw @@; Years + 1; Censored=0; Freq=fail; output; Censored=1; Freq=withdraw; output; datalines; 456 0 226 39 152 22 171 23 135 24 125 107 83 133 74 102 51 68 42 64 43 45 34 53 18 33 9 27 6 23 0 30 ;
PROC LIFETEST is invoked to compute the various life-table survival estimates, the median residual time, and their standard errors. The life-table method of computing estimates is requested by specifying METHOD=LT. The intervals are specified by the INTERVAL= option. Traditional high-resolution graphs of the life-table estimate, negative log of the estimate, negative log-log of the estimate, estimated density function, and estimated hazard function are requested by the PLOTS= option. No tests for homogeneity are carried out because the data are not stratified.
symbol1 c=blue; proc lifetest data=males method=lt intervals=(0 to 15 by 1) plots=(s,ls,lls,h,p); time Years*Censored(1); freq Freq; run;
Results of the life-table estimation are shown in Output 40.3.1.Thefive-year survival rate is 0.5193 with a standard error of 0.0103. The estimated median residual lifetime, which is 5.33 years initially, has reached a maximum of 6.34 years at the beginning of the second year and decreases gradually to a value lower than the initial 5.33 years at the beginning of the seventh year.
Survival of Males with Angina Pectoris Life Table Survival Estimates Conditional Effective Conditional Probability Survival Median Interval Number Number Sample Probability Standard Standard Residual [Lower, Upper) Failed Censored Size of Failure Error Survival Failure Error Lifetime 0 1 456 0 2418.0 0.1886 0.00796 1.0000 0 0 5.3313 1 2 226 39 1942.5 0.1163 0.00728 0.8114 0.1886 0.00796 6.2499 2 3 152 22 1686.0 0.0902 0.00698 0.7170 0.2830 0.00918 6.3432 3 4 171 23 1511.5 0.1131 0.00815 0.6524 0.3476 0.00973 6.2262 4 5 135 24 1317.0 0.1025 0.00836 0.5786 0.4214 0.0101 6.2185 5 6 125 107 1116.5 0.1120 0.00944 0.5193 0.4807 0.0103 5.9077 6 7 83 133 871.5 0.0952 0.00994 0.4611 0.5389 0.0104 5.5962 7 8 74 102 671.0 0.1103 0.0121 0.4172 0.5828 0.0105 5.1671 8 9 51 68 512.0 0.0996 0.0132 0.3712 0.6288 0.0106 4.9421 9 10 42 64 395.0 0.1063 0.0155 0.3342 0.6658 0.0107 4.8258 10 11 43 45 298.5 0.1441 0.0203 0.2987 0.7013 0.0109 4.6888 11 12 34 53 206.5 0.1646 0.0258 0.2557 0.7443 0.0111 . 12 13 18 33 129.5 0.1390 0.0304 0.2136 0.7864 0.0114 . 13 14 9 27 81.5 0.1104 0.0347 0.1839 0.8161 0.0118 . 14 15 6 23 47.5 0.1263 0.0482 0.1636 0.8364 0.0123 . 15 . 0 30 15.0 0 0 0.1429 0.8571 0.0133 . Evaluated at the Midpoint of the Interval Median PDF Hazard Interval Standard Standard Standard [Lower, Upper) Error PDF Error Hazard Error 0 1 0.1749 0.1886 0.00796 0.208219 0.009698 1 2 0.2001 0.0944 0.00598 0.123531 0.008201 2 3 0.2361 0.0646 0.00507 0.09441 0.007649 3 4 0.2361 0.0738 0.00543 0.119916 0.009154 4 5 0.1853 0.0593 0.00495 0.108043 0.009285 5 6 0.1806 0.0581 0.00503 0.118596 0.010589 6 7 0.1855 0.0439 0.00469 0.1 0.010963 7 8 0.2713 0.0460 0.00518 0.116719 0.013545 8 9 0.2763 0.0370 0.00502 0.10483 0.014659 9 10 0.4141 0.0355 0.00531 0.112299 0.017301 10 11 0.4183 0.0430 0.00627 0.155235 0.023602 11 12 . 0.0421 0.00685 0.17942 0.030646 12 13 . 0.0297 0.00668 0.149378 0.03511 13 14 . 0.0203 0.00651 0.116883 0.038894 14 15 . 0.0207 0.00804 0.134831 0.054919 15 . . . . . .
The breakdown of event and censored observation in the data is shown in Output 40.3.2. Note that 32.8% of the patients have withdrawn from the study.
Survival of Males with Angina Pectoris Summary of the Number of Censored and Uncensored Values Percent Total Failed Censored Censored 2418 1625 793 32.80 NOTE: There were 2 observations with missing values, negative time values or frequency values less than 1.
An exponential model may be appropriate for the survival of these male patients with angina pectoris since the curve of the negative log of the survivor function estimate versus the survival time ( Output 40.3.4) approximates a straight line through the origin. Note that the graph of the log of the negative log of the survivor function estimate versus the log of time ( Output 40.3.5) is practically a straight line.
As discussed in Lee (1992), the graph of the estimated hazard function ( Output 40.3.6) shows that the death rate is highest in the first year of diagnosis. From the end of the first year to the end of the tenth year, the death rate remains relatively constant, fluctuating between 0.09 and 0.12. The death rate is generally higher after the tenth year. This could indicate that a patient who has survived the first year has a better chance than a patient who has just been diagnosed. The profile of the median residual lifetimes also supports this interpretation.
The density estimate is shown in ( Output 40.3.7). Visually, it resembles that of an exponential distribution.