Getting Started


The LOGISTIC procedure is similar in use to the other regression procedures in the SAS System. To demonstrate the similarity, suppose the response variable y is binary or ordinal, and x1 and x2 are two explanatory variables of interest. To fit a logistic regression model, you can use a MODEL statement similar to that used in the REG procedure:

  proc logistic;   model y=x1 x2;   run;  

The response variable y can be either character or numeric. PROC LOGISTIC enumerates the total number of response categories and orders the response levels according to the response variable option ORDER= in the MODEL statement. The procedure also allows the input of binary response data that are grouped:

  proc logistic;   model r/n=x1 x2;   run;  

Here, n represents the number of trials and r represents the number of events.

The following example illustrates the use of PROC LOGISTIC. The data, taken from Cox and Snell (1989, pp. 10 “11), consist of the number, r , of ingots not ready for rolling, out of n tested , for a number of combinations of heating time and soaking time. The following invocation of PROC LOGISTIC fits the binary logit model to the grouped data:

  data ingots;   input Heat Soak r n @@;   datalines;   7 1.0 0 10  14 1.0 0 31  27 1.0 1 56  51 1.0 3 13   7 1.7 0 17  14 1.7 0 43  27 1.7 4 44  51 1.7 0  1   7 2.2 0  7  14 2.2 2 33  27 2.2 0 21  51 2.2 0  1   7 2.8 0 12  14 2.8 0 31  27 2.8 1 22  51 4.0 0  1   7 4.0 0  9  14 4.0 0 19  27 4.0 1 16   ;   proc logistic data=ingots;   model r/n=Heat Soak;   run;  

The results of this analysis are shown in the following tables.

PROC LOGISTIC first lists background information in Figure 42.1 about the fitting of the model. Included are the name of the input data set, the response variable(s) used, the number of observations used, and the link function used.

start figure
  The LOGISTIC Procedure   Model Information   Data Set                       WORK.INGOTS   Response Variable (Events)     r   Response Variable (Trials)     n   Model                          binary logit   Optimization Technique         Fisher's scoring   Number of Observations Read          19   Number of Observations Used          19   Sum of Frequencies Read             387   Sum of Frequencies Used             387  
end figure

Figure 42.1: Binary Logit Model

The Response Profile table (Figure 42.2) lists the response categories (which are Event and Nonevent when grouped data are input), their ordered values, and their total frequencies for the given data.

start figure
  Response Profile   Ordered     Binary            Total   Value     Outcome       Frequency   1     Event                12   2     Nonevent            375   Model Convergence Status   Convergence criterion (GCONV=1E-8) satisfied.  
end figure

Figure 42.2: Response Profile with Events/Trials Syntax

The Model Fit Statistics table (Figure 42.3) contains the Akaike Information Criterion (AIC), the Schwarz Criterion (SC), and the negative of twice the log likelihood (-2 Log L) for the intercept-only model and the fitted model. AIC and SC can be used to compare different models, and the ones with smaller values are preferred. Results of the likelihood ratio test and the efficient score test for testing the joint significance of the explanatory variables ( Soak and Heat ) are included in the Testing Global Null Hypothesis: BETA=0 table (Figure 42.3).

start figure
  Model Fit Statistics   Intercept   Intercept            and   Criterion          Only     Covariates   AIC             108.988        101.346   SC              112.947        113.221     2 Log L        106.988         95.346   Testing Global Null Hypothesis: BETA=0   Test                 Chi-Square       DF     Pr > ChiSq   Likelihood Ratio        11.6428        2         0.0030   Score                   15.1091        2         0.0005   Wald                    13.0315        2         0.0015  
end figure

Figure 42.3: Fit Statistics and Hypothesis Tests

The Analysis of Maximum Likelihood Estimates table in Figure 42.4 lists the parameter estimates, their standard errors, and the results of the Wald test for individual parameters. The odds ratio for each effect parameter, estimated by exponentiating the corresponding parameter estimate, is shown in the Odds Ratios Estimates table (Figure 42.4), along with 95% Wald confidence intervals.

start figure
  Analysis of Maximum Likelihood Estimates   Standard          Wald   Parameter   DF    Estimate       Error    Chi-Square   Pr > ChiSq   Intercept    1   5.5592      1.1197       24.6503       <.0001   Heat         1      0.0820      0.0237       11.9454       0.0005   Soak         1      0.0568      0.3312        0.0294       0.8639   Odds Ratio Estimates   Point          95% Wald   Effect    Estimate      Confidence Limits   Heat         1.085       1.036       1.137   Soak         1.058       0.553       2.026  
end figure

Figure 42.4: Parameter Estimates and Odds Ratios

Using the parameter estimates, you can calculate the estimated logit of as

click to expand

If Heat =7 and Soak =1, then logit( ) = ˆ’ 4 . 9284. Using this logit estimate, you can calculate as follows :

click to expand

This gives the predicted probability of the event (ingot not ready for rolling) for Heat =7 and Soak =1. Note that PROC LOGISTIC can calculate these statistics for you; use the OUTPUT statement with the PREDICTED= option.

Finally, the Association of Predicted Probabilities and Observed Responses table (Figure 42.5) contains four measures of association for assessing the predictive ability of a model. They are based on the number of pairs of observations with different response values, the number of concordant pairs, and the number of discordant pairs, which are also displayed. Formulas for these statistics are given in the Rank Correlation of Observed Responses and Predicted Probabilities section on page 2350.

start figure
  Association of Predicted Probabilities and Observed Responses   Percent Concordant     64.4    Somers' D    0.460   Percent Discordant     18.4    Gamma        0.555   Percent Tied           17.2    Tau-a        0.028   Pairs                  4500    c            0.730  
end figure

Figure 42.5: Association Table

To illustrate the use of an alternative form of input data, the following program creates the INGOTS data set with new variables NotReady and Freq instead of n and r . The variable NotReady represents the response of individual units; it has a value of 1 for units not ready for rolling (event) and a value of 0 for units ready for rolling (nonevent). The variable Freq represents the frequency of occurrence of each combination of Heat , Soak ,and NotReady . Note that, compared to the previous data set, NotReady =1 implies Freq = r , and NotReady =0 implies Freq = n ˆ’ r .

  data ingots;   input Heat Soak NotReady Freq @@;   datalines;   7 1.0 0 10  14 1.0 0 31  14 4.0 0 19  27 2.2 0 21  51 1.0 1  3   7 1.7 0 17  14 1.7 0 43  27 1.0 1  1  27 2.8 1  1  51 1.0 0 10   7 2.2 0  7  14 2.2 1  2  27 1.0 0 55  27 2.8 0 21  51 1.7 0  1   7 2.8 0 12  14 2.2 0 31  27 1.7 1  4  27 4.0 1  1  51 2.2 0  1   7 4.0 0  9  14 2.8 0 31  27 1.7 0 40  27 4.0 0 15  51 4.0 0  1   ;  

The following SAS statements invoke PROC LOGISTIC to fit the same model using the alternative form of the input data set.

  proc logistic data=ingots;   model NotReady(event='1') = Soak Heat;   freq Freq;   run;  

Results of this analysis are the same as the previous one. The displayed output for the two runs are identical except for the background information of the model fitand the Response Profile table shown in Figure 42.6.

start figure
  The LOGISTIC Procedure   Response Profile   Ordered                      Total   Value     NotReady     Frequency   1            0           375   2            1            12   Probability modeled is NotReady=1.  
end figure

Figure 42.6: Response Profile with Single-Trial Syntax

By default, Ordered Values are assigned to the sorted response values in ascending order, and PROC LOGISTIC models the probability of the response level that corresponds to the Ordered Value 1. There are several methods to change these defaults; the preceding statements specify the response variable option EVENT= to model the probability of NotReady =1 as displayed in Figure 42.6. See the Response Level Ordering section on page 2329 for more details.




SAS.STAT 9.1 Users Guide (Vol. 4)
SAS.STAT 9.1 Users Guide (Vol. 4)
ISBN: N/A
EAN: N/A
Year: 2004
Pages: 91

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net