Syntax


The following statements are available in PROC PROBIT.

  • PROC PROBIT < options > ;

    • MODEL response=independents < / options > ;

    • BY variables ;

    • CLASS variables ;

    • OUTPUT < OUT= SAS-data-set >< options > ;

    • WEIGHT variable ;

    • CDFPLOT < VAR = variable >< options > ;

    • INSET < keyword-list >< / options > ;

    • IPPPLOT < VAR = variable >< options > ;

    • LPREDPLOT < VAR = variable >< options > ;

    • PREDPPLOT < VAR = variable >< options > ;

A MODEL statement is required. Only a single MODEL statement can be used with one invocation of the PROBIT procedure. If multiple MODEL statements are present, only the last one is used. Main effects and higher-order terms can be specified in the MODEL statement, similar to the GLM procedure. If a CLASS statement is used, it must precede the MODEL statement.

The CDFPLOT, INSET, IPPPLOT, LPREDPLOT, and PREDPPLOT statements are used to produce graphical output. You can use any appropriate combination of the graphical statements after the MODEL statement.

PROC PROBIT Statement

  • PROC PROBIT < options > ;

The PROC PROBIT statement starts the procedure. You can specify the following options in the PROC PROBIT statement.

COVOUT

  • writes the parameter estimate covariance matrix to the OUTEST= data set.

C= rate

OPTC

  • controls how the natural response is handled. Specify the OPTC option to request that the natural response rate C be estimated. Specify the C= rate option to set the natural response rate or to provide the initial estimate of the natural response rate. The natural response rate value must be a number between 0 and 1.

    • If you specify neither the OPTC nor the C= option, a natural response rate of zero is assumed.

    • If you specify both the OPTC and the C= option, the C= option should be a reasonable initial estimate of the natural response rate. For example, you could use the ratio of the number of responses to the number of subjects in a control group .

    • If you specify the C= option but not the OPTC option, the natural response rate is set to the specified value and not estimated.

    • If you specify the OPTC option but not the C= option, PROC PROBIT s action depends on the response variable, as follows :

      • If you specify either the LN or LOG10 option and some subjects have the first independent variable (dose) values less than or equal to zero, these subjects are treated as a control group. The initial estimate of C is then the ratio of the number of responses to the number of subjects in this group.

      • If you do not specify the LN or LOG10 option or if there is no control group, then one of the following occurs:

        • If all responses are greater than zero, the initial estimate of the natural response rate is the minimal response rate (the ratio of the number of responses to the number of subjects in a dose group) across all dose levels.

        • If one or more of the responses is zero (making the response rate zero in that dose group), the initial estimate of the natural rate is the reciprocal of twice the largest number of subjects in any dose group in the experiment.

DATA = SAS-data-set

  • specifies the SAS data set to be used by PROC PROBIT. By default, the procedure uses the most recently created SAS data set.

GOUT= graphics-catalog

  • specifies a graphics catalog in which to save graphics output.

HPROB= p

  • specifies a minimum probability level for the Pearson chi-square to indicate a good fit. The default value is 0.10. The LACKFIT option must also be specified for this option to have any effect. For Pearson goodness-of-fit chi-square values with probability greater than the HPROB= value, the fiducial limits, if requested with the INVERSECL option, are computed using a critical value of 1.96. For chi-square values with probability less than the value of the HPROB= option, the critical value is a 0.95 two-sided quantile value taken from the t distribution with degrees of freedom equal to ( k ˆ’ 1) — m ˆ’ q , where k is the number of levels for the response variable, m is the number of different sets of independent variable values, and q is the number of parameters fit in the model. Note that the HPROB= option can also appear in the MODEL statement.

INEST= SAS-data-set

  • specifies an input SAS data set that contains initial estimates for all the parameters in the model. See the section INEST= SAS-data-set on page 3757 for a detailed description of the contents of the INEST= data set.

INVERSECL

  • computes confidence limits for the values of the first continuous independent variable (such as dose) that yield selected response rates. If the algorithm fails to converge (this can happen when C is nonzero), missing values are reported for the confidence limits. See the section Inverse Confidence Limits on page 3761 for details. Note that the INVERSECL option can also appear in the MODEL statement.

LACKFIT

  • performs two goodness-of-fit tests (a Pearson chi-square test and a log- likelihood ratio chi-square test) for the fitted model.

  • To compute the test statistics, proper grouping of the observations into subpopulations is needed. You can use the AGGREGATE or AGGREGATE= option for this end. See the entry for the AGGREGATE and AGGREGATE= options under the MODEL statement. If neither AGGREGATE nor AGGREGATE= is specified, PROC PROBIT assumes each observation is from a separate subpopulation and computes the goodness-of-fit test statistics only for the events/trials syntax.

  • Note: This test is not appropriate if the data are very sparse, with only a few values at each set of the independent variable values.

    If the Pearson chi-square test statistic is significant, then the covariance estimates and standard error estimates are adjusted. See the Lack of Fit Tests section on page 3759 for a description of the tests. Note that the LACKFIT option can also appear in the MODEL statement.

LOG

LN

  • analyzes the data by replacing the first continuous independent variable by its natural logarithm. This variable is usually the level of some treatment such as dosage. In addition to the usual output given by the INVERSECL option, the estimated dose values and 95% fiducial limits for dose are also displayed. If you specify the OPTC option, any observations with a dose value less than or equal to zero are used in the estimation as a control group. If you do not specify the OPTC option with the LOG or LN option, then any observations with the first continuous independent variable values less than or equal to zero are ignored.

LOG10

  • specifies an analysis like that of the LN or LOG option except that the common logarithm (log to the base 10) of the dose value is used rather than the natural logarithm.

NAMELEN= n

  • specifies the length of effect names in tables and output data sets to be n characters , where n is a value between 20 and 200. The default length is 20 characters.

NOPRINT

OPTC

  • controls how the natural response is handled. See the description of the C= option on page 3711 for details.

ORDER=DATA FORMATTED FREQ INTERNAL

  • specifies the sorting order for the levels of the classification variables specified in the CLASS statement, including the levels of the response variable. Response level ordering is important since PROC PROBIT always models the probability of response levels at the beginning of the ordering. See the section Response Level Ordering on page 3754 for further details. This ordering also determines which parameters in the model correspond to each level in the data. The following table shows how PROC PROBIT interprets values of the ORDER= option.

    Value of ORDER=

    Levels Sorted By

    DATA

    order of appearance in the input data set

    FORMATTED

    formatted value

    FREQ

    descending frequency count; levels with the most observations come first in the order

    INTERNAL

    unformatted value

  • By default, ORDER=FORMATTED. For the values FORMATTED and INTERNAL, the sort order is machine dependent. For more information on sorting order, see the chapter on the SORT procedure in the SAS Procedures Guide .

OUTEST= SAS-data-set

  • specifies a SAS data set to contain the parameter estimates and, if the COVOUT option is specified, their estimated covariances. If you omit this option, the output data set is not created. The contents of the data set are described in the section OUTEST= SAS-data-set on page 3762.

X DATA= SAS-data-set

  • specifies an input SAS data set that contains values for all the independent variables in the MODEL statement and variables in the CLASS statement. If there are covariates specified in a MODEL statement, you specify fixed values for the effects in the MODEL statement by the XDATA= data set when predicted values and/or fiducial limits for a single continuous variable (dose variable) are required. These specified values for the effects in the MODEL statement are also used for generating plots. See the section XDATA= SAS-data-set on page 3763 for a detailed description of the contents of the XDATA= data set.

BY Statement

  • BY variables ;

You can specify a BY statement with PROC PROBIT to obtain separate analyses on observations in groups defined by the BY variables. When a BY statement appears, the procedure expects the input data set to be sorted in order of the BY variables.

If your input data set is not sorted in ascending order on each of the BY variables, use one of the following alternatives:

  • Sort the data using the SORT procedure with a similar BY statement.

  • Specify the BY statement option NOTSORTED or DESCENDING in the BY statement for the PROBIT procedure. The NOTSORTED option does not mean that the data are unsorted but rather that the data are arranged in groups (according to values of the BY variables) and that these groups are not necessarily in alphabetical or increasing numeric order.

  • Create an index on the BY variables using the DATASETS procedure.

For more information on the BY statement, refer to the discussion in SAS Language Reference: Concepts . For more information on the DATASETS procedure, refer to the discussion in the SAS Procedures Guide .

CDFPLOT Statement

  • CDFPLOT < var = variable >< options > ;

The CDFPLOT statement plots the predicted cumulative distribution function (CDF) of the multinomial response variable as a function of a single continuous independent variable (dose variable). You can only use this statement after a multinomial model statement.

VAR= (variable)

  • specifies a single continuous variable (dose variable) in the independent variable list of the MODEL statement. If a VAR= variable is not specified, the first single continuous variable in the independent variable list of the MODEL statement is used. If such a variable does not exist in the independent variable list of the MODEL statement, an error is reported.

  • The predicted cumulative distribution function is defined as

    click to expand
  • where j =1, , k are the indexes of the k levels of the multinomial response variable, F is the CDF of the distribution used to model the cumulative probabilities, is the vector of estimated parameters, x is the covariate vector, j are estimated ordinal intercepts with 1 = 0, and C is the threshold parameter, either known or estimated from the model. Let x 1 be the covariate corresponding to the dose variable and x ˆ’ 1 be the vector of the rest of the covariates. Let the corresponding estimated parameters be 1 and ˆ’ 1 . Then

    click to expand
  • To plot j as a function of x 1 , x ˆ’ 1 must be specified. You can use the XDATA= option to provide the values of x ˆ’ 1 (see the XDATA= option in the PROC PROBIT statement for details), or use the default values that follow the rules:

    • If the effect contains a continuous variable (or variables), the overall mean of this effect is used.

    • If the effect is a single classification variable, the highest level of the variable is used.

options

  • specify the levels of the multinomial response variable for which the cdf curves are requested, and add features to the plot. There are k ˆ’ 1 curves for a k -level multinomial response variable (for the highest level, it is the constant line 1). You can specify any of them to be plotted by the LEVEL= option in the CDFPLOT statement. See the LEVEL= option for how to specify the levels.

  • An attached box on the right side of the plot is used to label these curves with the names of their levels. You can specify the color of this box using the CLABBOX= option.

  • You can use options in the CDFPLOT statement to

    • superimpose specification limits

    • specify the levels for which the cdf curves are requested

    • specify graphical enhancements (such as color or text height)

Summary of Options

The following tables list all options by function. The Dictionary of Options on page 3718 describes each option in detail.

CDF Options
Table 60.1: Options for CDFPLOT

LEVEL= character-list

specifies the names of the levels for which the cdf curves are requested

NOTHRESH

suppresses the threshold line

THRESHLABPOS= value

specifies the position for the label of the threshold line

General Options
Table 60.2: Color Options

CAXIS= color

specifies color for axis

CFIT= color

specifies color for fitted curves

CFRAME= color

specifies color for frame

CGRID= color

specifies color for grid lines

CHREF= color

specifies color for HREF= lines

CLABBOX= color

specifies color for label box

CTEXT= color

specifies color for text

CVREF= color

specifies color for VREF= lines

Table 60.3: Options to Enhance Plots Produced on Graphics Devices

ANNOTATE= SAS-data-set

specifies an ANNOTATE data set

INBORDER

requests a border around plot

LFIT= linetype

specifies line style for fitted curves

LGRID= linetype

specifies line style for grid lines

NOFRAME

suppresses the frame around plotting areas

NOGRID

suppresses grid lines

NOFIT

suppresses cdf curves

NOHLABEL

suppresses horizontal labels

NOHTICK

suppresses horizontal ticks

NOVTICK

suppresses vertical ticks

TURNVLABELS

vertically strings out characters in vertical labels

WFIT= n

specifies thickness for fitted curves

WGRID= n

specifies thickness for grids

WREFL= n

specifies thickness for reference lines

Table 60.4: Axis Options

HAXIS= value1 to value2 < by value3 >

specifies tick mark values for horizontal axis

HOFFSET= value

specifies offset for horizontal axis

HLOWER= value

specifies lower limit on horizontal axis scale

HUPPER= value

specifies upper limit on horizontal axis scale

NHTICK= n

specifies number of ticks for horizontal axis

NVTICK= n

specifies number of ticks for vertical axis

VAXIS= value1 to value2 < by value3 >

specifies tick mark values for vertical axis

VAXISLABEL= label

specifies label for vertical axis

VOFFSET= value

specifies offset for vertical axis

VLOWER= value

specifies lower limit on vertical axis scale

VUPPER= value

specifies upper limit on vertical axis scale

WAXIS= n

specifies thickness for axis

Table 60.5: Graphics Catalog Options

DESCRIPTION= string

specifies description for graphics catalog member

NAME = string

specifies name for plot in graphics catalog

Table 60.6: Options for Text Enhancement

FONT= font

specifies software font for text

HEIGHT= value

specifies height of text used outside framed areas

INFONT= font

specifies software font for text inside framed areas

INHEIGHT= value

specifies height of text inside framed areas

Table 60.7: Options for Reference Lines

HREF< (INTERSECT)> =value-list

requests horizontal reference line

HREFLABELS= (label1 , , labeln)

specifies labels for HREF= lines

HREFLABPOS= n

specifies vertical position of labels for HREF= lines

LHREF= linetype

specifies line style for HREF= lines

LVREF= linetype

specifies line style for VREF= lines

VREF<(INTERSECT)> =value-list

requests vertical reference line

VREFLABELS= (label1 , , labeln)

specifies labels for VREF= lines

VREFLABPOS= n

specifies horizontal position of labels for VREF= lines

Dictionary of Options

The following entries provide detailed descriptions of the options in the CDFPLOT statement.

ANNOTATE= SAS-data-set

ANNO= SAS-data-set

  • specifies an ANNOTATE data set, as described in SAS/GRAPH Software: Reference , that enables you to add features to the cdf plot. The ANNOTATE= data set you specify in the CDFPLOT statement is used for all plots created by the statement.

CAXIS= color

CAXES= color

  • specifies the color used for the axes and tick marks. This option overrides any COLOR= specifications in an AXIS statement. The default is the first color in the device color list.

CFIT= color

  • specifies the color for the fitted cdf curves. The default is the first color in the device color list.

CFRAME= color

CFR= color

  • specifies the color for the area enclosed by the axes and frame. This area is not shaded by default.

CGRID= color

  • specifies the color for grid lines. The default is the first color in the device color list.

CLABBOX= color

  • specifies the color for the area enclosed by the label box for cdf curves. This area is not shaded by default.

CHREF= color

CH= color

  • specifies the color for lines requested by the HREF= option. The default is the first color in the device color list.

CTEXT= color

  • specifies the color for tick mark values and axis labels. The default is the color specified for the CTEXT= option in the most recent GOPTIONS statement.

CVREF= color

CV= color

  • specifies the color for lines requested by the VREF= option. The default is the first color in the device color list.

DESCRIPTION= string

DES= string

  • specifies a description, up to 40 characters, that appears in the PROC GREPLAY master menu. The default is the variable name.

FONT= font

  • specifies a software font for reference line and axis labels. You can also specify fonts for axis labels in an AXIS statement. The FONT= font takes precedence over the FTEXT= font specified in the most recent GOPTIONS statement. Hardware characters are used by default.

HAXIS= value1 to value2 < by value3 >

  • specifies tick mark values for the horizontal axis. value1 , value2 , and value3 must be numeric, and value1 must be less than value2 . The lower tick mark is value1 .Tick marks are drawn at increments of value3 . The last tick mark is the greatest value that does not exceed value2 . If value3 is omitted, a value of 1 is used.

  • Examples of HAXIS= lists are:

      haxis = 0 to 10   haxis = 2 to 10 by 2   haxis = 0 to 200 by 10  

HEIGHT= value

  • specifies the height of text used outside framed areas. The default value is 3.846 (in percentage).

HLOWER= value

  • specifies the lower limit on the horizontal axis scale. The HLOWER= option specifies value as the lower horizontal axis tick mark. The tick mark interval and the upper axis limit are determined automatically. This option has no effect if the HAXIS= option is used.

HOFFSET= value

  • specifies offset for horizontal axis. The default value is 1.

HUPPER= value

  • specifies value as the upper horizontal axis tick mark. The tick mark interval and the lower axis limit are determined automatically. This option has no effect if the HAXIS= option is used.

HREF < (INTERSECT) > = value-list

  • requests reference lines perpendicular to the horizontal axis. If (INTERSECT) is specified, a second reference line perpendicular to the vertical axis is drawn that intersects the fit line at the same point as the horizontal axis reference line. If a horizontal axis reference line label is specified, the intersecting vertical axis reference line is labeled with the vertical axis value. See also the CHREF=, HREFLABELS=, and LHREF= options.

HREFLABELS= label1 , , labeln

HREFLABEL= label1 , , labeln

HREFLAB= label1 , , labeln

  • specifies labels for the lines requested by the HREF= option. The number of labels must equal the number of lines. Enclose each label in quotes. Labels can be up to 16 characters.

HREFLABPOS= n

  • specifies the vertical position of labels for HREF= lines. The following table shows valid values for n and the corresponding label placements.

    n

    label placement

    1

    top

    2

    staggered from top

    3

    bottom

    4

    staggered from bottom

    5

    alternating from top

    6

    alternating from bottom

INBORDER

  • requests a border around cdf plots.

LEVEL= ( character-list )

ORDINAL= ( character-list )

  • specifies the names of the levels for which cdf curves are requested. Names should be quoted and separated by space. If there is no correct name provided, no cdf curve is plotted.

LFIT= linetype

  • specifies a line style for fitted curves. By default, fitted curves are drawn by connecting solid lines ( linetype = 1 ).

LGRID= linetype

  • specifies a line style for all grid lines. linetype is between 1 and 46. The default is 35.

LHREF= linetype

LH= linetype

  • specifies the line type for lines requested by the HREF= option. The default is 2, which produces a dashed line.

LVREF= linetype

LV = linetype

  • specifies the line type for lines requested by the VREF= option. The default is 2, which produces a dashed line.

NAME= string

  • specifies a name for the plot, up to eight characters, that appears in the PROC GREPLAY master menu. The default is PROBIT .

NOFIT

  • suppresses the fitted cdf curves.

NOFRAME

  • suppresses the frame around plotting areas.

NOGRID

  • suppresses grid lines.

NOHLABEL

  • suppresses horizontal labels.

NOHTICK

  • suppresses horizontal tick marks.

NOTHRESH

  • suppresses the threshold line.

NOVLABEL

  • suppresses vertical labels.

NOVTICK

  • suppresses vertical tick marks.

THRESHLABPOS= n

  • specifies the horizontal position of labels for the threshold line. The following table shows valid values for n and the corresponding label placements.

    n

    label placement

    1

    left

    2

    right

VAXIS= value1 to value2 < by value3 >

  • specifies tick mark values for the vertical axis. value1 , value2 , and value3 must be numeric, and value1 must be less than value2 . The lower tick mark is value1 . Tick marks are drawn at increments of value3 . The last tick mark is the greatest value that does not exceed value2 . This method of specification of tick marks is not valid for logarithmic axes. If value3 is omitted, a value of 1 is used.

    Examples of VAXIS= lists are:

      vaxis = 0 to 10   vaxis = 0 to 2 by .1  

VAXISLABEL= string

  • specifies a label for the vertical axis.

VLOWER= value

  • specifies the lower limit on the vertical axis scale. The VLOWER= option specifies value as the lower vertical axis tick mark. The tick mark interval and the upper axis limit are determined automatically. This option has no effect if the VAXIS= option is used.

VREF= value-list

  • requests reference lines perpendicular to the vertical axis. If (INTERSECT) is specified, a second reference line perpendicular to the horizontal axis is drawn that intersects the fit line at the same point as the vertical axis reference line. If a vertical axis reference line label is specified, the intersecting horizontal axis reference line is labeled with the horizontal axis value. See also the CVREF=, LVREF=, and VREFLABELS= options.

VREFLABELS= label1 , , labeln

VREFLABEL= label1 , , labeln

VREFLAB= label1 , , labeln

  • specifies labels for the lines requested by the VREF= option. The number of labels must equal the number of lines. Enclose each label in quotes. Labels can be up to 16 characters.

VREFLABPOS= n

  • specifies the horizontal position of labels for VREF= lines. The following table shows valid values for n and the corresponding label placements.

    n

    label placement

    1

    left

    2

    right

VUPPER= value

  • specifies the upper limit on the vertical axis scale. The VUPPER= option specifies value as the upper vertical axis tick mark. The tick mark interval and the lower axis limit are determined automatically. This option has no effect if the VAXIS= option is used.

WAXIS= n

  • specifies line thickness for axes and frame. The default value is 1.

WFIT= n

  • specifies line thickness for fitted curves. The default value is 1.

WGRID= n

  • specifies line thickness for grids. The default value is 1.

WREFL= n

  • specifies line thickness for reference lines. The default value is 1.

CLASS Statement

  • CLASS variables ;

The CLASS statement names the classification variables to be used in the analysis. Classification variables can be either character or numeric. If a single response variable is specified in the MODEL statement, it must also be specified in a CLASS statement.

Class levels are determined from the formatted values of the CLASS variables. Thus, you can use formats to group values into levels. See the discussion of the FORMAT procedure in SAS Language Reference: Dictionary .

If the CLASS statement is used, it must appear before any of the MODEL statements.

INSET Statement

  • INSET < keyword-list >< options > ;

The box or table of summary information produced on plots made with the CDFPLOT, IPPPLOT, LPREDPLOT, and PREDPPLOT statement is called an inset . You can use the INSET statement to customize both the information that is printed in the inset box and the appearance of the inset box. To supply the information that is displayed in the inset box, you specify keywords corresponding to the information you want shown. For example, the following statements produce a predicted probability plot with the number of trials, the number of events, the name of the distribution, and the estimated optimum natural threshold in the inset.

  proc probit data=epidemic;   model r/n = dose;   predpplot ;   inset nobs ntrials nevents dist optc;   run;  

By default, inset entries are identified with appropriate labels. However, you can provide a customized label by specifying the keyword for that entry followed by the equal sign (=) and the label in quotes. For example, the following INSET statement produces an inset containing the number of observations and the name of the distribution, labeled Sample Size and Distribution in the inset.

  inset nobs=Sample Size dist=Distribution;  

If you specify a keyword that does not apply to the plot you are creating, then the keyword is ignored.

The options control the appearance of the box.

If you specify more than one INSET statement, only the first one is used.

Keywords Used in the INSET Statement

The following tables list keywords available in the INSET statement to display summary statistics, distribution parameters, and distribution fitting information.

Table 60.8: Summary Statistics

NOBS

number of observations

NTRIALS

number of trials

NEVENTS

number of events

C

the user inputted threshold

OPTC

the estimated natural threshold

NRESPLEV

number of levels of the response variable

Table 60.9: General Information

CONFIDENCE

confidence coefficient for all confidence intervals or for the Weibayes fit

DIST

name of the distribution

Options Used in the INSET Statement

The following tables list the options available in the INSET statement.

Table 60.10: General Appearance Options

FONT= font

specifies software font for text

HEIGHT= value

specifies height of text

HEADER= quoted string

specifies text for header or box title

NOFRAME

omits frame around box

POS= value

 

<DATA PERCENT>

determines the position of the inset. The value can be a compass point (N, NE, E, SE, S, SW, W, NW) or a pair of coordinates (x, y) enclosed in parentheses. The coordinates can be specified in axis percent units or axis data units.

REFPOINT= name

specifies the reference point for an inset that is positioned by a pair of coordinates with the POS= option. You use the REFPOINT= option in conjunction with the POS= coordinates. The REFPOINT= option specifies which corner of the inset frame you have specified with coordinates (x, y) and it can take the value of BR (bottom right), BL (bottom left), TR (top right), or TL (top left). The default is REFPOINT=BL. If the inset position is specified as a compass point, then the REFPOINT= option is ignored.

Table 60.11: Color and Pattern Options

CFILL= color

specifies color for filling box

CFILLH= color

specifies color for filling box header

CFRAME= color

specifies color for frame

CHEADER= color

specifies color for text in header

CTEXT= color

specifies color for text

IPPPLOT Statement

  • IPPPLOT < var = variable >< options > ;

The IPPPLOT statement plots the inverse of the predicted probability against a single continuous variable (dose variable) in the MODEL statement for the binomial model. You can only use this statement after a binomial model statement. The confidence limits for the predicted values of the dose variable are the computed fiducial limits, not the inverse of the confidence limits of the predicted probabilities. Refer to the section Inverse Confidence Limits on page 3761 for more details.

VAR= ( variable )

  • specifies a single continuous variable (dose variable) in the independent variable list of the MODEL statement. If a VAR= variable is not specified, the first single continuous variable in the independent variable list of the MODEL statement is used. If such a variable does not exist in the independent variable list of the MODEL statement, an error is reported.

  • For the binomial model, the response variable is a probability. An estimate of the dose level 1 needed for a response of p is given by

    click to expand
  • where F is the cumulative distribution function used to model the probability, x ˆ’ 1 is the vector of the rest of the covariates, ˆ’ 1 is the vector of the estimated parameters corresponding to x ˆ’ 1 , and 1 is the estimated parameter for the dose variable of interest.

  • To plot 1 as a function of p , x ˆ’ 1 must be specified. You can use the XDATA= option to provide the values of x ˆ’ 1 (see the XDATA= option in the PROC PROBIT statement for details), or use the default values that follow the rules:

    • If the effect contains a continuous variable (or variables), the overall mean of this effect is used.

    • If the effect is a single classification variable, the highest level of the variable is used.

options

  • add features to the plot.

  • You can use options in the IPPPLOT statement to

    • superimpose specification limits

    • suppress or add the observed data points on the plot

    • suppress or add the fiducial limits on the plot

    • specify graphical enhancements (such as color or text height)

Summary of Options

The following tables list all options by function. The Dictionary of Options on page 3728 describes each option in detail.

IPP Options
Table 60.12: Plot Layout Options for IPPPLOT

NOCONF

suppresses fiducial limits

NODATA

suppresses observed data points on the plot

NOTHRESH

suppresses the threshold line

THRESHLABPOS= value

specifies the position for the label of the threshold line

General Options
Table 60.13: Color Options

CAXIS= color

specifies color for axis

CFIT= color

specifies color for fitted curves

CFRAME= color

specifies color for frame

CGRID= color

specifies color for grid lines

CHREF= color

specifies color for HREF= lines

CTEXT= color

specifies color for text

CVREF= color

specifies color for VREF= lines

Table 60.14: Options to Enhance Plots Produced on Graphics Devices

ANNOTATE= SAS-data-set

specifies an ANNOTATE data set

INBORDER

requests a border around plot

LFIT= linetype

specifies line style for fitted curves and confidence limits

LGRID= linetype

specifies line style for grid lines

NOFRAME

suppresses the frame around plotting areas

NOGRID

suppresses grid lines

NOFIT

suppresses fitted curves

NOHLABEL

suppresses horizontal labels

NOHTICK

suppresses horizontal ticks

NOVTICK

suppresses vertical ticks

TURNVLABELS

vertically strings out characters in vertical labels

WFIT= n

specifies thickness for fitted curves

WGRID= n

specifies thickness for grids

WREFL= n

specifies thickness for reference lines

Table 60.15: Axis Options

HAXIS= value1 to value2 < by value3 >

specifies tick mark values for horizontal axis

HOFFSET= value

specifies offset for horizontal axis

HLOWER= value

specifies lower limit on horizontal axis scale

HUPPER= value

specifies upper limit on horizontal axis scale

NHTICK= n

specifies number of ticks for horizontal axis

NVTICK= n

specifies number of ticks for vertical axis

VAXIS= value1 to value2 < by value3 >

specifies tick mark values for vertical axis

VAXISLABEL= label

specifies label for vertical axis

VOFFSET= value

specifies offset for vertical axis

VLOWER= value

specifies lower limit on vertical axis scale

VUPPER= value

specifies upper limit on vertical axis scale

WAXIS= n

specifies thickness for axis

Table 60.16: Options for Reference Lines

HREF<(INTERSECT)>=value-list

requests horizontal reference line

HREFLABELS=(label1 , , labeln)

specifies labels for HREF= lines

HREFLABPOS= n

specifies vertical position of labels for HREF= lines

LHREF= linetype

specifies line style for HREF= lines

LVREF= linetype

specifies line style for VREF= lines

VREF<(INTERSECT)>=value-list

requests vertical reference line

VREFLABELS=(label1 , , labeln)

specifies labels for VREF= lines

VREFLABPOS= n

specifies horizontal position of labels for VREF= lines

Table 60.17: Graphics Catalog Options

DESCRIPTION= string

specifies description for graphics catalog member

NAME= string

specifies name for plot in graphics catalog

Table 60.18: Options for Text Enhancement

FONT= font

specifies software font for text

HEIGHT= value

specifies height of text used outside framed areas

INFONT= font

specifies software font for text inside framed areas

INHEIGHT= value

specifies height of text inside framed areas

Dictionary of Options

The following entries provide detailed descriptions of the options in the IPPPLOT statement.

ANNOTATE= SAS-data-set

ANNO= SAS-data-set

  • specifies an ANNOTATE data set, as described in SAS/GRAPH Software: Reference , that enables you to add features to the ipp plot. The ANNOTATE= data set you specify in the IPPPLOT statement is used for all plots created by the statement.

CAXIS= color

CAXES= color

  • specifies the color used for the axes and tick marks. This option overrides any COLOR= specifications in an AXIS statement. The default is the first color in the device color list.

CFIT= color

  • specifies the color for the fitted ipp curves. The default is the first color in the device color list.

CFRAME= color

CFR= color

  • specifies the color for the area enclosed by the axes and frame. This area is not shaded by default.

CGRID= color

  • specifies the color for grid lines. The default is the first color in the device color list.

CHREF= color

CH= color

  • specifies the color for lines requested by the HREF= option. The default is the first color in the device color list.

CTEXT= color

  • specifies the color for tick mark values and axis labels. The default is the color specified for the CTEXT= option in the most recent GOPTIONS statement.

CVREF= color

CV= color

  • specifies the color for lines requested by the VREF= option. The default is the first color in the device color list.

DESCRIPTION= string

DES= string

  • specifies a description, up to 40 characters, that appears in the PROC GREPLAY master menu. The default is the variable name.

FONT= font

  • specifies a software font for reference line and axis labels. You can also specify fonts for axis labels in an AXIS statement. The FONT= font takes precedence over the FTEXT= font specified in the most recent GOPTIONS statement. Hardware characters are used by default.

HAXIS= value1 to value2 < by value3 >

  • specifies tick mark values for the horizontal axis. value1 , value2 , and value3 must be numeric, and value1 must be less than value2 . The lower tick mark is value1 . Tick marks are drawn at increments of value3 . The last tick mark is the greatest value that does not exceed value2 . If value3 is omitted, a value of 1 is used.

  • Examples of HAXIS= lists are:

      haxis = 0 to 10   haxis = 2 to 10 by 2   haxis = 0 to 200 by 10  

HEIGHT= value

  • specifies the height of text used outside framed areas. The default value is 3.846 (in percentage).

HLOWER= value

  • specifies the lower limit on the horizontal axis scale. The HLOWER= option specifies value as the lower horizontal axis tick mark. The tick mark interval and the upper axis limit are determined automatically. This option has no effect if the HAXIS= option is used.

HOFFSET= value

  • specifies offset for horizontal axis. The default value is 1.

HUPPER= value

  • specifies value as the upper horizontal axis tick mark. The tick mark interval and the lower axis limit are determined automatically. This option has no effect if the HAXIS= option is used.

HREF < (INTERSECT) > = value-list

  • requests reference lines perpendicular to the horizontal axis. If (INTERSECT) is specified, a second reference line perpendicular to the vertical axis is drawn that intersects the fit line at the same point as the horizontal axis reference line. If a horizontal axis reference line label is specified, the intersecting vertical axis reference line is labeled with the vertical axis value. See also the CHREF=, HREFLABELS=, and LHREF= options.

HREFLABELS= label1 , , labeln

HREFLABEL= label1 , , labeln

HREFLAB= label1 , , labeln

  • specifies labels for the lines requested by the HREF= option. The number of labels must equal the number of lines. Enclose each label in quotes. Labels can be up to 16 characters.

HREFLABPOS= n

  • specifies the vertical position of labels for HREF= lines. The following table shows valid values for n and the corresponding label placements.

    n

    label placement

    1

    top

    2

    staggered from top

    3

    bottom

    4

    staggered from bottom

    5

    alternating from top

    6

    alternating from bottom

INBORDER

  • requests a border around ipp plots.

LFIT= linetype

  • specifies a line style for fitted curves and confidence limits. By default, fitted curves are drawn by connecting solid lines ( linetype = 1 ) and confidence limits are drawn by connecting dashed lines ( linetype = 3 ).

LGRID= linetype

  • specifies a line style for all grid lines. linetype is between 1 and 46. The default is 35.

LHREF= linetype

LH= linetype

  • specifies the line type for lines requested by the HREF= option. The default is 2, which produces a dashed line.

LVREF= linetype

LV = linetype

  • specifies the line type for lines requested by the VREF= option. The default is 2, which produces a dashed line.

NAME= string

  • specifies a name for the plot, up to eight characters, that appears in the PROC GREPLAY master menu. The default is PROBIT .

NOCONF

  • suppresses fiducial limits from the plot.

NODATA

  • suppresses observed data points from the plot.

NOFIT

  • suppresses the fitted ipp curves.

NOFRAME

  • suppresses the frame around plotting areas.

NOGRID

  • suppresses grid lines.

NOHLABEL

  • suppresses horizontal labels.

NOHTICK

  • suppresses horizontal tick marks.

NOTHRESH

  • suppresses the threshold line.

NOVLABEL

  • suppresses vertical labels.

NOVTICK

  • suppresses vertical tick marks.

THRESHLABPOS= n

  • specifies the vertical position of labels for the threshold line. The following table shows valid values for n and the corresponding label placements.

    n

    label placement

    1

    top

    2

    bottom

VAXIS= value1 to value2 < by value3 >

  • specifies tick mark values for the vertical axis. value1 , value2 , and value3 must be numeric, and value1 must be less than value2 . The lower tick mark is value1 . Tick marks are drawn at increments of value3 . The last tick mark is the greatest value that does not exceed value2 . This method of specification of tick marks is not valid for logarithmic axes. If value3 is omitted, a value of 1 is used.

  • Examples of VAXIS= lists are:

      vaxis = 0 to 10   vaxis = 0 to 2 by .1  

VAXISLABEL= string

  • specifies a label for the vertical axis.

VLOWER= value

  • specifies the lower limit on the vertical axis scale. The VLOWER= option specifies value as the lower vertical axis tick mark. The tick mark interval and the upper axis limit are determined automatically. This option has no effect if the VAXIS= option is used.

VREF= value-list

  • requests reference lines perpendicular to the vertical axis. If (INTERSECT) is specified, a second reference line perpendicular to the horizontal axis is drawn that intersects the fit line at the same point as the vertical axis reference line. If a vertical axis reference line label is specified, the intersecting horizontal axis reference line is labeled with the horizontal axis value. See also the CVREF=, LVREF=, and VREFLABELS= options.

VREFLABELS= label1 , , labeln

VREFLABEL= label1 , , labeln

VREFLAB= label1 , , labeln

  • specifies labels for the lines requested by the VREF= option. The number of labels must equal the number of lines. Enclose each label in quotes. Labels can be up to 16 characters.

VREFLABPOS= n

  • specifies the horizontal position of labels for VREF= lines. The following table shows valid values for n and the corresponding label placements.

    n

    label placement

    1

    left

    2

    right

VUPPER= value

  • specifies the upper limit on the vertical axis scale. The VUPPER= option specifies value as the upper vertical axis tick mark. The tick mark interval and the lower axis limit are determined automatically. This option has no effect if the VAXIS= option is used.

WAXIS= n

  • specifies line thickness for axes and frame. The default value is 1.

WFIT= n

  • specifies line thickness for fitted curves. The default value is 1.

WGRID= n

  • specifies line thickness for grids. The default value is 1.

WREFL= n

  • specifies line thickness for reference lines. The default value is 1.

LPREDPLOT Statement

  • LPREDPLOT < var = variable >< options > ;

The LPREDPLOT statement plots the linear predictor x ² b against a single continuous variable (dose variable) in the MODEL statement for either the binomial model or the multinomial model. The confidence limits for the predicted values are only available for the binomial model.

VAR= ( variable )

  • specifies a single continuous variable (dose variable) in the independent variable list of the MODEL statement for which the linear predictor plot is plotted. If a VAR= variable is not specified, the first single continuous variable in the independent variable list of the MODEL statement is used. If such a variable does not exist in the independent variable list of the MODEL statement, an error is reported.

  • Let x 1 be the covariate of the dose variable, x ˆ’ 1 be the vector of the rest of the covariates, ˆ’ 1 be the vector of estimated parameters corresponding to x ˆ’ 1 , and 1 be the estimated parameter for the dose variable of interest.

  • To plot ² b as a function of x 1 , x ˆ’ 1 must be specified. You can use the XDATA= option to provide the values of x ˆ’ 1 (see the XDATA= option in the PROC PROBIT statement for details), or use the default values that follow the rules:

    • If the effect contains a continuous variable (or variables), the overall mean of this effect is used.

    • If the effect is a single classification variable, the highest level of the variable is used.

options

  • add features to the plot.

  • For the multinomial model, you can use the LEVEL= option to specify the levels for which the linear predictor lines are plotted. The lines are labeled by the names of their levels in the middle.

  • You can use options in the LPREDPLOT statement to

    • superimpose specification limits

    • suppress or add the observed data points on the plot for the binomial model

    • suppress or add the confidence limits for the binomial model

    • specify the levels for which the linear predictor lines are requested for the multinomial model

    • specify graphical enhancements (such as color or text height)

Summary of Options

The following tables list all options by function. The Dictionary of Options on page 3736 describes each option in detail.

LPRED Options
Table 60.19: Plot Layout Options for LPREDPLOT

LEVEL= character-list

specifies the names of the levels for which the linear predictor lines are requested (only for the multinomial model)

NOCONF

suppresses fiducial limits (only for the binomial model)

NODATA

suppresses observed data points on the plot (only for the binomial model)

NOTHRESH

suppresses the threshold line

THRESHLABPOS= value

specifies the position for the label of the threshold line

General Options
Table 60.20: Color Options

CAXIS= color

specifies color for axis

CFIT= color

specifies color for fitted curves

CFRAME= color

specifies color for frame

CGRID= color

specifies color for grid lines

CHREF= color

specifies color for HREF= lines

CTEXT= color

specifies color for text

CVREF= color

specifies color for VREF= lines

Table 60.21: Options to Enhance Plots Produced on Graphics Devices

ANNOTATE= SAS-data-set

specifies an ANNOTATE data set

INBORDER

requests a border around plot

LFIT= linetype

specifies line style for fitted curves and confidence limits

LGRID= linetype

specifies line style for grid lines

NOFRAME

suppresses the frame around plotting areas

NOGRID

suppresses grid lines

NOFIT

suppresses fitted curves

NOHLABEL

suppresses horizontal labels

NOHTICK

suppresses horizontal ticks

NOVTICK

suppresses vertical ticks

TURNVLABELS

vertically strings out characters in vertical labels

WFIT= n

specifies thickness for fitted curves

WGRID= n

specifies thickness for grids

WREFL= n

specifies thickness for reference lines

Table 60.22: Axis Options

HAXIS= value1 to value2 < by value3 >

specifies tick mark values for horizontal axis

HOFFSET= value

specifies offset for horizontal axis

HLOWER= value

specifies lower limit on horizontal axis scale

HUPPER= value

specifies upper limit on horizontal axis scale

NHTICK= n

specifies number of ticks for horizontal axis

NVTICK= n

specifies number of ticks for vertical axis

VAXIS= value1 to value2 < by value3 >

specifies tick mark values for vertical axis

VAXISLABEL= label

specifies label for vertical axis

VOFFSET= value

specifies offset for vertical axis

VLOWER= value

specifies lower limit on vertical axis scale

VUPPER= value

specifies upper limit on vertical axis scale

WAXIS= n

specifies thickness for axis

Table 60.23: Graphics Catalog Options

DESCRIPTION= string

specifies description for graphics catalog member

NAME= string

specifies name for plot in graphics catalog

Table 60.24: Options for Text Enhancement

FONT= font

specifies software font for text

HEIGHT= value

specifies height of text used outside framed areas

INFONT= font

specifies software font for text inside framed areas

INHEIGHT= value

specifies height of text inside framed areas

 
Table 60.25: Options for Reference Lines

HREF<(INTERSECT)>=value-list

requests horizontal reference line

HREFLABELS=(label1 , , labeln)

specifies labels for HREF= lines

HREFLABPOS= n

specifies vertical position of labels for HREF= lines

LHREF= linetype

specifies line style for HREF= lines

LVREF= linetype

specifies line style for VREF= lines

VREF<(INTERSECT)>=value-list

requests vertical reference line

VREFLABELS=(label1 , , labeln)

specifies labels for VREF= lines

VREFLABPOS= n

specifies horizontal position of labels for VREF= lines

Dictionary of Options

The following entries provide detailed descriptions of the options in the LPREDPLOT statement.

ANNOTATE= SAS-data-set

ANNO= SAS-data-set

  • specifies an ANNOTATE data set, as described in SAS/GRAPH Software: Reference , that enables you to add features to the lpred plot. The ANNOTATE= data set you specify in the LPREDPLOT statement is used for all plots created by the statement.

CAXIS= color

CAXES= color

  • specifies the color used for the axes and tick marks. This option overrides any COLOR= specifications in an AXIS statement. The default is the first color in the device color list.

CFIT= color

  • specifies the color for the fitted lpred lines. The default is the first color in the device color list.

CFRAME= color

CFR= color

  • specifies the color for the area enclosed by the axes and frame. This area is not shaded by default.

CGRID= color

  • specifies the color for grid lines. The default is the first color in the device color list.

CHREF= color

CH= color

  • specifies the color for lines requested by the HREF= option. The default is the first color in the device color list.

CTEXT= color

  • specifies the color for tick mark values and axis labels. The default is the color specified for the CTEXT= option in the most recent GOPTIONS statement.

CVREF= color

CV= color

  • specifies the color for lines requested by the VREF= option. The default is the first color in the device color list.

DESCRIPTION= string

DES= string

  • specifies a description, up to 40 characters, that appears in the PROC GREPLAY master menu. The default is the variable name.

FONT= font

  • specifies a software font for reference line and axis labels. You can also specify fonts for axis labels in an AXIS statement. The FONT= font takes precedence over the FTEXT= font specified in the most recent GOPTIONS statement. Hardware characters are used by default.

HAXIS= value1 to value2 < by value3 >

  • specifies tick mark values for the horizontal axis. value1 , value2 , and value3 must be numeric, and value1 must be less than value2 . The lower tick mark is value1 . Tick marks are drawn at increments of value3 . The last tick mark is the greatest value that does not exceed value2 . If value3 is omitted, a value of 1 is used.

  • Examples of HAXIS= lists are:

      haxis = 0 to 10   haxis = 2 to 10 by 2   haxis = 0 to 200 by 10  

HEIGHT= value

  • specifies the height of text used outside framed areas. The default value is 3.846 (in percentage).

HLOWER= value

  • specifies the lower limit on the horizontal axis scale. The HLOWER= option specifies value as the lower horizontal axis tick mark. The tick mark interval and the upper axis limit are determined automatically. This option has no effect if the HAXIS= option is used.

HOFFSET= value

  • specifies offset for horizontal axis. The default value is 1.

HUPPER= value

  • specifies value as the upper horizontal axis tick mark. The tick mark interval and the lower axis limit are determined automatically. This option has no effect if the HAXIS= option is used.

HREF < (INTERSECT) > = value-list

  • requests reference lines perpendicular to the horizontal axis. If (INTERSECT) is specified, a second reference line perpendicular to the vertical axis is drawn that intersects the fit line at the same point as the horizontal axis reference line. If a horizontal axis reference line label is specified, the intersecting vertical axis reference line is labeled with the vertical axis value. See also the CHREF=, HREFLABELS=, and LHREF= options.

HREFLABELS= label1 , , labeln

HREFLABEL= label1 , , labeln

HREFLAB= label1 , , labeln

  • specifies labels for the lines requested by the HREF= option. The number of labels must equal the number of lines. Enclose each label in quotes. Labels can be up to 16 characters.

HREFLABPOS= n

  • specifies the vertical position of labels for HREF= lines. The following table shows valid values for n and the corresponding label placements.

    n

    label placement

    1

    top

    2

    staggered from top

    3

    bottom

    4

    staggered from bottom

    5

    alternating from top

    6

    alternating from bottom

INBORDER

  • requests a border around lpred plots.

LEVEL= ( character-list )

ORDINAL= ( character-list )

  • specifies the names of the levels for which linear predictor lines are requested. Names should be quoted and separated by space. If there is no correct name provided, no lpred line is plotted.

LFIT= linetype

  • specifies a line style for fitted curves and confidence limits. By default, fitted curves are drawn by connecting solid lines ( linetype = 1 ) and confidence limits are drawn by connecting dashed lines ( linetype = 3 ).

LGRID= linetype

  • specifies a line style for all grid lines. linetype is between 1 and 46. The default is 35.

LHREF= linetype

LH= linetype

  • specifies the line type for lines requested by the HREF= option. The default is 2, which produces a dashed line.

LVREF= linetype

LV = linetype

  • specifies the line type for lines requested by the VREF= option. The default is 2, which produces a dashed line.

NAME= string

  • specifies a name for the plot, up to eight characters, that appears in the PROC GREPLAY master menu. The default is PROBIT .

NOCONF

  • suppresses confidence limits from the plot. This only works for the binomial model. Confidence limits are not plotted for the multinomial model.

NODATA

  • suppresses observed data points from the plot. This only works for the binomial model. Data points are not plotted for the multinomial model.

NOFIT

  • suppresses the fitted lpred lines.

NOFRAME

  • suppresses the frame around plotting areas.

NOGRID

  • suppresses grid lines.

NOHLABEL

  • suppresses horizontal labels.

NOHTICK

  • suppresses horizontal tick marks.

NOTHRESH

  • suppresses the threshold line.

NOVLABEL

  • suppresses vertical labels.

NOVTICK

  • suppresses vertical tick marks.

THRESHLABPOS= n

  • specifies the horizontal position of labels for the threshold line. The following table shows valid values for n and the corresponding label placements.

    n

    label placement

    1

    left

    2

    right

VAXIS= value1 to value2 < by value3 >

  • specifies tick mark values for the vertical axis. value1 , value2 , and value3 must be numeric, and value1 must be less than value2 . The lower tick mark is value1 . Tick marks are drawn at increments of value3 . The last tick mark is the greatest value that does not exceed value2 . This method of specification of tick marks is not valid for logarithmic axes. If value3 is omitted, a value of 1 is used.

    Examples of VAXIS= lists are:

      vaxis = 0 to 10   vaxis = 0 to 2 by .1  

VAXISLABEL= string

  • specifies a label for the vertical axis.

VLOWER= value

  • specifies the lower limit on the vertical axis scale. The VLOWER= option specifies value as the lower vertical axis tick mark. The tick mark interval and the upper axis limit are determined automatically. This option has no effect if the VAXIS= option is used.

VREF= value-list

  • requests reference lines perpendicular to the vertical axis. If (INTERSECT) is specified, a second reference line perpendicular to the horizontal axis is drawn that intersects the fit line at the same point as the vertical axis reference line. If a vertical axis reference line label is specified, the intersecting horizontal axis reference line is labeled with the horizontal axis value. See also the CVREF=, LVREF=, and VREFLABELS= options.

VREFLABELS= label1 , , labeln

VREFLABEL= label1 , , labeln

VREFLAB= label1 , , labeln

  • specifies labels for the lines requested by the VREF= option. The number of labels must equal the number of lines. Enclose each label in quotes. Labels can be up to 16 characters.

VREFLABPOS= n

  • specifies the horizontal position of labels for VREF= lines. The following table shows valid values for n and the corresponding label placements.

    n

    label placement

    1

    left

    2

    right

VUPPER= number

  • specifies the upper limit on the vertical axis scale. The VUPPER= option specifies number as the upper vertical axis tick mark. The tick mark interval and the lower axis limit are determined automatically. This option has no effect if the VAXIS= option is used.

WAXIS= n

  • specifies line thickness for axes and frame. The default value is 1.

WFIT= n

  • specifies line thickness for fitted lines. The default value is 1.

WGRID= n

  • specifies line thickness for grids. The default value is 1.

WREFL= n

  • specifies line thickness for reference lines. The default value is 1.

MODEL Statement

  • label:> MODEL response=effects < / options > ;

  • label: > MODEL events/trials=effects < / options > ;

The MODEL statement names the variables used as the response and the independent variables. Additionally, you can specify the distribution used to model the response, as well as other options. Only a single MODEL statement can be used with one invocation of the PROBIT procedure. If multiple MODEL statements are present, only the last is used. Main effects and interaction terms can be specified in the MODEL statement, similar to the GLM procedure.

The optional label is used to label output from the matching MODEL statement.

The response can be a single variable with a value that is used to indicate the level of the observed response. Such a response variable must be listed in the CLASS statement. For example, the response might be a variable called Symptoms that takes on the values ˜None, ˜Mild, or ˜Severe. Note that, for dichotomous response variables, the probability of the lower sorted value is modeled by default (see the Details section beginning on page 3754). Because the model fit by the PROBIT procedure requires ordered response levels, you may need to use either the ORDER=DATA option in the PROC PROBIT statement or a numeric coding of the response to get the desired ordering of levels.

Alternatively, the response can be specified as a pair of variable names separated by a slash (/). The value of the first variable, events , is the number of positive responses (or events). The value of the second variable, trials , is the number of trials. Both variables must be numeric and non-negative, and the ratio of the first variable value to the second variable value must be between 0 and 1, inclusive. For example, the variables might be hits , a variable containing the number of hits for a baseball player, and AtBats , a variable containing the number of times at bat. A model for hitting proportion (batting average) as a function of age could be specified as

  model hits/AtBats=age;  

The effects following the equal sign are the covariates in the model. Higher-order effects, such as interactions and nested terms, are allowed in the list, similar to the GLM procedure. Variable names and combinations of variable names representing higher-order terms are allowed to appear in this list. Class variables can be used as effects, and indicator variables are generated for the class levels. If you do not specify any covariates following the equal sign, an intercept-only model is fit.

The following options are available in the MODEL statement.

AGGREGATE

AGGREGATE= (variable-list)

  • specifies the subpopulations on which the Pearson chi-square test statistic and the log-likelihood ratio chi-square test statistic (deviance) are calculated if the LACKFIT option is specified. See the section Rescaling the Covariance Matrix on page 3760 for details of Pearson s chi-square and deviance calculations.

  • Observations with common values in the given list of variables are regarded as coming from the same subpopulation. Variables in the list can be any variables in the input data set. Specifying the AGGREGATE option is equivalent to specifying the AGGREGATE= option with a variable list that includes all independent variables in the MODEL statement. The PROBIT procedure sorts the input data set according to the variables specified in this list. Information for the sorted data set is reported in the Response-Covariate Profile table.

  • The deviance and Pearson goodness-of-fit statistics are calculated if the LACKFIT option is specified in the MODEL statement. The calculated results are reported in the Goodness-of-Fit table. If the Pearson chi-square test is significant with the test level specified by the HPROB= option, the fiducial limits, if required with the INVERSECL option in the MODEL statement, are modified (see the section Inverse Confidence Limits on page 3761 for details). Also, the covariance matrix is re-scaled by the dispersion parameter when the SCALE= option is specified.

ALPHA= value

  • sets the significance level for the confidence intervals for regression parameters, fiducial limits for the predicted values, and confidence intervals for the predicted probabilities. The value must be between 0 and 1. The default value is ALPHA=0.05.

CONVERGE= value

  • specifies the convergence criterion. Convergence is declared when the maximum change in the parameter estimates between Newton-Raphson steps is less than the value specified. The change is a relative change if the parameter is greater than 0.01 in absolute value; otherwise , it is an absolute change.

  • By default, CONVERGE=1.0E-8.

CORRB

  • displays the estimated correlation matrix of the parameter estimates.

COVB

  • displays the estimated covariance matrix of the parameter estimates.

DISTRIBUTION= distribution-type

DIST= distribution-type

D= distribution-type

  • specifies the cumulative distribution function used to model the response probabilities. The distributions are described in the Details section beginning on page 3754. Valid values for distribution-type are

    NORMAL

    the normal distribution for the probit model

    LOGISTIC

    the logistic distribution for the logit model

    EXTREMEVALUE EXTREME GOMPERTZ

    the extreme value, or Gompertz distribution for the gompit model

  • By default, DISTRIBUTION=NORMAL.

HPROB= p

  • specifies a minimum probability level for the Pearson chi-square to indicate a good fit. The default value is 0.10. The LACKFIT option must also be specified for this option to have any effect. For Pearson goodness-of-fit chi-square values with probability greater than the HPROB= value, the fiducial limits, if requested with the INVERSECL option, are computed using a critical value of 1.96. For chi-square values with probability less than the value of the HPROB= option, the critical value is a 0.95 two-sided quantile value taken from the t distribution with degrees of freedom equal to ( k ˆ’ 1) — m ˆ’ q , where k is the number of levels for the response variable, m is the number of different sets of independent variable values, and q is the number of parameters fit in the model. If you specify the HPROB= option in both the PROC PROBIT and MODEL statements, the MODEL statement option takes precedence.

INITIAL= values

  • sets initial values for the parameters in the model other than the intercept. The values must be given in the order in which the variables are listed in the MODEL statement. If some of the independent variables listed in the MODEL statement are classification variables, then there must be as many values given for that variable as there are classification levels minus 1. The INITIAL option can be specified as follows.

    Type of List

    Specification

    list separated by blanks

    initial=3 4 5

    list separated by commas

    initial=3, 4, 5

  • By default, all parameters have initial estimates of zero.

  • Note: The INITIAL= option is overwritten by the INEST= option in the PROC PROBIT statement.

INTERCEPT= value

  • initializes the intercept parameter to value . By default, INTERCEPT=0.

INVERSECL

  • computes confidence limits for the values of the first continuous independent variable (such as dose) that yield selected response rates. If the algorithm fails to converge (this can happen when C is nonzero), missing values are reported for the confidence limits. See the section Inverse Confidence Limits on page 3761 for details.

ITPRINT

  • displays the iteration history, the final evaluation of the gradient, and the second derivative matrix (Hessian).

LACKFIT

  • performs two goodness-of-fit tests (a Pearson chi-square test and a log-likelihood ratio chi-square test) for the fitted model.

  • To compute the test statistics, proper grouping of the observations into subpopulations is needed. You can use the AGGREGATE or AGGREGATE= option to this end. See the entry for the AGGREGATE and AGGREGATE= options under the MODEL statement. If neither AGGREGATE nor AGGREGATE= is specified, PROC PROBIT assumes each observation is from a separate subpopulation and computes the goodness-of-fit test statistics only for the events/trials syntax.

  • Note: This test is not appropriate if the data are very sparse, with only a few values at each set of the independent variable values.

  • If the Pearson chi-square test statistic is significant, then the covariance estimates and standard error estimates are adjusted. See the section Lack of Fit Tests on page 3759 for a description of the tests. Note that the LACKFIT option can also appear in the PROC PROBIT statement. See the section PROC PROBIT Statement on page 3711 for details.

MAXITER= value

MAXIT= value

  • specifies the maximum number of iterations to be performed in estimating the parameters. By default, MAXITER=50.

NOINT

  • fits a model with no intercept parameter. If the INTERCEPT= option is also specified, the intercept is fixed at the specified value; otherwise, it is set to zero. This is most useful when the response is binary. When the response has k levels, then k ˆ’ 1 intercept parameters are fit. The NOINT option sets the intercept parameter corresponding to the lowest response level equal to zero. A Lagrange multiplier , or score, test for the restricted model is computed when the NOINT option is specified.

SCALE= scale

  • enables you to specify the method for estimating the dispersion parameter. To correct for overdispersion or underdispersion, the covariance matrix is multiplied by the estimate of the dispersion parameter. Valid values for scale are as follows:

    D DEVIANCE

    specifies that the dispersion parameter be estimated by the deviance divided by its degrees of freedom.

    P PEARSON

    specifies that the dispersion parameter be estimated by the Pearson chi-square statistic divided by its degrees of freedom. This is set as the default.

  • You can use the AGGREGATE= option to define the subpopulations for calculating the Pearson chi-square statistic and the deviance.

  • The Goodness-of-Fit table includes the Pearson chi-square statistic, the deviance, their degrees of freedom, the ratio of each statistic divided by its degrees of freedom, and the corresponding p -value.

SINGULAR= value

  • specifies the singularity criterion for determining linear dependencies in the set of independent variables. The sum of squares and cross-products matrix of the independent variables is formed and swept. If the relative size of a pivot becomes less than the value specified, then the variable corresponding to the pivot is considered to be linearly dependent on the previous set of variables considered . By default, SINGULAR=1E ˆ’ 12.

OUTPUT Statement

  • OUTPUT < OUT=SAS-data-set >< keyword=name keyword=name > ;

The OUTPUT statement creates a new SAS data set containing all variables in the input data set and, optionally , the fitted probabilities, the estimate of x ² ² , and the estimate of its standard error. Estimates of the probabilities, x ² ² , and the standard errors are computed for observations with missing response values as long as the values of all the explanatory variables are nonmissing. This enables you to compute these statistics for additional settings of the explanatory variables that are of interest but for which responses are not observed.

You can specify multiple OUTPUT statements. Each OUTPUT statement creates a new data set and applies only to the preceding MODEL statement. If you want to create a permanent SAS data set, you must specify a two-level name (refer to SAS Language Reference: Concepts for more information on permanent SAS data sets).

Details on the specifications in the OUTPUT statement are as follows:

keyword=name

specifies the statistics to include in the output data set and assigns names to the new variables that contain the statistics. Specify a keyword for each desired statistic (see the following list of keywords), an equal sign, and the variable to contain the statistic.

 

The keywords allowed and the statistics they represent are as follows:

 

PROB P

cumulative probability estimates click to expand

 

STD

standard error estimates of a j + x ² b

 

XBETA

estimates of a j + x ² ²

OUT= SAS-data-set

names the output data set. By default, the new data set is named using the DATA n convention.

When the single variable response syntax is used, the _LEVEL_ variable is added to the output data set, and there are k ˆ’ 1 output observations for each input observation, where k is the number of response levels. There is no observation output corresponding to the highest response level. For each of the k ˆ’ 1 observations, the PROB variable contains the fitted probability of obtaining a response level up to the level indicated by the _LEVEL_ variable, the XBETA variable contains a j + x ² b , where j references the levels ( a 1 =0), and the STD variable contains the standard error estimate of the XBETA variable. See the Details section, which follows, for the formulas for the parameterizations.

PREDPPLOT Statement

  • PREDPPLOT < var = variable >< options > ;

The PREDPPLOT statement plots the predicted probability against a single continuous variable (dose variable) in the MODEL statement for both the binomial model and the multinomial model. Confidence limits are only available for the binomial model. An attached box on the right side of the plot is used to label predicted probability curves with the names of their levels for the multinomial model. You can specify the color of this box using the CLABBOX= option.

VAR= (variable)

  • specifies a single continuous variable (dose variable) in the independent variable list of the MODEL statement. If a VAR= variable is not specified, the first single continuous variable in the independent variable list of the MODEL statement is used. If such a variable does not exist in the independent variable list of the MODEL statement, an error is reported.

  • The predicted probability is

    click to expand
  • for the binomial model and

    click to expand
  • for the multinomial model with k response levels, where F is the cumulative distribution function used to model the probability, x ² is the vector of the covariates, j are the estimated ordinal intercepts with 1 =0, C is the threshold parameter, either known or estimated from the model, and ² is the vector of estimated parameters.

  • To plot (or j ) as a function of a continuous variable x 1 , the remaining covariates x ˆ’ 1 must be specified. You can use the XDATA= option to provide the values of x ˆ’ 1 (see the XDATA= option in the PROC PROBIT statement for details), or use the default values that follow the rules:

  • If the effect contains a continuous variable (or variables), the overall mean of this effect is used.

  • If the effect is a single classification variable, the highest level of the variable is used.

options

  • enable you to plot the observed data and add features to the plot.

  • You can use options in the PREDPPLOT statement to

    • superimpose specification limits

    • suppress or add observed data points for the binomial model

    • suppress or add confidence limits for the binomial model

    • specify the levels for which predicted probability curves are requested for the multinomial model

    • specify graphical enhancements (such as color or text height)

Summary of Options

The following tables list all options by function. The Dictionary of Options on page 3749 describes each option in detail.

PREDPPLOT Options
Table 60.26: Plot Layout Options for PREDPPLOT

LEVEL= character-list

specifies the names of the levels for which the predicted probability curves are requested (only for the multinomial model)

NOCONF

suppresses confidence limits

NODATA

suppresses observed data points on the plot

NOTHRESH

suppresses the threshold line

THRESHLABPOS= value

specifies the position for the label of the threshold line

General Options
Table 60.27: Color Options

CAXIS= color

specifies color for the axes

CFIT= color

specifies color for fitted curves

CFRAME= color

specifies color for frame

CGRID= color

specifies color for grid lines

CHREF= color

specifies color for HREF= lines

CLABBOX= color

specifies color for label box

CTEXT= color

specifies color for text

CVREF= color

specifies color for VREF= lines

Table 60.28: Options to Enhance Plots Produced on Graphics Devices

ANNOTATE= SAS-data-set

specifies an ANNOTATE data set

INBORDER

requests a border around plot

LFIT= linetype

specifies line style for fitted curves and confidence limits

LGRID= linetype

specifies line style for grid lines

NOFRAME

suppresses the frame around plotting areas

NOGRID

suppresses grid lines

NOFIT

suppresses fitted curves

NOHLABEL

suppresses horizontal labels

NOHTICK

suppresses horizontal ticks

NOVTICK

suppresses vertical ticks

TURNVLABELS

vertically strings out characters in vertical labels

WFIT= n

specifies thickness for fitted curves

WGRID= n

specifies thickness for grids

WREFL= n

specifies thickness for reference lines

Table 60.29: Axis Options

HAXIS= value1 to value2 < by value3 >

specifies tick mark values for horizontal axis

HOFFSET= value

specifies offset for horizontal axis

HLOWER= value

specifies lower limit on horizontal axis scale

HUPPER= value

specifies upper limit on horizontal axis scale

NHTICK= n

specifies number of ticks for horizontal axis

NVTICK= n

specifies number of ticks for vertical axis

VAXIS= value1 to value2 < by value3 >

specifies tick mark values for vertical axis

VAXISLABEL= label

specifies label for vertical axis

VOFFSET= value

specifies offset for vertical axis

VLOWER= value

specifies lower limit on vertical axis scale

VUPPER= value

specifies upper limit on vertical axis scale

WAXIS= n

specifies thickness for axis

Table 60.30: Graphics Catalog Options

DESCRIPTION= string

specifies description for graphics catalog member

NAME= string

specifies name for plot in graphics catalog

Table 60.31: Options for Text Enhancement

FONT= font

specifies software font for text

HEIGHT= value

specifies height of text used outside framed areas

INFONT= font

specifies software font for text inside framed areas

INHEIGHT= value

specifies height of text inside framed areas

Table 60.32: Options for Reference Lines

HREF<(INTERSECT)> =value-list

requests horizontal reference line

HREFLABELS= (label1 , , labeln)

specifies labels for HREF= lines

HREFLABPOS= n

specifies vertical position of labels for HREF= lines

LHREF= linetype

specifies line style for HREF= lines

LVREF= linetype

specifies line style for VREF= lines

VREF<(INTERSECT)> =value-list

requests vertical reference line

VREFLABELS= (label1 , , labeln)

specifies labels for VREF= lines

VREFLABPOS= n

specifies horizontal position of labels for VREF= lines

Dictionary of Options

The following entries provide detailed descriptions of the options in the PREDPPLOT statement.

ANNOTATE= SAS-data-set

ANNO= SAS-data-set

  • specifies an ANNOTATE data set, as described in SAS/GRAPH Software: Reference , that enables you to add features to the predicted probability plot. The ANNOTATE= data set you specify in the PREDPPLOT statement is used for all plots created by the statement.

CAXIS= color

CAXES= color

  • specifies the color used for the axes and tick marks. This option overrides any COLOR= specifications in an AXIS statement. The default is the first color in the device color list.

CFIT= color

  • specifies the color for the fitted predicted probability curves. The default is the first color in the device color list.

CFRAME= color

CFR= color

  • specifies the color for the area enclosed by the axes and frame. This area is not shaded by default.

CGRID= color

  • specifies the color for grid lines. The default is the first color in the device color list.

CHREF= color

CH= color

  • specifies the color for lines requested by the HREF= option. The default is the first color in the device color list.

CTEXT= color

  • specifies the color for tick mark values and axis labels. The default is the color specified for the CTEXT= option in the most recent GOPTIONS statement.

CVREF= color

CV= color

  • specifies the color for lines requested by the VREF= option. The default is the first color in the device color list.

DESCRIPTION= string

DES= string

  • specifies a description, up to 40 characters, that appears in the PROC GREPLAY master menu. The default is the variable name.

FONT= font

  • specifies a software font for reference line and axis labels. You can also specify fonts for axis labels in an AXIS statement. The FONT= font takes precedence over the FTEXT= font specified in the most recent GOPTIONS statement. Hardware characters are used by default.

HAXIS= value1 to value2 < by value3 >

  • specifies tick mark values for the horizontal axis. value1 , value2 , and value3 must be numeric, and value1 must be less than value2 . The lower tick mark is value1 .Tick marks are drawn at increments of value3 . The last tick mark is the greatest value that does not exceed value2 . If value3 is omitted, a value of 1 is used.

    Examples of HAXIS= lists are:

  haxis = 0 to 10   haxis = 2 to 10 by 2   haxis = 0 to 200 by 10  

HEIGHT= value

  • specifies the height of text used outside framed areas.

HLOWER= value

  • specifies the lower limit on the horizontal axis scale. The HLOWER= option specifies value as the lower horizontal axis tick mark. The tick mark interval and the upper axis limit are determined automatically. This option has no effect if the HAXIS= option is used.

HOFFSET= value

  • specifies the offset for the horizontal axis. The default value is 1.

HUPPER= value

  • specifies value as the upper horizontal axis tick mark. The tick mark interval and the lower axis limit are determined automatically. This option has no effect if the HAXIS= option is used.

HREF < (INTERSECT) > = value-list

  • requests reference lines perpendicular to the horizontal axis. If (INTERSECT) is specified, a second reference line perpendicular to the vertical axis is drawn that intersects the fit line at the same point as the horizontal axis reference line. If a horizontal axis reference line label is specified, the intersecting vertical axis reference line is labeled with the vertical axis value. See also the CHREF=, HREFLABELS=, and LHREF= options.

HREFLABELS= label1 , , labeln

HREFLABEL= label1 , , labeln

HREFLAB= label1 , , labeln

  • specifies labels for the lines requested by the HREF= option. The number of labels must equal the number of lines. Enclose each label in quotes. Labels can be up to 16 characters.

HREFLABPOS= n

  • specifies the vertical position of labels for HREF= lines. The following table shows valid values for n and the corresponding label placements.

n

label placement

1

top

2

staggered from top

3

bottom

4

staggered from bottom

5

alternating from top

6

alternating from bottom

INBORDER

  • requests a border around predicted probability plots.

LEVEL= ( character-list )

ORDINAL= ( character-list )

  • specifies the names of the levels for which predicted probability curves are requested. Names should be quoted and separated by space. If there is no correct name provided, no fitted probability curve is plotted.

LFIT= linetype

  • specifies a line style for fitted curves and confidence limits. By default, fitted curves are drawn by connecting solid lines ( linetype = 1 ) and confidence limits are drawn by connecting dashed lines ( linetype = 3 ).

LGRID= linetype

  • specifies a line style for all grid lines. linetype is between 1 and 46. The default is 35.

LHREF= linetype

LH= linetype

  • specifies the line type for lines requested by the HREF= option. The default is 2, which produces a dashed line.

LVREF= linetype

LV = linetype

  • specifies the line type for lines requested by the VREF= option. The default is 2, which produces a dashed line.

NAME= string

  • specifies a name for the plot, up to eight characters, that appears in the PROC GREPLAY master menu. The default is PROBIT .

NOCONF

  • suppresses confidence limits from the plot. This only works for the binomial model. Confidence limits are not plotted for the multinomial model.

NODATA

  • suppresses observed data points from the plot. This only works for the binomial model. The data points are not plotted for the multinomial model.

NOFIT

  • suppresses the fitted predicted probability curves.

NOFRAME

  • suppresses the frame around plotting areas.

NOGRID

  • suppresses grid lines.

NOHLABEL

  • suppresses horizontal labels.

NOHTICK

  • suppresses horizontal tick marks.

NOTHRESH

  • suppresses the threshold line.

NOVLABEL

  • suppresses vertical labels.

NOVTICK

  • suppresses vertical tick marks.

THRESHLABPOS= n

  • specifies the horizontal position of labels for the threshold line. The following table shows valid values for n and the corresponding label placements.

    n

    label placement

    1

    left

    2

    right

VAXIS= value1 to value2 < by value3 >

  • specifies tick mark values for the vertical axis. value1 , value2 , and value3 must be numeric, and value1 must be less than value2 . The lower tick mark is value1 .Tick marks are drawn at increments of value3 . The last tick mark is the greatest value that does not exceed value2 . This method of specification of tick marks is not valid for logarithmic axes. If value3 is omitted, a value of 1 is used.

    Examples of VAXIS= lists are:

  vaxis = 0 to 10   vaxis = 0 to 2 by .1  

VAXISLABEL= string

  • specifies a label for the vertical axis.

VLOWER= value

  • specifies the lower limit on the vertical axis scale. The VLOWER= option specifies value as the lower vertical axis tick mark. The tick mark interval and the upper axis limit are determined automatically. This option has no effect if the VAXIS= option is used.

VREF= value-list

  • requests reference lines perpendicular to the vertical axis. If (INTERSECT) is specified, a second reference line perpendicular to the horizontal axis is drawn that intersects the fit line at the same point as the vertical axis reference line. If a vertical axis reference line label is specified, the intersecting horizontal axis reference line is labeled with the horizontal axis value. See also the CVREF=, LVREF=, and VREFLABELS= options.

VREFLABELS= label1 , , labeln

VREFLABEL= label1 , , labeln

VREFLAB= label1 , , labeln

  • specifies labels for the lines requested by the VREF= option. The number of labels must equal the number of lines. Enclose each label in quotes. Labels can be up to 16 characters.

VREFLABPOS= n

  • specifies the horizontal position of labels for VREF= lines. The following table shows valid values for n and the corresponding label placements.

    n

    label placement

    1

    left

    2

    right

VUPPER= value

  • specifies the upper limit on the vertical axis scale. The VUPPER= option specifies value as the upper vertical axis tick mark. The tick mark interval and the lower axis limit are determined automatically. This option has no effect if the VAXIS= option is used.

WAXIS= n

  • specifies line thickness for axes and frame. The default value is 1.

WFIT= n

  • specifies line thickness for fitted curves. The default value is 1.

WGRID= n

  • specifies line thickness for grids. The default value is 1.

WREFL= n

  • specifies line thickness for reference lines. The default value is 1.

WEIGHT Statement

  • WEIGHT variable ;

A WEIGHT statement can be used with PROC PROBIT to weight each observation by the value of the variable specified. The contribution of each observation to the likelihood function is multiplied by the value of the weight variable. Observations with zero, negative, or missing weights are not used in model estimation.




SAS.STAT 9.1 Users Guide (Vol. 5)
SAS.STAT 9.1 Users Guide (Vol. 5)
ISBN: N/A
EAN: N/A
Year: 2004
Pages: 98

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net