Syntax | SAS.STAT 9.1 Users Guide (Vol. 5)

The following statements are available in PROC PROBIT.

PROC PROBIT < options > ;
- MODEL response=independents < / options > ;
- BY variables ;
- CLASS variables ;
- OUTPUT < OUT= SAS-data-set >< options > ;
- WEIGHT variable ;
- CDFPLOT < VAR = variable >< options > ;
- INSET < keyword-list >< / options > ;
- IPPPLOT < VAR = variable >< options > ;
- LPREDPLOT < VAR = variable >< options > ;
- PREDPPLOT < VAR = variable >< options > ;

A MODEL statement is required. Only a single MODEL statement can be used with one invocation of the PROBIT procedure. If multiple MODEL statements are present, only the last one is used. Main effects and higher-order terms can be specified in the MODEL statement, similar to the GLM procedure. If a CLASS statement is used, it must precede the MODEL statement.

The CDFPLOT, INSET, IPPPLOT, LPREDPLOT, and PREDPPLOT statements are used to produce graphical output. You can use any appropriate combination of the graphical statements after the MODEL statement.

PROC PROBIT Statement

PROC PROBIT < options > ;

The PROC PROBIT statement starts the procedure. You can specify the following options in the PROC PROBIT statement.

COVOUT

writes the parameter estimate covariance matrix to the OUTEST= data set.

C= rate

OPTC

controls how the natural response is handled. Specify the OPTC option to request that the natural response rate C be estimated. Specify the C= rate option to set the natural response rate or to provide the initial estimate of the natural response rate. The natural response rate value must be a number between 0 and 1.
- If you specify neither the OPTC nor the C= option, a natural response rate of zero is assumed.
- If you specify both the OPTC and the C= option, the C= option should be a reasonable initial estimate of the natural response rate. For example, you could use the ratio of the number of responses to the number of subjects in a control group .
- If you specify the C= option but not the OPTC option, the natural response rate is set to the specified value and not estimated.
- If you specify the OPTC option but not the C= option, PROC PROBIT s action depends on the response variable, as follows :
  - If you specify either the LN or LOG10 option and some subjects have the first independent variable (dose) values less than or equal to zero, these subjects are treated as a control group. The initial estimate of C is then the ratio of the number of responses to the number of subjects in this group.
  - If you do not specify the LN or LOG10 option or if there is no control group, then one of the following occurs:
    - If all responses are greater than zero, the initial estimate of the natural response rate is the minimal response rate (the ratio of the number of responses to the number of subjects in a dose group) across all dose levels.
    - If one or more of the responses is zero (making the response rate zero in that dose group), the initial estimate of the natural rate is the reciprocal of twice the largest number of subjects in any dose group in the experiment.

DATA = SAS-data-set

specifies the SAS data set to be used by PROC PROBIT. By default, the procedure uses the most recently created SAS data set.

GOUT= graphics-catalog

specifies a graphics catalog in which to save graphics output.

HPROB= p

specifies a minimum probability level for the Pearson chi-square to indicate a good fit. The default value is 0.10. The LACKFIT option must also be specified for this option to have any effect. For Pearson goodness-of-fit chi-square values with probability greater than the HPROB= value, the fiducial limits, if requested with the INVERSECL option, are computed using a critical value of 1.96. For chi-square values with probability less than the value of the HPROB= option, the critical value is a 0.95 two-sided quantile value taken from the t distribution with degrees of freedom equal to ( k ˆ’ 1) — m ˆ’ q , where k is the number of levels for the response variable, m is the number of different sets of independent variable values, and q is the number of parameters fit in the model. Note that the HPROB= option can also appear in the MODEL statement.

INEST= SAS-data-set

specifies an input SAS data set that contains initial estimates for all the parameters in the model. See the section INEST= SAS-data-set on page 3757 for a detailed description of the contents of the INEST= data set.

INVERSECL

computes confidence limits for the values of the first continuous independent variable (such as dose) that yield selected response rates. If the algorithm fails to converge (this can happen when C is nonzero), missing values are reported for the confidence limits. See the section Inverse Confidence Limits on page 3761 for details. Note that the INVERSECL option can also appear in the MODEL statement.

LACKFIT

performs two goodness-of-fit tests (a Pearson chi-square test and a log- likelihood ratio chi-square test) for the fitted model.
To compute the test statistics, proper grouping of the observations into subpopulations is needed. You can use the AGGREGATE or AGGREGATE= option for this end. See the entry for the AGGREGATE and AGGREGATE= options under the MODEL statement. If neither AGGREGATE nor AGGREGATE= is specified, PROC PROBIT assumes each observation is from a separate subpopulation and computes the goodness-of-fit test statistics only for the events/trials syntax.
Note: This test is not appropriate if the data are very sparse, with only a few values at each set of the independent variable values.

If the Pearson chi-square test statistic is significant, then the covariance estimates and standard error estimates are adjusted. See the Lack of Fit Tests section on page 3759 for a description of the tests. Note that the LACKFIT option can also appear in the MODEL statement.

LOG

analyzes the data by replacing the first continuous independent variable by its natural logarithm. This variable is usually the level of some treatment such as dosage. In addition to the usual output given by the INVERSECL option, the estimated dose values and 95% fiducial limits for dose are also displayed. If you specify the OPTC option, any observations with a dose value less than or equal to zero are used in the estimation as a control group. If you do not specify the OPTC option with the LOG or LN option, then any observations with the first continuous independent variable values less than or equal to zero are ignored.

LOG10

specifies an analysis like that of the LN or LOG option except that the common logarithm (log to the base 10) of the dose value is used rather than the natural logarithm.

NAMELEN= n

specifies the length of effect names in tables and output data sets to be n characters , where n is a value between 20 and 200. The default length is 20 characters.

NOPRINT

suppresses the display of all output. Note that this option temporarily disables the Output Delivery System (ODS). For more information, see Chapter 14, Using the Output Delivery System.

OPTC

controls how the natural response is handled. See the description of the C= option on page 3711 for details.

ORDER=DATA FORMATTED FREQ INTERNAL

specifies the sorting order for the levels of the classification variables specified in the CLASS statement, including the levels of the response variable. Response level ordering is important since PROC PROBIT always models the probability of response levels at the beginning of the ordering. See the section Response Level Ordering on page 3754 for further details. This ordering also determines which parameters in the model correspond to each level in the data. The following table shows how PROC PROBIT interprets values of the ORDER= option.

Value of ORDER=	Levels Sorted By
DATA	order of appearance in the input data set
FORMATTED	formatted value
FREQ	descending frequency count; levels with the most observations come first in the order
INTERNAL	unformatted value

By default, ORDER=FORMATTED. For the values FORMATTED and INTERNAL, the sort order is machine dependent. For more information on sorting order, see the chapter on the SORT procedure in the SAS Procedures Guide .

OUTEST= SAS-data-set

specifies a SAS data set to contain the parameter estimates and, if the COVOUT option is specified, their estimated covariances. If you omit this option, the output data set is not created. The contents of the data set are described in the section OUTEST= SAS-data-set on page 3762.

X DATA= SAS-data-set

specifies an input SAS data set that contains values for all the independent variables in the MODEL statement and variables in the CLASS statement. If there are covariates specified in a MODEL statement, you specify fixed values for the effects in the MODEL statement by the XDATA= data set when predicted values and/or fiducial limits for a single continuous variable (dose variable) are required. These specified values for the effects in the MODEL statement are also used for generating plots. See the section XDATA= SAS-data-set on page 3763 for a detailed description of the contents of the XDATA= data set.

BY Statement

BY variables ;

You can specify a BY statement with PROC PROBIT to obtain separate analyses on observations in groups defined by the BY variables. When a BY statement appears, the procedure expects the input data set to be sorted in order of the BY variables.

If your input data set is not sorted in ascending order on each of the BY variables, use one of the following alternatives:

Sort the data using the SORT procedure with a similar BY statement.
Specify the BY statement option NOTSORTED or DESCENDING in the BY statement for the PROBIT procedure. The NOTSORTED option does not mean that the data are unsorted but rather that the data are arranged in groups (according to values of the BY variables) and that these groups are not necessarily in alphabetical or increasing numeric order.
Create an index on the BY variables using the DATASETS procedure.

For more information on the BY statement, refer to the discussion in SAS Language Reference: Concepts . For more information on the DATASETS procedure, refer to the discussion in the SAS Procedures Guide .

CDFPLOT Statement

CDFPLOT < var = variable >< options > ;

The CDFPLOT statement plots the predicted cumulative distribution function (CDF) of the multinomial response variable as a function of a single continuous independent variable (dose variable). You can only use this statement after a multinomial model statement.

VAR= (variable)

specifies a single continuous variable (dose variable) in the independent variable list of the MODEL statement. If a VAR= variable is not specified, the first single continuous variable in the independent variable list of the MODEL statement is used. If such a variable does not exist in the independent variable list of the MODEL statement, an error is reported.
The predicted cumulative distribution function is defined as
where j =1, , k are the indexes of the k levels of the multinomial response variable, F is the CDF of the distribution used to model the cumulative probabilities, is the vector of estimated parameters, x is the covariate vector, _j are estimated ordinal intercepts with ₁ = 0, and C is the threshold parameter, either known or estimated from the model. Let x ₁ be the covariate corresponding to the dose variable and x _ˆ’ ₁ be the vector of the rest of the covariates. Let the corresponding estimated parameters be ₁ and _{ˆ’ 1} . Then
To plot _j as a function of x ₁ , x _{ˆ’ 1} must be specified. You can use the XDATA= option to provide the values of x _{ˆ’ 1} (see the XDATA= option in the PROC PROBIT statement for details), or use the default values that follow the rules:
- If the effect contains a continuous variable (or variables), the overall mean of this effect is used.
- If the effect is a single classification variable, the highest level of the variable is used.

options

specify the levels of the multinomial response variable for which the cdf curves are requested, and add features to the plot. There are k ˆ’ 1 curves for a k -level multinomial response variable (for the highest level, it is the constant line 1). You can specify any of them to be plotted by the LEVEL= option in the CDFPLOT statement. See the LEVEL= option for how to specify the levels.
An attached box on the right side of the plot is used to label these curves with the names of their levels. You can specify the color of this box using the CLABBOX= option.
You can use options in the CDFPLOT statement to
- superimpose specification limits
- specify the levels for which the cdf curves are requested
- specify graphical enhancements (such as color or text height)

Summary of Options

The following tables list all options by function. The Dictionary of Options on page 3718 describes each option in detail.

CDF Options

Table 60.1: Options for CDFPLOT
LEVEL= character-list	specifies the names of the levels for which the cdf curves are requested
NOTHRESH	suppresses the threshold line
THRESHLABPOS= value	specifies the position for the label of the threshold line

General Options

Table 60.2: Color Options
CAXIS= color	specifies color for axis
CFIT= color	specifies color for fitted curves
CFRAME= color	specifies color for frame
CGRID= color	specifies color for grid lines
CHREF= color	specifies color for HREF= lines
CLABBOX= color	specifies color for label box
CTEXT= color	specifies color for text
CVREF= color	specifies color for VREF= lines

Table 60.3: Options to Enhance Plots Produced on Graphics Devices
ANNOTATE= SAS-data-set	specifies an ANNOTATE data set
INBORDER	requests a border around plot
LFIT= linetype	specifies line style for fitted curves
LGRID= linetype	specifies line style for grid lines
NOFRAME	suppresses the frame around plotting areas
NOGRID	suppresses grid lines
NOFIT	suppresses cdf curves
NOHLABEL	suppresses horizontal labels
NOHTICK	suppresses horizontal ticks
NOVTICK	suppresses vertical ticks
TURNVLABELS	vertically strings out characters in vertical labels
WFIT= n	specifies thickness for fitted curves
WGRID= n	specifies thickness for grids
WREFL= n	specifies thickness for reference lines

Table 60.4: Axis Options
HAXIS= value1 to value2 < by value3 >	specifies tick mark values for horizontal axis
HOFFSET= value	specifies offset for horizontal axis
HLOWER= value	specifies lower limit on horizontal axis scale
HUPPER= value	specifies upper limit on horizontal axis scale
NHTICK= n	specifies number of ticks for horizontal axis
NVTICK= n	specifies number of ticks for vertical axis
VAXIS= value1 to value2 < by value3 >	specifies tick mark values for vertical axis
VAXISLABEL= label	specifies label for vertical axis
VOFFSET= value	specifies offset for vertical axis
VLOWER= value	specifies lower limit on vertical axis scale
VUPPER= value	specifies upper limit on vertical axis scale
WAXIS= n	specifies thickness for axis

Table 60.5: Graphics Catalog Options
DESCRIPTION= string	specifies description for graphics catalog member
NAME = string	specifies name for plot in graphics catalog

Table 60.6: Options for Text Enhancement
FONT= font	specifies software font for text
HEIGHT= value	specifies height of text used outside framed areas
INFONT= font	specifies software font for text inside framed areas
INHEIGHT= value	specifies height of text inside framed areas

Table 60.7: Options for Reference Lines
HREF< (INTERSECT)> =value-list	requests horizontal reference line
HREFLABELS= (label1 , , labeln)	specifies labels for HREF= lines
HREFLABPOS= n	specifies vertical position of labels for HREF= lines
LHREF= linetype	specifies line style for HREF= lines
LVREF= linetype	specifies line style for VREF= lines
VREF<(INTERSECT)> =value-list	requests vertical reference line
VREFLABELS= (label1 , , labeln)	specifies labels for VREF= lines
VREFLABPOS= n	specifies horizontal position of labels for VREF= lines

Dictionary of Options

The following entries provide detailed descriptions of the options in the CDFPLOT statement.

ANNOTATE= SAS-data-set

ANNO= SAS-data-set

specifies an ANNOTATE data set, as described in SAS/GRAPH Software: Reference , that enables you to add features to the cdf plot. The ANNOTATE= data set you specify in the CDFPLOT statement is used for all plots created by the statement.

CAXIS= color

CAXES= color

specifies the color used for the axes and tick marks. This option overrides any COLOR= specifications in an AXIS statement. The default is the first color in the device color list.

CFIT= color

specifies the color for the fitted cdf curves. The default is the first color in the device color list.

CFRAME= color

CFR= color

specifies the color for the area enclosed by the axes and frame. This area is not shaded by default.

CGRID= color

specifies the color for grid lines. The default is the first color in the device color list.

CLABBOX= color

specifies the color for the area enclosed by the label box for cdf curves. This area is not shaded by default.

CHREF= color

CH= color

specifies the color for lines requested by the HREF= option. The default is the first color in the device color list.

CTEXT= color

specifies the color for tick mark values and axis labels. The default is the color specified for the CTEXT= option in the most recent GOPTIONS statement.

CVREF= color

CV= color

specifies the color for lines requested by the VREF= option. The default is the first color in the device color list.

DESCRIPTION= string

DES= string

specifies a description, up to 40 characters, that appears in the PROC GREPLAY master menu. The default is the variable name.

FONT= font

specifies a software font for reference line and axis labels. You can also specify fonts for axis labels in an AXIS statement. The FONT= font takes precedence over the FTEXT= font specified in the most recent GOPTIONS statement. Hardware characters are used by default.

HAXIS= value1 to value2 < by value3 >

specifies tick mark values for the horizontal axis. value1 , value2 , and value3 must be numeric, and value1 must be less than value2 . The lower tick mark is value1 .Tick marks are drawn at increments of value3 . The last tick mark is the greatest value that does not exceed value2 . If value3 is omitted, a value of 1 is used.

Examples of HAXIS= lists are:

  haxis = 0 to 10   haxis = 2 to 10 by 2   haxis = 0 to 200 by 10

HEIGHT= value

specifies the height of text used outside framed areas. The default value is 3.846 (in percentage).

HLOWER= value

specifies the lower limit on the horizontal axis scale. The HLOWER= option specifies value as the lower horizontal axis tick mark. The tick mark interval and the upper axis limit are determined automatically. This option has no effect if the HAXIS= option is used.

HOFFSET= value

specifies offset for horizontal axis. The default value is 1.

HUPPER= value

specifies value as the upper horizontal axis tick mark. The tick mark interval and the lower axis limit are determined automatically. This option has no effect if the HAXIS= option is used.

HREF < (INTERSECT) > = value-list

requests reference lines perpendicular to the horizontal axis. If (INTERSECT) is specified, a second reference line perpendicular to the vertical axis is drawn that intersects the fit line at the same point as the horizontal axis reference line. If a horizontal axis reference line label is specified, the intersecting vertical axis reference line is labeled with the vertical axis value. See also the CHREF=, HREFLABELS=, and LHREF= options.

HREFLABELS= label1 , , labeln

HREFLABEL= label1 , , labeln

HREFLAB= label1 , , labeln

specifies labels for the lines requested by the HREF= option. The number of labels must equal the number of lines. Enclose each label in quotes. Labels can be up to 16 characters.

HREFLABPOS= n

specifies the vertical position of labels for HREF= lines. The following table shows valid values for n and the corresponding label placements.

n	label placement
1	top
2	staggered from top
3	bottom
4	staggered from bottom
5	alternating from top
6	alternating from bottom

INBORDER

requests a border around cdf plots.

LEVEL= ( character-list )

ORDINAL= ( character-list )

specifies the names of the levels for which cdf curves are requested. Names should be quoted and separated by space. If there is no correct name provided, no cdf curve is plotted.

LFIT= linetype

specifies a line style for fitted curves. By default, fitted curves are drawn by connecting solid lines ( linetype = 1 ).

LGRID= linetype

specifies a line style for all grid lines. linetype is between 1 and 46. The default is 35.

LHREF= linetype

LH= linetype

specifies the line type for lines requested by the HREF= option. The default is 2, which produces a dashed line.

LVREF= linetype

LV = linetype

specifies the line type for lines requested by the VREF= option. The default is 2, which produces a dashed line.

NAME= string

specifies a name for the plot, up to eight characters, that appears in the PROC GREPLAY master menu. The default is PROBIT .

NOFIT

suppresses the fitted cdf curves.

NOFRAME

suppresses the frame around plotting areas.

NOGRID

suppresses grid lines.

NOHLABEL

suppresses horizontal labels.

NOHTICK

suppresses horizontal tick marks.

NOTHRESH

suppresses the threshold line.

NOVLABEL

suppresses vertical labels.

NOVTICK

suppresses vertical tick marks.

THRESHLABPOS= n

specifies the horizontal position of labels for the threshold line. The following table shows valid values for n and the corresponding label placements.

n

label placement

1

left

2

right

n	label placement
1	left
2	right

VAXIS= value1 to value2 < by value3 >

specifies tick mark values for the vertical axis. value1 , value2 , and value3 must be numeric, and value1 must be less than value2 . The lower tick mark is value1 . Tick marks are drawn at increments of value3 . The last tick mark is the greatest value that does not exceed value2 . This method of specification of tick marks is not valid for logarithmic axes. If value3 is omitted, a value of 1 is used.

Examples of VAXIS= lists are:
```
  vaxis = 0 to 10   vaxis = 0 to 2 by .1  
```

VAXISLABEL= string

specifies a label for the vertical axis.

VLOWER= value

specifies the lower limit on the vertical axis scale. The VLOWER= option specifies value as the lower vertical axis tick mark. The tick mark interval and the upper axis limit are determined automatically. This option has no effect if the VAXIS= option is used.

VREF= value-list

requests reference lines perpendicular to the vertical axis. If (INTERSECT) is specified, a second reference line perpendicular to the horizontal axis is drawn that intersects the fit line at the same point as the vertical axis reference line. If a vertical axis reference line label is specified, the intersecting horizontal axis reference line is labeled with the horizontal axis value. See also the CVREF=, LVREF=, and VREFLABELS= options.

VREFLABELS= label1 , , labeln

VREFLABEL= label1 , , labeln

VREFLAB= label1 , , labeln

specifies labels for the lines requested by the VREF= option. The number of labels must equal the number of lines. Enclose each label in quotes. Labels can be up to 16 characters.

VREFLABPOS= n

specifies the horizontal position of labels for VREF= lines. The following table shows valid values for n and the corresponding label placements.

n

label placement

1

left

2

right

n	label placement
1	left
2	right

VUPPER= value

specifies the upper limit on the vertical axis scale. The VUPPER= option specifies value as the upper vertical axis tick mark. The tick mark interval and the lower axis limit are determined automatically. This option has no effect if the VAXIS= option is used.

WAXIS= n

specifies line thickness for axes and frame. The default value is 1.

WFIT= n

specifies line thickness for fitted curves. The default value is 1.

WGRID= n

specifies line thickness for grids. The default value is 1.

WREFL= n

specifies line thickness for reference lines. The default value is 1.

CLASS Statement

CLASS variables ;

The CLASS statement names the classification variables to be used in the analysis. Classification variables can be either character or numeric. If a single response variable is specified in the MODEL statement, it must also be specified in a CLASS statement.

Class levels are determined from the formatted values of the CLASS variables. Thus, you can use formats to group values into levels. See the discussion of the FORMAT procedure in SAS Language Reference: Dictionary .

If the CLASS statement is used, it must appear before any of the MODEL statements.

INSET Statement

INSET < keyword-list >< options > ;

The box or table of summary information produced on plots made with the CDFPLOT, IPPPLOT, LPREDPLOT, and PREDPPLOT statement is called an inset . You can use the INSET statement to customize both the information that is printed in the inset box and the appearance of the inset box. To supply the information that is displayed in the inset box, you specify keywords corresponding to the information you want shown. For example, the following statements produce a predicted probability plot with the number of trials, the number of events, the name of the distribution, and the estimated optimum natural threshold in the inset.

  proc probit data=epidemic;   model r/n = dose;   predpplot ;   inset nobs ntrials nevents dist optc;   run;

By default, inset entries are identified with appropriate labels. However, you can provide a customized label by specifying the keyword for that entry followed by the equal sign (=) and the label in quotes. For example, the following INSET statement produces an inset containing the number of observations and the name of the distribution, labeled Sample Size and Distribution in the inset.

  inset nobs=Sample Size dist=Distribution;

If you specify a keyword that does not apply to the plot you are creating, then the keyword is ignored.

The options control the appearance of the box.

If you specify more than one INSET statement, only the first one is used.

Keywords Used in the INSET Statement

The following tables list keywords available in the INSET statement to display summary statistics, distribution parameters, and distribution fitting information.

Table 60.8: Summary Statistics
NOBS	number of observations
NTRIALS	number of trials
NEVENTS	number of events
C	the user inputted threshold
OPTC	the estimated natural threshold
NRESPLEV	number of levels of the response variable

Table 60.9: General Information
CONFIDENCE	confidence coefficient for all confidence intervals or for the Weibayes fit
DIST	name of the distribution

Options Used in the INSET Statement

The following tables list the options available in the INSET statement.

Table 60.10: General Appearance Options
FONT= font	specifies software font for text
HEIGHT= value	specifies height of text
HEADER= quoted string	specifies text for header or box title
NOFRAME	omits frame around box
POS= value
<DATA PERCENT>	determines the position of the inset. The value can be a compass point (N, NE, E, SE, S, SW, W, NW) or a pair of coordinates (x, y) enclosed in parentheses. The coordinates can be specified in axis percent units or axis data units.
REFPOINT= name	specifies the reference point for an inset that is positioned by a pair of coordinates with the POS= option. You use the REFPOINT= option in conjunction with the POS= coordinates. The REFPOINT= option specifies which corner of the inset frame you have specified with coordinates (x, y) and it can take the value of BR (bottom right), BL (bottom left), TR (top right), or TL (top left). The default is REFPOINT=BL. If the inset position is specified as a compass point, then the REFPOINT= option is ignored.

Table 60.11: Color and Pattern Options
CFILL= color	specifies color for filling box
CFILLH= color	specifies color for filling box header
CFRAME= color	specifies color for frame
CHEADER= color	specifies color for text in header
CTEXT= color	specifies color for text

IPPPLOT Statement

IPPPLOT < var = variable >< options > ;

The IPPPLOT statement plots the inverse of the predicted probability against a single continuous variable (dose variable) in the MODEL statement for the binomial model. You can only use this statement after a binomial model statement. The confidence limits for the predicted values of the dose variable are the computed fiducial limits, not the inverse of the confidence limits of the predicted probabilities. Refer to the section Inverse Confidence Limits on page 3761 for more details.

VAR= ( variable )

specifies a single continuous variable (dose variable) in the independent variable list of the MODEL statement. If a VAR= variable is not specified, the first single continuous variable in the independent variable list of the MODEL statement is used. If such a variable does not exist in the independent variable list of the MODEL statement, an error is reported.
For the binomial model, the response variable is a probability. An estimate of the dose level ₁ needed for a response of p is given by
where F is the cumulative distribution function used to model the probability, x _{ˆ’ 1} is the vector of the rest of the covariates, _{ˆ’ 1} is the vector of the estimated parameters corresponding to x _{ˆ’ 1} , and ₁ is the estimated parameter for the dose variable of interest.
To plot ₁ as a function of p , x _{ˆ’ 1} must be specified. You can use the XDATA= option to provide the values of x _{ˆ’ 1} (see the XDATA= option in the PROC PROBIT statement for details), or use the default values that follow the rules:
- If the effect contains a continuous variable (or variables), the overall mean of this effect is used.
- If the effect is a single classification variable, the highest level of the variable is used.

options

add features to the plot.
You can use options in the IPPPLOT statement to
- superimpose specification limits
- suppress or add the observed data points on the plot
- suppress or add the fiducial limits on the plot
- specify graphical enhancements (such as color or text height)

Summary of Options

The following tables list all options by function. The Dictionary of Options on page 3728 describes each option in detail.

IPP Options

Table 60.12: Plot Layout Options for IPPPLOT
NOCONF	suppresses fiducial limits
NODATA	suppresses observed data points on the plot
NOTHRESH	suppresses the threshold line
THRESHLABPOS= value	specifies the position for the label of the threshold line

General Options

Table 60.13: Color Options
CAXIS= color	specifies color for axis
CFIT= color	specifies color for fitted curves
CFRAME= color	specifies color for frame
CGRID= color	specifies color for grid lines
CHREF= color	specifies color for HREF= lines
CTEXT= color	specifies color for text
CVREF= color	specifies color for VREF= lines

Table 60.14: Options to Enhance Plots Produced on Graphics Devices
ANNOTATE= SAS-data-set	specifies an ANNOTATE data set
INBORDER	requests a border around plot
LFIT= linetype	specifies line style for fitted curves and confidence limits
LGRID= linetype	specifies line style for grid lines
NOFRAME	suppresses the frame around plotting areas
NOGRID	suppresses grid lines
NOFIT	suppresses fitted curves
NOHLABEL	suppresses horizontal labels
NOHTICK	suppresses horizontal ticks
NOVTICK	suppresses vertical ticks
TURNVLABELS	vertically strings out characters in vertical labels
WFIT= n	specifies thickness for fitted curves
WGRID= n	specifies thickness for grids
WREFL= n	specifies thickness for reference lines

Table 60.15: Axis Options
HAXIS= value1 to value2 < by value3 >	specifies tick mark values for horizontal axis
HOFFSET= value	specifies offset for horizontal axis
HLOWER= value	specifies lower limit on horizontal axis scale
HUPPER= value	specifies upper limit on horizontal axis scale
NHTICK= n	specifies number of ticks for horizontal axis
NVTICK= n	specifies number of ticks for vertical axis
VAXIS= value1 to value2 < by value3 >	specifies tick mark values for vertical axis
VAXISLABEL= label	specifies label for vertical axis
VOFFSET= value	specifies offset for vertical axis
VLOWER= value	specifies lower limit on vertical axis scale
VUPPER= value	specifies upper limit on vertical axis scale
WAXIS= n	specifies thickness for axis

Table 60.16: Options for Reference Lines
HREF<(INTERSECT)>=value-list	requests horizontal reference line
HREFLABELS=(label1 , , labeln)	specifies labels for HREF= lines
HREFLABPOS= n	specifies vertical position of labels for HREF= lines
LHREF= linetype	specifies line style for HREF= lines
LVREF= linetype	specifies line style for VREF= lines
VREF<(INTERSECT)>=value-list	requests vertical reference line
VREFLABELS=(label1 , , labeln)	specifies labels for VREF= lines
VREFLABPOS= n	specifies horizontal position of labels for VREF= lines

Table 60.17: Graphics Catalog Options
DESCRIPTION= string	specifies description for graphics catalog member
NAME= string	specifies name for plot in graphics catalog

Table 60.18: Options for Text Enhancement
FONT= font	specifies software font for text
HEIGHT= value	specifies height of text used outside framed areas
INFONT= font	specifies software font for text inside framed areas
INHEIGHT= value	specifies height of text inside framed areas

Dictionary of Options

The following entries provide detailed descriptions of the options in the IPPPLOT statement.

ANNOTATE= SAS-data-set

ANNO= SAS-data-set

specifies an ANNOTATE data set, as described in SAS/GRAPH Software: Reference , that enables you to add features to the ipp plot. The ANNOTATE= data set you specify in the IPPPLOT statement is used for all plots created by the statement.

CAXIS= color

CAXES= color

specifies the color used for the axes and tick marks. This option overrides any COLOR= specifications in an AXIS statement. The default is the first color in the device color list.

CFIT= color

specifies the color for the fitted ipp curves. The default is the first color in the device color list.

CFRAME= color

CFR= color

specifies the color for the area enclosed by the axes and frame. This area is not shaded by default.

CGRID= color

specifies the color for grid lines. The default is the first color in the device color list.

CHREF= color

CH= color

specifies the color for lines requested by the HREF= option. The default is the first color in the device color list.

CTEXT= color

specifies the color for tick mark values and axis labels. The default is the color specified for the CTEXT= option in the most recent GOPTIONS statement.

CVREF= color

CV= color

specifies the color for lines requested by the VREF= option. The default is the first color in the device color list.

DESCRIPTION= string

DES= string

specifies a description, up to 40 characters, that appears in the PROC GREPLAY master menu. The default is the variable name.

FONT= font

specifies a software font for reference line and axis labels. You can also specify fonts for axis labels in an AXIS statement. The FONT= font takes precedence over the FTEXT= font specified in the most recent GOPTIONS statement. Hardware characters are used by default.

HAXIS= value1 to value2 < by value3 >

specifies tick mark values for the horizontal axis. value1 , value2 , and value3 must be numeric, and value1 must be less than value2 . The lower tick mark is value1 . Tick marks are drawn at increments of value3 . The last tick mark is the greatest value that does not exceed value2 . If value3 is omitted, a value of 1 is used.

Examples of HAXIS= lists are:

  haxis = 0 to 10   haxis = 2 to 10 by 2   haxis = 0 to 200 by 10

HEIGHT= value

specifies the height of text used outside framed areas. The default value is 3.846 (in percentage).

HLOWER= value

specifies the lower limit on the horizontal axis scale. The HLOWER= option specifies value as the lower horizontal axis tick mark. The tick mark interval and the upper axis limit are determined automatically. This option has no effect if the HAXIS= option is used.

HOFFSET= value

specifies offset for horizontal axis. The default value is 1.

HUPPER= value

specifies value as the upper horizontal axis tick mark. The tick mark interval and the lower axis limit are determined automatically. This option has no effect if the HAXIS= option is used.

HREF < (INTERSECT) > = value-list

requests reference lines perpendicular to the horizontal axis. If (INTERSECT) is specified, a second reference line perpendicular to the vertical axis is drawn that intersects the fit line at the same point as the horizontal axis reference line. If a horizontal axis reference line label is specified, the intersecting vertical axis reference line is labeled with the vertical axis value. See also the CHREF=, HREFLABELS=, and LHREF= options.

HREFLABELS= label1 , , labeln

HREFLABEL= label1 , , labeln

HREFLAB= label1 , , labeln

specifies labels for the lines requested by the HREF= option. The number of labels must equal the number of lines. Enclose each label in quotes. Labels can be up to 16 characters.

HREFLABPOS= n

specifies the vertical position of labels for HREF= lines. The following table shows valid values for n and the corresponding label placements.

n	label placement
1	top
2	staggered from top
3	bottom
4	staggered from bottom
5	alternating from top
6	alternating from bottom

INBORDER

requests a border around ipp plots.

LFIT= linetype

specifies a line style for fitted curves and confidence limits. By default, fitted curves are drawn by connecting solid lines ( linetype = 1 ) and confidence limits are drawn by connecting dashed lines ( linetype = 3 ).

LGRID= linetype

specifies a line style for all grid lines. linetype is between 1 and 46. The default is 35.

LHREF= linetype

LH= linetype

specifies the line type for lines requested by the HREF= option. The default is 2, which produces a dashed line.

LVREF= linetype

LV = linetype

specifies the line type for lines requested by the VREF= option. The default is 2, which produces a dashed line.

NAME= string

specifies a name for the plot, up to eight characters, that appears in the PROC GREPLAY master menu. The default is PROBIT .

NOCONF

suppresses fiducial limits from the plot.

NODATA

suppresses observed data points from the plot.

NOFIT

suppresses the fitted ipp curves.

NOFRAME

suppresses the frame around plotting areas.

NOGRID

suppresses grid lines.

NOHLABEL

suppresses horizontal labels.

NOHTICK

suppresses horizontal tick marks.

NOTHRESH

suppresses the threshold line.

NOVLABEL

suppresses vertical labels.

NOVTICK

suppresses vertical tick marks.

THRESHLABPOS= n

specifies the vertical position of labels for the threshold line. The following table shows valid values for n and the corresponding label placements.

n

label placement

1

top

2

bottom

VAXIS= value1 to value2 < by value3 >

specifies tick mark values for the vertical axis. value1 , value2 , and value3 must be numeric, and value1 must be less than value2 . The lower tick mark is value1 . Tick marks are drawn at increments of value3 . The last tick mark is the greatest value that does not exceed value2 . This method of specification of tick marks is not valid for logarithmic axes. If value3 is omitted, a value of 1 is used.

Examples of VAXIS= lists are:

  vaxis = 0 to 10   vaxis = 0 to 2 by .1

VAXISLABEL= string

specifies a label for the vertical axis.

VLOWER= value

specifies the lower limit on the vertical axis scale. The VLOWER= option specifies value as the lower vertical axis tick mark. The tick mark interval and the upper axis limit are determined automatically. This option has no effect if the VAXIS= option is used.

VREF= value-list

requests reference lines perpendicular to the vertical axis. If (INTERSECT) is specified, a second reference line perpendicular to the horizontal axis is drawn that intersects the fit line at the same point as the vertical axis reference line. If a vertical axis reference line label is specified, the intersecting horizontal axis reference line is labeled with the horizontal axis value. See also the CVREF=, LVREF=, and VREFLABELS= options.

VREFLABELS= label1 , , labeln

VREFLABEL= label1 , , labeln

VREFLAB= label1 , , labeln

specifies labels for the lines requested by the VREF= option. The number of labels must equal the number of lines. Enclose each label in quotes. Labels can be up to 16 characters.

VREFLABPOS= n

specifies the horizontal position of labels for VREF= lines. The following table shows valid values for n and the corresponding label placements.

n

label placement

1

left

2

right

n	label placement
1	left
2	right

VUPPER= value

specifies the upper limit on the vertical axis scale. The VUPPER= option specifies value as the upper vertical axis tick mark. The tick mark interval and the lower axis limit are determined automatically. This option has no effect if the VAXIS= option is used.

WAXIS= n

specifies line thickness for axes and frame. The default value is 1.

WFIT= n

specifies line thickness for fitted curves. The default value is 1.

WGRID= n

specifies line thickness for grids. The default value is 1.

WREFL= n

specifies line thickness for reference lines. The default value is 1.

LPREDPLOT Statement

LPREDPLOT < var = variable >< options > ;

The LPREDPLOT statement plots the linear predictor x ² b against a single continuous variable (dose variable) in the MODEL statement for either the binomial model or the multinomial model. The confidence limits for the predicted values are only available for the binomial model.

VAR= ( variable )

specifies a single continuous variable (dose variable) in the independent variable list of the MODEL statement for which the linear predictor plot is plotted. If a VAR= variable is not specified, the first single continuous variable in the independent variable list of the MODEL statement is used. If such a variable does not exist in the independent variable list of the MODEL statement, an error is reported.
Let x ₁ be the covariate of the dose variable, x _{ˆ’ 1} be the vector of the rest of the covariates, _{ˆ’ 1} be the vector of estimated parameters corresponding to x _{ˆ’ 1} , and ₁ be the estimated parameter for the dose variable of interest.
To plot ² b as a function of x ₁ , x _{ˆ’ 1} must be specified. You can use the XDATA= option to provide the values of x _{ˆ’ 1} (see the XDATA= option in the PROC PROBIT statement for details), or use the default values that follow the rules:
- If the effect contains a continuous variable (or variables), the overall mean of this effect is used.
- If the effect is a single classification variable, the highest level of the variable is used.

options

add features to the plot.
For the multinomial model, you can use the LEVEL= option to specify the levels for which the linear predictor lines are plotted. The lines are labeled by the names of their levels in the middle.
You can use options in the LPREDPLOT statement to
- superimpose specification limits
- suppress or add the observed data points on the plot for the binomial model
- suppress or add the confidence limits for the binomial model
- specify the levels for which the linear predictor lines are requested for the multinomial model
- specify graphical enhancements (such as color or text height)

Summary of Options

The following tables list all options by function. The Dictionary of Options on page 3736 describes each option in detail.

LPRED Options

Table 60.19: Plot Layout Options for LPREDPLOT
LEVEL= character-list	specifies the names of the levels for which the linear predictor lines are requested (only for the multinomial model)
NOCONF	suppresses fiducial limits (only for the binomial model)
NODATA	suppresses observed data points on the plot (only for the binomial model)
NOTHRESH	suppresses the threshold line
THRESHLABPOS= value	specifies the position for the label of the threshold line

General Options

Table 60.20: Color Options
CAXIS= color	specifies color for axis
CFIT= color	specifies color for fitted curves
CFRAME= color	specifies color for frame
CGRID= color	specifies color for grid lines
CHREF= color	specifies color for HREF= lines
CTEXT= color	specifies color for text
CVREF= color	specifies color for VREF= lines

Table 60.21: Options to Enhance Plots Produced on Graphics Devices
ANNOTATE= SAS-data-set	specifies an ANNOTATE data set
INBORDER	requests a border around plot
LFIT= linetype	specifies line style for fitted curves and confidence limits
LGRID= linetype	specifies line style for grid lines
NOFRAME	suppresses the frame around plotting areas
NOGRID	suppresses grid lines
NOFIT	suppresses fitted curves
NOHLABEL	suppresses horizontal labels
NOHTICK	suppresses horizontal ticks
NOVTICK	suppresses vertical ticks
TURNVLABELS	vertically strings out characters in vertical labels
WFIT= n	specifies thickness for fitted curves
WGRID= n	specifies thickness for grids
WREFL= n	specifies thickness for reference lines

Table 60.22: Axis Options
HAXIS= value1 to value2 < by value3 >	specifies tick mark values for horizontal axis
HOFFSET= value	specifies offset for horizontal axis
HLOWER= value	specifies lower limit on horizontal axis scale
HUPPER= value	specifies upper limit on horizontal axis scale
NHTICK= n	specifies number of ticks for horizontal axis
NVTICK= n	specifies number of ticks for vertical axis
VAXIS= value1 to value2 < by value3 >	specifies tick mark values for vertical axis
VAXISLABEL= label	specifies label for vertical axis
VOFFSET= value	specifies offset for vertical axis
VLOWER= value	specifies lower limit on vertical axis scale
VUPPER= value	specifies upper limit on vertical axis scale
WAXIS= n	specifies thickness for axis

Table 60.23: Graphics Catalog Options
DESCRIPTION= string	specifies description for graphics catalog member
NAME= string	specifies name for plot in graphics catalog

Table 60.24: Options for Text Enhancement
FONT= font	specifies software font for text
HEIGHT= value	specifies height of text used outside framed areas
INFONT= font	specifies software font for text inside framed areas
INHEIGHT= value	specifies height of text inside framed areas

Table 60.25: Options for Reference Lines
HREF<(INTERSECT)>=value-list	requests horizontal reference line
HREFLABELS=(label1 , , labeln)	specifies labels for HREF= lines
HREFLABPOS= n	specifies vertical position of labels for HREF= lines
LHREF= linetype	specifies line style for HREF= lines
LVREF= linetype	specifies line style for VREF= lines
VREF<(INTERSECT)>=value-list	requests vertical reference line
VREFLABELS=(label1 , , labeln)	specifies labels for VREF= lines
VREFLABPOS= n	specifies horizontal position of labels for VREF= lines

Dictionary of Options

The following entries provide detailed descriptions of the options in the LPREDPLOT statement.

ANNOTATE= SAS-data-set

ANNO= SAS-data-set

specifies an ANNOTATE data set, as described in SAS/GRAPH Software: Reference , that enables you to add features to the lpred plot. The ANNOTATE= data set you specify in the LPREDPLOT statement is used for all plots created by the statement.

CAXIS= color

CAXES= color

specifies the color used for the axes and tick marks. This option overrides any COLOR= specifications in an AXIS statement. The default is the first color in the device color list.

CFIT= color

specifies the color for the fitted lpred lines. The default is the first color in the device color list.

CFRAME= color

CFR= color

specifies the color for the area enclosed by the axes and frame. This area is not shaded by default.

CGRID= color

specifies the color for grid lines. The default is the first color in the device color list.

CHREF= color

CH= color

specifies the color for lines requested by the HREF= option. The default is the first color in the device color list.

CTEXT= color

specifies the color for tick mark values and axis labels. The default is the color specified for the CTEXT= option in the most recent GOPTIONS statement.

CVREF= color

CV= color

specifies the color for lines requested by the VREF= option. The default is the first color in the device color list.

DESCRIPTION= string

DES= string

specifies a description, up to 40 characters, that appears in the PROC GREPLAY master menu. The default is the variable name.

FONT= font

specifies a software font for reference line and axis labels. You can also specify fonts for axis labels in an AXIS statement. The FONT= font takes precedence over the FTEXT= font specified in the most recent GOPTIONS statement. Hardware characters are used by default.

HAXIS= value1 to value2 < by value3 >

specifies tick mark values for the horizontal axis. value1 , value2 , and value3 must be numeric, and value1 must be less than value2 . The lower tick mark is value1 . Tick marks are drawn at increments of value3 . The last tick mark is the greatest value that does not exceed value2 . If value3 is omitted, a value of 1 is used.

Examples of HAXIS= lists are:

  haxis = 0 to 10   haxis = 2 to 10 by 2   haxis = 0 to 200 by 10

HEIGHT= value

specifies the height of text used outside framed areas. The default value is 3.846 (in percentage).

HLOWER= value

specifies the lower limit on the horizontal axis scale. The HLOWER= option specifies value as the lower horizontal axis tick mark. The tick mark interval and the upper axis limit are determined automatically. This option has no effect if the HAXIS= option is used.

HOFFSET= value

specifies offset for horizontal axis. The default value is 1.

HUPPER= value

specifies value as the upper horizontal axis tick mark. The tick mark interval and the lower axis limit are determined automatically. This option has no effect if the HAXIS= option is used.

HREF < (INTERSECT) > = value-list

requests reference lines perpendicular to the horizontal axis. If (INTERSECT) is specified, a second reference line perpendicular to the vertical axis is drawn that intersects the fit line at the same point as the horizontal axis reference line. If a horizontal axis reference line label is specified, the intersecting vertical axis reference line is labeled with the vertical axis value. See also the CHREF=, HREFLABELS=, and LHREF= options.

HREFLABELS= label1 , , labeln

HREFLABEL= label1 , , labeln

HREFLAB= label1 , , labeln

specifies labels for the lines requested by the HREF= option. The number of labels must equal the number of lines. Enclose each label in quotes. Labels can be up to 16 characters.

HREFLABPOS= n

specifies the vertical position of labels for HREF= lines. The following table shows valid values for n and the corresponding label placements.

n	label placement
1	top
2	staggered from top
3	bottom
4	staggered from bottom
5	alternating from top
6	alternating from bottom

INBORDER

requests a border around lpred plots.

LEVEL= ( character-list )

ORDINAL= ( character-list )

specifies the names of the levels for which linear predictor lines are requested. Names should be quoted and separated by space. If there is no correct name provided, no lpred line is plotted.

LFIT= linetype

specifies a line style for fitted curves and confidence limits. By default, fitted curves are drawn by connecting solid lines ( linetype = 1 ) and confidence limits are drawn by connecting dashed lines ( linetype = 3 ).

LGRID= linetype

specifies a line style for all grid lines. linetype is between 1 and 46. The default is 35.

LHREF= linetype

LH= linetype

specifies the line type for lines requested by the HREF= option. The default is 2, which produces a dashed line.

LVREF= linetype

LV = linetype

specifies the line type for lines requested by the VREF= option. The default is 2, which produces a dashed line.

NAME= string

specifies a name for the plot, up to eight characters, that appears in the PROC GREPLAY master menu. The default is PROBIT .

NOCONF

suppresses confidence limits from the plot. This only works for the binomial model. Confidence limits are not plotted for the multinomial model.

NODATA

suppresses observed data points from the plot. This only works for the binomial model. Data points are not plotted for the multinomial model.

NOFIT

suppresses the fitted lpred lines.

NOFRAME

suppresses the frame around plotting areas.

NOGRID

suppresses grid lines.

NOHLABEL

suppresses horizontal labels.

NOHTICK

suppresses horizontal tick marks.

NOTHRESH

suppresses the threshold line.

NOVLABEL

suppresses vertical labels.

NOVTICK

suppresses vertical tick marks.

THRESHLABPOS= n

specifies the horizontal position of labels for the threshold line. The following table shows valid values for n and the corresponding label placements.

n

label placement

1

left

2

right

n	label placement
1	left
2	right

VAXIS= value1 to value2 < by value3 >

specifies tick mark values for the vertical axis. value1 , value2 , and value3 must be numeric, and value1 must be less than value2 . The lower tick mark is value1 . Tick marks are drawn at increments of value3 . The last tick mark is the greatest value that does not exceed value2 . This method of specification of tick marks is not valid for logarithmic axes. If value3 is omitted, a value of 1 is used.

Examples of VAXIS= lists are:
```
  vaxis = 0 to 10   vaxis = 0 to 2 by .1  
```

VAXISLABEL= string

specifies a label for the vertical axis.

VLOWER= value

specifies the lower limit on the vertical axis scale. The VLOWER= option specifies value as the lower vertical axis tick mark. The tick mark interval and the upper axis limit are determined automatically. This option has no effect if the VAXIS= option is used.

VREF= value-list

requests reference lines perpendicular to the vertical axis. If (INTERSECT) is specified, a second reference line perpendicular to the horizontal axis is drawn that intersects the fit line at the same point as the vertical axis reference line. If a vertical axis reference line label is specified, the intersecting horizontal axis reference line is labeled with the horizontal axis value. See also the CVREF=, LVREF=, and VREFLABELS= options.

VREFLABELS= label1 , , labeln

VREFLABEL= label1 , , labeln

VREFLAB= label1 , , labeln

specifies labels for the lines requested by the VREF= option. The number of labels must equal the number of lines. Enclose each label in quotes. Labels can be up to 16 characters.

VREFLABPOS= n

specifies the horizontal position of labels for VREF= lines. The following table shows valid values for n and the corresponding label placements.

n

label placement

1

left

2

right

n	label placement
1	left
2	right

VUPPER= number

specifies the upper limit on the vertical axis scale. The VUPPER= option specifies number as the upper vertical axis tick mark. The tick mark interval and the lower axis limit are determined automatically. This option has no effect if the VAXIS= option is used.

WAXIS= n

specifies line thickness for axes and frame. The default value is 1.

WFIT= n

specifies line thickness for fitted lines. The default value is 1.

WGRID= n

specifies line thickness for grids. The default value is 1.

WREFL= n

specifies line thickness for reference lines. The default value is 1.

MODEL Statement

label:> MODEL response=effects < / options > ;
label: > MODEL events/trials=effects < / options > ;

The MODEL statement names the variables used as the response and the independent variables. Additionally, you can specify the distribution used to model the response, as well as other options. Only a single MODEL statement can be used with one invocation of the PROBIT procedure. If multiple MODEL statements are present, only the last is used. Main effects and interaction terms can be specified in the MODEL statement, similar to the GLM procedure.

The optional label is used to label output from the matching MODEL statement.

The response can be a single variable with a value that is used to indicate the level of the observed response. Such a response variable must be listed in the CLASS statement. For example, the response might be a variable called Symptoms that takes on the values ˜None, ˜Mild, or ˜Severe. Note that, for dichotomous response variables, the probability of the lower sorted value is modeled by default (see the Details section beginning on page 3754). Because the model fit by the PROBIT procedure requires ordered response levels, you may need to use either the ORDER=DATA option in the PROC PROBIT statement or a numeric coding of the response to get the desired ordering of levels.

Alternatively, the response can be specified as a pair of variable names separated by a slash (/). The value of the first variable, events , is the number of positive responses (or events). The value of the second variable, trials , is the number of trials. Both variables must be numeric and non-negative, and the ratio of the first variable value to the second variable value must be between 0 and 1, inclusive. For example, the variables might be hits , a variable containing the number of hits for a baseball player, and AtBats , a variable containing the number of times at bat. A model for hitting proportion (batting average) as a function of age could be specified as

  model hits/AtBats=age;

The effects following the equal sign are the covariates in the model. Higher-order effects, such as interactions and nested terms, are allowed in the list, similar to the GLM procedure. Variable names and combinations of variable names representing higher-order terms are allowed to appear in this list. Class variables can be used as effects, and indicator variables are generated for the class levels. If you do not specify any covariates following the equal sign, an intercept-only model is fit.

The following options are available in the MODEL statement.

AGGREGATE

AGGREGATE= (variable-list)

specifies the subpopulations on which the Pearson chi-square test statistic and the log-likelihood ratio chi-square test statistic (deviance) are calculated if the LACKFIT option is specified. See the section Rescaling the Covariance Matrix on page 3760 for details of Pearson s chi-square and deviance calculations.
Observations with common values in the given list of variables are regarded as coming from the same subpopulation. Variables in the list can be any variables in the input data set. Specifying the AGGREGATE option is equivalent to specifying the AGGREGATE= option with a variable list that includes all independent variables in the MODEL statement. The PROBIT procedure sorts the input data set according to the variables specified in this list. Information for the sorted data set is reported in the Response-Covariate Profile table.
The deviance and Pearson goodness-of-fit statistics are calculated if the LACKFIT option is specified in the MODEL statement. The calculated results are reported in the Goodness-of-Fit table. If the Pearson chi-square test is significant with the test level specified by the HPROB= option, the fiducial limits, if required with the INVERSECL option in the MODEL statement, are modified (see the section Inverse Confidence Limits on page 3761 for details). Also, the covariance matrix is re-scaled by the dispersion parameter when the SCALE= option is specified.

ALPHA= value

sets the significance level for the confidence intervals for regression parameters, fiducial limits for the predicted values, and confidence intervals for the predicted probabilities. The value must be between 0 and 1. The default value is ALPHA=0.05.

CONVERGE= value

specifies the convergence criterion. Convergence is declared when the maximum change in the parameter estimates between Newton-Raphson steps is less than the value specified. The change is a relative change if the parameter is greater than 0.01 in absolute value; otherwise , it is an absolute change.
By default, CONVERGE=1.0E-8.

CORRB

displays the estimated correlation matrix of the parameter estimates.

COVB

displays the estimated covariance matrix of the parameter estimates.

DISTRIBUTION= distribution-type

DIST= distribution-type

D= distribution-type

specifies the cumulative distribution function used to model the response probabilities. The distributions are described in the Details section beginning on page 3754. Valid values for distribution-type are

NORMAL	the normal distribution for the probit model
LOGISTIC	the logistic distribution for the logit model
EXTREMEVALUE EXTREME GOMPERTZ	the extreme value, or Gompertz distribution for the gompit model

By default, DISTRIBUTION=NORMAL.

HPROB= p

specifies a minimum probability level for the Pearson chi-square to indicate a good fit. The default value is 0.10. The LACKFIT option must also be specified for this option to have any effect. For Pearson goodness-of-fit chi-square values with probability greater than the HPROB= value, the fiducial limits, if requested with the INVERSECL option, are computed using a critical value of 1.96. For chi-square values with probability less than the value of the HPROB= option, the critical value is a 0.95 two-sided quantile value taken from the t distribution with degrees of freedom equal to ( k ˆ’ 1) — m ˆ’ q , where k is the number of levels for the response variable, m is the number of different sets of independent variable values, and q is the number of parameters fit in the model. If you specify the HPROB= option in both the PROC PROBIT and MODEL statements, the MODEL statement option takes precedence.

INITIAL= values

sets initial values for the parameters in the model other than the intercept. The values must be given in the order in which the variables are listed in the MODEL statement. If some of the independent variables listed in the MODEL statement are classification variables, then there must be as many values given for that variable as there are classification levels minus 1. The INITIAL option can be specified as follows.

Type of List

Specification

list separated by blanks

initial=3 4 5

list separated by commas

initial=3, 4, 5
By default, all parameters have initial estimates of zero.
Note: The INITIAL= option is overwritten by the INEST= option in the PROC PROBIT statement.

Type of List	Specification
list separated by blanks	initial=3 4 5
list separated by commas	initial=3, 4, 5

INTERCEPT= value

initializes the intercept parameter to value . By default, INTERCEPT=0.

INVERSECL

computes confidence limits for the values of the first continuous independent variable (such as dose) that yield selected response rates. If the algorithm fails to converge (this can happen when C is nonzero), missing values are reported for the confidence limits. See the section Inverse Confidence Limits on page 3761 for details.

ITPRINT

displays the iteration history, the final evaluation of the gradient, and the second derivative matrix (Hessian).

LACKFIT

performs two goodness-of-fit tests (a Pearson chi-square test and a log-likelihood ratio chi-square test) for the fitted model.
To compute the test statistics, proper grouping of the observations into subpopulations is needed. You can use the AGGREGATE or AGGREGATE= option to this end. See the entry for the AGGREGATE and AGGREGATE= options under the MODEL statement. If neither AGGREGATE nor AGGREGATE= is specified, PROC PROBIT assumes each observation is from a separate subpopulation and computes the goodness-of-fit test statistics only for the events/trials syntax.
Note: This test is not appropriate if the data are very sparse, with only a few values at each set of the independent variable values.
If the Pearson chi-square test statistic is significant, then the covariance estimates and standard error estimates are adjusted. See the section Lack of Fit Tests on page 3759 for a description of the tests. Note that the LACKFIT option can also appear in the PROC PROBIT statement. See the section PROC PROBIT Statement on page 3711 for details.

MAXITER= value

MAXIT= value

specifies the maximum number of iterations to be performed in estimating the parameters. By default, MAXITER=50.

NOINT

fits a model with no intercept parameter. If the INTERCEPT= option is also specified, the intercept is fixed at the specified value; otherwise, it is set to zero. This is most useful when the response is binary. When the response has k levels, then k ˆ’ 1 intercept parameters are fit. The NOINT option sets the intercept parameter corresponding to the lowest response level equal to zero. A Lagrange multiplier , or score, test for the restricted model is computed when the NOINT option is specified.

SCALE= scale

enables you to specify the method for estimating the dispersion parameter. To correct for overdispersion or underdispersion, the covariance matrix is multiplied by the estimate of the dispersion parameter. Valid values for scale are as follows:

D DEVIANCE	specifies that the dispersion parameter be estimated by the deviance divided by its degrees of freedom.
P PEARSON	specifies that the dispersion parameter be estimated by the Pearson chi-square statistic divided by its degrees of freedom. This is set as the default.

You can use the AGGREGATE= option to define the subpopulations for calculating the Pearson chi-square statistic and the deviance.
The Goodness-of-Fit table includes the Pearson chi-square statistic, the deviance, their degrees of freedom, the ratio of each statistic divided by its degrees of freedom, and the corresponding p -value.

SINGULAR= value

specifies the singularity criterion for determining linear dependencies in the set of independent variables. The sum of squares and cross-products matrix of the independent variables is formed and swept. If the relative size of a pivot becomes less than the value specified, then the variable corresponding to the pivot is considered to be linearly dependent on the previous set of variables considered . By default, SINGULAR=1E ˆ’ 12.

OUTPUT Statement

OUTPUT < OUT=SAS-data-set >< keyword=name keyword=name > ;

The OUTPUT statement creates a new SAS data set containing all variables in the input data set and, optionally , the fitted probabilities, the estimate of x ² ² , and the estimate of its standard error. Estimates of the probabilities, x ² ² , and the standard errors are computed for observations with missing response values as long as the values of all the explanatory variables are nonmissing. This enables you to compute these statistics for additional settings of the explanatory variables that are of interest but for which responses are not observed.

You can specify multiple OUTPUT statements. Each OUTPUT statement creates a new data set and applies only to the preceding MODEL statement. If you want to create a permanent SAS data set, you must specify a two-level name (refer to SAS Language Reference: Concepts for more information on permanent SAS data sets).

Details on the specifications in the OUTPUT statement are as follows:

keyword=name	specifies the statistics to include in the output data set and assigns names to the new variables that contain the statistics. Specify a keyword for each desired statistic (see the following list of keywords), an equal sign, and the variable to contain the statistic.
	The keywords allowed and the statistics they represent are as follows:
	PROB P	cumulative probability estimates
	STD	standard error estimates of a _j + x ² b
	XBETA	estimates of a _j + x ² ²
OUT= SAS-data-set	names the output data set. By default, the new data set is named using the DATA n convention.

When the single variable response syntax is used, the _LEVEL_ variable is added to the output data set, and there are k ˆ’ 1 output observations for each input observation, where k is the number of response levels. There is no observation output corresponding to the highest response level. For each of the k ˆ’ 1 observations, the PROB variable contains the fitted probability of obtaining a response level up to the level indicated by the _LEVEL_ variable, the XBETA variable contains a _j + x ² b , where j references the levels ( a ₁ =0), and the STD variable contains the standard error estimate of the XBETA variable. See the Details section, which follows, for the formulas for the parameterizations.

PREDPPLOT Statement

PREDPPLOT < var = variable >< options > ;

The PREDPPLOT statement plots the predicted probability against a single continuous variable (dose variable) in the MODEL statement for both the binomial model and the multinomial model. Confidence limits are only available for the binomial model. An attached box on the right side of the plot is used to label predicted probability curves with the names of their levels for the multinomial model. You can specify the color of this box using the CLABBOX= option.

VAR= (variable)

specifies a single continuous variable (dose variable) in the independent variable list of the MODEL statement. If a VAR= variable is not specified, the first single continuous variable in the independent variable list of the MODEL statement is used. If such a variable does not exist in the independent variable list of the MODEL statement, an error is reported.
The predicted probability is
for the binomial model and
for the multinomial model with k response levels, where F is the cumulative distribution function used to model the probability, x ² is the vector of the covariates, _j are the estimated ordinal intercepts with ₁ =0, C is the threshold parameter, either known or estimated from the model, and ² is the vector of estimated parameters.
To plot (or _j ) as a function of a continuous variable x ₁ , the remaining covariates x _{ˆ’ 1} must be specified. You can use the XDATA= option to provide the values of x _{ˆ’ 1} (see the XDATA= option in the PROC PROBIT statement for details), or use the default values that follow the rules:

If the effect contains a continuous variable (or variables), the overall mean of this effect is used.
If the effect is a single classification variable, the highest level of the variable is used.

options

enable you to plot the observed data and add features to the plot.
You can use options in the PREDPPLOT statement to
- superimpose specification limits
- suppress or add observed data points for the binomial model
- suppress or add confidence limits for the binomial model
- specify the levels for which predicted probability curves are requested for the multinomial model
- specify graphical enhancements (such as color or text height)

Summary of Options

The following tables list all options by function. The Dictionary of Options on page 3749 describes each option in detail.

PREDPPLOT Options

Table 60.26: Plot Layout Options for PREDPPLOT
LEVEL= character-list	specifies the names of the levels for which the predicted probability curves are requested (only for the multinomial model)
NOCONF	suppresses confidence limits
NODATA	suppresses observed data points on the plot
NOTHRESH	suppresses the threshold line
THRESHLABPOS= value	specifies the position for the label of the threshold line

General Options

Table 60.27: Color Options
CAXIS= color	specifies color for the axes
CFIT= color	specifies color for fitted curves
CFRAME= color	specifies color for frame
CGRID= color	specifies color for grid lines
CHREF= color	specifies color for HREF= lines
CLABBOX= color	specifies color for label box
CTEXT= color	specifies color for text
CVREF= color	specifies color for VREF= lines

Table 60.28: Options to Enhance Plots Produced on Graphics Devices
ANNOTATE= SAS-data-set	specifies an ANNOTATE data set
INBORDER	requests a border around plot
LFIT= linetype	specifies line style for fitted curves and confidence limits
LGRID= linetype	specifies line style for grid lines
NOFRAME	suppresses the frame around plotting areas
NOGRID	suppresses grid lines
NOFIT	suppresses fitted curves
NOHLABEL	suppresses horizontal labels
NOHTICK	suppresses horizontal ticks
NOVTICK	suppresses vertical ticks
TURNVLABELS	vertically strings out characters in vertical labels
WFIT= n	specifies thickness for fitted curves
WGRID= n	specifies thickness for grids
WREFL= n	specifies thickness for reference lines

Table 60.29: Axis Options
HAXIS= value1 to value2 < by value3 >	specifies tick mark values for horizontal axis
HOFFSET= value	specifies offset for horizontal axis
HLOWER= value	specifies lower limit on horizontal axis scale
HUPPER= value	specifies upper limit on horizontal axis scale
NHTICK= n	specifies number of ticks for horizontal axis
NVTICK= n	specifies number of ticks for vertical axis
VAXIS= value1 to value2 < by value3 >	specifies tick mark values for vertical axis
VAXISLABEL= label	specifies label for vertical axis
VOFFSET= value	specifies offset for vertical axis
VLOWER= value	specifies lower limit on vertical axis scale
VUPPER= value	specifies upper limit on vertical axis scale
WAXIS= n	specifies thickness for axis

Table 60.30: Graphics Catalog Options
DESCRIPTION= string	specifies description for graphics catalog member
NAME= string	specifies name for plot in graphics catalog

Table 60.31: Options for Text Enhancement
FONT= font	specifies software font for text
HEIGHT= value	specifies height of text used outside framed areas
INFONT= font	specifies software font for text inside framed areas
INHEIGHT= value	specifies height of text inside framed areas

Table 60.32: Options for Reference Lines
HREF<(INTERSECT)> =value-list	requests horizontal reference line
HREFLABELS= (label1 , , labeln)	specifies labels for HREF= lines
HREFLABPOS= n	specifies vertical position of labels for HREF= lines
LHREF= linetype	specifies line style for HREF= lines
LVREF= linetype	specifies line style for VREF= lines
VREF<(INTERSECT)> =value-list	requests vertical reference line
VREFLABELS= (label1 , , labeln)	specifies labels for VREF= lines
VREFLABPOS= n	specifies horizontal position of labels for VREF= lines

Dictionary of Options

The following entries provide detailed descriptions of the options in the PREDPPLOT statement.

ANNOTATE= SAS-data-set

ANNO= SAS-data-set

specifies an ANNOTATE data set, as described in SAS/GRAPH Software: Reference , that enables you to add features to the predicted probability plot. The ANNOTATE= data set you specify in the PREDPPLOT statement is used for all plots created by the statement.

CAXIS= color

CAXES= color

specifies the color used for the axes and tick marks. This option overrides any COLOR= specifications in an AXIS statement. The default is the first color in the device color list.

CFIT= color

specifies the color for the fitted predicted probability curves. The default is the first color in the device color list.

CFRAME= color

CFR= color

specifies the color for the area enclosed by the axes and frame. This area is not shaded by default.

CGRID= color

specifies the color for grid lines. The default is the first color in the device color list.

CHREF= color

CH= color

specifies the color for lines requested by the HREF= option. The default is the first color in the device color list.

CTEXT= color

specifies the color for tick mark values and axis labels. The default is the color specified for the CTEXT= option in the most recent GOPTIONS statement.

CVREF= color

CV= color

specifies the color for lines requested by the VREF= option. The default is the first color in the device color list.

DESCRIPTION= string

DES= string

specifies a description, up to 40 characters, that appears in the PROC GREPLAY master menu. The default is the variable name.

FONT= font

specifies a software font for reference line and axis labels. You can also specify fonts for axis labels in an AXIS statement. The FONT= font takes precedence over the FTEXT= font specified in the most recent GOPTIONS statement. Hardware characters are used by default.

HAXIS= value1 to value2 < by value3 >

specifies tick mark values for the horizontal axis. value1 , value2 , and value3 must be numeric, and value1 must be less than value2 . The lower tick mark is value1 .Tick marks are drawn at increments of value3 . The last tick mark is the greatest value that does not exceed value2 . If value3 is omitted, a value of 1 is used.

Examples of HAXIS= lists are:

  haxis = 0 to 10   haxis = 2 to 10 by 2   haxis = 0 to 200 by 10

HEIGHT= value

specifies the height of text used outside framed areas.

HLOWER= value

specifies the lower limit on the horizontal axis scale. The HLOWER= option specifies value as the lower horizontal axis tick mark. The tick mark interval and the upper axis limit are determined automatically. This option has no effect if the HAXIS= option is used.

HOFFSET= value

specifies the offset for the horizontal axis. The default value is 1.

HUPPER= value

specifies value as the upper horizontal axis tick mark. The tick mark interval and the lower axis limit are determined automatically. This option has no effect if the HAXIS= option is used.

HREF < (INTERSECT) > = value-list

requests reference lines perpendicular to the horizontal axis. If (INTERSECT) is specified, a second reference line perpendicular to the vertical axis is drawn that intersects the fit line at the same point as the horizontal axis reference line. If a horizontal axis reference line label is specified, the intersecting vertical axis reference line is labeled with the vertical axis value. See also the CHREF=, HREFLABELS=, and LHREF= options.

HREFLABELS= label1 , , labeln

HREFLABEL= label1 , , labeln

HREFLAB= label1 , , labeln

specifies labels for the lines requested by the HREF= option. The number of labels must equal the number of lines. Enclose each label in quotes. Labels can be up to 16 characters.

HREFLABPOS= n

specifies the vertical position of labels for HREF= lines. The following table shows valid values for n and the corresponding label placements.

n	label placement
1	top
2	staggered from top
3	bottom
4	staggered from bottom
5	alternating from top
6	alternating from bottom

INBORDER

requests a border around predicted probability plots.

LEVEL= ( character-list )

ORDINAL= ( character-list )

specifies the names of the levels for which predicted probability curves are requested. Names should be quoted and separated by space. If there is no correct name provided, no fitted probability curve is plotted.

LFIT= linetype

specifies a line style for fitted curves and confidence limits. By default, fitted curves are drawn by connecting solid lines ( linetype = 1 ) and confidence limits are drawn by connecting dashed lines ( linetype = 3 ).

LGRID= linetype

specifies a line style for all grid lines. linetype is between 1 and 46. The default is 35.

LHREF= linetype

LH= linetype

specifies the line type for lines requested by the HREF= option. The default is 2, which produces a dashed line.

LVREF= linetype

LV = linetype

specifies the line type for lines requested by the VREF= option. The default is 2, which produces a dashed line.

NAME= string

specifies a name for the plot, up to eight characters, that appears in the PROC GREPLAY master menu. The default is PROBIT .

NOCONF

suppresses confidence limits from the plot. This only works for the binomial model. Confidence limits are not plotted for the multinomial model.

NODATA

suppresses observed data points from the plot. This only works for the binomial model. The data points are not plotted for the multinomial model.

NOFIT

suppresses the fitted predicted probability curves.

NOFRAME

suppresses the frame around plotting areas.

NOGRID

suppresses grid lines.

NOHLABEL

suppresses horizontal labels.

NOHTICK

suppresses horizontal tick marks.

NOTHRESH

suppresses the threshold line.

NOVLABEL

suppresses vertical labels.

NOVTICK

suppresses vertical tick marks.

THRESHLABPOS= n

specifies the horizontal position of labels for the threshold line. The following table shows valid values for n and the corresponding label placements.

n

label placement

1

left

2

right

VAXIS= value1 to value2 < by value3 >

specifies tick mark values for the vertical axis. value1 , value2 , and value3 must be numeric, and value1 must be less than value2 . The lower tick mark is value1 .Tick marks are drawn at increments of value3 . The last tick mark is the greatest value that does not exceed value2 . This method of specification of tick marks is not valid for logarithmic axes. If value3 is omitted, a value of 1 is used.

Examples of VAXIS= lists are:

  vaxis = 0 to 10   vaxis = 0 to 2 by .1

VAXISLABEL= string

specifies a label for the vertical axis.

VLOWER= value

specifies the lower limit on the vertical axis scale. The VLOWER= option specifies value as the lower vertical axis tick mark. The tick mark interval and the upper axis limit are determined automatically. This option has no effect if the VAXIS= option is used.

VREF= value-list

requests reference lines perpendicular to the vertical axis. If (INTERSECT) is specified, a second reference line perpendicular to the horizontal axis is drawn that intersects the fit line at the same point as the vertical axis reference line. If a vertical axis reference line label is specified, the intersecting horizontal axis reference line is labeled with the horizontal axis value. See also the CVREF=, LVREF=, and VREFLABELS= options.

VREFLABELS= label1 , , labeln

VREFLABEL= label1 , , labeln

VREFLAB= label1 , , labeln

specifies labels for the lines requested by the VREF= option. The number of labels must equal the number of lines. Enclose each label in quotes. Labels can be up to 16 characters.

VREFLABPOS= n

specifies the horizontal position of labels for VREF= lines. The following table shows valid values for n and the corresponding label placements.

n

label placement

1

left

2

right

VUPPER= value

specifies the upper limit on the vertical axis scale. The VUPPER= option specifies value as the upper vertical axis tick mark. The tick mark interval and the lower axis limit are determined automatically. This option has no effect if the VAXIS= option is used.

WAXIS= n

specifies line thickness for axes and frame. The default value is 1.

WFIT= n

specifies line thickness for fitted curves. The default value is 1.

WGRID= n

specifies line thickness for grids. The default value is 1.

WREFL= n

specifies line thickness for reference lines. The default value is 1.

WEIGHT Statement

WEIGHT variable ;

A WEIGHT statement can be used with PROC PROBIT to weight each observation by the value of the variable specified. The contribution of each observation to the likelihood function is multiplied by the value of the weight variable. Observations with zero, negative, or missing weights are not used in model estimation.