The following statements are available in PROC PROBIT.
PROC PROBIT < options > ;
MODEL response=independents < / options > ;
BY variables ;
CLASS variables ;
OUTPUT < OUT= SAS-data-set >< options > ;
WEIGHT variable ;
CDFPLOT < VAR = variable >< options > ;
INSET < keyword-list >< / options > ;
IPPPLOT < VAR = variable >< options > ;
LPREDPLOT < VAR = variable >< options > ;
PREDPPLOT < VAR = variable >< options > ;
A MODEL statement is required. Only a single MODEL statement can be used with one invocation of the PROBIT procedure. If multiple MODEL statements are present, only the last one is used. Main effects and higher-order terms can be specified in the MODEL statement, similar to the GLM procedure. If a CLASS statement is used, it must precede the MODEL statement.
The CDFPLOT, INSET, IPPPLOT, LPREDPLOT, and PREDPPLOT statements are used to produce graphical output. You can use any appropriate combination of the graphical statements after the MODEL statement.
PROC PROBIT < options > ;
The PROC PROBIT statement starts the procedure. You can specify the following options in the PROC PROBIT statement.
COVOUT
writes the parameter estimate covariance matrix to the OUTEST= data set.
C= rate
OPTC
controls how the natural response is handled. Specify the OPTC option to request that the natural response rate C be estimated. Specify the C= rate option to set the natural response rate or to provide the initial estimate of the natural response rate. The natural response rate value must be a number between 0 and 1.
If you specify neither the OPTC nor the C= option, a natural response rate of zero is assumed.
If you specify both the OPTC and the C= option, the C= option should be a reasonable initial estimate of the natural response rate. For example, you could use the ratio of the number of responses to the number of subjects in a control group .
If you specify the C= option but not the OPTC option, the natural response rate is set to the specified value and not estimated.
If you specify the OPTC option but not the C= option, PROC PROBIT s action depends on the response variable, as follows :
If you specify either the LN or LOG10 option and some subjects have the first independent variable (dose) values less than or equal to zero, these subjects are treated as a control group. The initial estimate of C is then the ratio of the number of responses to the number of subjects in this group.
If you do not specify the LN or LOG10 option or if there is no control group, then one of the following occurs:
If all responses are greater than zero, the initial estimate of the natural response rate is the minimal response rate (the ratio of the number of responses to the number of subjects in a dose group) across all dose levels.
If one or more of the responses is zero (making the response rate zero in that dose group), the initial estimate of the natural rate is the reciprocal of twice the largest number of subjects in any dose group in the experiment.
DATA = SAS-data-set
specifies the SAS data set to be used by PROC PROBIT. By default, the procedure uses the most recently created SAS data set.
GOUT= graphics-catalog
specifies a graphics catalog in which to save graphics output.
HPROB= p
specifies a minimum probability level for the Pearson chi-square to indicate a good fit. The default value is 0.10. The LACKFIT option must also be specified for this option to have any effect. For Pearson goodness-of-fit chi-square values with probability greater than the HPROB= value, the fiducial limits, if requested with the INVERSECL option, are computed using a critical value of 1.96. For chi-square values with probability less than the value of the HPROB= option, the critical value is a 0.95 two-sided quantile value taken from the t distribution with degrees of freedom equal to ( k ˆ’ 1) — m ˆ’ q , where k is the number of levels for the response variable, m is the number of different sets of independent variable values, and q is the number of parameters fit in the model. Note that the HPROB= option can also appear in the MODEL statement.
INEST= SAS-data-set
specifies an input SAS data set that contains initial estimates for all the parameters in the model. See the section INEST= SAS-data-set on page 3757 for a detailed description of the contents of the INEST= data set.
INVERSECL
computes confidence limits for the values of the first continuous independent variable (such as dose) that yield selected response rates. If the algorithm fails to converge (this can happen when C is nonzero), missing values are reported for the confidence limits. See the section Inverse Confidence Limits on page 3761 for details. Note that the INVERSECL option can also appear in the MODEL statement.
LACKFIT
performs two goodness-of-fit tests (a Pearson chi-square test and a log- likelihood ratio chi-square test) for the fitted model.
To compute the test statistics, proper grouping of the observations into subpopulations is needed. You can use the AGGREGATE or AGGREGATE= option for this end. See the entry for the AGGREGATE and AGGREGATE= options under the MODEL statement. If neither AGGREGATE nor AGGREGATE= is specified, PROC PROBIT assumes each observation is from a separate subpopulation and computes the goodness-of-fit test statistics only for the events/trials syntax.
Note: This test is not appropriate if the data are very sparse, with only a few values at each set of the independent variable values.
If the Pearson chi-square test statistic is significant, then the covariance estimates and standard error estimates are adjusted. See the Lack of Fit Tests section on page 3759 for a description of the tests. Note that the LACKFIT option can also appear in the MODEL statement.
LOG
LN
analyzes the data by replacing the first continuous independent variable by its natural logarithm. This variable is usually the level of some treatment such as dosage. In addition to the usual output given by the INVERSECL option, the estimated dose values and 95% fiducial limits for dose are also displayed. If you specify the OPTC option, any observations with a dose value less than or equal to zero are used in the estimation as a control group. If you do not specify the OPTC option with the LOG or LN option, then any observations with the first continuous independent variable values less than or equal to zero are ignored.
LOG10
specifies an analysis like that of the LN or LOG option except that the common logarithm (log to the base 10) of the dose value is used rather than the natural logarithm.
NAMELEN= n
specifies the length of effect names in tables and output data sets to be n characters , where n is a value between 20 and 200. The default length is 20 characters.
NOPRINT
suppresses the display of all output. Note that this option temporarily disables the Output Delivery System (ODS). For more information, see Chapter 14, Using the Output Delivery System.
OPTC
controls how the natural response is handled. See the description of the C= option on page 3711 for details.
ORDER=DATA FORMATTED FREQ INTERNAL
specifies the sorting order for the levels of the classification variables specified in the CLASS statement, including the levels of the response variable. Response level ordering is important since PROC PROBIT always models the probability of response levels at the beginning of the ordering. See the section Response Level Ordering on page 3754 for further details. This ordering also determines which parameters in the model correspond to each level in the data. The following table shows how PROC PROBIT interprets values of the ORDER= option.
Value of ORDER= | Levels Sorted By |
---|---|
DATA | order of appearance in the input data set |
FORMATTED | formatted value |
FREQ | descending frequency count; levels with the most observations come first in the order |
INTERNAL | unformatted value |
By default, ORDER=FORMATTED. For the values FORMATTED and INTERNAL, the sort order is machine dependent. For more information on sorting order, see the chapter on the SORT procedure in the SAS Procedures Guide .
OUTEST= SAS-data-set
specifies a SAS data set to contain the parameter estimates and, if the COVOUT option is specified, their estimated covariances. If you omit this option, the output data set is not created. The contents of the data set are described in the section OUTEST= SAS-data-set on page 3762.
X DATA= SAS-data-set
specifies an input SAS data set that contains values for all the independent variables in the MODEL statement and variables in the CLASS statement. If there are covariates specified in a MODEL statement, you specify fixed values for the effects in the MODEL statement by the XDATA= data set when predicted values and/or fiducial limits for a single continuous variable (dose variable) are required. These specified values for the effects in the MODEL statement are also used for generating plots. See the section XDATA= SAS-data-set on page 3763 for a detailed description of the contents of the XDATA= data set.
BY variables ;
You can specify a BY statement with PROC PROBIT to obtain separate analyses on observations in groups defined by the BY variables. When a BY statement appears, the procedure expects the input data set to be sorted in order of the BY variables.
If your input data set is not sorted in ascending order on each of the BY variables, use one of the following alternatives:
Sort the data using the SORT procedure with a similar BY statement.
Specify the BY statement option NOTSORTED or DESCENDING in the BY statement for the PROBIT procedure. The NOTSORTED option does not mean that the data are unsorted but rather that the data are arranged in groups (according to values of the BY variables) and that these groups are not necessarily in alphabetical or increasing numeric order.
Create an index on the BY variables using the DATASETS procedure.
For more information on the BY statement, refer to the discussion in SAS Language Reference: Concepts . For more information on the DATASETS procedure, refer to the discussion in the SAS Procedures Guide .
CDFPLOT < var = variable >< options > ;
The CDFPLOT statement plots the predicted cumulative distribution function (CDF) of the multinomial response variable as a function of a single continuous independent variable (dose variable). You can only use this statement after a multinomial model statement.
VAR= (variable)
specifies a single continuous variable (dose variable) in the independent variable list of the MODEL statement. If a VAR= variable is not specified, the first single continuous variable in the independent variable list of the MODEL statement is used. If such a variable does not exist in the independent variable list of the MODEL statement, an error is reported.
The predicted cumulative distribution function is defined as
where j =1, , k are the indexes of the k levels of the multinomial response variable, F is the CDF of the distribution used to model the cumulative probabilities, is the vector of estimated parameters, x is the covariate vector, j are estimated ordinal intercepts with 1 = 0, and C is the threshold parameter, either known or estimated from the model. Let x 1 be the covariate corresponding to the dose variable and x ˆ’ 1 be the vector of the rest of the covariates. Let the corresponding estimated parameters be 1 and ˆ’ 1 . Then
To plot j as a function of x 1 , x ˆ’ 1 must be specified. You can use the XDATA= option to provide the values of x ˆ’ 1 (see the XDATA= option in the PROC PROBIT statement for details), or use the default values that follow the rules:
If the effect contains a continuous variable (or variables), the overall mean of this effect is used.
If the effect is a single classification variable, the highest level of the variable is used.
options
specify the levels of the multinomial response variable for which the cdf curves are requested, and add features to the plot. There are k ˆ’ 1 curves for a k -level multinomial response variable (for the highest level, it is the constant line 1). You can specify any of them to be plotted by the LEVEL= option in the CDFPLOT statement. See the LEVEL= option for how to specify the levels.
An attached box on the right side of the plot is used to label these curves with the names of their levels. You can specify the color of this box using the CLABBOX= option.
You can use options in the CDFPLOT statement to
superimpose specification limits
specify the levels for which the cdf curves are requested
specify graphical enhancements (such as color or text height)
The following tables list all options by function. The Dictionary of Options on page 3718 describes each option in detail.
LEVEL= character-list | specifies the names of the levels for which the cdf curves are requested |
NOTHRESH | suppresses the threshold line |
THRESHLABPOS= value | specifies the position for the label of the threshold line |
CAXIS= color | specifies color for axis |
CFIT= color | specifies color for fitted curves |
CFRAME= color | specifies color for frame |
CGRID= color | specifies color for grid lines |
CHREF= color | specifies color for HREF= lines |
CLABBOX= color | specifies color for label box |
CTEXT= color | specifies color for text |
CVREF= color | specifies color for VREF= lines |
ANNOTATE= SAS-data-set | specifies an ANNOTATE data set |
INBORDER | requests a border around plot |
LFIT= linetype | specifies line style for fitted curves |
LGRID= linetype | specifies line style for grid lines |
NOFRAME | suppresses the frame around plotting areas |
NOGRID | suppresses grid lines |
NOFIT | suppresses cdf curves |
NOHLABEL | suppresses horizontal labels |
NOHTICK | suppresses horizontal ticks |
NOVTICK | suppresses vertical ticks |
TURNVLABELS | vertically strings out characters in vertical labels |
WFIT= n | specifies thickness for fitted curves |
WGRID= n | specifies thickness for grids |
WREFL= n | specifies thickness for reference lines |
HAXIS= value1 to value2 < by value3 > | specifies tick mark values for horizontal axis |
HOFFSET= value | specifies offset for horizontal axis |
HLOWER= value | specifies lower limit on horizontal axis scale |
HUPPER= value | specifies upper limit on horizontal axis scale |
NHTICK= n | specifies number of ticks for horizontal axis |
NVTICK= n | specifies number of ticks for vertical axis |
VAXIS= value1 to value2 < by value3 > | specifies tick mark values for vertical axis |
VAXISLABEL= label | specifies label for vertical axis |
VOFFSET= value | specifies offset for vertical axis |
VLOWER= value | specifies lower limit on vertical axis scale |
VUPPER= value | specifies upper limit on vertical axis scale |
WAXIS= n | specifies thickness for axis |
DESCRIPTION= string | specifies description for graphics catalog member |
NAME = string | specifies name for plot in graphics catalog |
FONT= font | specifies software font for text |
HEIGHT= value | specifies height of text used outside framed areas |
INFONT= font | specifies software font for text inside framed areas |
INHEIGHT= value | specifies height of text inside framed areas |
HREF< (INTERSECT)> =value-list | requests horizontal reference line |
HREFLABELS= (label1 , , labeln) | specifies labels for HREF= lines |
HREFLABPOS= n | specifies vertical position of labels for HREF= lines |
LHREF= linetype | specifies line style for HREF= lines |
LVREF= linetype | specifies line style for VREF= lines |
VREF<(INTERSECT)> =value-list | requests vertical reference line |
VREFLABELS= (label1 , , labeln) | specifies labels for VREF= lines |
VREFLABPOS= n | specifies horizontal position of labels for VREF= lines |
The following entries provide detailed descriptions of the options in the CDFPLOT statement.
ANNOTATE= SAS-data-set
ANNO= SAS-data-set
specifies an ANNOTATE data set, as described in SAS/GRAPH Software: Reference , that enables you to add features to the cdf plot. The ANNOTATE= data set you specify in the CDFPLOT statement is used for all plots created by the statement.
CAXIS= color
CAXES= color
specifies the color used for the axes and tick marks. This option overrides any COLOR= specifications in an AXIS statement. The default is the first color in the device color list.
CFIT= color
specifies the color for the fitted cdf curves. The default is the first color in the device color list.
CFRAME= color
CFR= color
specifies the color for the area enclosed by the axes and frame. This area is not shaded by default.
CGRID= color
specifies the color for grid lines. The default is the first color in the device color list.
CLABBOX= color
specifies the color for the area enclosed by the label box for cdf curves. This area is not shaded by default.
CHREF= color
CH= color
specifies the color for lines requested by the HREF= option. The default is the first color in the device color list.
CTEXT= color
specifies the color for tick mark values and axis labels. The default is the color specified for the CTEXT= option in the most recent GOPTIONS statement.
CVREF= color
CV= color
specifies the color for lines requested by the VREF= option. The default is the first color in the device color list.
DESCRIPTION= string
DES= string
specifies a description, up to 40 characters, that appears in the PROC GREPLAY master menu. The default is the variable name.
FONT= font
specifies a software font for reference line and axis labels. You can also specify fonts for axis labels in an AXIS statement. The FONT= font takes precedence over the FTEXT= font specified in the most recent GOPTIONS statement. Hardware characters are used by default.
HAXIS= value1 to value2 < by value3 >
specifies tick mark values for the horizontal axis. value1 , value2 , and value3 must be numeric, and value1 must be less than value2 . The lower tick mark is value1 .Tick marks are drawn at increments of value3 . The last tick mark is the greatest value that does not exceed value2 . If value3 is omitted, a value of 1 is used.
Examples of HAXIS= lists are:
haxis = 0 to 10 haxis = 2 to 10 by 2 haxis = 0 to 200 by 10
HEIGHT= value
specifies the height of text used outside framed areas. The default value is 3.846 (in percentage).
HLOWER= value
specifies the lower limit on the horizontal axis scale. The HLOWER= option specifies value as the lower horizontal axis tick mark. The tick mark interval and the upper axis limit are determined automatically. This option has no effect if the HAXIS= option is used.
HOFFSET= value
specifies offset for horizontal axis. The default value is 1.
HUPPER= value
specifies value as the upper horizontal axis tick mark. The tick mark interval and the lower axis limit are determined automatically. This option has no effect if the HAXIS= option is used.
HREF < (INTERSECT) > = value-list
requests reference lines perpendicular to the horizontal axis. If (INTERSECT) is specified, a second reference line perpendicular to the vertical axis is drawn that intersects the fit line at the same point as the horizontal axis reference line. If a horizontal axis reference line label is specified, the intersecting vertical axis reference line is labeled with the vertical axis value. See also the CHREF=, HREFLABELS=, and LHREF= options.
HREFLABELS= label1 , , labeln
HREFLABEL= label1 , , labeln
HREFLAB= label1 , , labeln
specifies labels for the lines requested by the HREF= option. The number of labels must equal the number of lines. Enclose each label in quotes. Labels can be up to 16 characters.
HREFLABPOS= n
specifies the vertical position of labels for HREF= lines. The following table shows valid values for n and the corresponding label placements.
n | label placement |
---|---|
1 | top |
2 | staggered from top |
3 | bottom |
4 | staggered from bottom |
5 | alternating from top |
6 | alternating from bottom |
INBORDER
requests a border around cdf plots.
LEVEL= ( character-list )
ORDINAL= ( character-list )
specifies the names of the levels for which cdf curves are requested. Names should be quoted and separated by space. If there is no correct name provided, no cdf curve is plotted.
LFIT= linetype
specifies a line style for fitted curves. By default, fitted curves are drawn by connecting solid lines ( linetype = 1 ).
LGRID= linetype
specifies a line style for all grid lines. linetype is between 1 and 46. The default is 35.
LHREF= linetype
LH= linetype
specifies the line type for lines requested by the HREF= option. The default is 2, which produces a dashed line.
LVREF= linetype
LV = linetype
specifies the line type for lines requested by the VREF= option. The default is 2, which produces a dashed line.
NAME= string
specifies a name for the plot, up to eight characters, that appears in the PROC GREPLAY master menu. The default is PROBIT .
NOFIT
suppresses the fitted cdf curves.
NOFRAME
suppresses the frame around plotting areas.
NOGRID
suppresses grid lines.
NOHLABEL
suppresses horizontal labels.
NOHTICK
suppresses horizontal tick marks.
NOTHRESH
suppresses the threshold line.
NOVLABEL
suppresses vertical labels.
NOVTICK
suppresses vertical tick marks.
THRESHLABPOS= n
specifies the horizontal position of labels for the threshold line. The following table shows valid values for n and the corresponding label placements.
n | label placement |
---|---|
1 | left |
2 | right |
VAXIS= value1 to value2 < by value3 >
specifies tick mark values for the vertical axis. value1 , value2 , and value3 must be numeric, and value1 must be less than value2 . The lower tick mark is value1 . Tick marks are drawn at increments of value3 . The last tick mark is the greatest value that does not exceed value2 . This method of specification of tick marks is not valid for logarithmic axes. If value3 is omitted, a value of 1 is used.
Examples of VAXIS= lists are:
vaxis = 0 to 10 vaxis = 0 to 2 by .1
VAXISLABEL= string
specifies a label for the vertical axis.
VLOWER= value
specifies the lower limit on the vertical axis scale. The VLOWER= option specifies value as the lower vertical axis tick mark. The tick mark interval and the upper axis limit are determined automatically. This option has no effect if the VAXIS= option is used.
VREF= value-list
requests reference lines perpendicular to the vertical axis. If (INTERSECT) is specified, a second reference line perpendicular to the horizontal axis is drawn that intersects the fit line at the same point as the vertical axis reference line. If a vertical axis reference line label is specified, the intersecting horizontal axis reference line is labeled with the horizontal axis value. See also the CVREF=, LVREF=, and VREFLABELS= options.
VREFLABELS= label1 , , labeln
VREFLABEL= label1 , , labeln
VREFLAB= label1 , , labeln
specifies labels for the lines requested by the VREF= option. The number of labels must equal the number of lines. Enclose each label in quotes. Labels can be up to 16 characters.
VREFLABPOS= n
specifies the horizontal position of labels for VREF= lines. The following table shows valid values for n and the corresponding label placements.
n | label placement |
---|---|
1 | left |
2 | right |
VUPPER= value
specifies the upper limit on the vertical axis scale. The VUPPER= option specifies value as the upper vertical axis tick mark. The tick mark interval and the lower axis limit are determined automatically. This option has no effect if the VAXIS= option is used.
WAXIS= n
specifies line thickness for axes and frame. The default value is 1.
WFIT= n
specifies line thickness for fitted curves. The default value is 1.
WGRID= n
specifies line thickness for grids. The default value is 1.
WREFL= n
specifies line thickness for reference lines. The default value is 1.
CLASS variables ;
The CLASS statement names the classification variables to be used in the analysis. Classification variables can be either character or numeric. If a single response variable is specified in the MODEL statement, it must also be specified in a CLASS statement.
Class levels are determined from the formatted values of the CLASS variables. Thus, you can use formats to group values into levels. See the discussion of the FORMAT procedure in SAS Language Reference: Dictionary .
If the CLASS statement is used, it must appear before any of the MODEL statements.
INSET < keyword-list >< options > ;
The box or table of summary information produced on plots made with the CDFPLOT, IPPPLOT, LPREDPLOT, and PREDPPLOT statement is called an inset . You can use the INSET statement to customize both the information that is printed in the inset box and the appearance of the inset box. To supply the information that is displayed in the inset box, you specify keywords corresponding to the information you want shown. For example, the following statements produce a predicted probability plot with the number of trials, the number of events, the name of the distribution, and the estimated optimum natural threshold in the inset.
proc probit data=epidemic; model r/n = dose; predpplot ; inset nobs ntrials nevents dist optc; run;
By default, inset entries are identified with appropriate labels. However, you can provide a customized label by specifying the keyword for that entry followed by the equal sign (=) and the label in quotes. For example, the following INSET statement produces an inset containing the number of observations and the name of the distribution, labeled Sample Size and Distribution in the inset.
inset nobs=Sample Size dist=Distribution;
If you specify a keyword that does not apply to the plot you are creating, then the keyword is ignored.
The options control the appearance of the box.
If you specify more than one INSET statement, only the first one is used.
The following tables list keywords available in the INSET statement to display summary statistics, distribution parameters, and distribution fitting information.
NOBS | number of observations |
NTRIALS | number of trials |
NEVENTS | number of events |
C | the user inputted threshold |
OPTC | the estimated natural threshold |
NRESPLEV | number of levels of the response variable |
CONFIDENCE | confidence coefficient for all confidence intervals or for the Weibayes fit |
DIST | name of the distribution |
The following tables list the options available in the INSET statement.
FONT= font | specifies software font for text |
HEIGHT= value | specifies height of text |
HEADER= quoted string | specifies text for header or box title |
NOFRAME | omits frame around box |
POS= value | |
<DATA PERCENT> | determines the position of the inset. The value can be a compass point (N, NE, E, SE, S, SW, W, NW) or a pair of coordinates (x, y) enclosed in parentheses. The coordinates can be specified in axis percent units or axis data units. |
REFPOINT= name | specifies the reference point for an inset that is positioned by a pair of coordinates with the POS= option. You use the REFPOINT= option in conjunction with the POS= coordinates. The REFPOINT= option specifies which corner of the inset frame you have specified with coordinates (x, y) and it can take the value of BR (bottom right), BL (bottom left), TR (top right), or TL (top left). The default is REFPOINT=BL. If the inset position is specified as a compass point, then the REFPOINT= option is ignored. |
CFILL= color | specifies color for filling box |
CFILLH= color | specifies color for filling box header |
CFRAME= color | specifies color for frame |
CHEADER= color | specifies color for text in header |
CTEXT= color | specifies color for text |
IPPPLOT < var = variable >< options > ;
The IPPPLOT statement plots the inverse of the predicted probability against a single continuous variable (dose variable) in the MODEL statement for the binomial model. You can only use this statement after a binomial model statement. The confidence limits for the predicted values of the dose variable are the computed fiducial limits, not the inverse of the confidence limits of the predicted probabilities. Refer to the section Inverse Confidence Limits on page 3761 for more details.
VAR= ( variable )
specifies a single continuous variable (dose variable) in the independent variable list of the MODEL statement. If a VAR= variable is not specified, the first single continuous variable in the independent variable list of the MODEL statement is used. If such a variable does not exist in the independent variable list of the MODEL statement, an error is reported.
For the binomial model, the response variable is a probability. An estimate of the dose level 1 needed for a response of p is given by
where F is the cumulative distribution function used to model the probability, x ˆ’ 1 is the vector of the rest of the covariates, ˆ’ 1 is the vector of the estimated parameters corresponding to x ˆ’ 1 , and 1 is the estimated parameter for the dose variable of interest.
To plot 1 as a function of p , x ˆ’ 1 must be specified. You can use the XDATA= option to provide the values of x ˆ’ 1 (see the XDATA= option in the PROC PROBIT statement for details), or use the default values that follow the rules:
If the effect contains a continuous variable (or variables), the overall mean of this effect is used.
If the effect is a single classification variable, the highest level of the variable is used.
options
add features to the plot.
You can use options in the IPPPLOT statement to
superimpose specification limits
suppress or add the observed data points on the plot
suppress or add the fiducial limits on the plot
specify graphical enhancements (such as color or text height)
The following tables list all options by function. The Dictionary of Options on page 3728 describes each option in detail.
NOCONF | suppresses fiducial limits |
NODATA | suppresses observed data points on the plot |
NOTHRESH | suppresses the threshold line |
THRESHLABPOS= value | specifies the position for the label of the threshold line |
CAXIS= color | specifies color for axis |
CFIT= color | specifies color for fitted curves |
CFRAME= color | specifies color for frame |
CGRID= color | specifies color for grid lines |
CHREF= color | specifies color for HREF= lines |
CTEXT= color | specifies color for text |
CVREF= color | specifies color for VREF= lines |
ANNOTATE= SAS-data-set | specifies an ANNOTATE data set |
INBORDER | requests a border around plot |
LFIT= linetype | specifies line style for fitted curves and confidence limits |
LGRID= linetype | specifies line style for grid lines |
NOFRAME | suppresses the frame around plotting areas |
NOGRID | suppresses grid lines |
NOFIT | suppresses fitted curves |
NOHLABEL | suppresses horizontal labels |
NOHTICK | suppresses horizontal ticks |
NOVTICK | suppresses vertical ticks |
TURNVLABELS | vertically strings out characters in vertical labels |
WFIT= n | specifies thickness for fitted curves |
WGRID= n | specifies thickness for grids |
WREFL= n | specifies thickness for reference lines |
HAXIS= value1 to value2 < by value3 > | specifies tick mark values for horizontal axis |
HOFFSET= value | specifies offset for horizontal axis |
HLOWER= value | specifies lower limit on horizontal axis scale |
HUPPER= value | specifies upper limit on horizontal axis scale |
NHTICK= n | specifies number of ticks for horizontal axis |
NVTICK= n | specifies number of ticks for vertical axis |
VAXIS= value1 to value2 < by value3 > | specifies tick mark values for vertical axis |
VAXISLABEL= label | specifies label for vertical axis |
VOFFSET= value | specifies offset for vertical axis |
VLOWER= value | specifies lower limit on vertical axis scale |
VUPPER= value | specifies upper limit on vertical axis scale |
WAXIS= n | specifies thickness for axis |
HREF<(INTERSECT)>=value-list | requests horizontal reference line |
HREFLABELS=(label1 , , labeln) | specifies labels for HREF= lines |
HREFLABPOS= n | specifies vertical position of labels for HREF= lines |
LHREF= linetype | specifies line style for HREF= lines |
LVREF= linetype | specifies line style for VREF= lines |
VREF<(INTERSECT)>=value-list | requests vertical reference line |
VREFLABELS=(label1 , , labeln) | specifies labels for VREF= lines |
VREFLABPOS= n | specifies horizontal position of labels for VREF= lines |
DESCRIPTION= string | specifies description for graphics catalog member |
NAME= string | specifies name for plot in graphics catalog |
FONT= font | specifies software font for text |
HEIGHT= value | specifies height of text used outside framed areas |
INFONT= font | specifies software font for text inside framed areas |
INHEIGHT= value | specifies height of text inside framed areas |
The following entries provide detailed descriptions of the options in the IPPPLOT statement.
ANNOTATE= SAS-data-set
ANNO= SAS-data-set
specifies an ANNOTATE data set, as described in SAS/GRAPH Software: Reference , that enables you to add features to the ipp plot. The ANNOTATE= data set you specify in the IPPPLOT statement is used for all plots created by the statement.
CAXIS= color
CAXES= color
specifies the color used for the axes and tick marks. This option overrides any COLOR= specifications in an AXIS statement. The default is the first color in the device color list.
CFIT= color
specifies the color for the fitted ipp curves. The default is the first color in the device color list.
CFRAME= color
CFR= color
specifies the color for the area enclosed by the axes and frame. This area is not shaded by default.
CGRID= color
specifies the color for grid lines. The default is the first color in the device color list.
CHREF= color
CH= color
specifies the color for lines requested by the HREF= option. The default is the first color in the device color list.
CTEXT= color
specifies the color for tick mark values and axis labels. The default is the color specified for the CTEXT= option in the most recent GOPTIONS statement.
CVREF= color
CV= color
specifies the color for lines requested by the VREF= option. The default is the first color in the device color list.
DESCRIPTION= string
DES= string
specifies a description, up to 40 characters, that appears in the PROC GREPLAY master menu. The default is the variable name.
FONT= font
specifies a software font for reference line and axis labels. You can also specify fonts for axis labels in an AXIS statement. The FONT= font takes precedence over the FTEXT= font specified in the most recent GOPTIONS statement. Hardware characters are used by default.
HAXIS= value1 to value2 < by value3 >
specifies tick mark values for the horizontal axis. value1 , value2 , and value3 must be numeric, and value1 must be less than value2 . The lower tick mark is value1 . Tick marks are drawn at increments of value3 . The last tick mark is the greatest value that does not exceed value2 . If value3 is omitted, a value of 1 is used.
Examples of HAXIS= lists are:
haxis = 0 to 10 haxis = 2 to 10 by 2 haxis = 0 to 200 by 10
HEIGHT= value
specifies the height of text used outside framed areas. The default value is 3.846 (in percentage).
HLOWER= value
specifies the lower limit on the horizontal axis scale. The HLOWER= option specifies value as the lower horizontal axis tick mark. The tick mark interval and the upper axis limit are determined automatically. This option has no effect if the HAXIS= option is used.
HOFFSET= value
specifies offset for horizontal axis. The default value is 1.
HUPPER= value
specifies value as the upper horizontal axis tick mark. The tick mark interval and the lower axis limit are determined automatically. This option has no effect if the HAXIS= option is used.
HREF < (INTERSECT) > = value-list
requests reference lines perpendicular to the horizontal axis. If (INTERSECT) is specified, a second reference line perpendicular to the vertical axis is drawn that intersects the fit line at the same point as the horizontal axis reference line. If a horizontal axis reference line label is specified, the intersecting vertical axis reference line is labeled with the vertical axis value. See also the CHREF=, HREFLABELS=, and LHREF= options.
HREFLABELS= label1 , , labeln
HREFLABEL= label1 , , labeln
HREFLAB= label1 , , labeln
specifies labels for the lines requested by the HREF= option. The number of labels must equal the number of lines. Enclose each label in quotes. Labels can be up to 16 characters.
HREFLABPOS= n
specifies the vertical position of labels for HREF= lines. The following table shows valid values for n and the corresponding label placements.
n | label placement |
---|---|
1 | top |
2 | staggered from top |
3 | bottom |
4 | staggered from bottom |
5 | alternating from top |
6 | alternating from bottom |
INBORDER
requests a border around ipp plots.
LFIT= linetype
specifies a line style for fitted curves and confidence limits. By default, fitted curves are drawn by connecting solid lines ( linetype = 1 ) and confidence limits are drawn by connecting dashed lines ( linetype = 3 ).
LGRID= linetype
specifies a line style for all grid lines. linetype is between 1 and 46. The default is 35.
LHREF= linetype
LH= linetype
specifies the line type for lines requested by the HREF= option. The default is 2, which produces a dashed line.
LVREF= linetype
LV = linetype
specifies the line type for lines requested by the VREF= option. The default is 2, which produces a dashed line.
NAME= string
specifies a name for the plot, up to eight characters, that appears in the PROC GREPLAY master menu. The default is PROBIT .
NOCONF
suppresses fiducial limits from the plot.
NODATA
suppresses observed data points from the plot.
NOFIT
suppresses the fitted ipp curves.
NOFRAME
suppresses the frame around plotting areas.
NOGRID
suppresses grid lines.
NOHLABEL
suppresses horizontal labels.
NOHTICK
suppresses horizontal tick marks.
NOTHRESH
suppresses the threshold line.
NOVLABEL
suppresses vertical labels.
NOVTICK
suppresses vertical tick marks.
THRESHLABPOS= n
specifies the vertical position of labels for the threshold line. The following table shows valid values for n and the corresponding label placements.
n | label placement |
---|---|
1 | top |
2 | bottom |
VAXIS= value1 to value2 < by value3 >
specifies tick mark values for the vertical axis. value1 , value2 , and value3 must be numeric, and value1 must be less than value2 . The lower tick mark is value1 . Tick marks are drawn at increments of value3 . The last tick mark is the greatest value that does not exceed value2 . This method of specification of tick marks is not valid for logarithmic axes. If value3 is omitted, a value of 1 is used.
Examples of VAXIS= lists are:
vaxis = 0 to 10 vaxis = 0 to 2 by .1
VAXISLABEL= string
specifies a label for the vertical axis.
VLOWER= value
specifies the lower limit on the vertical axis scale. The VLOWER= option specifies value as the lower vertical axis tick mark. The tick mark interval and the upper axis limit are determined automatically. This option has no effect if the VAXIS= option is used.
VREF= value-list
requests reference lines perpendicular to the vertical axis. If (INTERSECT) is specified, a second reference line perpendicular to the horizontal axis is drawn that intersects the fit line at the same point as the vertical axis reference line. If a vertical axis reference line label is specified, the intersecting horizontal axis reference line is labeled with the horizontal axis value. See also the CVREF=, LVREF=, and VREFLABELS= options.
VREFLABELS= label1 , , labeln
VREFLABEL= label1 , , labeln
VREFLAB= label1 , , labeln
specifies labels for the lines requested by the VREF= option. The number of labels must equal the number of lines. Enclose each label in quotes. Labels can be up to 16 characters.
VREFLABPOS= n
specifies the horizontal position of labels for VREF= lines. The following table shows valid values for n and the corresponding label placements.
n | label placement |
---|---|
1 | left |
2 | right |
VUPPER= value
specifies the upper limit on the vertical axis scale. The VUPPER= option specifies value as the upper vertical axis tick mark. The tick mark interval and the lower axis limit are determined automatically. This option has no effect if the VAXIS= option is used.
WAXIS= n
specifies line thickness for axes and frame. The default value is 1.
WFIT= n
specifies line thickness for fitted curves. The default value is 1.
WGRID= n
specifies line thickness for grids. The default value is 1.
WREFL= n
specifies line thickness for reference lines. The default value is 1.
LPREDPLOT < var = variable >< options > ;
The LPREDPLOT statement plots the linear predictor x ² b against a single continuous variable (dose variable) in the MODEL statement for either the binomial model or the multinomial model. The confidence limits for the predicted values are only available for the binomial model.
VAR= ( variable )
specifies a single continuous variable (dose variable) in the independent variable list of the MODEL statement for which the linear predictor plot is plotted. If a VAR= variable is not specified, the first single continuous variable in the independent variable list of the MODEL statement is used. If such a variable does not exist in the independent variable list of the MODEL statement, an error is reported.
Let x 1 be the covariate of the dose variable, x ˆ’ 1 be the vector of the rest of the covariates, ˆ’ 1 be the vector of estimated parameters corresponding to x ˆ’ 1 , and 1 be the estimated parameter for the dose variable of interest.
To plot ² b as a function of x 1 , x ˆ’ 1 must be specified. You can use the XDATA= option to provide the values of x ˆ’ 1 (see the XDATA= option in the PROC PROBIT statement for details), or use the default values that follow the rules:
If the effect contains a continuous variable (or variables), the overall mean of this effect is used.
If the effect is a single classification variable, the highest level of the variable is used.
options
add features to the plot.
For the multinomial model, you can use the LEVEL= option to specify the levels for which the linear predictor lines are plotted. The lines are labeled by the names of their levels in the middle.
You can use options in the LPREDPLOT statement to
superimpose specification limits
suppress or add the observed data points on the plot for the binomial model
suppress or add the confidence limits for the binomial model
specify the levels for which the linear predictor lines are requested for the multinomial model
specify graphical enhancements (such as color or text height)
The following tables list all options by function. The Dictionary of Options on page 3736 describes each option in detail.
LEVEL= character-list | specifies the names of the levels for which the linear predictor lines are requested (only for the multinomial model) |
NOCONF | suppresses fiducial limits (only for the binomial model) |
NODATA | suppresses observed data points on the plot (only for the binomial model) |
NOTHRESH | suppresses the threshold line |
THRESHLABPOS= value | specifies the position for the label of the threshold line |
CAXIS= color | specifies color for axis |
CFIT= color | specifies color for fitted curves |
CFRAME= color | specifies color for frame |
CGRID= color | specifies color for grid lines |
CHREF= color | specifies color for HREF= lines |
CTEXT= color | specifies color for text |
CVREF= color | specifies color for VREF= lines |
ANNOTATE= SAS-data-set | specifies an ANNOTATE data set |
INBORDER | requests a border around plot |
LFIT= linetype | specifies line style for fitted curves and confidence limits |
LGRID= linetype | specifies line style for grid lines |
NOFRAME | suppresses the frame around plotting areas |
NOGRID | suppresses grid lines |
NOFIT | suppresses fitted curves |
NOHLABEL | suppresses horizontal labels |
NOHTICK | suppresses horizontal ticks |
NOVTICK | suppresses vertical ticks |
TURNVLABELS | vertically strings out characters in vertical labels |
WFIT= n | specifies thickness for fitted curves |
WGRID= n | specifies thickness for grids |
WREFL= n | specifies thickness for reference lines |
HAXIS= value1 to value2 < by value3 > | specifies tick mark values for horizontal axis |
HOFFSET= value | specifies offset for horizontal axis |
HLOWER= value | specifies lower limit on horizontal axis scale |
HUPPER= value | specifies upper limit on horizontal axis scale |
NHTICK= n | specifies number of ticks for horizontal axis |
NVTICK= n | specifies number of ticks for vertical axis |
VAXIS= value1 to value2 < by value3 > | specifies tick mark values for vertical axis |
VAXISLABEL= label | specifies label for vertical axis |
VOFFSET= value | specifies offset for vertical axis |
VLOWER= value | specifies lower limit on vertical axis scale |
VUPPER= value | specifies upper limit on vertical axis scale |
WAXIS= n | specifies thickness for axis |
DESCRIPTION= string | specifies description for graphics catalog member |
NAME= string | specifies name for plot in graphics catalog |
FONT= font | specifies software font for text |
HEIGHT= value | specifies height of text used outside framed areas |
INFONT= font | specifies software font for text inside framed areas |
INHEIGHT= value | specifies height of text inside framed areas |
HREF<(INTERSECT)>=value-list | requests horizontal reference line |
HREFLABELS=(label1 , , labeln) | specifies labels for HREF= lines |
HREFLABPOS= n | specifies vertical position of labels for HREF= lines |
LHREF= linetype | specifies line style for HREF= lines |
LVREF= linetype | specifies line style for VREF= lines |
VREF<(INTERSECT)>=value-list | requests vertical reference line |
VREFLABELS=(label1 , , labeln) | specifies labels for VREF= lines |
VREFLABPOS= n | specifies horizontal position of labels for VREF= lines |
The following entries provide detailed descriptions of the options in the LPREDPLOT statement.
ANNOTATE= SAS-data-set
ANNO= SAS-data-set
specifies an ANNOTATE data set, as described in SAS/GRAPH Software: Reference , that enables you to add features to the lpred plot. The ANNOTATE= data set you specify in the LPREDPLOT statement is used for all plots created by the statement.
CAXIS= color
CAXES= color
specifies the color used for the axes and tick marks. This option overrides any COLOR= specifications in an AXIS statement. The default is the first color in the device color list.
CFIT= color
specifies the color for the fitted lpred lines. The default is the first color in the device color list.
CFRAME= color
CFR= color
specifies the color for the area enclosed by the axes and frame. This area is not shaded by default.
CGRID= color
specifies the color for grid lines. The default is the first color in the device color list.
CHREF= color
CH= color
specifies the color for lines requested by the HREF= option. The default is the first color in the device color list.
CTEXT= color
specifies the color for tick mark values and axis labels. The default is the color specified for the CTEXT= option in the most recent GOPTIONS statement.
CVREF= color
CV= color
specifies the color for lines requested by the VREF= option. The default is the first color in the device color list.
DESCRIPTION= string
DES= string
specifies a description, up to 40 characters, that appears in the PROC GREPLAY master menu. The default is the variable name.
FONT= font
specifies a software font for reference line and axis labels. You can also specify fonts for axis labels in an AXIS statement. The FONT= font takes precedence over the FTEXT= font specified in the most recent GOPTIONS statement. Hardware characters are used by default.
HAXIS= value1 to value2 < by value3 >
specifies tick mark values for the horizontal axis. value1 , value2 , and value3 must be numeric, and value1 must be less than value2 . The lower tick mark is value1 . Tick marks are drawn at increments of value3 . The last tick mark is the greatest value that does not exceed value2 . If value3 is omitted, a value of 1 is used.
Examples of HAXIS= lists are:
haxis = 0 to 10 haxis = 2 to 10 by 2 haxis = 0 to 200 by 10
HEIGHT= value
specifies the height of text used outside framed areas. The default value is 3.846 (in percentage).
HLOWER= value
specifies the lower limit on the horizontal axis scale. The HLOWER= option specifies value as the lower horizontal axis tick mark. The tick mark interval and the upper axis limit are determined automatically. This option has no effect if the HAXIS= option is used.
HOFFSET= value
specifies offset for horizontal axis. The default value is 1.
HUPPER= value
specifies value as the upper horizontal axis tick mark. The tick mark interval and the lower axis limit are determined automatically. This option has no effect if the HAXIS= option is used.
HREF < (INTERSECT) > = value-list
requests reference lines perpendicular to the horizontal axis. If (INTERSECT) is specified, a second reference line perpendicular to the vertical axis is drawn that intersects the fit line at the same point as the horizontal axis reference line. If a horizontal axis reference line label is specified, the intersecting vertical axis reference line is labeled with the vertical axis value. See also the CHREF=, HREFLABELS=, and LHREF= options.
HREFLABELS= label1 , , labeln
HREFLABEL= label1 , , labeln
HREFLAB= label1 , , labeln
specifies labels for the lines requested by the HREF= option. The number of labels must equal the number of lines. Enclose each label in quotes. Labels can be up to 16 characters.
HREFLABPOS= n
specifies the vertical position of labels for HREF= lines. The following table shows valid values for n and the corresponding label placements.
n | label placement |
---|---|
1 | top |
2 | staggered from top |
3 | bottom |
4 | staggered from bottom |
5 | alternating from top |
6 | alternating from bottom |
INBORDER
requests a border around lpred plots.
LEVEL= ( character-list )
ORDINAL= ( character-list )
specifies the names of the levels for which linear predictor lines are requested. Names should be quoted and separated by space. If there is no correct name provided, no lpred line is plotted.
LFIT= linetype
specifies a line style for fitted curves and confidence limits. By default, fitted curves are drawn by connecting solid lines ( linetype = 1 ) and confidence limits are drawn by connecting dashed lines ( linetype = 3 ).
LGRID= linetype
specifies a line style for all grid lines. linetype is between 1 and 46. The default is 35.
LHREF= linetype
LH= linetype
specifies the line type for lines requested by the HREF= option. The default is 2, which produces a dashed line.
LVREF= linetype
LV = linetype
specifies the line type for lines requested by the VREF= option. The default is 2, which produces a dashed line.
NAME= string
specifies a name for the plot, up to eight characters, that appears in the PROC GREPLAY master menu. The default is PROBIT .
NOCONF
suppresses confidence limits from the plot. This only works for the binomial model. Confidence limits are not plotted for the multinomial model.
NODATA
suppresses observed data points from the plot. This only works for the binomial model. Data points are not plotted for the multinomial model.
NOFIT
suppresses the fitted lpred lines.
NOFRAME
suppresses the frame around plotting areas.
NOGRID
suppresses grid lines.
NOHLABEL
suppresses horizontal labels.
NOHTICK
suppresses horizontal tick marks.
NOTHRESH
suppresses the threshold line.
NOVLABEL
suppresses vertical labels.
NOVTICK
suppresses vertical tick marks.
THRESHLABPOS= n
specifies the horizontal position of labels for the threshold line. The following table shows valid values for n and the corresponding label placements.
n | label placement |
---|---|
1 | left |
2 | right |
VAXIS= value1 to value2 < by value3 >
specifies tick mark values for the vertical axis. value1 , value2 , and value3 must be numeric, and value1 must be less than value2 . The lower tick mark is value1 . Tick marks are drawn at increments of value3 . The last tick mark is the greatest value that does not exceed value2 . This method of specification of tick marks is not valid for logarithmic axes. If value3 is omitted, a value of 1 is used.
Examples of VAXIS= lists are:
vaxis = 0 to 10 vaxis = 0 to 2 by .1
VAXISLABEL= string
specifies a label for the vertical axis.
VLOWER= value
specifies the lower limit on the vertical axis scale. The VLOWER= option specifies value as the lower vertical axis tick mark. The tick mark interval and the upper axis limit are determined automatically. This option has no effect if the VAXIS= option is used.
VREF= value-list
requests reference lines perpendicular to the vertical axis. If (INTERSECT) is specified, a second reference line perpendicular to the horizontal axis is drawn that intersects the fit line at the same point as the vertical axis reference line. If a vertical axis reference line label is specified, the intersecting horizontal axis reference line is labeled with the horizontal axis value. See also the CVREF=, LVREF=, and VREFLABELS= options.
VREFLABELS= label1 , , labeln
VREFLABEL= label1 , , labeln
VREFLAB= label1 , , labeln
specifies labels for the lines requested by the VREF= option. The number of labels must equal the number of lines. Enclose each label in quotes. Labels can be up to 16 characters.
VREFLABPOS= n
specifies the horizontal position of labels for VREF= lines. The following table shows valid values for n and the corresponding label placements.
n | label placement |
---|---|
1 | left |
2 | right |
VUPPER= number
specifies the upper limit on the vertical axis scale. The VUPPER= option specifies number as the upper vertical axis tick mark. The tick mark interval and the lower axis limit are determined automatically. This option has no effect if the VAXIS= option is used.
WAXIS= n
specifies line thickness for axes and frame. The default value is 1.
WFIT= n
specifies line thickness for fitted lines. The default value is 1.
WGRID= n
specifies line thickness for grids. The default value is 1.
WREFL= n
specifies line thickness for reference lines. The default value is 1.
label:> MODEL response=effects < / options > ;
label: > MODEL events/trials=effects < / options > ;
The MODEL statement names the variables used as the response and the independent variables. Additionally, you can specify the distribution used to model the response, as well as other options. Only a single MODEL statement can be used with one invocation of the PROBIT procedure. If multiple MODEL statements are present, only the last is used. Main effects and interaction terms can be specified in the MODEL statement, similar to the GLM procedure.
The optional label is used to label output from the matching MODEL statement.
The response can be a single variable with a value that is used to indicate the level of the observed response. Such a response variable must be listed in the CLASS statement. For example, the response might be a variable called Symptoms that takes on the values ˜None, ˜Mild, or ˜Severe. Note that, for dichotomous response variables, the probability of the lower sorted value is modeled by default (see the Details section beginning on page 3754). Because the model fit by the PROBIT procedure requires ordered response levels, you may need to use either the ORDER=DATA option in the PROC PROBIT statement or a numeric coding of the response to get the desired ordering of levels.
Alternatively, the response can be specified as a pair of variable names separated by a slash (/). The value of the first variable, events , is the number of positive responses (or events). The value of the second variable, trials , is the number of trials. Both variables must be numeric and non-negative, and the ratio of the first variable value to the second variable value must be between 0 and 1, inclusive. For example, the variables might be hits , a variable containing the number of hits for a baseball player, and AtBats , a variable containing the number of times at bat. A model for hitting proportion (batting average) as a function of age could be specified as
model hits/AtBats=age;
The effects following the equal sign are the covariates in the model. Higher-order effects, such as interactions and nested terms, are allowed in the list, similar to the GLM procedure. Variable names and combinations of variable names representing higher-order terms are allowed to appear in this list. Class variables can be used as effects, and indicator variables are generated for the class levels. If you do not specify any covariates following the equal sign, an intercept-only model is fit.
The following options are available in the MODEL statement.
AGGREGATE
AGGREGATE= (variable-list)
specifies the subpopulations on which the Pearson chi-square test statistic and the log-likelihood ratio chi-square test statistic (deviance) are calculated if the LACKFIT option is specified. See the section Rescaling the Covariance Matrix on page 3760 for details of Pearson s chi-square and deviance calculations.
Observations with common values in the given list of variables are regarded as coming from the same subpopulation. Variables in the list can be any variables in the input data set. Specifying the AGGREGATE option is equivalent to specifying the AGGREGATE= option with a variable list that includes all independent variables in the MODEL statement. The PROBIT procedure sorts the input data set according to the variables specified in this list. Information for the sorted data set is reported in the Response-Covariate Profile table.
The deviance and Pearson goodness-of-fit statistics are calculated if the LACKFIT option is specified in the MODEL statement. The calculated results are reported in the Goodness-of-Fit table. If the Pearson chi-square test is significant with the test level specified by the HPROB= option, the fiducial limits, if required with the INVERSECL option in the MODEL statement, are modified (see the section Inverse Confidence Limits on page 3761 for details). Also, the covariance matrix is re-scaled by the dispersion parameter when the SCALE= option is specified.
ALPHA= value
sets the significance level for the confidence intervals for regression parameters, fiducial limits for the predicted values, and confidence intervals for the predicted probabilities. The value must be between 0 and 1. The default value is ALPHA=0.05.
CONVERGE= value
specifies the convergence criterion. Convergence is declared when the maximum change in the parameter estimates between Newton-Raphson steps is less than the value specified. The change is a relative change if the parameter is greater than 0.01 in absolute value; otherwise , it is an absolute change.
By default, CONVERGE=1.0E-8.
CORRB
displays the estimated correlation matrix of the parameter estimates.
COVB
displays the estimated covariance matrix of the parameter estimates.
DISTRIBUTION= distribution-type
DIST= distribution-type
D= distribution-type
specifies the cumulative distribution function used to model the response probabilities. The distributions are described in the Details section beginning on page 3754. Valid values for distribution-type are
NORMAL | the normal distribution for the probit model |
LOGISTIC | the logistic distribution for the logit model |
EXTREMEVALUE EXTREME GOMPERTZ | the extreme value, or Gompertz distribution for the gompit model |
By default, DISTRIBUTION=NORMAL.
HPROB= p
specifies a minimum probability level for the Pearson chi-square to indicate a good fit. The default value is 0.10. The LACKFIT option must also be specified for this option to have any effect. For Pearson goodness-of-fit chi-square values with probability greater than the HPROB= value, the fiducial limits, if requested with the INVERSECL option, are computed using a critical value of 1.96. For chi-square values with probability less than the value of the HPROB= option, the critical value is a 0.95 two-sided quantile value taken from the t distribution with degrees of freedom equal to ( k ˆ’ 1) — m ˆ’ q , where k is the number of levels for the response variable, m is the number of different sets of independent variable values, and q is the number of parameters fit in the model. If you specify the HPROB= option in both the PROC PROBIT and MODEL statements, the MODEL statement option takes precedence.
INITIAL= values
sets initial values for the parameters in the model other than the intercept. The values must be given in the order in which the variables are listed in the MODEL statement. If some of the independent variables listed in the MODEL statement are classification variables, then there must be as many values given for that variable as there are classification levels minus 1. The INITIAL option can be specified as follows.
Type of List | Specification |
---|---|
list separated by blanks | initial=3 4 5 |
list separated by commas | initial=3, 4, 5 |
By default, all parameters have initial estimates of zero.
Note: The INITIAL= option is overwritten by the INEST= option in the PROC PROBIT statement.
INTERCEPT= value
initializes the intercept parameter to value . By default, INTERCEPT=0.
INVERSECL
computes confidence limits for the values of the first continuous independent variable (such as dose) that yield selected response rates. If the algorithm fails to converge (this can happen when C is nonzero), missing values are reported for the confidence limits. See the section Inverse Confidence Limits on page 3761 for details.
ITPRINT
displays the iteration history, the final evaluation of the gradient, and the second derivative matrix (Hessian).
LACKFIT
performs two goodness-of-fit tests (a Pearson chi-square test and a log-likelihood ratio chi-square test) for the fitted model.
To compute the test statistics, proper grouping of the observations into subpopulations is needed. You can use the AGGREGATE or AGGREGATE= option to this end. See the entry for the AGGREGATE and AGGREGATE= options under the MODEL statement. If neither AGGREGATE nor AGGREGATE= is specified, PROC PROBIT assumes each observation is from a separate subpopulation and computes the goodness-of-fit test statistics only for the events/trials syntax.
Note: This test is not appropriate if the data are very sparse, with only a few values at each set of the independent variable values.
If the Pearson chi-square test statistic is significant, then the covariance estimates and standard error estimates are adjusted. See the section Lack of Fit Tests on page 3759 for a description of the tests. Note that the LACKFIT option can also appear in the PROC PROBIT statement. See the section PROC PROBIT Statement on page 3711 for details.
MAXITER= value
MAXIT= value
specifies the maximum number of iterations to be performed in estimating the parameters. By default, MAXITER=50.
NOINT
fits a model with no intercept parameter. If the INTERCEPT= option is also specified, the intercept is fixed at the specified value; otherwise, it is set to zero. This is most useful when the response is binary. When the response has k levels, then k ˆ’ 1 intercept parameters are fit. The NOINT option sets the intercept parameter corresponding to the lowest response level equal to zero. A Lagrange multiplier , or score, test for the restricted model is computed when the NOINT option is specified.
SCALE= scale
enables you to specify the method for estimating the dispersion parameter. To correct for overdispersion or underdispersion, the covariance matrix is multiplied by the estimate of the dispersion parameter. Valid values for scale are as follows:
D DEVIANCE | specifies that the dispersion parameter be estimated by the deviance divided by its degrees of freedom. |
P PEARSON | specifies that the dispersion parameter be estimated by the Pearson chi-square statistic divided by its degrees of freedom. This is set as the default. |
You can use the AGGREGATE= option to define the subpopulations for calculating the Pearson chi-square statistic and the deviance.
The Goodness-of-Fit table includes the Pearson chi-square statistic, the deviance, their degrees of freedom, the ratio of each statistic divided by its degrees of freedom, and the corresponding p -value.
SINGULAR= value
specifies the singularity criterion for determining linear dependencies in the set of independent variables. The sum of squares and cross-products matrix of the independent variables is formed and swept. If the relative size of a pivot becomes less than the value specified, then the variable corresponding to the pivot is considered to be linearly dependent on the previous set of variables considered . By default, SINGULAR=1E ˆ’ 12.
OUTPUT < OUT=SAS-data-set >< keyword=name keyword=name > ;
The OUTPUT statement creates a new SAS data set containing all variables in the input data set and, optionally , the fitted probabilities, the estimate of x ² ² , and the estimate of its standard error. Estimates of the probabilities, x ² ² , and the standard errors are computed for observations with missing response values as long as the values of all the explanatory variables are nonmissing. This enables you to compute these statistics for additional settings of the explanatory variables that are of interest but for which responses are not observed.
You can specify multiple OUTPUT statements. Each OUTPUT statement creates a new data set and applies only to the preceding MODEL statement. If you want to create a permanent SAS data set, you must specify a two-level name (refer to SAS Language Reference: Concepts for more information on permanent SAS data sets).
Details on the specifications in the OUTPUT statement are as follows:
keyword=name | specifies the statistics to include in the output data set and assigns names to the new variables that contain the statistics. Specify a keyword for each desired statistic (see the following list of keywords), an equal sign, and the variable to contain the statistic. | |
The keywords allowed and the statistics they represent are as follows: | ||
PROB P | cumulative probability estimates | |
STD | standard error estimates of a j + x ² b | |
XBETA | estimates of a j + x ² ² | |
OUT= SAS-data-set | names the output data set. By default, the new data set is named using the DATA n convention. |
When the single variable response syntax is used, the _LEVEL_ variable is added to the output data set, and there are k ˆ’ 1 output observations for each input observation, where k is the number of response levels. There is no observation output corresponding to the highest response level. For each of the k ˆ’ 1 observations, the PROB variable contains the fitted probability of obtaining a response level up to the level indicated by the _LEVEL_ variable, the XBETA variable contains a j + x ² b , where j references the levels ( a 1 =0), and the STD variable contains the standard error estimate of the XBETA variable. See the Details section, which follows, for the formulas for the parameterizations.
PREDPPLOT < var = variable >< options > ;
The PREDPPLOT statement plots the predicted probability against a single continuous variable (dose variable) in the MODEL statement for both the binomial model and the multinomial model. Confidence limits are only available for the binomial model. An attached box on the right side of the plot is used to label predicted probability curves with the names of their levels for the multinomial model. You can specify the color of this box using the CLABBOX= option.
VAR= (variable)
specifies a single continuous variable (dose variable) in the independent variable list of the MODEL statement. If a VAR= variable is not specified, the first single continuous variable in the independent variable list of the MODEL statement is used. If such a variable does not exist in the independent variable list of the MODEL statement, an error is reported.
The predicted probability is
for the binomial model and
for the multinomial model with k response levels, where F is the cumulative distribution function used to model the probability, x ² is the vector of the covariates, j are the estimated ordinal intercepts with 1 =0, C is the threshold parameter, either known or estimated from the model, and ² is the vector of estimated parameters.
To plot (or j ) as a function of a continuous variable x 1 , the remaining covariates x ˆ’ 1 must be specified. You can use the XDATA= option to provide the values of x ˆ’ 1 (see the XDATA= option in the PROC PROBIT statement for details), or use the default values that follow the rules:
If the effect contains a continuous variable (or variables), the overall mean of this effect is used.
If the effect is a single classification variable, the highest level of the variable is used.
options
enable you to plot the observed data and add features to the plot.
You can use options in the PREDPPLOT statement to
superimpose specification limits
suppress or add observed data points for the binomial model
suppress or add confidence limits for the binomial model
specify the levels for which predicted probability curves are requested for the multinomial model
specify graphical enhancements (such as color or text height)
The following tables list all options by function. The Dictionary of Options on page 3749 describes each option in detail.
LEVEL= character-list | specifies the names of the levels for which the predicted probability curves are requested (only for the multinomial model) |
NOCONF | suppresses confidence limits |
NODATA | suppresses observed data points on the plot |
NOTHRESH | suppresses the threshold line |
THRESHLABPOS= value | specifies the position for the label of the threshold line |
CAXIS= color | specifies color for the axes |
CFIT= color | specifies color for fitted curves |
CFRAME= color | specifies color for frame |
CGRID= color | specifies color for grid lines |
CHREF= color | specifies color for HREF= lines |
CLABBOX= color | specifies color for label box |
CTEXT= color | specifies color for text |
CVREF= color | specifies color for VREF= lines |
ANNOTATE= SAS-data-set | specifies an ANNOTATE data set |
INBORDER | requests a border around plot |
LFIT= linetype | specifies line style for fitted curves and confidence limits |
LGRID= linetype | specifies line style for grid lines |
NOFRAME | suppresses the frame around plotting areas |
NOGRID | suppresses grid lines |
NOFIT | suppresses fitted curves |
NOHLABEL | suppresses horizontal labels |
NOHTICK | suppresses horizontal ticks |
NOVTICK | suppresses vertical ticks |
TURNVLABELS | vertically strings out characters in vertical labels |
WFIT= n | specifies thickness for fitted curves |
WGRID= n | specifies thickness for grids |
WREFL= n | specifies thickness for reference lines |
HAXIS= value1 to value2 < by value3 > | specifies tick mark values for horizontal axis |
HOFFSET= value | specifies offset for horizontal axis |
HLOWER= value | specifies lower limit on horizontal axis scale |
HUPPER= value | specifies upper limit on horizontal axis scale |
NHTICK= n | specifies number of ticks for horizontal axis |
NVTICK= n | specifies number of ticks for vertical axis |
VAXIS= value1 to value2 < by value3 > | specifies tick mark values for vertical axis |
VAXISLABEL= label | specifies label for vertical axis |
VOFFSET= value | specifies offset for vertical axis |
VLOWER= value | specifies lower limit on vertical axis scale |
VUPPER= value | specifies upper limit on vertical axis scale |
WAXIS= n | specifies thickness for axis |
DESCRIPTION= string | specifies description for graphics catalog member |
NAME= string | specifies name for plot in graphics catalog |
FONT= font | specifies software font for text |
HEIGHT= value | specifies height of text used outside framed areas |
INFONT= font | specifies software font for text inside framed areas |
INHEIGHT= value | specifies height of text inside framed areas |
HREF<(INTERSECT)> =value-list | requests horizontal reference line |
HREFLABELS= (label1 , , labeln) | specifies labels for HREF= lines |
HREFLABPOS= n | specifies vertical position of labels for HREF= lines |
LHREF= linetype | specifies line style for HREF= lines |
LVREF= linetype | specifies line style for VREF= lines |
VREF<(INTERSECT)> =value-list | requests vertical reference line |
VREFLABELS= (label1 , , labeln) | specifies labels for VREF= lines |
VREFLABPOS= n | specifies horizontal position of labels for VREF= lines |
The following entries provide detailed descriptions of the options in the PREDPPLOT statement.
ANNOTATE= SAS-data-set
ANNO= SAS-data-set
specifies an ANNOTATE data set, as described in SAS/GRAPH Software: Reference , that enables you to add features to the predicted probability plot. The ANNOTATE= data set you specify in the PREDPPLOT statement is used for all plots created by the statement.
CAXIS= color
CAXES= color
specifies the color used for the axes and tick marks. This option overrides any COLOR= specifications in an AXIS statement. The default is the first color in the device color list.
CFIT= color
specifies the color for the fitted predicted probability curves. The default is the first color in the device color list.
CFRAME= color
CFR= color
specifies the color for the area enclosed by the axes and frame. This area is not shaded by default.
CGRID= color
specifies the color for grid lines. The default is the first color in the device color list.
CHREF= color
CH= color
specifies the color for lines requested by the HREF= option. The default is the first color in the device color list.
CTEXT= color
specifies the color for tick mark values and axis labels. The default is the color specified for the CTEXT= option in the most recent GOPTIONS statement.
CVREF= color
CV= color
specifies the color for lines requested by the VREF= option. The default is the first color in the device color list.
DESCRIPTION= string
DES= string
specifies a description, up to 40 characters, that appears in the PROC GREPLAY master menu. The default is the variable name.
FONT= font
specifies a software font for reference line and axis labels. You can also specify fonts for axis labels in an AXIS statement. The FONT= font takes precedence over the FTEXT= font specified in the most recent GOPTIONS statement. Hardware characters are used by default.
HAXIS= value1 to value2 < by value3 >
specifies tick mark values for the horizontal axis. value1 , value2 , and value3 must be numeric, and value1 must be less than value2 . The lower tick mark is value1 .Tick marks are drawn at increments of value3 . The last tick mark is the greatest value that does not exceed value2 . If value3 is omitted, a value of 1 is used.
Examples of HAXIS= lists are:
haxis = 0 to 10 haxis = 2 to 10 by 2 haxis = 0 to 200 by 10
HEIGHT= value
specifies the height of text used outside framed areas.
HLOWER= value
specifies the lower limit on the horizontal axis scale. The HLOWER= option specifies value as the lower horizontal axis tick mark. The tick mark interval and the upper axis limit are determined automatically. This option has no effect if the HAXIS= option is used.
HOFFSET= value
specifies the offset for the horizontal axis. The default value is 1.
HUPPER= value
specifies value as the upper horizontal axis tick mark. The tick mark interval and the lower axis limit are determined automatically. This option has no effect if the HAXIS= option is used.
HREF < (INTERSECT) > = value-list
requests reference lines perpendicular to the horizontal axis. If (INTERSECT) is specified, a second reference line perpendicular to the vertical axis is drawn that intersects the fit line at the same point as the horizontal axis reference line. If a horizontal axis reference line label is specified, the intersecting vertical axis reference line is labeled with the vertical axis value. See also the CHREF=, HREFLABELS=, and LHREF= options.
HREFLABELS= label1 , , labeln
HREFLABEL= label1 , , labeln
HREFLAB= label1 , , labeln
specifies labels for the lines requested by the HREF= option. The number of labels must equal the number of lines. Enclose each label in quotes. Labels can be up to 16 characters.
HREFLABPOS= n
specifies the vertical position of labels for HREF= lines. The following table shows valid values for n and the corresponding label placements.
n | label placement |
---|---|
1 | top |
2 | staggered from top |
3 | bottom |
4 | staggered from bottom |
5 | alternating from top |
6 | alternating from bottom |
INBORDER
requests a border around predicted probability plots.
LEVEL= ( character-list )
ORDINAL= ( character-list )
specifies the names of the levels for which predicted probability curves are requested. Names should be quoted and separated by space. If there is no correct name provided, no fitted probability curve is plotted.
LFIT= linetype
specifies a line style for fitted curves and confidence limits. By default, fitted curves are drawn by connecting solid lines ( linetype = 1 ) and confidence limits are drawn by connecting dashed lines ( linetype = 3 ).
LGRID= linetype
specifies a line style for all grid lines. linetype is between 1 and 46. The default is 35.
LHREF= linetype
LH= linetype
specifies the line type for lines requested by the HREF= option. The default is 2, which produces a dashed line.
LVREF= linetype
LV = linetype
specifies the line type for lines requested by the VREF= option. The default is 2, which produces a dashed line.
NAME= string
specifies a name for the plot, up to eight characters, that appears in the PROC GREPLAY master menu. The default is PROBIT .
NOCONF
suppresses confidence limits from the plot. This only works for the binomial model. Confidence limits are not plotted for the multinomial model.
NODATA
suppresses observed data points from the plot. This only works for the binomial model. The data points are not plotted for the multinomial model.
NOFIT
suppresses the fitted predicted probability curves.
NOFRAME
suppresses the frame around plotting areas.
NOGRID
suppresses grid lines.
NOHLABEL
suppresses horizontal labels.
NOHTICK
suppresses horizontal tick marks.
NOTHRESH
suppresses the threshold line.
NOVLABEL
suppresses vertical labels.
NOVTICK
suppresses vertical tick marks.
THRESHLABPOS= n
specifies the horizontal position of labels for the threshold line. The following table shows valid values for n and the corresponding label placements.
n | label placement |
---|---|
1 | left |
2 | right |
VAXIS= value1 to value2 < by value3 >
specifies tick mark values for the vertical axis. value1 , value2 , and value3 must be numeric, and value1 must be less than value2 . The lower tick mark is value1 .Tick marks are drawn at increments of value3 . The last tick mark is the greatest value that does not exceed value2 . This method of specification of tick marks is not valid for logarithmic axes. If value3 is omitted, a value of 1 is used.
Examples of VAXIS= lists are:
vaxis = 0 to 10 vaxis = 0 to 2 by .1
VAXISLABEL= string
specifies a label for the vertical axis.
VLOWER= value
specifies the lower limit on the vertical axis scale. The VLOWER= option specifies value as the lower vertical axis tick mark. The tick mark interval and the upper axis limit are determined automatically. This option has no effect if the VAXIS= option is used.
VREF= value-list
requests reference lines perpendicular to the vertical axis. If (INTERSECT) is specified, a second reference line perpendicular to the horizontal axis is drawn that intersects the fit line at the same point as the vertical axis reference line. If a vertical axis reference line label is specified, the intersecting horizontal axis reference line is labeled with the horizontal axis value. See also the CVREF=, LVREF=, and VREFLABELS= options.
VREFLABELS= label1 , , labeln
VREFLABEL= label1 , , labeln
VREFLAB= label1 , , labeln
specifies labels for the lines requested by the VREF= option. The number of labels must equal the number of lines. Enclose each label in quotes. Labels can be up to 16 characters.
VREFLABPOS= n
specifies the horizontal position of labels for VREF= lines. The following table shows valid values for n and the corresponding label placements.
n | label placement |
---|---|
1 | left |
2 | right |
VUPPER= value
specifies the upper limit on the vertical axis scale. The VUPPER= option specifies value as the upper vertical axis tick mark. The tick mark interval and the lower axis limit are determined automatically. This option has no effect if the VAXIS= option is used.
WAXIS= n
specifies line thickness for axes and frame. The default value is 1.
WFIT= n
specifies line thickness for fitted curves. The default value is 1.
WGRID= n
specifies line thickness for grids. The default value is 1.
WREFL= n
specifies line thickness for reference lines. The default value is 1.
WEIGHT variable ;
A WEIGHT statement can be used with PROC PROBIT to weight each observation by the value of the variable specified. The contribution of each observation to the likelihood function is multiplied by the value of the weight variable. Observations with zero, negative, or missing weights are not used in model estimation.