Details


This section describes the statistical and computational aspects of the ROBUSTREG procedure. The following notation is used throughout this section.

Let X =( x ij ) denote an n p matrix, y =( y 1 ,...,y n ) T a given n -vector of responses, and =( 1 ,..., p ) T an unknown p -vector of parameters or coefficients whose components are to be estimated. The matrix X is called the design matrix. Consider the usual linear model

where e =( e 1 ,...,e n ) T is an n -vector of unknown errors. It is assumed that (for a given X ) the components e i of e are independent and identically distributed according to a distribution L ( · / ƒ ), where ƒ is a scale parameter (usually unknown). The vector of residuals for a given value of is denoted by r =( r 1 , ...,rn ) T and the i th row of the matrix X is denoted by .

M Estimation

M estimation in the context of regression was first introduced by Huber (1973) as an result of making the least squares approach robust. Although M estimators are not robust with respect to leverage points, they are popular in applications where leverage points are not an issue.

Instead of minimizing a sum of squares of the residuals, a Huber-type M estimator M of minimizes a sum of less rapidly increasing functions of the residuals:

where r = y ˆ’ X . For the ordinary least squares estimation, is the quadratic function.

If ƒ is known, by taking derivatives with respect to , M is also a solution of the system of p equations:

click to expand

where ˆ = ² . If is convex, M is the unique solution.

The ROBUSTREG procedure solves this system by using iteratively reweighted least squares (IRLS). The weight function w ( x ) is defined as

The ROBUSTREG procedure provides ten kinds of weight functions (corresponding to ten -functions) through the WEIGHTFUNCTION= option in the MODEL statement. See the section 'Weight Functions' on page 3995 for a complete discussion. You can specify the scale parameter ƒ with the SCALE= option in the PROC statement.

If ƒ is unknown, both and ƒ are estimated by minimizing the function

click to expand

The algorithm proceeds by alternately improving in a location step and in a scale step.

For the scale step, three methods are available to estimate ƒ , which you can select with the SCALE= option.

  1. (SCALE=HUBER<(D=d)>) Compute by the iteration

    click to expand

    where

    click to expand

    is the Huber function and click to expand is the Huber constant (refer to Huber 1981, p. 179). You can specify d with the D= option. By default, d =2 . 5.

  2. (SCALE=TUKEY<(D=d)>) Compute by solving the supplementary equation

    click to expand

    where

    click to expand

    Here is Tukey's biweight function, and ² = ˆ« d ( s ) d ( s ) is the constant such that the solution is asymptotically consistent when L ( ·/ ƒ ) = ( ·) (refer to Hampel et. al. 1986, p. 149). You can specify d with the D= option. By default, d =2.5.

  3. (SCALE=MED) Compute by the iteration

    click to expand

    where ² = ˆ’ 1 (.75) is the constant such that the solution is asymptotically consistent when L ( ·/ ƒ )= ( ·) (refer to Hampel et. al. 1986, p. 312).

Note that SCALE = MED is the default.

Algorithm

The basic algorithm for computing M estimates for regression is iteratively reweighted least squares (IRLS). As the name suggests, a weighted least squares fit is carried out inside an iteration loop. For each iteration, a set of weights for the observations is used in the least squares fit. The weights are constructed by applying a weight function to the current residuals. Initial weights are based on residuals from an initial fit. The ROBUSTREG procedure uses the unweighted least squares fit as a default initial fit. The iteration terminates when a convergence criterion is satisfied. The maximum number of iterations is set to 1000. You can specify the weight function and the convergence criteria.

Weight Functions

You can specify the weight function for M estimation with the WEIGHTFUNCTION= option. The ROBUSTREG procedure provides ten weight functions. By default, the procedure uses the bisquare weight function. In most cases, M estimates are more sensitive to the parameters of these weight functions than to the type of the weight function. The median weight function is not stable and is seldom recommended in data analysis; it is included in the procedure for completeness. You can specify the parameters for these weight functions. Except for the hampel and median weight functions, default values for these parameters are defined such that the corresponding M estimates have 95% asymptotic efficiency in the location model with the Gaussian distribution (see Holland and Welsch (1977)).

The following list shows the weight functions available.

andrews

click to expand
click to expand

bisquare

click to expand
click to expand

cauchy

click to expand
click to expand

fair

click to expand

hampel

click to expand
click to expand

huber

click to expand
click to expand

logistic

click to expand

median

click to expand
click to expand

talworth

click to expand
click to expand

welsch

click to expand
click to expand

See Table 62.4 on page 3985 for the default values of the constants in these weight functions.

Convergence Criteria

The following convergence criteria are available in PROC ROBUSTREG:

  1. Relative change in the scaled residuals (CONVERGENCE= RESID)

  2. Relative change in the coefficients (CONVERGENCE= COEF)

  3. Relative change in weights (CONVERGENCE= W)

You can specify the criteria with the CONVERGENCE= option in the PROC statement. The default is CONVERGENCE= COEF.

You can specify the precision of the convergence criterion with the EPS= sub-option.

In addition to these convergence criteria, a convergence criterion based on scale-independent measure of the gradient is always checked. See Coleman, et. al. (1980). A warning is issued if this criterion is not satisfied.

Asymptotic Covariance and Confidence Intervals

The following three estimators of the asymptotic covariance of the robust estimator are available in PROC ROBUSTREG:

click to expand

where is a correction factor and W jk = ˆ‘ˆ ² ( r i ) x ij x ik . Refer to Huber (1981, p. 173) for more details.

You can specify the asymptotic covariance estimate with the option ASYMPCOV= option. The ROBUSTREG procedure uses H1 as the default because of its simplicity and stability. Confidence intervals are computed from the diagonal elements of the estimated asymptotic covariance matrix.

R 2 and Deviance

The robust version of R 2 is defined as

click to expand

and the robust deviance is defined as the optimal value of the objective function on the ƒ 2 -scale:

click to expand

where ² = ˆ , is the M estimator of , is the M estimator of location, and is the M estimator of the scale parameter in the full model.

Linear Tests

Two tests are available in PROC ROBUSTREG for the canonical linear hypothesis

click to expand

The first test is a robust version of the F test, which is refered to as the -test. Denote the M estimators in the full and reduced model as ˆˆ and 1 ˆˆ 1 , respectively. Let

click to expand

with

The robust F test is based on the test statistic

click to expand

Asymptotically under , where the standardization factor is » = ˆ« ˆ 2 ( s ) d ( s )/ ˆ« ˆ ² ( s ) d ( s ) and is the cumulative distribution function of the standard normal distribution. Large values of are significant. This test is a special case of the general -test of Hampel et. al. (1986, Section 7.2).

The second test is a robust version of the Wald test, which is refered to as -test. The test uses a test statistic

click to expand

where is the ( p ˆ’ q ) —( p ˆ’ q ) lower right block of the asymptotic covariance matrix of the M estimate M of in a p -parameter linear model.

Under , the statistic has an asymptotic 2 distribution with p ˆ’ q degrees of freedom. Large absolute values of are significant. Refer to Hampel et. al. (1986, Chapter 7).

Model Selection

When M estimation is used, two criteria are available in PROC ROBUSTREG for model selection. The first criterion is a counterpart of the Akaike (1974) AIC criterion for robust regression, and it is defined as

click to expand

where click to expand is a robust estimate of ƒ and is the M estimator with p -dimensional design matrix.

As with AIC, ± is the weight of the penalty for dimensions. The ROBUSTREG procedure uses ± =2 E ˆ 2 /E ˆ ² (Ronchetti (1985)) and estimates it using the final robust residuals.

The second criterion is a robust version of the Schwarz information criteria(BIC), and it is defined as

click to expand

High Breakdown Value Estimation

The breakdown value of an estimator is defined as the smallest fraction of contamination that can cause the estimator to take on values arbitrarily far from its value on the uncontamined data. The breakdown value of an estimator can be used as a measure of the robustness of the estimator. Rousseeuw and Leroy (1987) and others introduced the following high breakdown value estimators for linear regression.

LTS Estimate

The least trimmed squares (LTS) estimate proposed by Rousseeuw (1984) is defined as the p -vector

click to expand

where

click to expand

click to expand are the ordered squared residuals click to expand and h is defined in the range .

You can specify the parameter h with the H= option in the PROC statement. By default, click to expand . The breakdown value is for the LTS estimate.

The least median of squares (LMS) estimate is defined as the p -vector

click to expand

where

click to expand are the ordered squared residuals i = 1,..., n , and h is defined in the range click to expand .

The breakdown value for the LMS estimate is also . However the LTS estimate has several advantages over the LMS estimate. Its objective function is smoother, making the LTS estimate less 'jumpy' (i.e. sensitive to local effects) than the LMS estimate. Its statistical efficiency is better, because the LTS estimate is asymptotically normal whereas the LMS estimate has a lower convergence rate (Rousseeuw and Leroy (1987)). Another important advantage is that, using the FAST-LTS algorithm of Rousseeuw and Van Driessen (2000), the LTS estimate takes less computing time and is more accurate.

The ROBUSTREG procedure computes LTS estimates using the FAST-LTS algorithm. The estimates are often used to detect outliers in the data, which are then downweighted in the resulting weighted LS regression.

Algorithm

  • Least trimmed squares (LTS) regression is based on the subset of h observations (out of a total of n observations) whose least squares fit possesses the smallest sum of squared residuals. The coverage h may be set between and n . The LTS method was proposed by Rousseeuw (1984, p. 876) as a highly robust regression estimator with breakdown value . The ROBUSTREG procedure uses the FAST-LTS algorithm given by Rousseeuw and Van Driessen (1998). The intercept adjustment technique is also used in this implementation. However, because this adjustment is expensive to compute, it is optional. You can use the IADJUST option in the PROC statement to request or suppress the intercept adjustment. By default, PROC ROBUSTREG does intercept adjustment for data sets with less than 10000 observations. The algorithm is described briefly as follows . Refer to Rousseeuw and Van Driessen (2000) for details.

    1. The default h is where p is the number of independent variables . You can specify any integer h with click to expand with the H= option in the MODEL statement. The breakdown value for LTS, is reported . The default h is a good compromise between breakdown value and statistical efficiency.

    2. If p =1(single regressor) the procedure uses the exact algorithm of Rousseeuw and Leroy (1987, p. 172-172).

    3. If p 2, the procedure uses the following algorithm. If n < 2 ssubs , where ssubs is the size of the subgroups (you can specify ssubs using the SUBGROUPSIZE= option in the PROC statement, by default, ssubs = 300), draw a random p -subset and compute the regression coefficients using these p points (if the regression is degenerate , draw another p -subset). Compute the absolute residuals for all observations in the data set and select the first h points with smallest absolute residuals. From this selected h -subset, carry out nsteps C-steps (Concentration step, see Rousseeuw and Van Driessen (2000) for details. You can specify nsteps with the CSTEP= option in the PROC statement, by default, nsteps =2). Redraw p - subsets and repeat the preceding computing procedure nrep times and find the nbsol (at most) solutions with the lowest sums of h squared residuals. nrep can be specified with the NREP= option in the PROC statement. By default, NREP=min{500, }. For small n and p , all subsets are used and the NREP= option is ignored (Rousseeuw and Hubert (1996)). nbsol can be specified with the NBEST= option in the PROC statement. By default, NBEST=10. For each of these nbsol best solutions, take C-steps until convergence and find the best final solution.

    4. If n 5 ssubs , construct 5 disjoint random subgroups with size ssubs . If 2 ssubs < n < 5 ssubs , the data are split into at most four subgroups with ssubs or more observations in each subgroup, so that each observation belongs to a subgroup and such that the subgroups have roughly the same size. Let nsubs denote the number of subgroups. Inside each subgroup, repeat the procedure in Step 3 times and keep the nbsol best solutions. Pool the subgroups, yielding the merged set of size n merged . In the merged set, for each of the nsubs nbsol best solutions, carry out nsteps C-steps using n merged and click to expand and keep the nbsol best solutions. In the full data set, for each of these nbsol best solutions, take C-steps using n and h until convergence and find the best final solution.

R 2

The robust version of R 2 for the LTS estimate is defined as

click to expand

for models with the intercept term and as

click to expand

for models without the intercept term, where

click to expand

s LTS is a preliminary estimate of the parameter ƒ in the distribution function L ( · / ƒ ).

Here d h,n is chosen to make s LTS consistent assuming a Gaussian model. Specifically,

click to expand

with and being the distribution function and the density function of the standard normal distribution, respectively.

Final Weighted Scale Estimator

  • The ROBUSTREG procedure displays two scale estimators, s LTS and Wscale. The estimate Wscale is a more efficient scale estimate based on the preliminary estimate s LTS , and it is defined as

    click to expand

    where

    click to expand
  • You can specify k with the CUTOFF= option in the MODEL statement. By default, k =3.

S Estimate

The S estimate proposed by Rousseeuw and Yohai (1984) is defined as the p -vector

click to expand

where the dispersion S ( ) is the solution of

click to expand

Here ² is set to ˆ« ( s ) d ( s ) such that s and S ( s ) are asymptotically consistent estimates of and ƒ for the Gaussian regression model. The breakdown value of the S estimate is

The ROBUSTREG procedure provides two choices for : the Tukey function and the Yohai function.

The Tukey function, which you can specify with the option CHIF=TUKEY, is

click to expand

The constant k controls the breakdown value and efficiency of the S estimate. By specifying the efficiency using the EFF= option, you can determine the corresponding k . The default k is 2.9366 such that the breakdown value of the S estimate is 0.25 with a corresponding asymptotic efficiency for the Gaussian model of 75 . 9%.

The Yohai function, which you can specify with the option CHIF=YOHAI, is

click to expand

where b =1 . 792, b 1 = ˆ’ . 972, b 2 =0 . 432, b 3 = ˆ’ . 052, and b 4 =0 . 002. By specifying the efficiency using the EFF= option, you can determine the corresponding k . By default, k is set to 0.7405 such that the breakdown value of the S estimate is 0.25 with a corresponding asymptotic efficiency for the Gaussian model of 72 . 7%.

Algorithm

  • The ROBUSTREG procedure implements the algorithm by Marazzi (1993) for the S estimate, which is a refined version of the algorithm proposed by Ruppert (1992). The refined algorithm is briefly described as follows.

  • Initialize iter = 1.

    1. Draw a random q -subset of the total n observations and compute the regression coefficients using these q observations (if the regression is degenerate, draw another q -subset), where q p can be specified with the SUBSIZE= option. By default, q = p .

    2. Compute the residuals: click to expand for i =1 , , n . If iter = 1, set s * = 2 median { r i ,i = 1 , ,n } ; if s * = 0, set s * = min { r i ,i =1 , ,n }; while click to expand ; go to Step3. If iter > 1 and click to expand , go to the Step 3; else go to Step 5.

    3. Solve for s the equation

      click to expand

      using an iterative algorithm.

    4. If iter > 1 and s>s , go to Step5. Otherwise , sets = s an d * = . I f s * <TOLS , return s * and *; else go to Step5.

    5. if iter <NREP , setiter = iter + 1 and return to Step 1; else return s * and *.

  • The ROBUSTREG procedure does the following refinement step by default. You can request this refinement not be done using the NOREFINE option in the PROC statement.

    1. Let ˆ = ² . Using the values s * and * from the previous steps, compute M estimates M and ƒ M of and ƒ with the setup for M estimation in the section 'M Estimation' on page 3993. If ƒ M >s *, give a warning and return s * and *; otherwise, return ƒ M and M .

  • You can specify TOLS with the TOLERANCE= option; by default, TOLS = . 001. Alternately You can specify NREP with the NREP= option. You can also use the options NREP= NREP0 or NREP= NREP1 to determine NREP according to the following table. NREP= NREP0 is set as the default.

Table 62.7: Default NREP

P

NREP0

NREP1

1

150

500

2

300

1000

3

400

1500

4

500

2000

5

600

2500

6

700

3000

7

850

3000

8

1250

3000

9

1500

3000

>9

1500

3000

R 2 and Deviance

  • The robust version of R 2 for the S estimate is defined as

    click to expand
  • for the model with the intercept term and

    click to expand
  • for the model without the intercept term, where S p is the S estimate of the scale in the full model, S µ is the S estimate of the scale in the regression model with only the intercept term, and S is the S estimate of the scale without any regressor. The deviance D is defined as the optimal value of the objective function on the ƒ 2 -scale:

Asymptotic Covariance and Confidence Intervals

  • Since the S estimate satisfies the first-order necessary conditions as the M estimate, it has the same asymptotic covariance as that of the M estimate. All three estimators of the asymptotic covariance for the M estimate in the section 'Asymptotic Covariance and Confidence Intervals' on page 3997 can be used for the S estimate. Besides, the weighted covariance estimator H4 described in the section 'Asymptotic Covariance and Confidence Intervals' on page 4008 is also available and is set as the default. Confidence intervals for estimated parameters are computed from the diagonal elements of the estimated asymptotic covariance matrix.

MM Estimation

MM estimation is a combination of high breakdown value estimation and efficient estimation, which was introduced by Yohai (1987). It has three steps:

  1. Compute an initial (consistent) high breakdown value estimate ² . The ROBUSTREG procedure provides two kinds of estimates as the initial estimate, the LTS estimate and the S estimate. By default, the LTS estimate because of its speed, efficiency, and high breakdown value. The breakdown value of the final MM estimate is decided by the breakdown value of the initial LTS estimate and the constant k in the function. To use the S estimate as the initial estimate, you specify the INITEST=S option in the PROC statement. In this case, the breakdown value of the final MM estimate is decided only by the constant k . Instead of computing the LTS estimate or the S estimate as initial estimates, you can also specify the initial estimate explicitly using the INEST= option in the PROC statement. See the section 'INEST= Data Set' on page 4011 for details.

  2. Find ² such that

    click to expand

    where ² = ˆ« ( s ) d ( s ).

    The ROBUSTREG procedure provides two choices for : the Tukey function and the Yohai function.

    The Tukey function, which you can specify with the option CHIF=TUKEY, is

    click to expand

    where k can be specified with the K0= option. The default k = 2.9366 such that the asymptotically consistent scale estimate ² has the breakdown value of 25%.

    The Yohai function, which you can specify with the option CHIF=YOHAI, is

    click to expand

    where b = 1 . 792, b 1 = ˆ’ . 972, b 2 = 0 . 432, b 3 = ˆ’ . 052, and b 4 = 0 . 002. You can specify k with the K0= option. The default k is .7405 such that the asymptotically consistent scale estimate ² has the breakdown value of 25%.

  3. Find a local minimum MM of

    click to expand

    such that Q MM ( MM ) Q MM ( ² ). The algorithm for M estimation is used here.

    The ROBUSTREG procedure provides two choices for : the Tukey function and the Yohai function.

    The Tukey function, which you can specify with the option CHIF=TUKEY, is

    click to expand

    where k 1 can be specified with the K1= option. The default k 1 is 3.440 such that the MM estimate has 85% asymptotic efficiency with the Gaussian distribution.

    The Yohai function, which you can specify with the option CHIF=Yohai, is

    click to expand

    where k 1 can be specified with the K1= option. The default k 1 is 0.868 such that the MM estimate has 85% asymptotic efficiency with the Gaussian distribution.

Algorithm

The initial LTS estimate is computed using the algorithm described in the section 'LTS Estimate' on page 4000. You can control the quantile of the LTS estimate with the option INITH= h , where h is an integer between and . By default, , which corresponds to a breakdown value of around 25%.

The initial S estimate is computed using the algorithm described in the section 'S Estimate' on page 4003. You can control the breakdown value and efficiency of this initial S estimate by the constant k which can be specified with the K0 option.

The scale parameter ƒ is solved by an iterative algorithm

click to expand

where ² = ˆ« k 0( s ) d ( s ).

Once the scale parameter is computed, the iteratively reweighted least squares (IRLS) algorithm with fixed scale parameter is used to compute the final MM estimator.

Convergence Criteria

In the iterative algorithm for the scale parameter, the relative change of the scale parameter controls the convergence.

In the iteratively reweighted least squares algorithm, the same convergence criteria for the M estimate used before are used here.

Bias Test

Although the final MM estimate inherits the high-breakdown-value properity, its bias due to the distortion of the outliers can be high. Yohai, Stahel, and Zamar (1991) introduced a bias test. The ROBUSTREG procedure implements this test when you specify the BIASTEST option in the PROC statement. This test bases on the initial scale estimate ² and the final scale estimate ² , which is the solution of

click to expand

Let ˆ k ( ·) = ( ·) and ˆ k1 ( ·) = ( ·), where ² denotes the derivative with respect to the argument. Compute

click to expand

Standard asymptotic theory shows that T approximately follows a 2 -distribution with p degrees of freedom. If T exceeds the ± quantile of the 2 -distribution with p degrees of freedom, then the ROBUSTREG procedure gives a warning and recommends to use other methods. Otherwise the final MM estimate and the initial scale estimate are reported. You can specify ± with the ALPHA= option following the BIASTEST option. By default, ALPHA= 0.99.

Asymptotic Covariance and Confidence Intervals

Since the MM estimate is computed as a M estimate with a fixed scale in the last step, the asymptotic covariance for the M estimate can be used here for the asymptotic covariance of the MM estimate. Besides the three estimators H1, H2, and H3 as described in the section 'Asymptotic Covariance and Confidence Intervals' on page 3997, a weighted covariance estimator H4 is available:

click to expand

where is the correction factor and click to expand = .

You can specify these estimators with the option ASYMPCOV= [H1 H2 H3 H4]. The ROBUSTREG procedure uses H4 as default. Confidence intervals for estimated parameters are computed from the diagonal elements of the estimated asymptotic covariance matrix.

R 2 and Deviance

The robust version of R 2 for the MM estimate is defined as

click to expand

and the robust deviance is defined as the optimal value of the objective function on the ƒ 2 -scale:

click to expand

where ² = ˆ , is the MM estimator of , is the MM estimator of location, and is the MM estimator of the scale parameter in the full model.

Linear Tests

For MM estimation, the same -test and -test used for M estimation can be used. See the section 'Linear Tests' on page 3998 for details.

Model Selection

For MM estimation, the same two model selection methods used for M estimation can be used. See the section 'Model Selection' on page 3999 for details.

Robust Multivariate Location and Scale Estimates

The ROBUSTREG procedure uses the robust multivariate location and scale estimates for leverage points detection. The procedure provides the minimum covariance determinant (MCD) method, which was introduced by Rousseeuw (1984).

Algorithm

PROC ROBUSTREG implements the algorithm given by Rousseeuw and Van Driessen (1999) for MCD, which is similar to the algorithm for LTS.

Robust Distance

The Mahalanobis Distance is defined as

click to expand

where and click to expand . Here x i = ( x i 1 , , x i ( p- 1) ) T do not include the constant variable. The relation between the Mahalanobis distance MD ( x i ) and the hat matrix H = ( h ij ) = X ( X T X ) ˆ’ 1 X T is

click to expand

The Robust Distance is defined as

click to expand

where T ( X ) and C ( X ) are the robust multivariate location and scale obtained by MCD.

These distances are used to detect leverage points.

Leverage Point and Outlier Detection

Let be the cutoff value. The variable LEVERAGE is defined as

click to expand

You can specify a cutoff value with the LEVERAGE option in the MODEL statement.

Residuals r i , i = 1 , , n based on robust regression estimates are used to detect vertical outliers. The variable OUTLIER is defined as

click to expand

You can specify the multiplier k of the cutoff value by using the CUTOFF= option in the MODEL statement.

An ODS table called DIAGNOSTICS contains these two variables.

INEST= Data Set

When you use the M or MM estimation, you cna use the INEST= data set to specify initial estimates for all the parameters in the model. The INEST= option is ignored if you specify LTS or S estimation using the METHOD=LTS or METHOD=S option or if you specify the INITEST= option after the METHOD=MM option in the PROC statement. The INEST= data set must contain the intercept variable (named Intercept ) and all independent variables in the MODEL statement.

If BY processing is used, the INEST= data set should also include the BY variables, and there must be at least one observation for each BY group. If there is more than one observation in one BY group, the first one read is used for that BY group .

If the INEST= data set also contains the _TYPE_ variable, only observations with _TYPE_ value 'PARMS' are used as starting values.

You can specify starting values for the iteratively reweighted least squares algorithm in the INEST= data set. The INEST= data set has the same structure as the OUTEST= data set but is not required to have all the variables or observations that appear in the OUTEST= data set. One simple use of the INEST= option is passing the previous OUTEST= data set directly to the next model as an INEST= data set, assuming that the two models have the same parameterization.

OUTEST= Data Set

The OUTEST= data set contains parameter estimates for the model. You can specify a label in the MODEL statement to distinguish between the estimates for different modeling using the ROBUSTREG procedure. If the COVOUT option is specified, the OUTEST= data set also contains the estimated covariance matrix of the parameter estimates. Note that, if the ROBUSTREG procedure does not converge, the parameter estimates are set to missing in the OUTEST data set.

The OUTES= data set contains all variables specified in the MODEL statement and the BY statement. One observation consists of parameter values for the model with the dependent variable having the value ˆ’ 1. If the COVOUT option is specified, there are additional observations containing the rows of the estimated covariance matrix. For these observations, the dependent variable contains the parameter estimate for the corresponding row variable. The following variables are also added to the data set:

_MODEL_

a character variable containing the label of the MODEL statement, if present. Otherwise, the variable's value is blank

_NAME_

a character variable containing the name of the dependent variable for the parameter estimates observations or the name of the row for the covariance matrix estimates

_TYPE_

a character variable containing the type of the observation, either PARMS for parameter estimates or COV for covariance estimates

_METHOD_

a character variable of containing the type of estimation method, either M estimation, or LTS estimation, or S estimation, or MM estimation

_STATUS_

a character variable containing the status of model fitting, either Converged, or Warning, or Failed

INTERCEPT

a numeric variable containing the intercept parameter estimates and covariances

_SCALE_

a numeric variable containing the scale parameter estimates

Any BY variables specified are also added to the OUTEST= data set.

Computational Resources

The algorithms for the various different estimation methods need different amount of memory for working space. Let p be the number of parameters estimated and n be the number of observations used in the model estimation.

For M estimation, the minimum working space (in bytes) needed is

If sufficient space is available, the input data set is also kept in memory; otherwise, the input data set is reread for computing the iteratively reweighted least squares estimates and the execution time of the procedure increases substantially. For each reweighted least squares, O ( np 2 + p 3 ) multiplications and additions are required for computing the cross product matrix and its inverse. The O ( v ) notation means that, for large values of the argument, v , O ( v ) is approximately a constant times v .

Since the iteratively reweighted least squares algorithm converges very quickly (normally within less than 20 iterations), the computation of M estimates is fast.

LTS estimation is more expensive in computation. The minimum working space (in bytes) needed is

click to expand

The memory is mainly used to store the current data used by LTS for modeling. The LTS algorithm uses subsampling and spends much of its computing time on resampling and computing estimates for subsamples. Since it resamples if singularity is detected , it may take more time if the data set has serious singularities.

The MCD algorithm for high leverage point diagnostics is similar to the LTS algorithm.

ODS Table Names

The ROBUSTREG procedures assigns a name to each table it creates. You can specify these names when using the Output Delivery System (ODS) to select tables and create output data sets. These names are listed in the following table.

Table 62.8: ODS Tables Produced in PROC ROBUSTREG

ODS Table Name

Description

Statement

Option

BestEstimates

Best final estimates for LTS

PROC

SUBANALYSIS

BestSubEstimates

Best estimates for each subgroup

PROC

SUBANALYSIS [*]

BiasTest

Bias test for MM estimation

PROC

BIASTEST

ClassLevels

Class variable levels

CLASS

default*

CorrB

Parameter estimate correlation matrix

MODEL

CORRB

CovB

Parameter estimate covariance matrix

MODEL

COVB

CStep

C-Step for LTS fitting

PROC

SUBANALYSIS

Diagnostics

Outlier diagnostics

MODEL

DIAGNOSTICS

DiagSummary

Summary of the outlier diagnostics

MODEL

default

GoodFit

R2, deviance, AIC, and BIC

MODEL

default

InitLTSProfile

Profile for initial LTS estimate

PROC

METHOD

InitSProfile

Profile for initial S estimate

PROC

METHOD

IterHistory

Iteration history

PROC

ITPRINT

LTSEstimates

LTS parameter estimates

PROC

METHOD

LTSLocationScale

Location and scale for LTS

PROC

METHOD

LTSProfile

Profile for LTS estimate

PROC

METHOD

LTSRsquare

R2 for LTS estimate

PROC

METHOD

MMProfile

Profile for MM estimate

PROC

METHOD

ModelInfo

Model information

MODEL

default

NObs

Observations Summary

PROC

default

ParameterEstimates

Parameter estimates

MODEL

default

ParameterEstimatesF

Final weighted LS estimates

PROC

FWLS

ParameterEstimatesR

Reduced parameter estimates

TEST

default

ParmInfo

Parameter indices

MODEL

default

SProfile

Profile for S estimate

PROC

METHOD

Groups

Groups for LTS fitting

PROC

SUBANALYSIS [*]

SummaryStatistics

Summary statistics for model variables

MODEL

default

TestsProfile

Results for tests

TEST

default

[*] Depends on data.

ODS Graphics (Experimental)

Graphical displays are important in robust regression and outlier detection. Two plots are particularly useful for revealing outliers and leverage points. The first is a scatter plot of the standardized robust residuals against the robust distances (RDPLOT). The second is a scatter plot of the robust distances against the classical Mahalanobis distances (DDPLOT). See Figure 62.4 on page 3975 and Figure 62.5 on page 3975 for examples. In addition to these two plots, a histogram and a quantile-quantile plot of the standardized robust residuals are also helpful. See Figure 62.6 on page 3976 and Figure 62.7 on page 3976 for examples.

This section describes the use of ODS for creating these four plots with the ROBUSTREG procedure. These graphics are experimental in this release, meaning that both the graphical results and the syntax for specifying them are subject to change in a future release.

To request these plots you must specify the ODS GRAPHICS statement in addition to the PLOT= (or PLOTS=) option, which is described as follows. For more information on the ODS GRAPHICS statement, see Chapter 15, 'Statistical Graphics Using ODS.'

You can specify the PLOT= or PLOTS= option in the PROC statement to request one or more plots:

PLOT = keyword

PLOTS =( keyword-list )

  • requests plots for robust regression. You can specify one or more of the following keywords :

    With the RDPLOT and DDPLOT options, you can label the points on the plots by specifying the LABEL= suboption immediately after the keyword:

PLOT=DDPLOT <( LABEL = label method )>

PLOT=RDPLOT <( LABEL = label method )>

  • You can specify one of the following label methods :

    Table 62.10: Label Methods

    Value of LABEL=

    Label Method

    ALL

    label all points

    OUTLIER

    label outliers

    LEVERAGE

    label leverage points

    NONE

    no labels

  • By default, the ROBUSTREG procedure labels both outliers and leverage points.

  • If you specify ID variables in the ID statement, the values of the first ID variable are used as labels; otherwise, observation numbers are used as labels.

  • The histogram is superimposed with a normal density curve and a kernel density curve.

ODS Graph Names

PROC ROBUSTREG assigns a name to each graph it creates using ODS. You can use these names to reference the graphs when using ODS. The names are listed in Table 62.11 on page 4015.

Table 62.11: ODS Graphics Produced by PROC ROBUSTREG

ODS Graph Name

Plot Description

Statement

PLOTS= Option

DDPlot

Robust distance - Mahalanobis distance

PROC

DDPLOT

RDPlot

Standardized robust residual -Robust distance

PROC

RDPLOT

ResidualHistogram

Histogram of standardized robust residuals

PROC

RESHISTOGRAM

ResidualQQPlot

Q-Q plot of standardized robust residuals

PROC

RESQQPLOT

To request these graphs you must specify the ODS GRAPHICS statement in addition to the PLOT= (or PLOTS=) option described in Table 62.9 on page 4014. For more information on the ODS GRAPHICS statement, see Chapter 15, 'Statistical Graphics Using ODS.'

Table 62.9: Options for Plots

Keyword

Plot

DDPLOT

Robust distance - Mahalanobis distance

RDPLOT

Standardized robust residual - Robust distance

RESHISTOGRAM

Histogram of standardized robust residuals

RESQQPLOT

Q-Q plot of standardized robust residuals

ALL

All plots




SAS.STAT 9.1 Users Guide (Vol. 6)
SAS.STAT 9.1 Users Guide (Vol. 6)
ISBN: N/A
EAN: N/A
Year: 2004
Pages: 127

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net