Frankel (1961) reports an experiment aimed at maximizing the yield of mercaptoben-zothiazole (MBT) by varying processing time and temperature. Myers (1976) uses a two-factor model in which the estimated surface does not have a unique optimum. A ridge analysis is used to determine the region in which the optimum lies. The objective is to find the settings of time and temperature in the processing of a chemical that maximize the yield. The following statements read the data and invoke PROC RSREG. These statements produce Output 63.1.1 through Output 63.1.5:
data d; input Time Temp MBT; label Time = "Reaction Time (Hours)" Temp = "Temperature (Degrees Centigrade)" MBT = "Percent Yield Mercaptobenzothiazole"; datalines; 4.0 250 83.8 20.0 250 81.7 12.0 250 82.4 12.0 250 82.9 12.0 220 84.7 12.0 280 57.9 12.0 250 81.2 6.3 229 81.3 6.3 271 83.1 17.7 229 85.3 17.7 271 72.7 4.0 250 82.0 ; proc sort; by Time Temp; run; proc rsreg; model MBT=Time Temp / lackfit; ridge max; run;
The RSREG Procedure Coding Coefficients for the Independent Variables Factor Subtracted off Divided by Time 12.000000 8.000000 Temp 250.000000 30.000000 Response Surface for Variable MBT: Percent Yield Mercaptobenzothiazole Response Mean 79.916667 Root MSE 4.615964 R-Square 0.8003 Coefficient of Variation 5.7760
The RSREG Procedure Type I Sum Regression DF of Squares R-Square F Value Pr > F Linear 2 313.585803 0.4899 7.36 0.0243 Quadratic 2 146.768144 0.2293 3.44 0.1009 Crossproduct 1 51.840000 0.0810 2.43 0.1698 Total Model 5 512.193947 0.8003 4.81 0.0410 Sum of Residual DF Squares Mean Square F Value Pr > F Lack of Fit 3 124.696053 41.565351 39.63 0.0065 Pure Error 3 3.146667 1.048889 Total Error 6 127.842720 21.307120 Parameter Estimate Standard from Coded Parameter DF Estimate Error t Value Pr > t Data Intercept 1 545.867976 277.145373 1.97 0.0964 82.173110 Time 1 6.872863 5.004928 1.37 0.2188 1.014287 Temp 1 4.989743 2.165839 2.30 0.0608 8.676768 Time*Time 1 0.021631 0.056784 0.38 0.7164 1.384394 Temp*Time 1 0.030075 0.019281 1.56 0.1698 7.218045 Temp*Temp 1 0.009836 0.004304 2.29 0.0623 8.852519 Sum of Factor DF Squares Mean Square F Value Pr > F Label Time 3 61.290957 20.430319 0.96 0.4704 Reaction Time (Hours) Temp 3 461.250925 153.750308 7.22 0.0205 Temperature (Degrees Centigrade)
The RSREG Procedure Canonical Analysis of Response Surface Based on Coded Data Critical Value Factor Coded Uncoded Label Time 0.441758 8.465935 Reaction Time (Hours) Temp 0.309976 240.700718 Temperature (Degrees Centigrade) Predicted value at stationary point: 83.741940 Eigenvectors Eigenvalues Time Temp 2.528816 0.953223 0.302267 9.996940 0.302267 0.953223 Stationary point is a saddle point.
The RSREG Procedure Estimated Ridge of Maximum Response for Variable MBT: Percent Yield Mercaptobenzothiazole Coded Estimated Standard Uncoded Factor Values Radius Response Error Time Temp 0.0 82.173110 2.665023 12.000000 250.000000 0.1 82.952909 2.648671 11.964493 247.002956 0.2 83.558260 2.602270 12.142790 244.023941 0.3 84.037098 2.533296 12.704153 241.396084 0.4 84.470454 2.457836 13.517555 239.435227 0.5 84.914099 2.404616 14.370977 237.919138 0.6 85.390012 2.410981 15.212247 236.624811 0.7 85.906767 2.516619 16.037822 235.449230 0.8 86.468277 2.752355 16.850813 234.344204 0.9 87.076587 3.130961 17.654321 233.284652 1.0 87.732874 3.648568 18.450682 232.256238
Output 63.1.2 shows that the lack of fit for the model is highly significant. Since the quadratic model does not fit the data very well, firm statements about the underlying process should not be based only on the current analysis. Note from the analysis of variance for the model that the test for the time factor is not significant. If further experimentation is undertaken, it might be best to fix Time at a moderate to high value and to concentrate on the effect of temperature. In the actual experiment discussed here, extra runs were made that confirmed the results of the following analysis.
The canonical analysis (Output 63.1.3) indicates that the predicted response surface is shaped like a saddle. The eigenvalue of 2.5 shows that the valley orientation of the saddle is less curved than the hill orientation, with eigenvalue of ˆ’ 9 . 99. The coefficients of the associated eigenvectors show that the valley is more aligned with Time and the hill with Temp . Because the canonical analysis resulted in a saddle point, the estimated surface does not have a unique optimum.
However, the ridge analysis in Output 63.1.4 indicates that maximum yields will result from relatively high reaction times and low temperatures . A contour plot of the predicted response surface, shown in Output 63.1.5,confirms this conclusion.
The statements that produce this plot follow. Note that contour and three-dimensional plots can be created interactively using SAS/INSIGHT software or the ADX Interface in SAS/QC software. Initial DATA steps create a grid over Time and Temp and combine this grid with the original data, using a variable flag to indicate the grid. Then, PROC RSREG is used to create predictions for the combined data. Finally, PROC GCONTOUR to displays a contour plot of the predictions over just the grid.
data b; set d; flag=1; MBT=.; do Time=0 to 20 by 1; do Temp=220 to 280 by 5; output; end; end; data c; set d b; proc rsreg data=c out=e noprint; model MBT=Time Temp / predict; id flag; run;
axis1 label=(angle=90) minor=none; axis2 order=(220 to 280 by 20) minor=none; proc gcontour data=e(where=(flag=1)); plot Time*Temp=MBT / nlevels=12 vaxis=axis1 haxis=axis2 nolegend autolabel llevels=2 2 2 1 1 1 1 1 1 1 1 1 ; run;
One way of viewing covariates is as extra sources of variation in the dependent variable that may mask the variation due to primary factors. This example demonstrates the use of the COVAR= option in PROC RSREG to fit a response surface model to the dependent variable values corrected for the covariates.
You have a chemical process with a yield that you hypothesize to be dependent on three factors: reaction time, reaction temperature, and reaction pressure. You perform an experiment to measure this dependence. You are willing to include up to 20 runs in your experiment, but you can perform no more than 8 runs on the same day, so the design for the experiment is composed of three blocks. Additionally, you know that the grade of raw material for the reaction has a significant impact on the yield. You have no control over this, but you keep track of it. The following statements create a SAS data set containing the results of the experiment:
data Experiment; input Day Grade Time Temp Pressure Yield; datalines; 1 67 1 1 1 32.98 1 68 1 1 1 47.04 1 70 1 1 1 67.11 1 66 1 1 1 26.94 1 74 0 0 0 103.22 1 68 0 0 0 42.94 2 75 1 1 1 122.93 2 69 1 1 1 62.97 2 70 1 1 1 72.96 2 71 1 1 1 94.93 2 72 0 0 0 93.11 2 74 0 0 0 112.97 3 69 1.633 0 0 78.88 3 67 1.633 0 0 52.53 3 68 0 1.633 0 68.96 3 71 0 1.633 0 92.56 3 70 0 0 1.633 88.99 3 72 0 0 1.633 102.50 3 70 0 0 0 82.84 3 72 0 0 0 103.12 ;
Your first analysis neglects to take the covariates into account. The following statements use PROC RSREG to fit a response surface to the observed yield, but note that Day and Grade are omitted.
proc rsreg data=Experiment; model Yield = Time Temp Pressure; run;
The ANOVA results (shown in Output 63.2.1) indicate that no process variable effects are significantly larger than the background noise.
The RSREG Procedure Type I Sum Regression DF of Squares R-Square F Value Pr > F Linear 3 1880.842426 0.1353 0.67 0.5915 Quadratic 3 2370.438681 0.1706 0.84 0.5023 Crossproduct 3 241.873250 0.0174 0.09 0.9663 Total Model 9 4493.154356 0.3233 0.53 0.8226 Sum of Residual DF Squares Mean Square Total Error 10 9405.129724 940.512972
However, when the yields are adjusted for covariate effects of day and grade of raw material, very strong process variable effects are revealed. The following statements produce the ANOVA results in Output 63.2.2. Note that in order to include the effects of the classification factor Day as covariates, you need to create dummy variables indicating each day separately.
The RSREG Procedure Type I Sum Regression DF of Squares R-Square F Value Pr > F Covariates 3 13695 0.9854 316957 <.0001 Linear 3 156.524497 0.0113 3622.53 <.0001 Quadratic 3 22.989775 0.0017 532.06 <.0001 Crossproduct 3 23.403614 0.0017 541.64 <.0001 Total Model 12 13898 1.0000 80413.2 <.0001 Sum of Residual DF Squares Mean Square Total Error 7 0.100820 0.014403
data Experiment; set Experiment; d1 = (Day = 1); d2 = (Day = 2); d3 = (Day = 3); proc rsreg data=Experiment; model Yield = d1-d3 Grade Time Temp Pressure / covar=4; run;
The results show very strong effects due to both the covariates and the process variables.