The data for the following example are from Powell et al. (1982). In order to calibrate an instrument for measuring atomic weight, 24 replicate measurements of the atomic weight of silver (chemical symbol Ag ) are made with the new instrument and with a reference instrument.
Note: The results from this example vary from machine to machine depending on floating-point configuration.
The following statements read the measurements for the two instruments into the SAS data set AgWeight .
title 'Atomic Weight of Silver by Two Different Instruments'; data AgWeight; input Instrument AgWeight @@; datalines; 1 107.8681568 1 107.8681465 1 107.8681572 1 107.8681785 1 107.8681446 1 107.8681903 1 107.8681526 1 107.8681494 1 107.8681616 1 107.8681587 1 107.8681519 1 107.8681486 1 107.8681419 1 107.8681569 1 107.8681508 1 107.8681672 1 107.8681385 1 107.8681518 1 107.8681662 1 107.8681424 1 107.8681360 1 107.8681333 1 107.8681610 1 107.8681477 2 107.8681079 2 107.8681344 2 107.8681513 2 107.8681197 2 107.8681604 2 107.8681385 2 107.8681642 2 107.8681365 2 107.8681151 2 107.8681082 2 107.8681517 2 107.8681448 2 107.8681198 2 107.8681482 2 107.8681334 2 107.8681609 2 107.8681101 2 107.8681512 2 107.8681469 2 107.8681360 2 107.8681254 2 107.8681261 2 107.8681450 2 107.8681368 ;
Notice that the variation in the atomic weight measurements is several orders of magnitude less than their mean. This is a situation that can be difficult for standard, regression-based analysis-of-variance procedures to handle correctly.
The following statements invoke the ORTHOREG procedure to perform a simple one-way analysis of variance, testing for differences between the two instruments.
proc orthoreg data=AgWeight; class Instrument; model AgWeight = Instrument; run;
Output 53.1.1 shows the resulting analysis.
Atomic Weight of Silver by Two Different Instruments The ORTHOREG Procedure Class Level Information Factor Levels -Values- Instrument 2 1 2 Atomic Weight of Silver by Two Different Instruments The ORTHOREG Procedure Dependent Variable: AgWeight Sum of Source DF Squares Mean Square F Value Pr > F Model 1 3.6383419E-9 3.6383419E-9 15.95 0.0002 Error 46 1.0495173E-8 2.281559E-10 Corrected Total 47 1.4133515E-8 Root MSE 0.0000151048 R-Square 0.2574265445 Standard Parameter DF Parameter Estimate Error t Value Pr > t Intercept 1 107.868136354166 3.0832608E-6 3.499E7 <.0001 (Instrument='1') 1 0.00001741249999 4.3603893E-6 3.99 0.0002 (Instrument='2') 0 0 . . .
The mean difference between instruments is about 1 . 74 — 10 ˆ’ 5 (the value of the (Instrument='1') parameter in the parameter estimates table), whereas the level of background variation in the measurements is about 1 . 51 — 10 ˆ’ 5 (the value of the root mean squared error). The difference is significant, with a p -value of 0.0002.
The National Institute of Standards and Technology (1998) has provided certified ANOVA values for this data set. The following statements use ODS to examine the ANOVA values produced by both the ORTHOREG and GLM procedures more precisely for comparison with the NIST-certified values:
ods listing close; ods output ANOVA = OrthoregANOVA FitStatistics = OrthoregFitStat; proc orthoreg data=AgWeight; class Instrument; model AgWeight = Instrument; run; ods output OverallANOVA = GLMANOVA FitStatistics = GLMFitStat; proc glm data=AgWeight; class Instrument; model AgWeight = Instrument; run; ods listing; data _null_; set OrthoregANOVA (in=inANOVA) OrthoregFitStat(in=inFitStat); if (inANOVA) then do; if (Source = 'Model') then put "Model SS: " ss e20.; if (Source = 'Error') then put "Error SS: " ss e20.; end; if (inFitStat) then do; if (Statistic = 'Root MSE') then put "Root MSE: " nValue1 e20.; if (Statistic = 'R-Square') then put "R-Square: " nValue1 best20.; end; data _null_; set GLMANOVA (in=inANOVA) GLMFitStat(in=inFitStat); if (inANOVA) then do; if (Source = 'Model') then put "Model SS: " ss e20.; if (Source = 'Error') then put "Error SS: " ss e20.; end; if (inFitStat) then put "Root MSE: " RootMSE e20.; if (inFitStat) then put "R-Square: " RSquare best20.; run;
In releases of SAS/STAT software prior to Version 8, PROC GLM gave much less accurate results than PROC ORTHOREG, as shown in the following tables, which compare the ANOVA values certified by NIST with those produced by the two procedures.
Model SS | Error SS | |
---|---|---|
NIST-certified | 3.6383418750000E-09 | 1.0495172916667E-08 |
ORTHOREG | 3.6383418747907E-09 | 1.0495172916797E-08 |
GLM, Version 8 | 3.6383418747907E-09 | 1.0495172916797E-08 |
GLM, Previous releases |
| 1.0331496763990E-08 |
Root MSE | R-Square | |
---|---|---|
NIST-certified | 1.5104831444641E-05 | 0.25742654453832 |
ORTHOREG | 1.5104831444735E-05 | 0.25742654452494 |
GLM, Version 8 | 1.5104831444735E-05 | 0.25742654452494 |
GLM, Previous releases | 1.4986585859992E-05 |
|
While the ORTHOREG values and the GLM values for Version 8 are quite close to the certified ones, the GLM values for prior releases are not. In fact, since the model sum of squares is so small, in prior releases the GLM procedure set it (and consequently R 2 ) to zero.
This example applies the ORTHOREG procedure to a collection of data sets noted for being ill conditioned. The OUTEST= data set is used to collect the results for comparison with values certified to be correct by the National Institute of Standards and Technology (1998).
Note: The results from this example vary from machine to machine depending on floating-point configuration.
The data are from Wampler (1970). The independent variates for all five data sets are x i , i = 1 , 5 , for x = 0 , 1 , , 20. Two of the five dependent variables are exact linear functions of the independent terms:
The other three dependent variables have the same mean value as y 1 , but with nonzero errors.
where e is a vector of values with standard deviation 2044, chosen to be orthogonal to the mean model for y 1 .
The following statements create a SAS data set Wampler containing the Wampler data, run a SAS macro program using PROC ORTHOREG to fitafifth-order polynomial in x to each of the Wampler dependent variables, and collect the results in a data set named ParmEst .
data Wampler; do x=0 to 20; input e @@; y1 = 1 + x + x**2 + x**3 + x**4 + x**5; y2 = 1 + .1 *x + .01 *x**2 + .001*x**3 + .0001*x**4 + .00001*x**5; y3 = y1 + e; y4 = y1 + 100*e; y5 = y1 + 10000*e; output; end; datalines; 759 2048 2048 2048 2523 2048 2048 2048 1838 2048 2048 2048 1838 2048 2048 2048 2523 2048 2048 2048 759 ; %macro WTest; data ParmEst; if (0); run; %do i = 1 %to 5; proc orthoreg data=Wampler outest=ParmEst&i noprint; model y&i = x x*x x*x*x x*x*x*x x*x*x*x*x; data ParmEst&i; set ParmEst&i; Dep = "y&i"; data ParmEst; set ParmEst ParmEst&i; label Col1='x' Col2='x**2' Col3='x**3' Col4='x**4' Col5='x**5'; run; %end; %mend; %WTest;
Instead of displaying the raw values of the RMSE and parameter estimates, use a further DATA step to compute the deviations from the values certified to be correct by the National Institute of Standards and Technology (1998).
data ParmEst; set ParmEst; if (Dep = 'y1') then _RMSE_ = _RMSE_ 0.00000000000000; else if (Dep = 'y2') then _RMSE_ = _RMSE_ 0.00000000000000; else if (Dep = 'y3') then _RMSE_ = _RMSE_ 2360.14502379268; else if (Dep = 'y4') then _RMSE_ = _RMSE_ 236014.502379268; else if (Dep = 'y5') then _RMSE_ = _RMSE_ 23601450.2379268; if (Dep ^= 'y2') then do; Intercept = Intercept 1.00000000000000; Col1 = Col1 1.00000000000000; Col2 = Col2 1.00000000000000; Col3 = Col3 1.00000000000000; Col4 = Col4 1.00000000000000; Col5 = Col5 1.00000000000000; end; else do; Intercept = Intercept 1.00000000000000; Col1 = Col1 0.100000000000000; Col2 = Col2 0.100000000000000e 1; Col3 = Col3 0.100000000000000e 2; Col4 = Col4 0.100000000000000e 3; Col5 = Col5 0.100000000000000e 4; end; proc print data=ParmEst label noobs; title 'Wampler data: Deviations from Certified Values'; format _RMSE_ Intercept Col1-Col5 e9.; var Dep _RMSE_ Intercept Col1-Col5; run;
The results, shown in Output 53.2.1, indicate that the values computed by PROC ORTHOREG are quite close to the NIST-certified values.
Wampler data: Deviations from Certified Values Dep _RMSE_ Intercept x x**2 x**3 x**4 x**5 y1 0.00E+00 1.49E 10 9.08E 12 5.99E 12 1.26E 12 9.68E 14 2.00E 15 y2 0.00E+00 6.33E 15 5.55E 16 1.37E 16 1.13E 17 5.56E 19 1.52E 20 y3 1.09E 11 3.02E 10 1.70E 10 4.88E 11 5.75E 12 3.18E 13 6.88E 15 y4 3.20E 10 2.74E 09 5.60E-09 2.12E 09 2.89E-10 1.63E 11 3.24E 13 y5 2.98E 08 2.46E 07 5.54E 07 2.12E 07 2.90E 08 1.64E 09 3.27E 11