This example shows how to use the TRANSREG procedure to code and fit a maineffects ANOVA model. The input data set contains the dependent variables Y , factors X1 and X2 , and 11 observations. The following statements perform a main-effects ANOVA:
title 'Introductory Main-Effects ANOVA Example'; data A; input Y X1 $ X2 $; datalines; 8 a a 7 a a 4 a b 3 a b 5 b a 4 b a 2 b b 1 b b 8 c a 7 c a 5 c b 2 c b ; *---Fit a Main-Effects ANOVA model with 1, 0, -1 coding. ---; proc transreg ss2; model identity(Y) = class(X1 X2 / effects); output coefficients replace; run; *---Print TRANSREG output data set---; proc print label; format Intercept -- X2a 5.2; run;
The iteration history in Figure 75.1 shows that the final R-Square of 0.88144 is reached on the first iteration.
Introductory Main-Effects ANOVA Example The TRANSREG Procedure Dependent Variable Identity(Y) Class Level Information Class Levels Values X1 3 a b c X2 2 a b Number of Observations Read 12 Number of Observations Used 12 TRANSREG Univariate Algorithm Iteration History for Identity(Y) Iteration Average Maximum Criterion Number Change Change R-Square Change Note ------------------------------------------------------------------------- 1 0.00000 0.00000 0.88144 Converged Algorithm converged. The TRANSREG Procedure Hypothesis Tests for Identity(Y) Univariate ANOVA Table Based on the Usual Degrees of Freedom Sum of Mean Source DF Squares Square F Value Pr > F Model 3 57.00000 19.00000 19.83 0.0005 Error 8 7.66667 0.95833 Corrected Total 11 64.66667 Root MSE 0.97895 R-Square 0.8814 Dependent Mean 4.66667 Adj R-Sq 0.8370 Coeff Var 20.97739 Univariate Regression Table Based on the Usual Degrees of Freedom Type II Sum of Mean Variable DF Coefficient Squares Square F Value Pr > F Label Intercept 1 4.6666667 261.333 261.333 272.70 <.0001 Intercept Class.X1a 1 0.8333333 4.167 4.167 4.35 0.0705 X1 a Class.X1b 1 1.6666667 16.667 16.667 17.39 0.0031 X1 b Class.X2a 1 1.8333333 40.333 40.333 42.09 0.0002 X2 a
This is followed by ANOVA, fit statistics, and regression tables. PROC TRANSREG uses an effects (also called deviations from means or 0, 1, -1) coding in this example. For more information on using PROC TRANSREG for ANOVA and other codings, see the 'ANOVA Codings' section on page 4662.
The TRANSREG procedure produces the data set displayed in Figure 75.2.
Introductory Main-Effects ANOVA Example Obs _TYPE_ _NAME_ Y Intercept X1 a X1 b X2 a X1 X2 1 SCORE ROW1 8 1.00 1.00 0.00 1.00 a a 2 SCORE ROW2 7 1.00 1.00 0.00 1.00 a a 3 SCORE ROW3 4 1.00 1.00 0.00 1.00 a b 4 SCORE ROW4 3 1.00 1.00 0.00 1.00 a b 5 SCORE ROW5 5 1.00 0.00 1.00 1.00 b a 6 SCORE ROW6 4 1.00 0.00 1.00 1.00 b a 7 SCORE ROW7 2 1.00 0.00 1.00 1.00 b b 8 SCORE ROW8 1 1.00 0.00 1.00 1.00 b b 9 SCORE ROW9 8 1.00 1.00 1.00 1.00 c a 10 SCORE ROW10 7 1.00 1.00 1.00 1.00 c a 11 SCORE ROW11 5 1.00 1.00 1.00 1.00 c b 12 SCORE ROW12 2 1.00 1.00 1.00 1.00 c b 13 M COEFFI Y . 4.67 0.83 1.67 1.83 14 MEAN Y . . 5.50 3.00 6.50
The output data set has three kinds of observations, identified by values of _TYPE_ .
When _TYPE_ ='SCORE', the observation contains information on the dependent and independent variables as follows :
Y is the original dependent variable.
X1 and X2 are the independent classification variables, and the Intercept through X2 a columns contain the main effects design matrix that PROC TRANSREG creates. The variable names are Intercept , X1a , X1b , and X2a . Their labels are shown in the listing.
When _TYPE_ ='M COEFFI', the observation contains coefficients of the final linear model.
When _TYPE_ ='MEAN', the observation contains the marginal means.
The observations with _TYPE_ ='SCORE' form the score partition of the data set, and the observations with _TYPE_ ='M COEFFI' and _TYPE_ ='MEAN' form the coefficient partition of the data set.
The TRANSREG procedure can detect nonlinear relationships among variables. For example, suppose 400 observations are generated from the following function
and data are created as follows
where ˆˆ is random normal error.
The following statements find a cubic spline transformation of X with four knots. For information on using splines and knots, see the 'Smoothing Splines' section on page 4596, the 'Solving Standard Least-Squares Problems' section on page 4628, Example 75.1, and Example 75.4.
The following statements produce Figure 75.3 through Figure 75.4:
title 'Curve Fitting Example'; *---Create An Artificial Nonlinear Scatter Plot---; data Curve; Pi=constant('pi'); Pi4=4*Pi; Increment=Pi4/400; do X=Increment to Pi4 by Increment; T=X/4 + sin(X); Y=T + normal(7); output; end; run; *---Request a Spline Transformation of X---; proc transreg data=Curve dummy; model identity(Y)=spline(X / nknots=4); output predicted; id T; run; *---Plot the Results---; goptions goutmode=replace nodisplay; %let opts = haxis=axis2 vaxis=axis1 frame cframe=ligr; * Depending on your goptions, these plot options may work better: * %let opts = haxis=axis2 vaxis=axis1 frame; proc gplot; title; axis1 minor=none label=(angle=90 rotate=0); axis2 minor=none; plot T*X=2 / &opts name='tregin1'; plot Y*X=1 / &opts name='tregin2'; plot Y*X=1 T*X=2 PY*X=3 / &opts name='tregin3' overlay ; symbol1 color=blue v=star i=none; symbol2 color=yellow v=none i=join line=1; symbol3 color=red v=none i=join line=2; run; quit; goptions display; proc greplay nofs tc=sashelp.templt template=l2r2; igout gseg; treplay 1:tregin1 2:tregin3 3:tregin2; run; quit;
Curve Fitting Example The TRANSREG Procedure TRANSREG MORALS Algorithm Iteration History for Identity(Y) Iteration Average Maximum Criterion Number Change Change R-Square Change Note ------------------------------------------------------------------------- 0 0.74855 1.29047 0.19945 1 0.00000 0.00000 0.47062 0.27117 Converged Algorithm converged.
PROC TRANSREG increases the squared multiple correlation from the original value of 0.19945 to 0.47062. The plot of T by X shows the original function, the plot of Y by X shows the error-perturbed data, and the third plot shows the data, the true function as a solid curve, and the regression function as the dashed curve. The regression function closely approximates the true function.