RESIDUALS


When you begin studying the relationship between two variables you usually do not know whether the assumptions needed for regression analysis are satisfied. You do not know whether a linear relationship exists between the two variables , much less whether the distribution of the dependent variable is normal and has the same variance for all values of the independent variable. One of the goals of regression analysis is to check whether the required assumptions of linearity , normality, and constant variance are met. To do this we do an analysis of residuals.

A quantity called the residual plays a very important role when you are fitting models to data. You can think of a residual as what is left over after a model is fit. In a linear regression, the residual is the difference between the observed and predicted values of the dependent variable. If a person has 12 years of education and your model predicts nine, the residual for the case is 12 - 9 = 3. You have three years of education left over (not explained by the model).

By looking at the residual for each case you can see how well a model fits. If a model fits the data perfectly , all of the residuals are zero. Cases for which the model does not fit well have large residuals. Obviously, you can use the REGRESSION procedure to calculate the residuals for all of the cases. A typical scatterplot with a linear fit is shown in Figure 9.2 and a plot of fitted values and residuals in Figure 9.3.

click to expand
Figure 9.2: Scatterplot with possible linear fit superimposed.
click to expand
Figure 9.3: Fitted values and residuals.



Six Sigma and Beyond. Statistics and Probability
Six Sigma and Beyond: Statistics and Probability, Volume III
ISBN: 1574443127
EAN: 2147483647
Year: 2003
Pages: 252

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net