This example shows how to use PROC SCORE with factor scoring coefficients. First, the FACTOR procedure produces an output data set containing scoring coefficients in observations identified by _TYPE_ ='SCORE'. These data, together with the original data set Fitness , are supplied to PROC SCORE, resulting in a data set containing scores Factor1 and Factor2 . These statements produce Output 64.1.1 through Output 64.1.3:
/* This data set contains only the first 12 observations */ /* from the full data set used in the chapter on PROC REG. */ data Fitness; input Age Weight Oxygen RunTime RestPulse RunPulse @@; datalines; 44 89.47 44.609 11.37 62 178 40 75.07 45.313 10.07 62 185 44 85.84 54.297 8.65 45 156 42 68.15 59.571 8.17 40 166 38 89.02 49.874 9.22 55 178 47 77.45 44.811 11.63 58 176 40 75.98 45.681 11.95 70 176 43 81.19 49.091 10.85 64 162 44 81.42 39.442 13.08 63 174 38 81.87 60.055 8.63 48 170 44 73.03 50.541 10.13 45 168 45 87.66 37.388 14.03 56 186 ; proc factor data=Fitness outstat=FactOut method=prin rotate=varimax score; var Age Weight RunTime RunPulse RestPulse; title 'FACTOR SCORING EXAMPLE'; run; proc print data=FactOut; title2 'Data Set from PROC FACTOR'; run; proc score data=Fitness score=FactOut out=FScore; var Age Weight RunTime RunPulse RestPulse; run; proc print data=FScore; title2 'Data Set from PROC SCORE'; run;
Output 64.1.1 shows the PROC FACTOR output. The scoring coefficients for the two factors are shown at the end of the PROC FACTOR output.
FACTOR SCORING EXAMPLE The FACTOR Procedure Initial Factor Method: Principal Components Eigenvalues of the Correlation Matrix: Total = 5 Average = 1 Eigenvalue Difference Proportion Cumulative 1 2.30930638 1.11710686 0.4619 0.4619 2 1.19219952 0.30997249 0.2384 0.7003 3 0.88222702 0.37965990 0.1764 0.8767 4 0.50256713 0.38886717 0.1005 0.9773 5 0.11369996 0.0227 1.0000 Factor Pattern Factor1 Factor2 Age 0.29795 0.93675 Weight 0.43282 0.17750 RunTime 0.91983 0.28782 RunPulse 0.72671 0.38191 RestPulse 0.81179 0.23344
The FACTOR Procedure Initial Factor Method: Principal Components Variance Explained by Each Factor Factor1 Factor2 2.3093064 1.1921995 Final Communality Estimates: Total = 3.501506 Age Weight RunTime RunPulse RestPulse 0.96628351 0.21883401 0.92893333 0.67396207 0.71349297 The FACTOR Procedure Rotation Method: Varimax Orthogonal Transformation Matrix 1 2 1 0.92536 0.37908 2 0.37908 0.92536 Rotated Factor Pattern Factor1 Factor2 Age 0.07939 0.97979 Weight 0.46780 0.00018 RunTime 0.74207 0.61503 RunPulse 0.81725 0.07792 RestPulse 0.83969 0.09172
The FACTOR Procedure Rotation Method: Varimax Variance Explained by Each Factor Factor1 Factor2 2.1487753 1.3527306 Final Communality Estimates: Total = 3.501506 Age Weight RunTime RunPulse RestPulse 0.96628351 0.21883401 0.92893333 0.67396207 0.71349297 The FACTOR Procedure Rotation Method: Varimax Squared Multiple Correlations of the Variables with Each Factor Factor1 Factor2 1.0000000 1.0000000 Standardized Scoring Coefficients Factor1 Factor2 Age 0.17846 0.77600 Weight 0.22987 0.06672 RunTime 0.27707 0.37440 RunPulse 0.41263 0.17714 RestPulse 0.39952 0.04793
Output 64.1.2 lists the OUTSTAT= data set from PROC FACTOR. Note that observations 18 and 19 have _TYPE_ ='SCORE'. Observations 1 and 2 have _TYPE_ ='MEAN' and _TYPE_ ='STD', respectively. These four observations are used by PROC SCORE.
FACTOR SCORING EXAMPLE Data Set from PROC FACTOR Rest Obs _TYPE_ _NAME_ Age Weight RunTime RunPulse Pulse 1 MEAN 42.4167 80.5125 10.6483 172.917 55.6667 2 STD 2.8431 6.7660 1.8444 8.918 9.2769 3 N 12.0000 12.0000 12.0000 12.000 12.0000 4 CORR Age 1.0000 0.0128 0.5005 0.095 0.0080 5 CORR Weight 0.0128 1.0000 0.2637 0.173 0.2396 6 CORR RunTime 0.5005 0.2637 1.0000 0.556 0.6620 7 CORR RunPulse 0.0953 0.1731 0.5555 1.000 0.4853 8 CORR RestPulse 0.0080 0.2396 0.6620 0.485 1.0000 9 COMMUNAL 0.9663 0.2188 0.9289 0.674 0.7135 10 PRIORS 1.0000 1.0000 1.0000 1.000 1.0000 11 EIGENVAL 2.3093 1.1922 0.8822 0.503 0.1137 12 UNROTATE Factor1 0.2980 0.4328 0.9198 0.727 0.8118 13 UNROTATE Factor2 0.9368 0.1775 0.2878 0.382 0.2334 14 TRANSFOR Factor1 0.9254 0.3791 . . . 15 TRANSFOR Factor2 0.3791 0.9254 . . . 16 PATTERN Factor1 0.0794 0.4678 0.7421 0.817 0.8397 17 PATTERN Factor2 0.9798 0.0002 0.6150 0.078 0.0917 18 SCORE Factor1 0.1785 0.2299 0.2771 0.413 0.3995 19 SCORE Factor2 0.7760 0.0667 0.3744 0.177 0.0479
Since the PROC SCORE statement does not contain the NOSTD option, the data in the Fitness data set are standardized before scoring. For each variable specified in the VAR statement, the mean and standard deviation are obtained from the FactOut data set. For each observation in the Fitness data set, the variables are then standardized. For example, for observation 1 in the Fitness data set, the variable Age is standardized to 0 . 5569 = [(44 ˆ’ 42 . 4167) / 2 . 8431].
After the data in the Fitness data set are standardized, the standardized values of the variables in the VAR statement are multiplied by the matching coefficients in the FactOut data set, and the resulting products are summed. This sum is output as a value of the new score variable.
Output 64.1.3 displays the FScore data set produced by PROC SCORE. This data set contains the variables Age , Weight , Oxygen , RunTime , RestPulse ,and RunPulse from the Fitness data set. It also contains Factor1 and Factor2 , the two new score variables.
FACTOR SCORING EXAMPLE Data Set from PROC SCORE Run Rest Run Obs Age Weight Oxygen Time Pulse Pulse Factor1 Factor2 1 44 89.47 44.609 11.37 62 178 0.82129 0.35663 2 40 75.07 45.313 10.07 62 185 0.71173 0.99605 3 44 85.84 54.297 8.65 45 156 1.46064 0.36508 4 42 68.15 59.571 8.17 40 166 1.76087 0.27657 5 38 89.02 49.874 9.22 55 178 0.55819 1.67684 6 47 77.45 44.811 11.63 58 176 0.00113 1.40715 7 40 75.98 45.681 11.95 70 176 0.95318 0.48598 8 43 81.19 49.091 10.85 64 162 0.12951 0.36724 9 44 81.42 39.442 13.08 63 174 0.66267 0.85740 10 38 81.87 60.055 8.63 48 170 0.44496 1.53103 11 44 73.03 50.541 10.13 45 168 1.11832 0.55349 12 45 87.66 37.388 14.03 56 186 1.20836 1.05948