Examples


Example 27.1. Principal Component Analysis

The following example analyzes socioeconomic data provided by Harman (1976). The five variables represent total population, median school years , total employment, miscellaneous professional services, and median house value. Each observation represents one of twelve census tracts in the Los Angeles Standard Metropolitan Statistical Area.

The first analysis is a principal component analysis. Simple descriptive statistics and correlations are also displayed. This example produces Output 27.1.1:

data SocioEconomics;     title 'Five Socioeconomic Variables';     title2 'See Page 14 of Harman: Modern Factor Analysis, 3rd Ed';     input Population School Employment Services HouseValue;     datalines;  5700     12.8      2500      270       25000  1000     10.9      600       10        10000  3400     8.8       1000      10        9000  3800     13.6      1700      140       25000  4000     12.8      1600      140       25000  8200     8.3       2600      60        12000  1200     11.4      400       10        16000  9100     11.5      3300      60        14000  9900     12.5    3400     180      18000  9600     13.7    3600     390      25000  9600     9.6     3300     80       12000  9400     11.4    4000     100      13000  ;  proc factor data=SocioEconomics simple corr;     title3 'Principal Component Analysis';  run; 
Output 27.1.1: Principal Component Analysis
start example
                          Five Socioeconomic Variables               See Page 14 of Harman: Modern Factor Analysis, 3rd Ed                            Principal Component Analysis                                The FACTOR Procedure                 Means and Standard Deviations from 12 Observations                       Variable            Mean       Std Dev                       Population      6241.667     3439.9943                       School            11.442        1.7865                       Employment      2333.333     1241.2115                       Services         120.833      114.9275                       HouseValue     17000.000     6367.5313                                    Correlations                Population        School    Employment      Services    HouseValue  Population       1.00000       0.00975       0.97245       0.43887       0.02241  School           0.00975       1.00000       0.15428       0.69141       0.86307  Employment       0.97245       0.15428       1.00000       0.51472       0.12193  Services         0.43887       0.69141       0.51472       1.00000       0.77765  HouseValue       0.02241       0.86307       0.12193       0.77765       1.00000                         Principal Component Analysis                 Initial Factor Method: Principal Components         Eigenvalues of the Correlation Matrix: Total = 5 Average = 1                 Eigenvalue    Difference    Proportion   Cumulative            1    2.87331359    1.07665350        0.5747       0.5747            2    1.79666009    1.58182321        0.3593       0.9340            3    0.21483689    0.11490283        0.0430       0.9770            4    0.09993405    0.08467868        0.0200       0.9969            5    0.01525537                      0.0031       1.0000                                Factor Pattern                                     Factor1         Factor2                  Population         0.58096         0.80642                  School             0.76704        0.54476                  Employment         0.67243         0.72605                  Services           0.93239        0.10431                  HouseValue         0.79116        0.55818                       Variance Explained by Each Factor                             Factor1         Factor2                           2.8733136       1.7966601                 Final Communality Estimates: Total = 4.669974  Population          School      Employment        Services      HouseValue  0.98782629      0.88510555      0.97930583      0.88023562      0.93750041 
end example
 

There are two large eigenvalues, 2.8733 and 1.7967, which together account for 93.4% of the standardized variance. Thus, the first two principal components provide an adequate summary of the data for most purposes. Three components, explaining 97.7% of the variation, should be sufficient for almost any application. PROC FACTOR retains two components on the basis of the eigenvalues-greater-than-one rule since the third eigenvalue is only 0.2148.

The first component has large positive loadings for all five variables. The correlation with Services (0 . 93239) is especially high. The second component is a contrast of Population (0.80642) and Employment (0 . 72605) against School ( ˆ’ 0 . 54476) and HouseValue ( ˆ’ 0 . 55818), with a very small loading on Services ( ˆ’ 0 . 10431).

The final communality estimates show that all the variables are well accounted for by two components, with final communality estimates ranging from 0.880236 for Services to 0.987826 for Population .

Example 27.2. Principal Factor Analysis

The following example uses the data presented in Example 27.1, and performs a principal factor analysis with squared multiple correlations for the prior communality estimates ( PRIORS =SMC).

To help determine if the common factor model is appropriate, Kaiser s measure of sampling adequacy (MSA) is requested , and the residual correlations and partial correlations are computed ( RESIDUAL ). To help determine the number of factors, a scree plot (SCREE) of the eigenvalues is displayed, and the PREPLOT option plots the unrotated factor pattern.

The ROTATE= and REORDER options are specified to enhance factor interpretability. The ROTATE=PROMAX option produces an orthogonal varimax prerotation (default) followed by an oblique Procrustean rotation, and the REORDER option re-orders the variables according to their largest factor loadings. An OUTSTAT= data set is created by PROC FACTOR and displayed in Output 27.2.16.

proc factor data=SocioEconomics       priors=smc msa scree residual preplot       rotate=promax reorder plot       outstat=fact_all;     title3 'Principal Factor Analysis with Promax Rotation';  run;  proc print;     title3 'Factor Output Data Set';  run; 
Output 27.2.16: Output Data Set
start example
                             Factor Output Data Set                                                                              House  Obs  _TYPE_    _NAME_      Population    School  Employment   Services      Value    1  MEAN                    6241.67    11.4417    2333.33     120.833   17000.00    2  STD                     3439.99     1.7865    1241.21     114.928    6367.53    3  N                         12.00    12.0000      12.00      12.000      12.00    4  CORR      Population       1.00     0.0098       0.97       0.439       0.02    5  CORR      School           0.01     1.0000       0.15       0.691       0.86    6  CORR      Employment       0.97     0.1543       1.00       0.515       0.12    7  CORR      Services         0.44     0.6914       0.51       1.000       0.78    8  CORR      HouseValue       0.02     0.8631       0.12       0.778       1.00    9  COMMUNAL                   0.98     0.8176       0.97       0.798       0.88   10  PRIORS                     0.97     0.8223       0.97       0.786       0.85   11  EIGENVAL                   2.73     1.7161       0.04      0.025      0.07   12  UNROTATE  Factor1          0.63     0.7137       0.71       0.879       0.74   13  UNROTATE  Factor2          0.77    0.5552       0.68      0.158      0.58   14  RESIDUAL  Population       0.02    0.0112       0.01       0.011       0.00   15  RESIDUAL  School          0.01     0.1824       0.02      0.024       0.01   16  RESIDUAL  Employment       0.01     0.0215       0.03      0.006      0.02   17  RESIDUAL  Services         0.01    0.0239      0.01       0.202       0.03   18  RESIDUAL  HouseValue       0.00     0.0125      0.02       0.034       0.12   19  PRETRANS  Factor1          0.79    0.6145        .          .           .   20  PRETRANS  Factor2          0.61     0.7889        .          .           .   21  PREROTAT  Factor1          0.02     0.9042       0.15       0.791       0.94   22  PREROTAT  Factor2          0.99     0.0006       0.97       0.415      0.00   23  TRANSFOR  Factor1          0.74    0.7055        .          .           .   24  TRANSFOR  Factor2          0.54     0.8653        .          .           .   25  FCORR     Factor1          1.00     0.2019        .          .           .   26  FCORR     Factor2          0.20     1.0000        .          .           .   27  PATTERN   Factor1         0.08     0.9184       0.05       0.761       0.96   28  PATTERN   Factor2          1.00    0.0935       0.98       0.339      0.10   29  RCORR     Factor1          1.00    0.2019        .          .           .   30  RCORR     Factor2         0.20     1.0000        .          .           .   31  REFERENC  Factor1         0.08     0.8995       0.05       0.745       0.94   32  REFERENC  Factor2          0.98    0.0916       0.96       0.332      0.10   33  STRUCTUR  Factor1          0.12     0.8995       0.24       0.829       0.94   34  STRUCTUR  Factor2          0.99     0.0919       0.98       0.493       0.09 
end example
 

Output 27.2.1 displays the results of the principal factor extraction.

Output 27.2.1: Principal Factor Analysis
start example
                 Principal Factor Analysis with Promax Rotation                                The FACTOR Procedure                      Initial Factor Method: Principal Factors                Partial Correlations Controlling all other Variables                Population        School    Employment      Services    HouseValue  Population       1.00000      0.54465       0.97083       0.09612       0.15871  School          0.54465       1.00000       0.54373       0.04996       0.64717  Employment       0.97083       0.54373       1.00000       0.06689      0.25572  Services         0.09612       0.04996       0.06689       1.00000       0.59415  HouseValue       0.15871       0.64717      -0.25572       0.59415       1.00000           Kaiser's Measure of Sampling Adequacy: Overall MSA = 0.57536759     Population          School      Employment        Services      HouseValue     0.47207897      0.55158839      0.48851137      0.80664365      0.61281377                Principal Factor Analysis with Promax Rotation                   Initial Factor Method: Principal Factors                     Prior Communality Estimates: SMC  Population          School      Employment        Services      HouseValue  0.96859160      0.82228514      0.96918082      0.78572440      0.84701921                Eigenvalues of the Reduced Correlation Matrix:                   Total = 4.39280116  Average = 0.87856023                 Eigenvalue    Difference    Proportion    Cumulative            1    2.73430084    1.01823217        0.6225        0.6225            2    1.71606867    1.67650586        0.3907        1.0131            3    0.03956281    0.06408626        0.0090        1.0221            4    .02452345    0.04808427       0.0056        1.0165            5    .07260772                     0.0165        1.0000 
end example
 

If the data are appropriate for the common factor model, the partial correlations controlling the other variables should be small compared to the original correlations. The partial correlation between the variables School and HouseValue , for example, is 0.65, slightly less than the original correlation of 0.86. The partial correlation between Population and School is -0.54, which is much larger in absolute value than the original correlation; this is an indication of trouble. Kaiser s MSA is a summary, for each variable and for all variables together, of how much smaller the partial correlations are than the original correlations. Values of 0.8 or 0.9 are considered good, while MSAs below 0.5 are unacceptable. The variables Population , School , and Employment have very poor MSAs. Only the Services variable has a good MSA. The overall MSA of 0.58 is sufficiently poor that additional variables should be included in the analysis to better define the common factors. A commonly used rule is that there should be at least three variables per factor. In the following analysis, there seems to be two common factors in these data, so more variables are needed for a reliable analysis.

The SMCs are all fairly large; hence, the factor loadings do not differ greatly from the principal component analysis.

The eigenvalues show clearly that two common factors are present. The first two largest positive eigenvalues account for 101.31% of the common variance. This is possible because the reduced correlation matrix, in general, needs not be positive definite, and negative eigenvalues for the matrix are possible. The scree plot displays a sharp bend at the third eigenvalue, reinforcing the preceding conclusion.

Output 27.2.2: Scree Plot
start example
                 Principal Factor Analysis with Promax Rotation                      Initial Factor Method: Principal Factors  Scree Plot of Eigenvalues       |       |     3 +       |       |                  1       |       |       |  E 2  +  i    |  g    |                              2  e    |  n    |  v    |  a 1  +  l    |  u    |  e    |  s    |       |     0 +                                          3           4           5       |       |       |       |       |    -1 +       |       -------+-----------+-----------+-----------+-----------+-----------+-------             0           1           2           3           4           5                                          Number 
end example
 

As displayed in Output 27.2.3, the principal factor pattern is similar to the principal component pattern seen in Example 27.1. For example, the variable Services has the largest loading on the first factor, and the Population variable has the smallest. The variables Population and Employment have large positive loadings on the second factor, and the HouseValue and School variables have large negative loadings.

Output 27.2.3: Factor Pattern Matrix and Communalities
start example
              Principal Factor Analysis with Promax Rotation                   Initial Factor Method: Principal Factors                                Factor Pattern                                     Factor1         Factor2                  Services           0.87899        0.15847                  HouseValue         0.74215        0.57806                  Employment         0.71447         0.67936                  School             0.71370        0.55515                  Population         0.62533         0.76621                       Variance Explained by Each Factor                             Factor1         Factor2                           2.7343008       1.7160687                 Final Communality Estimates: Total = 4.450370  Population          School      Employment        Services      HouseValue  0.97811334      0.81756387      0.97199928      0.79774304      0.88494998 
end example
 

The final communality estimates are all fairly close to the priors. Only the communality for the variable HouseValue increased appreciably, from 0.847019 to 0.884950. Nearly 100% of the common variance is accounted for. The residual correlations (off-diagonal elements) are low, the largest being 0.03 ( Output 27.2.4). The partial correlations are not quite as impressive, since the uniqueness values are also rather small. These results indicate that the SMCs are good but not quite optimal communality estimates.

Output 27.2.4: Residual and Partial Correlations
start example
                 Principal Factor Analysis with Promax Rotation                      Initial Factor Method: Principal Factors               Residual Correlations With Uniqueness on the Diagonal                Population        School    Employment      Services    HouseValue  Population       0.02189      0.01118       0.00514       0.01063       0.00124  School          0.01118       0.18244       0.02151      0.02390       0.01248  Employment       0.00514       0.02151       0.02800      0.00565      0.01561  Services         0.01063      0.02390      0.00565       0.20226       0.03370  HouseValue       0.00124       0.01248      0.01561       0.03370       0.11505            Root Mean Square Off-Diagonal Residuals: Overall = 0.01693282     Population          School      Employment        Services      HouseValue     0.00815307      0.01813027      0.01382764      0.02151737      0.01960158                      Partial Correlations Controlling Factors                Population        School    Employment      Services    HouseValue  Population       1.00000      -0.17693       0.20752       0.15975       0.02471  School          0.17693       1.00000       0.30097      0.12443       0.08614  Employment       0.20752       0.30097       1.00000      0.07504      0.27509  Services         0.15975      0.12443      0.07504       1.00000       0.22093  HouseValue       0.02471       0.08614      0.27509       0.22093       1.00000 
end example
 
Output 27.2.5: Root Mean Square Off-Diagonal Partials
start example
              Principal Factor Analysis with Promax Rotation                   Initial Factor Method: Principal Factors         Root Mean Square Off-Diagonal Partials: Overall = 0.18550132  Population          School      Employment        Services      HouseValue  0.15850824      0.19025867      0.23181838      0.15447043      0.18201538 
end example
 

As displayed in Output 27.2.6, the unrotated factor pattern reveals two tight clusters of variables, with the variables HouseValue and School at the negative end of Factor2 axis and the variables Employment and Population at the positive end. The Services variable is in between but closer to the HouseValue and School variables. A good rotation would put the reference axes through the two clusters.

Output 27.2.6: Unrotated Factor Pattern Plot
start example
                 Principal Factor Analysis with Promax Rotation                      Initial Factor Method: Principal Factors  Plot of Factor Pattern for Factor1 and Factor2                                       Factor1                                          1                                     D   .9                                         .8                        E                         B               .7                   C                                                                A                                         .6                                         .5                                         .4                                         .3                                         .2                                                                          F                                         .1                               a                                                                          c          1 .9.8.7.6.5.4.3.2.1  0 .1 .2 .3 .4 .5 .6 .7 .8 .9 1.0t                                                                          o                                        .1                               r                                                                          2                                        .2                                        .3                                        .4                                        .5                                        .6                                        .7                                        .8                                        .9                                         1         Population=A School=B       Employment=C  Services=D    HouseValue=E 
end example
 

Output 27.2.7, Output 27.2.8, and Output 27.2.9 display the results of the varimax rotation. This rotation puts one axis through the variables HouseValue and School but misses the Population and Employment variables slightly.

Output 27.2.7: Varimax Rotation ”Transform Matrix and Rotated Pattern
start example
Principal Factor Analysis with Promax Rotation           Prerotation Method: Varimax         Orthogonal Transformation Matrix                            1               2            1         0.78895         0.61446            2        0.61446         0.78895              Rotated Factor Pattern                       Factor1         Factor2    HouseValue         0.94072        0.00004    School             0.90419         0.00055    Services           0.79085         0.41509    Population         0.02255         0.98874    Employment         0.14625         0.97499 
end example
 
Output 27.2.8: Varimax Rotation ”Variance Explained and Communalities
start example
              Principal Factor Analysis with Promax Rotation                         Prerotation Method: Varimax                       Variance Explained by Each Factor                             Factor1         Factor2                           2.3498567       2.1005128                 Final Communality Estimates: Total = 4.450370  Population          School      Employment        Services      HouseValue  0.97811334      0.81756387      0.97199928      0.79774304      0.88494998 
end example
 
Output 27.2.9: Varimax Rotated Factor Pattern Plot
start example
                 Principal Factor Analysis with Promax Rotation                            Prerotation Method: Varimax  Plot of Factor Pattern for Factor1 and Factor2                                       Factor1                                          1                                         E                                         .B                                         .8           D                                         .7                                         .6                                         .5                                         .4                                         .3                                         .2                                                                       C  F                                         .1                               a                                                                          c          1 .9.8.7.6.5.4.3.2.1  0 .1 .2 .3 .4 .5 .6 .7 .8 .9 A.0t                                                                          o                                        .1                               r                                                                          2                                        .2                                        .3                                        .4                                        .5                                        .6                                        .7                                        .8                                        .9                                         1         Population=A School=B       Employment=C Services=D    HouseValue=E 
end example
 

The oblique promax rotation ( Output 27.2.10 through Output 27.2.15) places an axis through the variables Population and Employment but misses the HouseValue and School variables. Since an independent-cluster solution would be possible if it were not for the variable Services , a Harris-Kaiser rotation weighted by the Cureton-Mulaik technique should be used.

Output 27.2.10: Promax Rotation ”Procrustean Target and Transform Matrix
start example
Principal Factor Analysis with Promax Rotation       Rotation Method: Promax (power = 3)   Target Matrix for Procrustean Transformation                       Factor1         Factor2    HouseValue         1.00000        0.00000    School             1.00000         0.00000    Services           0.69421         0.10045    Population         0.00001         1.00000    Employment         0.00326         0.96793        Procrustean Transformation Matrix                            1               2            1      1.04116598      0.0986534            2      0.1057226      0.96303019 
end example
 
Output 27.2.11: Promax Rotation ”Oblique Transform Matrix and Correlation
start example
Principal Factor Analysis with Promax Rotation       Rotation Method: Promax (power = 3)     Normalized Oblique Transformation Matrix                            1               2            1         0.73803         0.54202            2        0.70555         0.86528            Inter-Factor Correlations                     Factor1         Factor2     Factor1         1.00000         0.20188     Factor2         0.20188         1.00000 
end example
 
Output 27.2.12: Promax Rotation ”Rotated Factor Pattern and Correlations
start example
       Principal Factor Analysis with Promax Rotation              Rotation Method: Promax (power = 3)  Rotated Factor Pattern (Standardized Regression Coefficients)                              Factor1         Factor2           HouseValue      0.95558485      0.0979201           School          0.91842142      0.0935214           Services        0.76053238      0.33931804           Population      0.0790832      1.00192402           Employment         0.04799      0.97509085                  Reference Axis Correlations                            Factor1         Factor2            Factor1         1.00000        0.20188            Factor2        0.20188         1.00000 
end example
 
Output 27.2.13: Promax Rotation ”Variance Explained and Factor Structure
start example
      Principal Factor Analysis with Promax Rotation             Rotation Method: Promax (power = 3)        Reference Structure (Semipartial Correlations)                             Factor1         Factor2          HouseValue         0.93591        0.09590          School             0.89951        0.09160          Services           0.74487         0.33233          Population        0.07745         0.98129          Employment         0.04700         0.95501  Variance Explained by Each Factor Eliminating Other Factors                     Factor1         Factor2                   2.2480892       2.0030200               Factor Structure (Correlations)                             Factor1         Factor2          HouseValue         0.93582         0.09500          School             0.89954         0.09189          Services           0.82903         0.49286          Population         0.12319         0.98596          Employment         0.24484         0.98478 
end example
 
Output 27.2.14: Promax Rotation ”Variance Explained and Final Communalities
start example
              Principal Factor Analysis with Promax Rotation                     Rotation Method: Promax (power = 3)           Variance Explained by Each Factor Ignoring Other Factors                             Factor1         Factor2                           2.4473495       2.2022803                 Final Communality Estimates: Total = 4.450370  Population          School      Employment        Services      HouseValue  0.97811334      0.81756387      0.97199928      0.79774304      0.88494998 
end example
 
Output 27.2.15: Promax Rotated Factor Pattern Plot
start example
                 Principal Factor Analysis with Promax Rotation                        Rotation Method: Promax (power = 3)  Plot of Reference Structure for Factor1 and Factor2  Reference Axis Correlation = 0.2019  Angle = 101.6471                                       Factor1                                          1                                       E                                       B .9                                         .8                                                   D                                         .7                                         .6                                         .5                                         .4                                         .3                                         .2                                                                          F                                         .1                               a                                                                      C   c          1 .9.8.7.6.5.4.3.2.1  0 .1 .2 .3 .4 .5 .6 .7 .8 .9 1.0t                                                                          o                                        .1                            A  r                                                                          2                                        .2                                        .3                                        .4                                        .5                                        .6                                        .7                                        .8                                        .9                                         1         Population=A  School=B      Employment=C  Services=D    HouseValue=E 
end example
 

The output data set displayed in Output 27.2.16 can be used for Harris-Kaiser rotation by deleting observations with _TYPE_ = PATTERN and _TYPE_ = FCORR , which are for the promax-rotated factors, and changing _TYPE_ = UNROTATE to _TYPE_ = PATTERN . In this way, the initial orthogonal factor pattern matrix is saved in the observations with _TYPE_ = PATTERN . The following factor analysis will then read in the factor pattern in the fact2 data set as an initial factor solution, which will then be rotated by the Harris-Kaiser rotation with Cureton-Mulaik weights.

The following statements produce Output 27.2.17:

Output 27.2.17: Harris-Kaiser Rotation
start example
            Harris-Kaiser Rotation with Cureton-Mulaik Weights                             The FACTOR Procedure                 Rotation Method: Harris-Kaiser (hkpower = 0)                         Variable Weights for Rotation  Population          School      Employment        Services      HouseValue  0.95982747      0.93945424      0.99746396      0.12194766      0.94007263                        Oblique Transformation Matrix                                          1               2                          1         0.73537         0.61899                          2        0.68283         0.78987                          Inter-Factor Correlations                                   Factor1         Factor2                   Factor1         1.00000         0.08358                   Factor2         0.08358         1.00000        Harris-Kaiser Rotation with Cureton-Mulaik Weights          Rotation Method: Harris-Kaiser (hkpower = 0)  Rotated Factor Pattern (Standardized Regression Coefficients)                              Factor1         Factor2           HouseValue         0.94048         0.00279           School             0.90391         0.00327           Services           0.75459         0.41892           Population        0.06335         0.99227           Employment         0.06152         0.97885                  Reference Axis Correlations                            Factor1         Factor2            Factor1         1.00000        0.08358            Factor2        0.08358         1.00000         Reference Structure (Semipartial Correlations)                              Factor1         Factor2           HouseValue         0.93719         0.00278           School             0.90075         0.00326           Services           0.75195         0.41745           Population        0.06312         0.98880           Employment         0.06130         0.97543   Variance Explained by Each Factor Eliminating Other Factors                      Factor1         Factor2                    2.2628537       2.1034731              Harris-Kaiser Rotation with Cureton-Mulaik Weights                 Rotation Method: Harris-Kaiser (hkpower = 0)                       Factor Structure (Correlations)                                     Factor1         Factor2                  HouseValue         0.94071         0.08139                  School             0.90419         0.07882                  Services           0.78960         0.48198                  Population         0.01958         0.98698                  Employment         0.14332         0.98399           Variance Explained by Each Factor Ignoring Other Factors                             Factor1         Factor2                           2.3468965       2.1875158                 Final Communality Estimates: Total = 4.450370  Population          School      Employment        Services      HouseValue  0.97811334      0.81756387      0.97199928      0.79774304      0.88494998                 Harris-Kaiser Rotation with Cureton-Mulaik Weights                    Rotation Method: Harris-Kaiser (hkpower = 0)  Plot of Reference Structure for Factor1 and Factor2  Reference Axis Correlation = 0.0836  Angle = 94.7941                                       Factor1                                          1                                          E                                         .B                                         .8                                                      D                                         .7                                         .6                                         .5                                         .4                                         .3                                         .2                                                                          F                                         .1                               a                                                                       C  c          1 .9.8.7.6.5.4.3.2.1  0 .1 .2 .3 .4 .5 .6 .7 .8 .9 1.0t                                                                       A  o                                        .1                               r                                                                          2                                        .2                                        .3                                        .4                                        .5                                        .6                                        .7                                        .8                                        .9                                         1         Population=A  School=B      Employment=C  Services=D    HouseValue=E 
end example
 
data fact2(type=factor);     set fact_all;     if _TYPE_ in('PATTERN' 'FCORR') then delete;     if _TYPE_='UNROTATE' then _TYPE_='PATTERN'; 
proc factor rotate=hk norm=weight reorder plot;     title3 'Harris-Kaiser Rotation with Cureton-Mulaik Weights';  run; 

The results of the Harris-Kaiser rotation are displayed in Output 27.2.17:

In the results of the Harris-Kaiser rotation, the variable Services receives a small weight, and the axes are placed as desired.

Example 27.3. Maximum Likelihood Factor Analysis

This example uses maximum likelihood factor analyses for one, two, and three factors. It is already apparent from the principal factor analysis that the best number of common factors is almost certainly two. The one- and three-factor ML solutions reinforce this conclusion and illustrate some of the numerical problems that can occur. The following statements produce Output 27.3.1:

proc factor data=SocioEconomics method=ml heywood n=1;     title3 'Maximum Likelihood Factor Analysis with One Factor';  run;  proc factor data=SocioEconomics method=ml heywood n=2;     title3 'Maximum Likelihood Factor Analysis with Two Factors';  run;  proc factor data=SocioEconomics method=ml heywood n=3;     title3 'Maximum Likelihood Factor Analysis with Three Factors';  run; 
Output 27.3.1: Maximum Likelihood Factor Analysis
start example
               Maximum Likelihood Factor Analysis with One Factor                                The FACTOR Procedure                     Initial Factor Method: Maximum Likelihood                        Prior Communality Estimates: SMC     Population          School      Employment        Services      HouseValue     0.96859160      0.82228514      0.96918082      0.78572440      0.84701921          Preliminary Eigenvalues: Total = 76.1165859 Average = 15.2233172                    Eigenvalue    Difference    Proportion    Cumulative               1    63.7010086    50.6462895        0.8369        0.8369               2    13.0547191    12.7270798        0.1715        1.0084               3     0.3276393     0.6749199        0.0043        1.0127               4    0.3472805     0.2722202       0.0046        1.0081               5    0.6195007                     0.0081        1.0000  Iteration    Criterion    Ridge   Change               Communalities      1        6.5429218   0.0000   0.1033   0.93828   0.72227   1.00000   0.71940                                             0.74371      2        3.1232699   0.0000   0.7288   0.94566   0.02380   1.00000   0.26493                                             0.01487      3        3.1232699   0.0313   0.0000   0.94566   0.02380   1.00000   0.26493                                             0.01487          Convergence criterion satisfied.         Maximum Likelihood Factor Analysis with One Factor             Initial Factor Method: Maximum Likelihood            Significance Tests Based on 12 Observations                                                             Pr >               Test                     DF    Chi-Square     ChiSq  H0: No common factors                 10       54.2517    <.0001  HA: At least one common factor  H0: 1 Factor is sufficient             5       24.4656    0.0002  HA: More factors are needed      Chi-Square without Bartlett's Correction       34.355969      Akaike's Information Criterion                 24.355969      Schwarz's Bayesian Criterion                   21.931436      Tucker and Lewis's Reliability Coefficient      0.120231                   Squared Canonical Correlations                                Factor1                              1.0000000           Eigenvalues of the Weighted Reduced Correlation           Matrix: Total = 8.66E-15  Average = 2.165E-15                          Eigenvalue    Difference                     1         Infty         Infty                     2    1.92716032    2.15547340                     3    .22831308    0.56464322                     4    .79295630    0.11293464                     5    .90589094         Maximum Likelihood Factor Analysis with One Factor             Initial Factor Method: Maximum Likelihood                           Factor Pattern                                        Factor1                     Population      0.97244826                     School          0.15428378                     Employment               1                     Services        0.51471836                     HouseValue      0.12192599                 Variance Explained by Each Factor                Factor        Weighted    Unweighted                Factor1     17.8010629    2.24926004          Final Communality Estimates and Variable Weights  Total Communality: Weighted = 17.801063   Unweighted = 2.249260              Variable      Communality        Weight              Population     0.94565561    18.4011648              School         0.02380349     1.0243839              Employment     1.00000000         Infty              Services       0.26493499     1.3604239              HouseValue     0.01486595     1.0150903 
end example
 

Output 27.3.1 displays the results of the analysis with one factor. The solution on the second iteration is so close to the optimum that PROC FACTOR cannot find a better solution, hence you receive this message:

Convergence criterion satisfied. 

When this message appears, you should try rerunning PROC FACTOR with different prior communality estimates to make sure that the solution is correct. In this case, other prior estimates lead to the same solution or possibly to worse local optima, as indicated by the information criteria or the Chi-square values.

The variable Employment has a communality of 1.0 and, therefore, an infinite weight that is displayed next to the final communality estimate as a missing/infinite value. The first eigenvalue is also infinite. Infinite values are ignored in computing the total of the eigenvalues and the total final communality.

Output 27.3.2 displays the results of the analysis using two factors. The analysis converges without incident. This time, however, the Population variable is a Heywood case.

Output 27.3.2: Maximum Likelihood Factor Analysis ”Two Factors
start example
              Maximum Likelihood Factor Analysis with Two Factors                                The FACTOR Procedure                     Initial Factor Method: Maximum Likelihood                        Prior Communality Estimates: SMC     Population          School      Employment        Services      HouseValue     0.96859160      0.82228514      0.96918082      0.78572440      0.84701921          Preliminary Eigenvalues: Total = 76.1165859 Average = 15.2233172                    Eigenvalue    Difference    Proportion    Cumulative               1    63.7010086    50.6462895        0.8369        0.8369               2    13.0547191    12.7270798        0.1715        1.0084               3     0.3276393     0.6749199        0.0043        1.0127               4    0.3472805     0.2722202       0.0046        1.0081               5    0.6195007                     0.0081        1.0000  Iteration    Criterion    Ridge   Change               Communalities      1        0.3431221   0.0000   0.0471   1.00000   0.80672   0.95058   0.79348                                             0.89412      2        0.3072178   0.0000   0.0307   1.00000   0.80821   0.96023   0.81048                                             0.92480      3        0.3067860   0.0000   0.0063   1.00000   0.81149   0.95948   0.81677                                             0.92023      4        0.3067373   0.0000   0.0022   1.00000   0.80985   0.95963   0.81498                                             0.92241      5        0.3067321   0.0000   0.0007   1.00000   0.81019   0.95955   0.81569                                             0.92187          Convergence criterion satisfied.         Maximum Likelihood Factor Analysis with Two Factors             Initial Factor Method: Maximum Likelihood            Significance Tests Based on 12 Observations                                                             Pr >               Test                     DF    Chi-Square     ChiSq  H0: No common factors                 10       54.2517    <.0001  HA: At least one common factor  H0: 2 Factors are sufficient           1        2.1982    0.1382  HA: More factors are needed      Chi-Square without Bartlett's Correction       3.3740530      Akaike's Information Criterion                 1.3740530      Schwarz's Bayesian Criterion                   0.8891463      Tucker and Lewis's Reliability Coefficient     0.7292200                   Squared Canonical Correlations                        Factor1         Factor2                      1.0000000       0.9518891           Eigenvalues of the Weighted Reduced Correlation          Matrix: Total = 19.7853157  Average = 4.94632893            Eigenvalue    Difference    Proportion    Cumulative       1         Infty         Infty       2    19.7853143    19.2421292        1.0000        1.0000       3     0.5431851     0.5829564        0.0275        1.0275       4    0.0397713     0.4636411       0.0020        1.0254       5    0.5034124                     0.0254        1.0000         Maximum Likelihood Factor Analysis with Two Factors             Initial Factor Method: Maximum Likelihood                           Factor Pattern                                Factor1         Factor2             Population         1.00000         0.00000             School             0.00975         0.90003             Employment         0.97245         0.11797             Services           0.43887         0.78930             HouseValue         0.02241         0.95989                 Variance Explained by Each Factor                Factor        Weighted    Unweighted                Factor1     24.4329707    2.13886057                Factor2     19.7853143    2.36835294          Final Communality Estimates and Variable Weights  Total Communality: Weighted = 44.218285   Unweighted = 4.507214              Variable      Communality        Weight              Population     1.00000000         Infty              School         0.81014489     5.2682940              Employment     0.95957142    24.7246669              Services       0.81560348     5.4256462              HouseValue     0.92189372    12.7996793 
end example
 

The three-factor analysis displayed in Output 27.3.3 generates this message:

Output 27.3.3: Maximum Likelihood Factor Analysis ”Three Factors
start example
             Maximum Likelihood Factor Analysis with Three Factors                                The FACTOR Procedure                     Initial Factor Method: Maximum Likelihood                        Prior Communality Estimates: SMC     Population          School      Employment        Services      HouseValue     0.96859160      0.82228514      0.96918082      0.78572440      0.84701921          Preliminary Eigenvalues: Total = 76.1165859  Average = 15.2233172                    Eigenvalue    Difference    Proportion    Cumulative               1    63.7010086    50.6462895        0.8369        0.8369               2    13.0547191    12.7270798        0.1715        1.0084               3     0.3276393     0.6749199        0.0043        1.0127               4    0.3472805     0.2722202       0.0046        1.0081               5    0.6195007                     0.0081        1.0000  Iteration    Criterion    Ridge   Change               Communalities      1        0.1798029   0.0313   0.0501   0.96081   0.84184   1.00000   0.80175                                             0.89716      2        0.0016405   0.0313   0.0678   0.98081   0.88713   1.00000   0.79559                                             0.96500      3        0.0000041   0.0313   0.0094   0.98195   0.88603   1.00000   0.80498                                             0.96751      4        0.0000000   0.0313   0.0006   0.98202   0.88585   1.00000   0.80561                                             0.96735          ERROR: Converged, but not to a proper optimum.        Maximum Likelihood Factor Analysis with Three Factors             Initial Factor Method: Maximum Likelihood            Significance Tests Based on 12 Observations                                                             Pr >               Test                     DF    Chi-Square     ChiSq  H0: No common factors                 10       54.2517    <.0001  HA: At least one common factor  H0: 3 Factors are sufficient          2        0.0000     .  HA: More factors are needed      Chi-Square without Bartlett's Correction       0.0000003      Akaike's Information Criterion                 4.0000003      Schwarz's Bayesian Criterion                   4.9698136      Tucker and Lewis's Reliability Coefficient     0.0000000                   Squared Canonical Correlations                Factor1         Factor2         Factor3              1.0000000       0.9751895       0.6894465           Eigenvalues of the Weighted Reduced Correlation          Matrix: Total = 41.5254193  Average = 10.3813548            Eigenvalue    Difference    Proportion    Cumulative       1         Infty         Infty       2    39.3054826    37.0854258        0.9465        0.9465       3     2.2200568     2.2199693        0.0535        1.0000       4     0.0000875     0.0002949        0.0000        1.0000       5    0.0002075                     0.0000        1.0000        Maximum Likelihood Factor Analysis with Three Factors             Initial Factor Method: Maximum Likelihood                           Factor Pattern                        Factor1         Factor2         Factor3     Population         0.97245        0.11233        0.15409     School             0.15428         0.89108         0.26083     Employment         1.00000         0.00000         0.00000     Services           0.51472         0.72416        0.12766     HouseValue         0.12193         0.97227        0.08473                 Variance Explained by Each Factor                Factor        Weighted    Unweighted                Factor1     54.6115241    2.24926004                Factor2     39.3054826    2.27634375                Factor3      2.2200568    0.11525433          Final Communality Estimates and Variable Weights  Total Communality: Weighted = 96.137063   Unweighted = 4.640858              Variable      Communality        Weight              Population     0.98201660    55.6066901              School         0.88585165     8.7607194              Employment     1.00000000         Infty              Services       0.80564301     5.1444261              HouseValue     0.96734687    30.6251078 
end example
 
WARNING: Too many factors for a unique solution. 

The number of parameters in the model exceeds the number of elements in the correlation matrix from which they can be estimated, so an infinite number of different perfect solutions can be obtained. The Criterion approaches zero at an improper optimum, as indicated by this message:

Converged, but not to a proper optimum. 

The degrees of freedom for the chi-square test are ˆ’ 2, so a probability level cannot be computed for three factors. Note also that the variable Employment is a Heywood case again.

The probability levels for the chi-square test are 0.0001 for the hypothesis of no common factors, 0.0002 for one common factor, and 0.1382 for two common factors. Therefore, the two-factor model seems to be an adequate representation. Akaike s information criterion and Schwarz s Bayesian criterion attain their minimum values at two common factors, so there is little doubt that two factors are appropriate for these data.

Example 27.4. Using Confidence Intervals to Locate Salient Factor Loadings

This example illustrates how you can utilize the standard errors and confidence intervals to understand the pattern of factor loadings under the maximum likelihood estimation. There are nine tests and you want a three-factor solution for a correlation matrix based on 200 observations. You apply quartimin rotation with (default) Kaiser normalization. You define loadings with magnitudes greater than 0.45 to be salient and use 90% confidence intervals to judge the salience.

data test(type=corr);     title 'Quartimin-Rotated Factor Solution with Standard Errors';     input _name_ $ test1-test9;     _type_ = 'corr';     datalines;  Test1      1  .561  .602  .290  .404  .328  .367  .179 .268  Test2   .561     1  .743  .414  .526  .442  .523  .289 .399  Test3   .602  .743     1  .286  .343  .361  .679  .456 .532  Test4   .290  .414  .286     1  .677  .446  .412  .400 .491  Test5   .404  .526  .343  .677     1  .584  .408  .299 .466  Test6   .328  .442  .361  .446  .584     1  .333  .178 .306  Test7   .367  .523  .679  .412  .408  .333     1  .711 .760  Test8   .179  .289  .456  .400  .299  .178  .711     1 .725  Test9  .268 .399 .532 .491 .466 .306 .760 .725     1  ;  proc factor data=test method=ml reorder rotate=quartimin     nobs=200 n=3 se cover=.45 alpha=.1;     title2 'A nine-variable-three-factor example';  run; 

After the quartimin rotation, the correlation matrix for factors is shown in Output 27.4.1. The factors are medium to highly correlated. The confidence intervals seem to be very wide, suggesting that the estimation of factor correlations may not be very accurate for this sample size . For example, the 90% confidence interval for the correlation between Factor1 and Factor2 is (0.30, 0.51), a range of 0.21. You may need a larger sample to get a narrower interval, or a better estimation.

Output 27.4.1: QuartiminRotated Factor Solution with Standard Errors
start example
 Quartimin-Rotated Factor Solution with Standard Errors            A nine-variable-three-factor example                    The FACTOR Procedure                 Rotation Method: Quartimin                 Inter-Factor Correlations                With 90% confidence limits              Estimate/StdErr/LowerCL/UpperCL                  Factor1         Factor2         Factor3  Factor1         1.00000         0.41283         0.38304                  0.00000         0.06267         0.06060                   .              0.30475         0.27919                   .              0.51041         0.47804  Factor2         0.41283         1.00000         0.47006                  0.06267         0.00000         0.05116                  0.30475          .              0.38177                  0.51041          .              0.54986  Factor3         0.38304         0.47006         1.00000                  0.06060         0.05116         0.00000                  0.27919         0.38177          .                  0.47804         0.54986          . 
end example
 

The coverage displays in Output 27.4.2 show that Test8 , Test7 , and Test9 have salient relationships with Factor1 . The coverage displays are either ˜0*[ ] or ˜[ ]*0 , indicating that the entire 90% confidence intervals for the corresponding loadings are beyond the salience value at 0.45. On the other hand, the coverage display for Test3 on Factor1 is ˜0[ ]* . This indicates that even though the loading estimate is significantly larger than zero, it is not large enough to be salient. Similarly, Test3 , Test2 , and Test1 have salient relationships with Factor2 , while Test5 and Test4 have salient relationships with Factor3 . For Test6 , its relationship with Factor3 is a little bit ambiguous; the 90% confidence interval covers approximately values between 0.40 and 0.64. This means that the population value might have been smaller or larger than 0.45. It is marginal evidence for a salient relationship.

Output 27.4.2: Interpretations of Factors Using Rotated Factor Pattern
start example
             A nine-variable-three-factor example                    Rotation Method: Quartimin  Rotated Factor Pattern (Standardized Regression Coefficients)          With 90% confidence limits; Cover |*| = 0.45?        Estimate/StdErr/LowerCL/UpperCL/Coverage Display                    Factor1         Factor2         Factor3      test8         0.86810        0.05045         0.00114                    0.03282         0.03185         0.03087                    0.80271        0.10265        0.04959                    0.91286         0.00204         0.05187                       0*[]            *[0]            [0]*      test7         0.73204         0.27296         0.01098                    0.04434         0.05292         0.03838                    0.65040         0.18390        0.05211                    0.79697         0.35758         0.07399                       0*[]            0[]*            [0]*      test9        0.79654        0.01230        0.17307                    0.03948         0.04225         0.04420                   0.85291        0.08163        0.24472                   0.72180         0.05715        0.09955                       []*0            *[0]            *[]0      test3         0.27715         0.91156        0.19727                    0.05489         0.04877         0.02981                    0.18464         0.78650        0.24577                    0.36478         0.96481        0.14778                       0[]*            0*[]            *[]0      test2         0.01063         0.71540         0.20500                    0.05060         0.05148         0.05496                   0.07248         0.61982         0.11310                    0.09359         0.79007         0.29342                       [0]*            0*[]            0[]*      test1        0.07356         0.63815         0.13983                    0.04245         0.05380         0.05597                   0.14292         0.54114         0.04682                   0.00348         0.71839         0.23044                       *[]0            0*[]            0[]*      test5         0.00863         0.03234         0.91282                    0.04394         0.04387         0.04509                   0.06356        0.03986         0.80030                    0.08073         0.10421         0.96323                       [0]*            [0]*            0*[]      test4         0.22357        0.07576         0.67925                    0.05956         0.03640         0.05434                    0.12366        0.13528         0.57955                    0.31900        0.01569         0.75891                       0[]*            *[]0            0*[]      test6        0.04295         0.21911         0.53183                    0.05114         0.07481         0.06905                   0.12656         0.09319         0.40893                    0.04127         0.33813         0.63578                       *[0]            0[]*            0[*] 
end example
 

For oblique factor solutions, some researchers prefer to examine the factor structure loadings, which represent correlations, for determining salient relationships. In Output 27.4.3, the factor structure loadings and the associated standard error estimates and coverage displays are shown. The interpretations based on the factor structure matrix do not change much except for Test3 and Test9 . Test9 now has a salient correlation with Factor3 .For Test3 , it has salient correlations with both Factor1 and Factor2 . Fortunately, there are still tests that only have salient correlations with either Factor1 or Factor2 (but not both). This would make interpretations of factors less problematic .

Output 27.4.3: Interpretations of Factors Using Factor Structure
start example
         A nine-variable-three-factor example                Rotation Method: Quartimin             Factor Structure (Correlations)      With 90% confidence limits; Cover |*| = 0.45?    Estimate/StdErr/LowerCL/UpperCL/Coverage Display                Factor1         Factor2         Factor3  test8         0.84771         0.30847         0.30994                0.02871         0.06593         0.06263                0.79324         0.19641         0.20363                0.88872         0.41257         0.40904                   0*[]            0[]*            0[]*  test7         0.84894         0.58033         0.41970                0.02688         0.05265         0.06060                0.79834         0.48721         0.31523                0.88764         0.66041         0.51412                   0*[]            0*[]            0[*]  test9        0.86791        0.42248        0.48396                0.02522         0.06187         0.05504               0.90381        0.51873        0.56921               0.81987        0.31567        0.38841                   []*0            [*]0            [*]0  test3         0.57790         0.93325         0.33738                0.05069         0.02953         0.06779                0.48853         0.86340         0.22157                0.65528         0.96799         0.44380                   0*[]            0*[]            0[]*  test2         0.38449         0.81615         0.54535                0.06143         0.03106         0.05456                0.27914         0.75829         0.44946                0.48070         0.86126         0.62883                   0[*]            0*[]            0[*]  test1         0.24345         0.67351         0.41162                0.06864         0.04284         0.05995                0.12771         0.59680         0.30846                0.35264         0.73802         0.50522                   0[]*            0*[]            0[*]  test5         0.37163         0.46498         0.93132                0.06092         0.04979         0.03277                0.26739         0.37923         0.85159                0.46727         0.54282         0.96894                   0[*]            0[*]            0*[]  test4         0.45248         0.33583         0.72927                0.05876         0.06289         0.04061                0.35072         0.22867         0.65527                0.54367         0.43494         0.78941                   0[*]            0[]*            0*[]  test6         0.25122         0.45137         0.61837                0.07140         0.05858         0.05051                0.13061         0.34997         0.52833                0.36450         0.54232         0.69465                   0[]*            0[*]            0*[] 
end example
 



SAS.STAT 9.1 Users Guide (Vol. 2)
SAS/STAT 9.1 Users Guide Volume 2 only
ISBN: B003ZVJDOK
EAN: N/A
Year: 2004
Pages: 92

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net