Once it has been determined that factor analysis is suitable for analyzing the data, an appropriate method must be selected. The approach used to derive the weights or factor score coefficients differentiates the various methods of factor analysis. The two basic approaches are principal components analysis and common factor analysis. In principal components analysis, the total variance in the data is considered. The diagonal of the correlation matrix consists of unities, and full variance is brought into the factor matrix. Principal components analysis is recommended when the primary concern is to determine the minimum number of factors that will account for maximum variance in the data for use in subsequent multivariate analysis. The factors are called principal components
In common factor analysis, the factors are estimated based only on the common variance. Communalities are inserted in the diagonal of the correlation matrix. This method is appropriate when the primary concern is to identify the underlying dimensions and the common variance is of interest. This method is also known as principal axis factoring.
Other approaches for estimating the common factors are also available. These include the methods of unweighted least squares, generalized least squares, maximum likelihood, alpha method, and image factoring. These methods are complex and are not recommended for inexperienced users.?
Table 19.3 shows the application of principal components analysis to the toothpaste example. Under “Communalities,” “Initial” column, it can be seen that the commonality for each variable, VI to V6, is 1.0 as unities were inserted in the diagonal of the correlation matrix. The table labeled “Initial Eigenvalues” gives the eigenvalues. The eigenvalues for the factors are, as expected, in decreasing order of magnitude as we go from factor I to factor 6. The eigenvalue for a factor indicates the total variance attributed to that factor. The total variance accounted for by all six factors is 6.00, which is equal to the number of variables. Factor I accounts for a variance of 2.731, which is (2.731/6) or 45.52 percent of the total variance. Likewise, the second factor accounts for (2.218/6) or 36.97 percent of the total variance, and the first two factors combined account for 82.49 percent of the total variance. Several considerations are involved in determining the number of factors that should be used in the analysis.
Determine the Number of Factors
It is possible to compute as many principal components as there are variables, but in doing so, no parsimony is gained. In order to summarize the information contained in the original variables, a smaller number of factors should be extracted. The question is, how many? Several procedures have been suggested for determining the number of factors. These include a priori determination and approaches based on eigenvalues, scree plot, percentage of variance accounted for, split-half reliability, and significance tests.
A PRIORI DETERMINATION Sometimes, because of prior knowledge, the researcher knows how many factors to expect and thus can specify the number of factors to be extracted beforehand. The extraction of factors ceases when the desired number of factors have been extracted. Most computer programs allow the user to specify the number of factors, allowing for an easy implementation of this approach.
DETERMINATION BASED IN EIGENVALUES In this approach, only factors with eigenvalues greater than 1.0 are retu.ned: the other factors are not included in the model. An eigenvalue represents the amount of variance associated with the factor. Hence, only factors with a variance greater than 1.0 are included. Factors with variance less than 1.0 are no better than a single variable, because, due to standardization, each individual variable has a variance of 1.0. If the number of variables is less than 20, this approach will result in a conservative number of factors
DETERMINATION BASED ON SCREE PLOT A scree plot is a plot of the eigenvalues against the number of factors in order of extraction. The shape of the plot is used to determine the number of factors. Typically, the plot has a distinct breach between the steep slope of factors, with large eigenvalues and a gradual trailing off associated with the rest of the factors. This gradual trailing off is referred to as the screen. Experimental evidence indicates that the point at which the screen begins denotes the true number of factors. Generally, the number of factors determined by a scree plot will be one or a few more than that determined by the eigenvalue criterion.
DETERMINATION BASED ON PERCENTAGE OF VARIANCE In this approach, the number of factors extracted is determined so that the cumulative percentage of variance extracted by the factors reaches a satisfactory level. What level of variance is satisfactory depends upon the problem. However, it is recommended that the factors extracted should account· for at least 60 percent of the variance.
DETERMINATION BASED ON SPLIT-HALF RELIABILITY The sample is split in half and factor analysis is performed on each half. Only factors with high correspondence of factor loadings across the two subsamples are retained
DETERMINATION BASED ON SIGNIFICANCE TESTS It is possible to determine the statistical significance of the separate eigenvalues and retain only those factors that are statistically significant. A drawback is that with large samples (size greater than 200), many factors are likely to be statistically significant, although from a practical viewpoint many of these account for only a small proportion of the total variance.
The second column under “Communalities” in Table 19.3 gives relevant information after the desired number of factors has been extracted. The communalities for the variables under “Extraction” are different than under “Initial” because all of the variances associated with the variables are not explained unless all the factors are retained. The “Extraction Sums of Squared Loadings” give the variances associated with the factors that are retained. Note that these are the same as under “Initial Eigenvalues.” This is always the case in principal components analysis. The percentage variance accounted for by a factor is determined by dividing the associated eigenvalue with the total number of factors (or variables) and multiplying by 100. Thus, the first factor accounts for (2.73116) X 100 or 45.52 percent of the variance of the six variables. Likewise, the second factor accounts for (2.218/6) X 100 or 36.969 percent of the variance. Interpretation of the solution is often enhanced by a rotation of the factors.