class 10 factor analysis i

45
Factor analysis examines Factor analysis examines the interrelationships the interrelationships among a large number of among a large number of variables and, then, variables and, then, attempts to explain them attempts to explain them in terms of their common in terms of their common underlying dimension underlying dimension Common underlying Common underlying dimensions are referred to dimensions are referred to as factors as factors Interdependence technique Interdependence technique No I.V.s or D.V.s No I.V.s or D.V.s All variables are All variables are considered simultaneously considered simultaneously What is Factor What is Factor Analysis? Analysis?

Upload: kenny-reyes

Post on 26-May-2017

216 views

Category:

Documents


3 download

TRANSCRIPT

Page 1: Class 10 Factor Analysis I

Factor analysis examines Factor analysis examines the interrelationships among the interrelationships among a large number of variables a large number of variables and, then, attempts to and, then, attempts to explain them in terms of explain them in terms of their common underlying their common underlying dimensiondimension– Common underlying Common underlying

dimensions are referred to as dimensions are referred to as factorsfactors

Interdependence techniqueInterdependence technique– No I.V.s or D.V.s No I.V.s or D.V.s – All variables are considered All variables are considered

simultaneouslysimultaneously

What is Factor What is Factor Analysis?Analysis?

Page 2: Class 10 Factor Analysis I

Why do Factor Why do Factor Analysis?Analysis? Data SummarizationData Summarization

– Research question is to better understand Research question is to better understand the interrelationships among the variablesthe interrelationships among the variables

– Identify latent dimensions within data setIdentify latent dimensions within data set– Identification and understanding of these Identification and understanding of these

underlying dimensions is the goalunderlying dimensions is the goal Data ReductionData Reduction

– Discover underlying dimensions to reduce Discover underlying dimensions to reduce data to fewer variables so all dimensions data to fewer variables so all dimensions are represented in subsequent analysesare represented in subsequent analyses Surrogate variablesSurrogate variables Aggregated scalesAggregated scales Factor ScoresFactor Scores

Precursor to subsequent MV techniquesPrecursor to subsequent MV techniques– Data SummarizationData Summarization

Latent dimensions -- research question Latent dimensions -- research question answered with other MV techniquesanswered with other MV techniques

– Data ReductionData Reduction Avoid multicollinearity problemsAvoid multicollinearity problems Improve reliability of aggregated scalesImprove reliability of aggregated scales

Page 3: Class 10 Factor Analysis I

AssumptionsAssumptions Variables must be Variables must be

interrelatedinterrelated– 20 unrelated variables=20 20 unrelated variables=20

factorsfactors– Matrix must have sufficient Matrix must have sufficient

number of correlationsnumber of correlations Some underlying factor Some underlying factor

structurestructure Sample must be Sample must be

homogeneoushomogeneous Metric variables assumedMetric variables assumed MV normality not requiredMV normality not required Sample sizeSample size

– Min 50, prefer 100Min 50, prefer 100– Min 5 observations/item, prefer Min 5 observations/item, prefer

10 observations/item10 observations/item

Page 4: Class 10 Factor Analysis I

Types of Factor Types of Factor AnalysisAnalysis

Exploratory Factor Analysis Exploratory Factor Analysis (EFA)(EFA)– Used to discover underlying structureUsed to discover underlying structure– Principal components analysis (PCA) Principal components analysis (PCA)

(Thurstone)(Thurstone) Treats individual items or measures Treats individual items or measures

as though they have no unique erroras though they have no unique error– Factor analysis (common factors Factor analysis (common factors

analysis) (Spearman)analysis) (Spearman) Treats individual items or measures Treats individual items or measures

as having unique erroras having unique error– Both PCA and FA give similar answers Both PCA and FA give similar answers

most of the timemost of the time Confirmatory Factor Analysis Confirmatory Factor Analysis

(CFA)(CFA)– Used to test whether data fit a priori Used to test whether data fit a priori

expectations for data structureexpectations for data structure– Structural equations modelingStructural equations modeling

Page 5: Class 10 Factor Analysis I

Purpose of EFAPurpose of EFA EFA is a data reduction techniqueEFA is a data reduction technique

– Scientific parsimonyScientific parsimony– Which items are virtually the same Which items are virtually the same

thingthing Objective: simplification of items Objective: simplification of items

into subset of concepts or measuresinto subset of concepts or measures Part of construct validation (what Part of construct validation (what

are underlying patterns in data?)are underlying patterns in data?) EFA assesses dimensionality or EFA assesses dimensionality or

homogeneityhomogeneity Issues:Issues:

– Use principal components analysis Use principal components analysis (PCA) or factor analysis (FA)?(PCA) or factor analysis (FA)?

– How many factors?How many factors?– What type of rotation?What type of rotation?– How to interpret?How to interpret?

LoadingsLoadings Cross-loadingsCross-loadings

Page 6: Class 10 Factor Analysis I

Types of EFATypes of EFA Principal components analysisPrincipal components analysis

– A composite of the observed variables as A composite of the observed variables as a summary of those variablesa summary of those variables

– Assumes no error in itemsAssumes no error in items– No assumption of underlying constructNo assumption of underlying construct– Often used in physical scienceOften used in physical science– Precise mathematical solutions possiblePrecise mathematical solutions possible– Unity inserted on diagonal of matrixUnity inserted on diagonal of matrix

Factor (or common factors) analysisFactor (or common factors) analysis– In SPSS known as principal axis factoringIn SPSS known as principal axis factoring– Explain relationship between observed Explain relationship between observed

vars in terms of latent vars or factorsvars in terms of latent vars or factors– Factor is a hypothesized constructFactor is a hypothesized construct– Assumes error in itemsAssumes error in items– Precise math not possible, solved by Precise math not possible, solved by

iterationiteration– Communalities (shared var) on diagonalCommunalities (shared var) on diagonal

Page 7: Class 10 Factor Analysis I

Basic Logic of EFABasic Logic of EFA Items you want to reduce. Items you want to reduce. Creates mathematical combination Creates mathematical combination

of variables that maximizes of variables that maximizes variance you can predict in all variance you can predict in all variablesvariables principal component or principal component or a factora factor

New combination of items from New combination of items from residual variance that maximizes residual variance that maximizes variance you can predict in what is variance you can predict in what is leftleft second principal component second principal component or factoror factor

Continue until all variance is Continue until all variance is accounted for. Select the minimal accounted for. Select the minimal number of factors that captures number of factors that captures the most amount of variance.the most amount of variance.

Interpret the factors. Interpret the factors. Rotated matrix and loadings are Rotated matrix and loadings are

more interpretable. more interpretable.

Page 8: Class 10 Factor Analysis I

Concepts and TermsConcepts and TermsPCA starts with data matrix of N persons PCA starts with data matrix of N persons

arranged in rows and k measures arranged in arranged in rows and k measures arranged in columnscolumnsMeasuresMeasures

PersonsPersons A A B B C C D ... k D ... k1122 The objective is to explainThe objective is to explain33 the data in less than thethe data in less than the.. total number of itemstotal number of items..NN

N persons, k different measuresN persons, k different measuresPCA is a method to transform the original set of PCA is a method to transform the original set of

variables into a new set of principal variables into a new set of principal components that are unrelated to each other.components that are unrelated to each other.

Page 9: Class 10 Factor Analysis I

Concepts and Concepts and TermsTerms

Factor - Linear composite. A Factor - Linear composite. A way of turning multiple way of turning multiple measures into one thing.measures into one thing.

Factor Score - Measure of Factor Score - Measure of one person’s score on a one person’s score on a given factor.given factor.

Factor Loadings - Factor Loadings - Correlation of a factor score Correlation of a factor score with an item. Variables with with an item. Variables with high loadings are the high loadings are the distinguishing features of distinguishing features of the factor.the factor.

Page 10: Class 10 Factor Analysis I

Concepts and Concepts and TermsTerms

Communality - (hCommunality - (h22) - Variance in ) - Variance in a given item accounted for by all a given item accounted for by all factors. Sum of squared factor factors. Sum of squared factor loadings in a row from factor loadings in a row from factor analysis results. These are analysis results. These are presented in the diagonal in presented in the diagonal in common factor analysiscommon factor analysis

Factorally pure - A test that only Factorally pure - A test that only loads on one factor.loads on one factor.

Scale score - score for individual Scale score - score for individual obtained by adding together obtained by adding together items making up a factor. items making up a factor.

Page 11: Class 10 Factor Analysis I

The ProcessThe Process Because we are trying to Because we are trying to

reduce the data, we don’t reduce the data, we don’t want as many factors as itemswant as many factors as items

Because each new component Because each new component or factor is the best linear or factor is the best linear combination of residual combination of residual variance, data can be variance, data can be explained relatively well in explained relatively well in many less factors than many less factors than original number of itemsoriginal number of items

Stop taking additional factors Stop taking additional factors is a difficult decision. Primary is a difficult decision. Primary methods:methods:– Scree ruleScree rule– Kaiser criterion (eigenvalues Kaiser criterion (eigenvalues

> 1)> 1)

Page 12: Class 10 Factor Analysis I

How Many Factors?How Many Factors? Scree Plot (Cattell) - Not a test Scree Plot (Cattell) - Not a test

– Look for bend in plotLook for bend in plot– Include factor located right at bend pointInclude factor located right at bend point

Kaiser (or Latent Root) criterionKaiser (or Latent Root) criterion– Eigenvalues greater than 1Eigenvalues greater than 1– Also, 1 is the amount of variance Also, 1 is the amount of variance

accounted for by a single item (raccounted for by a single item (r22 = = 1.00). If eigenvalue < 1.00 then factor 1.00). If eigenvalue < 1.00 then factor accounts for less variance than a single accounts for less variance than a single item.item.

– Tinsley & Tinsley - Kaiser criterion can Tinsley & Tinsley - Kaiser criterion can underestimate number of factorsunderestimate number of factors

A priori hypothesized # of A priori hypothesized # of factorsfactors

Percent of variance criterion Percent of variance criterion Parallel analysis – eigenvalues Parallel analysis – eigenvalues

higher than expect by chancehigher than expect by chance Use both Use both plus theoryplus theory to make to make

determinationdetermination

Page 13: Class 10 Factor Analysis I

ExampleExampleRR matrix (correlation matrix) matrix (correlation matrix)

BlPrBlPr LSatLSatChol LStrChol LStr BdWtBdWt JSatJSat JStrJStrBlPrBlPr 1.001.00LSatLSat -.18 -.18 1.00 1.00CholChol .65.65 -.17 -.17 1.00 1.00LStrLStr .15 .15 -.45-.45 .22 .22 1.00 1.00BdWtBdWt .45.45 -.11 -.11 .52.52 .16 .16 1.00 1.00JSatJSat -.21 -.21 .85.85 -.12 -.12 -.35-.35 -.05 -.05 1.001.00JStrJStr .19 .19 -.21 -.21 .02 .02 .79.79 .19 .19 -.35-.35 1.001.00

Principal Components Analysis (PCA)Principal Components Analysis (PCA)Initial Statistics:Initial Statistics:Variable Communality * Factor Eigenval %Var Cum%Variable Communality * Factor Eigenval %Var Cum%BLPR 1.00000 * 1 2.85034 40.7 40.7BLPR 1.00000 * 1 2.85034 40.7 40.7LSAT 1.00000 * 2 1.74438 24.9 65.6LSAT 1.00000 * 2 1.74438 24.9 65.6CHOL 1.00000 * 3 1.16388 16.6 82.3CHOL 1.00000 * 3 1.16388 16.6 82.3LSTR 1.00000 * 4 .56098 8.0 90.3LSTR 1.00000 * 4 .56098 8.0 90.3BDWT 1.00000 * 5 .44201 6.3BDWT 1.00000 * 5 .44201 6.3 96.696.6JSAT 1.00000 * 6 .20235 2.9 99.5JSAT 1.00000 * 6 .20235 2.9 99.5JSTR 1.00000 * 7 .03607 .5 100.0JSTR 1.00000 * 7 .03607 .5 100.0

Page 14: Class 10 Factor Analysis I

ExampleExampleVariable Communality * Factor Eigenval %Var Variable Communality * Factor Eigenval %Var

Cum%Cum%BLPR 1.00000 * 1 2.85034 40.7 40.7BLPR 1.00000 * 1 2.85034 40.7 40.7LSAT 1.00000 * 2 1.74438 24.9 65.6LSAT 1.00000 * 2 1.74438 24.9 65.6CHOL 1.00000 * 3 1.16388 16.6 82.3CHOL 1.00000 * 3 1.16388 16.6 82.3LSTR 1.00000 * 4 .56098 8.0 90.3LSTR 1.00000 * 4 .56098 8.0 90.3BDWT 1.00000 * 5 .44201 6.3 96.6BDWT 1.00000 * 5 .44201 6.3 96.6JSAT 1.00000 * 6 .20235 2.9 99.5JSAT 1.00000 * 6 .20235 2.9 99.5JSTR 1.00000 * 7 .03607 .5 100.0JSTR 1.00000 * 7 .03607 .5 100.0

Factor Matrix (Unrotated):Factor Matrix (Unrotated): Factor 1 Factor 2 Factor 3 ... Fac7Factor 1 Factor 2 Factor 3 ... Fac7LSTR .73738 -.32677 .47575LSTR .73738 -.32677 .47575LSAT -.71287 .38426 .52039LSAT -.71287 .38426 .52039JSAT -.70452 .42559 .48553JSAT -.70452 .42559 .48553JSTR .64541 -.32867 .62912JSTR .64541 -.32867 .62912CHOL .54945 .68694 -.10453CHOL .54945 .68694 -.10453BDWT .48867 .60471 .13043BDWT .48867 .60471 .13043BLPR .58722 .60269 -.08534BLPR .58722 .60269 -.08534

Eigenvalue 2.850343Eigenvalue 2.850343 1.74438 1.16388 1.74438 1.16388

Page 15: Class 10 Factor Analysis I

ExampleExampleFinal Statistics:Final Statistics:Variable Communality * Factor Eigenvalue %Var CumVariable Communality * Factor Eigenvalue %Var Cum

%%BLPR .71533 * 1 2.85034 40.7 40.7BLPR .71533 * 1 2.85034 40.7 40.7LSAT .92665 * 2 1.74438 24.9 65.6LSAT .92665 * 2 1.74438 24.9 65.6CHOL .78470 * 3 1.16388 16.6 82.3CHOL .78470 * 3 1.16388 16.6 82.3LSTR .87684 *LSTR .87684 *BDWT .62149 *BDWT .62149 *JSAT .91321 *JSAT .91321 *JSTR .92037 *JSTR .92037 *

VARIMAX Rotated Factor Matrix:VARIMAX Rotated Factor Matrix: Factor 1 Factor 2 Factor 3Factor 1 Factor 2 Factor 3 hh22

CHOL CHOL .87987.87987 -.10246 -.00574 -.10246 -.00574 .78470.78470BLPR BLPR .83043.83043 -.14875 .05988 -.14875 .05988 .71533.71533BDWT BDWT .76940.76940 .05630 .16234 .05630 .16234 .62149.62149LSAT -.09806 LSAT -.09806 .94430.94430 -.15917 -.15917 .92665.92665JSAT -.05790 JSAT -.05790 .93376.93376 -.19479 -.19479 .91321.91321JSTR .06542 -.10717 JSTR .06542 -.10717 .95110.95110 .92036.92036LSTR .12381 -.26465 LSTR .12381 -.26465 .88965.88965.87684.87684

Eigenvalue 2.0883Eigenvalue 2.08831.88091.8809 1.78931.7893

Page 16: Class 10 Factor Analysis I

Scree PlotScree PlotFactor Scree Plot

Factor Number

7654321

Eig

enva

lue

3.5

3.0

2.5

2.0

1.5

1.0

.5

0.0

Scree comes from a word for loose rock anddebris at the base of a cliff!

Page 17: Class 10 Factor Analysis I

Information from Information from EFAEFA FACTORFACTOR

MsrMsr F1F1 F2F2 F3F3 hh22

aa .60.60 -.06-.06 .02 .02 .36.36bb .81.81 .12 .12 -.03-.03 .67.67cc .77.77 .03 .03 .08 .08 .60.60dd .01.01 .65 .65 -.04-.04 .42.42ee .03.03 .80 .80 .07 .07 .65.65ff .12.12 .67 .67 -.05-.05 .47.47gg .19.19 -.02-.02 .68 .68 .50.50hh .08.08 -.10-.10 .53 .53 .30.30ii .26.26 -.13-.13 .47 .47 .31.31Sum Sq LdngSum Sq Ldng 1.761.76 1.561.56 .98.98 TotalTotal% Variance% Variance .195.195 .173.173 .109.109 47.7%47.7%

(1.76/9) (1.56/9) (.98/9)(1.76/9) (1.56/9) (.98/9)A factor loading is the correlation between A factor loading is the correlation between

a factor and an itema factor and an itemWhen factors are orthogonal, factor When factors are orthogonal, factor

loadings squared are the amount of loadings squared are the amount of variance in one variable explained by variance in one variable explained by that factor (F1 explains 36% of the that factor (F1 explains 36% of the variance in Msr a; F3 explains 46% of variance in Msr a; F3 explains 46% of the variance in Msr g)the variance in Msr g)

Page 18: Class 10 Factor Analysis I

Information from EFAInformation from EFAMsrMsr F1F1 F2F2 F3F3 hh22

aa .60.60 -.06-.06 .02 .02 .36.36bb .81.81 .12 .12 -.03-.03 .67.67...... ... ... ... ... ... ... ... ... ii .26.26 -.13-.13 .47 .47 .31.31Sum Sq LdngSum Sq Ldng 1.761.76 1.561.56 .98.98 TotalTotal% Variance% Variance .195.195 .173.173 .109.109 47.7%47.7%

(1.76/9) (1.56/9) (.98/9)(1.76/9) (1.56/9) (.98/9)

Eigenvalue:Eigenvalue: Sum of squared Sum of squared loadings down a column loadings down a column (associated with a factor). Total (associated with a factor). Total variance in all vars explained by variance in all vars explained by one factor. Factors with one factor. Factors with eigenvalues less than 1 predict eigenvalues less than 1 predict less than the variance of 1 item.less than the variance of 1 item.

Communality (hCommunality (h22):): Variance in a Variance in a given item accounted for by all given item accounted for by all factors. Sum of squared factors. Sum of squared loadings across rows. Will equal loadings across rows. Will equal 1 if you retain all possible 1 if you retain all possible factors.factors.

Eigenvalue

Page 19: Class 10 Factor Analysis I

Information from EFAInformation from EFA FACTORFACTOR

MsrMsr F1F1 F2F2 F3F3 hh22

aa .60.60 -.06-.06 .02 .02 .36.36bb .81.81 .12 .12 -.03-.03 .67.67...... ... ... ... ... ... ... ... ... ii .26.26 -.13-.13 .47 .47 .31.31

Sum Sq LdngSum Sq Ldng 1.761.76 1.561.56 .98.98 TotalTotal% Variance% Variance .195.195 .173.173 .109.109 47.7%47.7%

(1.76/9) (1.56/9) (.98/9)(1.76/9) (1.56/9) (.98/9)Average of all communalities (hAverage of all communalities (h22 / k) = / k) =

proportion of variance in all variables proportion of variance in all variables explained by all factors. explained by all factors.

If all variables reproduced perfectly by If all variables reproduced perfectly by the factors, correlation between the factors, correlation between original variables equals sum of the original variables equals sum of the products of factor loadings. When not products of factor loadings. When not perfect, gives an estimate of the perfect, gives an estimate of the correlation.correlation.

e.g. re.g. rabab (.60*.81) + (-.06*.12) + (.02*-.03) (.60*.81) + (-.06*.12) + (.02*-.03) 48 48

Page 20: Class 10 Factor Analysis I

Information from EFAInformation from EFAMsrMsr F1F1 F2F2 F3F3 hh22

aa .60.60 -.06-.06 .02 .02 .36.36bb .81.81 .12 .12 -.03-.03 .67.67...... ... ... ... ... ... ... ... ... ii .26.26 -.13-.13 .47 .47 .31.31Sum Sq LdngSum Sq Ldng 1.761.76 1.561.56 .98.98 TotalTotal% Variance% Variance .195.195 .173.173 .109.109 47.7%47.7%

(1.76/9) (1.56/9) (.98/9)(1.76/9) (1.56/9) (.98/9)1-h1-h22 is the uniqueness is the uniqueness variance of variance of

an item not shared with other items. an item not shared with other items. Unique variance could be random Unique variance could be random error or systematic. error or systematic.

The factor matrix above is after The factor matrix above is after rotation. Eigenvalues computed on rotation. Eigenvalues computed on the unrotated and unreduced factor the unrotated and unreduced factor loading matrix because we are loading matrix because we are interested in total variance interested in total variance accounted for in the data. Use of accounted for in the data. Use of eigenvalues and % variance eigenvalues and % variance accounted for in SPSS not reordered accounted for in SPSS not reordered after rotation.after rotation.

Eigenvalue

Page 21: Class 10 Factor Analysis I

Important Properties of Important Properties of PCAPCA

Each factor in turn maximizes Each factor in turn maximizes variance explained from an variance explained from an RR matrixmatrix

For any number of factors For any number of factors obtained, PCs maximize variance obtained, PCs maximize variance explainedexplained

Amount of variance explained by Amount of variance explained by each PC equals the corresponding each PC equals the corresponding characteristic root (eigenvalue)characteristic root (eigenvalue)

All characteristic roots of PCs are All characteristic roots of PCs are positivepositive

Number of PCs derived equal the Number of PCs derived equal the number of factors need to explain number of factors need to explain all the variance in all the variance in RR

The sum of characteristic roots The sum of characteristic roots equals the sum of diagonal equals the sum of diagonal RR elementselements

Page 22: Class 10 Factor Analysis I

RotationsRotations All original PC and PF solutions All original PC and PF solutions

are orthogonal.are orthogonal. Once you obtain minimal number Once you obtain minimal number

of factors, you have to interpret of factors, you have to interpret themthem

Interpreting original solutions is Interpreting original solutions is difficult. Rotation aids difficult. Rotation aids interpretation.interpretation.

You are looking for simple You are looking for simple structurestructure– Component loadings should be Component loadings should be

very high for a few vars and very high for a few vars and near 0 for remaining variablesnear 0 for remaining variables

– Each variable should load Each variable should load highly on only 1 componenthighly on only 1 component

Unrotated MatrixUnrotated Matrix Rotated Rotated MatrixMatrixVarVar F1F1 F2 F2 F1F1 F2F2aa .75.75 .63 .63 .14.14 .95.95bb .69.69 .57 .57 .14.14 .90.90cc .80.80 .49 .49 .18.18 .92.92dd .85.85 -.42-.42 .94.94 .09.09ee .76.76 -.42-.42 .92.92 .07.07

Page 23: Class 10 Factor Analysis I

RotationRotation After rotation, variance After rotation, variance

accounted for by a factor accounted for by a factor is spread out. First factor is spread out. First factor no longer accounts for no longer accounts for max variance possible; max variance possible; others get more variance. others get more variance. Total variance accounted Total variance accounted for is the same.for is the same.

Two types of rotationTwo types of rotation– Orthogonal (factors Orthogonal (factors

uncorrelated)uncorrelated)– Oblique (factors Oblique (factors

correlated)correlated)

Page 24: Class 10 Factor Analysis I

RotationRotation Orthogonal rotation (rigid, 90 Orthogonal rotation (rigid, 90

degrees) - PCs or PFs remain degrees) - PCs or PFs remain uncorrelated after transformationuncorrelated after transformation– Varimax - Simplifying column Varimax - Simplifying column

weights to 1s and 0s. Factor has weights to 1s and 0s. Factor has items loading highly, others don’t items loading highly, others don’t load. Not appropriate if you load. Not appropriate if you expect a single factor.expect a single factor.

– Quartimax - Simplify to 1s and 0s Quartimax - Simplify to 1s and 0s in a row. Item loads high on 1 in a row. Item loads high on 1 factor, almost 0 on others. factor, almost 0 on others. Appropriate if you expect single Appropriate if you expect single general factor.general factor.

– Equimax. Compromise of Varimax Equimax. Compromise of Varimax and Quartimax rotations.and Quartimax rotations.

– In practice, choice of rotation In practice, choice of rotation makes little differencemakes little difference

Page 25: Class 10 Factor Analysis I

RotationRotation Oblique or correlated components (less Oblique or correlated components (less

or more than 90 degrees) - Account for or more than 90 degrees) - Account for same % var, but factors correlatedsame % var, but factors correlated– Some say not meaningful with PCASome say not meaningful with PCA– Many factors are theoretically related, so Many factors are theoretically related, so

rotation method should not “force” rotation method should not “force” orthogonalityorthogonality Allows the loadings to more closely match Allows the loadings to more closely match

simple structuresimple structure Correlated solutions will get you closer to Correlated solutions will get you closer to

simple structuresimple structure Oblimin (Kaiser) and promax are goodOblimin (Kaiser) and promax are good

– Provides a structure matrix of loadings and Provides a structure matrix of loadings and a pattern matrix of partial weights – which a pattern matrix of partial weights – which to interpret?to interpret?

Page 26: Class 10 Factor Analysis I

Orthogonal RotationOrthogonal RotationUnrotated MatrixUnrotated Matrix

Rotated MatrixRotated MatrixVarVar F1F1 F2 F2 F1F1 F2F2aa .75.75 .63 .63 .14.14 .95.95bb .69.69 .57 .57 .14.14 .90.90cc .80.80 .49 .49 .18.18 .92.92dd .85.85 -.42-.42 .94.94 .09.09ee .76.76 -.42-.42 .92.92 .07.07

.au

RF1

RF2

F1

F2

-1.00

-1.00

1.00

1.00

.bu .cu

.du.eu

Page 27: Class 10 Factor Analysis I

Simple Structure Simple Structure (Thurstone)(Thurstone)

(1) Each row of factor matrix (1) Each row of factor matrix should have at least one 0 should have at least one 0 loadingloading

(2) The number of items with 0 (2) The number of items with 0 loadings equals the number of loadings equals the number of factors; each column has 1 or factors; each column has 1 or more 0 loadingsmore 0 loadings

(3) Items with high loadings on (3) Items with high loadings on one factor or the otherone factor or the other

(4) If there are more than 4 (4) If there are more than 4 factors, a large portion of items factors, a large portion of items should have zero loadingsshould have zero loadings

(5) For every pair of columns, (5) For every pair of columns, there should be few cross-there should be few cross-loadingsloadings

(6) Few if any negative loadings(6) Few if any negative loadings

Page 28: Class 10 Factor Analysis I

Simple StructureSimple Structure FactorFactor

MsrMsr 11 22 33aa xx 00 00bb xx 00 00cc xx 00 00dd 00 xx 00ee 00 xx 00ff 00 xx 00gg 00 00 xxhh 00 00 xxii 00 00 xxjj 00 00 xx

Page 29: Class 10 Factor Analysis I

Oblique RotationOblique Rotation Example:Example:

Unrotated MatrixUnrotated MatrixRotated MatrixRotated Matrix

VarVar F1F1 F2 F2 F1F1 F2F2aa .75.75 .63 .63 .04.04 .98.98bb .69.69 .57 .57 .02.02 .99.99cc .80.80 .49 .49 .01.01 .97.97dd .85.85 -.42-.42 .99.99 .01.01ee .76.76 -.42-.42 .98.98 .02.02

.au

RF1

RF2

F1

F2

-1.00

-1.00

1.00

1.00

.bu .cu

.du.eu

Page 30: Class 10 Factor Analysis I

Orthogonal or Orthogonal or Oblique Rotation?Oblique Rotation?

Nunnally suggests using Nunnally suggests using orthogonal as opposed to orthogonal as opposed to oblique rotationsoblique rotations– Orthogonal is simplerOrthogonal is simpler– Leads to same conclusionsLeads to same conclusions– Oblique can be misleadingOblique can be misleading

Ford et al. suggest using Ford et al. suggest using oblique unless oblique unless orthogonality assumption orthogonality assumption is tenableis tenable

Page 31: Class 10 Factor Analysis I

InterpretationInterpretation Factors usually interpreted by Factors usually interpreted by

observing which variables load observing which variables load highest on each factorhighest on each factor– a priori criteria for loadings a priori criteria for loadings

(min .3+)(min .3+) Name factor. Always provide Name factor. Always provide

factor loading matrix in study.factor loading matrix in study. Cross-loadings are problematicCross-loadings are problematic

– a priori criteria for “large” a priori criteria for “large” cross-loadingcross-loading

– decide a priori what you will dodecide a priori what you will do Factor loadings or summated Factor loadings or summated

scales used to define new scale. scales used to define new scale. Can go back to correlation matrix Can go back to correlation matrix and do not only use factor and do not only use factor loadings. Loadings can be loadings. Loadings can be inflated.inflated.

Page 32: Class 10 Factor Analysis I

PCA and FAPCA and FA PCA - No constructs of theoretical PCA - No constructs of theoretical

meaning assumed; Simple mechanical meaning assumed; Simple mechanical linear combination. (1s in the diagonal linear combination. (1s in the diagonal of R)of R)

FA - assumes underlying latent FA - assumes underlying latent constructs. Allows for measurement constructs. Allows for measurement error (communalities in diagonal of R)error (communalities in diagonal of R)– Also PAF or common factors analysisAlso PAF or common factors analysis

PCA uses all the variance. FA uses PCA uses all the variance. FA uses ONLY shared variance. ONLY shared variance.

In FA you can have indeterminant In FA you can have indeterminant (unsolvable) solutions. Have to iterate (unsolvable) solutions. Have to iterate (computer makes best “guess”) to get (computer makes best “guess”) to get the solutions.the solutions.

Page 33: Class 10 Factor Analysis I

FAFA Also known as principal axis Also known as principal axis

factoring or common factor analysisfactoring or common factor analysis StepsSteps

– Estimate communalities of the Estimate communalities of the variables (shared variance)variables (shared variance)

– Substitute communalities in place Substitute communalities in place of 1s on diagonal of of 1s on diagonal of RR

– Perform a principal component Perform a principal component analysis on the reduced matrixanalysis on the reduced matrix

– Iterated FAIterated FA Estimate hEstimate h22

Solve for factor modelSolve for factor model Calculate new communalitiesCalculate new communalities Substitute new estimates of hSubstitute new estimates of h22 into into

matrix and redomatrix and redo Iterate until communalities don’t Iterate until communalities don’t

change muchchange much Rotate for interpretationRotate for interpretation

Page 34: Class 10 Factor Analysis I

Estimating Estimating CommunalitiesCommunalities

Highest correlation of given Highest correlation of given variable with other variables in variable with other variables in data setdata set

Squared multiple correlations Squared multiple correlations (SMCs) of each variable (SMCs) of each variable predicted by all other variables predicted by all other variables in the data setin the data set

Reliability of the variableReliability of the variable Because you are estimating and Because you are estimating and

the factors are no longer the factors are no longer combinations of actual combinations of actual variables, can get funny results:variables, can get funny results:– Communalities > 1.00Communalities > 1.00– Negative eigenvaluesNegative eigenvalues– Negative uniquenessNegative uniqueness

Page 35: Class 10 Factor Analysis I

Example FAExample FARR matrix (correlation matrix with h matrix (correlation matrix with h22))

BlPrBlPr LSatLSatChol LStrChol LStr BdWtBdWt JSatJSat JStrJStrBlPrBlPr .54 .54LSatLSat -.18 -.18 .89 .89CholChol .65.65 -.17 -.17 .67 .67LStrLStr .15 .15 -.45-.45 .22 .22 .87 .87BdWtBdWt .45.45 -.11 -.11 .52.52 .16 .16 .41 .41JSatJSat -.21 -.21 .85.85 -.12 -.12 -.35-.35 -.05-.05 .86 .86JStrJStr .19 .19 -.21 -.21 .02 .02 .79.79 .19 .19 -.35-.35 .87.87

Principal Axis Factoring (PAF)Principal Axis Factoring (PAF)Initial Statistics:Initial Statistics:Variable Communality * Factor Eigenvalue %Var Cum%Variable Communality * Factor Eigenvalue %Var Cum%BLPR .53859 * 1 2.85034 40.7 40.7BLPR .53859 * 1 2.85034 40.7 40.7LSAT .88573 * 2 1.74438 24.9 65.6LSAT .88573 * 2 1.74438 24.9 65.6CHOL .66685 * 3 1.16388 16.6 82.3CHOL .66685 * 3 1.16388 16.6 82.3LSTR .87187 * 4 .56098 8.0 90.3LSTR .87187 * 4 .56098 8.0 90.3BDWT .41804 * 5 .44201 6.3 96.6BDWT .41804 * 5 .44201 6.3 96.6JSAT .86448 * 6 .20235 2.9 99.5JSAT .86448 * 6 .20235 2.9 99.5JSTR .86966 * 7 .03607 .5 100.0JSTR .86966 * 7 .03607 .5 100.0

Page 36: Class 10 Factor Analysis I

FAFAPrincipal Axis Factoring (PAF)Principal Axis Factoring (PAF)Initial Statistics:Initial Statistics:Variable Communality * Factor Eigenvalue Variable Communality * Factor Eigenvalue %Var Cum% %Var Cum%BLPR .53859 * 1 2.85034 40.7 40.7BLPR .53859 * 1 2.85034 40.7 40.7LSAT .88573 * 2 1.74438 24.9 65.6LSAT .88573 * 2 1.74438 24.9 65.6CHOL .66685 * 3 1.16388 16.6 82.3CHOL .66685 * 3 1.16388 16.6 82.3LSTR .87187 * 4 .56098 8.0 90.3LSTR .87187 * 4 .56098 8.0 90.3BDWT .41804 * 5 .44201 6.3 96.6BDWT .41804 * 5 .44201 6.3 96.6JSAT .86448 * 6 .20235 2.9 99.5JSAT .86448 * 6 .20235 2.9 99.5JSTR .86966 * 7 .03607 .5 100.0JSTR .86966 * 7 .03607 .5 100.0

Factor Matrix (Unrotated):Factor Matrix (Unrotated): Factor 1 Factor 2 Factor 3Factor 1 Factor 2 Factor 3LSAT -.75885 .31104 .54455LSAT -.75885 .31104 .54455LSTR .70084 -.20961 .36388LSTR .70084 -.20961 .36388JSAT -.70038 .31502 .39982JSAT -.70038 .31502 .39982JSTR .68459 -.29044 .66213JSTR .68459 -.29044 .66213CHOL .48158 .74399 -.07267CHOL .48158 .74399 -.07267BLPR .48010 .56066 -.02253BLPR .48010 .56066 -.02253BDWT .36699 .47668 .08381BDWT .36699 .47668 .08381

Page 37: Class 10 Factor Analysis I

FAFAPrincipal Axis Factoring (PAF)Principal Axis Factoring (PAF)Final Statistics:Final Statistics:Variable Communality * Factor Eigenvalue %Var Variable Communality * Factor Eigenvalue %Var

Cum%Cum%BLPR .54535 * 1 2.62331 37.5 37.5BLPR .54535 * 1 2.62331 37.5 37.5LSAT .96913 * 2 1.41936 20.3 57.8LSAT .96913 * 2 1.41936 20.3 57.8CHOL .79071 * 3 1.04004 14.9 72.6CHOL .79071 * 3 1.04004 14.9 72.6LSTR .66752 *LSTR .66752 *BDWT .36893 *BDWT .36893 *JSAT .74962 *JSAT .74962 *JSTR .99144 *JSTR .99144 *

Rotated Factor Matrix (VARIMAX):Rotated Factor Matrix (VARIMAX): Factor 1 Factor 2 Factor 3Factor 1 Factor 2 Factor 3LSAT .96846 -.10483 -.14223LSAT .96846 -.10483 -.14223JSAT .83532 -.07092 -.21643JSAT .83532 -.07092 -.21643CHOL -.08425 .88520 -.00547CHOL -.08425 .88520 -.00547BLPR -.11739 .72364 .08898BLPR -.11739 .72364 .08898BDWT -.00430 .59379 .12778BDWT -.00430 .59379 .12778JSTR -.10474 .07011 .98770JSTR -.10474 .07011 .98770LSTR -.28514 .15273 .75026LSTR -.28514 .15273 .75026

Page 38: Class 10 Factor Analysis I

Logic of FALogic of FA

BlPr LSat Chol LStr BdWt JSat JStrHow many? What are the factors?

What we found:

BlPr LSat Chol LStr BdWt JSat JStr

Page 39: Class 10 Factor Analysis I

PCA vs. FAPCA vs. FA Pros & Cons:Pros & Cons:

– Pro PCA: has solvable equations. Pro PCA: has solvable equations. “Math is right”.“Math is right”.

– Con PCA: Lumping garbage Con PCA: Lumping garbage together. Also, no underlying together. Also, no underlying concepts.concepts.

– Pro FA: considers role of Pro FA: considers role of measurement error, gets at measurement error, gets at concepts. concepts.

– Con FA: doing mathematical Con FA: doing mathematical gymnastics.gymnastics.

Practically: Usually not much Practically: Usually not much differencedifference– PCA will tend to converge more PCA will tend to converge more

consistentlyconsistently– FA is more meaningful conceptuallyFA is more meaningful conceptually

Page 40: Class 10 Factor Analysis I

PCA vs. FAPCA vs. FA Situations where you might Situations where you might

want to use FA:want to use FA:– Where there are 12 or fewer Where there are 12 or fewer

variables (diagonal will have a variables (diagonal will have a large impact)large impact)

– Where the correlations between Where the correlations between the variables are small, then the variables are small, then diagonals will have a large impactdiagonals will have a large impact

If you have clear factor If you have clear factor structure, won’t make much structure, won’t make much differencedifference

Otherwise:Otherwise:– PCA will tend to overfactorPCA will tend to overfactor– If doing exploratory analysis, may If doing exploratory analysis, may

not mind overfactoringnot mind overfactoring

Page 41: Class 10 Factor Analysis I

Using FA ResultsUsing FA Results Single surrogate measure – choose a Single surrogate measure – choose a

single item with a high loading to single item with a high loading to represent factorrepresent factor

Summated Scale*Summated Scale*– Form a composite from items Form a composite from items

loading on same factorloading on same factor– Average all items that load on a Average all items that load on a

factor (unit weighting)factor (unit weighting)– Calculate the alpha for the Calculate the alpha for the

reliabilityreliability– Name the scale/constructName the scale/construct

Factor ScoresFactor Scores– Composite measures for each factor Composite measures for each factor

were computed for each subjectwere computed for each subject– Based on all factor loadings for all Based on all factor loadings for all

itemsitems– Not easily replicatedNot easily replicated

Page 42: Class 10 Factor Analysis I

ReportingReporting If you create a factor based If you create a factor based

scale, describe the processscale, describe the process Factor analytic study, report:Factor analytic study, report:

– Theoretical rationale for EFATheoretical rationale for EFA– Detailed description of subjects Detailed description of subjects

and items, including descriptive and items, including descriptive statsstats

– Correlation matrixCorrelation matrix– Methods used (PCA/FA, Methods used (PCA/FA,

communality estimates, factor communality estimates, factor extraction, rotation)extraction, rotation)

– Criteria employed for number of Criteria employed for number of factors and meaningful loadingsfactors and meaningful loadings

– Factor matrix (aka pattern Factor matrix (aka pattern matrix)matrix)

Page 43: Class 10 Factor Analysis I

Confirmatory Factor Confirmatory Factor AnalysisAnalysis

Part of construct validation process Part of construct validation process (do the data conform to expectations (do the data conform to expectations regarding the underlying patterns?)regarding the underlying patterns?)

Use SEM packages to perform CFAUse SEM packages to perform CFA EFA with specified number of factors EFA with specified number of factors

for a criterion is NOT a CFAfor a criterion is NOT a CFA Basically start with a correlation Basically start with a correlation

matrix and expected relationshipsmatrix and expected relationships Look at whether expected Look at whether expected

relationships can reproduce the relationships can reproduce the correlation matrix wellcorrelation matrix well

Tested with chi-square goodness of Tested with chi-square goodness of fit. If significant, data don’t fit fit. If significant, data don’t fit expected structure. No confirmation.expected structure. No confirmation.

Alternative measures of fit available.Alternative measures of fit available.

Page 44: Class 10 Factor Analysis I

Logic of CFALogic of CFALet’s say I believe:

BlPr LSat Chol LStr BdWt JSat JStr

Phys Hlth Life Happ Job Happ

BlPr LSat Chol LStr BdWt JSat JStr

But the reality is:

Phys Hlth Stress Satisfact

Data won’t confirm expected structure

Page 45: Class 10 Factor Analysis I

ExampleExampleR matrix (correlation matrix)R matrix (correlation matrix)

BlPrBlPr LSatLSatChol LStrChol LStr BdWtBdWtJSatJSat JStrJStr

BlPrBlPr 1.001.00LSatLSat -.18 -.18 1.00 1.00CholChol .65.65 -.17 -.17 1.00 1.00LStrLStr .15 .15 -.45-.45 .22 .22 1.00 1.00BdWtBdWt .45.45 -.11 -.11 .52.52 .16 .16 1.00 1.00JSatJSat -.21 -.21 .85.85 -.12 -.12 -.35-.35 -.05-.05

1.001.00JStrJStr .19 .19 -.21 -.21 .02 .02 .79.79 .19 .19 -.35-.35

1.001.00

Do the data fit?Do the data fit?

BlPr LSat Chol LStr BdWt JSat JStr

Phys Hlth Life Happ Job Happ