factor analysis caroline van baal march 3 rd 2004, boulder
Post on 19-Dec-2015
216 views
TRANSCRIPT
![Page 1: Factor analysis Caroline van Baal March 3 rd 2004, Boulder](https://reader035.vdocument.in/reader035/viewer/2022062421/56649d355503460f94a0d17c/html5/thumbnails/1.jpg)
Factor analysis
Caroline van Baal
March 3rd 2004, Boulder
![Page 2: Factor analysis Caroline van Baal March 3 rd 2004, Boulder](https://reader035.vdocument.in/reader035/viewer/2022062421/56649d355503460f94a0d17c/html5/thumbnails/2.jpg)
Phenotypic Factor Analysis
• (Approximate) description of the relations between different variables– Compare to Cholesky decomposition
• Testing of hypotheses on relations between different variables by comparing different (nested) models– How many underlying factors?
![Page 3: Factor analysis Caroline van Baal March 3 rd 2004, Boulder](https://reader035.vdocument.in/reader035/viewer/2022062421/56649d355503460f94a0d17c/html5/thumbnails/3.jpg)
Factor analysis and related methods
• Data reduction– Consider 6 variables:– Height, weight, arm length, leg length,
verbal IQ, performal IQ– You expect the first 4 to be correlated, and
the last 2 to be correlated, but do you expect high correlations between the first 4 and the last 2?
![Page 4: Factor analysis Caroline van Baal March 3 rd 2004, Boulder](https://reader035.vdocument.in/reader035/viewer/2022062421/56649d355503460f94a0d17c/html5/thumbnails/4.jpg)
Data analysis in non-experimental designs using latent
constructs
• Principal Components Analysis
• Triangular Decomposition (Cholesky)
• Exploratory Factor Analysis
• Confirmatory Factor Analysis
• Structural Equation Models
![Page 5: Factor analysis Caroline van Baal March 3 rd 2004, Boulder](https://reader035.vdocument.in/reader035/viewer/2022062421/56649d355503460f94a0d17c/html5/thumbnails/5.jpg)
Exploratory Factor Analysis
• Account for covariances among observed variables in terms of a smaller number of latent, common factors
• Includes error components for each variable• x = P * f + u• x = observed variables• f = latent factors• u = unique factors• P = matrix of factor loadings
![Page 6: Factor analysis Caroline van Baal March 3 rd 2004, Boulder](https://reader035.vdocument.in/reader035/viewer/2022062421/56649d355503460f94a0d17c/html5/thumbnails/6.jpg)
SIMINF VOC CODCOM ARI DIG BLC MAZ PIC PIA OBA
Factor 1IQ, “g”
1
![Page 7: Factor analysis Caroline van Baal March 3 rd 2004, Boulder](https://reader035.vdocument.in/reader035/viewer/2022062421/56649d355503460f94a0d17c/html5/thumbnails/7.jpg)
SIMINF VOC CODCOM ARI DIG BLC MAZ PIC PIA OBA
Factor 1verbal
Factor 2performal
1 1
![Page 8: Factor analysis Caroline van Baal March 3 rd 2004, Boulder](https://reader035.vdocument.in/reader035/viewer/2022062421/56649d355503460f94a0d17c/html5/thumbnails/8.jpg)
EFA equations
• C = P * D * P’ + U * U’• C = observed covariance matrix
• Nvar by nvar, symmetric
• P = factor loadings• Nvar by nfac, full
• D = correlations between factors• Nfac by nfac, standardized
• U = specific influences, errors• Nvar by nvar, diagonal
![Page 9: Factor analysis Caroline van Baal March 3 rd 2004, Boulder](https://reader035.vdocument.in/reader035/viewer/2022062421/56649d355503460f94a0d17c/html5/thumbnails/9.jpg)
Exploratory factor analysis
• No prior assumption on number of factors
• All variables load on all latent factors
• Factors are either all correlated or all uncorrelated
• Unique factors are uncorrelated
• Underidentification
![Page 10: Factor analysis Caroline van Baal March 3 rd 2004, Boulder](https://reader035.vdocument.in/reader035/viewer/2022062421/56649d355503460f94a0d17c/html5/thumbnails/10.jpg)
SIMINF VOC CODCOM ARI DIG BLC MAZ PIC PIA OBA
Factor 1verbal
Factor 2performal
Fix to 0
1 1
![Page 11: Factor analysis Caroline van Baal March 3 rd 2004, Boulder](https://reader035.vdocument.in/reader035/viewer/2022062421/56649d355503460f94a0d17c/html5/thumbnails/11.jpg)
Confirmatory factor analysis• An initial model is constructed, because:
– its elements are described by a theoretical process
– its elements have been obtained from a previous analysis in another sample
• The model has a specific number of factors• Variables do not have to load on all factors• Measurement errors may correlate• Some latent factors may be correlated,
while others are not
![Page 12: Factor analysis Caroline van Baal March 3 rd 2004, Boulder](https://reader035.vdocument.in/reader035/viewer/2022062421/56649d355503460f94a0d17c/html5/thumbnails/12.jpg)
SIMINF VOC CODCOM ARI DIG BLC MAZ PIC PIA OBA
Factor 1verbal
Factor 2performal
1 1
![Page 13: Factor analysis Caroline van Baal March 3 rd 2004, Boulder](https://reader035.vdocument.in/reader035/viewer/2022062421/56649d355503460f94a0d17c/html5/thumbnails/13.jpg)
SIMINF VOC CODCOM ARI DIG BLC MAZ PIC PIA OBA
Factor 1verbal
Factor 2performal
1 1
![Page 14: Factor analysis Caroline van Baal March 3 rd 2004, Boulder](https://reader035.vdocument.in/reader035/viewer/2022062421/56649d355503460f94a0d17c/html5/thumbnails/14.jpg)
SIMINF VOC CODCOM ARI DIG BLC MAZ PIC PIA OBA
VC FD PO
![Page 15: Factor analysis Caroline van Baal March 3 rd 2004, Boulder](https://reader035.vdocument.in/reader035/viewer/2022062421/56649d355503460f94a0d17c/html5/thumbnails/15.jpg)
SIMINF VOC CODCOM ARI DIG BLC MAZ PIC PIA OBA
VC FD PO
![Page 16: Factor analysis Caroline van Baal March 3 rd 2004, Boulder](https://reader035.vdocument.in/reader035/viewer/2022062421/56649d355503460f94a0d17c/html5/thumbnails/16.jpg)
CFA equations
• x = P * f + u• x = observed variables, f = latent factors• u = unique factors, P = factor loadings• C = P * D * P’ + U * U’• C = observed covariance matrix• P = factor loadings• D = correlations between factors• U = diagonal matrix of errors
![Page 17: Factor analysis Caroline van Baal March 3 rd 2004, Boulder](https://reader035.vdocument.in/reader035/viewer/2022062421/56649d355503460f94a0d17c/html5/thumbnails/17.jpg)
Structural equations models
• The factor model x = P * f + u is sometimes referred to as the measurement model
• The relations between latent factors can also be modeled
• This is done in the covariance structure model, or the structural equations model
• Higher order factor models
![Page 18: Factor analysis Caroline van Baal March 3 rd 2004, Boulder](https://reader035.vdocument.in/reader035/viewer/2022062421/56649d355503460f94a0d17c/html5/thumbnails/18.jpg)
SIMINF VOC CODCOM ARI DIG BLC MAZ PIC PIA OBA
VC FD PO
2nd order Factor“g”
F3F2F1
• Second order factor model: C = P*(A*I*A’+B*B')*P' + U*U’
![Page 19: Factor analysis Caroline van Baal March 3 rd 2004, Boulder](https://reader035.vdocument.in/reader035/viewer/2022062421/56649d355503460f94a0d17c/html5/thumbnails/19.jpg)
Five steps characterize structural equation models
• Model specification• Identification
– E.g., if a factor loads on 2 variables only, multiple solutions are possible, and the factor loadings have to be equated
• Estimation of parameters• Testing of goodness of fit• Respecification
• K.A. Bollen & J. Scott Long: Testing Structural Equation Models, 1993, Sage Publications
![Page 20: Factor analysis Caroline van Baal March 3 rd 2004, Boulder](https://reader035.vdocument.in/reader035/viewer/2022062421/56649d355503460f94a0d17c/html5/thumbnails/20.jpg)
Practice!• IQ and brain volumes (MRI)
• 3 brain volumes– Total cerebellum, Grey matter, White matter
• 2 IQ subtests– Calculation, Letters / numbers
• Brain and IQ factors are correlated
• Datafile: mri-IQ-all-twinA-5.dat
![Page 21: Factor analysis Caroline van Baal March 3 rd 2004, Boulder](https://reader035.vdocument.in/reader035/viewer/2022062421/56649d355503460f94a0d17c/html5/thumbnails/21.jpg)
Script: phenofact.mx
• BEGIN MATRICES ;• P FULL NVAR NFACT free ; ! factor loadings• D STAND NFACT NFACT !free ;! correlations between factors• U DIAG NVAR NVAR free ; ! subtest specific
influences• M Full 1 NVAR free ; ! means • END MATRICES ;
• BEGIN ALGEBRA;• C= P*D*P' +U*U' ; ! variance covariance matrix• END ALGEBRA;
• Means M /• Covariances C /
![Page 22: Factor analysis Caroline van Baal March 3 rd 2004, Boulder](https://reader035.vdocument.in/reader035/viewer/2022062421/56649d355503460f94a0d17c/html5/thumbnails/22.jpg)
• in exploratory factor analysis, if nfact = 2, one of the factor loadings has to be fixed to 0 to make it an identified model
• fix P 1 2
• In confirmatory factor analysis, specify a brain and an IQ factor• SPECIFY P• 101 0• 102 0• 103 0• 0 204• 0 205• 0 206
• (if a factor loads on 2 variables only, it is not possible to estimate both factor loadings. Equate them, or fix one of them to 1)
![Page 23: Factor analysis Caroline van Baal March 3 rd 2004, Boulder](https://reader035.vdocument.in/reader035/viewer/2022062421/56649d355503460f94a0d17c/html5/thumbnails/23.jpg)
Phenotypic Correlations: MRI-IQ, Dutch twins (A), n=111/296 pairs
brain
cereb
brain
grey
brain
white
IQ
calc
IQ
L/n
Cerebellum 1
Grey .63 1
White .61 .55 1
calculation .23 .25 .26 1
Letter/numb. .30 .19 .19 .46 1
![Page 24: Factor analysis Caroline van Baal March 3 rd 2004, Boulder](https://reader035.vdocument.in/reader035/viewer/2022062421/56649d355503460f94a0d17c/html5/thumbnails/24.jpg)
• What is the fit of a 1 factor model?– C = P * P’ + U*U’, P = 5x1 full, U = 5x5 diagonal
• What is the fit of a 2 factor model?– Same, P = 5x2 full with 1 factor loading fixed to 0– (Reducion: fix first 3 factor loadings of factor 2 to 0)
• Data suggest 2 latent factors: a brain (first 3) and an IQ factor (last 2): what is the evidence for this model?– Same, P = 5x2 full with 5 factor loadings fixed to 0
• Can the 2 factor model be improved by allowing a correlation between these 2 factors?– C = P * D * P’ + U*U’, P = 5x2 full matrix (5 fixed),
D = stand 2x2 matrix, U = 5x5 diagonal matrix
![Page 25: Factor analysis Caroline van Baal March 3 rd 2004, Boulder](https://reader035.vdocument.in/reader035/viewer/2022062421/56649d355503460f94a0d17c/html5/thumbnails/25.jpg)
Principal Components Analysis
• SPSS, SAS, Mx (functions \eval, \evec)
• Transformation of the data, not a model
• Is used to reduce a large set of correlated observed variables (xi) to (a smaller number of) uncorrelated (orthogonal) components (ci)
• xi is a linear function of ci
![Page 26: Factor analysis Caroline van Baal March 3 rd 2004, Boulder](https://reader035.vdocument.in/reader035/viewer/2022062421/56649d355503460f94a0d17c/html5/thumbnails/26.jpg)
PCA path diagram
• D
• P
• S = observed covariances = P * D * P’
x1 x2 x3 x4 x5
c1 c2 c3 c4 c5
![Page 27: Factor analysis Caroline van Baal March 3 rd 2004, Boulder](https://reader035.vdocument.in/reader035/viewer/2022062421/56649d355503460f94a0d17c/html5/thumbnails/27.jpg)
PCA equations
• Covariance matrix qSq = qPq * qDq * qPq’
• P = full q by q matrix of eigenvectors• D = diagonal matrix of eigenvalues• P is orthogonal: P * P’ = I (identity)
Criteria for number of factors• Kaiser criterion, scree plot, %var• Important: models not identified!
x1 x2 x3 x4 x5
c1 c2 c3 c4 c5
![Page 28: Factor analysis Caroline van Baal March 3 rd 2004, Boulder](https://reader035.vdocument.in/reader035/viewer/2022062421/56649d355503460f94a0d17c/html5/thumbnails/28.jpg)
Correlations: satisfaction, n=100
Var 1
work
Var 2
work
Var 3
work
Var 4
home
Var 5
home
Var 6
home
Var 1 1
Var 2 .65 1
Var 3 .65 .73 1
Var 4 .14 .14 .16 1
Var 5 .15 .18 .24 .66 1
Var 6 .14 .24 .25 .59 .73 1
![Page 29: Factor analysis Caroline van Baal March 3 rd 2004, Boulder](https://reader035.vdocument.in/reader035/viewer/2022062421/56649d355503460f94a0d17c/html5/thumbnails/29.jpg)
++++ ++
00
0
0
0
0++
++++
work home
Var 1 Var 2 Var 3 Var 4 Var 5 Var 6
![Page 30: Factor analysis Caroline van Baal March 3 rd 2004, Boulder](https://reader035.vdocument.in/reader035/viewer/2022062421/56649d355503460f94a0d17c/html5/thumbnails/30.jpg)
PCA: Factor loadings(eigenvalues 2.89 & 1.79)
Factor 1 Factor 2
Var 1 (work) .65 .56
Var 2 (work) .72 .54
Var 3 (work) .74 .51
Var 4 (home) .63 -.56
Var 5 (home) .71 -.57
Var 6 (home) .71 -.53
![Page 31: Factor analysis Caroline van Baal March 3 rd 2004, Boulder](https://reader035.vdocument.in/reader035/viewer/2022062421/56649d355503460f94a0d17c/html5/thumbnails/31.jpg)
Triangular decomposition (Cholesky)
x1 x2 x3 x4 x5
y1 y2 y3 y4 y5
1 operationalization of all PCA outcomes
Model is just identified! Model is saturated (df=0)
1 1 1 1 1
![Page 32: Factor analysis Caroline van Baal March 3 rd 2004, Boulder](https://reader035.vdocument.in/reader035/viewer/2022062421/56649d355503460f94a0d17c/html5/thumbnails/32.jpg)
Triangular decomposition
• S = Q * Q’ ( = P# * P# ‘, where P# is P*D)•
5Q5 = f11 0 0 0 0f21 f22 0 0 0f31 f32 f33 0 0f41 f42 f43 f44 0f51 f52 f53 f54 f55
• Q is a lower matrix• This is not a model! This is a transformation of the
observed matrix S. Fully determinate!
![Page 33: Factor analysis Caroline van Baal March 3 rd 2004, Boulder](https://reader035.vdocument.in/reader035/viewer/2022062421/56649d355503460f94a0d17c/html5/thumbnails/33.jpg)
Saturated model, # latent factorsscript: phenochol.mx
• BEGIN MATRICES ;• P LOWER NVAR NVAR free ; ! factor loadings• M FULL 1 NVAR free ; ! means • END MATRICES ;
• BEGIN ALGEBRA;• C= Q*Q' ; ! variance covariance matrix• K=\stnd(C) ; ! correlation matrix• X=\eval(K) ; ! eigen values (i.e., variance of latent factors)• Y=\evec(K) ; ! eigenvectors (i.e., regression coefficients)• END ALGEBRA;
• Means M /• Covariances C /