an introduction to confirmatory factor analysis (cfa) and ... · an introduction to confirmatory...
Post on 07-Jul-2020
15 Views
Preview:
TRANSCRIPT
21/08/2016
1
An Introduction to Confirmatory Factor Analysis (CFA) and Structural Equation Modeling (SEM): Investigating linear relationships among latent constructs
Gavin T L Brown, PhDSummer School Workshop, EARLI SIG 1, Munich, DEAugust 20-21, 2016Contact: gt.brown@auckland.ac.nz
What is a CFA or SEM model?
• A theoretically informed simplification of the complexities of reality created to test or generate hypotheses about linear relationships among various constructs
21/08/2016
2
Theory• We have theories that explain the way things
are (not just descriptions)• Theory and data are inter-twined
– We see phenomena and seek to explain them with theories
– We have theories and seek to test them with phenomena
– Theories ≠ Knowledge • but theories that do not explain phenomena are
certainly false [Knowledge--Popper]
• CFA/SEM is situated in hypothetico-deductive or abductive approaches to meaning
Theory: Context factors cause beliefs & behaviours
• Policies, cultures, histories, and societies differ
• We assume influences are linear and testable• Cyclical processes require longitudinal processes
to test causal and linear paths
21/08/2016
3
Theory generates complex models
• Theoretical framework:– Icek Ajzen: Reasoned or Planned Behaviour– Beliefs & Intentions influence Behaviour & Outcomes– Predictor Beliefs are inter-correlated– Actual control is a moderator/mediator– Mathematical model supposedly fits the data
Outcomes
Criterion of effectiveness
Developing a Model
• Evidence from theory• Evidence from previous studies• Evidence from data
– Exploratory Factor Analysis– Correlational analysis– Regression Analysis
21/08/2016
4
Models
• Everything is connected to everything in the real world – It’s messy and hard to make sense of
• BUT – in a model we select for theoretical reasons the
important connections that we THINK explain most of what is going on in the phenomenon of interest
– It is not the real thing, but a simplification• The arrangement of the connections between
and among variables of interest constitute testable expressions of our theories about how things go together
EFA to CFAStatement 1 2 3 4 5 6 7
29. Assessment fosters students' character. 0.556 0.023 ‐0.11 ‐0.154 ‐0.097 0.047 ‐0.072
22. Assessment cultivates students' positive attitudes towards life. 0.685 ‐0.049 ‐0.02 ‐0.074 ‐0.065 0.059 ‐0.008
20. Assessment is used to provoke students to be interested in learning. 0.591 0.04 0.084 ‐0.066 ‐0.059 ‐0.02 0.048
14. Assessment helps students succeed in authentic/real‐world experiences. 0.446 0.085 0.105 ‐0.216 0.092 ‐0.14 ‐0.124
13. Assessment ensures students pay attention during class. 0.533 0.066 0.131 ‐0.012 0.007 ‐0.22 ‐0.224
34. Assessment measures students' higher order thinking skills. 0.509 ‐0.167 0.007 ‐0.03 ‐0.176 ‐0.11 0.077
27. Assessment allows different students to get different instruction. 0.487 0.017 0.102 ‐0.128 0.011 0.15 0.213 24. Assessment stimulates students to think. 0.678 ‐0.061 0.074 0.008 0.001 ‐0.12 0.105
49. Assessment forces teachers to teach in a way against their beliefs. ‐0.083 0.458 ‐0.03 0.121 ‐0.071 ‐0.19 0.106
31. Assessment interferes with teaching. ‐0.102 0.54 ‐0.08 ‐0.06 0.086 ‐0.13 0.066
10. Assessment has little impact on teaching. 0.134 0.384 ‐0.19 ‐0.034 0.062 ‐0.01 ‐0.067
26. Assessment is an imprecise process. ‐0.004 0.629 0.034 0.008 0.021 0.057 0.094
23. Assessment results are filed & ignored. ‐0.017 0.646 ‐0.01 ‐0.057 ‐0.02 0.022 ‐0.056
45. Teachers conduct assessments but make little use of the results. ‐0.019 0.493 0.045 ‐0.003 ‐0.193 0.008 0.012
EFA steps1. Run MLE, oblimin allowing eigenvalues>1.002. Remove items with cross-loadings >.303. Remove items with no loading >.304. Remove items which did not logically fit their factor5. Remove items that seem literally repetitive in content6. Remove factors that are repetitive in meaning to earlier factorsRESULTItems kept fit conceptually and have strong unique loadings on 1
factor; CFA tests whether the simplified model still fits the data.
EFA. Non-zero values on other factors, but all weak.CFA. Forces these to ZERO
NB. This is the SPSS pattern matrix of regressions
21/08/2016
5
Prediction, Causation, Association
• CFA/SEM models assume linear (i.e., correlations and regressions) relationships (paths) exist among constructs.
• For example:– (A B) [2 things are correlated]– (A B) C [2 correlated things jointly influence a
3rd thing]– (A + B) C [2 things separately and/or jointly
influence a 3rd thing]– A B C [1 thing influences a 2nd which influences
a 3rd]– And so on…. [moderation, mediation, complex inter-
relationships]
CFA/SEM Involves Mathematical Testing of Models• A sophisticated correlational-causal mathematical
testing of a model against a data set• Does the model even solve properly?• How close are they? Does the model fit the data?
– Models are rejected if they do NOT have close fit to the data
• the data can’t be wrong esp. if it is a representative and large sample—it’s the reality we are trying to model
– Models are NOT accepted if they have close fit to the data
• They are NOT YET DISCONFIRMED—Popper• Multiple models can fit equally well the same data• Fit could be attributable to chance factors in the data we
collected
21/08/2016
6
Latent trait theory: invisible things influence observed behaviours
• Invisible traits explain responses & behaviours– But other things do too—random and
systematic which we might not have data on….so these residuals influence responses
– Example: • Intelligence (latent) explains how many answers
(manifest) you get right on a test but there is influence from other things (e.g., breakfast, happiness, study effort, quality of teaching, etc.) which are not in the model but exist
Latent trait theory
• Items of similar construct will be highly inter-correlated (Factor) and have low correlation with other factors (simple structure)
• Factors explain a sufficiently high proportion of variance in observed responses to warrant their usefulness
• Multiple models will fit the same data, so selection is theoretically driven, in light of statistical insights
Latent Observed behaviour
Residual, everything else in the universe
21/08/2016
7
Latent Trait Theory• Multiple manifest indicators are required to
generate stable estimation of the latent trait’s existence, strength, and direction; hence,– factor analysis expects 3 to 6 items per factor– test scores rely on 5 to 30 test questions
• WHY?– CHANCE….ERROR….DEFICIENCIES IN STIMULI– Observed behaviour is not perfectly controlled or
reflective of our TRUE intelligence, attitude, etc.• I chose B but I meant A; I chose response 3 but I meant 4
– Our response mechanism interferes• I want 3.4 but I had to choose 3 or 4
• Hence, all values are ESTIMATES– A range of most likely values exists– Multiple indicators reduces error/chance effects
Getting to good factor structures
• More items per factor – If N=50, each factor must have 12 items– If N=100, each factor must have 6 items– When N=500, factors can have 3 items
• Stronger loadings per item– (if >.80 then fewer items & people; if <.40 many
more items and people needed)• Large samples
– Ideally 10 people per manifest item– NB. If N=400, by chance 2% of models will be
inadmissible
21/08/2016
8
The linear relationship• Changes in XXX cause
a linear change (increase or decrease) in YYY
• Formula: Y= m*X + b– m=slope [standardised
beta = a proportion of standard deviation]
– b=intercept [starting point of equation; represents tendency to respond]
• Multiple predictors– y= b0 + b1X1 + b2X2…..
(just keep adding an X for each new variable)
Y vari
able
X variableb intercept
Interpretations:1. For every 1 SD change in X, you will get m*SD change in Y.2. This relationship explains x% of variance in Y
Looking Under the Hood: Components of CFA and SEM models
• Variables– Manifest [observed behaviours,
usually dependent, rectangles]– Latent [unobserved, explanatory, ovals]– Residual [unobserved, unexplained, ovals]
• Manifest variables are predicted by both Latent traits and residuals– Goal to have large proportion of variance in manifest
explained by latent rather than residual disturbances
Traitexplains
Observed responses
Everything else explains
21/08/2016
9
Looking Under the Hood: Components of CFA and SEM models• Paths
– Fixed: equations require SEED values to solve; 1 is the conventional seed. All latent traits must have one path to their predicted manifest variables with a fixed value. All other values are estimated relative to the seed value.
– Free: All other paths are allowed to be estimated freely based on the data provided to the model; they may be stronger than the fixed path, but better to make the strongest path in a factor the fixed path.
– Zero: Paths not required by the model are forced to be non-existent. This contrasts to EFA where all paths have some freely estimated value.
Example of Path Values• EFA indicated Grades
was the strongest value– Thus, seed value on path
• Residual terms exist and have seed value of 1 because they are equal to each other
• Note: manifest variables ONLY have paths from the conceptual LATENT trait– Zero between each other– If 2 or more factors,
items should have ZERO paths to other factors
Well-beingEvaluative
Grades e12
Ticks e13
Praise e14
Stickers e15
Answers e16
1
1
1
1
1
1
21/08/2016
10
Estimation• Maximum likelihood (most common)
– The parameter values in the data set (a sample) are the most likely values in the population (not present, but to which we wish to generalise)
– Hence, procedure attempts to maximise the input values when estimating the solution
• means, standard deviations, covariances– Hence, it matters that the sample reflects the
population and is sufficiently large that parameters are likely to apply to population
– Valid if response categories are defensibly continuous (i.e., ≥5 ordinal categories)
Model Evaluation: Fit to Data
• Because of MLE, it is possible to evaluate the fit of the model relative to the data by comparing the distributions– The chi-squared (χ2) test is the fundament of model
evaluation– χ2 test: difference between Observed (model) and
Expected (Data) adjusted by number of parameters and cases (degrees of freedom)
– However, χ2 penalises falsely large N (i.e., >100) and large number of manifest variables
– So it is a poor test, notwithstanding vehement objections by some researchers
21/08/2016
11
Evaluating Results: Which Fit indices & What Values?
Note.Report multiple indices but beware…..CFI punishes falsely complex models (i.e., >3 factors)RMSEA rewards falsely complex models with mis-specification
See Fan & Sivo, 2007*AMOS only generates SRMR if NO missing data; thus, important to clean up missing values prior to any analysis. Recommend expectation maximization (EM) procedure
Goodness of Fit Badness of fitDecision p of χ2/df CFI
gamma hatRMSEA SRMR*
Good >.05 >.95 <.05 ≈.06Acceptable >.05 >.90 <.08 <.08Marginal >.01 .85-.89 <.10Reject <.01 <.85 >.10 >.08
More on the RMSEA Statistic• RMSEA is a point estimate in the middle of a
range. – The 90% confidence interval should be reported. – The PCLOSE statistic shows whether it is probable
that RMSEA is <.05; accuracy effected by sample size
– Comparison to independence model not terribly interesting. The real question should be:Is there a better model to explain these responses than the model I have used?
RMSEAModel RMSEA LO 90 HI 90 PCLOSEDefault model .048 .045 .051 .899Independence .127 .124 .129 .000
21/08/2016
12
Would you accept this model?
• fit statistics– χ2 = 9.31, df = 8, p = .32, – χ2 /df = 1.16, p = .28; – Markov estimated p = .39
± .01; – CFI = .96; – gamma hat = .98; – RMSEA = .093, 90%
CI = .000–.295, pclose = .35;
– SRMR = .088
– Brown & Marshall, 2012
Distinguishing CFA from SEM
• CFA = measurement model of a construct– CFA models can have multiple dimensions and
complex structures • An achievement score can be hierarchical
– total consists of surface AND deep cognitive processes• An attitude or opinion can be multi-correlated
– Total consists of correlations between 3 or more related dimensions
• SEM = structural model of paths between constructs– SEM models arrange predictive paths
• Attitudes towards X influence performance on Y• Attitude towards X is related to attitude towards Y
21/08/2016
13
Example: CFA + SEM (Brown & Hirschfeld, 2008)
CFA: Measurement Model-4 correlated factorsNote. Accurate measurement models are also
needed for reading score, year, sex, & ethnicityStructural model: multiple predictors of performanceNote.
If measurements of each construct are NOT robust, do NOT use them for anything!!!
Linear Models are Recursive (Brown et al., 2009)
• CFA/SEM assume models are recursive– Beginnings and end are different– NOT circular
endings
origins
21/08/2016
14
How to Test Reciprocal Models?
• Make it longitudinal– Time 1 Time 2– A1B1C1 A2 B2C2
• Use 2 different methods of measuring construct A– AM1 BCAM2
• These approaches honour the reciprocal effects in theory without invalidating the linear regression equations AND the linear nature of existence
• Longitudinal analysis is an advanced topic in SEM and beyond today’s talk
Interpreting a Model
• Statistical significance of paths• The weights & directions of each path• The proportion of variance explained (the
effect size)
21/08/2016
15
Strategies for Evaluating a Model(Brown, Harris, & Harnett, 2012)
• Check that the model is admissible– The model is recursive
StudentInvolvement
MQQ69 e11
1
MQQ70 e21
MQQ57 e31
MQQ40 e41
MQQ68 e51
MQQ59 e61
MQQ50 e71
Well-being
MQQ58 e8
MQQ49 e9
MQQ45 e10
MQQ66 e11
MQQ35 e12
MQQ24 e13
MQQ18 e14
1
1
1
1
1
1
1
1
MQQ31 e151
Growth
MQQ6 e16
MQQ15 e17
MQQ4 e18
MQQ22 e19
MQQ12 e20
MQQ16 e21
MQQ43 e22
MQQ55 e23
1
1
1
1
1
1
1
1
1
MQQ2 e241
MQQ53 e251
MQQ17 e261
Irrelevance
MQQ32e27
MQQ21e28
MQQ42e29
MQQ11e30
MQQ61e31
MQQ1e32
1
1
1
1
1
1
1
Timeliness
MQQ29e33
MQQ20e34
MQQ10e35
MQQ41e36
1
1
1
1
1
Evaluation
MQQ23e37
MQQ8e38
MQQ25e39
MQQ5e40
MQQ63e41
MQQ44e42
1
1
1
1
1
1
e431
e441
e451
e461
e471
e481
TCoF
1
e491
Evaluation
MQQ23e37
MQQ8e38
MQQ25e39
MQQ5e40
MQQ63e41
MQQ44e42
1
1
1
1
1
1
e481
Check that it is IDENTIFIED.OOOPS! Seed value omitted
21/08/2016
16
Evaluating Results• Statistically significant paths
– The strength of the path should exceed what might occur by chance
– option to remove such paths or indicate as ns
If p>.05 path not stat sig
Note. Fixed paths have
no probability.
Evaluating Results
• Variance explained (SMC)– Equivalent to R2
– effect size f2 = R2 /(1 - R2)• Small: .02 to .14• Medium: .15 to .34• Large: >.35• (Cohen, 1992)
.08
Evaluation
.34MQQ23e37
.29MQQ8e38
.17MQQ25e39
.23MQQ5e40
.14MQQ63e41
.19MQQ44e42
.58
-.54
.41
.48.37
.44
e48
-.28
Note. SMC = Beta squaredBalanced not explained is in the residual (goal small residuals, so target β>.50)
f2 =.19/.81=.23 (medium)
21/08/2016
17
Testing Multiple Models • Analyst’s job is to identify which model fits best
and makes sense in terms of what we already know and believe about reality
• Instrument: Teachers’ Conceptions of Feedback– Theoretically expected 10 factors
• Data: independent samples from Louisiana and New Zealand
• Analysis: independent EFA and CFA for both samples, comparison of 2 groups, re-analysis of NZ sample
• Results: multiple structures and many possible valid models could fit; better model found in a series of studies
Testing Alternate Models
• Multiple models will fit the same data• To eliminate competing alternative hypotheses we
must test alternate models even if we don’t believe in them and don’t expect them to be right
• We inspect the pattern of fit indices to identify the model most likely to fit the data
• We judge that model by our theoretical understanding—if we can’t explain it, then its just a model that fits the data which we don’t understand….
21/08/2016
18
Recommended Alternatives to Test• How many factors are needed to explain the data?
– None? 1, 2, 3, 4, etc…?• Are the factors independent of each other?
– Uncorrelated, no hierarchy• Are the factors correlated or hierarchical?
– Correlated will always be better fitting but more complex to explain
• Do the factors have a linear path from one or many to one or many?– Look at correlations and ask if these suggest causal
relations
3. ImproveLearning
1
1. PersonalDevelopment
1
4. SchoolQuality1
2. Irrelevant
1
6. Examination9
21
5. TeacherQuality
1
7. Error
1
3. ImproveLearning
1
1. PersonalDevelopment
1
4. SchoolQuality1
2. Irrelevant
1
6. Examination
1
5. TeacherQuality
7. Error
1
ctcoa
1
e691
e701
e711
e721
e731
e741
e751
e761
3. ImproveLearning
1
1. PersonalDevelopment
1
4. SchoolQuality1
2. Irrelevant
1
6. Examination9
21
5. TeacherQuality
1
7. Error
1
1
e1
1
1
1
Independent CorrelatedHierarchical
21/08/2016
19
StrategyDevelopment
MQQ46 e1.81
MQQ45 e2.65
MQQ37 e3
.80
MQQ36 e4
.72
MQQ30 e5
.64
MQQ32 e6.72
Irrelevance
MQQ2 e7
MQQ3e8
MQQ7 e9
MQQ4 e10
MQQ60 e11
MQQ1 e12
.63
.57
.40
.51
.45
.43
EncourageImprovement
MQQ21 e13
MQQ34 e14
MQQ28 e15
MQQ22 e16
MQQ8 e17
MQQ43 e18
.66
.71
.77
.63
.66
.43
Make SsFeel Good
MQQ53e19
MQQ23e20
MQQ55e21
MQQ24e22
MQQ25e23
MQQ58e24
.64
.63
.75
.70
.68
.70
OrganisedPlanned
MQQ41e25
MQQ40e26
MQQ39e27
MQQ42e28
MQQ38e29
.70
.57
.78
.61
.73
RequiredExpected
MQQ14e30
MQQ19e31
MQQ16e32
MQQ17e33
MQQ20e34
.54
.48
.60
.49
.49
Independence
MQQ64 e35
MQQ50 e36
MQQ48 e37
MQQ47 e38
.32
.57
.81
.22
mqq56e40.66
Conceptionsof Feedback
.88
-.23
.71
.69
.75
.96
.71
e41
e42
e43
e44
e45
e46e47
e48
MQQ70 e49
.39
Louisiana: 7 Hierarchical factors, marginal fitBut this is not what we really expected—sample or model?
.68
StudentInvolvement
.27 MQQ69e1 .52
.58MQQ70e2
.76
.43 MQQ57e3
.66.40
MQQ40e4.64
.52MQQ68e5
.72
.16MQQ59e6
.40
.30MQQ50e7 .55
.25
Well-being
.35MQQ58 e8
.48MQQ49 e9
.43MQQ45 e10
.30MQQ66 e11
.32MQQ35 e12
.34MQQ24 e13
.59
.69
.66
.55
.57
.58
.18MQQ31 e15.42
.75
Growth
.29MQQ6 e16
.25MQQ22 e19
.43MQQ12 e20
.30MQQ16 e21
.21MQQ43 e22
.33MQQ55 e23
.54
.50
.66
.55
.45
.57.33
MQQ2 e24.57
.24MQQ53 e25
.49
.33MQQ17 e26
.57
.46
Irrelevance
.25
MQQ32 e27.25
MQQ21 e28.43
MQQ11 e30.40
MQQ61 e31
.50
.50
.65
.63
.41
Timeliness
.44MQQ29e33
.28MQQ20e34
.34MQQ10e35
.39MQQ41e36
.66
.53.59.62
.86
Evaluation
.37MQQ23e37
.18MQQ25e39
.27MQQ5e40
.13MQQ63e41
.18MQQ44e42
.61
.42
.52
.36 .42
.17
Tchr-OnlyValid
.54MQQ30e43
.73
.33MQQ9e44
.58
.56
StudentResponse
.64MQQ37 e45.80
.63MQQ38 e46
.80
.26MQQ36 e47.51
.20
Expected.33
MQQ54e50
.49
MQQ3e49
.57
.70
.94
Interactive
.20MQQ65e51 .45
.29MQQ67e52
.54
.32MQQ64e53
.57
.42MQQ26e54 .65
e56
e57
e58
e59
e60
e61
e62
e63
Learning
e64
e65
.72
.83
.89
.97
.64
Grading
.42
.93
.48
.32
.44
.21
-.39
.45
.39
-.20
NZ: 10 Hierarchical factors, good fitImposes meaning on the mess of 10 factors. But was this just chance?
NZ: Went back to theoretical framework of 10 factors. Recovered 9 factors and regressed onto feedback practices.(Brown et al. 2015)
Test Multiple Competing AlternativesModel Factors Items χ2 df χ2/df,
p CFI Gamma
hat RMSEA SRMR
1. LA hierarchical 7 40 1758.12 733 2.40, .12
.78 .86 .067 .080
2. NZ inter‐correlated, bifactor hierarchical
10 46 2378.58 1019 2.33, .13
.79 .90 .051 .063
3. NZ inter‐correlated (theory)
9 38 1626.22 656 2.48, .12
.81 .91 .053 .062
Reduction in χ2 given reduction in df was statistically significant between NZ Models so model 3 preferred
Model 3 is better fitting with acceptable values for RMSEA, SRMR, and χ2/df.
But note that NZ model still NOT fitted to Louisiana sample. Populations matter—a powerful way to test socio-cultural variation
21/08/2016
20
What is Confirmation in CFA?• Most studies follow this process
– An inventory is developed using theory– The validity of the questionnaire may be explored– EFA identifies a plausible model within a data set– CFA tests the fit of the EFA model to the data – CFA refines the EFA model with the same data– This process is better considered Restrictive analysis
not CFA• True confirmation comes when an existing model is
TESTED with an independent sample – Requires that 2nd sample is drawn from the same
population– No EFA needed– Just run the model, does it fit?– If NOT, then EFA must begin again…
Cross-Validation with a New Sample
• Multi-Group Confirmatory Factor Analysis– If we have robust evidence that the model works
with our own data, we should find that it also fits a new sample drawn from the same population
– If it does not, then the model was created taking advantage of chance artefacts within the initial sample and the model is less valid than we want
– If it does, then the samples are from the same population and do not react differently to the items
– However, equivalent models does not mean equivalent means for the group—it means their behaviour is modeled in the same way not the absolute value of their responses
21/08/2016
21
MGCFA• Systematic, sequential comparisons [if 1 is not
true, then do not proceed to 2, etc.]1. Is the unconstrained model admissible for both
groupsconfigural invariance2. Are the regression weights equivalent for both
groupsmetric invariance3. Are the factor intercepts equivalent for both
groupsscalar invariance4. Are the residuals equivalent for both groupsstrict
invariance• Only conditions 1 to 3 must be true to claim that
the model elicits the same behaviour
Determining Invariance• Configural invariance: RMSEA ≤.05• Invariance assumes that parameters will NOT be
identical, but will differ by no more than chance– Two statistics
• Is the change in chi-square (∆χ2), given the change in degrees of freedom (∆df ), statistically non-significant (p>.05)?
– But remember χ2 is very sensitive and may give false negative
• Is the change in the comparative fit index (∆CFI) very small (i.e., <.01)?
– This test is applied progressively• Model 2 is compared to the unconstrained model• Model 3 is compared to Model 2• Model 4 is compared to Model 3
21/08/2016
22
InteractiveInformal
T Checklist
v1_1
e48
11
T Conference
v2_1
e49
a1_11
Portfolio
v3_1
e50
a2_11
T observation
v4_1
e51
a3_11
Questions in class
v5_1
e52a4_11
Class mates score
v6_1
e53a5_11
I score
v7_1
e54 a6_11
Test-Like
Exam
v8_1
e55
1
1
T grades made up test
v9_1
e56
a7_11
T grades on test by someone else
v10_1
e57a8_11
Essay
v11_1
e58a9_11
T grade w ritten w ork
v12_1
e59a10_11
AssessmentDefinitions
1
1
vv1_1
e601
vv2_1
e61
1
vv3_1e62
1
InteractiveInformal
T Checklist
v1_2
e48
11
T Conference
v2_2
e49
a1_21
Portfolio
v3_2
e50
a2_21
T observation
v4_2
e51
a3_21
Questions in class
v5_2
e52a4_21
Class mates score
v6_2
e53a5_21
I score
v7_2
e54 a6_21
Test-Like
Exam
v8_2
e55
1
1
T grades made up test
v9_2
e56
a7_21
T grades on test by someone else
v10_2
e57a8_21
Essay
v11_2
e58a9_21
T grade w ritten w ork
v12_2
e59a10_21
AssessmentDefinitions
1
1
vv1_2
e601
vv2_2
e61
1
vv3_2e62
1
MGCFA: 2 Groups, 2 FactorsNB. AMOS numbers all parameters and suffixes them to show group
DETERMININGINVARIANCE
DETERMINING INVARIANCE
RMSEA<.05, Configurallyinvariant!
∆CFI=.011, Thus NOT invariant
p<.05, Thus NOTinvariant
Thus, the 2 samples are not from 1 population
21/08/2016
23
True Confirmatory Study• TCoA: 9 factors in 4 factor structure developed in NZ with Primary
teachers; tested with 400 secondary NZ teachers– configural invariance RMSEA = .041; regression weights (∆CFI = .006); second-order to
first-order factors (∆CFI = .001); covariances among the four factors (∆CFI = .002); equivalent residuals for the second-order factors (∆CFI = .000).
MGCFA in SEM
• It is possible that a measurement model will be invariant across groups of interest, but is NOT invariant in structural relations– This does not mean that the model is inapplicable– If theory and empirical research can explain the
different relations in the SEM, then the model is detecting real world differences
– Hence, invariance might NOT be expected in how a construct relates to another measure
21/08/2016
24
Excluding Māori group; SEM is equivalent
Meaningful non-invariance
Māori students experience secondary schooling and assessment quite differently to non-Māori students. Culture Counts!
NB. ∆CFI says equivalent, but ∆χ2 says reject. Interesting issues to be resolved in the field
Developing a structural model (SEM)
• If theory suggests causal or non-causal relations test regressions or correlations
• Identify possible structural paths between important variables in measurement models– Correlation analysis– Regression analysis
• Test plausible, logical options– A causes B; B causes A; A and B are correlated, etc.
21/08/2016
25
Why Use SEM instead of Multiple Regressions?• Limitations of multiple regressions
– only 1 construct can be predicted at a time; it’s not simultaneous
– The joint correlations among predictor constructs is not taken into account
– The paths from origin to terminus cannot be accounted for
– The latent trait has to be reduced to a manifest item• Thus, SEM is better able to test for statistical
significance of regressions under multiple conditions– Provided N is large enough
Summary• Theories are used to devise models that attempt to
explain how changes occur in various constructs and in how various constructs are related to each other
• CFA/SEM mathematical equations are based on linear regressions to identify the strength of relationships among Latent, Manifest, and Unexplained variables
• CFA/SEM models are used to establish validity of measurements and answer substantive questions
• CFA/SEM are powerful because of simultaneous properties and tighter specification of model
21/08/2016
26
Automations in AMOS
• Resize manifest variables• Name latent variables• Copying drawings• Standardised root mean residual (SRMR)
Resizing Variables• Typical Model
– Variables are bigger thanthe boxes you drew
– Solutions• Change label so it fits box• Change boxes so they are
all the same size & fit the text
– AMOS– >>Plugins– >>Resize Observed
Variables• Remember to create
space FIRST• Recommend 10 pt font
little impact on learning err19
1
interferes w ith my learningerr20
1
ignore assessment results err14
1
ignore or throw aw ay my assessment resultserr15
1
value-less err16
1
unfair err171
over-assessing err18
1
checking progress against achievementerr211 1
assining grade or level err22
1
comparing against criteria err231
determine how much i have learnederr24
1
appropriate and beneficial err25
1
integrated w ith learning practiceserr26
1
school honest
1
w orth of schools err2
1
information on schools
1
clear and definite1
results are trustw orthy err6
feedback for performanceerr9999
1
makes do my best err8
1 1
higher order thinking skills err9
1
helps improve learning err10
1
1
1
1
positive for social climate
engaging and enjoyable experienceerr32
predict future performance
1
121
1
1
1
little impact on learning err19
1
interferes w ith my learningerr20
1
ignore assessment results err14
1
ignore or throw aw ay my assessment resultserr15
1
value-less err16
1
unfair err171
over-assessing err18
1
ria
checking progress against achievementerr2111
assining grade or level err22
1
comparing against criteria err231
determine how much i have learnederr24
1
appropriate and beneficial err25
1
integrated w ith learning practiceserr26
1
ol
school honesterr1
1
w orth of schools err2
1
information on schools err3
1
clear and definite err4
1
results are trustw orthy err6
feedback for performanceerr9999
1
rove
makes do my best err8
1 1
higher order thinking skills err9
1
helps improve learning err10
1
es2
ore
ess
1
1
1
s12
13
positive for social climate
engaging and enjoyable experienceerr32
6
predict future performance e
1
12
1
1
1
1
1
Before After
21/08/2016
27
Name latent variables
• Draw the structure of the model you wish to test
• Every variable must be named UNIQUELY before analysis – >>Plugins– >>Name Unobserved
Variables– Hey presto—work
solved
1
1
1
1
1
1
1
1
1
1
F1
e11
1
e21
e31
e41
F2
e5
e6
e7
e8
1
1
1
1
1
Before After
Copy Drawings• To save work
– Draw first factor• Usually the biggest
factor– Select what you need– Photocopy next one
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
Hint:Do all the copying before naming unobserved variables
21/08/2016
28
Standardised root mean residual (SRMR)• Powerful fit
index• Obtained in
AMOS if and only if NO missing data in file– >>Plugins– >>Standa
rdized RMR
– >>Analyse (leave popup window open)
Ideally close to or less than .06
Missing Values
• Missing values cannot be used. – All SEM software will estimate these values using Full
Information Maximum Likelihood procedures.– However, you can’t see what values have been used.
So you need to check before you let the software do its thing
• Too much missing– >10%delete case/variable
• A little missing– <10% within tolerance– Use Expectation Maximisation (EM) procedure
(Dempster, Laird, & Rubin, 1977)
21/08/2016
29
EM Missing Values Analysis
• EM uses MLE to check that M, SD, correlation & covariance matrices are not disturbed by imputation– Matters that sample represents population
since sample values are assumed to be best estimate of population values
– Use SPSS Missing Values Analysis EM– Post analysis:
• check descriptives to ensure min and max are not violated—correct them
• Check MCAR test to see if distribution of missing is TRULY random
• If use full information, then AMOS will generate SRMR index
PS. What’s missing?
Things that go Wrong!
• Inadmissible solutions―multiple causes of inadmissibility (Gerbing & Anderson, 1987)
• Causes:– Specifying a sub-factor of items when participants do
not actually make such fine-grained distinctions (Chen, Bollen, Paxton, Curran, & Kirby, 2001)
– Not having enough items for the factor (Marsh, Hau, Balla, & Grayson, 1998)
– Not having enough people to generate stable estimates (ideally n>400)
– Too much missing data that has been imputed– Factors that are too highly inter-correlated
21/08/2016
30
NEGATIVE ERROR VARIANCE
• Explaining more than 100% of variance causes the error to be less than zero (negative). NOT logical or acceptable.
• Solutions:1. Remove offending factor and have items predicted
by higher-order factor2. Subordinate the factor to the other factor3. Fix value to small value >0.00 (e.g., .005) if 2*se
includes 0.00
Fixing a negative error variance• If 1 standard error (se) is greater than observed
value, it is highly likely (68% CI) that the TRUE value is not negative. Hence, it can be fixed to .005. If 2*se>estimate, then 95%CI True >0.00.
21/08/2016
31
Fixing a negative error variance• When can you do this?
– If previous studies have shown that the value is normally >0.00
– If structural causes can explain why variance is negative
• E.g., small sample size– If 2*se is > than variance estimate
• Essential that you inspect the Notes for Model before proceeding—your model might be wrong and you can’t tell by looking at the diagram!!!!
Error Variances Fixed @.005 Admissible Solution
School
Student
LearningImprovement
TeachingImprovement
Describe
Valid
Nurturance
Control1
Exam-oriented
Irrelevance
e641
0.005e65
e66
1
0.005e681
1
1
Accountability
e69
e70
1
1
1
1
School
Student
LearningImprovement
.69
4
TeachingImprovement
.5
Describe
Valid
Nurturance.75.72
.57
Control6.73
.7459
4
Exam-oriented
.
Irrelevance
.61
.66
e64
e65
e66
e68
.75
.62.68
1.45
.78
.67
Accountability.74
e69
e70
.54
.75
1.10
.76
.54
.80
.92-.44
-.81
1.67
-.25
.62
-.28
1.00
-.32
.33
.75
.23
NB. When only 1 predictor explained will be .995 (≈1.00)
21/08/2016
32
Not Positive Definite• When ALL eigenvalues (the diagonals) in a matrix
are NOT >0– At least 1 factor is linearly dependent on another
(collinearity)
Responses to Nonpositive Definite Covariance Matrices• Goal:
– Keep the same theoretical framework so that the analysis tests your theory
– Not just dredging through data to discover relationships
• Too much chance in that process
21/08/2016
33
Resolving Inadmissible Solutions• Reduce the number of factors by joining the
factors which are linearly dependent1. Destroy one factor and join the items to the first factor
• The meaning remains the same (just lose some precision)
Responses to Nonpositive Definite Covariance Matrices
Another approach2. Make one factor a sub-factor of the first
Describe
1
Valid
1
Nurturance
43.
Improvement
e611
e651
e66
1
Describe
ValidNurturance
1
4
Improvement
1
e651
e66
1
e69
1
Before Aftervery high correlation a dependent regression
21/08/2016
34
Irrelevance
Bad
AccountabilitySchools
Describe
Assess
.08
Improvement
Teaching
A
e22
e43
e21
e13
4.15
1.26
AccountabilityStudents
.41
.39
.26
.24
.36
.41
.58
.31
.41
.12
.35
inaccuracyAss
.66
Te
.44
e440
.09
.09
.45
.30
1.40
.92
.68
.46
.97
.66
F1
F2
F3
1.44
1.62
.95
e441 .48
.44
.40
.37
.27
.50
.42
.56
e442
.39
.18
.36
Inadmissible N=82
Irrelevance
Ass
Te
AccountabilitySchools
Ass
A
Assessment
As
Improvement
Asse
As
A
Assess
.39
.40
.27
Assessm
Teache
.85
.67
.40
.54
.36
.34
.38
.53
.26
.59
.42
.65
.31
.52
.47
.48
.43
.43
.24
.36
.49
.21
.55
.40
.55
.41
.47
Admissible(N=82)•Removed sub-factors•Joined highly correlated factorsInterpretation similar
This approach necessary especially when N is low.
OPTIONS FOR INADMISSIBLE SOLUTIONS
error variances <0.00 + correlations >1.00
Options For Inadmissible Solutions
Base Model (did not fit a new group)
Alternate Model 1: Remove problematic item
Alternative Model 2: Add paths
PS. Both revisions worked
21/08/2016
35
Options For Inadmissible Solutions
• Parcel factor into a single variable and pretend that it was measured directly
StructureContent
.30
Moves e1.55
.42Focus e2
.65
.33Material e3
.57
.68Authority e4
.83
.34
Mechanics e15
.27Tone e16
.50Flow e17
.25Originality e18
Style
.50
.71
.52
.58.57
.00PreTraining
Structure/Content
e1
.00PreTraining
Style
e4
.55
8 items, 2 factors becomes 2 correlated variables when n reduces from 86 to 20. Note the similar correlation.
Improving Fit
• If solution is recursive and admissible, but does not fit well, what to do?– Remove paths and items that are not statistically
significant– Remove items that have high attraction to logically
inappropriate factors (use the modification indices to identify those)
– Remove items with weak loadings on their respective factors
– Correlate all the factors with each other– Collapse factors– Parcel factors into scale scores
21/08/2016
36
Using Modification Indices• Look at Regression Weights—these are the
paths we might consider changing– Interpretation: If I add the recommended path I
will get an additional improvement in fit– Values >20 are strong; values>4 are stat sig
Using Modification Indices• What to look for?
– Items that have strong sum of MI• These are items that are most mis-specified• Your theory does not match participant responses
– Items that are highly attracted to multiple factors• Remove these IF and ONLY IF there are sufficient items in
a factor to keep a valid factor– Factors that want to be joined to other factors
• Consider this IF and ONLY IF the path makes some sort of theoretical sense—can you explain it?
• Make it dependent or correlated• Remember
– Change only if you can explain it with your theory– Don’t delete items that would destroy a factor
21/08/2016
37
3. HelpStudents
.7
.76.8
.6
1. PersonalDevelopment
q2722
q3423
Q2924
Q2425
Q2226
Q2027
Q1428
.57
.67
.64
.77
.72
.69.67
4. SchoolQuality
q19
Q28
Q15
.76
.82.74
2. IrrelevantQ49
Q31
Q26
.62.64.59
6. Examination
Q7Q33Q39q38Q62
.56.67.71.55.68
5. TeacherQuality
q42 q35 q57
.82 .76.45
7. Error
q58 q36
.63 .76
q452
.44
q10
.47
q40
.69
q25
.61
66
.63
q238
.60
.63 .14
.60
-.03
.14
.65 .58
.20
.42
.30
-.31
.06
.26
.49
.15.49
.36
.24.54
.38.18
Note. This item loads well on its factor but it is attractive to 2 other factors with some strength.Will removing it improve fit and retain meaning?
USING MODIFICATIONINDICES
USING MODIFICATION INDICES
3. HelpStudents
.77
.75.8
.66
1. PersonalDevelopment
.57
68
.78
.70
.69.68
4. SchoolQuality
.82.74
2. Irrelevant2
.65.58
6. Examination
Q33Q39q38Q62
.67.77.58.64
5. TeacherQuality
.83 .76
7. Error
.63 .75
4
6
q40
.67.63
0
.63 .13
.61
-.05
.14
.64 .55
.20
.42
.33
-.31
.06
.25
.53
.17.47
.28
.21.58
.38.18
Model
#manifest variables df RMSEA
Gamma hat
Before 3 items removed 33 474 0.066 0.889
After 3 items removed 30 384 0.061 0.913
This change reduced the df considerably while increasing the RMSEA somewhat. Consequence is that the model fit for gamma hat is clearly acceptable. Fit is improved.
But does it still mean the same? What did I lose in gaining better fit?
21/08/2016
38
Improving Fit• What NOT to do
– Correlate residuals• It will improve fit but what does it mean?• Residuals are meant to be random in their
patterns but when correlated…• Everything I can’t explain is systematically
related to everything I can’t explain and I can’t explain how or why it is related….
• Do you believe this?—if not don’t do it!• Plausible in longitudinal studies though
• If these do not produce acceptable fit, rethink your model and theory– Use EFA to see how the items in this sample
really do aggregate
1
1
1
1
1
1
1
1
1
1
1
1
1
Summary
• Same techniques used to validate measurement models and explore relations between constructs
• Requires large N and sophisticated mathematical formulae
• Is powerful to test and generate hypotheses• Logically depends on the notion of causation
and prediction• Can be done relatively easily with modern
software but many things can go wrong—see 2nd part of this lecture in 2 weeks
21/08/2016
39
Summary• Estimation problems are quite common
– Small sample size– Poor model specification– Over-factored constructs
• Solutions must fit and be theoretically sound– Remove factors, items– Use modification indices
• Solutions need to be tested– Alternative structures– Invariance to other groups
• And then we can address the substantive “So What?” question, with good measurements
References: Authorities• Boomsma, A., & Hoogland, J. J. (2001). The robustness of LISREL modeling revisited. In R. Cudeck, S. Du
Toit & D. Sorbom (Eds.), Structural equation modeling: Present and future (pp. 139-168). Lincolnwood, IL: Scientific Software International.
• Chen, F., Bollen, K. A., Paxton, P., Curran, P. J., & Kirby, J. B. (2001). Improper solutions in structural equation models: Causes, consequences, and strategies. Sociological Methods & Research, 29(4), 468-508.
• Cheung, G. W., & Rensvold, R. B. (2002). Evaluating goodness-of-fit indexes for testing measurement invariance. Structural Equation Modeling, 9(2), 233-255.
• Dempster, A. P., Laird, N. M., & Rubin, D. B. (1977). Maximum likelihood estimation from incomplete data via the EM algorithm (with discussion). Journal of the Royal Statistical Society, Series B, 39(1), 1-38.
• Gerbing, D. W., & Anderson, J. C. (1987). Improper solutions in the analysis of covariance structures: Their interpretability and a comparison of alternate respecifications. Psychometrika, 52(1), 99-111.
• Hoyle, R. H., & Duvall, J. L. (2004). Determining the number of factors in exploratory and confirmatory factor analysis. In D. Kaplan (Ed.), The SAGE Handbook of Quantitative Methodology for Social Sciences (pp. 301-315). Thousand Oaks, CA: Sage.
• Marsh, H. W., Hau, K.-T., Balla, J. R., & Grayson, D. (1998). Is more ever too much? The number of indicators per factor in confirmatory factor analysis. Multivariate Behavioral Research, 33(2), 181-220.
• McClelland, G. H. (2000). Nasty data: Unruly, ill-mannered observations can ruin your analysis. In H. T. Reis & C. M. Judd (Eds.). Handbook of research methods in social and personality psychology (pp. 393-411). Cambridge: Cambridge University Press.
• Vandenberg, R. J., & Lance, C. E. (2000). A review and synthesis of the measurement invariance literature: Suggestions, practices, and recommendations for organizational research. Organizational Research Methods, 3(4), 4-70.
• Wu, A. D., Li, Z., & Zumbo, B. D. (2007). Decoding the meaning of factorial invariance and updating the practice of multi-group confirmatory factor analysis: A demonstration with TIMSS data. Practical Assessment, Research & Evaluation, 12(3), Available online: http://pareonline.net/getvn.asp?v=12&n=13.
21/08/2016
40
References: Authorities• Ajzen, I. (2005). Attitudes, personality and behavior (2nd
ed.). New York: Open University Press.• Byrne, B. M. (2001). Structural Equation Modeling with
AMOS: Basic Concepts, Applications, and Programming. Mahwah, NJ: LEA.
• Fan, X., & Sivo, S. A. (2007). Sensitivity of fit indices to model misspecification and model types. Multivariate Behavioral Research, 42(3), 509–529.
• Marsh, H. W., Hau, K.-T., & Wen, Z. (2004). In search of golden rules: Comment on hypothesis-testing approaches to setting cutoff values for fit indexes and dangers in overgeneralizing Hu and Bentler's (1999) findings. Structural Equation Modeling, 11(3), 320-341.
• Marsh, H. W., Hau, K.-T., Balla, J. R., & Grayson, D. (1998). Is more ever too much? The number of indicators per factor in confirmatory factor analysis. Multivariate Behavioral Research, 33(2), 181-220.
Basic Readings on CFA/AMOS• Costello, A. B., & Osborne, J. W. (2005). Best practices in exploratory factor
analysis: Four recommendations for getting the most from your analysis. Practical Assessment Research & Evaluation, 10(7), Available online: http://www.pareonline.net/pdf/v10n17.pdf.
• Courtney, M. G. R. (2013). Determining the number of factors to retain in EFA: Using the SPSS R-Menu v2.0 to make more judicious estimations. Practical Assessment Research & Evaluation, 18(8), Available online: http://pareonline.net/getvn.asp?v=18&n=18.
• Klem, L. (2000). Structural equation modeling. In L. G. Grimm & P. R. Yarnold (Eds.), Reading and Understanding More Multivariate Statistics (pp. 227-260). Washington, DC: APA.
• Kline, P. (1994). An easy guide to factor analysis. London: Routledge.• Kim, J.-O., & Mueller, C. W. (1978). Factor Analysis: Statistical methods and
practical issues (Vol. 14). Thousand Oaks, CA: Sage • Lei, P.-W., & Wu, W. (2007). Introduction to structural equation modeling:
Issues and practical considerations. Educational Measurement: Issues and Practice, 26(3), 33–43. doi: 10.1111/j.1745-3992.2007.00099.x
• McDonald, R. P. (2010). Structural Models and the Art of Approximation. Perspectives on Psychological Science, 5(6), 675-686. doi: 10.1177/1745691610388766
• Thompson, B. (2000). Ten commandments of structural equation modeling. In L. G. Grimm & P. R. Yarnold (Eds.), Reading and Understanding More Multivariate Statistics (pp. 261-283). Washington, DC: APA.
21/08/2016
41
References: Studies used• Brown, G. T. L. (2009). Teachers’ self-reported assessment practices and conceptions:
Using structural equation modelling to examine measurement and structural models. In T. Teo & M. S. Khine (Eds.), Structural equation modeling in educational research: Concepts and applications (pp. 243-266). Rotterdam, NL: Sense Publishers.
• Brown, G. T. L., Harris, L. R., & Harnett, J. (2012). Teacher beliefs about feedback within an Assessment for Learning environment: Endorsement of improved learning over student well-being. Teaching and Teacher Education, 28(7), 968-978. doi: 10.1016/j.tate.2012.05.003.
• Brown, G. T. L., Harris, L. R., O’Quin, C. R., & Lane, K. (2015). Using multi-group confirmatory factor analysis to evaluate cross-cultural research: Identifying and understanding non-invariance. International Journal of Research and Method in Education. Advance online publication. doi: 10.1080/1743727X.2015.1070823
• Brown, G. T. L., & Hirschfeld, G. H. F. (2008). Students’ conceptions of assessment: Links to outcomes. Assessment in Education: Principles, Policy and Practice, 15(1), 3-17.
• Brown, G. T. L., Irving, S. E., Peterson, E. R., & Hirschfeld, G. H. F. (2009). Use of interactive-informal assessment practices: New Zealand secondary students’ conceptions of assessment. Learning & Instruction, 19(2), 97-111.
• Brown, G. T. L., Peterson, E. R., & Irving, S. E. (2009). Self-regulatory beliefs about assessment predict mathematics achievement. In D. M. McInerney, G. T. L. Brown, & G. A. D. Liem (Eds.) Student perspectives on assessment: What students can tell us about assessment for learning (pp. 159-186). Charlotte, N Information Age Publishing.
References: Studies used• Brown, G. T. L. (2007). Teachers' conceptions of assessment: Multi-group confirmatory factor analyses
across sectors and countries. Unpublished manuscript, University of Auckland, New Zealand.
• Brown, G. T. L. (2008). Reanalysis of Li & Hui (2007) data set. Confidential Report. Auckland, NZ: University of Auckland, School of Teaching, Learning & Development.
• Brown, G. T. L. (2009, October). Preliminary Analysis of the Chinese Teachers’ Conceptions of Assessment (C-TCoA) Inventory. Hong Kong: Hong Kong Institute of Education, Faculty of Education Studies.
• Brown, G. T. L., Irving, S. E., Peterson, E. R., & Hirschfeld, G. H. F. (2009). Use of interactive-informal assessment practices: New Zealand secondary students’ conceptions of assessment. Learning & Instruction, 19(2), 97-111.
• Brown, G. T. L., Kennedy, K. J., Fok, P. K., Chan, J. K. S., & Yu, W. M. (2009). Assessment for improvement: Understanding Hong Kong teachers’ conceptions and practices of assessment. Assessment in Education: Principles, Policy and Practice, 16(3), 347-363.
• Brown, G. T. L., Lake, R., & Matters, G. (2011). Queensland teachers’ conceptions of assessment: The impact of policy priorities on teacher attitudes. Teaching and Teacher Education, 27(1), 210-220. doi: 10.1016/j.tate.2010.08.003
• Brown, G. T. L., & Marshall, J. C. (2012). The impact of training students how to write introductions for academic essays: An exploratory, longitudinal study. Assessment and Evaluation in Higher Education, 37(6), 653-670. doi: 10.1080/02602938.2011.563277.
• Hirschfeld, G. H. F., & Brown, G. T. L. (2009). Students’ conceptions of assessment: Factorial and structural invariance of the SCoA across sex, age, and ethnicity. European Journal of Psychological Assessment, 25(1), 30-38.
top related