structural equation modeling using mplus
DESCRIPTION
Structural Equation Modeling Using Mplus. Chongming Yang Research Support Center FHSS College. Structural?. Structuralism Components Relations. Objectives. Introduction to SEM T he model Parameters Estimation Model evaluation Applications E stimate simple models with Mplus. - PowerPoint PPT PresentationTRANSCRIPT
Structural Equation Structural Equation Modeling Modeling
Using MplusUsing Mplus
Chongming YangChongming Yang
Research Support CenterResearch Support Center
FHSS CollegeFHSS College
Structural? Structural?
StructuralismStructuralism ComponentsComponents Relations Relations
ObjectivesObjectives
Introduction to SEMIntroduction to SEM The modelThe model ParametersParameters Estimation Estimation Model evaluationModel evaluation Applications Applications
Estimate simple models with Mplus Estimate simple models with Mplus
Continuous Dependent Continuous Dependent VariablesVariables
Session ISession I
Information of VariableInformation of Variable
MeanMean VarianceVariance SkewednessSkewedness Kurtosis Kurtosis
Variance & CovarianceVariance & Covariance2( )
1
n
ii
x xV
n
( )( )
1
n
i ii
x x y yCov
n
Covariance Matrix (S)
x1 x2 x3 x1 x2 x3
x1 Vx1 V11
x2 Covx2 Cov21 21 VV22
x3 Covx3 Cov31 31 CovCov32 32 VV33
Statistical Model Statistical Model
Probabilistic statement about Probabilistic statement about Relations of variablesRelations of variables
Imperfect but useful representation Imperfect but useful representation of realityof reality
Structural Equation Structural Equation ModelingModeling
A system of regression equations for A system of regression equations for latent variables to estimate and test latent variables to estimate and test direct and indirect effects without the direct and indirect effects without the influence of measurement errors.influence of measurement errors.
To estimate and test theories about To estimate and test theories about interrelations among observed and interrelations among observed and latent variables.latent variables.
Latent Variable Latent Variable ( (Construct / Factor / TraitConstruct / Factor / Trait))
A hypothetical variable A hypothetical variable cannot be measured directly cannot be measured directly No objective measurement unitNo objective measurement unit inferred from observable manifestations inferred from observable manifestations
Multiple manifestations (indicators) Multiple manifestations (indicators) Normally distributed interval Normally distributed interval
dimensiondimension
How is Depression How is Depression Distributed in?Distributed in?
BYU students BYU students
Patients for Therapy Patients for Therapy
Normal Distributions Normal Distributions
Levels of AnalysesLevels of Analyses
ObservedObserved
LatentLatent
Test TheoriesTest Theories
Classical True Score Theory:Classical True Score Theory:
Observed Score = True score + Observed Score = True score + ErrorError
Item Response TheoryItem Response Theory Generalizability Generalizability (Raykov & Marcoulides, 2006)(Raykov & Marcoulides, 2006)
Graphic Symbols of SEMGraphic Symbols of SEM
Rectangle – observed variableRectangle – observed variable Oval -- latent variable or errorOval -- latent variable or error Single-headed arrow -- causal Single-headed arrow -- causal
relationrelation Double-headed arrow -- correlation Double-headed arrow -- correlation
Graphic Measurement Graphic Measurement Model Model
of Latent of Latent
X1
X2
X3
1
2
3
1
2
3
EquationsEquations
Specific equationsSpecific equationsXX11 = = 11 + + 11
XX22 = = 22 + + 22
XX33 = = 33 + + 3 3
Matrix SymbolsMatrix SymbolsX = X = + +
True Score Theory?True Score Theory?
Relations of VariancesRelations of Variances
VVX1X1 = = 1122 + + 11
VVX2X2 = = 2222 + + 22
VVX3X3 = = 3322 + + 33
= measurement error / uniqueness = measurement error / uniqueness
Unknown ParametersUnknown Parameters
VVX1X1 = = 1122 + + 11
VVX2X2 = = 2222 + + 22
VVX3X3 = = 3322 + + 33
Sample Covariance Matrix (S)
x1 x2 x3 x1 x2 x3
x1 Vx1 V11
x2 Covx2 Cov21 21 VV22
x3 Covx3 Cov31 31 CovCov32 32 VV33
Variance of Variance of
Variance of Variance of = common covariance = common covariance of X1 X2 and X3of X1 X2 and X3
Variance of
1
2 3
0
0
0
Unstandardized Unstandardized ParameterizationParameterization
(scaling)(scaling) 1 1 = 1 = 1 (set variance of X1 =1; X1 called reference Indicator)(set variance of X1 =1; X1 called reference Indicator)
Variance of Variance of = common variance of X1 X2 = common variance of X1 X2 and X3and X3
Squared Squared = explained variance of X (R = explained variance of X (R22)) Variance of Variance of = unexplained variance--error = unexplained variance--error Total Variance = Squared Total Variance = Squared + + Variance Variance
Just Identified ModelJust Identified Model
X1
X2
X3
1
2
3
1
2
3
Reference IndicatorReference Indicator(marker)(marker)
Choose conceptually the best Choose conceptually the best
Small variance Small variance non-convergence non-convergence Different markers Different markers different different
parameters estimates and their parameters estimates and their standard errorsstandard errors
Affect measurement invariance tests Affect measurement invariance tests Not affect standardized estimatesNot affect standardized estimates
Standardized Standardized ParameterizationsParameterizations
(scaling)(scaling) Variance of Variance of = 1 = common = 1 = common
variance of X1 X2 and X3variance of X1 X2 and X3 Squared Squared = explained variance of X = explained variance of X
(R(R22)) Variance of Variance of = 1 - = 1 - 22 Mean of Mean of = 0 = 0 Mean of Mean of = 0 = 0
Two Kinds of ParametersTwo Kinds of Parameters
Fixed at 0, 1, or other valuesFixed at 0, 1, or other values Freely estimatedFreely estimated
GeneralIntelligence
Verbald3
Reasoningd2
Analyticd1
EmotionalIntelligence
Recognize/Assessd5
SelfControld4
Personality
Opennessd7
Agreeable-nessd6
JobSatisfaction
BeingAppreciated e1
SocialRelations e2
MaritalSatisfaction
PerceivedBenefit e3
PerceivedCost e4
z1
z2
Structural Equation ModelStructural Equation Modelin Matrix Symbolsin Matrix Symbols
X = X = xx + + (exogenous) (exogenous)
Y = Y = yy + + (endogenous)(endogenous)
= = + + + + (structural model)(structural model)
Note: Measurement model reflects the true score Note: Measurement model reflects the true score theory theory
Structural Equation ModelStructural Equation Modelin Matrix Symbolsin Matrix Symbols
X = X = xx + + xx + + (measurement) (measurement)
Y = Y = yy + + yy + + (measurement)(measurement)
= = αα + + + + + + (structural)(structural)
Note: SEM with mean structure.Note: SEM with mean structure.
Model Implied Covariance Model Implied Covariance MatrixMatrix
(Σ)(Σ)
Note: This covariance matrix contains unknown parameters in the equations.
(I-B) = non-singular
Estimations/Fit FunctionsEstimations/Fit Functions
Hypothesis: Hypothesis: = S or = S or - S = 0 - S = 0
Maximum LikelihoodMaximum Likelihood
F = log||F = log|||| + trace(S|| + trace(S-1-1) - log||S|| - (p+q)) - log||S|| - (p+q)
Convergence -- Reaching Convergence -- Reaching LimitLimit
Minimize F while adjust unknown Parameters through Minimize F while adjust unknown Parameters through iterative processiterative process
Convergence value: F difference between last two Convergence value: F difference between last two iterationsiterations
Default convergence = .0001 Default convergence = .0001 Increase to help convergence (Increase to help convergence (0.001 or 0.010.001 or 0.01))
e.g. e.g. Analysis: convergence = .01;Analysis: convergence = .01;
No ConvergenceNo Convergence
No unique parameter estimatesNo unique parameter estimates Lack of degrees of freedom Lack of degrees of freedom under under
identification identification Variance of reference indicator too Variance of reference indicator too
small small Fixed parameters are left to be freely Fixed parameters are left to be freely
estimatedestimated Misspecified model Misspecified model
Absolute Fit IndexAbsolute Fit Index
22 = F(N-1) = F(N-1) (N = sample size)(N = sample size)
df = p(p+1)/2 – q df = p(p+1)/2 – q
P = number of variances, covariances, & meansP = number of variances, covariances, & means
q = number of unknown parameters to be estimatedq = number of unknown parameters to be estimated
probprob = ? = ? (Nonsignificant (Nonsignificant 22 indicates good fit, indicates good fit, Why?)Why?)
Sample InformationSample Information
x1 x2 x3 x4 …x1 x2 x3 x4 …x1 x1 vv11
x2 x2 covcov21 21 vv22
x3 x3 covcov31 31 covcov32 32 vv33
x4 x4 covcov41 41 covcov42 42 covcov43 43 vv4 4 ……
…… Mean1 Mean2 Mean3 Mean4 Mean1 Mean2 Mean3 Mean4 ……
Total info = P(P+1)/2 + Means Total info = P(P+1)/2 + Means
Absolute Fit -- SRMRAbsolute Fit -- SRMR
Standardized Root Mean Square Standardized Root Mean Square ResidualResidual
SRMR = Difference between SRMR = Difference between observed and implied covariances in observed and implied covariances in standardized metricstandardized metric
Desirable when < .90, but no Desirable when < .90, but no consensusconsensus
Relative Fit: Relative Fit: Relative to Baseline (Null) Relative to Baseline (Null)
ModelModel All unknown parameters are fixed at All unknown parameters are fixed at
0 0 Variables not related Variables not related ((=======0)=0)
Model implied covariance Model implied covariance = 0 = 0 Fit to sample covariance matrix SFit to sample covariance matrix S Obtain Obtain 22, df, , df, prob prob < .0000 < .0000
Relative Fit IndicesRelative Fit Indices
CFI = 1- (CFI = 1- (22-df)/(-df)/(22bb-df-dfbb) )
b = baseline modelb = baseline model Comparative Fit Index, desirable => .95; 95% better than b modelComparative Fit Index, desirable => .95; 95% better than b model
TLI = (TLI = (22bb/df/dfb b - - 22/df) / (/df) / (22
bb/df/dfbb-1) -1) (Tucker-Lewis Index, desirable => .90)(Tucker-Lewis Index, desirable => .90)
RMSEA = RMSEA = √(√(22-df)/(n*df) -df)/(n*df) (Root Mean Square of Error Approximation, desirable <=.06(Root Mean Square of Error Approximation, desirable <=.06 penalize a large model with more unknown parameters)penalize a large model with more unknown parameters)
Special Case ASpecial Case A
VerbalAggression
t4a3 e3
t4a93 e2
t4a94 e1
PhysicalAggression
t4a37 e6
t4a57 e5
t4a90 e4
Sex
d1
1
d2
1
Special Cases A Special Cases A
Assumption: x = Assumption: x =
y y = = xx + + + +
= = + + xx + +
Special Case BSpecial Case B
VerbalAggression
x3e3
x2e2
x1e1
PhysicalAggression
x6e6
x5e5
x4e4
PeerStatus
d
Special Cases B Special Cases B
Assumption: y = Assumption: y =
x = x = xx + + xx + +
yy = = + + + +
Other Special Cases of SEMOther Special Cases of SEM
Confirmatory Factor Analysis Confirmatory Factor Analysis (measurement model only)(measurement model only) Multiple & Multivariate RegressionMultiple & Multivariate Regression ANOVA / MANOVA ANOVA / MANOVA (multigroup CFA)(multigroup CFA)
ANCOVAANCOVA Path Analysis Model Path Analysis Model (no latent variables)(no latent variables)
Simultaneous Econometric Equations…Simultaneous Econometric Equations… Growth Curve ModelingGrowth Curve Modeling ……
EFA vs. CFAEFA vs. CFA
Factor 1
x1
e1
1
1
x2
e21
x3
e31
Factor 2
x4
e4
x5
e5
x6
e6
1
1 1 1
Exploratory Factor AnalysisConfirmatory Factor Analysis
Factor 1
x1
e1
x2
e2
x3
e3
Factor 2
x4
e4
x5
e5
x6
e6
1
1 1 1
1
1 1 1
Multiple RegressionMultiple Regression
x1
x2
x3
Y
e1
ANCOVAANCOVA
Pretest1
Group
Posttest1
e11
Pretest2 Posttest2
e21
Multivariate Normality Multivariate Normality AssumptionAssumption
Observed data summed up perfectly Observed data summed up perfectly by covariance matrix S (+ means M), by covariance matrix S (+ means M), S thus is an estimator of the S thus is an estimator of the population covariance population covariance
Consequences of ViolationConsequences of Violation
Inflated Inflated 2 2 & deflated CFI and TLI& deflated CFI and TLI reject plausible models reject plausible models
Inflated standard errors Inflated standard errors attenuate factor loadings and attenuate factor loadings and relations of latent variables relations of latent variables (structural parameters)(structural parameters)
(Cause: Sample covariances were underestimated) (Cause: Sample covariances were underestimated)
Accommodating Accommodating StrategiesStrategies
Correcting Fit Correcting Fit Satorra-Bentler Scaled Satorra-Bentler Scaled 2 2 & Standard Errors & Standard Errors
(estimator = mlm; in Mplus)(estimator = mlm; in Mplus) Correcting standard errorsCorrecting standard errors
BootstrappingBootstrapping Transforming Nonnormal variablesTransforming Nonnormal variables
Transforming into new normal indicators Transforming into new normal indicators (undesirable)(undesirable)
SEM with Categorical VariablesSEM with Categorical Variables
Satorra-Bentler Scaled Satorra-Bentler Scaled 2 2 & & SE SE
S-B S-B 22 = = d d-1-1(ML-based (ML-based 22)) (d= Scaling factor (d= Scaling factor that incorporates kurtosis)that incorporates kurtosis)
Effect: performs well with continuous data Effect: performs well with continuous data in terms of in terms of 22, CFI, TLI, RMSEA, parameter , CFI, TLI, RMSEA, parameter estimates and standard errors.estimates and standard errors.
also works with certain-categorical also works with certain-categorical variables (See next slide)variables (See next slide)
Analysis:Analysis: estimator = MLM; estimator = MLM;
Workable Categorical DataWorkable Categorical Data
1.000 2.000 3.000 4.000 5.000
0.000
1.000
2.000
3.000
4.000
5.000
6.000
7.000
Nonworkable Categorical Nonworkable Categorical DataData
1.000 2.000 3.000
0.000
1.000
2.000
3.000
4.000
5.000
6.000
BootstrappingBootstrapping(resampling of data)(resampling of data)
Original btstrp1 btstrp2 …Original btstrp1 btstrp2 … x y x y x y x y x y x y 1 5 5 3 1 31 5 5 3 1 3 2 4 1 1 5 42 4 1 1 5 4 3 3 3 2 4 13 3 3 2 4 1 4 2 4 5 2 24 2 4 5 2 2 5 1 2 4 3 55 1 2 4 3 5 . . . . . .. . . . . .
Limitation of BootstrappingLimitation of Bootstrapping
Assumption: Sample = PopulationAssumption: Sample = Population Useful Diagnostic ToolUseful Diagnostic Tool Does not Compensate for Does not Compensate for
small or unrepresentative samples small or unrepresentative samples severely non-normal or severely non-normal or absence of independent samples for the cross-absence of independent samples for the cross-
validationvalidation Analysis:Analysis: Bootstrap = 500 Bootstrap = 500
(standard/residual);(standard/residual); Output:Output: stand cinterval; stand cinterval;
Multiple Programs Multiple Programs IntegratedIntegrated
SEM of both continuous and categorical SEM of both continuous and categorical variablesvariables
Multilevel modeling Multilevel modeling Mixture modeling (identify hidden groups)Mixture modeling (identify hidden groups) Complex survey data modeling Complex survey data modeling
(stratification, clustering, weights)(stratification, clustering, weights) Modern missing data treatmentModern missing data treatment Monte Carlo Simulations Monte Carlo Simulations
Types of Mplus FilesTypes of Mplus Files
Data (*.dat, *.txt)Data (*.dat, *.txt) Input (specify a model, <=80 Input (specify a model, <=80
columns/line)columns/line) Output (automatically produced) Output (automatically produced) Plot (automatically produced) Plot (automatically produced)
Data File Format Data File Format
Free Free Delimited by tab, space, or comma Delimited by tab, space, or comma All missing values must be flagged with All missing values must be flagged with
special numbers / symbols special numbers / symbols Default in Mplus Default in Mplus Computationally slow with large data setComputationally slow with large data set
FixedFixed
Format = 3F3, 5F3.2, F5.1;Format = 3F3, 5F3.2, F5.1;
Mplus Input Mplus Input
DATADATA: : File = ? File = ?
VARIABLEVARIABLE: : Names=?; Usevar=?; Names=?; Usevar=?; Categ=?;Categ=?;
ANALYSISANALYSIS: : Type = ?Type = ?
MODELMODEL: : (BY, ON, WITH)(BY, ON, WITH) OUTPUTOUTPUT: : Stand;Stand;
Model Specification in MplusModel Specification in Mplus
BY BY Measured by Measured by (F by x1 x2 x3 x4)(F by x1 x2 x3 x4)
ON ON Regressed on Regressed on (y on x)(y on x)
WITH WITH Correlated with Correlated with (x with y)(x with y)
XWITH XWITH Interact with Interact with (inter | F1 xwith F2)(inter | F1 xwith F2)
PON PON Pair ON Pair ON (y1 y2 on x1 x2 = y1 on x1; y2 on (y1 y2 on x1 x2 = y1 on x1; y2 on
x2)x2) PWITH PWITH pair with pair with (x1 x2 with y1 y2 = x1 with (x1 x2 with y1 y2 = x1 with
y1; y1 with y2)y1; y1 with y2)
Default Specification
Error or residual (disturbance) Covariance of exogenous variables in
CFA Certain covariances of residuals (z2)
z2z1
Graphic ModelGraphic Model
F1
y1 y2 y3
F3
y7 y8 y9
F5
y13 y14 y15
F2
y6y5y4 F4
y12y11y10
d3
d4d5
Model SpecificationModel Specification
Model: Model: f1 by y1-y3;f1 by y1-y3;
f2 by y4-y6;f2 by y4-y6;
f3 by y7-y9;f3 by y7-y9;
f4 by y10-y12;f4 by y10-y12;
f5 by y13-y15;f5 by y13-y15;
f3 on f1 f2;f3 on f1 f2;
f4 on f2;f4 on f2;
f5 on f2 f3 f4 ;f5 on f2 f3 f4 ;MeaErrors are au
PracticePractice Prepare two data files for MplusPrepare two data files for Mplus
Mediation.sav Mediation.sav Aggress.sav Aggress.sav
Model SpecificationModel Specification Single Group CFASingle Group CFA Examine Mediation Effects in a Full Examine Mediation Effects in a Full
SEMSEM Run a MIMIC model of aggressions Run a MIMIC model of aggressions Multigroup CFA to examine Multigroup CFA to examine
measurement invariance measurement invariance
SPSS DataSPSS Data
Missing Values?Missing Values? Leave as blank to use fixed formatLeave as blank to use fixed format Recode into special number to use free formatRecode into special number to use free format
Save as & choose file typeSave as & choose file type Fixed ASCIIFixed ASCII Free *.dat (with or without variable names?)Free *.dat (with or without variable names?)
Copy & paste variable names into Mplus Copy & paste variable names into Mplus input fileinput file
Mplus InterfaceMplus Interface
Activate Mplus Program Activate Mplus Program Language GeneratorLanguage Generator Manually Create An Input File Manually Create An Input File
Four Separate FilesFour Separate Files(Mplus)(Mplus)
Data Data best prepared with other programsbest prepared with other programs
Input Input Need manually specify a model Need manually specify a model
OutputOutput automatic output windowautomatic output window
Graph Graph automatic graph file automatic graph file
Data FileData File
Individual Case Data (*.dat or *.txt) Individual Case Data (*.dat or *.txt) Free Format (default)Free Format (default)
Variable separated by tab, comma, or spaceVariable separated by tab, comma, or space All missing values must be flagged with special All missing values must be flagged with special
symbols or numbers). symbols or numbers). Fixed FormatFixed Format
Variable takes fixed space, e.g. 2F2, 4F6, 5F6.3Variable takes fixed space, e.g. 2F2, 4F6, 5F6.3 Missing values can be left blankMissing values can be left blank
Summary DataSummary Data Variance-Covariance matrix, meansVariance-Covariance matrix, means Correlation matrix, standard deviation, meansCorrelation matrix, standard deviation, means
SPSS SPSS Mplus Mplus
Open “Antisocial.sav” with SPSS Open “Antisocial.sav” with SPSS Work in Variable WindowWork in Variable Window Option 1: Option 1: Fixed Format Fixed Format
Change Format to Simplify Change Format to Simplify Save as ? (Type=Fixed ASCIISave as ? (Type=Fixed ASCII) )
Option 2: Free FormatOption 2: Free Format Recode missing values Recode missing values Save as Save as ? ? (Tab-delimited)(Tab-delimited)
Fixed FormatFixed Format
F3 4F3.2 25F1F3 4F3.2 25F1
F3F3 One variable that takes 3 columns One variable that takes 3 columns
4F3.2 4F3.2 4 variables, each has 3 column 4 variables, each has 3 column
with 2 decimals with a columnwith 2 decimals with a column
25F1 25F1 25 variables, each uses on 25 variables, each uses on
columncolumn
Copy SPSS Variable Names Copy SPSS Variable Names into Mplusinto Mplus
Menu: Utilities Menu: Utilities Variables Variables Highlight to select variablesHighlight to select variables Paste Paste Go to Syntax Window Go to Syntax Window Select & Copy Select & Copy Paste under Paste under Names Are Names Are in Mplus input in Mplus input
file file Practice now Practice now
SAS SAS Mplus Mplus
Assign flags to missing values (use Assign flags to missing values (use Array code for many variables)Array code for many variables)
Proc Export Data = Proc Export Data = Data FileData File Outfile = “Mplus input file folder\Outfile = “Mplus input file folder\
*.dat” *.dat” DBMS = dlm Replace;DBMS = dlm Replace; Run;Run; Practice Practice
Fixed Format Out of SASFixed Format Out of SAS
Open with SPSSOpen with SPSS Save as Fixed Format Save as Fixed Format PracticePractice
Stata2mplusStata2mplus
Converting a stata data file to *.datConverting a stata data file to *.dat
Find out:Find out:http://www.ats.ucla.edu/stat/stata/faq/stata2mplus.htm
Modification IndicesModification Indices
Lower bound estimate of the expected Lower bound estimate of the expected chi square decrease chi square decrease
Freely estimating a parameter fixed at Freely estimating a parameter fixed at 00
MPlusMPlus Output: stand Mod(10); Output: stand Mod(10); Start with least important parameters Start with least important parameters
(covariance of errors)(covariance of errors) Caution: justification?Caution: justification?
Indirect (Mediation) EffectIndirect (Mediation) Effect
A*BA*B
Mplus specification:Mplus specification:Model Indirect: DV IND Mediator IV;Model Indirect: DV IND Mediator IV;
Model ComparisonModel Comparison Model: Model:
Probabilistic statement about the relations of Probabilistic statement about the relations of variablesvariables
Imperfect but usefulImperfect but useful
Models Differ:Models Differ: Different Variables and Different Relations Different Variables and Different Relations
((, , , , , , )) Same Variables but Different Relations Same Variables but Different Relations
((, , , , , , ))
Nested ModelNested Model A Nested Model (b) comes from general A Nested Model (b) comes from general
Model (a) byModel (a) by
Removing a parameter (e.g. a path)Removing a parameter (e.g. a path)
Fixing a parameter at a value (e.g. 0)Fixing a parameter at a value (e.g. 0)
Constraining parameter to be equal to anotherConstraining parameter to be equal to another
Both models have the same variablesBoth models have the same variables
Test If A=BTest If A=B
F1
y1 y2 y3
F3
y7 y8 y9
F5
y13 y14 y15
F2
y6y5y4 F4
y12y11y10
B
A
d3
d4d5
Model Comparison via Model Comparison via 22 DifferenceDifference
22 = df = (Nested model) = df = (Nested model) 22 = df = (Default model) = df = (Default model) ___________________________________ ___________________________________ 22
difdif = df = dfdifdif = p = ? = p = ? (a single tail)(a single tail)
Find p value at the following website:Find p value at the following website:http://www.tutor-homework.com/statistics_tables/statistics_tables.html
Conclusion: Conclusion: If p > .05, there is no difference between the default model and If p > .05, there is no difference between the default model and
nested model. Or the Hypothesis that the parameters of the two nested model. Or the Hypothesis that the parameters of the two models are equal is not supported. models are equal is not supported.
PracticePractice
Test if effect A=BTest if effect A=B
Equality Constraints in Equality Constraints in Mplus Mplus
Parameter Labels:Parameter Labels: Numbers Numbers Letters Letters Combination of numbers of lettersCombination of numbers of letters
Constraint (B=A)Constraint (B=A) F3 on F1 (A);F3 on F1 (A); F3 on F2 (A);F3 on F2 (A);
Run CFA with Real DataRun CFA with Real Data
VerbalAggression
a3 e1
a93 e2
a94 e3
PhysicalAggression
a37 e4
a57 e5
a90 e6
Multigroup AnalysisMultigroup Analysis
VARIABLE:VARIABLE: USEVAR = X1 X2 X3 X4; USEVAR = X1 X2 X3 X4; Grouping IS Grouping IS sex sex (0=F 1=M); (0=F 1=M); ANALYSIS: ANALYSIS: TYPE = MISSING H1;TYPE = MISSING H1;MODEL:MODEL: F1 BY X1 - X4;F1 BY X1 - X4;
MODEL M: MODEL M: F1 BY X2 - X4; F1 BY X2 - X4;
Note: sex is grouping variable and is not used in the model.
Why Measurement Why Measurement Invariance Matters?Invariance Matters?
XXg1g1 = = g1g1 + + g1g1g1g1 + + g1g1
XXg2g2 = = g2g2 + + g2g2g2g2 + + g2g2
XXg1g1-- XXg2g2= (= (g1g1 - - g2g2) + () + (g1g1g1g1--g2g2g2g2) + () + (g1g1--g2g2))
XXg1g1-- XXg2 g2 = = + + ((g1g1- - g2g2) )
Test Measurement Invariance Test Measurement Invariance Default Model Default Model
Model:Model: F1 By a3 F1 By a3 a93(1) a93(1) a94 (2);a94 (2); F2 By a37 F2 By a37 a57 (3) a57 (3) a90 (4); a90 (4); Model M:Model M: F1 By F1 By a93 () a93 () a94 ();a94 (); F2 By F2 By a57 () a57 () a90 ();a90 ();Output: stand;Output: stand;
Note: Reference indicators in the second group are omitted.
Test Measurement Invariance Test Measurement Invariance Constrained Model Constrained Model
Model:Model: F1 By a3 F1 By a3 a93(1) a93(1) a94 (2);a94 (2); F2 By a37 F2 By a37 a57 (3) a57 (3) a90 (4); a90 (4); Model M:Model M: F1 By F1 By a93 (1) a93 (1) a94 (2);a94 (2); F2 By F2 By a57 (3) a57 (3) a90 (4);a90 (4);Output: stand;Output: stand;
Note: Reference indicators in the second group are omitted.
Estimate with Real DataEstimate with Real Data
VerbalAggression
a3 e1
a93 e2
a94 e3
PhysicalAggression
a37 e4
a57 e5
a90 e6
Sex
Race1
Race2
d1
d2
SEM with Categorical SEM with Categorical IndicatorsIndicators
Session IISession II
Problems of Ordinal ScalesProblems of Ordinal Scales
Not truly interval measure of a latent Not truly interval measure of a latent dimension, having measurement dimension, having measurement errors errors
Limited range, biased against Limited range, biased against extreme scoresextreme scores
Items are equally weighted (implicitly Items are equally weighted (implicitly by 1) when summed up or averaged, by 1) when summed up or averaged, losing item sensitivity losing item sensitivity
Criticisms on Using Ordinal Criticisms on Using Ordinal Scales Scales as Measures of Latent as Measures of Latent
ConstructsConstructs Steven (1951):Steven (1951): …means should be avoided …means should be avoided
because its meaning could be easily interpreted because its meaning could be easily interpreted beyond ranks.beyond ranks.
Merbitz(1989):Merbitz(1989): Ordinal scales and foundations Ordinal scales and foundations of misinferenceof misinference
Muthen (1983):Muthen (1983): Pearson product moment Pearson product moment correlations of ordinal scales will produce correlations of ordinal scales will produce distorted results in structural equation modeling. distorted results in structural equation modeling.
Write (1998):Write (1998): “… “…misuses nonlinear raw scores misuses nonlinear raw scores or Likert scales as though they were linear or Likert scales as though they were linear measures will produce systematically distorted measures will produce systematically distorted results. …It’s not only unfair, it is immoral.” results. …It’s not only unfair, it is immoral.”
Assumption of Categorical Assumption of Categorical Indicators Indicators
A categorical indicator is a coarse A categorical indicator is a coarse categorization of a normally categorization of a normally distributed underlying dimension distributed underlying dimension
Latent (Polychoric) Latent (Polychoric) CorrelationCorrelation
Categorization of Latent DimensionCategorization of Latent Dimension& Threshold & Threshold
No Yes
Never Sometimes Often
1 2 3 4 5
Y
m-1 m
ThresholdThreshold
The values of a latent dimension at The values of a latent dimension at which respondents have 50% which respondents have 50% probability of responding to two probability of responding to two adjacent categoriesadjacent categories
Number of thresholds = response Number of thresholds = response categories – 1. e.g. a binary variable categories – 1. e.g. a binary variable has one threshold.has one threshold.
Mplus specification [x$1] [y$2]; Mplus specification [x$1] [y$2];
Normal Cumulative Normal Cumulative DistributionsDistributions
Measurement Models of Measurement Models of Categorical Indicators (Categorical Indicators (2P 2P
IRT)IRT)
Probit: Probit: P P ((=1|=1|) = ) = [(-[(- + + ))-1/2-1/2 ] ] (Estimation = Weight Least Square with df adjusted (Estimation = Weight Least Square with df adjusted
for for
Means and Variances)Means and Variances)
Logistic: Logistic: P P ((=1|=1|) = 1 / (1+ ) = 1 / (1+ ee-(--(- + + ))))
(Maximum Likelihood Estimation)(Maximum Likelihood Estimation)
Converting CFA to IRT Converting CFA to IRT ParametersParameters
Probit ConversionProbit Conversion a = a = -1/2 -1/2
b = b = // Logit ConversionLogit Conversion
a = a = /D/D (D=1.7)(D=1.7)
b = b = //
One Parameter One Parameter Item Response Theory ModelItem Response Theory Model
Analysis: Estimator = ML;Analysis: Estimator = ML; Model: Model:
F by [email protected] F by [email protected]
[email protected] [email protected]
… …
Sample Information Sample Information
Latent Correlation Matrix Latent Correlation Matrix
equivalent to covariance matrix of equivalent to covariance matrix of continuous indicatorscontinuous indicators
Threshold matrix Threshold matrix ΔΔ equivalent to means of continuous equivalent to means of continuous
indicatorsindicators
Stages of EstimationStages of Estimation
Sample information: Sample information: Correlations/threshold/intercepts Correlations/threshold/intercepts (Maximum Likelihood)(Maximum Likelihood)
Correlation structure (Weight Least Correlation structure (Weight Least Square)Square)
gg F = F = (s (s(g)(g)--(g)(g))’W)’W(g)-1(g)-1(s(s(g)(g)--(g)(g))) g=1g=1
WW-1-1 matrix matrix
Elements: Elements:
S1 intercepts or/and thresholdsS1 intercepts or/and thresholds
S2 slopesS2 slopes
S3 residual variances and S3 residual variances and correlationscorrelations
WW-1 -1 : divided by sample size: divided by sample size
EstimationEstimation
WLSMVWLSMV: :
WWeight eight LLeast east SSquare estimation quare estimation 22 with degrees of freedom adjusted for with degrees of freedom adjusted for MMeans and eans and VVariances of latent and ariances of latent and observed variables observed variables
Baseline ModelBaseline Model
Estimated thresholds of all the Estimated thresholds of all the categorical indicatorscategorical indicators
dfdf = = pp 22– 3– 3p p ((p p = 3 of polychoric = 3 of polychoric correlations)correlations)
Data Preparation TipData Preparation Tip
Categorical indicators are required to Categorical indicators are required to have consistent response categories have consistent response categories across groupsacross groups
Run Crosstab to identify zero cellsRun Crosstab to identify zero cells
Recode variables to collapse certain Recode variables to collapse certain categories to eliminate zero cellscategories to eliminate zero cells
Inconsistent CategoriesInconsistent Categories
1 2 3 4 5
Male 60 80 43 4 0
Female
57 86 32 16 2
1 2 3 4
Male 60 80 43 4
Female
57 86 32 18
Specify Specify DependentDependent Variables Variables
as Categoricalas Categorical Variable:Variable:
Categ = x1-x3;Categ = x1-x3; Categ = all;Categ = all;
Reporting Results
Guidelines: Conceptual Model Software + Version Data (continuous or categorical?) Treatment of Missing Values Estimation method Model fit indices (2
(df), p, CFI, TLI, RMSEA)
Measurement properties (factor loadings + reliability) Structural parameter estimates (estimate,
significance, 95% confidence intervals) ( = .23*, CI = .18~.28)
Reliability of Categorical Indicators
(variance approach)
= (i)2/ [(i)2 + 2], where
(i)2 = square (sum of standardized factor loadings)
2 = sum of residual variances i = items or indicator
2i = 1 - 2
McDonald, R. P. (1999). Test theory: A unified treatment (p.89) Mahwah, New Jersey: Lawrence Erlbaum Associates.
Calculator of Reliability Calculator of Reliability (Categorical Indicators)(Categorical Indicators)
SPSS reliability dataSPSS reliability data SPSS reliability syntax SPSS reliability syntax
Trouble Shooting StrategyTrouble Shooting Strategy
Start with one part of a big modelStart with one part of a big model Ensure every part worksEnsure every part works Estimate all parts simultaneously Estimate all parts simultaneously
Important ResourcesImportant Resources
Mplus Website:Mplus Website: www.statmodel.com
Papers:Papers: http://www.statmodel.com/papers.shtml
Mplus discussions:Mplus discussions:
http://www.statmodel.com/cgi-bin/discus/discus.cgi