structural equation modeling using mplus

Structural Equation Structural Equation Modeling Modeling

Using MplusUsing Mplus

Chongming YangChongming Yang

Research Support CenterResearch Support Center

FHSS CollegeFHSS College

Structural? Structural?

StructuralismStructuralism ComponentsComponents Relations Relations

ObjectivesObjectives

Introduction to SEMIntroduction to SEM The modelThe model ParametersParameters Estimation Estimation Model evaluationModel evaluation Applications Applications

Estimate simple models with Mplus Estimate simple models with Mplus

Continuous Dependent Continuous Dependent VariablesVariables

Session ISession I

Information of VariableInformation of Variable

MeanMean VarianceVariance SkewednessSkewedness Kurtosis Kurtosis

Variance & CovarianceVariance & Covariance2( )

1

n

ii

x xV

n

( )( )

1

n

i ii

x x y yCov

n

Covariance Matrix (S)

x1 x2 x3 x1 x2 x3

x1 Vx1 V11

x2 Covx2 Cov21 21 VV22

x3 Covx3 Cov31 31 CovCov32 32 VV33

Statistical Model Statistical Model

Probabilistic statement about Probabilistic statement about Relations of variablesRelations of variables

Imperfect but useful representation Imperfect but useful representation of realityof reality

Structural Equation Structural Equation ModelingModeling

A system of regression equations for A system of regression equations for latent variables to estimate and test latent variables to estimate and test direct and indirect effects without the direct and indirect effects without the influence of measurement errors.influence of measurement errors.

To estimate and test theories about To estimate and test theories about interrelations among observed and interrelations among observed and latent variables.latent variables.

Latent Variable Latent Variable ( (Construct / Factor / TraitConstruct / Factor / Trait))

A hypothetical variable A hypothetical variable cannot be measured directly cannot be measured directly No objective measurement unitNo objective measurement unit inferred from observable manifestations inferred from observable manifestations

Multiple manifestations (indicators) Multiple manifestations (indicators) Normally distributed interval Normally distributed interval

dimensiondimension

How is Depression How is Depression Distributed in?Distributed in?

BYU students BYU students

Patients for Therapy Patients for Therapy

Normal Distributions Normal Distributions

Levels of AnalysesLevels of Analyses

ObservedObserved

LatentLatent

Test TheoriesTest Theories

Classical True Score Theory:Classical True Score Theory:

Observed Score = True score + Observed Score = True score + ErrorError

Item Response TheoryItem Response Theory Generalizability Generalizability (Raykov & Marcoulides, 2006)(Raykov & Marcoulides, 2006)

Graphic Symbols of SEMGraphic Symbols of SEM

Rectangle – observed variableRectangle – observed variable Oval -- latent variable or errorOval -- latent variable or error Single-headed arrow -- causal Single-headed arrow -- causal

relationrelation Double-headed arrow -- correlation Double-headed arrow -- correlation

Graphic Measurement Graphic Measurement Model Model

of Latent of Latent

X1

X2

X3

1

2

3

1

2

3

EquationsEquations

Specific equationsSpecific equationsXX11 = = 11 + + 11

XX22 = = 22 + + 22

XX33 = = 33 + + 3 3

Matrix SymbolsMatrix SymbolsX = X = + +

True Score Theory?True Score Theory?

Relations of VariancesRelations of Variances

VVX1X1 = = 1122 + + 11

VVX2X2 = = 2222 + + 22

VVX3X3 = = 3322 + + 33

= measurement error / uniqueness = measurement error / uniqueness

Unknown ParametersUnknown Parameters

VVX1X1 = = 1122 + + 11

VVX2X2 = = 2222 + + 22

VVX3X3 = = 3322 + + 33

Sample Covariance Matrix (S)

x1 x2 x3 x1 x2 x3

x1 Vx1 V11

x2 Covx2 Cov21 21 VV22

x3 Covx3 Cov31 31 CovCov32 32 VV33

Variance of Variance of

Variance of Variance of = common covariance = common covariance of X1 X2 and X3of X1 X2 and X3

Variance of

1

2 3

0

0

0

Unstandardized Unstandardized ParameterizationParameterization

(scaling)(scaling) 1 1 = 1 = 1 (set variance of X1 =1; X1 called reference Indicator)(set variance of X1 =1; X1 called reference Indicator)

Variance of Variance of = common variance of X1 X2 = common variance of X1 X2 and X3and X3

Squared Squared = explained variance of X (R = explained variance of X (R22)) Variance of Variance of = unexplained variance--error = unexplained variance--error Total Variance = Squared Total Variance = Squared + + Variance Variance

Just Identified ModelJust Identified Model

X1

X2

X3

1

2

3

1

2

3

Reference IndicatorReference Indicator(marker)(marker)

Choose conceptually the best Choose conceptually the best

Small variance Small variance non-convergence non-convergence Different markers Different markers different different

parameters estimates and their parameters estimates and their standard errorsstandard errors

Affect measurement invariance tests Affect measurement invariance tests Not affect standardized estimatesNot affect standardized estimates

Standardized Standardized ParameterizationsParameterizations

(scaling)(scaling) Variance of Variance of = 1 = common = 1 = common

variance of X1 X2 and X3variance of X1 X2 and X3 Squared Squared = explained variance of X = explained variance of X

(R(R22)) Variance of Variance of = 1 - = 1 - 22 Mean of Mean of = 0 = 0 Mean of Mean of = 0 = 0

Two Kinds of ParametersTwo Kinds of Parameters

Fixed at 0, 1, or other valuesFixed at 0, 1, or other values Freely estimatedFreely estimated

GeneralIntelligence

Verbald3

Reasoningd2

Analyticd1

EmotionalIntelligence

Recognize/Assessd5

SelfControld4

Personality

Opennessd7

Agreeable-nessd6

JobSatisfaction

BeingAppreciated e1

SocialRelations e2

MaritalSatisfaction

PerceivedBenefit e3

PerceivedCost e4

z1

z2

Structural Equation ModelStructural Equation Modelin Matrix Symbolsin Matrix Symbols

X = X = xx + + (exogenous) (exogenous)

Y = Y = yy + + (endogenous)(endogenous)

= = + + + + (structural model)(structural model)

Note: Measurement model reflects the true score Note: Measurement model reflects the true score theory theory

Structural Equation ModelStructural Equation Modelin Matrix Symbolsin Matrix Symbols

X = X = xx + + xx + + (measurement) (measurement)

Y = Y = yy + + yy + + (measurement)(measurement)

= = αα + + + + + + (structural)(structural)

Note: SEM with mean structure.Note: SEM with mean structure.

Model Implied Covariance Model Implied Covariance MatrixMatrix

(Σ)(Σ)

Note: This covariance matrix contains unknown parameters in the equations.

(I-B) = non-singular

Estimations/Fit FunctionsEstimations/Fit Functions

Hypothesis: Hypothesis: = S or = S or - S = 0 - S = 0

Maximum LikelihoodMaximum Likelihood

F = log||F = log|||| + trace(S|| + trace(S-1-1) - log||S|| - (p+q)) - log||S|| - (p+q)

Convergence -- Reaching Convergence -- Reaching LimitLimit

Minimize F while adjust unknown Parameters through Minimize F while adjust unknown Parameters through iterative processiterative process

Convergence value: F difference between last two Convergence value: F difference between last two iterationsiterations

Default convergence = .0001 Default convergence = .0001 Increase to help convergence (Increase to help convergence (0.001 or 0.010.001 or 0.01))

e.g. e.g. Analysis: convergence = .01;Analysis: convergence = .01;

No ConvergenceNo Convergence

No unique parameter estimatesNo unique parameter estimates Lack of degrees of freedom Lack of degrees of freedom under under

identification identification Variance of reference indicator too Variance of reference indicator too

small small Fixed parameters are left to be freely Fixed parameters are left to be freely

estimatedestimated Misspecified model Misspecified model

Absolute Fit IndexAbsolute Fit Index

22 = F(N-1) = F(N-1) (N = sample size)(N = sample size)

df = p(p+1)/2 – q df = p(p+1)/2 – q

P = number of variances, covariances, & meansP = number of variances, covariances, & means

q = number of unknown parameters to be estimatedq = number of unknown parameters to be estimated

probprob = ? = ? (Nonsignificant (Nonsignificant 22 indicates good fit, indicates good fit, Why?)Why?)

Sample InformationSample Information

x1 x2 x3 x4 …x1 x2 x3 x4 …x1 x1 vv11

x2 x2 covcov21 21 vv22

x3 x3 covcov31 31 covcov32 32 vv33

x4 x4 covcov41 41 covcov42 42 covcov43 43 vv4 4 ……

…… Mean1 Mean2 Mean3 Mean4 Mean1 Mean2 Mean3 Mean4 ……

Total info = P(P+1)/2 + Means Total info = P(P+1)/2 + Means

Absolute Fit -- SRMRAbsolute Fit -- SRMR

Standardized Root Mean Square Standardized Root Mean Square ResidualResidual

SRMR = Difference between SRMR = Difference between observed and implied covariances in observed and implied covariances in standardized metricstandardized metric

Desirable when < .90, but no Desirable when < .90, but no consensusconsensus

Relative Fit: Relative Fit: Relative to Baseline (Null) Relative to Baseline (Null)

ModelModel All unknown parameters are fixed at All unknown parameters are fixed at

0 0 Variables not related Variables not related ((=======0)=0)

Model implied covariance Model implied covariance = 0 = 0 Fit to sample covariance matrix SFit to sample covariance matrix S Obtain Obtain 22, df, , df, prob prob < .0000 < .0000

Relative Fit IndicesRelative Fit Indices

CFI = 1- (CFI = 1- (22-df)/(-df)/(22bb-df-dfbb) )

b = baseline modelb = baseline model Comparative Fit Index, desirable => .95; 95% better than b modelComparative Fit Index, desirable => .95; 95% better than b model

TLI = (TLI = (22bb/df/dfb b - - 22/df) / (/df) / (22

bb/df/dfbb-1) -1) (Tucker-Lewis Index, desirable => .90)(Tucker-Lewis Index, desirable => .90)

RMSEA = RMSEA = √(√(22-df)/(n*df) -df)/(n*df) (Root Mean Square of Error Approximation, desirable <=.06(Root Mean Square of Error Approximation, desirable <=.06 penalize a large model with more unknown parameters)penalize a large model with more unknown parameters)

Special Case ASpecial Case A

VerbalAggression

t4a3 e3

t4a93 e2

t4a94 e1

PhysicalAggression

t4a37 e6

t4a57 e5

t4a90 e4

Sex

d1

1

d2

1

Special Cases A Special Cases A

Assumption: x = Assumption: x =

y y = = xx + + + +

= = + + xx + +

Special Case BSpecial Case B

VerbalAggression

x3e3

x2e2

x1e1

PhysicalAggression

x6e6

x5e5

x4e4

PeerStatus

d

Special Cases B Special Cases B

Assumption: y = Assumption: y =

x = x = xx + + xx + +

yy = = + + + +

Other Special Cases of SEMOther Special Cases of SEM

Confirmatory Factor Analysis Confirmatory Factor Analysis (measurement model only)(measurement model only) Multiple & Multivariate RegressionMultiple & Multivariate Regression ANOVA / MANOVA ANOVA / MANOVA (multigroup CFA)(multigroup CFA)

ANCOVAANCOVA Path Analysis Model Path Analysis Model (no latent variables)(no latent variables)

Simultaneous Econometric Equations…Simultaneous Econometric Equations… Growth Curve ModelingGrowth Curve Modeling ……

EFA vs. CFAEFA vs. CFA

Factor 1

x1

e1

1

1

x2

e21

x3

e31

Factor 2

x4

e4

x5

e5

x6

e6

1

1 1 1

Exploratory Factor AnalysisConfirmatory Factor Analysis

Factor 1

x1

e1

x2

e2

x3

e3

Factor 2

x4

e4

x5

e5

x6

e6

1

1 1 1

1

1 1 1

Multiple RegressionMultiple Regression

x1

x2

x3

Y

e1

ANCOVAANCOVA

Pretest1

Group

Posttest1

e11

Pretest2 Posttest2

e21

Multivariate Normality Multivariate Normality AssumptionAssumption

Observed data summed up perfectly Observed data summed up perfectly by covariance matrix S (+ means M), by covariance matrix S (+ means M), S thus is an estimator of the S thus is an estimator of the population covariance population covariance

Consequences of ViolationConsequences of Violation

Inflated Inflated 2 2 & deflated CFI and TLI& deflated CFI and TLI reject plausible models reject plausible models

Inflated standard errors Inflated standard errors attenuate factor loadings and attenuate factor loadings and relations of latent variables relations of latent variables (structural parameters)(structural parameters)

(Cause: Sample covariances were underestimated) (Cause: Sample covariances were underestimated)

Accommodating Accommodating StrategiesStrategies

Correcting Fit Correcting Fit Satorra-Bentler Scaled Satorra-Bentler Scaled 2 2 & Standard Errors & Standard Errors

(estimator = mlm; in Mplus)(estimator = mlm; in Mplus) Correcting standard errorsCorrecting standard errors

BootstrappingBootstrapping Transforming Nonnormal variablesTransforming Nonnormal variables

Transforming into new normal indicators Transforming into new normal indicators (undesirable)(undesirable)

SEM with Categorical VariablesSEM with Categorical Variables

Satorra-Bentler Scaled Satorra-Bentler Scaled 2 2 & & SE SE

S-B S-B 22 = = d d-1-1(ML-based (ML-based 22)) (d= Scaling factor (d= Scaling factor that incorporates kurtosis)that incorporates kurtosis)

Effect: performs well with continuous data Effect: performs well with continuous data in terms of in terms of 22, CFI, TLI, RMSEA, parameter , CFI, TLI, RMSEA, parameter estimates and standard errors.estimates and standard errors.

also works with certain-categorical also works with certain-categorical variables (See next slide)variables (See next slide)

Analysis:Analysis: estimator = MLM; estimator = MLM;

Workable Categorical DataWorkable Categorical Data

1.000 2.000 3.000 4.000 5.000

0.000

1.000

2.000

3.000

4.000

5.000

6.000

7.000

Nonworkable Categorical Nonworkable Categorical DataData

1.000 2.000 3.000

0.000

1.000

2.000

3.000

4.000

5.000

6.000

BootstrappingBootstrapping(resampling of data)(resampling of data)

Original btstrp1 btstrp2 …Original btstrp1 btstrp2 … x y x y x y x y x y x y 1 5 5 3 1 31 5 5 3 1 3 2 4 1 1 5 42 4 1 1 5 4 3 3 3 2 4 13 3 3 2 4 1 4 2 4 5 2 24 2 4 5 2 2 5 1 2 4 3 55 1 2 4 3 5 . . . . . .. . . . . .

Limitation of BootstrappingLimitation of Bootstrapping

Assumption: Sample = PopulationAssumption: Sample = Population Useful Diagnostic ToolUseful Diagnostic Tool Does not Compensate for Does not Compensate for

small or unrepresentative samples small or unrepresentative samples severely non-normal or severely non-normal or absence of independent samples for the cross-absence of independent samples for the cross-

validationvalidation Analysis:Analysis: Bootstrap = 500 Bootstrap = 500

(standard/residual);(standard/residual); Output:Output: stand cinterval; stand cinterval;

MplusMplus

www.statmodel.com

http://www.statmodel.com/

Multiple Programs Multiple Programs IntegratedIntegrated

SEM of both continuous and categorical SEM of both continuous and categorical variablesvariables

Multilevel modeling Multilevel modeling Mixture modeling (identify hidden groups)Mixture modeling (identify hidden groups) Complex survey data modeling Complex survey data modeling

(stratification, clustering, weights)(stratification, clustering, weights) Modern missing data treatmentModern missing data treatment Monte Carlo Simulations Monte Carlo Simulations

Types of Mplus FilesTypes of Mplus Files

Data (*.dat, *.txt)Data (*.dat, *.txt) Input (specify a model, <=80 Input (specify a model, <=80

columns/line)columns/line) Output (automatically produced) Output (automatically produced) Plot (automatically produced) Plot (automatically produced)

Data File Format Data File Format

Free Free Delimited by tab, space, or comma Delimited by tab, space, or comma All missing values must be flagged with All missing values must be flagged with

special numbers / symbols special numbers / symbols Default in Mplus Default in Mplus Computationally slow with large data setComputationally slow with large data set

FixedFixed

Format = 3F3, 5F3.2, F5.1;Format = 3F3, 5F3.2, F5.1;

Mplus Input Mplus Input

DATADATA: : File = ? File = ?

VARIABLEVARIABLE: : Names=?; Usevar=?; Names=?; Usevar=?; Categ=?;Categ=?;

ANALYSISANALYSIS: : Type = ?Type = ?

MODELMODEL: : (BY, ON, WITH)(BY, ON, WITH) OUTPUTOUTPUT: : Stand;Stand;

Model Specification in MplusModel Specification in Mplus

BY BY Measured by Measured by (F by x1 x2 x3 x4)(F by x1 x2 x3 x4)

ON ON Regressed on Regressed on (y on x)(y on x)

WITH WITH Correlated with Correlated with (x with y)(x with y)

XWITH XWITH Interact with Interact with (inter | F1 xwith F2)(inter | F1 xwith F2)

PON PON Pair ON Pair ON (y1 y2 on x1 x2 = y1 on x1; y2 on (y1 y2 on x1 x2 = y1 on x1; y2 on

x2)x2) PWITH PWITH pair with pair with (x1 x2 with y1 y2 = x1 with (x1 x2 with y1 y2 = x1 with

y1; y1 with y2)y1; y1 with y2)

Default Specification

Error or residual (disturbance) Covariance of exogenous variables in

CFA Certain covariances of residuals (z2)

z2z1

Graphic ModelGraphic Model

F1

y1 y2 y3

F3

y7 y8 y9

F5

y13 y14 y15

F2

y6y5y4 F4

y12y11y10

d3

d4d5

Model SpecificationModel Specification

Model: Model: f1 by y1-y3;f1 by y1-y3;

f2 by y4-y6;f2 by y4-y6;

f3 by y7-y9;f3 by y7-y9;

f4 by y10-y12;f4 by y10-y12;

f5 by y13-y15;f5 by y13-y15;

f3 on f1 f2;f3 on f1 f2;

f4 on f2;f4 on f2;

f5 on f2 f3 f4 ;f5 on f2 f3 f4 ;MeaErrors are au

PracticePractice Prepare two data files for MplusPrepare two data files for Mplus

Mediation.sav Mediation.sav Aggress.sav Aggress.sav

Model SpecificationModel Specification Single Group CFASingle Group CFA Examine Mediation Effects in a Full Examine Mediation Effects in a Full

SEMSEM Run a MIMIC model of aggressions Run a MIMIC model of aggressions Multigroup CFA to examine Multigroup CFA to examine

measurement invariance measurement invariance

SPSS DataSPSS Data

Missing Values?Missing Values? Leave as blank to use fixed formatLeave as blank to use fixed format Recode into special number to use free formatRecode into special number to use free format

Save as & choose file typeSave as & choose file type Fixed ASCIIFixed ASCII Free *.dat (with or without variable names?)Free *.dat (with or without variable names?)

Copy & paste variable names into Mplus Copy & paste variable names into Mplus input fileinput file

Mplus InterfaceMplus Interface

Activate Mplus Program Activate Mplus Program Language GeneratorLanguage Generator Manually Create An Input File Manually Create An Input File

Four Separate FilesFour Separate Files(Mplus)(Mplus)

Data Data best prepared with other programsbest prepared with other programs

Input Input Need manually specify a model Need manually specify a model

OutputOutput automatic output windowautomatic output window

Graph Graph automatic graph file automatic graph file

Data FileData File

Individual Case Data (*.dat or *.txt) Individual Case Data (*.dat or *.txt) Free Format (default)Free Format (default)

Variable separated by tab, comma, or spaceVariable separated by tab, comma, or space All missing values must be flagged with special All missing values must be flagged with special

symbols or numbers). symbols or numbers). Fixed FormatFixed Format

Variable takes fixed space, e.g. 2F2, 4F6, 5F6.3Variable takes fixed space, e.g. 2F2, 4F6, 5F6.3 Missing values can be left blankMissing values can be left blank

Summary DataSummary Data Variance-Covariance matrix, meansVariance-Covariance matrix, means Correlation matrix, standard deviation, meansCorrelation matrix, standard deviation, means

SPSS SPSS Mplus Mplus

Open “Antisocial.sav” with SPSS Open “Antisocial.sav” with SPSS Work in Variable WindowWork in Variable Window Option 1: Option 1: Fixed Format Fixed Format

Change Format to Simplify Change Format to Simplify Save as ? (Type=Fixed ASCIISave as ? (Type=Fixed ASCII) )

Option 2: Free FormatOption 2: Free Format Recode missing values Recode missing values Save as Save as ? ? (Tab-delimited)(Tab-delimited)

Fixed FormatFixed Format

F3 4F3.2 25F1F3 4F3.2 25F1

F3F3 One variable that takes 3 columns One variable that takes 3 columns

4F3.2 4F3.2 4 variables, each has 3 column 4 variables, each has 3 column

with 2 decimals with a columnwith 2 decimals with a column

25F1 25F1 25 variables, each uses on 25 variables, each uses on

columncolumn

Copy SPSS Variable Names Copy SPSS Variable Names into Mplusinto Mplus

Menu: Utilities Menu: Utilities Variables Variables Highlight to select variablesHighlight to select variables Paste Paste Go to Syntax Window Go to Syntax Window Select & Copy Select & Copy Paste under Paste under Names Are Names Are in Mplus input in Mplus input

file file Practice now Practice now

SAS SAS Mplus Mplus

Assign flags to missing values (use Assign flags to missing values (use Array code for many variables)Array code for many variables)

Proc Export Data = Proc Export Data = Data FileData File Outfile = “Mplus input file folder\Outfile = “Mplus input file folder\

*.dat” *.dat” DBMS = dlm Replace;DBMS = dlm Replace; Run;Run; Practice Practice

Fixed Format Out of SASFixed Format Out of SAS

Open with SPSSOpen with SPSS Save as Fixed Format Save as Fixed Format PracticePractice

Stata2mplusStata2mplus

Converting a stata data file to *.datConverting a stata data file to *.dat

Find out:Find out:http://www.ats.ucla.edu/stat/stata/faq/stata2mplus.htm

http://www.ats.ucla.edu/stat/stata/faq/stata2mplus.htm

http://www.ats.ucla.edu/stat/stata/faq/stata2mplus.htm

Modification IndicesModification Indices

Lower bound estimate of the expected Lower bound estimate of the expected chi square decrease chi square decrease

Freely estimating a parameter fixed at Freely estimating a parameter fixed at 00

MPlusMPlus Output: stand Mod(10); Output: stand Mod(10); Start with least important parameters Start with least important parameters

(covariance of errors)(covariance of errors) Caution: justification?Caution: justification?

Indirect (Mediation) EffectIndirect (Mediation) Effect

A*BA*B

Mplus specification:Mplus specification:Model Indirect: DV IND Mediator IV;Model Indirect: DV IND Mediator IV;

Model ComparisonModel Comparison Model: Model:

Probabilistic statement about the relations of Probabilistic statement about the relations of variablesvariables

Imperfect but usefulImperfect but useful

Models Differ:Models Differ: Different Variables and Different Relations Different Variables and Different Relations

((, , , , , , )) Same Variables but Different Relations Same Variables but Different Relations

((, , , , , , ))

Nested ModelNested Model A Nested Model (b) comes from general A Nested Model (b) comes from general

Model (a) byModel (a) by

Removing a parameter (e.g. a path)Removing a parameter (e.g. a path)

Fixing a parameter at a value (e.g. 0)Fixing a parameter at a value (e.g. 0)

Constraining parameter to be equal to anotherConstraining parameter to be equal to another

Both models have the same variablesBoth models have the same variables

Test If A=BTest If A=B

F1

y1 y2 y3

F3

y7 y8 y9

F5

y13 y14 y15

F2

y6y5y4 F4

y12y11y10

B

A

d3

d4d5

Model Comparison via Model Comparison via 22 DifferenceDifference

22 = df = (Nested model) = df = (Nested model) 22 = df = (Default model) = df = (Default model) ___________________________________ ___________________________________ 22

difdif = df = dfdifdif = p = ? = p = ? (a single tail)(a single tail)

Find p value at the following website:Find p value at the following website:http://www.tutor-homework.com/statistics_tables/statistics_tables.html

Conclusion: Conclusion: If p > .05, there is no difference between the default model and If p > .05, there is no difference between the default model and

nested model. Or the Hypothesis that the parameters of the two nested model. Or the Hypothesis that the parameters of the two models are equal is not supported. models are equal is not supported.

http://www.tutor-homework.com/statistics_tables/statistics_tables.html

PracticePractice

Test if effect A=BTest if effect A=B

Equality Constraints in Equality Constraints in Mplus Mplus

Parameter Labels:Parameter Labels: Numbers Numbers Letters Letters Combination of numbers of lettersCombination of numbers of letters

Constraint (B=A)Constraint (B=A) F3 on F1 (A);F3 on F1 (A); F3 on F2 (A);F3 on F2 (A);

Run CFA with Real DataRun CFA with Real Data

VerbalAggression

a3 e1

a93 e2

a94 e3

PhysicalAggression

a37 e4

a57 e5

a90 e6

Multigroup AnalysisMultigroup Analysis

VARIABLE:VARIABLE: USEVAR = X1 X2 X3 X4; USEVAR = X1 X2 X3 X4; Grouping IS Grouping IS sex sex (0=F 1=M); (0=F 1=M); ANALYSIS: ANALYSIS: TYPE = MISSING H1;TYPE = MISSING H1;MODEL:MODEL: F1 BY X1 - X4;F1 BY X1 - X4;

MODEL M: MODEL M: F1 BY X2 - X4; F1 BY X2 - X4;

Note: sex is grouping variable and is not used in the model.

Why Measurement Why Measurement Invariance Matters?Invariance Matters?

XXg1g1 = = g1g1 + + g1g1g1g1 + + g1g1

XXg2g2 = = g2g2 + + g2g2g2g2 + + g2g2

XXg1g1-- XXg2g2= (= (g1g1 - - g2g2) + () + (g1g1g1g1--g2g2g2g2) + () + (g1g1--g2g2))

XXg1g1-- XXg2 g2 = = + + ((g1g1- - g2g2) )

Test Measurement Invariance Test Measurement Invariance Default Model Default Model

Model:Model: F1 By a3 F1 By a3 a93(1) a93(1) a94 (2);a94 (2); F2 By a37 F2 By a37 a57 (3) a57 (3) a90 (4); a90 (4); Model M:Model M: F1 By F1 By a93 () a93 () a94 ();a94 (); F2 By F2 By a57 () a57 () a90 ();a90 ();Output: stand;Output: stand;

Note: Reference indicators in the second group are omitted.

Test Measurement Invariance Test Measurement Invariance Constrained Model Constrained Model

Model:Model: F1 By a3 F1 By a3 a93(1) a93(1) a94 (2);a94 (2); F2 By a37 F2 By a37 a57 (3) a57 (3) a90 (4); a90 (4); Model M:Model M: F1 By F1 By a93 (1) a93 (1) a94 (2);a94 (2); F2 By F2 By a57 (3) a57 (3) a90 (4);a90 (4);Output: stand;Output: stand;

Note: Reference indicators in the second group are omitted.

Estimate with Real DataEstimate with Real Data

VerbalAggression

a3 e1

a93 e2

a94 e3

PhysicalAggression

a37 e4

a57 e5

a90 e6

Sex

Race1

Race2

d1

d2

SEM with Categorical SEM with Categorical IndicatorsIndicators

Session IISession II

Problems of Ordinal ScalesProblems of Ordinal Scales

Not truly interval measure of a latent Not truly interval measure of a latent dimension, having measurement dimension, having measurement errors errors

Limited range, biased against Limited range, biased against extreme scoresextreme scores

Items are equally weighted (implicitly Items are equally weighted (implicitly by 1) when summed up or averaged, by 1) when summed up or averaged, losing item sensitivity losing item sensitivity

Criticisms on Using Ordinal Criticisms on Using Ordinal Scales Scales as Measures of Latent as Measures of Latent

ConstructsConstructs Steven (1951):Steven (1951): …means should be avoided …means should be avoided

because its meaning could be easily interpreted because its meaning could be easily interpreted beyond ranks.beyond ranks.

Merbitz(1989):Merbitz(1989): Ordinal scales and foundations Ordinal scales and foundations of misinferenceof misinference

Muthen (1983):Muthen (1983): Pearson product moment Pearson product moment correlations of ordinal scales will produce correlations of ordinal scales will produce distorted results in structural equation modeling. distorted results in structural equation modeling.

Write (1998):Write (1998): “… “…misuses nonlinear raw scores misuses nonlinear raw scores or Likert scales as though they were linear or Likert scales as though they were linear measures will produce systematically distorted measures will produce systematically distorted results. …It’s not only unfair, it is immoral.” results. …It’s not only unfair, it is immoral.”

Assumption of Categorical Assumption of Categorical Indicators Indicators

A categorical indicator is a coarse A categorical indicator is a coarse categorization of a normally categorization of a normally distributed underlying dimension distributed underlying dimension

Latent (Polychoric) Latent (Polychoric) CorrelationCorrelation

Categorization of Latent DimensionCategorization of Latent Dimension& Threshold & Threshold

No Yes

Never Sometimes Often

1 2 3 4 5

Y

m-1 m

ThresholdThreshold

The values of a latent dimension at The values of a latent dimension at which respondents have 50% which respondents have 50% probability of responding to two probability of responding to two adjacent categoriesadjacent categories

Number of thresholds = response Number of thresholds = response categories – 1. e.g. a binary variable categories – 1. e.g. a binary variable has one threshold.has one threshold.

Mplus specification [x$1] [y$2]; Mplus specification [x$1] [y$2];

Normal Cumulative Normal Cumulative DistributionsDistributions

http://upload.wikimedia.org/wikipedia/commons/1/19/Normal_distribution_cdf.png

Measurement Models of Measurement Models of Categorical Indicators (Categorical Indicators (2P 2P

IRT)IRT)

Probit: Probit: P P ((=1|=1|) = ) = [(-[(- + + ))-1/2-1/2 ] ] (Estimation = Weight Least Square with df adjusted (Estimation = Weight Least Square with df adjusted

for for

Means and Variances)Means and Variances)

Logistic: Logistic: P P ((=1|=1|) = 1 / (1+ ) = 1 / (1+ ee-(--(- + + ))))

(Maximum Likelihood Estimation)(Maximum Likelihood Estimation)

Converting CFA to IRT Converting CFA to IRT ParametersParameters

Probit ConversionProbit Conversion a = a = -1/2 -1/2

b = b = // Logit ConversionLogit Conversion

a = a = /D/D (D=1.7)(D=1.7)

b = b = //

One Parameter One Parameter Item Response Theory ModelItem Response Theory Model

Analysis: Estimator = ML;Analysis: Estimator = ML; Model: Model:

F by [email protected] F by [email protected]

[email protected] [email protected]

… …

[email protected]; [email protected];

Sample Information Sample Information

Latent Correlation Matrix Latent Correlation Matrix

equivalent to covariance matrix of equivalent to covariance matrix of continuous indicatorscontinuous indicators

Threshold matrix Threshold matrix ΔΔ equivalent to means of continuous equivalent to means of continuous

indicatorsindicators

Stages of EstimationStages of Estimation

Sample information: Sample information: Correlations/threshold/intercepts Correlations/threshold/intercepts (Maximum Likelihood)(Maximum Likelihood)

Correlation structure (Weight Least Correlation structure (Weight Least Square)Square)

gg F = F = (s (s(g)(g)--(g)(g))’W)’W(g)-1(g)-1(s(s(g)(g)--(g)(g))) g=1g=1

WW-1-1 matrix matrix

Elements: Elements:

S1 intercepts or/and thresholdsS1 intercepts or/and thresholds

S2 slopesS2 slopes

S3 residual variances and S3 residual variances and correlationscorrelations

WW-1 -1 : divided by sample size: divided by sample size

EstimationEstimation

WLSMVWLSMV: :

WWeight eight LLeast east SSquare estimation quare estimation 22 with degrees of freedom adjusted for with degrees of freedom adjusted for MMeans and eans and VVariances of latent and ariances of latent and observed variables observed variables

Baseline ModelBaseline Model

Estimated thresholds of all the Estimated thresholds of all the categorical indicatorscategorical indicators

dfdf = = pp 22– 3– 3p p ((p p = 3 of polychoric = 3 of polychoric correlations)correlations)

Data Preparation TipData Preparation Tip

Categorical indicators are required to Categorical indicators are required to have consistent response categories have consistent response categories across groupsacross groups

Run Crosstab to identify zero cellsRun Crosstab to identify zero cells

Recode variables to collapse certain Recode variables to collapse certain categories to eliminate zero cellscategories to eliminate zero cells

Inconsistent CategoriesInconsistent Categories

1 2 3 4 5

Male 60 80 43 4 0

Female

57 86 32 16 2

1 2 3 4

Male 60 80 43 4

Female

57 86 32 18

Specify Specify DependentDependent Variables Variables

as Categoricalas Categorical Variable:Variable:

Categ = x1-x3;Categ = x1-x3; Categ = all;Categ = all;

Reporting Results

Guidelines: Conceptual Model Software + Version Data (continuous or categorical?) Treatment of Missing Values Estimation method Model fit indices (2

(df), p, CFI, TLI, RMSEA)

Measurement properties (factor loadings + reliability) Structural parameter estimates (estimate,

significance, 95% confidence intervals) ( = .23*, CI = .18~.28)

Reliability of Categorical Indicators

(variance approach)

= (i)2/ [(i)2 + 2], where

(i)2 = square (sum of standardized factor loadings)

2 = sum of residual variances i = items or indicator

2i = 1 - 2

McDonald, R. P. (1999). Test theory: A unified treatment (p.89) Mahwah, New Jersey: Lawrence Erlbaum Associates.

Calculator of Reliability Calculator of Reliability (Categorical Indicators)(Categorical Indicators)

SPSS reliability dataSPSS reliability data SPSS reliability syntax SPSS reliability syntax

Trouble Shooting StrategyTrouble Shooting Strategy

Start with one part of a big modelStart with one part of a big model Ensure every part worksEnsure every part works Estimate all parts simultaneously Estimate all parts simultaneously

Important ResourcesImportant Resources

Mplus Website:Mplus Website: www.statmodel.com

Papers:Papers: http://www.statmodel.com/papers.shtml

Mplus discussions:Mplus discussions:

http://www.statmodel.com/cgi-bin/discus/discus.cgi

http://www.statmodel.com/

http://www.statmodel.com/papers.shtml



structural equation modeling using mplus

Documents

common variance of x1

set variance of x1

y y measurement

squared variance

unexplained variance

errortotal variance

x x measurement y

common covariance of