tests of structural equation models do not work: what to do ?

66
1 Tests of structural equation models do not work: What to do ? Willem E.Saris ESADE Universitat Ramon Llull

Upload: chesmu

Post on 23-Jan-2016

40 views

Category:

Documents


0 download

DESCRIPTION

Tests of structural equation models do not work: What to do ?. Willem E.Saris ESADE Universitat Ramon Llull. Concern about testing. I have been worried about the testing procedures in SEM from my first contacts - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Tests of structural equation models do not work: What to do ?

1

Tests of structural equation models do not work: What to do ?

Willem E.Saris

ESADE

Universitat Ramon Llull

Page 2: Tests of structural equation models do not work: What to do ?

22

Concern about testing

I have been worried about the testing procedures in SEM from my first contacts

More then 25 years ago Albert Satorra and me wrote our first paper on the power of the test.

Our worries have not been shared by the SEM community untill recently (Publication in SEM)

I am very pleased that today I have the opportunity to convince you of our point of view

Page 3: Tests of structural equation models do not work: What to do ?

333

Importance of testing

The purpose of SEM is to estimate the strength of relationships between variables correcting for measurement error

All estimates are conditional on the specified model

Therefore testing the models is essential for SEM

Page 4: Tests of structural equation models do not work: What to do ?

44

Content of my lecture

•Brief intro in SEM and the standard test•Our criticism•The alternative direction of the SEM community: fit indices•The special case of RMSEA•Why fit indices are not the solution•Back to the basics•An illustration

Page 5: Tests of structural equation models do not work: What to do ?

55

Introduction SEM by example

A frequently discussed issue nowadays is whether Social Trust is related with Political Trust.

Both latent variables are normally measures by three indicators

•Path analysis suggests:

• ij = ikjm if k=m

• ij = ikjm if k≠m

Page 6: Tests of structural equation models do not work: What to do ?

66

Estimation of effects

The parameters are estimated by minimizing the following quadratic form:

f = wij (sij – ij)2

The estimates are the values which minimize this function

The value of this function at its minimum is denoted by f0

Page 7: Tests of structural equation models do not work: What to do ?

77

Imagine that this is the observed correlation matrix

Correlation Matrix

y1 y2 y3 y4 y5 y6

-------- -------- -------- -------- -------- --------

y1 1.00

y2 0.64 1.00

y3 0.64 0.64 1.00

y4 0.32 0.32 0.32 1.00

y5 0.32 0.32 0.32 0.64 1.00

y6 0.32 0.32 0.32 0.64 0.64 1.00

Page 8: Tests of structural equation models do not work: What to do ?

88

The estimates

LAMBDA-Y

F 1 F 2

-------- --------

y1 0.80 - -

y2 0.80 - -

y3 0.80 - -

y4 - - 0.80

y5 - - 0.80

y6 - - 0.80

Correlation of F1 with F2 = 0.50

We can estimate the relationship between latent variables and observed variables but also between latent variables

Page 9: Tests of structural equation models do not work: What to do ?

99

The residuals= differences between observed and expected correlations

Residuals

y1 y2 y3 y4 y5 y6

-------- -------- -------- -------- -------- --------

y1 0.00

y2 0.00 0.00

y3 0.00 0.00 0.00

y4 0.00 0.00 0.00 0.00

y5 0.00 0.00 0.00 0.00 0.00

y6 0.00 0.00 0.00 0.00 0.00 0.00

Page 10: Tests of structural equation models do not work: What to do ?

10

Imagine that the model in the population is different

.5F1 F2

.8 .2

Y1 Y2 Y3 Y4 Y5 Y6

e1 e2 e3 e4 e5 e6

1.00 .64 1.00 .64 .64 1.00 .48 .48 .48 1.00 .32 .32 .32 .72 1.00 .32 .32 .32 .72 .64 1.00

Page 11: Tests of structural equation models do not work: What to do ?

11

Now the estimates are also differentThese estimates deviate somewhat from the values in the population

The deviations are due to the misspecification

Can we detect that the hypothesized model is wrong ?

LAMBDA-Y F 1 F 2 -------- -------- VAR 1 0.80 - - (0.04) 19.96 VAR 2 0.80 - - (0.04) 19.96 VAR 3 0.80 - - (0.04) 19.96 VAR 4 - - 0.95 (0.04) 26.14 VAR 5 - - 0.77 (0.04) 19.42 VAR 6 - - 0.77 (0.04) 19.42

Page 12: Tests of structural equation models do not work: What to do ?

12

The fitted residuals

Based on these estimates the expected correlations can be calculated.

The residuals (observed-expected correlations) can indicate that the model is misspecified

In this case the residuals are:

VAR 1 VAR 2 VAR 3 VAR 4 VAR 5 VAR 6 -------- -------- -------- -------- -------- ---- VAR 1 0.00 VAR 2 0.00 0.00 VAR 3 0.00 0.00 0.00 VAR 4 0.02 0.02 0.02 0.00 VAR 5 -0.05 -0.05 -0.05 -0.01 0.00 VAR 6 -0.05 -0.05 -0.05 -0.01 0.05 0.00

Page 13: Tests of structural equation models do not work: What to do ?

13

When should the model be rejected ?

Residuals can differ from zero due to misspecification of the model

But also due to sampling fluctuations.

So when should the model be rejected ?

Page 14: Tests of structural equation models do not work: What to do ?

14

The quality the test should haveMacCallum, Browne and Sugawara (1996: 131)

“if the model is truly a good model in terms of its fit in the population, we wish to avoid concluding that the model is a bad one.

Alternatively, if the model is truly a bad one, we wish to avoid concluding that it is a good one.”

Page 15: Tests of structural equation models do not work: What to do ?

15

In statistical terms

Required is:

A small probability of a type 1 error i.e. the probability of rejection of a good model

A small probability of a type II error i.e. the probability of acceptance of a bad model

Page 16: Tests of structural equation models do not work: What to do ?

16

Bad models are misspecified models

Hu and Bentler (1998: 427):

“a model is said to be misspecified when

(a) one or more parameters are estimated whose population values are zeros (i.e. an over-parameterised misspecified model)

(b) one or more parameters are fixed to zeros whose population values are non-zeros (i.e. an under-parameterised misspecified model)

(c) or both.”

Page 17: Tests of structural equation models do not work: What to do ?

17

Definition of the size of a misspecification

The size of the misspecification is the absolute difference between

the true value of the parameter and

the value specified in the analysis

In the above example the size of the misspecification was .2

Page 18: Tests of structural equation models do not work: What to do ?

18

The standard chi2 test

It can be shown that under very general conditions:

the test statistic T = nF0 has a 2 (df) distribution if the model is correct

The model is rejected if T > C

where Cis the value for which

pr(2 (df) > C ) =

Page 19: Tests of structural equation models do not work: What to do ?

19

Criticism

The specified test does not test directly for misspecifications in the model

The test checks possible consequences of misspecifications present in the residuals

The specified test only controls the type I errors and not the type II errors

Page 20: Tests of structural equation models do not work: What to do ?

20

Can we evaluate type II errors ?

It is well known that

T has a non central 2 (df, ncp) distribution if the model is incorrect

Due to a misspecification in the model the mean of the distribution of T increases with what is called the Noncentrality parameter (NCP)

Page 21: Tests of structural equation models do not work: What to do ?

21

The Central and noncentral chi2 distribution and the power

Page 22: Tests of structural equation models do not work: What to do ?

22

The non-centrality parameter NCP

The NCP can be computed as shown by Satorra and Saris (1985) by generating population data and estimating the parameters with an incorrect model.

The difference between the two models is the misspecification in the model

In that case the value of the test statistic T is equal to the NCP for this misspecification given that the rest of the model is correct.

Page 23: Tests of structural equation models do not work: What to do ?

2321 april 2023 23college titel en nummer

An illustration

Page 24: Tests of structural equation models do not work: What to do ?

24

High Power (left) and low Power (right)

•High power is good for big errors not for small errors.

•Low power is good for small errors not for big errors

•With loading .8 the left side applies. With loadings .5 the right side applies for the same error.

Page 25: Tests of structural equation models do not work: What to do ?

25

The standard test is not good enough

The standard test can only detect misspecifications for which the test is sensitive (high power).

Rejection of the model can be due to very small misspecifications for which the test is very sensitive

Not rejection does not mean that the model is correct. The test can be insensitive for the misspecifications

Page 26: Tests of structural equation models do not work: What to do ?

26

The reasons for the problems

Only type I errors are taken into account

It is not a direct test of misspecifications but of consequences of misspecifications.

These consequences (residuals) are also affected by other characteristics of the model

Page 27: Tests of structural equation models do not work: What to do ?

27

This was not the mainstream problem

Hu and Bentler say:

“the decision for accepting or rejecting a particular model may vary as a function of sample size, which is certainly not desirable.”

This problem with the chi2 test has led to the development of a plethora of Fit indices.

Page 28: Tests of structural equation models do not work: What to do ?

28

Fit indices with cut-of criteria

Page 29: Tests of structural equation models do not work: What to do ?

29

Model evaluation with Fit indices

The traditional model evaluation method has been replaced by a similar procedure using Fit indices.For fit indices that have a theoretical upper value of 1 for good fitting models (such as AGFI and GFI) , the model being rejected if:FI < Cfi

There are however, also FIs for which a theoretical lower value of 0 indicates a good fit; for them the model is rejected if:FI > Cfi

where Cfi is a fix cut-off value developed specifically for each FI.

Page 30: Tests of structural equation models do not work: What to do ?

30

Criticism

For most indices the distribution is unknown. Only by Monte Carlo experiments, based on specific cases, arguments are made for critical values

Only consequences for the residuals are evaluated and not the misspecifications themselves.

Page 31: Tests of structural equation models do not work: What to do ?

31

Goodness of fit by approximationSteiger (1990), Browne & Cudeck (1993) and MacCallum et al. (1996), have argued:models are always simplifications of reality and are therefore always misspecified. This has led to the most popular fit index nowadays: Root Mean Squared Error of Approximation or RMSEA

Although there is truth in this argument, this is not a good reason to completely change the approach to model testing.

Page 32: Tests of structural equation models do not work: What to do ?

32

This is not necessary

One has to design tests which take into account Type 1 and type 2 errors so that:

Models with substantially relevant misspecifications should be rejected and

Models with substantially irrelevant misspecifications should be accepted.

Page 33: Tests of structural equation models do not work: What to do ?

33

Serious problems

The fit indices are functions of the fitting function

So they have the same serious problems as the standard test

Let us show that by very simple but fundamental models.

Page 34: Tests of structural equation models do not work: What to do ?

34

A model Mo with a substantively relevant misspecification

Population model M1 Hypothesized model M0

The misspecification is in the correlated disturbance terms

The size of the misspecification in .2

Without detection the misspecification b21=.2 not .0 !

This model should be rejected

Page 35: Tests of structural equation models do not work: What to do ?

35

A model Mo with a substantively irrelevant misspecification

Population model M1 Hypothesized model M0

The misspecification is in the correlated factors

The size of the misspecification in .05

For all practical purposes this model should be accepted

Page 36: Tests of structural equation models do not work: What to do ?

36

Population data

y1 y2 x1 x2

y1 1.00

y2 0.20 1.00

x1 0.40 0.00 1.00

x2 0.00 0.10 0.00 1.00

Page 37: Tests of structural equation models do not work: What to do ?

37

Population study with different values of 22

γ22 CHI2 power RMSEA CFI AGFI SRMR MI of ψ21

0.1 3.20 0.34 0.00 1.00 0.99 0.025 3.20

0.2 3.30 0.35 0.00 1.00 0.99 0.025 3.30

0.3 3.49 0.37 0.00 1.00 0.99 0.025 3.49

0.4 3.80 0.38 0.00 1.00 0.99 0.025 3.80

0.5 4.20 0.43 0.01 1.00 0.99 0.025 4.20

0.6 5.07 0.50 0.03 1.00 0.98 0.025 5.07

0.7 6.47 0.62 0.04 0.99 0.98 0.025 6.47

0.8 9.50 0.79 0.06 0.98 0.97 0.025 9.50

0.9 20.27 0.99 0.10 0.96 0.94 0.025 20.27

Page 38: Tests of structural equation models do not work: What to do ?

38

A model Mo with a substantive irrelevant misspecification

Population model M1 Hypothesized model M0

The misspecification is in the correlated factors

The size of the misspecification in .05

For all practical purposes this model should be accepted

Page 39: Tests of structural equation models do not work: What to do ?

39

Population study of the factor model

The better the measures are the more likely it is that the model is rejected

This is not a very attractive test

S RMSEASRMR

Page 40: Tests of structural equation models do not work: What to do ?

40

These examples show

The model with a substantively relevant misspecification will most likely not be rejected

The model with a substantively irrelevant misspecification will most likely be rejected

This is the opposite of what all of us would like

Page 41: Tests of structural equation models do not work: What to do ?

41

We see what should not happen

In contrast to what MacCallum, Browne and Sugawara (1996: 131) required:

A bad model will not be rejected

A good model will be rejected

Page 42: Tests of structural equation models do not work: What to do ?

42

Conclusion

We could say paraphrasing Hu and Bentler :

“the decision for accepting or rejecting a particular model may vary as a function of irrelevant parameters, which is certainly not desirable.”

So there are reasons enough to consider alternative procedures for testing these models.

Page 43: Tests of structural equation models do not work: What to do ?

43

Can information about the power help?

We have thought that information about the power of the test can help to test hypotheses about single parameters or small sets of parameters

Let me illustrate this by the last example

Page 44: Tests of structural equation models do not work: What to do ?

44

We want to test if the factors measure the same i.e. Correlate perfectly

Population model M1 Hypothesized model M0

The misspecification is in the correlated factors

What is the power of the chi2 test if the size of the misspecification is .10

Page 45: Tests of structural equation models do not work: What to do ?

45

The power of the test

Page 46: Tests of structural equation models do not work: What to do ?

46

Now we can design the test

Given that the loadings are around .8

And we accept a type I error of .05 ()

And we want to have a high power (.8) to detect a deviation of .1 or more

Then we should have a sample size of at least 300 cases

In this case the model should be rejected

If T > 3.84

Page 47: Tests of structural equation models do not work: What to do ?

47

Criticism

The problem of this test is that we have to suppose that there are no other misspecifications in the model

If there are other misspecifications they can be the cause of the rejection of the model

Page 48: Tests of structural equation models do not work: What to do ?

48

There are many other possible errors

Page 49: Tests of structural equation models do not work: What to do ?

49

The situation is even worse

The model test requires a test for all parameters

But the tests are unequally sensitive for misspecifications in different parameters

We can only expect that the test detects misspecifications for which the test is sensitive

This sensitivety depends on characteristics of the model that have nothing to do with the size of the misspecification.

Page 50: Tests of structural equation models do not work: What to do ?

50

For example

NCP

Page 51: Tests of structural equation models do not work: What to do ?

51

Model test is impossible

Given these differences in power between the different parameters one can never formulate a test for all parameters of the model

If one increases the power

minimal missspecifications in some parameters will lead to rejection of the model.

If one does not increase the power

some misspecifications will never be detected.

Page 52: Tests of structural equation models do not work: What to do ?

52

Our proposal: back to the basics

We have to test for misspecifications in the models

In this test type I and type II (or power) have to be taken into account

Only one serious misspecification is already enough to reject the model

Page 53: Tests of structural equation models do not work: What to do ?

53

A half way solution from 1987: Estimation of the EPC and MI

MI

EPC

Page 54: Tests of structural equation models do not work: What to do ?

54

What do we get for each constrained parameter ?

For each constrained parameter we can get the EPC and the MI.

This means that we get an estimate of the misspecification (EPC) and the test statistic (MI) for this misspecification.

What we still miss is an indication of the power of the test for each EPC.

Page 55: Tests of structural equation models do not work: What to do ?

55

The power of the test

What a relevant misspecification is depends of the progress in a discipline

In the social science the following sizes of misspecifications are certainly relevant

.1 for a causal effect or correlated error

.4 for a loading

Page 56: Tests of structural equation models do not work: What to do ?

56

The power of the test

We call the critical value for a misspecification So a value larger than should be detected with high likelihood

So models with a misspecification of should be rejected with high likelihood = high power

It can be shown that

NCP = (MI/EPC2) δ2

Given the NCP, one can determine the power

Page 57: Tests of structural equation models do not work: What to do ?

57

Decision table for detection misspecification of single parameters

Power

Low High

Modification index

Not significant Not informative. Inconclusive

(I)

No misspecification

(nm)

Significant Misspecification present

(m)

Inspect EPC (EPC)

Page 58: Tests of structural equation models do not work: What to do ?

58

An illustration : Model Blok and Saris

Page 59: Tests of structural equation models do not work: What to do ?

59

What the traditional tests tell chi2= 161 with df =9, SRMR =.073, RMSEA = .21, CFI = .95 and AGFI =.67.

According to the suggested cut-off values, all fit indices, with the only exception of CFI, would reject the model.

How can we be sure of this conclusion?

It is also possible that there are only very small misspecification(s) for which these test statistics and fit indices are very sensitive.

Page 60: Tests of structural equation models do not work: What to do ?

60

Testing all parameters with JRule(William van der Veld et al.)

Page 61: Tests of structural equation models do not work: What to do ?

61

Test for variances and covariances

Page 62: Tests of structural equation models do not work: What to do ?

62

The corrected model

Page 63: Tests of structural equation models do not work: What to do ?

63

What the traditional tests tell

chi2 = 3.88, with df=5 (p-value =.57), SRMR =.0076, RMSEA = .0, CFI= 1.0 AGFI = .98.

Now all indices suggest that the model fits the data.

However, this decision is also doubtful

It is possible that the power of the tests is so low for this model that the misspecifications are not detected.

Page 64: Tests of structural equation models do not work: What to do ?

64

Testing all parameters with JRule

Page 65: Tests of structural equation models do not work: What to do ?

65

Conclusions

The traditional chi2 test does not provide the information that is needed

It does not test for misspecifications and

It ignores the power of the test

The fit indices have the same problems

Our new approach directly tests for misspecifications and takes the power of the test into account.

Page 66: Tests of structural equation models do not work: What to do ?

66

Conclusions

Our new approach can detect for any parameter whether this parameter is misspecified or not or that there is not enough information to decide.

If a misspecification is detected the model should be rejected

If the power is too low to make a decision further research is needed to test the quality of the model

The latter option is completely ignored in the traditional model tests