random effects structure for confirmatory analysis: why it’s important to christoph scheepers...

38
Random effects structure for confirmatory analysis: Why it’s important to Christoph Scheepers University of Glasgow

Upload: pierce-nelson

Post on 17-Jan-2016

227 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Random effects structure for confirmatory analysis: Why it’s important to Christoph Scheepers University of Glasgow

Random effects structure for confirmatory analysis: Why it’s important to

Christoph ScheepersUniversity of Glasgow

Page 2: Random effects structure for confirmatory analysis: Why it’s important to Christoph Scheepers University of Glasgow

Introduction

• Barr, D.J., Levy, R., Scheepers., C., & Tily, H.J. (2013). Random effects structure for confirmatory hypothesis testing: Keep it maximal. Journal of Memory and Language, 68(3), 255-278.

• Why we did this– Linear Mixed Effect Models (LMEMs) are becoming increasingly popular– However, a lot of malpractices in the literature (especially concerning

random effect structures)– Approaches vary greatly and/or are insufficiently reported– Researchers moving from ANOVA to LMEM do not seem to grasp the

continued applicability (and relevance!) of their previous knowledge about how to account for dependencies created by their experimental designs

– Demystify LMEMs and establish standards for confirmatory analysis

Page 3: Random effects structure for confirmatory analysis: Why it’s important to Christoph Scheepers University of Glasgow

Introduction

• Barr, D.J., Levy, R., Scheepers., C., & Tily, H.J. (2013). Random effects structure for confirmatory hypothesis testing: Keep it maximal. Journal of Memory and Language, 68(3), 255-278.

• Why did we do this?– Linear Mixed Effect Models (LMEMs) are becoming increasingly popular– However, a lot of malpractices in the present literature (especially

concerning random effect structures)– Approaches vary greatly and/or are insufficiently reported– Researchers moving from ANOVA to LMEM do not seem to grasp the

continued applicability (and relevance!) of their previous knowledge about how to account for dependencies created by their experimental designs

– Demystify LMEMs and establish standards for confirmatory analysis

Page 4: Random effects structure for confirmatory analysis: Why it’s important to Christoph Scheepers University of Glasgow

Goals of statistical analysis

• Q. What combination of factors characterises my current lexical decision data best?– A. structure of a model

• data-driven model selection• exploratory analysis

• Q. Do factors A and B contribute independently or interactively to lexical decision?– A. Test of an interaction hypothesis

• or • generalizability and replicability• confirmatory analysis

Page 5: Random effects structure for confirmatory analysis: Why it’s important to Christoph Scheepers University of Glasgow

Confirmatory hypothesis testing

• Present focus: LMEMs as ANOVA replacement for confirmatory analysis of experimental data.

Page 6: Random effects structure for confirmatory analysis: Why it’s important to Christoph Scheepers University of Glasgow

Dependent (i.e. correlated) observations

• Observations A and B are dependent if knowing A reduces uncertainty about B– E.g. Within-subjects RT experiment: knowing that Subject responded

slowly in condition A, we can expect that also responds slowly in condition B.

– Observations are consistent (correlated) over time

• Observations are conditionally independent if the source of the dependency is removed (by estimating a parameter)– E.g. remove the individual subject mean from the original RT scores

(factoring out individual differences in baseline response rate across subjects)

– Within-subjects t-test, ANOVA…

Page 7: Random effects structure for confirmatory analysis: Why it’s important to Christoph Scheepers University of Glasgow

The importance of conditional independence

• Failing to meet conditional independence assumptions (as dictated by the design) is one of the worst statistical errors one can make in confirmatory analysis– More important than normality, outliers, homogeneity of variance etc.

• … but is also one of the most poorly understood aspects in the use of LMEMs

• … even though we usually teach students thoroughly about when to apply, e.g., a repeated- or an independent-measures test– General rule: It’s the design that determines what test to use, not

whether repeated measurements from the same subjects (or items) are actually correlated or not!

Page 8: Random effects structure for confirmatory analysis: Why it’s important to Christoph Scheepers University of Glasgow

Observations repeated over sampling units

• Between-subjects/between-items designs with cell replications• Within-subjects/within-items design

– With or without replications per cell• Multiple observations over time per trial (time series data)• Sampling from multiple populations

– “Items-as-fixed-effects fallacy” (Clark, 1973)

– Participants as “measuring devices”

Page 9: Random effects structure for confirmatory analysis: Why it’s important to Christoph Scheepers University of Glasgow

A critique of current LMEM practises

• Myths: Random intercepts are more important than random slopes – “crossing” of random intercepts performs some kind of magic in terms of generalization – RI-only models are most common (ca. 60%)– Model selection: RSs are tested for inclusion; RIs are included by default– Some tutorial articles promote this idea (Janssen, 2011; Locker, Hoffmann,

Bovaird, 2007)

• Fact: (crossed) random slopes are more important than random intercepts for generalization– Most studies have repeated observations over sampling units (subjects,

items)– Random slopes capture variability of effects over sampling units– RI-only is essentially just improperly conducted ANOVA– RI-only is about the least generalizable analysis, even worse than alone in

some circumstances

Page 10: Random effects structure for confirmatory analysis: Why it’s important to Christoph Scheepers University of Glasgow

Mixed Model ANOVA in SPSS

• The parallels between LMEMs and ANOVA become apparent when one uses the Mixed Model ANOVA approach (cf. Clark, 1973)

• Alternative to the common “recipe” of first aggregating data up to the subject (or item) level and then performing a repeated-measures ANOVA

• Let’s a assume a 24 subjects/24 items within/within design with 2 conditions (A and B)

• Data remain in their unaggregated (trial-by-trial) format!

Page 11: Random effects structure for confirmatory analysis: Why it’s important to Christoph Scheepers University of Glasgow

Mixed Model ANOVA in SPSSUNIANOVA Resp BY Cond SubjID /RANDOM=SubjID /METHOD=SSTYPE(3) /INTERCEPT=INCLUDE /CRITERIA=ALPHA(0.05) /DESIGN=Cond SubjID Cond*SubjID.

Page 12: Random effects structure for confirmatory analysis: Why it’s important to Christoph Scheepers University of Glasgow

Mixed Model ANOVA in SPSSUNIANOVA Resp BY Cond SubjID /RANDOM=SubjID /METHOD=SSTYPE(3) /INTERCEPT=INCLUDE /CRITERIA=ALPHA(0.05) /DESIGN=Cond SubjID.

Page 13: Random effects structure for confirmatory analysis: Why it’s important to Christoph Scheepers University of Glasgow

Understanding random effect structure

• Toy data set• Hypothetical lexical decision experiment examining the effect

of “word type” (A vs. B) on RTs• Within-subject/between-item manipulation• 4 subjects 4 items = 16 data points

Page 14: Random effects structure for confirmatory analysis: Why it’s important to Christoph Scheepers University of Glasgow

The data

AB

600

700

800

900

1000

I1 I2 I3 I4 I1 I2 I3 I4 I1 I2 I3 I4 I1 I2 I3 I4

S1 S2 S3 S4

Page 15: Random effects structure for confirmatory analysis: Why it’s important to Christoph Scheepers University of Glasgow

Model 1 (all observations independent)

AB

600

700

800

900

1000

I1 I2 I3 I4 I1 I2 I3 I4 I1 I2 I3 I4 I1 I2 I3 I4

S1 S2 S3 S4

A (model)B (model)

Not a mixed effects model!

Page 16: Random effects structure for confirmatory analysis: Why it’s important to Christoph Scheepers University of Glasgow

Model 2 (subject random intercepts, a.k.a. “bad” MM ANOVA)

ABA (model)

600

700

800

900

1000

I1 I2 I3 I4 I1 I2 I3 I4 I1 I2 I3 I4 I1 I2 I3 I4

S1 S2 S3 S4

B (model)

lmer:

Page 17: Random effects structure for confirmatory analysis: Why it’s important to Christoph Scheepers University of Glasgow

Model 3(subject random intercepts and slopes; MM ANOVA )

ABA (model)B (model)

lmer:

600

700

800

900

1000

I1 I2 I3 I4 I1 I2 I3 I4 I1 I2 I3 I4 I1 I2 I3 I4

S1 S2 S3 S4

without intercept-slope correlation:Y~X+(1|Subject)+(0+X|Subject)

Page 18: Random effects structure for confirmatory analysis: Why it’s important to Christoph Scheepers University of Glasgow

ABA (model)B (model)

lmer:

Model 4(subject intercepts and slopes + item intercepts; “MAXIMAL”)

600

700

800

900

1000

I1 I2 I3 I4 I1 I2 I3 I4 I1 I2 I3 I4 I1 I2 I3 I4

S1 S2 S3 S4

Page 19: Random effects structure for confirmatory analysis: Why it’s important to Christoph Scheepers University of Glasgow

By-subject/item RI-only models?

• Have actually no equivalent in ANOVA for confirmatory analysis– Perhaps equivalent to “badly computed” min-?

• Random intercepts mainly increase power by reducing “baseline” error variance– Increased power with no additional Type-I error protection: A recipe for

disaster! (… or a recipe for guaranteed publication! ;-) )

• Random slopes are essential to protect against Type-I errors for fixed effects of interest

Page 20: Random effects structure for confirmatory analysis: Why it’s important to Christoph Scheepers University of Glasgow

Keeping it maximal

• Choice of random effect structure for confirmatory analysis should be design-driven– Just as it used to be in the past

• A maximal model encodes the dependencies that are likely to be created by the sampling method itself (e.g., repeated measures over sampling units)

• Assumption: Variability is the norm– Different subjects (items) are differentially sensitive to (differentially

able to create) the experimental manipulation of interest– No effect is constant across subjects or items!– Random slopes capture precisely that variability

Page 21: Random effects structure for confirmatory analysis: Why it’s important to Christoph Scheepers University of Glasgow

Data-driven approaches?

• Let the data “speak for themselves” whether random slopes should be included or not (model selection)

• Danger: Designs are typically not optimised for the detection of random effects (Type-II errors)

• The assumption of unsystematic effect-variability in the population makes sense even if not confirmed by the present sample of data

• Why test random slopes but not random intercepts?• Why worry about “overfitting the data” but not about

“underfitting the design”?• Maximal models must not necessarily be conservative

Page 22: Random effects structure for confirmatory analysis: Why it’s important to Christoph Scheepers University of Glasgow

Generalisation performance of different approaches

Page 23: Random effects structure for confirmatory analysis: Why it’s important to Christoph Scheepers University of Glasgow

Method

Page 24: Random effects structure for confirmatory analysis: Why it’s important to Christoph Scheepers University of Glasgow

Data-driven approaches

Page 25: Random effects structure for confirmatory analysis: Why it’s important to Christoph Scheepers University of Glasgow

Performance: WSBI design

Type-I error

Power

Page 26: Random effects structure for confirmatory analysis: Why it’s important to Christoph Scheepers University of Glasgow

Performance: WSWI design

Type-I error

Power

Page 27: Random effects structure for confirmatory analysis: Why it’s important to Christoph Scheepers University of Glasgow

Data-driven approaches

• Never outperform maximal models!• Asymptote towards maximal models using

– Backward/“best path” algorithm combined with very lenient criteria for inclusion of random slopes

Page 28: Random effects structure for confirmatory analysis: Why it’s important to Christoph Scheepers University of Glasgow

Type I as function of critical variances: WSBI design (“reasonable” approaches)

Page 29: Random effects structure for confirmatory analysis: Why it’s important to Christoph Scheepers University of Glasgow

Type I as function of critical variances: WSBI design (“bad” approaches)

Page 30: Random effects structure for confirmatory analysis: Why it’s important to Christoph Scheepers University of Glasgow

Type I as function of critical variances: WSWI design (“reasonable” approaches)

Page 31: Random effects structure for confirmatory analysis: Why it’s important to Christoph Scheepers University of Glasgow

Type I as function of critical variances: WSWI design (“bad” approaches)

Page 32: Random effects structure for confirmatory analysis: Why it’s important to Christoph Scheepers University of Glasgow

The worst-case scenario for maximal modelsin terms of power

• Even in this rather unrealistic scenario (no by-subject/item variability of effects in the population!)– maximal models are not far off the power of the “correct” RI-only model– and far more powerful than even

Page 33: Random effects structure for confirmatory analysis: Why it’s important to Christoph Scheepers University of Glasgow

Summary

• Choices about random effects already existed in “traditional” approaches – not only in LMEMs

• Maximal LMEMs perform best across the entire parameter space• RI-only LMEMs are about the worst thing one can do for

confirmatory analysis (even worse than alone)• Performance of data-driven approaches

– strongly depends on the algorithm and criteria used– does not exceed that of a design-driven maximal LMEMs

• Maximal LMEMs are powerful even if they “overfit” the generative process

• Model comparison (LR) better than “z as t”• Maximal models are best for confirmatory analysis (of continuous

data at least)

Page 34: Random effects structure for confirmatory analysis: Why it’s important to Christoph Scheepers University of Glasgow

Back to using ANOVA?

• There is certainly no reason to ‘reject’ or ‘look down upon’ results that are based on an ANOVA approach (ANOVA is still far more trustworthy than a badly performed LMEM!!), but…– minmore conservative than maximal LMEMs– becomes anticonservative as population slope variances become very

large, and conservative as population slope variances become very small

– Plus:• LMEMs can better handle unbalanced data*• LMEMs can model non-normally distributed data (e.g. binary data)*• LMEMs allows for testing of continuous predictor interactions*

*GEE as well, btw. (which is what I’m typically using)

Page 35: Random effects structure for confirmatory analysis: Why it’s important to Christoph Scheepers University of Glasgow

Specifying the maximal model

• All sampling units get a random intercept• Any fixed factor should be accompanied with a by-unit random

slope if it is both– within-unit (repeated measures) and– has multiple observations per condition per unit [ with only one observation per condition per unit, RI is sufficient ]

• Any fixed factor interaction should be accompanied with a by-unit random slope if– all constituent factors are within-unit– there are multiple observations per unit per cell

• Where ‘cell’ is a combination of factor levels

Page 36: Random effects structure for confirmatory analysis: Why it’s important to Christoph Scheepers University of Glasgow

Maximal model doesn’t converge – what now?

• Definitely needs further investigation (especially with categorical data), but we recommend:– Check for potential extreme values, coding errors etc.– Examine the random effects estimates in the non-converged model

• Remove the highest-order random effect closest to zero, then refit

– Consider dropping random intercept/slope correlations• lmer(Y~X+(1|Subject)+(0+X|Subject))

– Even dropping close-to-zero random intercepts is worth considering• Detrimental to power but not to Type-I error

– If all of this doesn’t work, give up the idea of “crossing” random effects and perform separate by-subjects and by-item analyses, each with appropriate maximal random effect structure

Page 37: Random effects structure for confirmatory analysis: Why it’s important to Christoph Scheepers University of Glasgow

Reporting

• Many (published!!) articles just state– “… we used a linear mixed effects model with random effects for

subjects and items”– Not sufficient!

• Report the final random effect structure and how you got there– If a random slope required by the design had been dropped for whatever

reason, explain to non-experts what that means, e.g.• “by dropping the subject random slope for effect X, the model

assumes that there is absolutely no subject-specific variation in the size and direction of effect X in the population…”

Page 38: Random effects structure for confirmatory analysis: Why it’s important to Christoph Scheepers University of Glasgow

Conclusion

• Understand the observational dependencies created by your design

• Effects (if they exist) inevitably vary across sampling units– Random slopes account for that variability in repeated-measures designs

with more than one observation per condition

• Be suspicious of any method (parametric or non-parametric) that uses the same algorithm for different sampling designs

• For confirmatory analysis: