causal models for regression modeling strategieshbiostat.org/doc/rms/causalmodels.pdf · 1. causal...

24
Causal Models for Regression Modeling Strategies Drew Griffin Levy Regression Modeling Strategies Short Course May, 2020

Upload: others

Post on 25-Jun-2020

12 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Causal Models for Regression Modeling Strategieshbiostat.org/doc/rms/causalModels.pdf · 1. Causal Inference in Statistics: A Primer, 2016 2. Causality: Models, Reasoning and Inference,

Causal Models for Regression Modeling

StrategiesDrew Griffin Levy

Regression Modeling Strategies Short CourseMay, 2020

Page 2: Causal Models for Regression Modeling Strategieshbiostat.org/doc/rms/causalModels.pdf · 1. Causal Inference in Statistics: A Primer, 2016 2. Causality: Models, Reasoning and Inference,

Takeaways: Reasons to consider causal models for regression modeling in observational studies

1. Alternative approaches to variable selection

2. Deeper insight re. how causal inferences from associational models can be questionable

3. Identifying the minimum (and various) set of adjustments necessary for unbiased estimation of effects

4. Risk of inducing bias with statistical adjustment (collider stratification bias)

5. Clearly and explicitly communicating assumptions about justifications for model specification

Page 3: Causal Models for Regression Modeling Strategieshbiostat.org/doc/rms/causalModels.pdf · 1. Causal Inference in Statistics: A Primer, 2016 2. Causality: Models, Reasoning and Inference,

Resources• DAGitty - drawing and analyzing causal diagrams (DAGs) (www.dagitty.net/)• Judea Pearl

1. Causal Inference in Statistics: A Primer, 20162. Causality: Models, Reasoning and Inference, 20093. The Book of Why: The New Science of Cause and Effect, 2018.

• Miguel Hernan1. The Causal Inference Book2. edX MOOC: Causal Diagrams: Draw Your Assumptions Before Your Conclusions

• Modern Epidemiology, 3rd Ed. Rothman, Greenland, Lash: Chapter 12–Causal Diagrams

• Causal Diagrams for Epidemiologic Research. S. Greenland, J. Pearl, J. Robins. Epidemiology 1999;10:37-48.

• Developing a Protocol for Observational Comparative Effectiveness Research: A User's Guide: Supplement 2, Use of Directed Acyclic Graphs

Page 4: Causal Models for Regression Modeling Strategieshbiostat.org/doc/rms/causalModels.pdf · 1. Causal Inference in Statistics: A Primer, 2016 2. Causality: Models, Reasoning and Inference,

Analytic bias• Model selection

– E(β |̂β ”̂significant”) ≠βtrue• Model misspecification• Over-fitting• Residual confounding• Arbitrary categorization• Collider bias

POPULATION

SAMPLE ANALYSIS

“What we observeis not nature itself,

but nature exposed to our method of questioning.”

-Werner Heisenberg DECISIONS & ACTION

INFERENCE

Conventional statistical methods

• Risk of selection bias; confounding by indication

• Importance of study / experimental design

• Omitted variables• Missing data• Measurement issues• Information bias DATA

Likelihood: P(data | 𝚹)

Uncertainties• Model specification• Model selection• Assumptions re. distributions

• Cognition/psychology

• Intentions• Motivations

Association vs. Causation

Belief ~ Evidence

P(𝚹 | data )

NATURE

The Epistemological Arc

Page 5: Causal Models for Regression Modeling Strategieshbiostat.org/doc/rms/causalModels.pdf · 1. Causal Inference in Statistics: A Primer, 2016 2. Causality: Models, Reasoning and Inference,
Page 6: Causal Models for Regression Modeling Strategieshbiostat.org/doc/rms/causalModels.pdf · 1. Causal Inference in Statistics: A Primer, 2016 2. Causality: Models, Reasoning and Inference,
Page 7: Causal Models for Regression Modeling Strategieshbiostat.org/doc/rms/causalModels.pdf · 1. Causal Inference in Statistics: A Primer, 2016 2. Causality: Models, Reasoning and Inference,

We can & will be fooled by data!

“The data are profoundly dumb!”---Judea Pearl, Book of Why

• Data helps to describe reality—albeit imperfectly• It is a prevalent mistake to believe that “all the answers

[information] are in the data”• Observations are not objective; Nature is indifferent to

furnishing noise vs. signal; the computer cannot divine causes; good faith science requires humility• Relying on statistical approaches to identifying

variables for adjustment and control of confounding can be problematic

Page 8: Causal Models for Regression Modeling Strategieshbiostat.org/doc/rms/causalModels.pdf · 1. Causal Inference in Statistics: A Primer, 2016 2. Causality: Models, Reasoning and Inference,

Alternative PoV: how to identify variables for unbiased estimation1. How to estimate a 1° effect (e.g., Tx) without bias• Confounding is a causal phenomenon• Confounding: P(Y|X) ≠ P(Y|do(X))

2. Identifying the set(s) of adjustments necessary for unbiased estimation of specific effects

3. Causal models also elucidate• Adjustments that induce bias!• Selection bias• Much else

Page 9: Causal Models for Regression Modeling Strategieshbiostat.org/doc/rms/causalModels.pdf · 1. Causal Inference in Statistics: A Primer, 2016 2. Causality: Models, Reasoning and Inference,

“What causes say about data”

• Causal diagrams show how causal relations are expected to translate into associations & independencies

1. Initially, associations & independencies derived from subject matter knowledge are posited in a DAG

2. Then given the posited model, associations & independencies observed in data are are computed

• A credible causal model will reconcile associations & independencies observed with the constraints provided by the posited causal model• Subject to further criticism; revision qualification,

elaboration, updating, refinement

Page 10: Causal Models for Regression Modeling Strategieshbiostat.org/doc/rms/causalModels.pdf · 1. Causal Inference in Statistics: A Primer, 2016 2. Causality: Models, Reasoning and Inference,

Basic structures in causal models

1. Causal relationship2. Chains3. Mediation4. Confounder5. Collider

Page 11: Causal Models for Regression Modeling Strategieshbiostat.org/doc/rms/causalModels.pdf · 1. Causal Inference in Statistics: A Primer, 2016 2. Causality: Models, Reasoning and Inference,

Cause-effect

DAGs are both causal models and statistical models (i.e., models that represent associations and independencies)

Causal effects imply associations Lack of causal effects imply independencies: e.g., P(Y|X) ≠ P(Y)

*Figures, examples and propositions appropriated from Hernan’s Causal Diagrams: Draw Your Assumptions Before Your Conclusions

Page 12: Causal Models for Regression Modeling Strategieshbiostat.org/doc/rms/causalModels.pdf · 1. Causal Inference in Statistics: A Primer, 2016 2. Causality: Models, Reasoning and Inference,

Causal structures: Chains, Junctions and Paths

• Mediation

• Direct vs. indirect effects• Total effect

• Conditional independence:• In general: Pr(Y=y|X=x) = Pr(Y=y)• Pr(Y=y|A=a, B=b) = Pr(Y=y|B=b)

*Figures, examples and propositions appropriated from Hernan’s Causal Diagrams: Draw Your Assumptions Before Your Conclusions

Page 13: Causal Models for Regression Modeling Strategieshbiostat.org/doc/rms/causalModels.pdf · 1. Causal Inference in Statistics: A Primer, 2016 2. Causality: Models, Reasoning and Inference,

Confounders

• Causal structure with common causes

• Bias: A and Y are not expected to be independent

• Bias: estimation of magnitude of association of A and Y

*Figures, examples and propositions appropriated from Hernan’s Causal Diagrams: Draw Your Assumptions Before Your Conclusions

Page 14: Causal Models for Regression Modeling Strategieshbiostat.org/doc/rms/causalModels.pdf · 1. Causal Inference in Statistics: A Primer, 2016 2. Causality: Models, Reasoning and Inference,

Colliders & Collider-stratification bias

• Paths with convergent arrows • When colliders are not

conditioned on they block pathways.

• When colliders areconditioned on they open pathways

• Thus adjustment can inadvertently induce bias!

• The prevalence of these collider structures is likely under appreciated.

Page 15: Causal Models for Regression Modeling Strategieshbiostat.org/doc/rms/causalModels.pdf · 1. Causal Inference in Statistics: A Primer, 2016 2. Causality: Models, Reasoning and Inference,

Stratifying on a collider is a major culprit in systematic bias

Page 16: Causal Models for Regression Modeling Strategieshbiostat.org/doc/rms/causalModels.pdf · 1. Causal Inference in Statistics: A Primer, 2016 2. Causality: Models, Reasoning and Inference,

Selection Bias and collider-stratification bias

• Common effects do not create an association, unless conditioned on.

• When there is a component of the association due to selecting a subset of the population, we say that there is selection bias.

*Figures, examples and propositions appropriated from Hernan’s Causal Diagrams: Draw Your Assumptions Before Your Conclusions

Page 17: Causal Models for Regression Modeling Strategieshbiostat.org/doc/rms/causalModels.pdf · 1. Causal Inference in Statistics: A Primer, 2016 2. Causality: Models, Reasoning and Inference,

Deconfounding → P(Y|do(X))

• Distinguish concepts: confounding, confounder, and “deconfounding”• “d-separation”: for any given pattern of paths in the

causal model, what pattern of dependencies and independencies we should expect in the data• “Back-door criterion” for bias evaluation indicates

possible sets of variables for unbiased estimation• Identify the set of adjustments necessary for

unbiased estimation of effects

Page 18: Causal Models for Regression Modeling Strategieshbiostat.org/doc/rms/causalModels.pdf · 1. Causal Inference in Statistics: A Primer, 2016 2. Causality: Models, Reasoning and Inference,

Daggity: - drawing and analyzing causal diagrams (DAGs) (www.dagitty.net/)

Staplin N, Herrington WG, Judge PK, Reith CA, Haynes R, Landray MJ, Baigent C, Emberson J. Use of Causal Diagrams to Inform the Design and Interpretation of Observational Studies: An Example from the Study of Heart and Renal Protection (SHARP). Clin J Am

Page 19: Causal Models for Regression Modeling Strategieshbiostat.org/doc/rms/causalModels.pdf · 1. Causal Inference in Statistics: A Primer, 2016 2. Causality: Models, Reasoning and Inference,
Page 20: Causal Models for Regression Modeling Strategieshbiostat.org/doc/rms/causalModels.pdf · 1. Causal Inference in Statistics: A Primer, 2016 2. Causality: Models, Reasoning and Inference,

“Draw your assumptions before your conclusions.” —M. Hernan

• Causal diagrams help us summarize what we know about a problem and communicate our assumptions about its causal structure.• Causal diagrams help us diagnose biases in causal

inference• Causal diagrams help you organize your expert

knowledge visually; and therefore, they help you draw your assumptions before your conclusions.

Page 21: Causal Models for Regression Modeling Strategieshbiostat.org/doc/rms/causalModels.pdf · 1. Causal Inference in Statistics: A Primer, 2016 2. Causality: Models, Reasoning and Inference,

Resources• DAGitty - drawing and analyzing causal diagrams (DAGs) (www.dagitty.net/)• Judea Pearl

1. Causal Inference in Statistics: A Primer, 20162. Causality: Models, Reasoning and Inference, 20093. The Book of Why: The New Science of Cause and Effect, 2018.

• Miguel Hernan1. The Causal Inference Book2. edX MOOC: Causal Diagrams: Draw Your Assumptions Before Your Conclusions

• Modern Epidemiology, 3rd Ed. Rothman, Greenland, Lash: Chapter 12–Causal Diagrams

• Causal Diagrams for Epidemiologic Research. S. Greenland, J. Pearl, J. Robins. Epidemiology 1999;10:37-48.

• Developing a Protocol for Observational Comparative Effectiveness Research: A User's Guide: Supplement 2, Use of Directed Acyclic Graphs

Page 22: Causal Models for Regression Modeling Strategieshbiostat.org/doc/rms/causalModels.pdf · 1. Causal Inference in Statistics: A Primer, 2016 2. Causality: Models, Reasoning and Inference,

Proposed process for using SCMs and DAGs

1. Think hard about the research question and problem of effect identification

2. Develop DAGs based on subject matter knowledge without looking at data: do not contort the DAG based on data availability

3. Do the causal calculus in Daggity to identify the set of minimum necessary adjustment for unbiased effect estimation

4. Do analysis and reconcile observations with causal model (this is science)

5. Publish the DAG with the research report.

Page 23: Causal Models for Regression Modeling Strategieshbiostat.org/doc/rms/causalModels.pdf · 1. Causal Inference in Statistics: A Primer, 2016 2. Causality: Models, Reasoning and Inference,

Takeaways: Reasons to consider causal models for regression modeling in non-randomized studies1. Better approaches to variable selection2. Deeper insight re. how causal inferences from

associational models can be questionable3. Identifying the minimum set of adjustments

necessary for unbiased (unconfounded) estimation of effects

4. Risk of collider stratification bias5. Clearly and explicitly communicating assumptions

about justifications for model specification.

Page 24: Causal Models for Regression Modeling Strategieshbiostat.org/doc/rms/causalModels.pdf · 1. Causal Inference in Statistics: A Primer, 2016 2. Causality: Models, Reasoning and Inference,

Analytic bias• Model selection

– E(β |̂β ”̂significant”) ≠βtrue• Model misspecification• Over-fitting• Residual confounding• Arbitrary categorization• Collider bias

POPULATION

SAMPLE ANALYSIS

“What we observeis not nature itself,

but nature exposed to our method of questioning.”

-Werner Heisenberg DECISIONS & ACTION

INFERENCE

Conventional statistical methods

• Risk of selection bias; confounding by indication

• Importance of study / experimental design

• Omitted variables• Missing data• Measurement issues• Information bias DATA

Likelihood: P(data | 𝚹)

Uncertainties• Model specification• Model selection• Assumptions re. distributions

• Cognition/psychology

• Intentions• Motivations

Association vs. Causation

Belief ~ Evidence

P(𝚹 | data )

NATURE

The Epistemological Arc