week 10: causality with measured confounding...anexperimentis a study where assignment to treatment...

346
Week 10: Causality with Measured Confounding Brandon Stewart 1 Princeton November 26 and 28, 2018 1 These slides are heavily influenced by Matt Blackwell, Jens Hainmueller, Erin Hartman, Kosuke Imai and Gary King. Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 1 / 89

Upload: others

Post on 17-Jul-2020

1 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Week 10: Causality with Measured Confounding

Brandon Stewart1

Princeton

November 26 and 28, 2018

1These slides are heavily influenced by Matt Blackwell, Jens Hainmueller, ErinHartman, Kosuke Imai and Gary King.

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 1 / 89

Page 2: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Where We’ve Been and Where We’re Going...

Last WeekI intro to causal inference

This WeekI Monday:

F experimental IdealF identification with measured confounding

I Wednesday:F regression estimation

Next WeekI identification with unmeasured confoundingI instrumental variables

Long RunI probability → inference → regression → causal inference

Questions?

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 2 / 89

Page 3: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

1 The Experimental Ideal

2 Assumption of No Unmeasured Confounding

3 Estimation Under No Unmeasured Confounding

4 Regression Estimators

5 Regression and Causality

6 Regression Under Heterogeneous Effects

7 Fun with Visualization, Replication and the NYT

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 3 / 89

Page 4: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

1 The Experimental Ideal

2 Assumption of No Unmeasured Confounding

3 Estimation Under No Unmeasured Confounding

4 Regression Estimators

5 Regression and Causality

6 Regression Under Heterogeneous Effects

7 Fun with Visualization, Replication and the NYT

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 3 / 89

Page 5: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Lancet 2001: negative correlation between coronary heart disease mortalityand level of vitamin C in bloodstream (controlling for age, gender, bloodpressure, diabetes, and smoking)

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 4 / 89

Page 6: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Lancet 2002: no effect of vitamin C on mortality in controlled placebo trial(controlling for nothing)

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 4 / 89

Page 7: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Lancet 2003: comparing among individuals with the same age, gender,blood pressure, diabetes, and smoking, those with higher vitamin C levelshave lower levels of obesity, lower levels of alcohol consumption, are lesslikely to grow up in working class, etc.

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 4 / 89

Page 8: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Why So Much Variation?

Confounders

X

T Y

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 5 / 89

Page 9: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Why So Much Variation?

Confounders

X

T Y

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 5 / 89

Page 10: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Observational Studies and Experimental Ideal

Randomization forms gold standard for causal inference, because itbalances observed and unobserved confounders

Cannot always randomize so we do observational studies, where weadjust for the observed covariates and hope that unobservables arebalanced

Better than hoping: design observational study to approximate anexperiment

I “The planner of an observational study should always ask himself: Howwould the study be conducted if it were possible to do it by controlledexperimentation” (Cochran 1965)

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 6 / 89

Page 11: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Observational Studies and Experimental Ideal

Randomization forms gold standard for causal inference, because itbalances observed and unobserved confounders

Cannot always randomize so we do observational studies, where weadjust for the observed covariates and hope that unobservables arebalanced

Better than hoping: design observational study to approximate anexperiment

I “The planner of an observational study should always ask himself: Howwould the study be conducted if it were possible to do it by controlledexperimentation” (Cochran 1965)

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 6 / 89

Page 12: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Observational Studies and Experimental Ideal

Randomization forms gold standard for causal inference, because itbalances observed and unobserved confounders

Cannot always randomize so we do observational studies, where weadjust for the observed covariates and hope that unobservables arebalanced

Better than hoping: design observational study to approximate anexperiment

I “The planner of an observational study should always ask himself: Howwould the study be conducted if it were possible to do it by controlledexperimentation” (Cochran 1965)

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 6 / 89

Page 13: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Observational Studies and Experimental Ideal

Randomization forms gold standard for causal inference, because itbalances observed and unobserved confounders

Cannot always randomize so we do observational studies, where weadjust for the observed covariates and hope that unobservables arebalanced

Better than hoping: design observational study to approximate anexperiment

I “The planner of an observational study should always ask himself: Howwould the study be conducted if it were possible to do it by controlledexperimentation” (Cochran 1965)

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 6 / 89

Page 14: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Angrist and Pishke’s Frequently Asked Questions

What is the causal relationship of interest?

What is the experiment that could ideally be used to capture thecausal effect of interest?

What is your identification strategy?

What is your mode of statistical inference?

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 7 / 89

Page 15: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Angrist and Pishke’s Frequently Asked Questions

What is the causal relationship of interest?

What is the experiment that could ideally be used to capture thecausal effect of interest?

What is your identification strategy?

What is your mode of statistical inference?

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 7 / 89

Page 16: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Angrist and Pishke’s Frequently Asked Questions

What is the causal relationship of interest?

What is the experiment that could ideally be used to capture thecausal effect of interest?

What is your identification strategy?

What is your mode of statistical inference?

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 7 / 89

Page 17: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Angrist and Pishke’s Frequently Asked Questions

What is the causal relationship of interest?

What is the experiment that could ideally be used to capture thecausal effect of interest?

What is your identification strategy?

What is your mode of statistical inference?

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 7 / 89

Page 18: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Angrist and Pishke’s Frequently Asked Questions

What is the causal relationship of interest?

What is the experiment that could ideally be used to capture thecausal effect of interest?

What is your identification strategy?

What is your mode of statistical inference?

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 7 / 89

Page 19: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Experiment review

An experiment is a study where assignment to treatment is controlledby the researcher.

I pi = P[Di = 1] be the probability of treatment assignment probability.I pi is controlled and known by researcher in an experiment.

A randomized experiment is an experiment with the followingproperties:

1 Positivity: assignment is probabilistic: 0 < pi < 1

I No deterministic assignment.

2 Unconfoundedness: P[Di = 1|Y(1),Y(0)] = P[Di = 1]

I Treatment assignment does not depend on any potential outcomes.I Sometimes written as Di⊥⊥(Y(1),Y(0))

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 8 / 89

Page 20: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Experiment review

An experiment is a study where assignment to treatment is controlledby the researcher.

I pi = P[Di = 1] be the probability of treatment assignment probability.

I pi is controlled and known by researcher in an experiment.

A randomized experiment is an experiment with the followingproperties:

1 Positivity: assignment is probabilistic: 0 < pi < 1

I No deterministic assignment.

2 Unconfoundedness: P[Di = 1|Y(1),Y(0)] = P[Di = 1]

I Treatment assignment does not depend on any potential outcomes.I Sometimes written as Di⊥⊥(Y(1),Y(0))

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 8 / 89

Page 21: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Experiment review

An experiment is a study where assignment to treatment is controlledby the researcher.

I pi = P[Di = 1] be the probability of treatment assignment probability.I pi is controlled and known by researcher in an experiment.

A randomized experiment is an experiment with the followingproperties:

1 Positivity: assignment is probabilistic: 0 < pi < 1

I No deterministic assignment.

2 Unconfoundedness: P[Di = 1|Y(1),Y(0)] = P[Di = 1]

I Treatment assignment does not depend on any potential outcomes.I Sometimes written as Di⊥⊥(Y(1),Y(0))

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 8 / 89

Page 22: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Experiment review

An experiment is a study where assignment to treatment is controlledby the researcher.

I pi = P[Di = 1] be the probability of treatment assignment probability.I pi is controlled and known by researcher in an experiment.

A randomized experiment is an experiment with the followingproperties:

1 Positivity: assignment is probabilistic: 0 < pi < 1

I No deterministic assignment.

2 Unconfoundedness: P[Di = 1|Y(1),Y(0)] = P[Di = 1]

I Treatment assignment does not depend on any potential outcomes.I Sometimes written as Di⊥⊥(Y(1),Y(0))

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 8 / 89

Page 23: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Experiment review

An experiment is a study where assignment to treatment is controlledby the researcher.

I pi = P[Di = 1] be the probability of treatment assignment probability.I pi is controlled and known by researcher in an experiment.

A randomized experiment is an experiment with the followingproperties:

1 Positivity: assignment is probabilistic: 0 < pi < 1

I No deterministic assignment.

2 Unconfoundedness: P[Di = 1|Y(1),Y(0)] = P[Di = 1]

I Treatment assignment does not depend on any potential outcomes.I Sometimes written as Di⊥⊥(Y(1),Y(0))

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 8 / 89

Page 24: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Experiment review

An experiment is a study where assignment to treatment is controlledby the researcher.

I pi = P[Di = 1] be the probability of treatment assignment probability.I pi is controlled and known by researcher in an experiment.

A randomized experiment is an experiment with the followingproperties:

1 Positivity: assignment is probabilistic: 0 < pi < 1

I No deterministic assignment.

2 Unconfoundedness: P[Di = 1|Y(1),Y(0)] = P[Di = 1]

I Treatment assignment does not depend on any potential outcomes.I Sometimes written as Di⊥⊥(Y(1),Y(0))

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 8 / 89

Page 25: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Experiment review

An experiment is a study where assignment to treatment is controlledby the researcher.

I pi = P[Di = 1] be the probability of treatment assignment probability.I pi is controlled and known by researcher in an experiment.

A randomized experiment is an experiment with the followingproperties:

1 Positivity: assignment is probabilistic: 0 < pi < 1

I No deterministic assignment.

2 Unconfoundedness: P[Di = 1|Y(1),Y(0)] = P[Di = 1]

I Treatment assignment does not depend on any potential outcomes.I Sometimes written as Di⊥⊥(Y(1),Y(0))

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 8 / 89

Page 26: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Experiment review

An experiment is a study where assignment to treatment is controlledby the researcher.

I pi = P[Di = 1] be the probability of treatment assignment probability.I pi is controlled and known by researcher in an experiment.

A randomized experiment is an experiment with the followingproperties:

1 Positivity: assignment is probabilistic: 0 < pi < 1

I No deterministic assignment.

2 Unconfoundedness: P[Di = 1|Y(1),Y(0)] = P[Di = 1]

I Treatment assignment does not depend on any potential outcomes.

I Sometimes written as Di⊥⊥(Y(1),Y(0))

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 8 / 89

Page 27: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Experiment review

An experiment is a study where assignment to treatment is controlledby the researcher.

I pi = P[Di = 1] be the probability of treatment assignment probability.I pi is controlled and known by researcher in an experiment.

A randomized experiment is an experiment with the followingproperties:

1 Positivity: assignment is probabilistic: 0 < pi < 1

I No deterministic assignment.

2 Unconfoundedness: P[Di = 1|Y(1),Y(0)] = P[Di = 1]

I Treatment assignment does not depend on any potential outcomes.I Sometimes written as Di⊥⊥(Y(1),Y(0))

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 8 / 89

Page 28: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Why do Experiments Help?

Remember selection bias?

E [Yi |Di = 1]− E [Yi |Di = 0]

= E [Yi (1)|Di = 1]− E [Yi (0)|Di = 0]

= E [Yi (1)|Di = 1]− E [Yi (0)|Di = 1] + E [Yi (0)|Di = 1]− E [Yi (0)|Di = 0]

= E [Yi (1)− Yi (0)|Di = 1]︸ ︷︷ ︸Average Treatment Effect on Treated

+E [Yi (0)|Di = 1]− E [Yi (0)|Di = 0]︸ ︷︷ ︸selection bias

In an experiment we know that treatment is randomly assigned. Thus we can dothe following:

E [Yi (1)|Di = 1]− E [Yi (0)|Di = 0] = E [Yi (1)|Di = 1]− E [Yi (0)|Di = 1]

= E [Yi (1)]− E [Yi (0)]

When all goes well, an experiment eliminates selection bias.

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 9 / 89

Page 29: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Why do Experiments Help?

Remember selection bias?

E [Yi |Di = 1]− E [Yi |Di = 0]

= E [Yi (1)|Di = 1]− E [Yi (0)|Di = 0]

= E [Yi (1)|Di = 1]− E [Yi (0)|Di = 1] + E [Yi (0)|Di = 1]− E [Yi (0)|Di = 0]

= E [Yi (1)− Yi (0)|Di = 1]︸ ︷︷ ︸Average Treatment Effect on Treated

+E [Yi (0)|Di = 1]− E [Yi (0)|Di = 0]︸ ︷︷ ︸selection bias

In an experiment we know that treatment is randomly assigned. Thus we can dothe following:

E [Yi (1)|Di = 1]− E [Yi (0)|Di = 0] = E [Yi (1)|Di = 1]− E [Yi (0)|Di = 1]

= E [Yi (1)]− E [Yi (0)]

When all goes well, an experiment eliminates selection bias.

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 9 / 89

Page 30: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Why do Experiments Help?

Remember selection bias?

E [Yi |Di = 1]− E [Yi |Di = 0]

= E [Yi (1)|Di = 1]− E [Yi (0)|Di = 0]

= E [Yi (1)|Di = 1]− E [Yi (0)|Di = 1] + E [Yi (0)|Di = 1]− E [Yi (0)|Di = 0]

= E [Yi (1)− Yi (0)|Di = 1]︸ ︷︷ ︸Average Treatment Effect on Treated

+E [Yi (0)|Di = 1]− E [Yi (0)|Di = 0]︸ ︷︷ ︸selection bias

In an experiment we know that treatment is randomly assigned. Thus we can dothe following:

E [Yi (1)|Di = 1]− E [Yi (0)|Di = 0] = E [Yi (1)|Di = 1]− E [Yi (0)|Di = 1]

= E [Yi (1)]− E [Yi (0)]

When all goes well, an experiment eliminates selection bias.

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 9 / 89

Page 31: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Why do Experiments Help?

Remember selection bias?

E [Yi |Di = 1]− E [Yi |Di = 0]

= E [Yi (1)|Di = 1]− E [Yi (0)|Di = 0]

= E [Yi (1)|Di = 1]− E [Yi (0)|Di = 1] + E [Yi (0)|Di = 1]− E [Yi (0)|Di = 0]

= E [Yi (1)− Yi (0)|Di = 1]︸ ︷︷ ︸Average Treatment Effect on Treated

+E [Yi (0)|Di = 1]− E [Yi (0)|Di = 0]︸ ︷︷ ︸selection bias

In an experiment we know that treatment is randomly assigned. Thus we can dothe following:

E [Yi (1)|Di = 1]− E [Yi (0)|Di = 0] = E [Yi (1)|Di = 1]− E [Yi (0)|Di = 1]

= E [Yi (1)]− E [Yi (0)]

When all goes well, an experiment eliminates selection bias.

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 9 / 89

Page 32: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Observational studies

Many different sets of identification assumptions that we’ll cover.

To start, focus on studies that are similar to experiments, just withouta known and controlled treatment assignment.

I No guarantee that the treatment and control groups are comparable.

1 Positivity (Common Support): assignment is probabilistic:0 < P[Di = 1|X,Y(1),Y(0)] < 1

2 No unmeasured confounding: P[Di = 1|X,Y(1),Y(0)] = P[Di = 1|X]

I For some observed XI Also called: unconfoundedness

, ignorability, selection on observables,no omitted variables, exogenous, conditionally exchangeable, etc.

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 10 / 89

Page 33: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Observational studies

Many different sets of identification assumptions that we’ll cover.

To start, focus on studies that are similar to experiments, just withouta known and controlled treatment assignment.

I No guarantee that the treatment and control groups are comparable.

1 Positivity (Common Support): assignment is probabilistic:0 < P[Di = 1|X,Y(1),Y(0)] < 1

2 No unmeasured confounding: P[Di = 1|X,Y(1),Y(0)] = P[Di = 1|X]

I For some observed XI Also called: unconfoundedness

, ignorability, selection on observables,no omitted variables, exogenous, conditionally exchangeable, etc.

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 10 / 89

Page 34: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Observational studies

Many different sets of identification assumptions that we’ll cover.

To start, focus on studies that are similar to experiments, just withouta known and controlled treatment assignment.

I No guarantee that the treatment and control groups are comparable.

1 Positivity (Common Support): assignment is probabilistic:0 < P[Di = 1|X,Y(1),Y(0)] < 1

2 No unmeasured confounding: P[Di = 1|X,Y(1),Y(0)] = P[Di = 1|X]

I For some observed XI Also called: unconfoundedness

, ignorability, selection on observables,no omitted variables, exogenous, conditionally exchangeable, etc.

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 10 / 89

Page 35: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Observational studies

Many different sets of identification assumptions that we’ll cover.

To start, focus on studies that are similar to experiments, just withouta known and controlled treatment assignment.

I No guarantee that the treatment and control groups are comparable.

1 Positivity (Common Support): assignment is probabilistic:0 < P[Di = 1|X,Y(1),Y(0)] < 1

2 No unmeasured confounding: P[Di = 1|X,Y(1),Y(0)] = P[Di = 1|X]

I For some observed XI Also called: unconfoundedness

, ignorability, selection on observables,no omitted variables, exogenous, conditionally exchangeable, etc.

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 10 / 89

Page 36: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Observational studies

Many different sets of identification assumptions that we’ll cover.

To start, focus on studies that are similar to experiments, just withouta known and controlled treatment assignment.

I No guarantee that the treatment and control groups are comparable.

1 Positivity (Common Support): assignment is probabilistic:0 < P[Di = 1|X,Y(1),Y(0)] < 1

2 No unmeasured confounding: P[Di = 1|X,Y(1),Y(0)] = P[Di = 1|X]

I For some observed XI Also called: unconfoundedness

, ignorability, selection on observables,no omitted variables, exogenous, conditionally exchangeable, etc.

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 10 / 89

Page 37: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Observational studies

Many different sets of identification assumptions that we’ll cover.

To start, focus on studies that are similar to experiments, just withouta known and controlled treatment assignment.

I No guarantee that the treatment and control groups are comparable.

1 Positivity (Common Support): assignment is probabilistic:0 < P[Di = 1|X,Y(1),Y(0)] < 1

2 No unmeasured confounding: P[Di = 1|X,Y(1),Y(0)] = P[Di = 1|X]

I For some observed X

I Also called: unconfoundedness

, ignorability, selection on observables,no omitted variables, exogenous, conditionally exchangeable, etc.

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 10 / 89

Page 38: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Observational studies

Many different sets of identification assumptions that we’ll cover.

To start, focus on studies that are similar to experiments, just withouta known and controlled treatment assignment.

I No guarantee that the treatment and control groups are comparable.

1 Positivity (Common Support): assignment is probabilistic:0 < P[Di = 1|X,Y(1),Y(0)] < 1

2 No unmeasured confounding: P[Di = 1|X,Y(1),Y(0)] = P[Di = 1|X]

I For some observed XI Also called: unconfoundedness

, ignorability, selection on observables,no omitted variables, exogenous, conditionally exchangeable, etc.

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 10 / 89

Page 39: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Observational studies

Many different sets of identification assumptions that we’ll cover.

To start, focus on studies that are similar to experiments, just withouta known and controlled treatment assignment.

I No guarantee that the treatment and control groups are comparable.

1 Positivity (Common Support): assignment is probabilistic:0 < P[Di = 1|X,Y(1),Y(0)] < 1

2 No unmeasured confounding: P[Di = 1|X,Y(1),Y(0)] = P[Di = 1|X]

I For some observed XI Also called: unconfoundedness, ignorability

, selection on observables,no omitted variables, exogenous, conditionally exchangeable, etc.

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 10 / 89

Page 40: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Observational studies

Many different sets of identification assumptions that we’ll cover.

To start, focus on studies that are similar to experiments, just withouta known and controlled treatment assignment.

I No guarantee that the treatment and control groups are comparable.

1 Positivity (Common Support): assignment is probabilistic:0 < P[Di = 1|X,Y(1),Y(0)] < 1

2 No unmeasured confounding: P[Di = 1|X,Y(1),Y(0)] = P[Di = 1|X]

I For some observed XI Also called: unconfoundedness, ignorability, selection on observables

,no omitted variables, exogenous, conditionally exchangeable, etc.

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 10 / 89

Page 41: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Observational studies

Many different sets of identification assumptions that we’ll cover.

To start, focus on studies that are similar to experiments, just withouta known and controlled treatment assignment.

I No guarantee that the treatment and control groups are comparable.

1 Positivity (Common Support): assignment is probabilistic:0 < P[Di = 1|X,Y(1),Y(0)] < 1

2 No unmeasured confounding: P[Di = 1|X,Y(1),Y(0)] = P[Di = 1|X]

I For some observed XI Also called: unconfoundedness, ignorability, selection on observables,

no omitted variables

, exogenous, conditionally exchangeable, etc.

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 10 / 89

Page 42: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Observational studies

Many different sets of identification assumptions that we’ll cover.

To start, focus on studies that are similar to experiments, just withouta known and controlled treatment assignment.

I No guarantee that the treatment and control groups are comparable.

1 Positivity (Common Support): assignment is probabilistic:0 < P[Di = 1|X,Y(1),Y(0)] < 1

2 No unmeasured confounding: P[Di = 1|X,Y(1),Y(0)] = P[Di = 1|X]

I For some observed XI Also called: unconfoundedness, ignorability, selection on observables,

no omitted variables, exogenous

, conditionally exchangeable, etc.

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 10 / 89

Page 43: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Observational studies

Many different sets of identification assumptions that we’ll cover.

To start, focus on studies that are similar to experiments, just withouta known and controlled treatment assignment.

I No guarantee that the treatment and control groups are comparable.

1 Positivity (Common Support): assignment is probabilistic:0 < P[Di = 1|X,Y(1),Y(0)] < 1

2 No unmeasured confounding: P[Di = 1|X,Y(1),Y(0)] = P[Di = 1|X]

I For some observed XI Also called: unconfoundedness, ignorability, selection on observables,

no omitted variables, exogenous, conditionally exchangeable, etc.

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 10 / 89

Page 44: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Designing observational studies

Rubin (2008) argues that we should still “design” our observationalstudies:

I Pick the ideal experiment to this observational study.I Hide the outcome data.I Try to estimate the randomization procedure.I Analyze this as an experiment with this estimated procedure.

Tries to minimize “snooping” by picking the best modeling strategybefore seeing the outcome.

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 11 / 89

Page 45: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Designing observational studies

Rubin (2008) argues that we should still “design” our observationalstudies:

I Pick the ideal experiment to this observational study.

I Hide the outcome data.I Try to estimate the randomization procedure.I Analyze this as an experiment with this estimated procedure.

Tries to minimize “snooping” by picking the best modeling strategybefore seeing the outcome.

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 11 / 89

Page 46: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Designing observational studies

Rubin (2008) argues that we should still “design” our observationalstudies:

I Pick the ideal experiment to this observational study.I Hide the outcome data.

I Try to estimate the randomization procedure.I Analyze this as an experiment with this estimated procedure.

Tries to minimize “snooping” by picking the best modeling strategybefore seeing the outcome.

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 11 / 89

Page 47: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Designing observational studies

Rubin (2008) argues that we should still “design” our observationalstudies:

I Pick the ideal experiment to this observational study.I Hide the outcome data.I Try to estimate the randomization procedure.

I Analyze this as an experiment with this estimated procedure.

Tries to minimize “snooping” by picking the best modeling strategybefore seeing the outcome.

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 11 / 89

Page 48: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Designing observational studies

Rubin (2008) argues that we should still “design” our observationalstudies:

I Pick the ideal experiment to this observational study.I Hide the outcome data.I Try to estimate the randomization procedure.I Analyze this as an experiment with this estimated procedure.

Tries to minimize “snooping” by picking the best modeling strategybefore seeing the outcome.

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 11 / 89

Page 49: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Designing observational studies

Rubin (2008) argues that we should still “design” our observationalstudies:

I Pick the ideal experiment to this observational study.I Hide the outcome data.I Try to estimate the randomization procedure.I Analyze this as an experiment with this estimated procedure.

Tries to minimize “snooping” by picking the best modeling strategybefore seeing the outcome.

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 11 / 89

Page 50: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Discrete covariates

Suppose that we knew that Di was unconfounded within levels of abinary Xi .

Then we could always estimate the causal effect using iteratedexpectations as in a stratified randomized experiment:

EX

{E[Yi |Di = 1,Xi ]− E[Yi |Di = 0,Xi ]

}=(E[Yi |Di = 1,Xi = 1]− E[Yi |Di = 0,Xi = 1]

)︸ ︷︷ ︸

diff-in-means for Xi=1

P[Xi = 1]︸ ︷︷ ︸share of Xi=1

+(E[Yi |Di = 1,Xi = 0]− E[Yi |Di = 0,Xi = 0]

)︸ ︷︷ ︸

diff-in-means for Xi=0

P[Xi = 0]︸ ︷︷ ︸share of Xi=0

Never used our knowledge of the randomization for this quantity.

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 12 / 89

Page 51: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Discrete covariates

Suppose that we knew that Di was unconfounded within levels of abinary Xi .

Then we could always estimate the causal effect using iteratedexpectations as in a stratified randomized experiment:

EX

{E[Yi |Di = 1,Xi ]− E[Yi |Di = 0,Xi ]

}=(E[Yi |Di = 1,Xi = 1]− E[Yi |Di = 0,Xi = 1]

)︸ ︷︷ ︸

diff-in-means for Xi=1

P[Xi = 1]︸ ︷︷ ︸share of Xi=1

+(E[Yi |Di = 1,Xi = 0]− E[Yi |Di = 0,Xi = 0]

)︸ ︷︷ ︸

diff-in-means for Xi=0

P[Xi = 0]︸ ︷︷ ︸share of Xi=0

Never used our knowledge of the randomization for this quantity.

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 12 / 89

Page 52: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Discrete covariates

Suppose that we knew that Di was unconfounded within levels of abinary Xi .

Then we could always estimate the causal effect using iteratedexpectations as in a stratified randomized experiment:

EX

{E[Yi |Di = 1,Xi ]− E[Yi |Di = 0,Xi ]

}=(E[Yi |Di = 1,Xi = 1]− E[Yi |Di = 0,Xi = 1]

)︸ ︷︷ ︸

diff-in-means for Xi=1

P[Xi = 1]︸ ︷︷ ︸share of Xi=1

+(E[Yi |Di = 1,Xi = 0]− E[Yi |Di = 0,Xi = 0]

)︸ ︷︷ ︸

diff-in-means for Xi=0

P[Xi = 0]︸ ︷︷ ︸share of Xi=0

Never used our knowledge of the randomization for this quantity.

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 12 / 89

Page 53: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Discrete covariates

Suppose that we knew that Di was unconfounded within levels of abinary Xi .

Then we could always estimate the causal effect using iteratedexpectations as in a stratified randomized experiment:

EX

{E[Yi |Di = 1,Xi ]− E[Yi |Di = 0,Xi ]

}

=(E[Yi |Di = 1,Xi = 1]− E[Yi |Di = 0,Xi = 1]

)︸ ︷︷ ︸

diff-in-means for Xi=1

P[Xi = 1]︸ ︷︷ ︸share of Xi=1

+(E[Yi |Di = 1,Xi = 0]− E[Yi |Di = 0,Xi = 0]

)︸ ︷︷ ︸

diff-in-means for Xi=0

P[Xi = 0]︸ ︷︷ ︸share of Xi=0

Never used our knowledge of the randomization for this quantity.

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 12 / 89

Page 54: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Discrete covariates

Suppose that we knew that Di was unconfounded within levels of abinary Xi .

Then we could always estimate the causal effect using iteratedexpectations as in a stratified randomized experiment:

EX

{E[Yi |Di = 1,Xi ]− E[Yi |Di = 0,Xi ]

}=(E[Yi |Di = 1,Xi = 1]− E[Yi |Di = 0,Xi = 1]

)︸ ︷︷ ︸

diff-in-means for Xi=1

P[Xi = 1]︸ ︷︷ ︸share of Xi=1

+(E[Yi |Di = 1,Xi = 0]− E[Yi |Di = 0,Xi = 0]

)︸ ︷︷ ︸

diff-in-means for Xi=0

P[Xi = 0]︸ ︷︷ ︸share of Xi=0

Never used our knowledge of the randomization for this quantity.

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 12 / 89

Page 55: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Discrete covariates

Suppose that we knew that Di was unconfounded within levels of abinary Xi .

Then we could always estimate the causal effect using iteratedexpectations as in a stratified randomized experiment:

EX

{E[Yi |Di = 1,Xi ]− E[Yi |Di = 0,Xi ]

}=(E[Yi |Di = 1,Xi = 1]− E[Yi |Di = 0,Xi = 1]

)︸ ︷︷ ︸

diff-in-means for Xi=1

P[Xi = 1]︸ ︷︷ ︸share of Xi=1

+(E[Yi |Di = 1,Xi = 0]− E[Yi |Di = 0,Xi = 0]

)︸ ︷︷ ︸

diff-in-means for Xi=0

P[Xi = 0]︸ ︷︷ ︸share of Xi=0

Never used our knowledge of the randomization for this quantity.

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 12 / 89

Page 56: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Stratification Example: Smoking and Mortality (Cochran,1968)

Table 1

Death Rates per 1,000 Person-Years

Smoking group Canada U.K. U.S.

Non-smokers 20.2 11.3 13.5Cigarettes 20.5 14.1 13.5Cigars/pipes 35.5 20.7 17.4

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 13 / 89

Page 57: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Stratification Example: Smoking and Mortality (Cochran,1968)

Table 2

Mean Ages, Years

Smoking group Canada U.K. U.S.

Non-smokers 54.9 49.1 57.0Cigarettes 50.5 49.8 53.2Cigars/pipes 65.9 55.7 59.7

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 14 / 89

Page 58: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Stratification

To control for differences in age, we would like to compare differentsmoking-habit groups with the same age distribution

One possibility is to use stratification:

for each country, divide each group into different age subgroups

calculate death rates within age subgroups

average within age subgroup death rates using fixed weights (e.g.number of cigarette smokers)

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 15 / 89

Page 59: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Stratification

To control for differences in age, we would like to compare differentsmoking-habit groups with the same age distribution

One possibility is to use stratification:

for each country, divide each group into different age subgroups

calculate death rates within age subgroups

average within age subgroup death rates using fixed weights (e.g.number of cigarette smokers)

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 15 / 89

Page 60: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Stratification

To control for differences in age, we would like to compare differentsmoking-habit groups with the same age distribution

One possibility is to use stratification:

for each country, divide each group into different age subgroups

calculate death rates within age subgroups

average within age subgroup death rates using fixed weights (e.g.number of cigarette smokers)

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 15 / 89

Page 61: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Stratification

To control for differences in age, we would like to compare differentsmoking-habit groups with the same age distribution

One possibility is to use stratification:

for each country, divide each group into different age subgroups

calculate death rates within age subgroups

average within age subgroup death rates using fixed weights (e.g.number of cigarette smokers)

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 15 / 89

Page 62: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Stratification

To control for differences in age, we would like to compare differentsmoking-habit groups with the same age distribution

One possibility is to use stratification:

for each country, divide each group into different age subgroups

calculate death rates within age subgroups

average within age subgroup death rates using fixed weights (e.g.number of cigarette smokers)

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 15 / 89

Page 63: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Stratification: Example

Death Rates # Pipe- # Non-Pipe Smokers Smokers Smokers

Age 20 - 50 15 11 29

Age 50 - 70 35 13 9

Age + 70 50 16 2

Total 40 40

What is the average death rate for Pipe Smokers?

15 · (11/40) + 35 · (13/40) + 50 · (16/40) = 35.5

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 16 / 89

Page 64: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Stratification: Example

Death Rates # Pipe- # Non-Pipe Smokers Smokers Smokers

Age 20 - 50 15 11 29

Age 50 - 70 35 13 9

Age + 70 50 16 2

Total 40 40

What is the average death rate for Pipe Smokers?15 · (11/40) + 35 · (13/40) + 50 · (16/40) = 35.5

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 16 / 89

Page 65: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Stratification: Example

Death Rates # Pipe- # Non-Pipe Smokers Smokers Smokers

Age 20 - 50 15 11 29

Age 50 - 70 35 13 9

Age + 70 50 16 2

Total 40 40

What is the average death rate for Pipe Smokers if they had same agedistribution as Non-Smokers?

15 · (29/40) + 35 · (9/40) + 50 · (2/40) = 21.2

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 17 / 89

Page 66: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Stratification: Example

Death Rates # Pipe- # Non-Pipe Smokers Smokers Smokers

Age 20 - 50 15 11 29

Age 50 - 70 35 13 9

Age + 70 50 16 2

Total 40 40

What is the average death rate for Pipe Smokers if they had same agedistribution as Non-Smokers?15 · (29/40) + 35 · (9/40) + 50 · (2/40) = 21.2

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 17 / 89

Page 67: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Smoking and Mortality (Cochran, 1968)

Table 3

Adjusted Death Rates using 3 Age groups

Smoking group Canada U.K. U.S.

Non-smokers 20.2 11.3 13.5Cigarettes 28.3 12.8 17.7Cigars/pipes 21.2 12.0 14.2

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 18 / 89

Page 68: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Continuous covariates

So, great, we can stratify. Why not do this all the time?

What if Xi = income for unit i?

I Each unit has its own value of Xi : $54,134, $123,043, $23,842.I If Xi = 54134 is unique, will only observe 1 of these:

E[Yi |Di = 1,Xi = 54134]− E[Yi |Di = 0,Xi = 54134]

I cannot stratify to each unique value of Xi :

Practically, this is massively important: almost always have data withunique values.

One option is to discretize as we discussed with age, we will discuss morelater this week!

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 19 / 89

Page 69: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Continuous covariates

So, great, we can stratify. Why not do this all the time?

What if Xi = income for unit i?

I Each unit has its own value of Xi : $54,134, $123,043, $23,842.I If Xi = 54134 is unique, will only observe 1 of these:

E[Yi |Di = 1,Xi = 54134]− E[Yi |Di = 0,Xi = 54134]

I cannot stratify to each unique value of Xi :

Practically, this is massively important: almost always have data withunique values.

One option is to discretize as we discussed with age, we will discuss morelater this week!

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 19 / 89

Page 70: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Continuous covariates

So, great, we can stratify. Why not do this all the time?

What if Xi = income for unit i?

I Each unit has its own value of Xi : $54,134, $123,043, $23,842.

I If Xi = 54134 is unique, will only observe 1 of these:

E[Yi |Di = 1,Xi = 54134]− E[Yi |Di = 0,Xi = 54134]

I cannot stratify to each unique value of Xi :

Practically, this is massively important: almost always have data withunique values.

One option is to discretize as we discussed with age, we will discuss morelater this week!

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 19 / 89

Page 71: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Continuous covariates

So, great, we can stratify. Why not do this all the time?

What if Xi = income for unit i?

I Each unit has its own value of Xi : $54,134, $123,043, $23,842.I If Xi = 54134 is unique, will only observe 1 of these:

E[Yi |Di = 1,Xi = 54134]− E[Yi |Di = 0,Xi = 54134]

I cannot stratify to each unique value of Xi :

Practically, this is massively important: almost always have data withunique values.

One option is to discretize as we discussed with age, we will discuss morelater this week!

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 19 / 89

Page 72: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Continuous covariates

So, great, we can stratify. Why not do this all the time?

What if Xi = income for unit i?

I Each unit has its own value of Xi : $54,134, $123,043, $23,842.I If Xi = 54134 is unique, will only observe 1 of these:

E[Yi |Di = 1,Xi = 54134]− E[Yi |Di = 0,Xi = 54134]

I cannot stratify to each unique value of Xi :

Practically, this is massively important: almost always have data withunique values.

One option is to discretize as we discussed with age, we will discuss morelater this week!

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 19 / 89

Page 73: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Continuous covariates

So, great, we can stratify. Why not do this all the time?

What if Xi = income for unit i?

I Each unit has its own value of Xi : $54,134, $123,043, $23,842.I If Xi = 54134 is unique, will only observe 1 of these:

E[Yi |Di = 1,Xi = 54134]− E[Yi |Di = 0,Xi = 54134]

I cannot stratify to each unique value of Xi :

Practically, this is massively important: almost always have data withunique values.

One option is to discretize as we discussed with age, we will discuss morelater this week!

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 19 / 89

Page 74: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Continuous covariates

So, great, we can stratify. Why not do this all the time?

What if Xi = income for unit i?

I Each unit has its own value of Xi : $54,134, $123,043, $23,842.I If Xi = 54134 is unique, will only observe 1 of these:

E[Yi |Di = 1,Xi = 54134]− E[Yi |Di = 0,Xi = 54134]

I cannot stratify to each unique value of Xi :

Practically, this is massively important: almost always have data withunique values.

One option is to discretize as we discussed with age, we will discuss morelater this week!

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 19 / 89

Page 75: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Identification Under Selection on Observables

Identification Assumption1 (Y1,Y0)⊥⊥D|X (selection on observables)

2 0 < Pr(D = 1|X ) < 1 with probability one (common support)

Identification ResultGiven selection on observables we have

E[Y1 − Y0|X ] = E[Y1 − Y0|X ,D = 1]

= E[Y |X ,D = 1]− E[Y |X ,D = 0]

Therefore, under the common support condition:

τATE = E[Y1 − Y0] =

∫E[Y1 − Y0|X ] dP(X )

=

∫ (E[Y |X ,D = 1]− E[Y |X ,D = 0]

)dP(X )

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 20 / 89

Page 76: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Identification Under Selection on Observables

Identification Assumption1 (Y1,Y0)⊥⊥D|X (selection on observables)

2 0 < Pr(D = 1|X ) < 1 with probability one (common support)

Identification ResultSimilarly,

τATT = E[Y1 − Y0|D = 1]

=

∫ (E[Y |X ,D = 1]− E[Y |X ,D = 0]

)dP(X |D = 1)

To identify τATT the selection on observables and common support conditions canbe relaxed to:

Y0⊥⊥D|X (SOO for Controls)

Pr(D = 1|X ) < 1 (Weak Overlap)

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 20 / 89

Page 77: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Identification Under Selection on Observables

Potential Outcome Potential Outcomeunit under Treatment under Control

i Y1i Y0i Di Xi

1 E[Y1|X = 0,D = 1] E[Y0|X = 0,D = 1]1 0

2 1 03 E[Y1|X = 0,D = 0] E[Y0|X = 0,D = 0]

0 04 0 05 E[Y1|X = 1,D = 1] E[Y0|X = 1,D = 1]

1 16 1 17 E[Y1|X = 1,D = 0] E[Y0|X = 1,D = 0]

0 18 0 1

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 21 / 89

Page 78: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Identification Under Selection on Observables

Potential Outcome Potential Outcomeunit under Treatment under Control

i Y1i Y0i Di Xi

1 E[Y1|X = 0,D = 1]E[Y0|X = 0,D = 1]= 1 0

2 E[Y0|X = 0,D = 0] 1 03 E[Y1|X = 0,D = 0] E[Y0|X = 0,D = 0]

0 04 0 05 E[Y1|X = 1,D = 1]

E[Y0|X = 1,D = 1]= 1 16 E[Y0|X = 1,D = 0] 1 17 E[Y1|X = 1,D = 0] E[Y0|X = 1,D = 0]

0 18 0 1

(Y1,Y0)⊥⊥D|X implies that we conditioned on all confounders. The treat-ment is randomly assigned within each stratum of X :

E[Y0|X = 0,D = 1] = E[Y0|X = 0,D = 0] and

E[Y0|X = 1,D = 1] = E[Y0|X = 1,D = 0]

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 21 / 89

Page 79: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Identification Under Selection on Observables

Potential Outcome Potential Outcomeunit under Treatment under Control

i Y1i Y0i Di Xi

1 E[Y1|X = 0,D = 1]E[Y0|X = 0,D = 1]= 1 0

2 E[Y0|X = 0,D = 0] 1 03 E[Y1|X = 0,D = 0] = E[Y0|X = 0,D = 0]

0 04 E[Y1|X = 0,D = 1] 0 05 E[Y1|X = 1,D = 1]

E[Y0|X = 1,D = 1]= 1 16 E[Y0|X = 1,D = 0] 1 17 E[Y1|X = 1,D = 0] = E[Y0|X = 1,D = 0]

0 18 E[Y1|X = 1,D = 1] 0 1

(Y1,Y0)⊥⊥D|X also implies

E[Y1|X = 0,D = 1] = E[Y1|X = 0,D = 0] and

E[Y1|X = 1,D = 1] = E[Y1|X = 1,D = 0]

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 21 / 89

Page 80: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

1 The Experimental Ideal

2 Assumption of No Unmeasured Confounding

3 Estimation Under No Unmeasured Confounding

4 Regression Estimators

5 Regression and Causality

6 Regression Under Heterogeneous Effects

7 Fun with Visualization, Replication and the NYT

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 22 / 89

Page 81: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

1 The Experimental Ideal

2 Assumption of No Unmeasured Confounding

3 Estimation Under No Unmeasured Confounding

4 Regression Estimators

5 Regression and Causality

6 Regression Under Heterogeneous Effects

7 Fun with Visualization, Replication and the NYT

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 22 / 89

Page 82: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

What is confounding?

Confounding is the bias caused by common causes of the treatmentand outcome.

I Leads to “spurious correlation.”

In observational studies, the goal is to avoid confounding inherent inthe data.

Pervasive in the social sciences:

I effect of income on voting (confounding: age)I effect of job training program on employment (confounding:

motivation)I effect of political institutions on economic development (confounding:

previous economic development)

No unmeasured confounding assumes that we’ve measured all sourcesof confounding.

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 23 / 89

Page 83: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

What is confounding?

Confounding is the bias caused by common causes of the treatmentand outcome.

I Leads to “spurious correlation.”

In observational studies, the goal is to avoid confounding inherent inthe data.

Pervasive in the social sciences:

I effect of income on voting (confounding: age)I effect of job training program on employment (confounding:

motivation)I effect of political institutions on economic development (confounding:

previous economic development)

No unmeasured confounding assumes that we’ve measured all sourcesof confounding.

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 23 / 89

Page 84: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

What is confounding?

Confounding is the bias caused by common causes of the treatmentand outcome.

I Leads to “spurious correlation.”

In observational studies, the goal is to avoid confounding inherent inthe data.

Pervasive in the social sciences:

I effect of income on voting (confounding: age)I effect of job training program on employment (confounding:

motivation)I effect of political institutions on economic development (confounding:

previous economic development)

No unmeasured confounding assumes that we’ve measured all sourcesof confounding.

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 23 / 89

Page 85: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

What is confounding?

Confounding is the bias caused by common causes of the treatmentand outcome.

I Leads to “spurious correlation.”

In observational studies, the goal is to avoid confounding inherent inthe data.

Pervasive in the social sciences:

I effect of income on voting (confounding: age)I effect of job training program on employment (confounding:

motivation)I effect of political institutions on economic development (confounding:

previous economic development)

No unmeasured confounding assumes that we’ve measured all sourcesof confounding.

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 23 / 89

Page 86: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

What is confounding?

Confounding is the bias caused by common causes of the treatmentand outcome.

I Leads to “spurious correlation.”

In observational studies, the goal is to avoid confounding inherent inthe data.

Pervasive in the social sciences:

I effect of income on voting (confounding: age)

I effect of job training program on employment (confounding:motivation)

I effect of political institutions on economic development (confounding:previous economic development)

No unmeasured confounding assumes that we’ve measured all sourcesof confounding.

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 23 / 89

Page 87: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

What is confounding?

Confounding is the bias caused by common causes of the treatmentand outcome.

I Leads to “spurious correlation.”

In observational studies, the goal is to avoid confounding inherent inthe data.

Pervasive in the social sciences:

I effect of income on voting (confounding: age)I effect of job training program on employment (confounding:

motivation)

I effect of political institutions on economic development (confounding:previous economic development)

No unmeasured confounding assumes that we’ve measured all sourcesof confounding.

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 23 / 89

Page 88: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

What is confounding?

Confounding is the bias caused by common causes of the treatmentand outcome.

I Leads to “spurious correlation.”

In observational studies, the goal is to avoid confounding inherent inthe data.

Pervasive in the social sciences:

I effect of income on voting (confounding: age)I effect of job training program on employment (confounding:

motivation)I effect of political institutions on economic development (confounding:

previous economic development)

No unmeasured confounding assumes that we’ve measured all sourcesof confounding.

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 23 / 89

Page 89: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

What is confounding?

Confounding is the bias caused by common causes of the treatmentand outcome.

I Leads to “spurious correlation.”

In observational studies, the goal is to avoid confounding inherent inthe data.

Pervasive in the social sciences:

I effect of income on voting (confounding: age)I effect of job training program on employment (confounding:

motivation)I effect of political institutions on economic development (confounding:

previous economic development)

No unmeasured confounding assumes that we’ve measured all sourcesof confounding.

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 23 / 89

Page 90: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Big problem

How can we determine if no unmeasured confounding holds if wedidn’t assign the treatment?

Put differently:

I What covariates do we need to condition on?I What covariates do we need to include in our regressions?

One way, from the assumption itself:

I P[Di = 1|X,Y(1),Y(0)] = P[Di = 1|X]I Include covariates such that, conditional on them, the treatment

assignment does not depend on the potential outcomes.

Another way: use DAGs and look at back-door paths.

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 24 / 89

Page 91: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Big problem

How can we determine if no unmeasured confounding holds if wedidn’t assign the treatment?

Put differently:

I What covariates do we need to condition on?I What covariates do we need to include in our regressions?

One way, from the assumption itself:

I P[Di = 1|X,Y(1),Y(0)] = P[Di = 1|X]I Include covariates such that, conditional on them, the treatment

assignment does not depend on the potential outcomes.

Another way: use DAGs and look at back-door paths.

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 24 / 89

Page 92: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Big problem

How can we determine if no unmeasured confounding holds if wedidn’t assign the treatment?

Put differently:

I What covariates do we need to condition on?

I What covariates do we need to include in our regressions?

One way, from the assumption itself:

I P[Di = 1|X,Y(1),Y(0)] = P[Di = 1|X]I Include covariates such that, conditional on them, the treatment

assignment does not depend on the potential outcomes.

Another way: use DAGs and look at back-door paths.

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 24 / 89

Page 93: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Big problem

How can we determine if no unmeasured confounding holds if wedidn’t assign the treatment?

Put differently:

I What covariates do we need to condition on?I What covariates do we need to include in our regressions?

One way, from the assumption itself:

I P[Di = 1|X,Y(1),Y(0)] = P[Di = 1|X]I Include covariates such that, conditional on them, the treatment

assignment does not depend on the potential outcomes.

Another way: use DAGs and look at back-door paths.

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 24 / 89

Page 94: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Big problem

How can we determine if no unmeasured confounding holds if wedidn’t assign the treatment?

Put differently:

I What covariates do we need to condition on?I What covariates do we need to include in our regressions?

One way, from the assumption itself:

I P[Di = 1|X,Y(1),Y(0)] = P[Di = 1|X]I Include covariates such that, conditional on them, the treatment

assignment does not depend on the potential outcomes.

Another way: use DAGs and look at back-door paths.

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 24 / 89

Page 95: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Big problem

How can we determine if no unmeasured confounding holds if wedidn’t assign the treatment?

Put differently:

I What covariates do we need to condition on?I What covariates do we need to include in our regressions?

One way, from the assumption itself:

I P[Di = 1|X,Y(1),Y(0)] = P[Di = 1|X]

I Include covariates such that, conditional on them, the treatmentassignment does not depend on the potential outcomes.

Another way: use DAGs and look at back-door paths.

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 24 / 89

Page 96: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Big problem

How can we determine if no unmeasured confounding holds if wedidn’t assign the treatment?

Put differently:

I What covariates do we need to condition on?I What covariates do we need to include in our regressions?

One way, from the assumption itself:

I P[Di = 1|X,Y(1),Y(0)] = P[Di = 1|X]I Include covariates such that, conditional on them, the treatment

assignment does not depend on the potential outcomes.

Another way: use DAGs and look at back-door paths.

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 24 / 89

Page 97: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Big problem

How can we determine if no unmeasured confounding holds if wedidn’t assign the treatment?

Put differently:

I What covariates do we need to condition on?I What covariates do we need to include in our regressions?

One way, from the assumption itself:

I P[Di = 1|X,Y(1),Y(0)] = P[Di = 1|X]I Include covariates such that, conditional on them, the treatment

assignment does not depend on the potential outcomes.

Another way: use DAGs and look at back-door paths.

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 24 / 89

Page 98: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Backdoor paths and blocking paths

Backdoor path: is a non-causal path from D to Y .

I Would remain if we removed any arrows pointing out of D.

Backdoor paths between D and Y common causes of D and Y :

D

X

Y

Here there is a backdoor path D ← X → Y , where X is a commoncause for the treatment and the outcome.

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 25 / 89

Page 99: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Backdoor paths and blocking paths

Backdoor path: is a non-causal path from D to Y .

I Would remain if we removed any arrows pointing out of D.

Backdoor paths between D and Y common causes of D and Y :

D

X

Y

Here there is a backdoor path D ← X → Y , where X is a commoncause for the treatment and the outcome.

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 25 / 89

Page 100: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Backdoor paths and blocking paths

Backdoor path: is a non-causal path from D to Y .

I Would remain if we removed any arrows pointing out of D.

Backdoor paths between D and Y common causes of D and Y :

D

X

Y

Here there is a backdoor path D ← X → Y , where X is a commoncause for the treatment and the outcome.

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 25 / 89

Page 101: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Backdoor paths and blocking paths

Backdoor path: is a non-causal path from D to Y .

I Would remain if we removed any arrows pointing out of D.

Backdoor paths between D and Y common causes of D and Y :

D

X

Y

Here there is a backdoor path D ← X → Y , where X is a commoncause for the treatment and the outcome.

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 25 / 89

Page 102: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Other types of confounding

D

U X

Y

D is enrolling in a job training program.

Y is getting a job.

U is being motivated

X is number of job applications sent out.

Big assumption here: no arrow from U to Y .

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 26 / 89

Page 103: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Other types of confounding

D

U X

Y

D is enrolling in a job training program.

Y is getting a job.

U is being motivated

X is number of job applications sent out.

Big assumption here: no arrow from U to Y .

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 26 / 89

Page 104: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Other types of confounding

D

U X

Y

D is enrolling in a job training program.

Y is getting a job.

U is being motivated

X is number of job applications sent out.

Big assumption here: no arrow from U to Y .

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 26 / 89

Page 105: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Other types of confounding

D

U X

Y

D is enrolling in a job training program.

Y is getting a job.

U is being motivated

X is number of job applications sent out.

Big assumption here: no arrow from U to Y .

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 26 / 89

Page 106: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Other types of confounding

D

U X

Y

D is enrolling in a job training program.

Y is getting a job.

U is being motivated

X is number of job applications sent out.

Big assumption here: no arrow from U to Y .

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 26 / 89

Page 107: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Other types of confounding

D

U X

Y

D is exercise.

Y is having a disease.

U is lifestyle.

X is smoking

Big assumption here: no arrow from U to Y .

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 27 / 89

Page 108: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

What’s the problem with backdoor paths?

D

U X

Y

A path is blocked if:

1 we control for or stratify a non-collider on that path OR2 we do not control for a collider.

Unblocked backdoor paths confounding.

In the DAG here, if we condition on X , then the backdoor path isblocked.

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 28 / 89

Page 109: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

What’s the problem with backdoor paths?

D

U X

Y

A path is blocked if:

1 we control for or stratify a non-collider on that path OR

2 we do not control for a collider.

Unblocked backdoor paths confounding.

In the DAG here, if we condition on X , then the backdoor path isblocked.

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 28 / 89

Page 110: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

What’s the problem with backdoor paths?

D

U X

Y

A path is blocked if:

1 we control for or stratify a non-collider on that path OR2 we do not control for a collider.

Unblocked backdoor paths confounding.

In the DAG here, if we condition on X , then the backdoor path isblocked.

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 28 / 89

Page 111: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

What’s the problem with backdoor paths?

D

U X

Y

A path is blocked if:

1 we control for or stratify a non-collider on that path OR2 we do not control for a collider.

Unblocked backdoor paths confounding.

In the DAG here, if we condition on X , then the backdoor path isblocked.

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 28 / 89

Page 112: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

What’s the problem with backdoor paths?

D

U X

Y

A path is blocked if:

1 we control for or stratify a non-collider on that path OR2 we do not control for a collider.

Unblocked backdoor paths confounding.

In the DAG here, if we condition on X , then the backdoor path isblocked.

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 28 / 89

Page 113: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Not all backdoor paths

D

U1

X

Y

Conditioning on the posttreatment covariates opens the non-causalpath.

I selection bias.

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 29 / 89

Page 114: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Not all backdoor paths

D

U1

X

Y

Conditioning on the posttreatment covariates opens the non-causalpath.

I selection bias.

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 29 / 89

Page 115: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Not all backdoor paths

D

U1

X

Y

Conditioning on the posttreatment covariates opens the non-causalpath.

I selection bias.

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 29 / 89

Page 116: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Don’t condition on post-treatment variables

Every time you do, a puppy cries.

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 30 / 89

Page 117: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Don’t condition on post-treatment variables

Every time you do, a puppy cries.

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 30 / 89

Page 118: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

M-bias

D

U1 U2

X

Y

Not all backdoor paths induce confounding.

This backdoor path is blocked by the collider X that we don’t controlfor.

If we control for X opens the path and induces confounding.

I Sometimes called M-bias.

Controversial because of differing views on what to control for:

I Rubin thinks that M-bias is a “mathematical curiosity” and we shouldcontrol for all pretreatment variables

I Pearl and others think M-bias is a real threat.I See the Elwert and Winship piece for more!

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 31 / 89

Page 119: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

M-bias

D

U1 U2

X

Y

Not all backdoor paths induce confounding.

This backdoor path is blocked by the collider X that we don’t controlfor.

If we control for X opens the path and induces confounding.

I Sometimes called M-bias.

Controversial because of differing views on what to control for:

I Rubin thinks that M-bias is a “mathematical curiosity” and we shouldcontrol for all pretreatment variables

I Pearl and others think M-bias is a real threat.I See the Elwert and Winship piece for more!

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 31 / 89

Page 120: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

M-bias

D

U1 U2

X

Y

Not all backdoor paths induce confounding.

This backdoor path is blocked by the collider X that we don’t controlfor.

If we control for X opens the path and induces confounding.

I Sometimes called M-bias.

Controversial because of differing views on what to control for:

I Rubin thinks that M-bias is a “mathematical curiosity” and we shouldcontrol for all pretreatment variables

I Pearl and others think M-bias is a real threat.I See the Elwert and Winship piece for more!

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 31 / 89

Page 121: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

M-bias

D

U1 U2

X

Y

Not all backdoor paths induce confounding.

This backdoor path is blocked by the collider X that we don’t controlfor.

If we control for X opens the path and induces confounding.

I Sometimes called M-bias.

Controversial because of differing views on what to control for:

I Rubin thinks that M-bias is a “mathematical curiosity” and we shouldcontrol for all pretreatment variables

I Pearl and others think M-bias is a real threat.I See the Elwert and Winship piece for more!

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 31 / 89

Page 122: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

M-bias

D

U1 U2

X

Y

Not all backdoor paths induce confounding.

This backdoor path is blocked by the collider X that we don’t controlfor.

If we control for X opens the path and induces confounding.

I Sometimes called M-bias.

Controversial because of differing views on what to control for:

I Rubin thinks that M-bias is a “mathematical curiosity” and we shouldcontrol for all pretreatment variables

I Pearl and others think M-bias is a real threat.I See the Elwert and Winship piece for more!

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 31 / 89

Page 123: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

M-bias

D

U1 U2

X

Y

Not all backdoor paths induce confounding.

This backdoor path is blocked by the collider X that we don’t controlfor.

If we control for X opens the path and induces confounding.

I Sometimes called M-bias.

Controversial because of differing views on what to control for:

I Rubin thinks that M-bias is a “mathematical curiosity” and we shouldcontrol for all pretreatment variables

I Pearl and others think M-bias is a real threat.I See the Elwert and Winship piece for more!

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 31 / 89

Page 124: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

M-bias

D

U1 U2

X

Y

Not all backdoor paths induce confounding.

This backdoor path is blocked by the collider X that we don’t controlfor.

If we control for X opens the path and induces confounding.

I Sometimes called M-bias.

Controversial because of differing views on what to control for:

I Rubin thinks that M-bias is a “mathematical curiosity” and we shouldcontrol for all pretreatment variables

I Pearl and others think M-bias is a real threat.

I See the Elwert and Winship piece for more!

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 31 / 89

Page 125: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

M-bias

D

U1 U2

X

Y

Not all backdoor paths induce confounding.

This backdoor path is blocked by the collider X that we don’t controlfor.

If we control for X opens the path and induces confounding.

I Sometimes called M-bias.

Controversial because of differing views on what to control for:

I Rubin thinks that M-bias is a “mathematical curiosity” and we shouldcontrol for all pretreatment variables

I Pearl and others think M-bias is a real threat.I See the Elwert and Winship piece for more!

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 31 / 89

Page 126: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Backdoor criterion

Can we use a DAG to evaluate no unmeasured confounders?

Pearl answered yes, with the backdoor criterion, which states that theeffect of D on Y is identified if:

1 No backdoor paths from D to Y OR2 Measured covariates are sufficient to block all backdoor paths from D

to Y .

First is really only valid for randomized experiments.

The backdoor criterion is fairly powerful. Tells us:

I if there is confounding given this DAG,I if it is possible to remove the confounding, andI what variables to condition on to eliminate the confounding.

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 32 / 89

Page 127: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Backdoor criterion

Can we use a DAG to evaluate no unmeasured confounders?

Pearl answered yes, with the backdoor criterion, which states that theeffect of D on Y is identified if:

1 No backdoor paths from D to Y OR2 Measured covariates are sufficient to block all backdoor paths from D

to Y .

First is really only valid for randomized experiments.

The backdoor criterion is fairly powerful. Tells us:

I if there is confounding given this DAG,I if it is possible to remove the confounding, andI what variables to condition on to eliminate the confounding.

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 32 / 89

Page 128: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Backdoor criterion

Can we use a DAG to evaluate no unmeasured confounders?

Pearl answered yes, with the backdoor criterion, which states that theeffect of D on Y is identified if:

1 No backdoor paths from D to Y OR

2 Measured covariates are sufficient to block all backdoor paths from Dto Y .

First is really only valid for randomized experiments.

The backdoor criterion is fairly powerful. Tells us:

I if there is confounding given this DAG,I if it is possible to remove the confounding, andI what variables to condition on to eliminate the confounding.

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 32 / 89

Page 129: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Backdoor criterion

Can we use a DAG to evaluate no unmeasured confounders?

Pearl answered yes, with the backdoor criterion, which states that theeffect of D on Y is identified if:

1 No backdoor paths from D to Y OR2 Measured covariates are sufficient to block all backdoor paths from D

to Y .

First is really only valid for randomized experiments.

The backdoor criterion is fairly powerful. Tells us:

I if there is confounding given this DAG,I if it is possible to remove the confounding, andI what variables to condition on to eliminate the confounding.

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 32 / 89

Page 130: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Backdoor criterion

Can we use a DAG to evaluate no unmeasured confounders?

Pearl answered yes, with the backdoor criterion, which states that theeffect of D on Y is identified if:

1 No backdoor paths from D to Y OR2 Measured covariates are sufficient to block all backdoor paths from D

to Y .

First is really only valid for randomized experiments.

The backdoor criterion is fairly powerful. Tells us:

I if there is confounding given this DAG,I if it is possible to remove the confounding, andI what variables to condition on to eliminate the confounding.

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 32 / 89

Page 131: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Backdoor criterion

Can we use a DAG to evaluate no unmeasured confounders?

Pearl answered yes, with the backdoor criterion, which states that theeffect of D on Y is identified if:

1 No backdoor paths from D to Y OR2 Measured covariates are sufficient to block all backdoor paths from D

to Y .

First is really only valid for randomized experiments.

The backdoor criterion is fairly powerful. Tells us:

I if there is confounding given this DAG,I if it is possible to remove the confounding, andI what variables to condition on to eliminate the confounding.

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 32 / 89

Page 132: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Backdoor criterion

Can we use a DAG to evaluate no unmeasured confounders?

Pearl answered yes, with the backdoor criterion, which states that theeffect of D on Y is identified if:

1 No backdoor paths from D to Y OR2 Measured covariates are sufficient to block all backdoor paths from D

to Y .

First is really only valid for randomized experiments.

The backdoor criterion is fairly powerful. Tells us:

I if there is confounding given this DAG,

I if it is possible to remove the confounding, andI what variables to condition on to eliminate the confounding.

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 32 / 89

Page 133: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Backdoor criterion

Can we use a DAG to evaluate no unmeasured confounders?

Pearl answered yes, with the backdoor criterion, which states that theeffect of D on Y is identified if:

1 No backdoor paths from D to Y OR2 Measured covariates are sufficient to block all backdoor paths from D

to Y .

First is really only valid for randomized experiments.

The backdoor criterion is fairly powerful. Tells us:

I if there is confounding given this DAG,I if it is possible to remove the confounding, and

I what variables to condition on to eliminate the confounding.

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 32 / 89

Page 134: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Backdoor criterion

Can we use a DAG to evaluate no unmeasured confounders?

Pearl answered yes, with the backdoor criterion, which states that theeffect of D on Y is identified if:

1 No backdoor paths from D to Y OR2 Measured covariates are sufficient to block all backdoor paths from D

to Y .

First is really only valid for randomized experiments.

The backdoor criterion is fairly powerful. Tells us:

I if there is confounding given this DAG,I if it is possible to remove the confounding, andI what variables to condition on to eliminate the confounding.

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 32 / 89

Page 135: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Example: Sufficient Conditioning Sets

U1●

U3

● Z1 ●Z2 ●Z3

X●

Z5●

Y

U9

U11

Z4

U2

U5

U4

U6

●U7

U10

U8

●● ●●

●● ●●

●●

●●

●●

●●

●●

●●

●●

Remove arrows out of X .Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 33 / 89

Page 136: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Example: Sufficient Conditioning Sets

U1●

U3

● Z1 ●Z2 ●Z3

X●

Z5●

Y

U9

U11

Z4

U2

U5

U4

U6

●U7

U10

U8

●● ●●

●● ●●

●●

●●

●●

●●

●●

●●

●●

Recall that paths are blocked by “unconditioned colliders” or conditioned non-colliders

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 34 / 89

Page 137: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Example: Sufficient Conditioning Sets

U1●

U3

● Z1 ●Z2 ●Z3

X●

Z5●

Y

U9

U11

Z4

U2

U5

U4

U6

●U7

U10

U8

●● ●●

●● ●●

●●

●●

●●

●●

●●

●●

●●

●●

Recall that paths are blocked by “unconditioned colliders” or conditioned non-colliders

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 35 / 89

Page 138: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Example: Sufficient Conditioning Sets

U1●

U3

● Z1 ●Z2 ●Z3

X●

Z5●

Y

U9

U11

Z4

U2

U5

U4

U6

●U7

U10

U8

●● ●●

●● ●●

●●

●●

●●

●●

●●

●●

●●

●●

Recall that paths are blocked by “unconditioned colliders” or conditioned non-colliders

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 36 / 89

Page 139: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Example: Sufficient Conditioning Sets

U1●

U3

● Z1 ●Z2 ●Z3

X●

Z5●

Y

U9

U11

Z4

U2

U5

U4

U6

●U7

U10

U8

●● ●●

●● ●●

●●

●●

●●

●●

●●

●●

●●

●●

Recall that paths are blocked by “unconditioned colliders” or conditioned non-colliders

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 37 / 89

Page 140: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Example: Non-sufficient Conditioning Sets

U1●

U3

● Z1 ●Z2 ●Z3

X●

Z5●

Y

U9

U11

Z4

U2

U5

U4

U6

●U7

U10

U8

●● ●●

●● ●●

●●

●●

●●

●●

●●

●●

●●

●●

Recall that paths are blocked by “unconditioned colliders” or conditioned non-colliders

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 38 / 89

Page 141: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Example: Non-sufficient Conditioning Sets

U1●

U3

● Z1 ●Z2 ●Z3

X●

Z5●

Y

U9

U11

Z4

U2

U5

U4

U6

●U7

U10

U8

●● ●●

●● ●●

●●

●●

●●

●●

●●

●●

●●

●●

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 39 / 89

Page 142: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Implications (via Vanderweele and Shpitser 2011)

1 Choose all pre-treatment covariates

(would condition on C2 inducing M-bias)

2 Choose all covariates which directly cause the treatment and the outcome

(would leave open a backdoor path A← C3 ← U3 → Y .)

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 40 / 89

Page 143: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Implications (via Vanderweele and Shpitser 2011)

Two common criteria fail here:

1 Choose all pre-treatment covariates

(would condition on C2 inducing M-bias)

2 Choose all covariates which directly cause the treatment and the outcome

(would leave open a backdoor path A← C3 ← U3 → Y .)

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 40 / 89

Page 144: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Implications (via Vanderweele and Shpitser 2011)

Two common criteria fail here:

1 Choose all pre-treatment covariates(would condition on C2 inducing M-bias)

2 Choose all covariates which directly cause the treatment and the outcome

(would leave open a backdoor path A← C3 ← U3 → Y .)

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 40 / 89

Page 145: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Implications (via Vanderweele and Shpitser 2011)

Two common criteria fail here:

1 Choose all pre-treatment covariates(would condition on C2 inducing M-bias)

2 Choose all covariates which directly cause the treatment and the outcome(would leave open a backdoor path A← C3 ← U3 → Y .)

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 40 / 89

Page 146: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

No unmeasured confounders is not testable

No unmeasured confounding places no restrictions on the observeddata. (

Yi (0)∣∣Di = 1,Xi

)︸ ︷︷ ︸unobserved

d=(Yi (0)

∣∣Di = 0,Xi

)︸ ︷︷ ︸observed

Here,d= means equal in distribution.

No way to directly test this assumption without the counterfactualdata, which is missing by definition!

With backdoor criterion, you must have the correct DAG.

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 41 / 89

Page 147: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

No unmeasured confounders is not testable

No unmeasured confounding places no restrictions on the observeddata. (

Yi (0)∣∣Di = 1,Xi

)︸ ︷︷ ︸unobserved

d=(Yi (0)

∣∣Di = 0,Xi

)︸ ︷︷ ︸observed

Here,d= means equal in distribution.

No way to directly test this assumption without the counterfactualdata, which is missing by definition!

With backdoor criterion, you must have the correct DAG.

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 41 / 89

Page 148: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

No unmeasured confounders is not testable

No unmeasured confounding places no restrictions on the observeddata. (

Yi (0)∣∣Di = 1,Xi

)︸ ︷︷ ︸unobserved

d=(Yi (0)

∣∣Di = 0,Xi

)︸ ︷︷ ︸observed

Here,d= means equal in distribution.

No way to directly test this assumption without the counterfactualdata, which is missing by definition!

With backdoor criterion, you must have the correct DAG.

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 41 / 89

Page 149: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

No unmeasured confounders is not testable

No unmeasured confounding places no restrictions on the observeddata. (

Yi (0)∣∣Di = 1,Xi

)︸ ︷︷ ︸unobserved

d=(Yi (0)

∣∣Di = 0,Xi

)︸ ︷︷ ︸observed

Here,d= means equal in distribution.

No way to directly test this assumption without the counterfactualdata, which is missing by definition!

With backdoor criterion, you must have the correct DAG.

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 41 / 89

Page 150: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Assessing no unmeasured confounders

Can do “placebo” tests, where Di cannot have an effect (laggedoutcomes, etc)

Della Vigna and Kaplan (2007, QJE): effect of Fox News availabilityon Republican vote share

I Availability in 2000/2003 can’t affect past vote shares.

Unconfoundedness could still be violated even if you pass this test!

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 42 / 89

Page 151: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Assessing no unmeasured confounders

Can do “placebo” tests, where Di cannot have an effect (laggedoutcomes, etc)

Della Vigna and Kaplan (2007, QJE): effect of Fox News availabilityon Republican vote share

I Availability in 2000/2003 can’t affect past vote shares.

Unconfoundedness could still be violated even if you pass this test!

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 42 / 89

Page 152: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Assessing no unmeasured confounders

Can do “placebo” tests, where Di cannot have an effect (laggedoutcomes, etc)

Della Vigna and Kaplan (2007, QJE): effect of Fox News availabilityon Republican vote share

I Availability in 2000/2003 can’t affect past vote shares.

Unconfoundedness could still be violated even if you pass this test!

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 42 / 89

Page 153: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Assessing no unmeasured confounders

Can do “placebo” tests, where Di cannot have an effect (laggedoutcomes, etc)

Della Vigna and Kaplan (2007, QJE): effect of Fox News availabilityon Republican vote share

I Availability in 2000/2003 can’t affect past vote shares.

Unconfoundedness could still be violated even if you pass this test!

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 42 / 89

Page 154: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Alternatives to no unmeasured confounding

Without explicit randomization, we need some way of identifyingcausal effects.

No unmeasured confounders ≈ randomized experiment.

I Identification results very similar to experiments.

With unmeasured confounding are we doomed? Maybe not!

Other approaches rely on finding plausibly exogenous variation inassignment of Di :

I Instrumental variables (randomization + exclusion restriction)I Over-time variation (diff-in-diff, fixed effects)I Arbitrary thresholds for treatment assignment (RDD)I All discussed in the next couple of weeks!

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 43 / 89

Page 155: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Alternatives to no unmeasured confounding

Without explicit randomization, we need some way of identifyingcausal effects.

No unmeasured confounders ≈ randomized experiment.

I Identification results very similar to experiments.

With unmeasured confounding are we doomed? Maybe not!

Other approaches rely on finding plausibly exogenous variation inassignment of Di :

I Instrumental variables (randomization + exclusion restriction)I Over-time variation (diff-in-diff, fixed effects)I Arbitrary thresholds for treatment assignment (RDD)I All discussed in the next couple of weeks!

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 43 / 89

Page 156: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Alternatives to no unmeasured confounding

Without explicit randomization, we need some way of identifyingcausal effects.

No unmeasured confounders ≈ randomized experiment.

I Identification results very similar to experiments.

With unmeasured confounding are we doomed? Maybe not!

Other approaches rely on finding plausibly exogenous variation inassignment of Di :

I Instrumental variables (randomization + exclusion restriction)I Over-time variation (diff-in-diff, fixed effects)I Arbitrary thresholds for treatment assignment (RDD)I All discussed in the next couple of weeks!

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 43 / 89

Page 157: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Alternatives to no unmeasured confounding

Without explicit randomization, we need some way of identifyingcausal effects.

No unmeasured confounders ≈ randomized experiment.

I Identification results very similar to experiments.

With unmeasured confounding are we doomed? Maybe not!

Other approaches rely on finding plausibly exogenous variation inassignment of Di :

I Instrumental variables (randomization + exclusion restriction)I Over-time variation (diff-in-diff, fixed effects)I Arbitrary thresholds for treatment assignment (RDD)I All discussed in the next couple of weeks!

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 43 / 89

Page 158: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Alternatives to no unmeasured confounding

Without explicit randomization, we need some way of identifyingcausal effects.

No unmeasured confounders ≈ randomized experiment.

I Identification results very similar to experiments.

With unmeasured confounding are we doomed? Maybe not!

Other approaches rely on finding plausibly exogenous variation inassignment of Di :

I Instrumental variables (randomization + exclusion restriction)I Over-time variation (diff-in-diff, fixed effects)I Arbitrary thresholds for treatment assignment (RDD)I All discussed in the next couple of weeks!

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 43 / 89

Page 159: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Alternatives to no unmeasured confounding

Without explicit randomization, we need some way of identifyingcausal effects.

No unmeasured confounders ≈ randomized experiment.

I Identification results very similar to experiments.

With unmeasured confounding are we doomed? Maybe not!

Other approaches rely on finding plausibly exogenous variation inassignment of Di :

I Instrumental variables (randomization + exclusion restriction)

I Over-time variation (diff-in-diff, fixed effects)I Arbitrary thresholds for treatment assignment (RDD)I All discussed in the next couple of weeks!

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 43 / 89

Page 160: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Alternatives to no unmeasured confounding

Without explicit randomization, we need some way of identifyingcausal effects.

No unmeasured confounders ≈ randomized experiment.

I Identification results very similar to experiments.

With unmeasured confounding are we doomed? Maybe not!

Other approaches rely on finding plausibly exogenous variation inassignment of Di :

I Instrumental variables (randomization + exclusion restriction)I Over-time variation (diff-in-diff, fixed effects)

I Arbitrary thresholds for treatment assignment (RDD)I All discussed in the next couple of weeks!

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 43 / 89

Page 161: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Alternatives to no unmeasured confounding

Without explicit randomization, we need some way of identifyingcausal effects.

No unmeasured confounders ≈ randomized experiment.

I Identification results very similar to experiments.

With unmeasured confounding are we doomed? Maybe not!

Other approaches rely on finding plausibly exogenous variation inassignment of Di :

I Instrumental variables (randomization + exclusion restriction)I Over-time variation (diff-in-diff, fixed effects)I Arbitrary thresholds for treatment assignment (RDD)

I All discussed in the next couple of weeks!

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 43 / 89

Page 162: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Alternatives to no unmeasured confounding

Without explicit randomization, we need some way of identifyingcausal effects.

No unmeasured confounders ≈ randomized experiment.

I Identification results very similar to experiments.

With unmeasured confounding are we doomed? Maybe not!

Other approaches rely on finding plausibly exogenous variation inassignment of Di :

I Instrumental variables (randomization + exclusion restriction)I Over-time variation (diff-in-diff, fixed effects)I Arbitrary thresholds for treatment assignment (RDD)I All discussed in the next couple of weeks!

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 43 / 89

Page 163: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Where We’ve Been and Where We’re Going...

Last WeekI intro to causal inference

This WeekI Monday:

F experimental IdealF identification with measured confounding

I Wednesday:F regression estimation

Next WeekI identification with unmeasured confoundingI instrumental variables

Long RunI probability → inference → regression → causal inference

Questions?

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 44 / 89

Page 164: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

1 The Experimental Ideal

2 Assumption of No Unmeasured Confounding

3 Estimation Under No Unmeasured Confounding

4 Regression Estimators

5 Regression and Causality

6 Regression Under Heterogeneous Effects

7 Fun with Visualization, Replication and the NYT

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 45 / 89

Page 165: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

1 The Experimental Ideal

2 Assumption of No Unmeasured Confounding

3 Estimation Under No Unmeasured Confounding

4 Regression Estimators

5 Regression and Causality

6 Regression Under Heterogeneous Effects

7 Fun with Visualization, Replication and the NYT

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 45 / 89

Page 166: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Identification vs. Estimation

An approximately ordered causal workflow:

1) Question ← the thing we care about2) Ideal Experiment ← what’s the counterfactual we care about3) Estimand ← the causal quantity of interest4) Identification Strategy ← how we connect features of a probability

distribution of observed data to causal estimand.5) Estimation ← how we estimate a feature of a probability distribution

from observed data.6) Inference/Uncertainty ← what would have happened if we observed a

different treatment assignment? (and possibly sampled a differentpopulation)

‘Whats your identification strategy?’ means ‘what are theassumptions that allow you to claim that the association you’veestimated has a causal interpretation?’

Selection on observables is an identification strategy

Identification depends on assumptions not statistical models.

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 46 / 89

Page 167: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Identification vs. Estimation

An approximately ordered causal workflow:

1) Question ← the thing we care about2) Ideal Experiment ← what’s the counterfactual we care about3) Estimand ← the causal quantity of interest4) Identification Strategy ← how we connect features of a probability

distribution of observed data to causal estimand.5) Estimation ← how we estimate a feature of a probability distribution

from observed data.6) Inference/Uncertainty ← what would have happened if we observed a

different treatment assignment? (and possibly sampled a differentpopulation)

‘Whats your identification strategy?’ means ‘what are theassumptions that allow you to claim that the association you’veestimated has a causal interpretation?’

Selection on observables is an identification strategy

Identification depends on assumptions not statistical models.

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 46 / 89

Page 168: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Identification vs. Estimation

An approximately ordered causal workflow:

1) Question ← the thing we care about

2) Ideal Experiment ← what’s the counterfactual we care about3) Estimand ← the causal quantity of interest4) Identification Strategy ← how we connect features of a probability

distribution of observed data to causal estimand.5) Estimation ← how we estimate a feature of a probability distribution

from observed data.6) Inference/Uncertainty ← what would have happened if we observed a

different treatment assignment? (and possibly sampled a differentpopulation)

‘Whats your identification strategy?’ means ‘what are theassumptions that allow you to claim that the association you’veestimated has a causal interpretation?’

Selection on observables is an identification strategy

Identification depends on assumptions not statistical models.

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 46 / 89

Page 169: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Identification vs. Estimation

An approximately ordered causal workflow:

1) Question ← the thing we care about2) Ideal Experiment ← what’s the counterfactual we care about

3) Estimand ← the causal quantity of interest4) Identification Strategy ← how we connect features of a probability

distribution of observed data to causal estimand.5) Estimation ← how we estimate a feature of a probability distribution

from observed data.6) Inference/Uncertainty ← what would have happened if we observed a

different treatment assignment? (and possibly sampled a differentpopulation)

‘Whats your identification strategy?’ means ‘what are theassumptions that allow you to claim that the association you’veestimated has a causal interpretation?’

Selection on observables is an identification strategy

Identification depends on assumptions not statistical models.

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 46 / 89

Page 170: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Identification vs. Estimation

An approximately ordered causal workflow:

1) Question ← the thing we care about2) Ideal Experiment ← what’s the counterfactual we care about3) Estimand ← the causal quantity of interest

4) Identification Strategy ← how we connect features of a probabilitydistribution of observed data to causal estimand.

5) Estimation ← how we estimate a feature of a probability distributionfrom observed data.

6) Inference/Uncertainty ← what would have happened if we observed adifferent treatment assignment? (and possibly sampled a differentpopulation)

‘Whats your identification strategy?’ means ‘what are theassumptions that allow you to claim that the association you’veestimated has a causal interpretation?’

Selection on observables is an identification strategy

Identification depends on assumptions not statistical models.

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 46 / 89

Page 171: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Identification vs. Estimation

An approximately ordered causal workflow:

1) Question ← the thing we care about2) Ideal Experiment ← what’s the counterfactual we care about3) Estimand ← the causal quantity of interest4) Identification Strategy ← how we connect features of a probability

distribution of observed data to causal estimand.

5) Estimation ← how we estimate a feature of a probability distributionfrom observed data.

6) Inference/Uncertainty ← what would have happened if we observed adifferent treatment assignment? (and possibly sampled a differentpopulation)

‘Whats your identification strategy?’ means ‘what are theassumptions that allow you to claim that the association you’veestimated has a causal interpretation?’

Selection on observables is an identification strategy

Identification depends on assumptions not statistical models.

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 46 / 89

Page 172: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Identification vs. Estimation

An approximately ordered causal workflow:

1) Question ← the thing we care about2) Ideal Experiment ← what’s the counterfactual we care about3) Estimand ← the causal quantity of interest4) Identification Strategy ← how we connect features of a probability

distribution of observed data to causal estimand.5) Estimation ← how we estimate a feature of a probability distribution

from observed data.

6) Inference/Uncertainty ← what would have happened if we observed adifferent treatment assignment? (and possibly sampled a differentpopulation)

‘Whats your identification strategy?’ means ‘what are theassumptions that allow you to claim that the association you’veestimated has a causal interpretation?’

Selection on observables is an identification strategy

Identification depends on assumptions not statistical models.

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 46 / 89

Page 173: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Identification vs. Estimation

An approximately ordered causal workflow:

1) Question ← the thing we care about2) Ideal Experiment ← what’s the counterfactual we care about3) Estimand ← the causal quantity of interest4) Identification Strategy ← how we connect features of a probability

distribution of observed data to causal estimand.5) Estimation ← how we estimate a feature of a probability distribution

from observed data.6) Inference/Uncertainty ← what would have happened if we observed a

different treatment assignment? (and possibly sampled a differentpopulation)

‘Whats your identification strategy?’ means ‘what are theassumptions that allow you to claim that the association you’veestimated has a causal interpretation?’

Selection on observables is an identification strategy

Identification depends on assumptions not statistical models.

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 46 / 89

Page 174: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Identification vs. Estimation

An approximately ordered causal workflow:

1) Question ← the thing we care about2) Ideal Experiment ← what’s the counterfactual we care about3) Estimand ← the causal quantity of interest4) Identification Strategy ← how we connect features of a probability

distribution of observed data to causal estimand.5) Estimation ← how we estimate a feature of a probability distribution

from observed data.6) Inference/Uncertainty ← what would have happened if we observed a

different treatment assignment? (and possibly sampled a differentpopulation)

‘Whats your identification strategy?’ means ‘what are theassumptions that allow you to claim that the association you’veestimated has a causal interpretation?’

Selection on observables is an identification strategy

Identification depends on assumptions not statistical models.

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 46 / 89

Page 175: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Identification vs. Estimation

An approximately ordered causal workflow:

1) Question ← the thing we care about2) Ideal Experiment ← what’s the counterfactual we care about3) Estimand ← the causal quantity of interest4) Identification Strategy ← how we connect features of a probability

distribution of observed data to causal estimand.5) Estimation ← how we estimate a feature of a probability distribution

from observed data.6) Inference/Uncertainty ← what would have happened if we observed a

different treatment assignment? (and possibly sampled a differentpopulation)

‘Whats your identification strategy?’ means ‘what are theassumptions that allow you to claim that the association you’veestimated has a causal interpretation?’

Selection on observables is an identification strategy

Identification depends on assumptions not statistical models.

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 46 / 89

Page 176: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Identification vs. Estimation

An approximately ordered causal workflow:

1) Question ← the thing we care about2) Ideal Experiment ← what’s the counterfactual we care about3) Estimand ← the causal quantity of interest4) Identification Strategy ← how we connect features of a probability

distribution of observed data to causal estimand.5) Estimation ← how we estimate a feature of a probability distribution

from observed data.6) Inference/Uncertainty ← what would have happened if we observed a

different treatment assignment? (and possibly sampled a differentpopulation)

‘Whats your identification strategy?’ means ‘what are theassumptions that allow you to claim that the association you’veestimated has a causal interpretation?’

Selection on observables is an identification strategy

Identification depends on assumptions not statistical models.

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 46 / 89

Page 177: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Estimation

Estimation is secondary to identification.

Selection on observables generally requires estimating at least oneconditional expectation function and there are many ways to do that.

An incomplete list of strategies:I matchingI weightingI regressionI combinations of the above

Today we will talk about regression because that’s the subject of theclass.

A big topic I’m skipping over as outside the scope of class is thepropensity score (conditional expectation of the treatment given thecovariates).

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 47 / 89

Page 178: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Estimation

Estimation is secondary to identification.

Selection on observables generally requires estimating at least oneconditional expectation function and there are many ways to do that.

An incomplete list of strategies:I matchingI weightingI regressionI combinations of the above

Today we will talk about regression because that’s the subject of theclass.

A big topic I’m skipping over as outside the scope of class is thepropensity score (conditional expectation of the treatment given thecovariates).

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 47 / 89

Page 179: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Estimation

Estimation is secondary to identification.

Selection on observables generally requires estimating at least oneconditional expectation function and there are many ways to do that.

An incomplete list of strategies:I matchingI weightingI regressionI combinations of the above

Today we will talk about regression because that’s the subject of theclass.

A big topic I’m skipping over as outside the scope of class is thepropensity score (conditional expectation of the treatment given thecovariates).

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 47 / 89

Page 180: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Estimation

Estimation is secondary to identification.

Selection on observables generally requires estimating at least oneconditional expectation function and there are many ways to do that.

An incomplete list of strategies:I matchingI weightingI regressionI combinations of the above

Today we will talk about regression because that’s the subject of theclass.

A big topic I’m skipping over as outside the scope of class is thepropensity score (conditional expectation of the treatment given thecovariates).

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 47 / 89

Page 181: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Estimation

Estimation is secondary to identification.

Selection on observables generally requires estimating at least oneconditional expectation function and there are many ways to do that.

An incomplete list of strategies:I matchingI weightingI regressionI combinations of the above

Today we will talk about regression because that’s the subject of theclass.

A big topic I’m skipping over as outside the scope of class is thepropensity score (conditional expectation of the treatment given thecovariates).

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 47 / 89

Page 182: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Estimation

Estimation is secondary to identification.

Selection on observables generally requires estimating at least oneconditional expectation function and there are many ways to do that.

An incomplete list of strategies:I matchingI weightingI regressionI combinations of the above

Today we will talk about regression because that’s the subject of theclass.

A big topic I’m skipping over as outside the scope of class is thepropensity score (conditional expectation of the treatment given thecovariates).

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 47 / 89

Page 183: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Regression

David Freedman:

I sometimes have a nightmare about Kepler. Suppose a few of uswere transported back in time to the year 1600, and were invitedby the Emperor Rudolph II to set up an Imperial Department ofStatistics in the court at Prague. Despairing of those circularorbits, Kepler enrolls in our department. We teach him thegeneral linear model, least squares, dummy variables, everything.He goes back to work, fits the best circular orbit for Mars byleast squares, puts in a dummy variable for the exceptionalobservation - and publishes. And that’s the end, right there inPrague at the beginning of the 17th century.

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 48 / 89

Page 184: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Regression and Causality

Regression is an estimation strategy that can be used with anidentification strategy to estimate a causal effect

When is regression causal? When the CEF is causal.

This means that the question of whether regression has a causalinterpretation is a question about identification

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 49 / 89

Page 185: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Regression and Causality

Regression is an estimation strategy that can be used with anidentification strategy to estimate a causal effect

When is regression causal? When the CEF is causal.

This means that the question of whether regression has a causalinterpretation is a question about identification

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 49 / 89

Page 186: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Regression and Causality

Regression is an estimation strategy that can be used with anidentification strategy to estimate a causal effect

When is regression causal?

When the CEF is causal.

This means that the question of whether regression has a causalinterpretation is a question about identification

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 49 / 89

Page 187: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Regression and Causality

Regression is an estimation strategy that can be used with anidentification strategy to estimate a causal effect

When is regression causal? When the CEF is causal.

This means that the question of whether regression has a causalinterpretation is a question about identification

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 49 / 89

Page 188: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Regression and Causality

Regression is an estimation strategy that can be used with anidentification strategy to estimate a causal effect

When is regression causal? When the CEF is causal.

This means that the question of whether regression has a causalinterpretation is a question about identification

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 49 / 89

Page 189: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Identification under Selection on Observables: Regression

Consider the linear regression of Yi = β0 + τDi + X ′i β + εi .

Given selection on observables, there are mainly three identificationscenarios:

1 Constant treatment effects and outcomes are linear in X

I τ will provide unbiased and consistent estimates of ATE.

2 Constant treatment effects and unknown functional form

I τ will provide well-defined linear approximation to the average causalresponse function E[Y |D = 1,X ]− E[Y |D = 0,X ]. Approximationmay be very poor if E[Y |D,X ] is misspecified and then τ may bebiased for the ATE.

3 Heterogeneous treatment effects (τ differs for different values of X )

I If outcomes are linear in X , τ is unbiased and consistent estimator forconditional-variance-weighted average of the underlying causal effects.This averagecan be different from the ATE.

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 50 / 89

Page 190: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Identification under Selection on Observables: Regression

Consider the linear regression of Yi = β0 + τDi + X ′i β + εi .

Given selection on observables, there are mainly three identificationscenarios:

1 Constant treatment effects and outcomes are linear in X

I τ will provide unbiased and consistent estimates of ATE.

2 Constant treatment effects and unknown functional form

I τ will provide well-defined linear approximation to the average causalresponse function E[Y |D = 1,X ]− E[Y |D = 0,X ]. Approximationmay be very poor if E[Y |D,X ] is misspecified and then τ may bebiased for the ATE.

3 Heterogeneous treatment effects (τ differs for different values of X )

I If outcomes are linear in X , τ is unbiased and consistent estimator forconditional-variance-weighted average of the underlying causal effects.This averagecan be different from the ATE.

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 50 / 89

Page 191: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Identification under Selection on Observables: Regression

Consider the linear regression of Yi = β0 + τDi + X ′i β + εi .

Given selection on observables, there are mainly three identificationscenarios:

1 Constant treatment effects and outcomes are linear in X

I τ will provide unbiased and consistent estimates of ATE.

2 Constant treatment effects and unknown functional form

I τ will provide well-defined linear approximation to the average causalresponse function E[Y |D = 1,X ]− E[Y |D = 0,X ]. Approximationmay be very poor if E[Y |D,X ] is misspecified and then τ may bebiased for the ATE.

3 Heterogeneous treatment effects (τ differs for different values of X )

I If outcomes are linear in X , τ is unbiased and consistent estimator forconditional-variance-weighted average of the underlying causal effects.This averagecan be different from the ATE.

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 50 / 89

Page 192: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Identification under Selection on Observables: Regression

Consider the linear regression of Yi = β0 + τDi + X ′i β + εi .

Given selection on observables, there are mainly three identificationscenarios:

1 Constant treatment effects and outcomes are linear in X

I τ will provide unbiased and consistent estimates of ATE.

2 Constant treatment effects and unknown functional form

I τ will provide well-defined linear approximation to the average causalresponse function E[Y |D = 1,X ]− E[Y |D = 0,X ]. Approximationmay be very poor if E[Y |D,X ] is misspecified and then τ may bebiased for the ATE.

3 Heterogeneous treatment effects (τ differs for different values of X )

I If outcomes are linear in X , τ is unbiased and consistent estimator forconditional-variance-weighted average of the underlying causal effects.This averagecan be different from the ATE.

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 50 / 89

Page 193: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Identification under Selection on Observables: Regression

Identification Assumption

1 Constant treatment effect: τ = Y1i − Y0i for all i

2 Control outcome is linear in X : Y0i = β0 + X ′i β + εi with εi⊥⊥Xi (no

omitted variables and linearly separable confounding)

Identification Result

Then τATE = E[Y1 − Y0] is identified by a regression of the observedoutcome on the covariates and the treatment indicatorYi = β0 + τDi + X ′

i β + εi

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 51 / 89

Page 194: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Ideal Case: Linear Constant Effects ModelAssume constant linear effects and linearly separable confounding:

Yi (d) = Yi = β0 + τDi + ηi

Linearly separable confounding: assume that E[ηi |Xi ] = X ′i β,

which means that ηi = X ′i β + εi where E[εi |Xi ] = 0.

Under this model, (Y1,Y0)⊥⊥D|X implies εi |X⊥⊥DAs a result,

Yi = β0 + τDi + E[ηi ]

= β0 + τDi + X ′i β + E[εi ]

= β0 + τDi + X ′i β

Thus, a regression where Di and Xi are entered linearly can recoverthe ATE.

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 52 / 89

Page 195: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Ideal Case: Linear Constant Effects ModelAssume constant linear effects and linearly separable confounding:

Yi (d) = Yi = β0 + τDi + ηi

Linearly separable confounding: assume that E[ηi |Xi ] = X ′i β,

which means that ηi = X ′i β + εi where E[εi |Xi ] = 0.

Under this model, (Y1,Y0)⊥⊥D|X implies εi |X⊥⊥DAs a result,

Yi = β0 + τDi + E[ηi ]

= β0 + τDi + X ′i β + E[εi ]

= β0 + τDi + X ′i β

Thus, a regression where Di and Xi are entered linearly can recoverthe ATE.

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 52 / 89

Page 196: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Ideal Case: Linear Constant Effects ModelAssume constant linear effects and linearly separable confounding:

Yi (d) = Yi = β0 + τDi + ηi

Linearly separable confounding: assume that E[ηi |Xi ] = X ′i β,

which means that ηi = X ′i β + εi where E[εi |Xi ] = 0.

Under this model, (Y1,Y0)⊥⊥D|X implies εi |X⊥⊥D

As a result,

Yi = β0 + τDi + E[ηi ]

= β0 + τDi + X ′i β + E[εi ]

= β0 + τDi + X ′i β

Thus, a regression where Di and Xi are entered linearly can recoverthe ATE.

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 52 / 89

Page 197: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Ideal Case: Linear Constant Effects ModelAssume constant linear effects and linearly separable confounding:

Yi (d) = Yi = β0 + τDi + ηi

Linearly separable confounding: assume that E[ηi |Xi ] = X ′i β,

which means that ηi = X ′i β + εi where E[εi |Xi ] = 0.

Under this model, (Y1,Y0)⊥⊥D|X implies εi |X⊥⊥DAs a result,

Yi = β0 + τDi + E[ηi ]

= β0 + τDi + X ′i β + E[εi ]

= β0 + τDi + X ′i β

Thus, a regression where Di and Xi are entered linearly can recoverthe ATE.

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 52 / 89

Page 198: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Ideal Case: Linear Constant Effects ModelAssume constant linear effects and linearly separable confounding:

Yi (d) = Yi = β0 + τDi + ηi

Linearly separable confounding: assume that E[ηi |Xi ] = X ′i β,

which means that ηi = X ′i β + εi where E[εi |Xi ] = 0.

Under this model, (Y1,Y0)⊥⊥D|X implies εi |X⊥⊥DAs a result,

Yi = β0 + τDi + E[ηi ]

= β0 + τDi + X ′i β + E[εi ]

= β0 + τDi + X ′i β

Thus, a regression where Di and Xi are entered linearly can recoverthe ATE.

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 52 / 89

Page 199: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Implausible Plausible

Constant effects and linearly separable confounding aren’t veryappealing or plausible assumptions

To understand what happens when they don’t hold, we need tounderstand the properties of regression with minimal assumptions:this is often called an agnostic view of regression.

The Aronow and Miller book (and lecture 7) provide some contextbut essentially as long as we have iid sampling, we will asymptoticallyobtain the best linear approximation to the CEF.

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 53 / 89

Page 200: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Implausible Plausible

Constant effects and linearly separable confounding aren’t veryappealing or plausible assumptions

To understand what happens when they don’t hold, we need tounderstand the properties of regression with minimal assumptions:this is often called an agnostic view of regression.

The Aronow and Miller book (and lecture 7) provide some contextbut essentially as long as we have iid sampling, we will asymptoticallyobtain the best linear approximation to the CEF.

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 53 / 89

Page 201: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Implausible Plausible

Constant effects and linearly separable confounding aren’t veryappealing or plausible assumptions

To understand what happens when they don’t hold, we need tounderstand the properties of regression with minimal assumptions:this is often called an agnostic view of regression.

The Aronow and Miller book (and lecture 7) provide some contextbut essentially as long as we have iid sampling, we will asymptoticallyobtain the best linear approximation to the CEF.

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 53 / 89

Page 202: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Implausible Plausible

Constant effects and linearly separable confounding aren’t veryappealing or plausible assumptions

To understand what happens when they don’t hold, we need tounderstand the properties of regression with minimal assumptions:this is often called an agnostic view of regression.

The Aronow and Miller book (and lecture 7) provide some contextbut essentially as long as we have iid sampling, we will asymptoticallyobtain the best linear approximation to the CEF.

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 53 / 89

Page 203: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

1 The Experimental Ideal

2 Assumption of No Unmeasured Confounding

3 Estimation Under No Unmeasured Confounding

4 Regression Estimators

5 Regression and Causality

6 Regression Under Heterogeneous Effects

7 Fun with Visualization, Replication and the NYT

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 54 / 89

Page 204: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

1 The Experimental Ideal

2 Assumption of No Unmeasured Confounding

3 Estimation Under No Unmeasured Confounding

4 Regression Estimators

5 Regression and Causality

6 Regression Under Heterogeneous Effects

7 Fun with Visualization, Replication and the NYT

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 54 / 89

Page 205: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Regression and causality

Most econometrics textbooks: regression defined without respect tocausality.

But then when is β “biased”? What does this even mean?

The question, then, is when does knowing the CEF tell us somethingabout causality?

Angrist and Pishke argues that a regression is causal when the CEF itapproximates is causal. Identification is king.

We will show that under certain conditions, a regression of theoutcome on the treatment and the covariates can recover a causalparameter, but perhaps not the one in which we are interested.

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 55 / 89

Page 206: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Regression and causality

Most econometrics textbooks: regression defined without respect tocausality.

But then when is β “biased”? What does this even mean?

The question, then, is when does knowing the CEF tell us somethingabout causality?

Angrist and Pishke argues that a regression is causal when the CEF itapproximates is causal. Identification is king.

We will show that under certain conditions, a regression of theoutcome on the treatment and the covariates can recover a causalparameter, but perhaps not the one in which we are interested.

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 55 / 89

Page 207: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Regression and causality

Most econometrics textbooks: regression defined without respect tocausality.

But then when is β “biased”? What does this even mean?

The question, then, is when does knowing the CEF tell us somethingabout causality?

Angrist and Pishke argues that a regression is causal when the CEF itapproximates is causal. Identification is king.

We will show that under certain conditions, a regression of theoutcome on the treatment and the covariates can recover a causalparameter, but perhaps not the one in which we are interested.

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 55 / 89

Page 208: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Regression and causality

Most econometrics textbooks: regression defined without respect tocausality.

But then when is β “biased”? What does this even mean?

The question, then, is when does knowing the CEF tell us somethingabout causality?

Angrist and Pishke argues that a regression is causal when the CEF itapproximates is causal. Identification is king.

We will show that under certain conditions, a regression of theoutcome on the treatment and the covariates can recover a causalparameter, but perhaps not the one in which we are interested.

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 55 / 89

Page 209: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Regression and causality

Most econometrics textbooks: regression defined without respect tocausality.

But then when is β “biased”? What does this even mean?

The question, then, is when does knowing the CEF tell us somethingabout causality?

Angrist and Pishke argues that a regression is causal when the CEF itapproximates is causal. Identification is king.

We will show that under certain conditions, a regression of theoutcome on the treatment and the covariates can recover a causalparameter, but perhaps not the one in which we are interested.

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 55 / 89

Page 210: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Linear constant effects model, binary treatment

Now with the benefit of covering agnostic regression, let’s review again thesimple case.

Experiment: with a simple experiment, we can rewrite the consistencyassumption to be a regression formula:

Yi = DiYi (1) + (1− Di )Yi (0)

= Yi (0) + (Yi (1)− Yi (0))Di

= E[Yi (0)] + τDi + (Yi (0)− E[Yi (0)])

= µ0 + τDi + v0i

Note that if ignorability holds (as in an experiment) for Yi (0), then itwill also hold for v0

i , since E[Yi (0)] is constant. Thus, this satifies theusual assumptions for regression.

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 56 / 89

Page 211: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Linear constant effects model, binary treatment

Now with the benefit of covering agnostic regression, let’s review again thesimple case.

Experiment: with a simple experiment, we can rewrite the consistencyassumption to be a regression formula:

Yi = DiYi (1) + (1− Di )Yi (0)

= Yi (0) + (Yi (1)− Yi (0))Di

= E[Yi (0)] + τDi + (Yi (0)− E[Yi (0)])

= µ0 + τDi + v0i

Note that if ignorability holds (as in an experiment) for Yi (0), then itwill also hold for v0

i , since E[Yi (0)] is constant. Thus, this satifies theusual assumptions for regression.

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 56 / 89

Page 212: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Linear constant effects model, binary treatment

Now with the benefit of covering agnostic regression, let’s review again thesimple case.

Experiment: with a simple experiment, we can rewrite the consistencyassumption to be a regression formula:

Yi = DiYi (1) + (1− Di )Yi (0)

= Yi (0) + (Yi (1)− Yi (0))Di

= E[Yi (0)] + τDi + (Yi (0)− E[Yi (0)])

= µ0 + τDi + v0i

Note that if ignorability holds (as in an experiment) for Yi (0), then itwill also hold for v0

i , since E[Yi (0)] is constant. Thus, this satifies theusual assumptions for regression.

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 56 / 89

Page 213: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Linear constant effects model, binary treatment

Now with the benefit of covering agnostic regression, let’s review again thesimple case.

Experiment: with a simple experiment, we can rewrite the consistencyassumption to be a regression formula:

Yi = DiYi (1) + (1− Di )Yi (0)

= Yi (0) + (Yi (1)− Yi (0))Di

= E[Yi (0)] + τDi + (Yi (0)− E[Yi (0)])

= µ0 + τDi + v0i

Note that if ignorability holds (as in an experiment) for Yi (0), then itwill also hold for v0

i , since E[Yi (0)] is constant. Thus, this satifies theusual assumptions for regression.

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 56 / 89

Page 214: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Linear constant effects model, binary treatment

Now with the benefit of covering agnostic regression, let’s review again thesimple case.

Experiment: with a simple experiment, we can rewrite the consistencyassumption to be a regression formula:

Yi = DiYi (1) + (1− Di )Yi (0)

= Yi (0) + (Yi (1)− Yi (0))Di

= E[Yi (0)] + τDi + (Yi (0)− E[Yi (0)])

= µ0 + τDi + v0i

Note that if ignorability holds (as in an experiment) for Yi (0), then itwill also hold for v0

i , since E[Yi (0)] is constant. Thus, this satifies theusual assumptions for regression.

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 56 / 89

Page 215: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Linear constant effects model, binary treatment

Now with the benefit of covering agnostic regression, let’s review again thesimple case.

Experiment: with a simple experiment, we can rewrite the consistencyassumption to be a regression formula:

Yi = DiYi (1) + (1− Di )Yi (0)

= Yi (0) + (Yi (1)− Yi (0))Di

= E[Yi (0)] + τDi + (Yi (0)− E[Yi (0)])

= µ0 + τDi + v0i

Note that if ignorability holds (as in an experiment) for Yi (0), then itwill also hold for v0

i , since E[Yi (0)] is constant. Thus, this satifies theusual assumptions for regression.

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 56 / 89

Page 216: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Now with covariates

Now assume no unmeasured confounders: Yi (d)⊥⊥Di |Xi .

We will assume a linear model for the potential outcomes:

Yi (d) = α + τ · d + ηi

Remember that linearity isn’t an assumption if Di is binary

Effect of Di is constant here, the ηi are the only source of individualvariation and we have E [ηi ] = 0.

Consistency assumption allows us to write this as:

Yi = α + τDi + ηi .

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 57 / 89

Page 217: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Now with covariates

Now assume no unmeasured confounders: Yi (d)⊥⊥Di |Xi .

We will assume a linear model for the potential outcomes:

Yi (d) = α + τ · d + ηi

Remember that linearity isn’t an assumption if Di is binary

Effect of Di is constant here, the ηi are the only source of individualvariation and we have E [ηi ] = 0.

Consistency assumption allows us to write this as:

Yi = α + τDi + ηi .

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 57 / 89

Page 218: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Now with covariates

Now assume no unmeasured confounders: Yi (d)⊥⊥Di |Xi .

We will assume a linear model for the potential outcomes:

Yi (d) = α + τ · d + ηi

Remember that linearity isn’t an assumption if Di is binary

Effect of Di is constant here, the ηi are the only source of individualvariation and we have E [ηi ] = 0.

Consistency assumption allows us to write this as:

Yi = α + τDi + ηi .

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 57 / 89

Page 219: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Now with covariates

Now assume no unmeasured confounders: Yi (d)⊥⊥Di |Xi .

We will assume a linear model for the potential outcomes:

Yi (d) = α + τ · d + ηi

Remember that linearity isn’t an assumption if Di is binary

Effect of Di is constant here, the ηi are the only source of individualvariation and we have E [ηi ] = 0.

Consistency assumption allows us to write this as:

Yi = α + τDi + ηi .

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 57 / 89

Page 220: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Now with covariates

Now assume no unmeasured confounders: Yi (d)⊥⊥Di |Xi .

We will assume a linear model for the potential outcomes:

Yi (d) = α + τ · d + ηi

Remember that linearity isn’t an assumption if Di is binary

Effect of Di is constant here, the ηi are the only source of individualvariation and we have E [ηi ] = 0.

Consistency assumption allows us to write this as:

Yi = α + τDi + ηi .

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 57 / 89

Page 221: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Covariates in the error

Let’s assume that ηi is linear in Xi : ηi = X ′i γ + νi

New error is uncorrelated with Xi : E[νi |Xi ] = 0.

This is an assumption! Might be false!

Plug into the above:

E[Yi (d)|Xi ] = E [Yi |Di ,Xi ] = α + τDi + E [ηi |Xi ]

= α + τDi + X ′i γ + E [νi |Xi ]

= α + τDi + X ′i γ

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 58 / 89

Page 222: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Covariates in the error

Let’s assume that ηi is linear in Xi : ηi = X ′i γ + νi

New error is uncorrelated with Xi : E[νi |Xi ] = 0.

This is an assumption! Might be false!

Plug into the above:

E[Yi (d)|Xi ] = E [Yi |Di ,Xi ] = α + τDi + E [ηi |Xi ]

= α + τDi + X ′i γ + E [νi |Xi ]

= α + τDi + X ′i γ

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 58 / 89

Page 223: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Covariates in the error

Let’s assume that ηi is linear in Xi : ηi = X ′i γ + νi

New error is uncorrelated with Xi : E[νi |Xi ] = 0.

This is an assumption! Might be false!

Plug into the above:

E[Yi (d)|Xi ] = E [Yi |Di ,Xi ] = α + τDi + E [ηi |Xi ]

= α + τDi + X ′i γ + E [νi |Xi ]

= α + τDi + X ′i γ

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 58 / 89

Page 224: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Covariates in the error

Let’s assume that ηi is linear in Xi : ηi = X ′i γ + νi

New error is uncorrelated with Xi : E[νi |Xi ] = 0.

This is an assumption! Might be false!

Plug into the above:

E[Yi (d)|Xi ] = E [Yi |Di ,Xi ] = α + τDi + E [ηi |Xi ]

= α + τDi + X ′i γ + E [νi |Xi ]

= α + τDi + X ′i γ

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 58 / 89

Page 225: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Covariates in the error

Let’s assume that ηi is linear in Xi : ηi = X ′i γ + νi

New error is uncorrelated with Xi : E[νi |Xi ] = 0.

This is an assumption! Might be false!

Plug into the above:

E[Yi (d)|Xi ] = E [Yi |Di ,Xi ]

= α + τDi + E [ηi |Xi ]

= α + τDi + X ′i γ + E [νi |Xi ]

= α + τDi + X ′i γ

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 58 / 89

Page 226: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Covariates in the error

Let’s assume that ηi is linear in Xi : ηi = X ′i γ + νi

New error is uncorrelated with Xi : E[νi |Xi ] = 0.

This is an assumption! Might be false!

Plug into the above:

E[Yi (d)|Xi ] = E [Yi |Di ,Xi ] = α + τDi + E [ηi |Xi ]

= α + τDi + X ′i γ + E [νi |Xi ]

= α + τDi + X ′i γ

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 58 / 89

Page 227: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Covariates in the error

Let’s assume that ηi is linear in Xi : ηi = X ′i γ + νi

New error is uncorrelated with Xi : E[νi |Xi ] = 0.

This is an assumption! Might be false!

Plug into the above:

E[Yi (d)|Xi ] = E [Yi |Di ,Xi ] = α + τDi + E [ηi |Xi ]

= α + τDi + X ′i γ + E [νi |Xi ]

= α + τDi + X ′i γ

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 58 / 89

Page 228: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Covariates in the error

Let’s assume that ηi is linear in Xi : ηi = X ′i γ + νi

New error is uncorrelated with Xi : E[νi |Xi ] = 0.

This is an assumption! Might be false!

Plug into the above:

E[Yi (d)|Xi ] = E [Yi |Di ,Xi ] = α + τDi + E [ηi |Xi ]

= α + τDi + X ′i γ + E [νi |Xi ]

= α + τDi + X ′i γ

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 58 / 89

Page 229: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Summing up regression with constant effects

Reviewing the assumptions we’ve used:

I no unmeasured confoundersI constant treatment effectsI linearity of the treatment/covariates

Under these, we can run the following regression to estimate the ATE,τ :

Yi = α + τDi + X ′i γ + νi

Works with continuous or ordinal Di if effect of these variables is trulylinear.

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 59 / 89

Page 230: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Summing up regression with constant effects

Reviewing the assumptions we’ve used:

I no unmeasured confounders

I constant treatment effectsI linearity of the treatment/covariates

Under these, we can run the following regression to estimate the ATE,τ :

Yi = α + τDi + X ′i γ + νi

Works with continuous or ordinal Di if effect of these variables is trulylinear.

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 59 / 89

Page 231: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Summing up regression with constant effects

Reviewing the assumptions we’ve used:

I no unmeasured confoundersI constant treatment effects

I linearity of the treatment/covariates

Under these, we can run the following regression to estimate the ATE,τ :

Yi = α + τDi + X ′i γ + νi

Works with continuous or ordinal Di if effect of these variables is trulylinear.

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 59 / 89

Page 232: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Summing up regression with constant effects

Reviewing the assumptions we’ve used:

I no unmeasured confoundersI constant treatment effectsI linearity of the treatment/covariates

Under these, we can run the following regression to estimate the ATE,τ :

Yi = α + τDi + X ′i γ + νi

Works with continuous or ordinal Di if effect of these variables is trulylinear.

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 59 / 89

Page 233: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Summing up regression with constant effects

Reviewing the assumptions we’ve used:

I no unmeasured confoundersI constant treatment effectsI linearity of the treatment/covariates

Under these, we can run the following regression to estimate the ATE,τ :

Yi = α + τDi + X ′i γ + νi

Works with continuous or ordinal Di if effect of these variables is trulylinear.

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 59 / 89

Page 234: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Summing up regression with constant effects

Reviewing the assumptions we’ve used:

I no unmeasured confoundersI constant treatment effectsI linearity of the treatment/covariates

Under these, we can run the following regression to estimate the ATE,τ :

Yi = α + τDi + X ′i γ + νi

Works with continuous or ordinal Di if effect of these variables is trulylinear.

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 59 / 89

Page 235: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

1 The Experimental Ideal

2 Assumption of No Unmeasured Confounding

3 Estimation Under No Unmeasured Confounding

4 Regression Estimators

5 Regression and Causality

6 Regression Under Heterogeneous Effects

7 Fun with Visualization, Replication and the NYT

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 60 / 89

Page 236: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

1 The Experimental Ideal

2 Assumption of No Unmeasured Confounding

3 Estimation Under No Unmeasured Confounding

4 Regression Estimators

5 Regression and Causality

6 Regression Under Heterogeneous Effects

7 Fun with Visualization, Replication and the NYT

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 60 / 89

Page 237: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Heterogeneous effects, binary treatment

Completely randomized experiment:

Yi = DiYi (1) + (1− Di )Yi (0)

= Yi (0) + (Yi (1)− Yi (0))Di

= µ0 + τiDi + (Yi (0)− µ0)

= µ0 + τDi + (Yi (0)− µ0) + (τi − τ) · Di

= µ0 + τDi + εi

Error term now includes two components:

1 “Baseline” variation in the outcome: (Yi (0)− µ0)2 Variation in the treatment effect, (τi − τ)

We can verify that under experiment, E[εi |Di ] = 0

Thus, OLS estimates the ATE with no covariates.

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 61 / 89

Page 238: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Heterogeneous effects, binary treatment

Completely randomized experiment:

Yi = DiYi (1) + (1− Di )Yi (0)

= Yi (0) + (Yi (1)− Yi (0))Di

= µ0 + τiDi + (Yi (0)− µ0)

= µ0 + τDi + (Yi (0)− µ0) + (τi − τ) · Di

= µ0 + τDi + εi

Error term now includes two components:

1 “Baseline” variation in the outcome: (Yi (0)− µ0)2 Variation in the treatment effect, (τi − τ)

We can verify that under experiment, E[εi |Di ] = 0

Thus, OLS estimates the ATE with no covariates.

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 61 / 89

Page 239: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Heterogeneous effects, binary treatment

Completely randomized experiment:

Yi = DiYi (1) + (1− Di )Yi (0)

= Yi (0) + (Yi (1)− Yi (0))Di

= µ0 + τiDi + (Yi (0)− µ0)

= µ0 + τDi + (Yi (0)− µ0) + (τi − τ) · Di

= µ0 + τDi + εi

Error term now includes two components:

1 “Baseline” variation in the outcome: (Yi (0)− µ0)2 Variation in the treatment effect, (τi − τ)

We can verify that under experiment, E[εi |Di ] = 0

Thus, OLS estimates the ATE with no covariates.

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 61 / 89

Page 240: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Heterogeneous effects, binary treatment

Completely randomized experiment:

Yi = DiYi (1) + (1− Di )Yi (0)

= Yi (0) + (Yi (1)− Yi (0))Di

= µ0 + τiDi + (Yi (0)− µ0)

= µ0 + τDi + (Yi (0)− µ0) + (τi − τ) · Di

= µ0 + τDi + εi

Error term now includes two components:

1 “Baseline” variation in the outcome: (Yi (0)− µ0)2 Variation in the treatment effect, (τi − τ)

We can verify that under experiment, E[εi |Di ] = 0

Thus, OLS estimates the ATE with no covariates.

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 61 / 89

Page 241: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Heterogeneous effects, binary treatment

Completely randomized experiment:

Yi = DiYi (1) + (1− Di )Yi (0)

= Yi (0) + (Yi (1)− Yi (0))Di

= µ0 + τiDi + (Yi (0)− µ0)

= µ0 + τDi + (Yi (0)− µ0) + (τi − τ) · Di

= µ0 + τDi + εi

Error term now includes two components:

1 “Baseline” variation in the outcome: (Yi (0)− µ0)2 Variation in the treatment effect, (τi − τ)

We can verify that under experiment, E[εi |Di ] = 0

Thus, OLS estimates the ATE with no covariates.

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 61 / 89

Page 242: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Heterogeneous effects, binary treatment

Completely randomized experiment:

Yi = DiYi (1) + (1− Di )Yi (0)

= Yi (0) + (Yi (1)− Yi (0))Di

= µ0 + τiDi + (Yi (0)− µ0)

= µ0 + τDi + (Yi (0)− µ0) + (τi − τ) · Di

= µ0 + τDi + εi

Error term now includes two components:

1 “Baseline” variation in the outcome: (Yi (0)− µ0)2 Variation in the treatment effect, (τi − τ)

We can verify that under experiment, E[εi |Di ] = 0

Thus, OLS estimates the ATE with no covariates.

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 61 / 89

Page 243: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Heterogeneous effects, binary treatment

Completely randomized experiment:

Yi = DiYi (1) + (1− Di )Yi (0)

= Yi (0) + (Yi (1)− Yi (0))Di

= µ0 + τiDi + (Yi (0)− µ0)

= µ0 + τDi + (Yi (0)− µ0) + (τi − τ) · Di

= µ0 + τDi + εi

Error term now includes two components:

1 “Baseline” variation in the outcome: (Yi (0)− µ0)2 Variation in the treatment effect, (τi − τ)

We can verify that under experiment, E[εi |Di ] = 0

Thus, OLS estimates the ATE with no covariates.

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 61 / 89

Page 244: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Heterogeneous effects, binary treatment

Completely randomized experiment:

Yi = DiYi (1) + (1− Di )Yi (0)

= Yi (0) + (Yi (1)− Yi (0))Di

= µ0 + τiDi + (Yi (0)− µ0)

= µ0 + τDi + (Yi (0)− µ0) + (τi − τ) · Di

= µ0 + τDi + εi

Error term now includes two components:

1 “Baseline” variation in the outcome: (Yi (0)− µ0)

2 Variation in the treatment effect, (τi − τ)

We can verify that under experiment, E[εi |Di ] = 0

Thus, OLS estimates the ATE with no covariates.

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 61 / 89

Page 245: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Heterogeneous effects, binary treatment

Completely randomized experiment:

Yi = DiYi (1) + (1− Di )Yi (0)

= Yi (0) + (Yi (1)− Yi (0))Di

= µ0 + τiDi + (Yi (0)− µ0)

= µ0 + τDi + (Yi (0)− µ0) + (τi − τ) · Di

= µ0 + τDi + εi

Error term now includes two components:

1 “Baseline” variation in the outcome: (Yi (0)− µ0)2 Variation in the treatment effect, (τi − τ)

We can verify that under experiment, E[εi |Di ] = 0

Thus, OLS estimates the ATE with no covariates.

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 61 / 89

Page 246: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Heterogeneous effects, binary treatment

Completely randomized experiment:

Yi = DiYi (1) + (1− Di )Yi (0)

= Yi (0) + (Yi (1)− Yi (0))Di

= µ0 + τiDi + (Yi (0)− µ0)

= µ0 + τDi + (Yi (0)− µ0) + (τi − τ) · Di

= µ0 + τDi + εi

Error term now includes two components:

1 “Baseline” variation in the outcome: (Yi (0)− µ0)2 Variation in the treatment effect, (τi − τ)

We can verify that under experiment, E[εi |Di ] = 0

Thus, OLS estimates the ATE with no covariates.

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 61 / 89

Page 247: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Heterogeneous effects, binary treatment

Completely randomized experiment:

Yi = DiYi (1) + (1− Di )Yi (0)

= Yi (0) + (Yi (1)− Yi (0))Di

= µ0 + τiDi + (Yi (0)− µ0)

= µ0 + τDi + (Yi (0)− µ0) + (τi − τ) · Di

= µ0 + τDi + εi

Error term now includes two components:

1 “Baseline” variation in the outcome: (Yi (0)− µ0)2 Variation in the treatment effect, (τi − τ)

We can verify that under experiment, E[εi |Di ] = 0

Thus, OLS estimates the ATE with no covariates.

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 61 / 89

Page 248: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Adding covariates

What happens with no unmeasured confounders? Need to conditionon Xi now.

Remember identification of the ATE/ATT using iterated expectations.

ATE is the weighted sum of Conditional Average Treatment Effects(CATEs):

τ =∑x

τ(x) Pr[Xi = x ]

ATE/ATT are weighted averages of CATEs.

What about the regression estimand, τR? How does it relate to theATE/ATT?

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 62 / 89

Page 249: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Adding covariates

What happens with no unmeasured confounders? Need to conditionon Xi now.

Remember identification of the ATE/ATT using iterated expectations.

ATE is the weighted sum of Conditional Average Treatment Effects(CATEs):

τ =∑x

τ(x) Pr[Xi = x ]

ATE/ATT are weighted averages of CATEs.

What about the regression estimand, τR? How does it relate to theATE/ATT?

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 62 / 89

Page 250: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Adding covariates

What happens with no unmeasured confounders? Need to conditionon Xi now.

Remember identification of the ATE/ATT using iterated expectations.

ATE is the weighted sum of Conditional Average Treatment Effects(CATEs):

τ =∑x

τ(x) Pr[Xi = x ]

ATE/ATT are weighted averages of CATEs.

What about the regression estimand, τR? How does it relate to theATE/ATT?

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 62 / 89

Page 251: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Adding covariates

What happens with no unmeasured confounders? Need to conditionon Xi now.

Remember identification of the ATE/ATT using iterated expectations.

ATE is the weighted sum of Conditional Average Treatment Effects(CATEs):

τ =∑x

τ(x) Pr[Xi = x ]

ATE/ATT are weighted averages of CATEs.

What about the regression estimand, τR? How does it relate to theATE/ATT?

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 62 / 89

Page 252: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Adding covariates

What happens with no unmeasured confounders? Need to conditionon Xi now.

Remember identification of the ATE/ATT using iterated expectations.

ATE is the weighted sum of Conditional Average Treatment Effects(CATEs):

τ =∑x

τ(x) Pr[Xi = x ]

ATE/ATT are weighted averages of CATEs.

What about the regression estimand, τR? How does it relate to theATE/ATT?

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 62 / 89

Page 253: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Heterogeneous effects and regression

Let’s investigate this under a saturated regression model:

Yi =∑x

Bxiαx + τRDi + ei .

Use a dummy variable for each unique combination of Xi :Bxi = I(Xi = x)

Linear in Xi by construction!

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 63 / 89

Page 254: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Heterogeneous effects and regression

Let’s investigate this under a saturated regression model:

Yi =∑x

Bxiαx + τRDi + ei .

Use a dummy variable for each unique combination of Xi :Bxi = I(Xi = x)

Linear in Xi by construction!

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 63 / 89

Page 255: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Heterogeneous effects and regression

Let’s investigate this under a saturated regression model:

Yi =∑x

Bxiαx + τRDi + ei .

Use a dummy variable for each unique combination of Xi :Bxi = I(Xi = x)

Linear in Xi by construction!

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 63 / 89

Page 256: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Investigating the regression coefficient

How can we investigate τR? Well, we can rely on the regressionanatomy:

τR =Cov(Yi ,Di − E [Di |Xi ])

Var(Di − E [Di |Xi ])

Di − E[Di |Xi ] is the residual from a regression of Di on the full set ofdummies.

With a little work we can show:

τR =E[τ(Xi )(Di − E[Di |Xi ])

2]

E[(Di − E [Di |Xi ])2]=

E[τ(Xi )σ2d(Xi )]

E[σ2d(Xi )]

σ2d(x) = Var[Di |Xi = x ] is the conditional variance of treatment

assignment.

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 64 / 89

Page 257: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Investigating the regression coefficient

How can we investigate τR? Well, we can rely on the regressionanatomy:

τR =Cov(Yi ,Di − E [Di |Xi ])

Var(Di − E [Di |Xi ])

Di − E[Di |Xi ] is the residual from a regression of Di on the full set ofdummies.

With a little work we can show:

τR =E[τ(Xi )(Di − E[Di |Xi ])

2]

E[(Di − E [Di |Xi ])2]=

E[τ(Xi )σ2d(Xi )]

E[σ2d(Xi )]

σ2d(x) = Var[Di |Xi = x ] is the conditional variance of treatment

assignment.

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 64 / 89

Page 258: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Investigating the regression coefficient

How can we investigate τR? Well, we can rely on the regressionanatomy:

τR =Cov(Yi ,Di − E [Di |Xi ])

Var(Di − E [Di |Xi ])

Di − E[Di |Xi ] is the residual from a regression of Di on the full set ofdummies.

With a little work we can show:

τR =E[τ(Xi )(Di − E[Di |Xi ])

2]

E[(Di − E [Di |Xi ])2]=

E[τ(Xi )σ2d(Xi )]

E[σ2d(Xi )]

σ2d(x) = Var[Di |Xi = x ] is the conditional variance of treatment

assignment.

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 64 / 89

Page 259: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Investigating the regression coefficient

How can we investigate τR? Well, we can rely on the regressionanatomy:

τR =Cov(Yi ,Di − E [Di |Xi ])

Var(Di − E [Di |Xi ])

Di − E[Di |Xi ] is the residual from a regression of Di on the full set ofdummies.

With a little work we can show:

τR =E[τ(Xi )(Di − E[Di |Xi ])

2]

E[(Di − E [Di |Xi ])2]=

E[τ(Xi )σ2d(Xi )]

E[σ2d(Xi )]

σ2d(x) = Var[Di |Xi = x ] is the conditional variance of treatment

assignment.

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 64 / 89

Page 260: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

ATE versus OLS

τR = E[τ(Xi )Wi ] =∑x

τ(x)σ2d(x)

E[σ2d(Xi )]

P[Xi = x ]

Compare to the ATE:

τ = E[τ(Xi )] =∑x

τ(x)P[Xi = x ]

Both weight strata relative to their size (P[Xi = x ])

OLS weights strata higher if the treatment variance in those strata(σ2

d(x)) is higher in those strata relative to the average varianceacross strata (E[σ2

d(Xi )]).

The ATE weights only by their size.

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 65 / 89

Page 261: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

ATE versus OLS

τR = E[τ(Xi )Wi ] =∑x

τ(x)σ2d(x)

E[σ2d(Xi )]

P[Xi = x ]

Compare to the ATE:

τ = E[τ(Xi )] =∑x

τ(x)P[Xi = x ]

Both weight strata relative to their size (P[Xi = x ])

OLS weights strata higher if the treatment variance in those strata(σ2

d(x)) is higher in those strata relative to the average varianceacross strata (E[σ2

d(Xi )]).

The ATE weights only by their size.

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 65 / 89

Page 262: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

ATE versus OLS

τR = E[τ(Xi )Wi ] =∑x

τ(x)σ2d(x)

E[σ2d(Xi )]

P[Xi = x ]

Compare to the ATE:

τ = E[τ(Xi )] =∑x

τ(x)P[Xi = x ]

Both weight strata relative to their size (P[Xi = x ])

OLS weights strata higher if the treatment variance in those strata(σ2

d(x)) is higher in those strata relative to the average varianceacross strata (E[σ2

d(Xi )]).

The ATE weights only by their size.

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 65 / 89

Page 263: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

ATE versus OLS

τR = E[τ(Xi )Wi ] =∑x

τ(x)σ2d(x)

E[σ2d(Xi )]

P[Xi = x ]

Compare to the ATE:

τ = E[τ(Xi )] =∑x

τ(x)P[Xi = x ]

Both weight strata relative to their size (P[Xi = x ])

OLS weights strata higher if the treatment variance in those strata(σ2

d(x)) is higher in those strata relative to the average varianceacross strata (E[σ2

d(Xi )]).

The ATE weights only by their size.

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 65 / 89

Page 264: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

ATE versus OLS

τR = E[τ(Xi )Wi ] =∑x

τ(x)σ2d(x)

E[σ2d(Xi )]

P[Xi = x ]

Compare to the ATE:

τ = E[τ(Xi )] =∑x

τ(x)P[Xi = x ]

Both weight strata relative to their size (P[Xi = x ])

OLS weights strata higher if the treatment variance in those strata(σ2

d(x)) is higher in those strata relative to the average varianceacross strata (E[σ2

d(Xi )]).

The ATE weights only by their size.

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 65 / 89

Page 265: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Regression weighting

Wi =σ2d(Xi )

E[σ2d(Xi )]

Why does OLS weight like this?

OLS is a minimum-variance estimator more weight to more precisewithin-strata estimates.

Within-strata estimates are most precise when the treatment is evenlyspread and thus has the highest variance.

If Di is binary, then we know the conditional variance will be:

σ2d(x) = P[Di = 1|Xi = x ] (1− P[Di = 1|Xi = x ])

Maximum variance with P[Di = 1|Xi = x ] = 1/2.

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 66 / 89

Page 266: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Regression weighting

Wi =σ2d(Xi )

E[σ2d(Xi )]

Why does OLS weight like this?

OLS is a minimum-variance estimator more weight to more precisewithin-strata estimates.

Within-strata estimates are most precise when the treatment is evenlyspread and thus has the highest variance.

If Di is binary, then we know the conditional variance will be:

σ2d(x) = P[Di = 1|Xi = x ] (1− P[Di = 1|Xi = x ])

Maximum variance with P[Di = 1|Xi = x ] = 1/2.

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 66 / 89

Page 267: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Regression weighting

Wi =σ2d(Xi )

E[σ2d(Xi )]

Why does OLS weight like this?

OLS is a minimum-variance estimator more weight to more precisewithin-strata estimates.

Within-strata estimates are most precise when the treatment is evenlyspread and thus has the highest variance.

If Di is binary, then we know the conditional variance will be:

σ2d(x) = P[Di = 1|Xi = x ] (1− P[Di = 1|Xi = x ])

Maximum variance with P[Di = 1|Xi = x ] = 1/2.

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 66 / 89

Page 268: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Regression weighting

Wi =σ2d(Xi )

E[σ2d(Xi )]

Why does OLS weight like this?

OLS is a minimum-variance estimator more weight to more precisewithin-strata estimates.

Within-strata estimates are most precise when the treatment is evenlyspread and thus has the highest variance.

If Di is binary, then we know the conditional variance will be:

σ2d(x) = P[Di = 1|Xi = x ] (1− P[Di = 1|Xi = x ])

Maximum variance with P[Di = 1|Xi = x ] = 1/2.

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 66 / 89

Page 269: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Regression weighting

Wi =σ2d(Xi )

E[σ2d(Xi )]

Why does OLS weight like this?

OLS is a minimum-variance estimator more weight to more precisewithin-strata estimates.

Within-strata estimates are most precise when the treatment is evenlyspread and thus has the highest variance.

If Di is binary, then we know the conditional variance will be:

σ2d(x) = P[Di = 1|Xi = x ] (1− P[Di = 1|Xi = x ])

Maximum variance with P[Di = 1|Xi = x ] = 1/2.

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 66 / 89

Page 270: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Regression weighting

Wi =σ2d(Xi )

E[σ2d(Xi )]

Why does OLS weight like this?

OLS is a minimum-variance estimator more weight to more precisewithin-strata estimates.

Within-strata estimates are most precise when the treatment is evenlyspread and thus has the highest variance.

If Di is binary, then we know the conditional variance will be:

σ2d(x) = P[Di = 1|Xi = x ] (1− P[Di = 1|Xi = x ])

Maximum variance with P[Di = 1|Xi = x ] = 1/2.

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 66 / 89

Page 271: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

OLS weighting exampleBinary covariate:

Group 1 Group 2

P[Xi = 1] = 0.75 P[Xi = 0] = 0.25

P[Di = 1|Xi = 1] = 0.9 P[Di = 1|Xi = 0] = 0.5

σ2d(1) = 0.09 σ2

d(0) = 0.25

τ(1) = 1 τ(0) = −1

Implies the ATE is τ = 0.5Average conditional variance: E[σ2

d(Xi )] = 0.13 weights for Xi = 1 are: 0.09/0.13 = 0.692, for Xi = 0: 0.25/0.13= 1.92.

τR = E[τ(Xi )Wi ]

= τ(1)W (1)P[Xi = 1] + τ(0)W (0)P[Xi = 0]

= 1× 0.692× 0.75 +−1× 1.92× 0.25

= 0.039

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 67 / 89

Page 272: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

OLS weighting exampleBinary covariate:

Group 1 Group 2

P[Xi = 1] = 0.75 P[Xi = 0] = 0.25

P[Di = 1|Xi = 1] = 0.9 P[Di = 1|Xi = 0] = 0.5

σ2d(1) = 0.09 σ2

d(0) = 0.25

τ(1) = 1 τ(0) = −1

Implies the ATE is τ = 0.5Average conditional variance: E[σ2

d(Xi )] = 0.13 weights for Xi = 1 are: 0.09/0.13 = 0.692, for Xi = 0: 0.25/0.13= 1.92.

τR = E[τ(Xi )Wi ]

= τ(1)W (1)P[Xi = 1] + τ(0)W (0)P[Xi = 0]

= 1× 0.692× 0.75 +−1× 1.92× 0.25

= 0.039

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 67 / 89

Page 273: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

OLS weighting exampleBinary covariate:

Group 1 Group 2

P[Xi = 1] = 0.75 P[Xi = 0] = 0.25

P[Di = 1|Xi = 1] = 0.9 P[Di = 1|Xi = 0] = 0.5

σ2d(1) = 0.09 σ2

d(0) = 0.25

τ(1) = 1 τ(0) = −1

Implies the ATE is τ = 0.5Average conditional variance: E[σ2

d(Xi )] = 0.13 weights for Xi = 1 are: 0.09/0.13 = 0.692, for Xi = 0: 0.25/0.13= 1.92.

τR = E[τ(Xi )Wi ]

= τ(1)W (1)P[Xi = 1] + τ(0)W (0)P[Xi = 0]

= 1× 0.692× 0.75 +−1× 1.92× 0.25

= 0.039

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 67 / 89

Page 274: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

OLS weighting exampleBinary covariate:

Group 1 Group 2

P[Xi = 1] = 0.75 P[Xi = 0] = 0.25

P[Di = 1|Xi = 1] = 0.9 P[Di = 1|Xi = 0] = 0.5

σ2d(1) = 0.09 σ2

d(0) = 0.25

τ(1) = 1 τ(0) = −1

Implies the ATE is τ = 0.5Average conditional variance: E[σ2

d(Xi )] = 0.13 weights for Xi = 1 are: 0.09/0.13 = 0.692, for Xi = 0: 0.25/0.13= 1.92.

τR = E[τ(Xi )Wi ]

= τ(1)W (1)P[Xi = 1] + τ(0)W (0)P[Xi = 0]

= 1× 0.692× 0.75 +−1× 1.92× 0.25

= 0.039

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 67 / 89

Page 275: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

OLS weighting exampleBinary covariate:

Group 1 Group 2

P[Xi = 1] = 0.75 P[Xi = 0] = 0.25

P[Di = 1|Xi = 1] = 0.9 P[Di = 1|Xi = 0] = 0.5

σ2d(1) = 0.09 σ2

d(0) = 0.25

τ(1) = 1 τ(0) = −1

Implies the ATE is τ = 0.5Average conditional variance: E[σ2

d(Xi )] = 0.13 weights for Xi = 1 are: 0.09/0.13 = 0.692, for Xi = 0: 0.25/0.13= 1.92.

τR = E[τ(Xi )Wi ]

= τ(1)W (1)P[Xi = 1] + τ(0)W (0)P[Xi = 0]

= 1× 0.692× 0.75 +−1× 1.92× 0.25

= 0.039

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 67 / 89

Page 276: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

OLS weighting exampleBinary covariate:

Group 1 Group 2

P[Xi = 1] = 0.75 P[Xi = 0] = 0.25

P[Di = 1|Xi = 1] = 0.9 P[Di = 1|Xi = 0] = 0.5

σ2d(1) = 0.09 σ2

d(0) = 0.25

τ(1) = 1 τ(0) = −1

Implies the ATE is τ = 0.5

Average conditional variance: E[σ2d(Xi )] = 0.13

weights for Xi = 1 are: 0.09/0.13 = 0.692, for Xi = 0: 0.25/0.13= 1.92.

τR = E[τ(Xi )Wi ]

= τ(1)W (1)P[Xi = 1] + τ(0)W (0)P[Xi = 0]

= 1× 0.692× 0.75 +−1× 1.92× 0.25

= 0.039

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 67 / 89

Page 277: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

OLS weighting exampleBinary covariate:

Group 1 Group 2

P[Xi = 1] = 0.75 P[Xi = 0] = 0.25

P[Di = 1|Xi = 1] = 0.9 P[Di = 1|Xi = 0] = 0.5

σ2d(1) = 0.09 σ2

d(0) = 0.25

τ(1) = 1 τ(0) = −1

Implies the ATE is τ = 0.5Average conditional variance: E[σ2

d(Xi )] = 0.13

weights for Xi = 1 are: 0.09/0.13 = 0.692, for Xi = 0: 0.25/0.13= 1.92.

τR = E[τ(Xi )Wi ]

= τ(1)W (1)P[Xi = 1] + τ(0)W (0)P[Xi = 0]

= 1× 0.692× 0.75 +−1× 1.92× 0.25

= 0.039

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 67 / 89

Page 278: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

OLS weighting exampleBinary covariate:

Group 1 Group 2

P[Xi = 1] = 0.75 P[Xi = 0] = 0.25

P[Di = 1|Xi = 1] = 0.9 P[Di = 1|Xi = 0] = 0.5

σ2d(1) = 0.09 σ2

d(0) = 0.25

τ(1) = 1 τ(0) = −1

Implies the ATE is τ = 0.5Average conditional variance: E[σ2

d(Xi )] = 0.13 weights for Xi = 1 are: 0.09/0.13 = 0.692, for Xi = 0: 0.25/0.13= 1.92.

τR = E[τ(Xi )Wi ]

= τ(1)W (1)P[Xi = 1] + τ(0)W (0)P[Xi = 0]

= 1× 0.692× 0.75 +−1× 1.92× 0.25

= 0.039

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 67 / 89

Page 279: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

OLS weighting exampleBinary covariate:

Group 1 Group 2

P[Xi = 1] = 0.75 P[Xi = 0] = 0.25

P[Di = 1|Xi = 1] = 0.9 P[Di = 1|Xi = 0] = 0.5

σ2d(1) = 0.09 σ2

d(0) = 0.25

τ(1) = 1 τ(0) = −1

Implies the ATE is τ = 0.5Average conditional variance: E[σ2

d(Xi )] = 0.13 weights for Xi = 1 are: 0.09/0.13 = 0.692, for Xi = 0: 0.25/0.13= 1.92.

τR = E[τ(Xi )Wi ]

= τ(1)W (1)P[Xi = 1] + τ(0)W (0)P[Xi = 0]

= 1× 0.692× 0.75 +−1× 1.92× 0.25

= 0.039

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 67 / 89

Page 280: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

OLS weighting exampleBinary covariate:

Group 1 Group 2

P[Xi = 1] = 0.75 P[Xi = 0] = 0.25

P[Di = 1|Xi = 1] = 0.9 P[Di = 1|Xi = 0] = 0.5

σ2d(1) = 0.09 σ2

d(0) = 0.25

τ(1) = 1 τ(0) = −1

Implies the ATE is τ = 0.5Average conditional variance: E[σ2

d(Xi )] = 0.13 weights for Xi = 1 are: 0.09/0.13 = 0.692, for Xi = 0: 0.25/0.13= 1.92.

τR = E[τ(Xi )Wi ]

= τ(1)W (1)P[Xi = 1] + τ(0)W (0)P[Xi = 0]

= 1× 0.692× 0.75 +−1× 1.92× 0.25

= 0.039

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 67 / 89

Page 281: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

OLS weighting exampleBinary covariate:

Group 1 Group 2

P[Xi = 1] = 0.75 P[Xi = 0] = 0.25

P[Di = 1|Xi = 1] = 0.9 P[Di = 1|Xi = 0] = 0.5

σ2d(1) = 0.09 σ2

d(0) = 0.25

τ(1) = 1 τ(0) = −1

Implies the ATE is τ = 0.5Average conditional variance: E[σ2

d(Xi )] = 0.13 weights for Xi = 1 are: 0.09/0.13 = 0.692, for Xi = 0: 0.25/0.13= 1.92.

τR = E[τ(Xi )Wi ]

= τ(1)W (1)P[Xi = 1] + τ(0)W (0)P[Xi = 0]

= 1× 0.692× 0.75 +−1× 1.92× 0.25

= 0.039

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 67 / 89

Page 282: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

OLS weighting exampleBinary covariate:

Group 1 Group 2

P[Xi = 1] = 0.75 P[Xi = 0] = 0.25

P[Di = 1|Xi = 1] = 0.9 P[Di = 1|Xi = 0] = 0.5

σ2d(1) = 0.09 σ2

d(0) = 0.25

τ(1) = 1 τ(0) = −1

Implies the ATE is τ = 0.5Average conditional variance: E[σ2

d(Xi )] = 0.13 weights for Xi = 1 are: 0.09/0.13 = 0.692, for Xi = 0: 0.25/0.13= 1.92.

τR = E[τ(Xi )Wi ]

= τ(1)W (1)P[Xi = 1] + τ(0)W (0)P[Xi = 0]

= 1× 0.692× 0.75 +−1× 1.92× 0.25

= 0.039

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 67 / 89

Page 283: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

When will OLS estimate the ATE?

When does τ = τR?

Constant treatment effects: τ(x) = τ = τRConstant probability of treatment: e(x) = P[Di = 1|Xi = x ] = e.

I Implies that the OLS weights are 1.

Incorrect linearity assumption in Xi will lead to more bias.

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 68 / 89

Page 284: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

When will OLS estimate the ATE?

When does τ = τR?

Constant treatment effects: τ(x) = τ = τR

Constant probability of treatment: e(x) = P[Di = 1|Xi = x ] = e.

I Implies that the OLS weights are 1.

Incorrect linearity assumption in Xi will lead to more bias.

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 68 / 89

Page 285: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

When will OLS estimate the ATE?

When does τ = τR?

Constant treatment effects: τ(x) = τ = τRConstant probability of treatment: e(x) = P[Di = 1|Xi = x ] = e.

I Implies that the OLS weights are 1.

Incorrect linearity assumption in Xi will lead to more bias.

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 68 / 89

Page 286: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

When will OLS estimate the ATE?

When does τ = τR?

Constant treatment effects: τ(x) = τ = τRConstant probability of treatment: e(x) = P[Di = 1|Xi = x ] = e.

I Implies that the OLS weights are 1.

Incorrect linearity assumption in Xi will lead to more bias.

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 68 / 89

Page 287: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

When will OLS estimate the ATE?

When does τ = τR?

Constant treatment effects: τ(x) = τ = τRConstant probability of treatment: e(x) = P[Di = 1|Xi = x ] = e.

I Implies that the OLS weights are 1.

Incorrect linearity assumption in Xi will lead to more bias.

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 68 / 89

Page 288: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Other ways to use regression

What’s the path forward?

I Accept the bias (might be relatively small with saturated models)I Use a different regression approach

Let µd(x) = E[Yi (d)|Xi = x ] be the CEF for the potential outcomeunder Di = d .

By consistency and n.u.c., we have µd(x) = E[Yi |Di = d ,Xi = x ].

Estimate a regression of Yi on Xi among the Di = d group.

Then, µd(x) is just a predicted value from the regression for Xi = x .

How can we use this?

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 69 / 89

Page 289: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Other ways to use regression

What’s the path forward?

I Accept the bias (might be relatively small with saturated models)

I Use a different regression approach

Let µd(x) = E[Yi (d)|Xi = x ] be the CEF for the potential outcomeunder Di = d .

By consistency and n.u.c., we have µd(x) = E[Yi |Di = d ,Xi = x ].

Estimate a regression of Yi on Xi among the Di = d group.

Then, µd(x) is just a predicted value from the regression for Xi = x .

How can we use this?

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 69 / 89

Page 290: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Other ways to use regression

What’s the path forward?

I Accept the bias (might be relatively small with saturated models)I Use a different regression approach

Let µd(x) = E[Yi (d)|Xi = x ] be the CEF for the potential outcomeunder Di = d .

By consistency and n.u.c., we have µd(x) = E[Yi |Di = d ,Xi = x ].

Estimate a regression of Yi on Xi among the Di = d group.

Then, µd(x) is just a predicted value from the regression for Xi = x .

How can we use this?

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 69 / 89

Page 291: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Other ways to use regression

What’s the path forward?

I Accept the bias (might be relatively small with saturated models)I Use a different regression approach

Let µd(x) = E[Yi (d)|Xi = x ] be the CEF for the potential outcomeunder Di = d .

By consistency and n.u.c., we have µd(x) = E[Yi |Di = d ,Xi = x ].

Estimate a regression of Yi on Xi among the Di = d group.

Then, µd(x) is just a predicted value from the regression for Xi = x .

How can we use this?

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 69 / 89

Page 292: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Other ways to use regression

What’s the path forward?

I Accept the bias (might be relatively small with saturated models)I Use a different regression approach

Let µd(x) = E[Yi (d)|Xi = x ] be the CEF for the potential outcomeunder Di = d .

By consistency and n.u.c., we have µd(x) = E[Yi |Di = d ,Xi = x ].

Estimate a regression of Yi on Xi among the Di = d group.

Then, µd(x) is just a predicted value from the regression for Xi = x .

How can we use this?

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 69 / 89

Page 293: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Other ways to use regression

What’s the path forward?

I Accept the bias (might be relatively small with saturated models)I Use a different regression approach

Let µd(x) = E[Yi (d)|Xi = x ] be the CEF for the potential outcomeunder Di = d .

By consistency and n.u.c., we have µd(x) = E[Yi |Di = d ,Xi = x ].

Estimate a regression of Yi on Xi among the Di = d group.

Then, µd(x) is just a predicted value from the regression for Xi = x .

How can we use this?

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 69 / 89

Page 294: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Other ways to use regression

What’s the path forward?

I Accept the bias (might be relatively small with saturated models)I Use a different regression approach

Let µd(x) = E[Yi (d)|Xi = x ] be the CEF for the potential outcomeunder Di = d .

By consistency and n.u.c., we have µd(x) = E[Yi |Di = d ,Xi = x ].

Estimate a regression of Yi on Xi among the Di = d group.

Then, µd(x) is just a predicted value from the regression for Xi = x .

How can we use this?

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 69 / 89

Page 295: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Other ways to use regression

What’s the path forward?

I Accept the bias (might be relatively small with saturated models)I Use a different regression approach

Let µd(x) = E[Yi (d)|Xi = x ] be the CEF for the potential outcomeunder Di = d .

By consistency and n.u.c., we have µd(x) = E[Yi |Di = d ,Xi = x ].

Estimate a regression of Yi on Xi among the Di = d group.

Then, µd(x) is just a predicted value from the regression for Xi = x .

How can we use this?

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 69 / 89

Page 296: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Imputation estimators

Impute the treated potential outcomes with Yi (1) = µ1(Xi )!

Impute the control potential outcomes with Yi (0) = µ0(Xi )!

Procedure:

I Regress Yi on Xi in the treated group and get predicted values for allunits (treated or control).

I Regress Yi on Xi in the control group and get predicted values for allunits (treated or control).

I Take the average difference between these predicted values.

More mathematically, look like this:

τimp =1

N

∑i

µ1(Xi )− µ0(Xi )

Sometimes called an imputation estimator.

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 70 / 89

Page 297: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Imputation estimators

Impute the treated potential outcomes with Yi (1) = µ1(Xi )!

Impute the control potential outcomes with Yi (0) = µ0(Xi )!

Procedure:

I Regress Yi on Xi in the treated group and get predicted values for allunits (treated or control).

I Regress Yi on Xi in the control group and get predicted values for allunits (treated or control).

I Take the average difference between these predicted values.

More mathematically, look like this:

τimp =1

N

∑i

µ1(Xi )− µ0(Xi )

Sometimes called an imputation estimator.

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 70 / 89

Page 298: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Imputation estimators

Impute the treated potential outcomes with Yi (1) = µ1(Xi )!

Impute the control potential outcomes with Yi (0) = µ0(Xi )!

Procedure:

I Regress Yi on Xi in the treated group and get predicted values for allunits (treated or control).

I Regress Yi on Xi in the control group and get predicted values for allunits (treated or control).

I Take the average difference between these predicted values.

More mathematically, look like this:

τimp =1

N

∑i

µ1(Xi )− µ0(Xi )

Sometimes called an imputation estimator.

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 70 / 89

Page 299: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Imputation estimators

Impute the treated potential outcomes with Yi (1) = µ1(Xi )!

Impute the control potential outcomes with Yi (0) = µ0(Xi )!

Procedure:

I Regress Yi on Xi in the treated group and get predicted values for allunits (treated or control).

I Regress Yi on Xi in the control group and get predicted values for allunits (treated or control).

I Take the average difference between these predicted values.

More mathematically, look like this:

τimp =1

N

∑i

µ1(Xi )− µ0(Xi )

Sometimes called an imputation estimator.

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 70 / 89

Page 300: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Imputation estimators

Impute the treated potential outcomes with Yi (1) = µ1(Xi )!

Impute the control potential outcomes with Yi (0) = µ0(Xi )!

Procedure:

I Regress Yi on Xi in the treated group and get predicted values for allunits (treated or control).

I Regress Yi on Xi in the control group and get predicted values for allunits (treated or control).

I Take the average difference between these predicted values.

More mathematically, look like this:

τimp =1

N

∑i

µ1(Xi )− µ0(Xi )

Sometimes called an imputation estimator.

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 70 / 89

Page 301: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Imputation estimators

Impute the treated potential outcomes with Yi (1) = µ1(Xi )!

Impute the control potential outcomes with Yi (0) = µ0(Xi )!

Procedure:

I Regress Yi on Xi in the treated group and get predicted values for allunits (treated or control).

I Regress Yi on Xi in the control group and get predicted values for allunits (treated or control).

I Take the average difference between these predicted values.

More mathematically, look like this:

τimp =1

N

∑i

µ1(Xi )− µ0(Xi )

Sometimes called an imputation estimator.

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 70 / 89

Page 302: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Imputation estimators

Impute the treated potential outcomes with Yi (1) = µ1(Xi )!

Impute the control potential outcomes with Yi (0) = µ0(Xi )!

Procedure:

I Regress Yi on Xi in the treated group and get predicted values for allunits (treated or control).

I Regress Yi on Xi in the control group and get predicted values for allunits (treated or control).

I Take the average difference between these predicted values.

More mathematically, look like this:

τimp =1

N

∑i

µ1(Xi )− µ0(Xi )

Sometimes called an imputation estimator.

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 70 / 89

Page 303: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Imputation estimators

Impute the treated potential outcomes with Yi (1) = µ1(Xi )!

Impute the control potential outcomes with Yi (0) = µ0(Xi )!

Procedure:

I Regress Yi on Xi in the treated group and get predicted values for allunits (treated or control).

I Regress Yi on Xi in the control group and get predicted values for allunits (treated or control).

I Take the average difference between these predicted values.

More mathematically, look like this:

τimp =1

N

∑i

µ1(Xi )− µ0(Xi )

Sometimes called an imputation estimator.

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 70 / 89

Page 304: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Simple imputation estimator

Use predict() from the within-group models on the data from theentire sample.

Useful trick: use a model on the entire data and model.frame() toget the right design matrix:

## heterogeneous effects

y.het <- ifelse(d == 1, y + rnorm(n, 0, 5), y)

mod <- lm(y.het ~ d + X)

mod1 <- lm(y.het ~ X, subset = d == 1)

mod0 <- lm(y.het ~ X, subset = d == 0)

y1.imps <- predict(mod1, model.frame(mod))

y0.imps <- predict(mod0, model.frame(mod))

mean(y1.imps - y0.imps)

## [1] 0.61

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 71 / 89

Page 305: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Simple imputation estimator

Use predict() from the within-group models on the data from theentire sample.

Useful trick: use a model on the entire data and model.frame() toget the right design matrix:

## heterogeneous effects

y.het <- ifelse(d == 1, y + rnorm(n, 0, 5), y)

mod <- lm(y.het ~ d + X)

mod1 <- lm(y.het ~ X, subset = d == 1)

mod0 <- lm(y.het ~ X, subset = d == 0)

y1.imps <- predict(mod1, model.frame(mod))

y0.imps <- predict(mod0, model.frame(mod))

mean(y1.imps - y0.imps)

## [1] 0.61

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 71 / 89

Page 306: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Simple imputation estimator

Use predict() from the within-group models on the data from theentire sample.

Useful trick: use a model on the entire data and model.frame() toget the right design matrix:

## heterogeneous effects

y.het <- ifelse(d == 1, y + rnorm(n, 0, 5), y)

mod <- lm(y.het ~ d + X)

mod1 <- lm(y.het ~ X, subset = d == 1)

mod0 <- lm(y.het ~ X, subset = d == 0)

y1.imps <- predict(mod1, model.frame(mod))

y0.imps <- predict(mod0, model.frame(mod))

mean(y1.imps - y0.imps)

## [1] 0.61

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 71 / 89

Page 307: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Notes on imputation estimators

If µd(x) are consistent estimators, then τimp is consistent for the ATE.

Why don’t people use this?

I Most people don’t know the results we’ve been talking about.I Harder to implement than vanilla OLS.

Can use linear regression to estimate µd(x) = x ′βdRecent trend is to estimate µd(x) via non-parametric methods suchas:

I Kernel regression, local linear regression, regression trees, etcI Easiest is generalized additive models (GAMs)

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 72 / 89

Page 308: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Notes on imputation estimators

If µd(x) are consistent estimators, then τimp is consistent for the ATE.

Why don’t people use this?

I Most people don’t know the results we’ve been talking about.I Harder to implement than vanilla OLS.

Can use linear regression to estimate µd(x) = x ′βdRecent trend is to estimate µd(x) via non-parametric methods suchas:

I Kernel regression, local linear regression, regression trees, etcI Easiest is generalized additive models (GAMs)

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 72 / 89

Page 309: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Notes on imputation estimators

If µd(x) are consistent estimators, then τimp is consistent for the ATE.

Why don’t people use this?

I Most people don’t know the results we’ve been talking about.

I Harder to implement than vanilla OLS.

Can use linear regression to estimate µd(x) = x ′βdRecent trend is to estimate µd(x) via non-parametric methods suchas:

I Kernel regression, local linear regression, regression trees, etcI Easiest is generalized additive models (GAMs)

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 72 / 89

Page 310: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Notes on imputation estimators

If µd(x) are consistent estimators, then τimp is consistent for the ATE.

Why don’t people use this?

I Most people don’t know the results we’ve been talking about.I Harder to implement than vanilla OLS.

Can use linear regression to estimate µd(x) = x ′βdRecent trend is to estimate µd(x) via non-parametric methods suchas:

I Kernel regression, local linear regression, regression trees, etcI Easiest is generalized additive models (GAMs)

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 72 / 89

Page 311: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Notes on imputation estimators

If µd(x) are consistent estimators, then τimp is consistent for the ATE.

Why don’t people use this?

I Most people don’t know the results we’ve been talking about.I Harder to implement than vanilla OLS.

Can use linear regression to estimate µd(x) = x ′βd

Recent trend is to estimate µd(x) via non-parametric methods suchas:

I Kernel regression, local linear regression, regression trees, etcI Easiest is generalized additive models (GAMs)

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 72 / 89

Page 312: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Notes on imputation estimators

If µd(x) are consistent estimators, then τimp is consistent for the ATE.

Why don’t people use this?

I Most people don’t know the results we’ve been talking about.I Harder to implement than vanilla OLS.

Can use linear regression to estimate µd(x) = x ′βdRecent trend is to estimate µd(x) via non-parametric methods suchas:

I Kernel regression, local linear regression, regression trees, etcI Easiest is generalized additive models (GAMs)

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 72 / 89

Page 313: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Notes on imputation estimators

If µd(x) are consistent estimators, then τimp is consistent for the ATE.

Why don’t people use this?

I Most people don’t know the results we’ve been talking about.I Harder to implement than vanilla OLS.

Can use linear regression to estimate µd(x) = x ′βdRecent trend is to estimate µd(x) via non-parametric methods suchas:

I Kernel regression, local linear regression, regression trees, etc

I Easiest is generalized additive models (GAMs)

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 72 / 89

Page 314: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Notes on imputation estimators

If µd(x) are consistent estimators, then τimp is consistent for the ATE.

Why don’t people use this?

I Most people don’t know the results we’ve been talking about.I Harder to implement than vanilla OLS.

Can use linear regression to estimate µd(x) = x ′βdRecent trend is to estimate µd(x) via non-parametric methods suchas:

I Kernel regression, local linear regression, regression trees, etcI Easiest is generalized additive models (GAMs)

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 72 / 89

Page 315: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Imputation estimator visualization

-2 -1 0 1 2 3

-0.5

0.0

0.5

1.0

1.5

x

y

TreatedControl

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 73 / 89

Page 316: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Imputation estimator visualization

-2 -1 0 1 2 3

-0.5

0.0

0.5

1.0

1.5

x

y

TreatedControl

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 74 / 89

Page 317: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Imputation estimator visualization

-2 -1 0 1 2 3

-0.5

0.0

0.5

1.0

1.5

x

y

TreatedControl

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 75 / 89

Page 318: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Nonlinear relationships

Same idea but with nonlinear relationship between Yi and Xi :

-2 -1 0 1 2

-1.0

-0.5

0.0

0.5

x

y

TreatedControl

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 76 / 89

Page 319: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Nonlinear relationships

Same idea but with nonlinear relationship between Yi and Xi :

-2 -1 0 1 2

-1.0

-0.5

0.0

0.5

x

y

TreatedControl

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 77 / 89

Page 320: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Nonlinear relationships

Same idea but with nonlinear relationship between Yi and Xi :

-2 -1 0 1 2

-1.0

-0.5

0.0

0.5

x

y

TreatedControl

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 78 / 89

Page 321: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Using semiparametric regressionHere, CEFs are nonlinear, but we don’t know their form.

We can use GAMs from the mgcv package to for flexible estimate:

library(mgcv)

mod0 <- gam(y ~ s(x), subset = d == 0)

summary(mod0)

##

## Family: gaussian

## Link function: identity

##

## Formula:

## y ~ s(x)

##

## Parametric coefficients:

## Estimate Std. Error t value Pr(>|t|)

## (Intercept) -0.0225 0.0154 -1.46 0.16

##

## Approximate significance of smooth terms:

## edf Ref.df F p-value

## s(x) 6.03 7.08 41.3 <2e-16 ***

## ---

## Signif. codes: 0 ’***’ 0.001 ’**’ 0.01 ’*’ 0.05 ’.’ 0.1 ’ ’ 1

##

## R-sq.(adj) = 0.909 Deviance explained = 92.8%

## GCV = 0.0093204 Scale est. = 0.0071351 n = 30

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 79 / 89

Page 322: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Using semiparametric regressionHere, CEFs are nonlinear, but we don’t know their form.

We can use GAMs from the mgcv package to for flexible estimate:

library(mgcv)

mod0 <- gam(y ~ s(x), subset = d == 0)

summary(mod0)

##

## Family: gaussian

## Link function: identity

##

## Formula:

## y ~ s(x)

##

## Parametric coefficients:

## Estimate Std. Error t value Pr(>|t|)

## (Intercept) -0.0225 0.0154 -1.46 0.16

##

## Approximate significance of smooth terms:

## edf Ref.df F p-value

## s(x) 6.03 7.08 41.3 <2e-16 ***

## ---

## Signif. codes: 0 ’***’ 0.001 ’**’ 0.01 ’*’ 0.05 ’.’ 0.1 ’ ’ 1

##

## R-sq.(adj) = 0.909 Deviance explained = 92.8%

## GCV = 0.0093204 Scale est. = 0.0071351 n = 30

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 79 / 89

Page 323: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Using GAMs

-2 -1 0 1 2

-1.0

-0.5

0.0

0.5

x

y

TreatedControl

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 80 / 89

Page 324: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Using GAMs

-2 -1 0 1 2

-1.0

-0.5

0.0

0.5

x

y

TreatedControl

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 81 / 89

Page 325: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Using GAMs

-2 -1 0 1 2

-1.0

-0.5

0.0

0.5

x

y

TreatedControl

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 82 / 89

Page 326: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

‘Wait...so what are we actually doing most of the time?’

A Discussion

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 83 / 89

Page 327: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Conclusions

Regression is mechanically very simple, but philosophically somewhatcomplicated

It is a useful descriptive tool for approximating a conditionalexpectation function

Once again though, the estimand of interest isn’t necessarily theregression coefficient.

There are many other approaches to estimation, but identification iskey.

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 84 / 89

Page 328: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Conclusions

Regression is mechanically very simple, but philosophically somewhatcomplicated

It is a useful descriptive tool for approximating a conditionalexpectation function

Once again though, the estimand of interest isn’t necessarily theregression coefficient.

There are many other approaches to estimation, but identification iskey.

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 84 / 89

Page 329: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Conclusions

Regression is mechanically very simple, but philosophically somewhatcomplicated

It is a useful descriptive tool for approximating a conditionalexpectation function

Once again though, the estimand of interest isn’t necessarily theregression coefficient.

There are many other approaches to estimation, but identification iskey.

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 84 / 89

Page 330: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Conclusions

Regression is mechanically very simple, but philosophically somewhatcomplicated

It is a useful descriptive tool for approximating a conditionalexpectation function

Once again though, the estimand of interest isn’t necessarily theregression coefficient.

There are many other approaches to estimation, but identification iskey.

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 84 / 89

Page 331: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Next Week

Causality with Unmeasured Confounding

Reading:I Angrist and Pishke Chapter 4 Instrumental Variables and Chapter 6 on

Regression Discontinuity DesignsI Morgan and Winship Chapter 9 Instrumental Variable Estimators of

Causal EffectsI Optional: Hernan and Robins Chapter 16 Instrumental Variable

Estimation

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 85 / 89

Page 332: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

1 The Experimental Ideal

2 Assumption of No Unmeasured Confounding

3 Estimation Under No Unmeasured Confounding

4 Regression Estimators

5 Regression and Causality

6 Regression Under Heterogeneous Effects

7 Fun with Visualization, Replication and the NYT

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 86 / 89

Page 333: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

1 The Experimental Ideal

2 Assumption of No Unmeasured Confounding

3 Estimation Under No Unmeasured Confounding

4 Regression Estimators

5 Regression and Causality

6 Regression Under Heterogeneous Effects

7 Fun with Visualization, Replication and the NYT

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 86 / 89

Page 334: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Visualization in the New York Times

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 87 / 89

Page 335: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Visualization in the New York Times

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 87 / 89

Page 336: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Visualization in the New York Times

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 87 / 89

Page 337: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Visualization in the New York Times

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 87 / 89

Page 338: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Alternate Graphs

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 88 / 89

Page 339: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Alternate Graphs

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 88 / 89

Page 340: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Alternate Graphs

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 88 / 89

Page 341: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Alternate Graphs

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 88 / 89

Page 342: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Alternate Graphs

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 88 / 89

Page 343: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Alternate Graphs

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 88 / 89

Page 344: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Thoughts

Two stories here:

1 Visualization and data coding choices are important

2 The internet is amazing (especially with replication data beingavailable!)

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 89 / 89

Page 345: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Thoughts

Two stories here:

1 Visualization and data coding choices are important

2 The internet is amazing (especially with replication data beingavailable!)

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 89 / 89

Page 346: Week 10: Causality with Measured Confounding...Anexperimentis a study where assignment to treatment is controlled by the researcher. I p i = P[D i = 1] be the probability of treatment

Thoughts

Two stories here:

1 Visualization and data coding choices are important

2 The internet is amazing (especially with replication data beingavailable!)

Stewart (Princeton) Week 10: Measured Confounding November 26 and 28, 2018 89 / 89