the strength of instrumental variables and their sensitivity to ...€¦ · iv applications in...

21
Dylan Small Department of Statistics, Wharton School, University of Pennsylvania Based on joint work with Paul Rosenbaum Instrumental variables and their sensitivity to unobserved biases

Upload: others

Post on 11-Aug-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: The Strength of Instrumental Variables and Their Sensitivity to ...€¦ · IV Applications in Health Research Outcome (Y) Treatment (W) IV (Z) Reference Mortality More intensive

Dylan Small Department of Statistics,

Wharton School, University of Pennsylvania

Based on joint work with Paul Rosenbaum

Instrumental variables and their sensitivity to unobserved

biases

Page 2: The Strength of Instrumental Variables and Their Sensitivity to ...€¦ · IV Applications in Health Research Outcome (Y) Treatment (W) IV (Z) Reference Mortality More intensive

Overview • Instrumental Variable (IV) Method:

Method of controlling for unmeasured confounding

• Example: Effect of World War II military service on future earnings.

• Sensitivity to unobserved biases for IV method.

• How strength of IVs affects sensitivity to unobserved biases:

• Implications for designing studies with IVs.

Page 3: The Strength of Instrumental Variables and Their Sensitivity to ...€¦ · IV Applications in Health Research Outcome (Y) Treatment (W) IV (Z) Reference Mortality More intensive

Example: WWII Veteran Status and Earnings • Does military service raise or lower

earnings? • Angrist and Krueger (1994) studied this in

context of WWII military service and 1980 earnings (using 5% public use sample of US Census).

• Lower earnings? Military service in WWII interrupts education or career.

• Higher earnings? Labor market might favor veterans, GI Bill increases education.

Page 4: The Strength of Instrumental Variables and Their Sensitivity to ...€¦ · IV Applications in Health Research Outcome (Y) Treatment (W) IV (Z) Reference Mortality More intensive

This is association not causation: WWII Vets might not be comparable to Non-Vets in terms of health, criminal behavior…

WWII Vets (76% of men) earned on average $4500 more in 1980 than Non-Vets.

Page 5: The Strength of Instrumental Variables and Their Sensitivity to ...€¦ · IV Applications in Health Research Outcome (Y) Treatment (W) IV (Z) Reference Mortality More intensive

Addressing Confounding • Confounding Variable: Variable that is (i) not comparable between treatment and control groups. (ii) affects outcome. e.g., health, criminal behavior. • If all confounders measured, they can be

adjusted for by regression, propensity scores, matching methods…

• But health, criminal behavior not measured in the Census

Page 6: The Strength of Instrumental Variables and Their Sensitivity to ...€¦ · IV Applications in Health Research Outcome (Y) Treatment (W) IV (Z) Reference Mortality More intensive

Unmeasured Confounding

Veteran Status

Earnings

Unobserved Confounders

(Health, criminal

behavior, etc.

Graph is conditional on measured confounders (race, education up to 8 years, location of birth)

Page 7: The Strength of Instrumental Variables and Their Sensitivity to ...€¦ · IV Applications in Health Research Outcome (Y) Treatment (W) IV (Z) Reference Mortality More intensive

Instrumental Variables Strategy

Extract variation in W from Z that is free of unobserved confounders

and use this variation to estimate the causal effect of W on Y. Key IV Assumptions: (1) Z independent of unobserved variables; (2) Z does not have direct effect on outcome.

W: Veteran Status

Z: Year of Birth

Y:Earnings

Unobserved Confounders (Health etc.)

Y=Outcome W=Treatment Z=IV

Graph is conditional on measured confounders (race, education up to 8 years, location of birth)

Page 8: The Strength of Instrumental Variables and Their Sensitivity to ...€¦ · IV Applications in Health Research Outcome (Y) Treatment (W) IV (Z) Reference Mortality More intensive
Page 9: The Strength of Instrumental Variables and Their Sensitivity to ...€¦ · IV Applications in Health Research Outcome (Y) Treatment (W) IV (Z) Reference Mortality More intensive

IV Applications in Health Research Outcome (Y ) Treatment (W ) IV ( Z ) Reference Mortality More intensive vs.

less intensive treatment for heart attack patients

Distance lived from cardiac care center

McLellan, McNeil and Newhouse (1994)

Mortality Conventional vs. atypical antipsychotics

Prescribing physician’s preference

Wang, Schneeweiss et al. (2005)

Mortality Premature baby delivered at high level NICU vs. local hospital

Mother’s differential distance between high level NICU and local hospital

Lorch, Baiocchi, Ahlberg and Small (2012)

Birth weight Maternal smoking State cigarette taxes Evans and Ringel (1999)

Birth weight Maternal smoking Random assignment of free smoker’s counseling

Permutt and Hebel (1989)

Heart attack HDL cholesterol Genes that affect HDL Voight et al. (2012)

Page 10: The Strength of Instrumental Variables and Their Sensitivity to ...€¦ · IV Applications in Health Research Outcome (Y) Treatment (W) IV (Z) Reference Mortality More intensive

Prototype IV Design: Matched Pair Encouragement Design

Consider a matched pair design in which there are I matched pairs (say matched for measured confounders) and one unit j in each pair i is encouraged to receive treatment ( 1ijZ = ) and the other unit j’ is not encouraged to receive treatment '( 0)ijZ = . In this context, the encouragement variable Z is said to be a valid instrumental variable (IV) if Z is effectively randomly assigned:

1 2 1 21 1( 1, 0) , ( 0, 1)2 2i i i iP Z Z P Z Z= = = = = =

(i.e., Z is not related to any unmeasured confounders). Inference can be based on two stage least squares or permutation inference. 95% CI for effect of military service on earnings using 1926 vs. 1928 as IV: (-$1445, -$500)

Page 11: The Strength of Instrumental Variables and Their Sensitivity to ...€¦ · IV Applications in Health Research Outcome (Y) Treatment (W) IV (Z) Reference Mortality More intensive

A picture of the IV argument

-- We created matched triples: men matched on quarter of birth, race, age, education up to 8 years and location of birth. -- This figure provides reason to doubt military service increases earnings by $4500. -- From 1924 to 1926, the proportion of veterans stayed about constant and the earnings stayed about the same. From 1926 to 1928, the proportion of veterans decreased by 50% but earnings increased, suggesting military service decreases earnings.

Page 12: The Strength of Instrumental Variables and Their Sensitivity to ...€¦ · IV Applications in Health Research Outcome (Y) Treatment (W) IV (Z) Reference Mortality More intensive

Sensitivity Analysis

IV method assumes that the IV (encouragement) is effectively randomly assigned:

1 2 1 21 1( 1, 0) , ( 0, 1)2 2i i i iP Z Z P Z Z= = = = = =

There is often concern about whether this is true. In WWII Study, there are gradual long term trends in apprenticeship, education, employment and nutrition that might bias comparisons of workers born two years apart. A sensitivity analysis asks how departures from random assignment of the IV of various magnitudes might alter a study’s conclusion.

Page 13: The Strength of Instrumental Variables and Their Sensitivity to ...€¦ · IV Applications in Health Research Outcome (Y) Treatment (W) IV (Z) Reference Mortality More intensive

Model for Sensitivity Analysis For subject ij, let ijπ denote the probability that ij is encouraged,

( 1)ij ijP Zπ = = . Suppose that two subjects ij and ik may differ in their odds of being encouraged by at most a factor of 1Γ ≥ because they differ in terms of an unobserved covariate, ij iku u≠ ,

(1 )1 , ,(1 )

ij ik

ik ij

i j kπ ππ π

−≤ ≤ Γ ∀

Γ −.

If 1Γ = , IV is randomly assigned. If 1Γ > , then distribution of treatment assignments is unknown but magnitude of departure from random assignment controlled by Γ .

Page 14: The Strength of Instrumental Variables and Their Sensitivity to ...€¦ · IV Applications in Health Research Outcome (Y) Treatment (W) IV (Z) Reference Mortality More intensive

Sensitivity Analysis for WWII Study

Upper Bound on One-Sided Significance Level for 1926 vs. 1928 IV Γ 0 : 0H β ≥ 0 : 4,500H β ≥ 1 0.001 0.001 1.2 1.000 0.001 1.5 1.000 0.027 1.6 1.000 0.904 2.2 1.000 1.000 2.3 1.000 1.000

causal effect of military service on earningsβ =

Page 15: The Strength of Instrumental Variables and Their Sensitivity to ...€¦ · IV Applications in Health Research Outcome (Y) Treatment (W) IV (Z) Reference Mortality More intensive

Strength of IV • We’d like our study to be as insensitive to bias as

possible, i.e., finding is significant for as large a as possible. • How does strength of IV affect sensitivity to bias. • An IV is strong if encouragement has a strong effect on

treatment received; An IV is weak if encouragement has only a weak effect

on treatment received.

• Effects of Weak IVs

1. Increased Variance 2. Increased Sensitivity to Bias

Study Strong IV Weak IV World War II Study 1926 vs. 1928 1924 vs. 1926 Maternal Smoking Study Random assignment of

free counseling State cigarette taxes

Γ

Page 16: The Strength of Instrumental Variables and Their Sensitivity to ...€¦ · IV Applications in Health Research Outcome (Y) Treatment (W) IV (Z) Reference Mortality More intensive
Page 17: The Strength of Instrumental Variables and Their Sensitivity to ...€¦ · IV Applications in Health Research Outcome (Y) Treatment (W) IV (Z) Reference Mortality More intensive

Effect of Weak IVs I: Increased Variance

If Z is a weak IV, then the variance of the IV estimate will

be higher because less variation in W from Z can be extracted.

W|X

Z|X

Y

Unobserved Variables

95% CI for effect of military service using 1926 vs. 1928 IV: (-$1,445, -$500). 95% CI for effect of military service using 1924 vs. 1926 IV: (-$10,130, $10,750)

Page 18: The Strength of Instrumental Variables and Their Sensitivity to ...€¦ · IV Applications in Health Research Outcome (Y) Treatment (W) IV (Z) Reference Mortality More intensive

Effect of Weak IVs II: Increased Sensitivity to Bias

Power of a Sensitivity Analysis (Rosenbaum, 2004)

Suppose Z were in fact a valid IV so that 1 2 1 21( 1, 0) ( 0, 1)2i i i iP Z Z P Z Z= = = = = = ,

but we didn’t know this and wanted to allow for some sensitivity to bias measured by Γ Suppose also that 0β β− (true causal effect minus null hypothesis causal effect) was large, so that 0 0:H β β= was substantially in error. We would like to be able to reject 0 0:H β β= when the bias could be up to some Γ (e.g., Γ =1.5) . Power of a sensitivity analysis at Γ : Probability that we will reject 0 0:H β β= for Γ assuming that Z is a valid IV and a given value of 0β β− .

Page 19: The Strength of Instrumental Variables and Their Sensitivity to ...€¦ · IV Applications in Health Research Outcome (Y) Treatment (W) IV (Z) Reference Mortality More intensive

Effect size: 0( ) / 1β β σ− = Number of pairs I Strength of IV: P(Treat|IV=1)- P(Treat|IV=0)

Γ 100 1000 10,000 100,000 limI→∞

1 1 1.00 1.00 1.00 1.00 1 0.5 1 0.99 1.00 1.00 1.00 1 0.1 1 0.12 0.73 1.00 1.00 1 1 1.2 1.00 1.00 1.00 1.00 1 0.5 1.2 0.92 1.00 1.00 1.00 1 0.1 1.2 0.03 0.03 0.04 0.10 1 1 2 1.00 1.00 1.00 1.00 1 0.5 2 0.18 0.97 1.00 1.00 1 0.1 2 0.00 0.00 0.00 0.00 0 When the IV is valid ( 1Γ = ), the power is of course greater for stronger IVs but there is good power for all cases with sample size of 10,000 pairs. Valid but weak IVs eventually get it right. But when 1Γ > , the power can tend to 1 or 0 depending on the strength of the IV. Weak IVs are quite sensitive to small biases.

Page 20: The Strength of Instrumental Variables and Their Sensitivity to ...€¦ · IV Applications in Health Research Outcome (Y) Treatment (W) IV (Z) Reference Mortality More intensive

Practical Consequences

1. Weak IVs that might have small bias are dangerous to use. Weak IVs are sensitive to quite small biases ( 1Γ > yet Γ close to 1), even when the effect size 0( ) /β β σ− is quite large. Unless one is confident that a weak IV is perfectly valid ( 1Γ = ), its extreme sensitivity to small biases is likely to limit its usefulness to the study of enormous effects, 0( ) / 1β β σ− >> . 2. Strong IVs that might be moderately biased are useful. A strong IV may provide useful information even if moderate biases are plausible. Consider two studies, a small study with a strong IV and a large study with a weak IV, which would have the same power if both IVs are unbiased. When there is concern that the IVs might be biased, the small study with a strong IV has considerable advantages.

Page 21: The Strength of Instrumental Variables and Their Sensitivity to ...€¦ · IV Applications in Health Research Outcome (Y) Treatment (W) IV (Z) Reference Mortality More intensive

Potential IVs in Health Outcomes Research Potential IV Strength Differential Distance to Nearest Provider of Treatment A vs. Treatment B

Weak or Strong

Geographic or Hospital Preference for Treatment A vs. B

Weak or Strong

Physician Preference for Treatment A vs. B Weak or Strong Calendar Time (one treatment may become more common over time)

Weak or Strong

Genetic Variants Usually Weak Timing of Admission to Hospital Weak or Strong Insurance Plan Coverage for Treatment A vs. B Weak or Strong Randomized Encouragement at Point of Care for Treatment A vs. B When No Clear Cut Choice

Potentially Strong

Reference for this talk: Small, D. and Rosenbaum, P. (2008). War and wages: the strength of instrumental variables and their sensitivity to unobserved biases. Journal of the American Statisical Association, 103 924-933.