1 estimating treatment effects with observational data using instrumental variable estimation: the...

69
1 Estimating Treatment Effects with Observational Data using Instrumental Variable Estimation: The Extent of Inference John M. Brooks, Ph.D. Health Effectiveness Research Center (HERCe) Colleges of Pharmacy and Public Health University of Iowa June 26, 2005 Health Effectiveness Research Center

Upload: zoe-bruce

Post on 25-Dec-2015

228 views

Category:

Documents


4 download

TRANSCRIPT

Page 1: 1 Estimating Treatment Effects with Observational Data using Instrumental Variable Estimation: The Extent of Inference John M. Brooks, Ph.D. Health Effectiveness

1

Estimating Treatment Effects with Observational Data using Instrumental

Variable Estimation: The Extent of Inference

John M. Brooks, Ph.D.

Health Effectiveness Research Center (HERCe)

Colleges of Pharmacy and Public Health

University of Iowa

June 26, 2005Health Effectiveness Research Center

Page 2: 1 Estimating Treatment Effects with Observational Data using Instrumental Variable Estimation: The Extent of Inference John M. Brooks, Ph.D. Health Effectiveness

2

Research Goal:

• Estimate casual relationships between "treatment" and “outcome” in healthcare...

→ treatment on outcome;→ behavior on outcome;→ system change on behavior (e.g.

guideline implementation);→ system change on outcome.

Page 3: 1 Estimating Treatment Effects with Observational Data using Instrumental Variable Estimation: The Extent of Inference John M. Brooks, Ph.D. Health Effectiveness

3

• Written as a linear relationship:

Y = a0 + a1• T

our goal is to obtain estimate(s) of “a1”.

• To estimate “a1” T must move or vary.

• To make inferences about “a1” the source of the variation in T must be scrutinized relative to your research goal.

Page 4: 1 Estimating Treatment Effects with Observational Data using Instrumental Variable Estimation: The Extent of Inference John M. Brooks, Ph.D. Health Effectiveness

4

• Key research design issues for isolating and using “T” variation:

1. the manner in which the researcher collects data; and

2. the approach to deal with “confounding factors”

confounding factors: factors that vary both with T and Y.

Page 5: 1 Estimating Treatment Effects with Observational Data using Instrumental Variable Estimation: The Extent of Inference John M. Brooks, Ph.D. Health Effectiveness

5

Design Control ofConfounding Factors

Statistical Control ofConfounding Factors

Research Environments and Estimation Methods

SecondaryDatabases

Researcher-CollectedDatabases

Statistical “Matching”Techniques (Propensity Scores)

Quasi-Experimental Designs

Instrumental Variables

– “Ex Post Design”

ANOVA

Logistic Regression

Multiple Regression

– “Risk Adjustment”

Entirely ControlledExperiment - 2 Tests– RandomizedControlled Trials

Weighted Regression Techniques of SurveyDatabases:• NMES• MEPS

Page 6: 1 Estimating Treatment Effects with Observational Data using Instrumental Variable Estimation: The Extent of Inference John M. Brooks, Ph.D. Health Effectiveness

6

Sources of Treatment Variation in Health Care

1. Randomized Controlled Trials: study of patients with a given medical condition in which treatment is randomly assigned.

• Why randomly assign treatment to patients?

To help ensure that estimated treatment effects result from the treatment variation and not unmeasured confounders.

The Gold StandardThe Gold Standard

Page 7: 1 Estimating Treatment Effects with Observational Data using Instrumental Variable Estimation: The Extent of Inference John M. Brooks, Ph.D. Health Effectiveness

7

• Why not more Randomized Controlled Trials?

→ ethical problems once treatment is approved

→ expensive and time-consuming

→ little motivation

→ patient sampling problems when comparing existing treatments (so who wants to be randomized?)

Page 8: 1 Estimating Treatment Effects with Observational Data using Instrumental Variable Estimation: The Extent of Inference John M. Brooks, Ph.D. Health Effectiveness

8

2. Observational Healthcare Databases Containing Healthcare Treatment Choices:

• Secondary:

→ Claims: medical service treatment claims from individuals with health insurance

→ Provider-Specific: databases describing the utilization of a set of providers.

• Primary:

→ Health Care Surveys: surveys of patients or providers detailing health care utilization.

Page 9: 1 Estimating Treatment Effects with Observational Data using Instrumental Variable Estimation: The Extent of Inference John M. Brooks, Ph.D. Health Effectiveness

9

• Strengths:

→ plenty of variation in treatment choice;

→ ability to study effects of treatment across a variety of clinical scenarios;

→ can assess treatments in practice – estimate “effectiveness”;

→ often unobtrusively collected;

→ the power of large numbers and time.

Page 10: 1 Estimating Treatment Effects with Observational Data using Instrumental Variable Estimation: The Extent of Inference John M. Brooks, Ph.D. Health Effectiveness

10

• Weaknesses:

→ often data usually not collected for researcher’s purpose (secondary);

→ patient enrollment variation;

→ confounding information may be unobserved.

- care not covered is not observed - care not claimed is not observed - claim form limitations

- nuances of illness, treatment, and patient that can’t be recorded on claims forms

Page 11: 1 Estimating Treatment Effects with Observational Data using Instrumental Variable Estimation: The Extent of Inference John M. Brooks, Ph.D. Health Effectiveness

11

Is the Main Weakness with Observational DataUnmeasured Confounders or Treatment Selection Bias?

1. Unmeasured Confounders

• Unmeasured Confounders argument:

→ homogenous treatment effect (a1 same for all patients); and

→ unmeasured factors related to both treatment and outcome is the source of bias.

Page 12: 1 Estimating Treatment Effects with Observational Data using Instrumental Variable Estimation: The Extent of Inference John M. Brooks, Ph.D. Health Effectiveness

12

• Assume true outcome relationship is:

Y = ao + a1•T + a2•L + e

where:

Y = measure of outcome (e.g. 1 if survive to a certain time period, 0 otherwise);

T = 1 if receive treatment, 0 otherwise; and

L = additional factor (e.g. severity, other treatments).

Goal is to estimate a1 – the effect of treatment on outcome.

Page 13: 1 Estimating Treatment Effects with Observational Data using Instrumental Variable Estimation: The Extent of Inference John M. Brooks, Ph.D. Health Effectiveness

13

• For Estimation Suppose:

→ L is not measured and the estimation model is:

Y = ao + a1•T + u where:

u = (a2•L + e)

→ L is related to Y (a2 ≠ 0); and

→ T and L are related (Cov(T,L) ≠ 0).

Cov(T,L) – covariance of T & L. Cov(T,L) ≠ 0 essentially means that T & L move together.

Page 14: 1 Estimating Treatment Effects with Observational Data using Instrumental Variable Estimation: The Extent of Inference John M. Brooks, Ph.D. Health Effectiveness

14

• Define the ordinary least squares (ANOVA) estimate of a1 as .

→ It can be shown that under these assumptions is a biased estimate of a1 through its expected value:

= a1 + Cov(T,L)•a2

→ Also note that will equal a1 if either:

-- Cov(T,L) = 0; or

-- a2 = 0.

]ˆ[ 1aE

1a

1a

]ˆ[ 1aE

Page 15: 1 Estimating Treatment Effects with Observational Data using Instrumental Variable Estimation: The Extent of Inference John M. Brooks, Ph.D. Health Effectiveness

15

• Suppose theory about the unmeasured variable “L” suggests:

→ “a2 < 0” (patients with higher severity are less likely to survive).

→ Cov(T,L) > 0 (treated patients are generally more severe).

• Plug in “signs” into our expected value formula to find:

→ < a1.]ˆ[ 1aE)(

))((]ˆ[

11 aaE

Page 16: 1 Estimating Treatment Effects with Observational Data using Instrumental Variable Estimation: The Extent of Inference John M. Brooks, Ph.D. Health Effectiveness

16

• Problem with the Unmeasured Confounders argument to describe bias in observational data:

→ No theoretical foundation linking treatments to unmeasured factors....

Why is Cov(T,L) ≠ 0?

→ In the example above, if treatment effect (a1) is the same for all patients, why would Cov(T,L) > 0?

Perhaps patients getting treated:

-- live in areas with high/low poverty; -- live in areas with more pollution; or -- also tend to get other unmeasured treatments.

Page 17: 1 Estimating Treatment Effects with Observational Data using Instrumental Variable Estimation: The Extent of Inference John M. Brooks, Ph.D. Health Effectiveness

17

2. Treatment Selection Bias (the gestalt underlying most negative reviewer’s comments)

• Treatment Selection Bias argument: → Heterogeneous treatment effect -- Cov(T,L) is a reflection of decision-maker’s beliefs about the treatment effectiveness across patients related to unmeasured factors “L”.

→ “Bias” comes from unmeasured factors (L) being related to the treatment choice and outcome.

→ Researcher must address both bias and ability to generalize (to whom do the results apply?).

Page 18: 1 Estimating Treatment Effects with Observational Data using Instrumental Variable Estimation: The Extent of Inference John M. Brooks, Ph.D. Health Effectiveness

18

• Assume true outcome relationship is:

Y = bo + (b1•L) •T + b2•L + e where:

Y = measure of outcome (e.g. 1 if survive to a certain time period, 0 otherwise);

T = 1 if receive treatment, 0 otherwise;

L = unmeasured factor (e.g. severity, other treatment);

b2 = the direct effect of L on Y; and

(b1•L) = effect of T on Y that depends on L.

Page 19: 1 Estimating Treatment Effects with Observational Data using Instrumental Variable Estimation: The Extent of Inference John M. Brooks, Ph.D. Health Effectiveness

19

→ L is now related to T through theory linking "treatment choice" to the decision-maker’s expectations of treatment benefits across patients with different “L”.

T = co + c1•L + c2•W + v where:

T = 1 if receive treatment, 0 otherwise;

L = unmeasured factor (e.g. severity, other treatment) affecting treatment choice through expected treatment effectiveness; and

W = other factors affecting treatment choice.

If decision makers use L in treatment decisions, c1 ≠ 0 and Cov(T,L) ≠ 0.

Page 20: 1 Estimating Treatment Effects with Observational Data using Instrumental Variable Estimation: The Extent of Inference John M. Brooks, Ph.D. Health Effectiveness

20

• Ultimate goal should be to estimate (b1•L) – the effect of treatment T on outcome Y across levels of L.

• For estimation suppose:

→ L is not measured and it is wrongly assumed by the researcher that the effect of T is homogenous, and the estimation model is:

Y = ao + a1•T + u where:

u = f(L,T, e, b1,b2)

Page 21: 1 Estimating Treatment Effects with Observational Data using Instrumental Variable Estimation: The Extent of Inference John M. Brooks, Ph.D. Health Effectiveness

21

• Define the ordinary least squares (ANOVA) estimate of a1 as .

→ It can be shown that the expected value of is:

→ If b2 = 0 (L has no direct effect on Y) or c1 = 0 (no selection based on L), then becomes:

Yields an average estimate of the treatment effect for “the treated” in the sample. Result can be generalized only to those with L similar to those treated.

1a

1a

2111

bc]1T|L[EbaE

1

aE

]1T|L[EbaE11

Page 22: 1 Estimating Treatment Effects with Observational Data using Instrumental Variable Estimation: The Extent of Inference John M. Brooks, Ph.D. Health Effectiveness

22

• How does c1 • b2 affect this estimate?

→ Assume that L is unmeasured illness severity and that higher L means more severe illness.

→ Higher L lowers survival which implies b2 < 0.

→ If treatment benefit is less for more severe cases (e.g. surgery for heart attacks) then:

benefit falls less treatment with higher in more severity severe cases

Estimate of the effect of the treatment on the treated will be biased high.

0bc0c0b2111

Page 23: 1 Estimating Treatment Effects with Observational Data using Instrumental Variable Estimation: The Extent of Inference John M. Brooks, Ph.D. Health Effectiveness

23

→ If treatment benefit is greater for more severe cases (e.g. antibiotics for otitis media) then:

benefit increases more treatment with higher in more severity severe cases

Estimate of the effect of the treatment on the treated will be biased low.

0bc0c0b2111

Page 24: 1 Estimating Treatment Effects with Observational Data using Instrumental Variable Estimation: The Extent of Inference John M. Brooks, Ph.D. Health Effectiveness

24

• So what do we have here?

→ Observational data contains treatment variation.

→ If treatment benefits are heterogeneous the best you can get is an estimate of the treatment effect on the treated (Does this address the benefits from expanding treatments?).

→ Treatment selection may be based on unmeasured factors related to both treatment effectiveness and outcomes.

→ If unmeasured factors affecting selection also effect outcomes directly, estimate will be biased.

Do we have any alternatives?

Page 25: 1 Estimating Treatment Effects with Observational Data using Instrumental Variable Estimation: The Extent of Inference John M. Brooks, Ph.D. Health Effectiveness

25

Instrumental Variables (IV) Estimation and “Subset B”

• IV estimation offers consistent estimates for a subset of patients (McClellan, Newhouse 1993):

Marginal Patients: patients whose treatment choices vary with measured factors called instruments that do not directly affect outcomes.

• McClellan and Newhouse argued that estimates of treatment effects for Marginal Patients are useful.

→ Estimates may be more suitable than RCT estimates to address the question of whether existing treatment rates should change.

Page 26: 1 Estimating Treatment Effects with Observational Data using Instrumental Variable Estimation: The Extent of Inference John M. Brooks, Ph.D. Health Effectiveness

26

More certaintyabout treatmentbenefits

A B C

0% 100%

A = subset of patients all providers agree to treat.

C = subset of patients all providers agree not to treat.

B = subset of patients whose treatment choice issituation/provider dependent.

50%

Distribution of Patients by Prior Assessment of the Certainty of Treatment Benefit

• Where do Marginal Patients come from?

Less certaintyabout treatmentbenefits

Page 27: 1 Estimating Treatment Effects with Observational Data using Instrumental Variable Estimation: The Extent of Inference John M. Brooks, Ph.D. Health Effectiveness

27

• Patients in Subset B are interesting because:

→ the “best” treatment choice (treat or don’t treat) is least certain;

→ treatment or no-treatment for a patient in this subset is not considered bad medicine – the “art” of medicine;

→ the possibility of gaining new RCT evidence for patients in this subset is remote (ethics, motivation);

→ McClellan et al. 1994 argue that (1) policy interventions and (2) non-clinical factors (e.g. provider access, market pressures) affect mainly the treatment choices of patients in this subset.

Page 28: 1 Estimating Treatment Effects with Observational Data using Instrumental Variable Estimation: The Extent of Inference John M. Brooks, Ph.D. Health Effectiveness

28

MoreCertainty

LessCertainty

A B C

0% 100%50%

• Size and location of Subset B varies with clinical scenario.

B C

0% 100%50%

off-label use for new treatment (e.g. new anti-cancer drugs used in non-tested cancer populations):

treatment with little consensus (e.g. aggressive treatment for early-stage prostate cancer):

MoreCertainty

LessCertainty

Page 29: 1 Estimating Treatment Effects with Observational Data using Instrumental Variable Estimation: The Extent of Inference John M. Brooks, Ph.D. Health Effectiveness

29

MoreCertainty

A B C

0% 100%50%

• Changes in the underlying population definition will affect the location of Subset B.

A C

0% 100%50%

aggressive treatment for early-stage prostate cancer for 70-80 year-olds with one comorbidity:

aggressive treatment for early-stage prostate cancer for 50-60 year-olds with no comorbidities:

B

LessCertainty

LessCertainty

MoreCertainty

Page 30: 1 Estimating Treatment Effects with Observational Data using Instrumental Variable Estimation: The Extent of Inference John M. Brooks, Ph.D. Health Effectiveness

30

• IV estimation involves:

1. Finding measured variables or “instruments” (Z) that:

a. are related to the possibility of a patient receiving treatment (cov(T,Z) ≠ 0); and

b. are assumed (through theory) unrelated directly to Y or to unmeasured confounding variables (cov(Z,L) = 0).

The theoretical basis for “Z” variables should come from a model of treatment choice – the “W” variables in:

T = co + c1•L + c2•W + v where:

W = other factors affecting treatment choice.

Page 31: 1 Estimating Treatment Effects with Observational Data using Instrumental Variable Estimation: The Extent of Inference John M. Brooks, Ph.D. Health Effectiveness

31

• IV estimation involves con’t:

2. Grouping patients using values of the “instrument”.

3. Estimate treatment effects for marginal patients by exploiting treatment rate differences across patient groups.

Local Average Treatment Effect -- (Imbens & Angrist 1994)

Page 32: 1 Estimating Treatment Effects with Observational Data using Instrumental Variable Estimation: The Extent of Inference John M. Brooks, Ph.D. Health Effectiveness

32

• For example, if an instrument divides patients into two groups, a simple IV estimate can be found by calculating:

1. the overall treatment rate in each group (ti = treatment rate in group “i”); and

2. the overall outcome rate in each group (yi = outcome rate in group “i”); and estimate:

where:

= average treatment effect for the “marginal patients” specific to the instrument used in the analysis – only those patients whose treatment choices were affected by the instrument who must have come from Subset B.

21

211 tt

yy

ratetreatmentindifference

rateoutcomeindifferencea IV

ˆ

IVa1ˆ

Page 33: 1 Estimating Treatment Effects with Observational Data using Instrumental Variable Estimation: The Extent of Inference John M. Brooks, Ph.D. Health Effectiveness

33

More Certainty

Less Certainty

A B

0% 100%

M = patients within Subset B whose treatment choicesare affected by the instrument – the MarginalPatients for that instrument.

treated

• Hypothetical Treatment Choices Across Patients Grouped by Access to Providers Required for Treatment

Patient Group Closer to Providers Required for Treatment:

Patient Group Further From Providers Required for Treatment:

60%

M

50%

A B

0% 100%

treated

C

C

M

More Certainty

Less Certainty

Page 34: 1 Estimating Treatment Effects with Observational Data using Instrumental Variable Estimation: The Extent of Inference John M. Brooks, Ph.D. Health Effectiveness

34

• We have treatment rates for each group:

Closer Group Treatment Rate: .60 Further Group Treatment Rate: .50

Suppose we also measured “cure” rates in both groups:

Closer Group Cure Rate: .40 Further Group Cure Rate: .38

• Four numbers lead to the following IV estimate:

21

02563840

1.

..

....

aIV

Page 35: 1 Estimating Treatment Effects with Observational Data using Instrumental Variable Estimation: The Extent of Inference John M. Brooks, Ph.D. Health Effectiveness

35

• Strict Interpretation:

→ If the treatment rate in the Further Group was increased .01 percentage point (e.g. .50 to .51) by increasing treatment for the M patients in the Further Group, the Cure rate in the Further Group would increase .002 (.01 • .2) – from .38 to .382.

• Stretched “Policy-Relevant” Interpretation (McClellan et al. 1994)

→ A behavioral intervention that increases the overall treatment rate by .01 percentage point (e.g. .55 to .56) would lead to an increase in the cure rate of .002 (.01 • .2).

Page 36: 1 Estimating Treatment Effects with Observational Data using Instrumental Variable Estimation: The Extent of Inference John M. Brooks, Ph.D. Health Effectiveness

36

• Stretched interpretation assumes that the treatment effect for patients in Subset B is fairly homogenous and an IV estimate from a single instrument can be generalized to all patients in Subset B.

• Stretched interpretation may not be accurate if treatment effects are heterogeneous within Subset B and different instruments affect treatment choices from different patients within Subset B. → Results from a single instrument may still be more appropriate than assuming RCT results apply to Subset B.

→ Ability to generalize results may increase if more than one instrument is used in an IV analysis.

Page 37: 1 Estimating Treatment Effects with Observational Data using Instrumental Variable Estimation: The Extent of Inference John M. Brooks, Ph.D. Health Effectiveness

37

• IV qualifiers to remember:

→ second property of IV variables (cov(Z,L) = 0) is forever an assumption (unless more data are obtained);

→ unmeasured but correlated treatments may still bias estimated treatment benefits; and

→ ability to generalize is limited.

Researchers should fully qualify their IV estimates – don't oversell.

Page 38: 1 Estimating Treatment Effects with Observational Data using Instrumental Variable Estimation: The Extent of Inference John M. Brooks, Ph.D. Health Effectiveness

38

Hypothetical Example to Demonstrate “4-Number” Result

Suppose:

• 2100 children with Otitis Media (OM) in a population.

• Two treatment possibilities:

1. antibiotics;2. watchful waiting.

• The patients in our sample are in one of three severity types “low”, “medium”, and “high”

• Severity type is observed by the provider/patient but is not observed by the researcher.

Page 39: 1 Estimating Treatment Effects with Observational Data using Instrumental Variable Estimation: The Extent of Inference John M. Brooks, Ph.D. Health Effectiveness

39

• The 2100 patients are distributed across severity type in the following manner:

severity type High Medium Low number of patients 800 800 500

• The actual underlying cure rates for each severity type by treatment are:

severity type treatment High Medium Low antibiotics .95 .97 .98 watchful waiting .80 .90 .98

Page 40: 1 Estimating Treatment Effects with Observational Data using Instrumental Variable Estimation: The Extent of Inference John M. Brooks, Ph.D. Health Effectiveness

40

→ Higher severity means a lower the cure rate in general (b2 < 0).

→ Treatment effects are heterogeneous and antibiotics have a higher curative effect in more severe patients and offer no advantage to the less severe (b1 > 0).

→ All providers have inclination that antibiotics work well in the "high" severity patients; have little effect on the "low" severity patients; but the effect in the "medium" type is unknown.

→ Leads to treatment selection bias...the more severe kids are treated (c1 > 0) and more severe kids are less likely cured (b2 < 0).

Page 41: 1 Estimating Treatment Effects with Observational Data using Instrumental Variable Estimation: The Extent of Inference John M. Brooks, Ph.D. Health Effectiveness

41

Potential Methods to Get Treatment Variation for Analysis:

1. Randomize Patients Into Treatments -- ANOVA

2. Providers Assign Treatments -- ANOVA

3. Instrumental Variable Grouping

Page 42: 1 Estimating Treatment Effects with Observational Data using Instrumental Variable Estimation: The Extent of Inference John M. Brooks, Ph.D. Health Effectiveness

42

1. Randomize Patients Across Population – ANOVA.

Patient Treatment Assignments After Randomization by Severity Type

severity type patient groups High Medium Low antibiotics 400 400 250 watchful waiting 400 400 250

Page 43: 1 Estimating Treatment Effects with Observational Data using Instrumental Variable Estimation: The Extent of Inference John M. Brooks, Ph.D. Health Effectiveness

43

Expected average cure rates for each group:

• Unbiased average antibiotic treatment effect for the entire population (.965-.881 = .084), but

• Estimate will vary with the average severity in the population...E[L|T=1].

• To whom does it apply? A patient randomly chosen from an urn? Are patients chosen from urns?

965.98.1050250

97.1050400

95.1050400

RateCureAntibiotic

881.98.1050250

90.1050400

80.1050400

RateCure.W.W

Page 44: 1 Estimating Treatment Effects with Observational Data using Instrumental Variable Estimation: The Extent of Inference John M. Brooks, Ph.D. Health Effectiveness

44

2. Providers Assign Treatments -- ANOVA

If providers follow “inclinations”, we may end up with something like:

Number of Patients Assigned by Providers to EachTreatment Group by Severity Type

severity type patient group High Medium Lowantibiotics 800 400 0watchful waiting 0 400 500

Page 45: 1 Estimating Treatment Effects with Observational Data using Instrumental Variable Estimation: The Extent of Inference John M. Brooks, Ph.D. Health Effectiveness

45

Expected average cure rates for each group:

• For this population the average treatment effect is on the treated (800/1200*.15 + 400/1200*.07=.123).

• We find a biased low estimate of the antibiotic treatment effect for the average treated patient (.957 - .944 = .013 < .123).

• “Biased low” follows our theory as...

957.98.1200

097.

1200400

95.1200800

RateCureAntibiotic

944.98.900500

90.900400

80.900

0RateCure.W.W

Page 46: 1 Estimating Treatment Effects with Observational Data using Instrumental Variable Estimation: The Extent of Inference John M. Brooks, Ph.D. Health Effectiveness

46

3. Instrumental Variable Grouping – Further assume:

a. Information is available to the researcher to approximate distances from patients to providers

• address of patient• supply of providers in area around patients

b. Evidence suggests that patients in areas with more physicians per capita have a higher probability of being treated with antibiotics for their OM than patients in areas with fewer physicians per capita.

Page 47: 1 Estimating Treatment Effects with Observational Data using Instrumental Variable Estimation: The Extent of Inference John M. Brooks, Ph.D. Health Effectiveness

47

If “b” is true, divide 2100 patients into two groups based on the physicians per capita in the area around their home:

Group 1: the group of patients living in areas with a higher number of physicians per capita.

Group 2: the group of patients living in areas with a lower number of physicians per capita.

Page 48: 1 Estimating Treatment Effects with Observational Data using Instrumental Variable Estimation: The Extent of Inference John M. Brooks, Ph.D. Health Effectiveness

48

Using our assumptions, does this grouping qualify as an instrument?

1. Doc supply related to treatment? Yes, if patients tend to go to the closest provider for treatment.

If true, and providers follow inclinations we may see treatment patterns something like:

Patient Treatment Assignments by Severity Type

patient severity type group High Medium LowGroup 1 100% antibiotics 80% antibiotics 100% W.W. 20% W.W. Group 2 100% antibiotics 30% antibiotics 100% W.W. 70% W.W.

Page 49: 1 Estimating Treatment Effects with Observational Data using Instrumental Variable Estimation: The Extent of Inference John M. Brooks, Ph.D. Health Effectiveness

49

2. Is grouping related to unmeasured confounding variables (e.g. severity)? Related to severity only if parents chose residences in expectation of the severity of a future acute condition.

If not related to severity, we assume equivalent severity distributions across groups:

Number of Patients in Each Group by Severity Type

severity type patient group High Medium Low Group 1 400 400 250 Group 2 400 400 250

Page 50: 1 Estimating Treatment Effects with Observational Data using Instrumental Variable Estimation: The Extent of Inference John M. Brooks, Ph.D. Health Effectiveness

50

Expected average estimated cure rates for these groups:

Well, (.959428 - .946092) = .013336 doesn't appear to reveal much of anything…!

959428.98.1050250

90.1050

8097.

1050320

95.1050400

RateCure1Group

946092.98.1050250

90.1050280

97.1050120

95.1050400

RateCure2Group

Page 51: 1 Estimating Treatment Effects with Observational Data using Instrumental Variable Estimation: The Extent of Inference John M. Brooks, Ph.D. Health Effectiveness

51

Now look at the antibiotic treatment rate in each group:

720/1050 = .68571 in Group 1520/1050 = .4952381 in Group 2

These differences also don't look very informative….

The IV change in the cure rates resulting from a one unit increase in the drug treatment rate equals:

• This estimate is the average difference in the antibiotic cure rate for the marginal or in this example the “Medium” severity patients.

07190471905

013336

495238168571

9460929594281 .

.

.

..

..ˆ

IVa

Page 52: 1 Estimating Treatment Effects with Observational Data using Instrumental Variable Estimation: The Extent of Inference John M. Brooks, Ph.D. Health Effectiveness

52

• Remember the actual “unknown” cure rates for each group by treatment are:

severity type treatment High Medium Low antibiotics .95 .97 .98 watchful waiting .80 .90 .98

.07

• This estimate was found using only measured treatment rates and outcome rates across “groups” that are defined by the instruments.

• Which of the estimates above is the most important for policy-makers wondering about over/underutilization of a treatment?

Page 53: 1 Estimating Treatment Effects with Observational Data using Instrumental Variable Estimation: The Extent of Inference John M. Brooks, Ph.D. Health Effectiveness

53

IV Brass Tacks

• Where do instruments come from?

→ Theory on what motivated choices, not theory on how choices can be motivated.

→ Observed differences in:

-- guideline implementation (timing/interpretation) -- product approval rules across payers -- reimbursement differences across payers/geography -- area provider “treatment signatures” -- geographic access to relevant providers -- provider market structure/competition

→ Generally, “Natural Experiments” (Angrist and Krueger, 2001)

Page 54: 1 Estimating Treatment Effects with Observational Data using Instrumental Variable Estimation: The Extent of Inference John M. Brooks, Ph.D. Health Effectiveness

54

• General IV Estimation Model

Treatment Choice Equation (1st stage):

Outcome Equation (2nd stage):

Yi = 1 if health outcome occurs, 0 otherwise; Xi = measured patient clinical characteristics; Ti = 1 if patient received treatment, 0 otherwise; = predicted treatment from 1st stage; Zi = a set of binary variables grouping patients based on values of instrumental variables (from W); and Li = unmeasured confounding variables assumed related to both Y and T but not Z.

The only variation in T used to estimate a1 comes from Z.

i1ii3i20i

LcvZcXccT

iiiii LaeXaTaaY 3210ˆ

iT

Page 55: 1 Estimating Treatment Effects with Observational Data using Instrumental Variable Estimation: The Extent of Inference John M. Brooks, Ph.D. Health Effectiveness

55

• Define the IV estimate of a1 as .

→ It can be shown that the expected value of is:

Yields an average estimate of the treatment effect for the set of patients whose treatment choices were dependent on their value of Z.

)]Z(T|L[EbaE1IV1

IVa1ˆ

IVa1ˆ

Page 56: 1 Estimating Treatment Effects with Observational Data using Instrumental Variable Estimation: The Extent of Inference John M. Brooks, Ph.D. Health Effectiveness

56

→ The estimate of a1 can only be definitively generalized to the patients whose treatment choices were affected by Z (Angrist, Imbens, Rubin 1996).

→ F-test of whether the parameters within c3 are simultaneously equal to zero provides a test of the first instrumental variable criterion:

Finding measured variables or “instruments” (Z) that:

a. are related to the possibility of a patient receiving treatment (cov(T,Z) ≠ 0)

Page 57: 1 Estimating Treatment Effects with Observational Data using Instrumental Variable Estimation: The Extent of Inference John M. Brooks, Ph.D. Health Effectiveness

57

→ Model can be estimated via:

-- Two-Stage Least Squares (2SLS) – PROC SYSLIN in SAS. -- Bivariate Probit – BIPROBIT function in STATA. -- Two-Stage Replacement (e.g. Beenstock & Rahav, 2002).

→ 2SLS offers consistent estimates that are asymptotically normal with the fewest assumptions (Angrist 2001).

-- essentially regressing group-level outcome rate changes on group-level treatment rate changes.

Page 58: 1 Estimating Treatment Effects with Observational Data using Instrumental Variable Estimation: The Extent of Inference John M. Brooks, Ph.D. Health Effectiveness

58

• How many groups?

→ Z can be specified as continuous variables, but results are then conditional on this assumption and are less interpretable.

→ Creating many groups from an instrument (more binary variables in Z) uses more information and yields a weighted average of many two-group comparisons, e.g.

-- low/high groups using the median of the instrument VS -- low/med low/med high/high groups using the quartiles of the instrument.

→ Too many groups may introduce bias.

→ Best to report estimates for several grouping strategies.

Page 59: 1 Estimating Treatment Effects with Observational Data using Instrumental Variable Estimation: The Extent of Inference John M. Brooks, Ph.D. Health Effectiveness

59

Effect of Dialysis Center Profit-Status on Patient Survival: An

Instrumental Variables Approach

Brooks, Irwin, Pendergast, Chrischilles, Flanigan, Hunsicker

Page 60: 1 Estimating Treatment Effects with Observational Data using Instrumental Variable Estimation: The Extent of Inference John M. Brooks, Ph.D. Health Effectiveness

60

Introduction

In a meta-analysis of observational studies,Devereaux et al (1) found that patient survival at for-profit dialysis centers was poorer than non-profit centers.

Objective

Compare estimates of the effect of dialysis center profit status on patient survival using risk-adjustment and IV estimation.

Page 61: 1 Estimating Treatment Effects with Observational Data using Instrumental Variable Estimation: The Extent of Inference John M. Brooks, Ph.D. Health Effectiveness

61

Sample

• N = 101,669 incident ESRD patients from United States Renal Data System (USRDS) from 1996-1999 that:

-- were between 67 and 100 years old at dialysis initiation; -- had hemodialysis as initial modality; -- obtained dialysis in a non-government dialysis facility; -- had complete information on all model variables; -- zip codes linked to 1990 census data.

Key Variable Definitions

• Outcome: one-year survival after dialysis initiation = 1, 0 otherwise.

• Treatment Setting: patient initiated dialysis in a for-profit dialysis center = 1, 0 otherwise.

Page 62: 1 Estimating Treatment Effects with Observational Data using Instrumental Variable Estimation: The Extent of Inference John M. Brooks, Ph.D. Health Effectiveness

62

Instrumental Variable Strategy

• Followed McClellan et al. (1994) and grouped patients based on Differential Distance (DD) to various hospital classifications:

DD = (DFP - DNP) where

DFP = distance from patient residence to the nearest for profit dialysis center; and DNP = distance from patient residence to the nearest non-profit dialysis center.

• Assessed whether IV estimates were robust to the number of patient groups defined using differential distance.

Page 63: 1 Estimating Treatment Effects with Observational Data using Instrumental Variable Estimation: The Extent of Inference John M. Brooks, Ph.D. Health Effectiveness

63

Percent Initial For-Profit and Number of Comorbidities by Patient Differential Distance

-100 -50 0 50 100

Miles a not for-profit is closer

0.0

0.2

0.4

0.6

0.8

1.0

% fo

r-pr

ofit

1

2

3

4

5

Ave

rage

num

ber

of c

omor

bidi

ties

not for-profit closerfor-profit closer

% for-profitcomorbidities

Page 64: 1 Estimating Treatment Effects with Observational Data using Instrumental Variable Estimation: The Extent of Inference John M. Brooks, Ph.D. Health Effectiveness

64

Table 1: Attributes of Dialysis Patient Groups, 1996-1999

Differential Distance (DD)Patient Treatment Setting Patient Closer to: Characteristics For-Profit Non-Profit For-Profit Non-Profit For Profit % 100 0 92.8 48.1** White % 70.6 73.3** 74.9 67.9**Black % 23.5 20.0** 19.8 25.1**Cardiac Failure % 42.6 44.2** 43.1 43.1 Diabetes % 45.1 41.3** 44.8 43.1**CerebroVasc Dis% 11.9 12.5** 12.0 12.1 Isch Heart Disc % 32.1 36.1** 33.5 33.1AMI 11.6 13.2** 12.2 12.0Reside in:a High Hlth State % 61.2 47.4** 61.4 52.9**Med Hlth State % 16.4 40.3** 13.7 33.3Low Hlth State % 22.4 12.3** 24.9 13.9**Number 73,480 30,678 52,443 51,715a.Subramanian, S, Kawacki I, et al. (2001). “Does the state you live in make a difference? A multilevel analysis of self-rated health in the US.” Social Science & Medicine 53(1): 9-19.**,* statistically significant at the .01 and .05 levels, respectively

Page 65: 1 Estimating Treatment Effects with Observational Data using Instrumental Variable Estimation: The Extent of Inference John M. Brooks, Ph.D. Health Effectiveness

65

More Likelyto For-Profit

Less Likelyto For-Profit

M

0% 100%

M = patients whose dialysis center choice is dependent onthe relative distance to for-profit and non-profit dialysiscenters – Marginal Patients.

50%

“Marginal” End Stage Renal Disease Patients, 1996-1999

92.8%48.1%

Page 66: 1 Estimating Treatment Effects with Observational Data using Instrumental Variable Estimation: The Extent of Inference John M. Brooks, Ph.D. Health Effectiveness

66

Table 2: F-Statistics Testing Factors in Center Choice Model are Related to the Use of For-Profit Dialysis Facilities, 1996-1999.

Factora Partial F-Statistics . Differential Distance (instrument) 2150.53** Year 59.53** Gender 5.81** Age 7.29**Race 6.26**Comorbidity 8.55**Previous Healthcare Use 2.63*State of Residence 212.00**Distance to Nearest Center 75.41**Area Socioeconomic Status 21.83** a. specified using binary variables reflecting differences in respective characteristic. Differential distance was used to group patients into 20 separate groups. **,* statistically significant at the .01 and .05 levels, respectively

Page 67: 1 Estimating Treatment Effects with Observational Data using Instrumental Variable Estimation: The Extent of Inference John M. Brooks, Ph.D. Health Effectiveness

67

Table 3: 2SLS/IV and Ordinary Least Squares (OLS) Estimates of the Effect of Initial For-Profit Initial Dialysis Provider Relative to a Non-Profit Provider on 1-Year Patient Survival

Estimation Model Number of Instrument and Specification Groups Specified Estimate (P-value) OLS no covariates na -0.0031c (0.3450) OLS Devereaux covariatesa na -0.0122c (<.0001) OLS Devereuax covariates plusb na -0.0071c (0.0511) 2SLS/IVb 2 0.0009 (0.9264) 2SLS/IVb 5 0.0025 (0.7373) 2SLS/IVb 10 -0.00004 (0.9953) 2SLS/IVb 20 -0.0002 (0.9823) 2SLS/IVb 40 0.0006 (0.9349)

a. Factors consistently controlled for in the studies within the Devereaux meta-analysis – age, gender, race, comorbidities.b. Factors consistently controlled for in the studies within the Devereaux meta-analysis – age, gender, race, comorbidities, plus dialysis year, state of residence, previous healthcare utilization, provider access (distance to nearest dialysis center), socioeconomic status (patient zip percent rural, percent poverty, and per capita income). c. Logistic regression estimates were consistent in both magnitude and statistical significance. OLS estimates were reported because their interpretation is more consistent with IV estimates.

Page 68: 1 Estimating Treatment Effects with Observational Data using Instrumental Variable Estimation: The Extent of Inference John M. Brooks, Ph.D. Health Effectiveness

68

Summary

• The foundation of IV estimation is theory that suggests instruments – what factors motivated treatment choices.

• Ability to generalize is limited, but IV estimates offer a more natural estimate of the effects of rate changes than RCT estimates.

• Estimates can vary by sample and instrument used.

• Estimates are conditional on the truth (and acceptance) of a known identification restriction. The source of the treatment variation is known. The relationship between this variation source and unmeasured confounders can be debated.

Page 69: 1 Estimating Treatment Effects with Observational Data using Instrumental Variable Estimation: The Extent of Inference John M. Brooks, Ph.D. Health Effectiveness

69

References

Angrist JD, 2001. Estimation of Limited Dependent Variable Models with Dummy Endogenous Regressors: Simple Strategies for Empirical Practice. Journal of Business & Economic Statistics. 19(1):2-16

Angrist, JD, Imbens GW, Rubin, DB. 1996. Identification of Causal Effects Using Instrumental Variables. Journal of the American Statistical Association. 91:444-454.

Angrist JD, Krueger AB. 2001. Instrumental Variables and the Search for Identification: From Supply and Demand to Natural Experiments. Journal of Economic Perspectives. 15(4): 69-85.

Beenstock M, Rahav G. Testing Gateway Theory: do cigarette prices affect illicit drug use? Journal of Health Economics 2002;21:679-98.

Brooks JM, Chrischilles E, Scott S, Chen-Hardee S. 2003. Was Lumpectomy Underutilized for Early Stage Breast Cancer? – Instrumental Variables Evidence for Stage II Patients from Iowa. Health Services Research, 38(6):1385-1402.

Brooks JM, McClellan M, Wong H. 2000. The Marginal Benefits of Invasive Treatment for Acute Myocardial Infarction: Does Insurance Coverage Matter? Inquiry, 37(1):75-90.

Devereaux, P., H. Schunemann, et al. (2002). “Comparison of Mortality Between Private For-Profit and Private Not-For-Profit Hemodialysis Centers. A Systematic Review and Meta-analysis.” JAMA 288(19): 2449-2457.

Imbens GW, Angrist, JD. 1994. Identification and Estimation of Local Average Treatment Effects, Econometrica. 62(2):467-475.

McClellan M, McNeil BJ, Newhouse JP. 1994. Does More Intensive Treatment of Acute Myocardial Infarction in the Elderly Reduce Mortality: Analysis Using Instrumental Variables", Journal of the American Medical Association. 272:859-866.

McClellan M, Newhouse JP. 1993. The Marginal Benefits of Medical Treatment Intensity. Cambridge, Mass: National Bureau of Economic Research: Working Paper.

McClellan M, Newhouse JP. 1997. The Marginal Cost-Effectiveness of Medical Technology - a Panel Instrumental Variables Approach, Journal of Econometrics. 77:39-64.

Subramanian, S, Kawacki I, et al. (2001). “Does the state you live in make a difference? A multilevel analysis of self-rated health in the US.” Social Science & Medicine 53(1): 9-19.