instrumental variables and control functions

35
Instrumental Variables and Control Functions Day 3, Lecture 1 By Caroline Krafft Training on Applied Micro-Econometrics and Public Policy Evaluation July 25-27, 2016 Economic Research Forum

Upload: economic-research-forum

Post on 07-Jan-2017

583 views

Category:

Government & Nonprofit


1 download

TRANSCRIPT

Page 1: Instrumental Variables and Control Functions

Instrumental Variables and

Control Functions

Day 3, Lecture 1

By Caroline Krafft

Training on Applied Micro-Econometrics and Public Policy Evaluation

July 25-27, 2016

Economic Research Forum

Page 2: Instrumental Variables and Control Functions

Readings

• Primary source:

• Angrist, J.D. and J.-S. Pischke (2009). Chapter 4 “Instrumental

variables in action: Sometimes you get what you need” Mostly

Harmless Econometrics Princeton, NJ: Princeton University Press.

• Additional material from:

• Wooldridge, J. M. (2015). Control Function Methods in Applied

Econometrics. Journal of Human Resources, 50(2), 421–445.

• Terza, J. V., Basu, A., & Rathouz, P. J. (2008). Two-Stage Residual

Inclusion Estimation: Addressing Endogeneity in Health

Econometric Modeling. Journal of Health Economics, 27(3), 531–

543.

2

Page 3: Instrumental Variables and Control Functions

Type II solutions

• Have been examining Type I (conditional exogeneity of

placement) quasi-experimental solutions

• Propensity score matching

• Difference-in-difference

• Panel data (fixed and random)

• Now moving to Type II solutions

• Rules or instruments (instrumental variables) determine placement

• Instrumental variables or regression discontinuity design

3

Page 4: Instrumental Variables and Control Functions

Instrumental variables

• An instrument is something correlated with the causal variable of interest but uncorrelated with any other determinants of the dependent variable

• This instrumental variable can be used to solve missing or unknown control variables (omitted variables bias) problems

• Instrumental variables can be used in different techniques:• Two-stage least squares

• Control functions

• IVs also have an important role in:• Correcting for random measurement error in continuous variables

• Not categorical or binary variables

• Solving simultaneous equations models

4

Page 5: Instrumental Variables and Control Functions

Case study: schooling and wages

• Interested in the link between schooling (s) and wages (Y). Let’s assume that the causal relationship is a different function ffor each person (i): Ysi=fi(s)• Tells us what individual i earns for any s

• A common simplifying assumption is that the functional form is a linear, constant-effects (causal) model:• Ysi=α+ρs+ηi

• ηi is other factors that determine potential earnings

• Let’s say that one observable factor is ability, Ai

• Selection on observables would mean:

• ηi=Ai’γ+υi

• For the moment, let us assume that only ability correlates ηi and si so that E[siυi]=0

• So if ability is observed, then Ysi=α+ρsi+Ai’γ+υi

5

Page 6: Instrumental Variables and Control Functions

What if ability is not observed?

• Want to know ρ in Ysi=α+ρsi+Ai’γ+υi

• Do not observe Ai, which is likely to be correlated with si,

creating omitted variables bias if simply tried to estimate

Ysi=α+ρsi+ηi

• Potential solution: use an instrument (zi) that is correlated

with the causal variable of interest (si) but uncorrelated

with any other determinants of the dependent variable

• Cov(zi, ηi)=0

• Referred to as the exclusion restriction: zi could have been

excluded from our model of interest

6

Page 7: Instrumental Variables and Control Functions

Two stage least squares

• Want to know ρ in ysi=α’Xi+ρsi+Ai’γ+υi (structural equation)

• Both yi and si are endogenous variables, Xi exogenous

• Have some zi (exogenous variable) meeting exclusion

restriction

• Can estimate regression of the first stage (si on zi) and

regression for reduced form (yi on zi)

• First stage, controlling for covariates:

• Substituting predicted value of si into structural equation

generates two-stage least squares estimates:

7

si = Xi'p10 +p11zi +x1i

si = Xi'p10 + p11zi

yi =a 'Xi + rsi +[hi + r(si - si )]

Page 8: Instrumental Variables and Control Functions

Two Stage Least Squares: A summary

• Two stage least squares uses an instrument, the concept

of a first stage and a second stage (two steps on previous

slide) to get a consistent estimate of ρ

• Doing this actually in two stages leads to incorrect standard errors

• Typically implemented in a single command in software

8

Page 9: Instrumental Variables and Control Functions

Finding an instrument

• Things you’ll need to check in implementing this two-stage approach• First stage is statistically significant (no weak instruments)

• Test this and present your test statistics (F-test)

• Exclusion restriction (zi only affects yi through si)

• To find a good instrument need to understand processes determining the variable of interest (si)• Institutional knowledge is particularly helpful

• There may be institutional constraints that can serve as good instruments

• Need a very strong case for why instrument only affects outcome through variable of interest• If there are any other Xs that are affected by instrument, would need to

control for them to prevent a violation of the exclusion restriction

9

Page 10: Instrumental Variables and Control Functions

Example: Quarter of Birth Instrument

• Example (Angrist and Krueger 1991): Children enter

school in the calendar year in which they turn 6

• School start age is a function of date of birth

• Have to stay until 16

• Different grades when reach drop out age

• Compulsory schooling laws and age of entry create a natural

experiment where children are compelled to attend school for a

varying number of years based on when they are born

• Can use quarter of birth as an instrument for schooling

• Conceptually excludable: quarter of birth shouldn’t affect

ability (or motivation, or family connections, or anything

else that affects wages)

10

Page 11: Instrumental Variables and Control Functions

Average Education by Quarter of Birth

(First Stage)

11

Men born earlier in the

year tend to have less

schooling (quit “earlier”)

Page 12: Instrumental Variables and Control Functions

Earnings and quarter of birth

12

Earnings are lower for those who are born

in earlier quarters—those with less

schooling

Page 13: Instrumental Variables and Control Functions

Example: OLS and 2SLS results

13

Page 14: Instrumental Variables and Control Functions

Multiple instruments

• Can use multiple instruments (z1i, z2i, z3i, etc.) in the first stage

• If have multiple endogenous variables in second stage, will need multiple instruments

• Models are just identified when the # instruments=# endogenous variables

• Models are over identified when the # instruments># endogenous variables• This allows for additional testing of the assumptions underlying

instruments

14

Page 15: Instrumental Variables and Control Functions

IV with Heterogeneous Potential

Outcomes• Our basic two stage least squares (2SLS) model

assumed that the causal effect of interest is constant.

• For a dummy variable (example: college (1) or no college (0)) this

means y1i-y0i=ρ for all i

• Homogenous treatment effect

• For a multivalued variable (example: years of schooling), this

means ysi-ys-1,i=ρ for all i and all s

• Linearity and homogenous treatment effect

15

Page 16: Instrumental Variables and Control Functions

Validity and Heterogeneity

• Treatment effects are likely to be heterogeneous• A distribution of effects across individuals

• Example: Individuals who choose to take up a training program may be those who particularly benefit from it. Expanding the training program to the general population might have different (weaker) effects

• Internal validity occurs when the analysis discovers causal effects for the population being studied• Will hold for a good IV study or RCT

• Regardless of heterogeneity

• External validity occurs when a study can predict effects into different contexts• Can be better assessed when allowing for heterogeneous treatment effects

16

Page 17: Instrumental Variables and Control Functions

Heterogeneity with a dummy treatment

variable• Interested in the effect of some program on the outcome

yi, where capture participation as a dummy, Di

• Denote as yi(d, z) the potential outcome of i with Di=d and zi=z

• Denote as D1i i’s treatment status when zi=1 and denote

as D0i i’s treatment status when zi=0

• Only one is observed

• Observed treatment status is therefore:

• Di=D0i+(D1i -D0i)zi=

• Average causal effect of zi on Di is E[π1i]

17

p0 +p1izi +xi

Page 18: Instrumental Variables and Control Functions

Monotonicity assumption

• Have model for treatment of:

• For this model to be useful, it has to be the case that

monotonicity holds, meaning:

• The instrument has to either:

• increase participation or have no effect for

• or decrease participation or have no effect

• There cannot be some people who are more likely and

some less likely to participate from the instrument

18

Di = p0 +p1izi +xi

p1i ³ 0 or p1i £ 0

Page 19: Instrumental Variables and Control Functions

Independence Assumption

• Have to assume that the instrument is as good as

randomly assigned (independent of potential outcomes

and treatment assignments).

• This means then that the first stage is causal

19

[yi(D1i,1), yi(D0i, 0),D1i,D0i )]^ zi

Page 20: Instrumental Variables and Control Functions

Exclusion restriction

• For the exclusion restriction to hold in the heterogeneous treatment effects and dummy treatment framework, it must be the case that yi(d,z) is a function only of d• yi(d,0)=yi(d,1) for d=0,1

• Exclusion restriction fails if outcome of interest is affected by instrument in some other way than by treatment (program) of interest• Need to have a unique channel for causal effects of instrument

• Treatment could still be randomly assigned• Random assignment could lead to other changes in behavior

• Example: Those more likely to be drafted into the military stayed in school longer. Draft numbers were by random lottery, but behavior or remaining in college confounds estimated impact of military service on wages

20

Page 21: Instrumental Variables and Control Functions

The LATE Theorem• Given:

• A1-Independence

• A2-Exclusion: yi(d,0)=yi(d,1) for d=0,1

• A3-First stage (no weak instruments): E[D1i -D0i]≠0

• A4-Monotonicity

• In any study with IVs, you need to consider and discuss all of these

conditions in your work

• Then you can estimate the local average treatment effect

(LATE) (for the instrument increasing treatment case):

21

D1i -D0i ³ 0 or D1i -D0i £ 0 "i

[yi(D1i,1), yi(D0i, 0),D1i,D0i )]^ zi

E[yi | zi =1]-E[yi | zi = 0]

E[Di | zi =1]-E[Di | zi = 0]

= E[y1i - y0i |D1i > D0i ] = E[r0i | p1i > 0]

Page 22: Instrumental Variables and Control Functions

Dividing the sample for LATE

• There could be four groups (assume IV increases treatment)• Defiers (ruled out by monotonicity): D0i=1, D1i=0

• Compliers: D1i=1, D0i=0• Affected by the instrument

• Always takers: D1i=1, D0i=1

• Never takers D1i=0, D0i=0

• With LATE, identify the effect of the treatment based on the population of compliers• Not informative about effects on never takers or always takers

• ATT based on always-takers and compliers

• ATU based on never-takers and compliers

• Compliers will be different (and therefore LATE different) for different instruments

22

Page 23: Instrumental Variables and Control Functions

IVs in RCTs

• Often end up using IVs and LATE in RCTs when:

• RCT is a randomly assigned offer of treatment

• One-sided non-compliance: some take up of offer

• All controls remain untreated

• Comparing those who take up treatment with those who did not would be misleading (selection bias, typically positive)

• Can use offer of treatment as an IV for treatment received

• Then the LATE is effect of treatment on compliers, treatment on the treated (TOT)

• Distinct from intent-to-treat (ITT) estimates which show the causal effect of offered treatment on those assigned to treatment

• Whether or not they took it up

• ITT/compliance rate=TOT

23

Page 24: Instrumental Variables and Control Functions

JTPA (Job Training Partnership) experiment:

Program effects on earnings of disadvantaged

24

Page 25: Instrumental Variables and Control Functions

Complicating LATE

• Can add covariates

• Independence assumption becomes a conditional independence

assumption:

• As good as randomly assigned conditional on covariates

• May be necessary for instrument to be valid

• Can improve precision

• With linear modeling, 2SLS results are a (very) close approximation

of causal relationship of interest

• Can use multiple instruments

• Keeping in mind different instruments generate different compliers

25

[yi1, y0i,D1i,D0i )]^ zi | Xi

Page 26: Instrumental Variables and Control Functions

Extending to an Average Causal

Response Model• Consider now the case where treatment is not just a dummy

• Example: years of schooling: Ysi=fi(s)

• There are s different unit causal effects: ysi-ys-1,i

• Linear causal model assumes these are all the same

• Assuming independence, exclusion, first stage, monotonicity 2SLS

generates a weighted average of unit causal effects

• Based on compliers over range of si (driven by the z from a treatment

intensity less than s to at least s)

26

Page 27: Instrumental Variables and Control Functions

Common 2SLS mistakes: Manual 2SLS

• Software packaged have 2SLS built in—so best to use

the built in functions as they help avoid some errors

• If you do “manually” compute two stages, need to make

sure to:

• Adjust the standard errors for the two-stage nature of the estimates

• OLS residual variance includes difference between predicted and

observed times coefficient

• Use the same covariates (X) in the first and second stages

• Failing to do so can create inconsistency in the second stage

27

Page 28: Instrumental Variables and Control Functions

2SLS with small samples

• 2SLS is consistent (as the sample becomes large) but is

biased in small samples

• 2SLS estimates may be systematically wrong

• 2SLS is most biased when instruments are weak, when

there are many instruments

• Biased towards OLS

• Essentially because the first stage is estimated and noisier the

weaker the instruments

• Most concerned about this with small samples, weak

instruments, many instruments

28

Page 29: Instrumental Variables and Control Functions

Making the case for your instrument

• 1. Always report the first stage

• Argue for why it makes sense (signs, magnitude)

• 2. Report your F-statistic on the instrument

• Bigger is better, need above 10 as a rule of thumb

• 3. If you have multiple instruments, use the best one for just-identified estimates and present those

• 4. Use limited information maximum likelihood (LIML) for over-identified instruments

• Less precise but less biased

• Compare results

• 5. Check model in the reduced-form regression of y on instruments

• Unbiased since OLS

• Want to see causal relation of some size in reduced form

29

Page 30: Instrumental Variables and Control Functions

Common 2SLS mistakes: Forbidden

Regression• The 2SLS models we’ve been talking about have been using

linear functional forms• Using OLS on a nonlinear variable

• Should we use a nonlinear first stage instead?• NO! Forbidden regression

• The forbidden regression uses a nonlinear first stage (predicted values of endogenous regressor) in the second stage• Only OLS creates first-stage residuals that are uncorrelated with

predicted values and covariates

• Using nonlinear fitted values as instruments means identifying off the first stage nonlinearities

• Nonlinear second stage can also be problematic

30

Page 31: Instrumental Variables and Control Functions

Dealing with limited dependent

variables• Limited dependent variables (LDVs, binary, categorical,

etc.) typically assume a latent linear index• Require functional form assumptions

• Often 2SLS is still the best way to go (or at least should be shown as one model)

• May be able to make the case for nonlinear form like bivariate probit. Example: Have a third child? Depends on zi (sex of first two kids)• First stage:

• Second stage (Employment status):

• Problem would be correlation between error terms

• Estimate bivariate probit with maximum likelihood

• Generate average causal effects

31

Di =1[X '

ig*

0 +g *

1zi >ui ]

Yi =1[X '

ib*

0 +b*

1Di >ei ]

Page 32: Instrumental Variables and Control Functions

Control Function

Approaches

32

Page 33: Instrumental Variables and Control Functions

Control functions

• Although historically the term has been used in a variety of ways, most modern applications of the term control function are instrumental variable approaches (Wooldridge 2015)

• There is an endogenous explanatory variable

• Control function approaches use the exogenous variation from an excluded instrument to generate variation in the residuals from the reduced form (first stage)• The residuals are the control functions—included in the second stage

with the endogenous variables

• Advantage is primarily in dealing with nonlinear models (Terza, Basu & Rathouz 2008)

• Also allows for tests of the nature of selection

33

Page 34: Instrumental Variables and Control Functions

Control function equations

• Assume we are interested in the effect of endogenous

variable y2 on outcome y1 and have a vector z of

exogenous variables, including some instrument z2

• Essential problem is same as for 2SLS. Solution is

different, and based on the linear relationship between the

structural and reduced form error:

• e1 uncorrelated with y2

• Can then estimate:

• Essentially “control for” endogeneity of y2

34

y1 = z1d1 +g1y2 +u1 E[z '

ju1] = 0

y2 = z1p 21 + z2p 22 +n2 E[z'

jn2 ] = 0

u1 = r1n2 +e1 E[n2e1]= 0

y1 = z1d1 +g1y2 + r1n2 +e1

Page 35: Instrumental Variables and Control Functions

Implementing two-step control

function• Regress yi2 on zi

• Predict OLS residuals

• Run OLS regression of yi1 on zi1 , yi2 ,

• Generates

• Essentially keeps residuals and actual endogenous variable,

whereas 2SLS uses predicted value of endogenous variable

• Bootstrap standard errors

• Can undertake heteroscedasticity robust Hausman testing

of endogeneity,

• Still relies on (strong) instrument for identification

35

n i2

n i2d1,g1, r1

r1 = 0