longitudinal surveys and their role in implementing quasi

28
SUSAN W. PARKER CIDE LONGITUDINAL SURVEYS AND THEIR ROLE IN IMPLEMENTING QUASI-EXPERIMENTAL AND (FIELD) EXPERIMENTAL DESIGNS FOR POLICY EVALUATION.

Upload: others

Post on 22-Jun-2022

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Longitudinal surveys and their role in implementing quasi

S U S A N W . P A R K E R

C I D E

LONGITUDINAL SURVEYS AND THEIR ROLE IN IMPLEMENTING QUASI-EXPERIMENTAL AND (FIELD) EXPERIMENTAL DESIGNS FOR POLICY EVALUATION.

Page 2: Longitudinal surveys and their role in implementing quasi

PLAN OF THE PRESENTATION

• Overview: Program evaluation.

• Experimental and quasi/non experimental evaluations

• Advantages and disadvantages of distinctive empirical approaches.

• What data do we need for program evaluation?

• Experimental and non-experimental

• Can experimental estimates be replicated with non-experimental methods?

• Evaluations of long term impacts.

• Conclusions

Page 3: Longitudinal surveys and their role in implementing quasi

OBJECTIVE OF PROGRAM EVALUATION: MEASURING IMPACTS.

What do we mean by “impact”?

D= 1 for participants, =0 for those who do not participate.

Y1= result under treatment, Y0= result without treatment

Evaluation problem: don’t observe the counterfactual of what would havehappened to those who participate in the absence of the program.

)1|( 01 DYYEI

)1|( 0 DYE

Page 4: Longitudinal surveys and their role in implementing quasi

DEFINING IMPACT

Tiempo

Res

ulta

do p

rim

ario

Impacto

Intervention

Page 5: Longitudinal surveys and their role in implementing quasi

HOW DO WE ESTIMATE A COUNTERFACTUAL?

• Usually, by selecting a group which is not affected by the program.

• Experimental evaluation:

• Use of random assignment of eligible beneficiaries to treatment or control.

• Non-experimental evaluation:

• Argue that a particular group under certain conditions can replicate the counterfactual.

5

Page 6: Longitudinal surveys and their role in implementing quasi

AVOIDING SELECTION BIAS

Individuals who participate in programs are frequently different from

those who do not participate:

• Programs are only located in certain areas.

• Programs are means tested, e.g. have to be poor.

• Participation is voluntary.

Selection bias occurs when:

→ Comparison group not a good representation of the counterfactual

6

)0|(~)1|( 00 DYEDYE

Page 7: Longitudinal surveys and their role in implementing quasi

EXPERIMENTAL DESIGNS

• Long history in medical studies.

• Since the 1960s used in developed countries to evaluate the impact of

government programs.

• During the last 20 years huge growth of pilot interventions using

experimental design to evaluate programs which might promote

development in developing countries.

• Prospera (Progresa/Oportunidades) an early example.

• Poverty Action Lab (MIT)

• Example: Kremer and Miguel Econometrica impacts of deworming on health and

schooling of children in Kenya.

• 75 schools, 2 treatment groups and 1 control.

Page 8: Longitudinal surveys and their role in implementing quasi

EXPERIMENTAL DESIGNS

D= 1 for participants, =0 for those who do not participate.

Y1= result under treatment, Y0= result without treatment

Assigning treatment and control groups randomly implies the control groupgives a good approximation to the counterfactual

)1|( 01 DYYEI

)1|( 0 DYE

Page 9: Longitudinal surveys and their role in implementing quasi

RANDOM ASSIGNMENT

Income per capita

1000

500

0Treatment Control

1257 1242

Program office

Page 10: Longitudinal surveys and their role in implementing quasi

NON-EXPERIMENTAL METHODS FOR CONSTRUCTING THE COUNTER FACTUAL

• Regression

• Matching

• Regression discontinuity.

• Instrumental variables

• Before and after difference in difference methods.

• Useful for all types of estimators.

Page 11: Longitudinal surveys and their role in implementing quasi

NON RANDOM ASSIGNMENT

Program officeIncome per capita

1000

500

0Treatment Control

1457

947

Page 12: Longitudinal surveys and their role in implementing quasi

GENERAL CHARACTERISTICS OF FIELD EXPERIMENTS

• The experiment or program tend to not last long, e.g. a year or two.

• Program may end…

• Or continue with the control group also becoming treatment…

• Normally focused in a local geographic area, e.g. 50 schools in one or

two states randomly assigned to treatment or control.

• Most experimental evaluations focus on short term impacts.

• Baseline data pre program (check quality of randomization).

• Longitudinal data: 1 or 2 follow up rounds.

• Data focuses on impact indicators of interest.

Page 13: Longitudinal surveys and their role in implementing quasi

ADVANTAGES AND DISADVANTAGES OF EXPERIMENTS

• Main potential advantage is providing plausible estimate of

counterfactual and thus credible impacts.

• But an important disadvantage:

• Impacts only valid for the population studied.

• Do impacts of Prospera in rural Oaxaca tell us anything about impacts in

Guadalajara?

• Angus Deaton: Critique of experiments

• Likely to be more informative and less biased to use survey data with larger and more

representative samples with non-experimental methods.

• But how can we judge?

• A tough question….

Page 14: Longitudinal surveys and their role in implementing quasi

CAN NON-EXPERIMENTAL EVALUATIONS REPLICATE THE RESULTS FROM EXPERIMENTAL EVALUATIONS?

• Let’s assume that experimental methods provide the best

approximation to the “true impacts”:

• Under what conditions- with what data and methods- can non-

experimental evaluations replicate experimental evaluations?

Counterfactual never observable

In settings with experimental data , non-experimental data can be used to

try to approximate the experimental impacts.

Heckman, Ichimura, and Todd (1997), Dehejia and Wahba (1998, 1999), and Smith and Todd (2004)

Page 15: Longitudinal surveys and their role in implementing quasi

REPLICATING RESULTS FROM EXPERIMENTAL EVALUATIONS.

Heckman, Ichimura and Todd. 1997. “Matching as an Econometric Estimator. “ Review of Economics and Statistics .

Show how the choice of the set of observables variables used to construct the counter factual affects the ability of non-experimental methods to replicate experimental results.

Some findings:

Biases are higher when using a cruder set of conditioning variables.

→Implies need a set of detailed control variables.

Matching estimators perform best when treatment and control groups were located in the same geographical area and same survey instrument applied to both treatment and comparison groups.

Page 16: Longitudinal surveys and their role in implementing quasi

REPLICATING (CONTINUED).

• Smith and Todd (2004): difference-in-difference matching estimators

perform much better than cross-sectional methods in cases where

participants and nonparticipants were drawn from different regional

labor markets and/or were given different survey questionnaires.

• Baseline survey (and hence longitudinal data) is critical.

• Pretty challenging to generate credible impacts with non-experimental

evaluations using only after program data.

→Need longitudinal data.

Page 17: Longitudinal surveys and their role in implementing quasi

WHAT CAN GO WRONG WITH EXPERIMENTS AND DATA COLLECTION?

• Selective attrition:

• Program affects the probability of remaining in the sample.

• Common and a big problem.

• Selective attrition implies back in the non experimental world of impact evaluation.

• Even attrition which is not selective is a problem:

• Impacts will only reflect those who are interviewed.

• Can be the least affected or least interesting population.

• Evaluation of youth beneficiaries of Prospera in rural areas

• Children in Prospera after 10 years most youth no longer lived in their parents household.

Who are the children who most benefited from increased education?

• Probably those who migrated to urban areas are those who most benefitted.

Page 18: Longitudinal surveys and their role in implementing quasi

EVALUATIONS OF LONGER TERM PROGRAM IMPACTS.

• Arguably of equal importance than short term or initial impact

evaluations. Example: Prospera (Progresa/Oportunidades).

• Do initial program impacts hold up over time?

• Is increase in growth of treatment children relative to control children before age 2

maintained or do control kids catch up?

• Are there longer term impacts of the program on other variables?

• Does the higher education received by Prospera (Progresa/Oportunidades) lead to

higher wages when the kids become adults?

• Surprisingly few studies of long term impacts (almost none in Mexico)

but is a growing literature.

• What data do you need to carry out long term evaluations?

Page 19: Longitudinal surveys and their role in implementing quasi

ALTERNATIVES FOR LONGER TERM IMPACT EVALUATIONS

1. Experimental evaluations:

a. Follow up of original evaluation samples.

b. Issues of attrition and mortality.

Some recent examples:

Behrman, Parker and Todd, 2011. “Do School Subsidy Programs provide Lasting Benefits”. Journal of Human Resources.

Baird, Hicks, Kremer and Miguel 2015. “Worms at Work: Long‐run Impacts of a Child

Health Investment”

Page 20: Longitudinal surveys and their role in implementing quasi

FOLLOW UP OF PARTICIPANTS PERRY PRESCHOOL PROJECT

“Understanding the Mechanisms Through Which an Influential Early

Childhood Program Boosted Adult Outcomes.” Heckman, Pinto and

Savelev, 2013. American Economic Review.

• Perry Preschool program: targeted poor African American children

with low IQ.

• 123 participants randomly assigned in the 1960s

• Continuous followup.

• Only 11 had left the sample by age 40!

• Very large impacts on reductions in arrests, and being on welfare.

• Small sample size but impacts observable because of their size.

Page 21: Longitudinal surveys and their role in implementing quasi

ALTERNATIVES FOR LONGER TERM IMPACT EVALUATIONS

2. Can longitudinal nationally representative data be used to evaluate

government programs?

a. For large programs like Seguro Popular and Prospera yes.

MHAS and MxFLS both have sufficient observations of beneficiaries.

b. Small programs will likely not have enough program participants.

Even for large programs it may be hard to do sub groups or heterogeneity analysis.

Oversampling disadvantaged or poor populations can be a way to increase

likelihood of data being useful for program evaluation for future programs.

Page 22: Longitudinal surveys and their role in implementing quasi

MHAS AND SEGURO POPULAR

“Health Insurance and the Aging: Evidence from the Seguro Popular

program in Mexico.” (Parker, Saenz, and Wong).

•Historically low coverage of workers in the formal sector <50%

•Health insurance program for the informal sector Seguro Popular

introduced in 2002.

• Expansion by 2015 to more than 50 million individuals, about 40% of all Mexicans.

• MHAS baseline 2001, baseline for Seguro Popular.

• 2657 beneficiaries in 2013

• Combined with administrative data on available health services.

• Difference in difference estimation methods.

Page 23: Longitudinal surveys and their role in implementing quasi

SEGURO POPULAR COVERAGE

Table 3 Health insurance for MHAS individuals interviewed in 2001 and 2012.

Insurance status in 2001 and 2012 Urban >100, 000 inhabitants Rural, < 100,000 inhabitants

With insurance in 2001 and 2012 71.5% 39.1%

With insurance 2001, w/o in 2012 2.6% 1.9%

With insurance 2012, w/o in 2001 19.9% 46.4%

Without insurance in 2001 and 2012

6.0% 12.6%

100.0% 100.0%

Source: Author’s calculations using the Mexican Health and Aging Study.

Page 24: Longitudinal surveys and their role in implementing quasi

EXAMPLES OF LONG TERM EVALUATIONS:CURRIE AND THOMAS, 1995

• Study of long term impacts of Head Start, an early childhood program

for poor children in the United States.

• Early study of long term impacts of a public program.

• National Longitudinal Survey of Youth

• Oversample of poor population.

• Sample of about 5000 children, 927 participated in Head Start.

• Compares siblings who participate with siblings who don’t participate.

• Key feature enabling evaluation: oversampling of the poor population.

Page 25: Longitudinal surveys and their role in implementing quasi

ALTERNATIVES FOR LONGER TERM IMPACT EVALUATIONS

3. Linking administrative data to evaluation sample data.

“How Does Your Kindergarten Classroom Affect Your Earnings?

Evidence from Project STAR” (Chetty,Raj, John Friedman, Nathaniel

Hilger, Emmanuel Saez, Diane Schanzenbach, and Danny Yagan),

Quarterly Journal of Economics, 2011.

Page 26: Longitudinal surveys and their role in implementing quasi

USE OF ADMINISTRATIVE RECORDS: MERGING SCHOOL RECORDS AND TAX RETURNS.

• STAR program, randomly assigned kindergarten children to smaller (15)

or larger classrooms (22)

• Findings: Students who were randomly assigned to higher quality

classrooms in grades K-3 earn more, are more likely to attend college,

save more for retirement, and live in better neighborhoods.

• HOW?

• Linked kindergarten school records to tax records.

• (Using Social Security number, date of birth, gender, and names)

Page 27: Longitudinal surveys and their role in implementing quasi

ADMINISTRATIVE DATA EVALUATION OF SEGURO POPULAR

• Seguro Popular is only for workers in the informal sector.

• Does this create incentives for workers to choose informal sector jobs over formal sector jobs?

Mariano Bosch & Raymundo M. Campos-Vazquez, 2014. "The Trade-Offs of Welfare Policies in Labor Markets with Informal Jobs: The Case of the "Seguro Popular" Program in Mexico," American Economic Journal: Economic Policy,

Administrative social security data at the municipality level merged with administrative data on number of Seguro Popular beneficiaries to study change in the creation of formal jobs under Seguro Popular.

Page 28: Longitudinal surveys and their role in implementing quasi

SOME TENTATIVE CONCLUSIONS AND SUGGESTIONS ON DATA.

• Longitudinal data (before and after) a must for obtaining credible

program impacts, both for experimental and non experimental

methods.

• Household level surveys

• Oversample poor population increases probability of having sufficient

beneficiaries of disadvantaged populations.

• OR, why not a longitudinal panel focusing only on the poor population.

• Linking administrative data on tax returns to household surveys or schools data.

• Through CURP, IMSS social security , RFC.

• CURP can be looked up upon internet.

• Large sample size unless you are pretty sure your program will have very large impacts….