impact evaluation for evidence-based policy making

Impact Evaluation

Impact Evaluation for Impact Evaluation for Evidence-Based Policy Evidence-Based Policy

MakingMakingArianna LegoviniArianna Legovini

Lead, Africa Impact Evaluation InitiativeLead, Africa Impact Evaluation Initiative

AFTRLAFTRL

2

Answer Three Questions

Why is evaluation valuable?

What makes a good impact evaluation?

How to implement evaluation?

3

IE Answers: How do we turn this teacher…

4

…into this teacher?

5

Why Evaluate? Need evidence on what works

Allocate limited budget Fiscal accountability

Improve program/policy overtime Operational research Managing by results

Information key to sustainability Negotiating budgets Informing constituents and managing press Informing donors

6

What is different between traditional M&E and Impact Evaluation? monitoring to track

implementation efficiency (input-output)

INPUTS OUTCOMESOUTPUTS

MONITOR EFFICIENCY

EVALUATE EFFECTIVENESS

$$$

BEHAVIOR

impact evaluation to measure effectiveness (output-outcome)

7

Question types and methods

Process Evaluation / Monitoring:

Descriptive Descriptive analysisanalysis

Causal Causal analysisanalysis

▫What was the effect of the program on outcomes?▫How would outcomes change under alternative program designs?▫Does the program impact people differently (e.g. females, poor, minorities)?▫Is the program cost-effective?

▫Is program being implemented efficiently?▫Is program targeting the right population?▫Are outcomes moving in the right direction?

Impact Evaluation:

8

Which can be answered by traditional M&E and which by IE?

Are ITNs being delivered as planned?

Does school-based delivery of malaria treatment increase school attendance?

What is the correlation between health coverage and under fives receiving treatment within 24 hr of fever outbreak?

Does the house-to-house approach lead to an increase in under fives sleeping under ITNs relative to level in communities with other community-based approaches?

M&E

IE

M&E

IE

9

Types of Impact Evaluation

Efficacy: Proof of Concept Pilot under ideal conditions

Effectiveness: At scale Normal circumstances & capabilities Lower or higher impact? Higher or lower costs?

10

So, use impact evaluation to…. Test innovations Scale up what works (e.g. de-worming) Cut/change what does not (e.g. HIV counseling) Measure effectiveness of programs (e.g. JTPA ) Find best tactics to e.g. change people’s behavior

(e.g. come to the clinic) Manage expectations

e.g. PROGRESA/OPORTUNIDADES (Mexico) Transition across presidential terms Expansion to 5 million households Change in benefits Battle with the press

11

Next question please




12

Assessing impact

examples How much does an anti-malaria program

lower under-five mortality? What is the beneficiary’s health status with

program compared to without program?

Compare same individual with & without programs at the same point in time

Never observe same individual with and without program at same point in time

13

Solving the evaluation problem

Counterfactual: what would have happened without the program

Need to estimate counterfactual i.e. find a control or comparison group

Counterfactual Criteria Treated & counterfactual groups have identical

initial characteristics on average, Only reason for the difference in

outcomes is due to the intervention

14

2 “Counterfeit” Counterfactuals

Before and after: Same individual before the treatment

Non-Participants: Those who choose not to enroll in program Those who were not offered the program

15

Before and After Example

Food Aid Compare mortality before and after Find increase in mortality Did the program fail? “Before” normal year, but “after” famine

year Cannot separate (identify) effect of food

aid from effect of drought Epidemic

16

Before and After

Compare Y before and after intervention

B Before-after counterfactual

A-B Estimated impact

Control for time varying factors

C True counterfactual

A-C True impact

A-B is under-estimatedTime

Y

AfterBefore

A

B

C

t-1 t

Treatment

B

17

Non-Participants….

Compare non-participants to participants

Counterfactual: non-participant outcomes

Problem: why did they not participate?

18

Exercise: Why participants and

non-participants might differ? Mothers who came to the

health unit for ORT and mothers who did not?

Communities that applied for funds for IRT and communities that did not?

Children who received ACT and children who did not?

Child had diarrhea

Access to clinic

Costal and mountain

Epidemic and non-epidemic

Child had fever

Access to clinic

19

Health program example

Treatment offered Who signs up?

Those who are sick Areas with epidemics

Have lower health status that those who do not sign up

Healthy people/communities are a poor estimate of counterfactual

20

Health insurance example

Health insurance offered

Who buys health insurance?

Who does not buy?

Compare health care utilization of those who got insurance to those who did not

Cannot separately identify impact of insurance and utilization on health

21

What's wrong?

Selection bias: People choose to participate for specific reasons

Many times reasons are directly related to the outcome of interest Health Insurance: health status

and medical expenditures

Cannot separately identify impact of the program from these other factors/reasons

22

Program placement example

Government offers family planning program to villages with high fertility

Compare fertility in villages offered program to fertility in villages not offered

Program targeted based on fertility, so Treatments have high fertility Counterfactuals have low fertility

Cannot separately identify program impact from geographic targeting criteria

23

Need to know…

Why some get program and others do not How some get into treatment and other in

control group

If reasons correlated with outcome cannot identify/separate program impact from other explanations of differences in outcomes

The process by which data is generated

24

Possible Solutions…

Guarantee comparability of treatment and control groups

ONLY remaining difference is intervention

In this workshop we will consider Experimental design/randomization Quasi-experiments

Regression Discontinuity Double differences

Instrumental Variables

25

These solutions all involve…

Randomization Give all equal chance of being in

control or treatment groups Guarantees that all factors/characteristics will

be on average equal between groups Only difference is the intervention

If not, need transparent & observable criteria for who is offered program

26

The Last Question




27

Implementation Issues

Political economy

Policy context

Finding a good control Retrospective versus prospective designs Making the design compatible with operations Ethical Issues

Relationship to “results” monitoring

28

Political Economy

What is the policy purpose? In USA derail from national policy, defend

budget In RSA answer electorate In Mexico allocate budget to poverty programs In IDA country pressure to demonstrate aid

effectiveness and scale up In poor country hard constraints and ambitious

targets

29

Political Economy

Cultural shift

From retrospective evaluation Look back and judge

To prospective evaluation Decide what need to learn Experiment with alternatives Measure and inform Adopt better alternatives overtime

Change in incentives Rewards for changing programs that do not work Rewards for generating knowledge Separating job performance from knowledge generation

30

The Policy Context

Address policy-relevant questions: What policy questions need to be answered? What outcomes answer those questions? What indicators measures outcomes? How much of a change in the outcomes

would determine success?

Example: teacher performance-based pay Scale up pilot? Criteria: Need at least a 10% increase in test

scores with no change in unit costs

31

Prospective designs

Use opportunities to generate good control groups

Most programs cannot deliver benefits to all those eligible Budgetary limitations:

Eligible who get it are potential treatments Eligible who do not are potential controls

Logistical limitations: Those who go first are potential treatments Those who go later are potential controls

32

Who gets the program?

Eligibility criteria Are benefits targeted? How are they targeted? Can we rank eligible's priority? Are measures good enough for fine rankings?

Who goes first? Roll out

Equal chance to go first, second, third?

33

Ethical Considerations

Do not delay benefits: Rollout based on budget/administrative constraints

Equity: equally deserving beneficiaries deserve an equal chance of going first

Transparent & accountable method

Give everyone eligible an equal chance

If rank based on some criteria, then criteria should be quantitative and public

34

Retrospective Designs

Hard to find good control groups Must live with arbitrary or unobservable

allocation rules Administrative data

good enough to reflect program was implemented as described

Need pre-intervention baseline survey On both controls and treatments With covariates to control for initial differences

Without baseline difficult to use quasi-experimental methods

35

Manage for results

Retrospective evaluation cannot be used to manage for results

Use resources wisely: do prospective evaluation design Better methods More tailored policy questions Precise estimates Timely feedback and program changes Improve results on the ground

36

Monitoring Systems

Projects/programs regularly collect data for management purposes

Typical content Lists of beneficiaries Distribution of benefits Expenditures Outcomes Ongoing process evaluation

Information is needed for impact evaluation

37

Evaluation uses information to:

Verify who is beneficiary When started What benefits were actually delivered

Necessary condition for program to have an impact:

benefits need to get to targeted beneficiaries

38

Improve use of monitoring data for IE Program monitoring data usually only

collected in areas where active

Collect baseline for control areas as well

Very cost-effective as little need for additional special surveys

Add a couple of outcome indicators

Most IE’s use only monitoring data

39

Overall Messages

Impact evaluation useful for Validating program design Adjusting program structure Communicating to finance ministry

& civil society A good evaluation design requires

estimating the counterfactual What would have happened to beneficiaries

if had not received the program Need to know all reasons why beneficiaries got

program & others did not

40

Design Messages

Address policy questions Interesting is what government needs and will

use Stakeholder buy-in Easiest to use prospective designs Good monitoring systems & administrative

data can improve IE and lower costs

impact evaluation for evidence-based policy making

Documents

good impact evaluation

program impact people

evaluation valuable

program costeffective

higher impact

alternative program

lower costs

health coverage