africa impact evaluation initiative, aftrl africa program for education impact evaluation david...
TRANSCRIPT
AFRICA IMPACT EVALUATION INITIATIVE, AFTRL
Africa Program for Education Impact Evaluation
David Evans
Impact Evaluation Cluster, AFTRL
Slides by Paul J. Gertler & Sebastian Martinez
Impact Evaluation Methods: Impact Evaluation Methods: Difference in difference & Matching
Measuring Impact
► Randomized Experiments► Quasi-experiments
Randomized Promotion – Instrumental Variables
Regression Discontinuity Double differences (Diff in diff) Matching
Case 5: Diff in diff
► Compare change in outcomes between treatments and non-treatment Impact is the difference in the change in
outcomes
►Impact = (Yt1-Yt0
) - (Yc1-Yc0
)
TimeTreatment
Outcome
Treatment Group
Control Group
Average Treatment Effect
TimeTreatment
Outcome
Treatment Group
Control Group
Measured effect without pre-measurement
TimeTreatment
Outcome
EstimatedAverage Treatment Effect
Average Treatment Effect
Treatment Group
Control Group
Diff in diff
► What is the key difference between these two cases?
► Fundamental assumption that trends (slopes) are the same in treatments and controls (sometimes true, sometimes not)
► Need a minimum of three points in time to verify this and estimate treatment (two pre-intervention)
TimeTreatment
Outcome
Treatment Group
Control Group
Average Treatment Effect
First
observation
Second
observation
Third
observation
Examples
► Two neighboring school districts School enrollment or test scores are
improving at same rate before the program (even if at different levels)
One receives program, one does not Neighboring _______
Case 5: Diff in Diff
Not Enrolled Enrolled t-statMean change
CPC 8.26 35.92 10.31
Case 5 - Diff in Diff
Linear Regression Multivariate Linear Regression
Estimated Impact on CPC 27.66** 25.53**(2.68) (2.77)
** Significant at 1% level
Case 5 - Diff in Diff
Impact Evaluation Example –Summary of Results
Case 1 - Before and After
Case 2 - Enrolled/Not
Enrolled
Case 3 - Randomization
Case 4 - Regression
Discontinuity
Case 5 - Diff in Diff
Multivariate Linear
RegressionMultivariate Linear
Regression
Multivariate Linear
Regression
Multivariate Linear
Regression
Multivariate Linear
Regression
Estimated Impact on CPC 34.28** -4.15 29.79** 30.58** 25.53**
(2.11) (4.05) (3.00) (5.93) (2.77)** Significant at 1% level
Impact Evaluation Example –Summary of Results
Case 1 - Before and After
Case 2 - Enrolled/Not
Enrolled
Case 3 - Randomization
Case 4 - Regression
Discontinuity
Case 5 - Diff in Diff
Multivariate Linear
RegressionMultivariate Linear
Regression
Multivariate Linear
Regression
Multivariate Linear
Regression
Multivariate Linear
Regression
Estimated Impact on CPC 34.28** -4.15 29.79** 30.58** 25.53**
(2.11) (4.05) (3.00) (5.93) (2.77)** Significant at 1% level
Example
► Old-age pensions and schooling in South Africa Eligible if household member over 60 Not eligible if under 60
• Used household with member age 55-60
Pensions for women and girls’ education
Measuring Impact
► Randomized Experiments► Quasi-experiments
Randomized Promotion – Instrumental Variables
Regression Discontinuity Double differences (Diff in diff) Matching
Matching
► Pick the ideal comparison group that matches the treatment group from a larger survey.
► The matches are selected on the basis of similarities in observed characteristics. For example?
► This assumes no selection bias based on unobserved characteristics. Example: income Example: entrepreneurship
Source: Martin Ravallion
Propensity-Score Matching (PSM)► Controls: non-participants with same characteristics
as participants In practice, it is very hard. The entire vector of X observed
characteristics could be huge.
► Match on the basis of the propensity score
P(Xi) = Pr (participationi=1|X) Instead of aiming to ensure that the matched control for
each participant has exactly the same value of X, same result can be achieved by matching on the probability of participation.
This assumes that participation is independent of outcomes given X (not true if important unobserved outcomes are affecting participation)
Steps in Score Matching
1. Representative & highly comparable survey of non-participants and participants.
2. Pool the two samples and estimate a logit (or probit) model of program participation:
Gives the probability of participating for a person with X
3. Restrict samples to assure common support (important source of bias in observational studies)
For each participant find a sample of non-participants that have similar propensity scores
Compare the outcome indicators. The difference is the estimate of the gain due to the program for that observation.
Calculate the mean of these individual gains to obtain the average overall gain.
Density
0 1Propensity score
Region of common support
Density of scores for participants
High probability of participating given X
Steps in Score Matching1. Representative & highly comparable survey of non-
participants and participants.2. Pool the two samples and estimate a logit (or probit) model
of program participation:Gives the probability of participating for a person with X
3. Restrict samples to assure common support (important source of bias in observational studies)
4. For each participant find a sample of non-participants that have similar propensity scores
5. Compare the outcome indicators. The difference is the estimate of the gain due to the program for that observation.
6. Calculate the mean of these individual gains to obtain the average overall gain.
PSM vs an experiment
► Pure experiment does not require the untestable assumption of independence conditional on observables
► PSM requires large samples and good data
Lessons on Matching Methods
► Typically used for IE when neither randomization, RD or other quasi-experimental options are not possible (i.e. no baseline) Be cautious of ex-post matching:
• Matching on variables that change due to participation (i.e., endogenous)
• What are some variables that won’t change?
► Matching helps control for OBSERVABLE differences
More Lessons on Matching Methods
► Matching at baseline can be very useful: Estimation:
• Combine with other techniques (i.e. diff in diff)
• Know the assignment rule (match on this rule)
Sampling:• Selecting non-randomized control
sample► Need good quality data
Common support can be a problem
Case 7: Matching
Case 7 - PROPENSITY SCORE: Pr(treatment=1)
Variable Coef. Std. Err.
Age Head -0.03 0.00Educ Head -0.05 0.01Age Spouse -0.02 0.00Educ Spouse -0.06 0.01Ethnicity 0.42 0.04Female Head -0.23 0.07Constant 1.6 0.10
P-score Quintiles
Xi T C t-score T C t-score T C t-score T C t-score T C t-scoreAge Head 68.04 67.45 -1.2 53.61 53.38 -0.51 44.16 44.68 1.34 37.67 38.2 1.72 32.48 32.14 -1.18Educ Head 1.54 1.97 3.13 2.39 2.69 1.67 3.25 3.26 -0.04 3.53 3.43 -0.98 2.98 3.12 1.96Age Spouse 55.95 55.05 -1.43 46.5 46.41 0.66 39.54 40.01 1.86 34.2 34.8 1.84 29.6 29.19 -1.44Educ Spouse 1.89 2.19 2.47 2.61 2.64 0.31 3.17 3.19 0.23 3.34 3.26 -0.78 2.37 2.72 1.99Ethnicity 0.16 0.11 -2.81 0.24 0.27 -1.73 0.3 0.32 1.04 0.14 0.13 -0.11 0.7 0.66 -2.3Female Head 0.19 0.21 0.92 0.42 0.16 -1.4 0.092 0.088 -0.35 0.35 0.32 -0.34 0.008 0.008 0.83
Quintile 4 Quintile 5Quintile 1 Quintile 2 Quintile 3
Case 7: Matching
Linear Regression Multivariate Linear Regression
Estimated Impact on CPC 1.16 7.06+(3.59) (3.65)
** Significant at 1% level, + Significant at 10% level
Case 7 - Matching
Impact Evaluation Example –Summary of Results
Case 1 - Before and After
Case 2 - Enrolled/Not
Enrolled
Case 3 - Randomization
Case 4 - Regression
Discontinuity
Case 5 - Diff in Diff
Case 6 - IV (TOT)
Case 7 - Matching
Multivariate Linear
RegressionMultivariate Linear
Regression
Multivariate Linear
Regression
Multivariate Linear
Regression
Multivariate Linear
Regression 2SLS
Multivariate Linear
RegressionEstimated Impact on CPC 34.28** -4.15 29.79** 30.58** 25.53** 30.44** 7.06+
(2.11) (4.05) (3.00) (5.93) (2.77) (3.07) (3.65)** Significant at 1% level
Measuring Impact
► Experimental design/randomization► Quasi-experiments
Regression Discontinuity Double differences (Diff in diff) Other options
• Instrumental Variables• Matching
Combinations of the above
Remember…..
► Objective of impact evaluation is to estimate the CAUSAL effect of a program on outcomes of interest
► In designing the program we must understand the data generation process behavioral process that generates the
data how benefits are assigned
► Fit the best evaluation design to the operational context
Design When to use Advantages Disadvantages
Randomization ►Whenever possible►When an intervention will not be universally implemented
►Gold standard►Most powerful
►Not always feasible►Not always ethical
Random Promotion ►When an intervention is universally implemented
► Learn and intervention ►Only looks at sub-group of sample
Regression Discontinuity
►If an intervention is assigned based on rank
►Assignment based on rank is common
►Only look at sub-group of sample
Double differences ►If two groups are growing at similar rates
►Eliminates fixed differences not related to treatment
►Can be biased if trends change
Matching ►One other methods are not possible
►Overcomes observed differences between treatment and comparison
►Assumes no unobserved differences (often implausible)