evaluation methods for measuring the impact of social protection programs
DESCRIPTION
Evaluation methods for measuring the impact of social protection programs. Joost de Laat , Menahem Prywes, Shafique Jamal The World Bank. Objectives:. Understand: Principles of the difference in the differences method of project evaluation and weaknesses of the method. - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Evaluation methods for measuring the impact of social protection programs](https://reader035.vdocument.in/reader035/viewer/2022081520/5681665d550346895dd9e18d/html5/thumbnails/1.jpg)
Evaluation methods for measuring the impact of social protection programs
Joost de Laat, Menahem Prywes, Shafique JamalThe World Bank
![Page 2: Evaluation methods for measuring the impact of social protection programs](https://reader035.vdocument.in/reader035/viewer/2022081520/5681665d550346895dd9e18d/html5/thumbnails/2.jpg)
Objectives:
Understand:
Principles of the difference in the differences method of project evaluation and weaknesses of the method.
Principles of the randomized controlled trial (RCT) method. Limits and weaknesses of the randomized controlled trial
method. Principles of the regression discontinuity design (RDD)
method.
![Page 3: Evaluation methods for measuring the impact of social protection programs](https://reader035.vdocument.in/reader035/viewer/2022081520/5681665d550346895dd9e18d/html5/thumbnails/3.jpg)
How donors evaluated development projects. Donors often couldn’t evaluate development projects, and
especially health projects, convincingly because: No one bothered to collect baseline data, No one tracked the treatment (beneficiary) group over time.
Sometimes donors collected this information and then measured changes in the treatment group over time. However it remained unclear whether this performance was
better or worse to the comparator (treatment) groups. Sometimes, donors applied the difference in the differences
methods. This compares results from the treatment group to results from a
control group. But it’s often unclear whether the comparator group really is
comparable to the treatment group. Parliaments and donors increasingly demand credible evaluations!
![Page 4: Evaluation methods for measuring the impact of social protection programs](https://reader035.vdocument.in/reader035/viewer/2022081520/5681665d550346895dd9e18d/html5/thumbnails/4.jpg)
Souce: Prashant Bharadwaj
Difference in differences methodology-1
![Page 5: Evaluation methods for measuring the impact of social protection programs](https://reader035.vdocument.in/reader035/viewer/2022081520/5681665d550346895dd9e18d/html5/thumbnails/5.jpg)
Difference in differences methodology-2Source: Prashant Bharadwaj
![Page 6: Evaluation methods for measuring the impact of social protection programs](https://reader035.vdocument.in/reader035/viewer/2022081520/5681665d550346895dd9e18d/html5/thumbnails/6.jpg)
Difference in differences methodology-3Source: Prashant Bharadway
![Page 7: Evaluation methods for measuring the impact of social protection programs](https://reader035.vdocument.in/reader035/viewer/2022081520/5681665d550346895dd9e18d/html5/thumbnails/7.jpg)
Difference in the differences: simple Difference in the differences methodology: simple numerical example.Source: Prashant Bharadway
![Page 8: Evaluation methods for measuring the impact of social protection programs](https://reader035.vdocument.in/reader035/viewer/2022081520/5681665d550346895dd9e18d/html5/thumbnails/8.jpg)
In contrast to the Difference in the Differences method, randomized controlled trials seek to make valid comparisons between outcomes for treatment and control groups.
Randomization establishes a control group that is statistically identical to the intervention group. This produces unbiased results.
Randomization reduces selection bias, for example Undercoverage: some parts of the population are under-
represented in the sample. Self-selection: people who agree to participate in the trial
have special characteristics, i.e. strong opinions on an issue. Nonresponse: bias: participants who do not respond may
have particular views or other characteristics.
![Page 9: Evaluation methods for measuring the impact of social protection programs](https://reader035.vdocument.in/reader035/viewer/2022081520/5681665d550346895dd9e18d/html5/thumbnails/9.jpg)
Unit of RandomizationChoose according to type of program
o Individual/Householdo School/Health Clinic/catchment
areao Block/Village/Communityo Ward/District/Region
Keep in mindo Need “sufficiently large” number of units to detect
minimum desired impact: Power.o Spillovers/contaminationo Operational and survey costs
As a rule of thumb, randomize at the smallest viable unit of implementation.
![Page 10: Evaluation methods for measuring the impact of social protection programs](https://reader035.vdocument.in/reader035/viewer/2022081520/5681665d550346895dd9e18d/html5/thumbnails/10.jpg)
Example: Randomized Assignment
Mexico Progresa Conditional Cash Transfer program
Unit of randomization: Community
o 320 treatment communities (14446 households): First transfers in April 1998.
o 186 comparison communities (9630 households): First transfers November 1999
506 communities in the evaluation sample
Randomized phase-in
![Page 11: Evaluation methods for measuring the impact of social protection programs](https://reader035.vdocument.in/reader035/viewer/2022081520/5681665d550346895dd9e18d/html5/thumbnails/11.jpg)
Example: Randomized Assignment
Treatment Communities
320
Comparison Communities
186
Time
T=1T=0
Comparison Period
![Page 12: Evaluation methods for measuring the impact of social protection programs](https://reader035.vdocument.in/reader035/viewer/2022081520/5681665d550346895dd9e18d/html5/thumbnails/12.jpg)
Example: Randomized Assignment
How do we know we have good clones?
In the absence of Progresa, treatment and comparisons should be identical
Let’s compare their characteristics at baseline (T=0)
![Page 13: Evaluation methods for measuring the impact of social protection programs](https://reader035.vdocument.in/reader035/viewer/2022081520/5681665d550346895dd9e18d/html5/thumbnails/13.jpg)
Example: Balance at Baseline
Case 3: Randomized AssignmentTreatment Comparison T-stat
Consumption($ monthly per capita) 233.4 233.47 -0.39Head’s age (years) 41.6 42.3 -1.2Spouse’s age(years) 36.8 36.8 -0.38Head’s education (years) 2.9 2.8 2.16**Spouse’s education (years) 2.7 2.6 0.006
Note: If the effect is statistically significant at the 5% significance level, we label the estimated impact with 2 stars (**).
![Page 14: Evaluation methods for measuring the impact of social protection programs](https://reader035.vdocument.in/reader035/viewer/2022081520/5681665d550346895dd9e18d/html5/thumbnails/14.jpg)
Example: Balance at Baseline
Case 3: Randomized AssignmentTreatment Comparison T-stat
Head is female=1 0.07 0.07 -0.66Indigenous=1 0.42 0.42 -0.21Number of household members 5.7 5.7 1.21Bathroom=1 0.57 0.56 1.04Hectares of Land 1.67 1.71 -1.35Distance to Hospital (km) 109 106 1.02
Note: If the effect is statistically significant at the 5% significance level, we label the estimated impact with 2 stars (**).
![Page 15: Evaluation methods for measuring the impact of social protection programs](https://reader035.vdocument.in/reader035/viewer/2022081520/5681665d550346895dd9e18d/html5/thumbnails/15.jpg)
Example: Randomized AssignmentTreatment Group(Randomized to
treatment)
Counterfactual (Randomized to
Comparison)
Impact(Y | P=1) - (Y | P=0)
Baseline (T=0) Consumption (Y) 233.47 233.40 0.07Follow-up (T=1) Consumption (Y) 268.75 239.5 29.25**
Estimated Impact on Consumption (Y)
Linear Regression 29.25**Multivariate Linear Regression 29.75**
Note: If the effect is statistically significant at the 5% significance level, we label the estimated impact with 2 stars (**).
![Page 16: Evaluation methods for measuring the impact of social protection programs](https://reader035.vdocument.in/reader035/viewer/2022081520/5681665d550346895dd9e18d/html5/thumbnails/16.jpg)
Keep in MindRandomized Assignment
In Randomized Assignment, large enough samples, produces 2 statistically equivalent groups.
We have identified the perfect clone.
Randomized beneficiary
Randomized comparison
Feasible for prospective evaluations with over-subscription/excess demand.
Most pilots and new programs fall into this category.
!
![Page 17: Evaluation methods for measuring the impact of social protection programs](https://reader035.vdocument.in/reader035/viewer/2022081520/5681665d550346895dd9e18d/html5/thumbnails/17.jpg)
Limits on randomized controlled trials
Out of sample generalization: Results from these trials are internally valid but cannot be generalized (extrapolated) out of sample. An inference of general validity of a result would require an internally consistent theory of causation and repeated randomized controlled trials in different countries, demographic rules, and natural environments.
Results are comparisons of averages. Therefore the results of a randomized controlled trial may not be valid for making policies for sub-groups or for individual households and people –especially if the policymaker has additional information.
![Page 18: Evaluation methods for measuring the impact of social protection programs](https://reader035.vdocument.in/reader035/viewer/2022081520/5681665d550346895dd9e18d/html5/thumbnails/18.jpg)
Risks of bias in the randomized controlled trial methodology
Self selection out of the control group. Randomized controlled trials in the social sciences are not double blind, like pharmaceuticals trials. The people who are not receiving the treatment (for example, tutoring, or nutritional supplements) may decide to obtain these on their own, biasing the results.
Replacement of drop-outs may lead to bias.
![Page 19: Evaluation methods for measuring the impact of social protection programs](https://reader035.vdocument.in/reader035/viewer/2022081520/5681665d550346895dd9e18d/html5/thumbnails/19.jpg)
Limits to use of randomized controlled trials
Randomized controlled trials are expensive. They can cost any where from $150,000 to several million dollars. A $500,000 cost is typical. This means the method cannot be applied to all development projects.
Many development projects do not address units that can be randomized. For instance states or provinces/oblasts cannot be meaningfully randomized.
Ethical rules are unclear. In medical research, participation in a randomized controlled trial requires informed consent. There are no general rules for economic development project. US universities and some developing countries have ethical rules.
![Page 20: Evaluation methods for measuring the impact of social protection programs](https://reader035.vdocument.in/reader035/viewer/2022081520/5681665d550346895dd9e18d/html5/thumbnails/20.jpg)
Subtle conflicts of interests and biases can prejudice all evaluation studies –whatever the methodology.
Sponsors’ conflict of interest. Donors, governments, project units, and NGOs prefer to report positive findings because this helps to sustain their business and jobs. Sometimes, project units resist or even refuse payment to contractors who deliver negative evaluation reports.
Contractors’ conflict of interest. The contractors who carry-out the evaluation studies may be influenced by their clients preferences.
Confirmation bias. Donors, governments, project units, NGOs often believe that outcomes are positive and tend to perceive positive outcomes. Also, officials, managers, and development experts come to identify personally with the projects. Their psychological bias is, ‘I mean well, therefore the project is successful.’
Publication bias. Scholarly journals prefer to publish positive results and generally neglect negative results (‘no effect’ is not newsworthy). This may induce bias in academic work.
![Page 21: Evaluation methods for measuring the impact of social protection programs](https://reader035.vdocument.in/reader035/viewer/2022081520/5681665d550346895dd9e18d/html5/thumbnails/21.jpg)
Economic and ethical questions
When should donors and governments insist on application of the randomized controlled trial methodology and when is this inappropriate?
When is it unethical to use the randomized controlled trials methodology in a development context?
![Page 22: Evaluation methods for measuring the impact of social protection programs](https://reader035.vdocument.in/reader035/viewer/2022081520/5681665d550346895dd9e18d/html5/thumbnails/22.jpg)
Regression Discontinuity Design Many social programs select beneficiaries using an index or score:
Anti-poverty Programs
Pensions
Education
Agriculture
Targeted to households below a given poverty index/income
Targeted to population above a certain age
Scholarships targeted to students with high scores on standarized text
Fertilizer program targeted to small farms less than given number of hectares)
![Page 23: Evaluation methods for measuring the impact of social protection programs](https://reader035.vdocument.in/reader035/viewer/2022081520/5681665d550346895dd9e18d/html5/thumbnails/23.jpg)
Regression Discontinuity DesignExample: Effect of social assistance program on nutrition
Reduce vulnerability and improve nutrition of poor families
Goal
o Households with a poverty score ≤50 are pooro Households with a poverty score >50 are not poor
Method
Poor households receive social assistance transfersIntervention
![Page 24: Evaluation methods for measuring the impact of social protection programs](https://reader035.vdocument.in/reader035/viewer/2022081520/5681665d550346895dd9e18d/html5/thumbnails/24.jpg)
Regression Discontinuity Design-Baseline
Not eligible
Eligible
![Page 25: Evaluation methods for measuring the impact of social protection programs](https://reader035.vdocument.in/reader035/viewer/2022081520/5681665d550346895dd9e18d/html5/thumbnails/25.jpg)
Regression Discontinuity Design-Post Intervention
IMPACT
![Page 26: Evaluation methods for measuring the impact of social protection programs](https://reader035.vdocument.in/reader035/viewer/2022081520/5681665d550346895dd9e18d/html5/thumbnails/26.jpg)
Regression Discontinuity Design
We have a continuous eligibility index with a defined cut-offo Households with a score ≤ cutoff are eligibleo Households with a score > cutoff are not eligibleo Or vice-versa
Intuitive explanation of the method:o Units just above the cut-off point are very similar to units
just below it – good comparison.o Compare outcomes Y for units just above and below the cut-
off point.For a discontinuity design, you need: 1) Continuous eligibility index2) Clearly defines eligibility cut-off.
![Page 27: Evaluation methods for measuring the impact of social protection programs](https://reader035.vdocument.in/reader035/viewer/2022081520/5681665d550346895dd9e18d/html5/thumbnails/27.jpg)
THANK YOU!
Questions?
Next: Tajikistan Example