visualizing impact evaluation data quality · visualizing the quality of impact evaluation data...

15
Visualizing the Quality of Impact Evaluation Data Josef L. Loening World Bank The views expressed in this presentation are entirely those of the author; they do not necessarily represent the views of the World Bank and its affiliated Organizations, or those of the Executive Directors of the World Bank or the governments they represent. OECD Seminar on Innovative Approaches to Turn Statistics into Knowledge, 8-10 December 2010, Cape Town

Upload: others

Post on 29-Jul-2020

9 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Visualizing Impact Evaluation Data Quality · Visualizing the Quality of Impact Evaluation Data Josef L. Loening World Bank ... quality of results-based M&E systems •Wide range

Visualizing the Quality of Impact Evaluation Data Josef L. Loening World Bank The views expressed in this presentation are entirely those of the author; they do not necessarily represent the views of the World Bank and its affiliated Organizations, or those of the Executive Directors of the World Bank or the governments they represent.

OECD Seminar on Innovative Approaches to Turn Statistics into Knowledge, 8-10 December 2010, Cape Town

Page 2: Visualizing Impact Evaluation Data Quality · Visualizing the Quality of Impact Evaluation Data Josef L. Loening World Bank ... quality of results-based M&E systems •Wide range

Outline

• Impact evaluation and community-driven development (CDD) projects

• Detecting unusual data with Benford’s law using STATA and WORDLE

• Project example from Africa – Household production of eggs

– Household production of maize

– Beneficiary financial contribution

• Conclusions

Page 3: Visualizing Impact Evaluation Data Quality · Visualizing the Quality of Impact Evaluation Data Josef L. Loening World Bank ... quality of results-based M&E systems •Wide range

Impact Evaluation and CDD Projects • “Seeking the truth from facts” • Many practical and logistical

difficulties in monitoring and evaluation

• Analytical and data demands for valid interferences can be quite daunting

• Distortions in the market for knowledge about development effectiveness

• CDD projects: According to Bank’s independent evaluations rigorous monitoring and evaluation is often weak

• But: rising interest and support for evaluations

We look at three issues 1. Data quality underpinning

monitoring and evaluation: rarely analyzed

2. Asymmetric information: development practitioners cannot easily assess the quality of information

3. Project’s own monitoring databases: important, but typically underutilized

Page 4: Visualizing Impact Evaluation Data Quality · Visualizing the Quality of Impact Evaluation Data Josef L. Loening World Bank ... quality of results-based M&E systems •Wide range

Distribution Anomalies and Benford’s Law

1,2,...,91

D )

1d

1(1

10log)

1P(D

about 30% of numbers begin with 1 about 5% of numbers begin with 9

Page 5: Visualizing Impact Evaluation Data Quality · Visualizing the Quality of Impact Evaluation Data Josef L. Loening World Bank ... quality of results-based M&E systems •Wide range

Short Review

• Observation that the first pages of logarithmic tables were more worn out than the last pages

• Benford (1938) rediscovered the first digit phenomenon, using 20 different data sets

• Mathematical proof of the “random samples from random distributions theorem” and finding that Benford’s law is base and scale invariant

• In the literature, a number of rules and statistical tests are formulated on which data are expected to follow Benford’s law under certain conditions

• Nigrini (1996) was very influential in establishing Benford’s law as an indicator of fraud in finance and taxation

• Judge and Schlechter (2009) find that Benford’s distribution applies to detect unusual household survey data in developed and developing countries

Page 6: Visualizing Impact Evaluation Data Quality · Visualizing the Quality of Impact Evaluation Data Josef L. Loening World Bank ... quality of results-based M&E systems •Wide range

Project Example

• Objective is to raise the production of food, incomes, and assets of participating households

• Implementation of small sub-projects, planned and managed by communities themselves

• Very large number of subprojects, covering the entire range of rural productive activities

• Unique database, which captures information for about 49.9 percent of sub-projects

“We had a very long but productive meeting, chaired by the Permanent Secretary. I mentioned about the cleaning of the monitoring and evaluation database, which generated a long discussion. In the end, it was agreed that the Project Coordination Unit completes the gap-filling data-entry exercise.”

“I came to realize that it required a lot of consultation with district officers as well as referring to district reports. In cases where there were different crop measurements, I had to agree with district officers on the equivalent weights to get the correct figures.”

“When data clerks were entering raw data they had to choose from a list of pre-coded information like ‘1’ or ‘2’ and when done in a hurry, they made some mistakes. This was more common in livestock, which is why in enterprises like ‘cattle’ products appeared to be ‘eggs’.

Page 7: Visualizing Impact Evaluation Data Quality · Visualizing the Quality of Impact Evaluation Data Josef L. Loening World Bank ... quality of results-based M&E systems •Wide range

Digits for Egg Production 10

2030

400

1 2 3 4 5 6 7 8 9 1 2 3 4 5 6 7 8 9

Egg production at baseline Egg production after CDD project

Per

cent

Digits

Page 8: Visualizing Impact Evaluation Data Quality · Visualizing the Quality of Impact Evaluation Data Josef L. Loening World Bank ... quality of results-based M&E systems •Wide range

Number Clouds for Egg Production (per laying cycle)

At baseline

After CDD project

Page 9: Visualizing Impact Evaluation Data Quality · Visualizing the Quality of Impact Evaluation Data Josef L. Loening World Bank ... quality of results-based M&E systems •Wide range

Revised Digits for Egg Production 10

2030

400

1 2 3 4 5 6 7 8 9 1 2 3 4 5 6 7 8 9

Egg production at baseline Egg production after CDD project

Per

cent

Digits

Page 10: Visualizing Impact Evaluation Data Quality · Visualizing the Quality of Impact Evaluation Data Josef L. Loening World Bank ... quality of results-based M&E systems •Wide range

Number Clouds for Household Maize Production (in kg) At baseline After CDD project

Page 11: Visualizing Impact Evaluation Data Quality · Visualizing the Quality of Impact Evaluation Data Josef L. Loening World Bank ... quality of results-based M&E systems •Wide range

Digits for Maize Production

1020

3040

0

1 2 3 4 5 6 7 8 9 1 2 3 4 5 6 7 8 9

Maize production at baseline Maize production after CDD

Per

cent

Digits

Page 12: Visualizing Impact Evaluation Data Quality · Visualizing the Quality of Impact Evaluation Data Josef L. Loening World Bank ... quality of results-based M&E systems •Wide range

Number Clouds for Beneficiary’s Own Financial Contribution

Revised

Original

Page 13: Visualizing Impact Evaluation Data Quality · Visualizing the Quality of Impact Evaluation Data Josef L. Loening World Bank ... quality of results-based M&E systems •Wide range

Digits of Beneficiary’s Own Financial Contribution

First Digits

Second Digits

510

1520

2530

0

Pe

rcen

t

1 2 3 4 5 6 7 8 9Digits

68

1012

140

Pe

rcen

t

0 1 2 3 4 5 6 7 8 9Digits

Page 14: Visualizing Impact Evaluation Data Quality · Visualizing the Quality of Impact Evaluation Data Josef L. Loening World Bank ... quality of results-based M&E systems •Wide range

Quantiles of CDD Project Financial Contribution to Beneficiaries

Note that thresholds and assigned numbers make Benford’s law not applicable!

0

1000

0020

0000

3000

00

CD

D fi

nanc

ial c

ontr

ibut

ion

per

hous

ehol

d (lo

cal c

urre

ncy)

0 .25 .5 .75 1Fraction of the data

Page 15: Visualizing Impact Evaluation Data Quality · Visualizing the Quality of Impact Evaluation Data Josef L. Loening World Bank ... quality of results-based M&E systems •Wide range

Conclusions

• Simple, objective, and effective tool to screen quality of monitoring and evaluation survey data: – Livestock production is useable after outlier detection

– Crop production is not reliable due to poor measurement and/or other problems

– In our case, method not applicable to project’s financial data because of heavy censoring

• Graphical distribution analysis of continuously measured project performance indicators can enhance quality of results-based M&E systems

• Wide range of other applications with significant forward-looking potential to enhance quality of statistics and contribute to informed decision-making