evaluation - dme for peace · 2018-03-09 · evaluation designs describe the logic and the...

48
EVALUATION Some Tools, Methods & Approaches U.S. Department of State Washington, D.C.

Upload: others

Post on 24-Jul-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: EVALUATION - DME for Peace · 2018-03-09 · Evaluation designs describe the logic and the conceptual framework for answering the evaluation’s key questions. DoS recognizes two

EVALUATION

EVALUATION

EVALUATION

EVALUATION

EVALUATION

EVALUATION

EVALUATIONEVALUATION

EVALUATION

EVALUATION

EVALUATION

EVALUATION

EVALUATION

EVALUATION

EVALUATION

EVALUATIONEVALUATION

EVALUATION

EVALUATIONEVALUATION

EVALUATION

EVALUATIONSome Tools, Methods & Approaches

U.S. Department of StateWashington, D.C.

Page 2: EVALUATION - DME for Peace · 2018-03-09 · Evaluation designs describe the logic and the conceptual framework for answering the evaluation’s key questions. DoS recognizes two

Evaluation: Some Tools, Methods & Approaches

ACkNOWLEDgMENTSThis booklet was prepared by Rolf Sartorius with support from Taryn Anderson, Michael Bamberger, Danielle de García, Mateusz Pucilowski and Mike Duthie. The authors wish to thank Krishna Kumar and Peter Davis for their useful comments and suggestions.

Copyright 2013

S O C I A L I M PA C T2300 Clarendon Boulevard, Suite 1000Arlington, VA 22201http://www.socialimpact.comS O C I A L I M PA C T Version 2.0

Page 3: EVALUATION - DME for Peace · 2018-03-09 · Evaluation designs describe the logic and the conceptual framework for answering the evaluation’s key questions. DoS recognizes two

Social Impact

3

TABLE Of CONTENTSPart I: Purpose and Overview ........................................................................ 5

1. Purpose ......................................................................................................... 6

2. Evaluation types ............................................................................................ 7

Performance and process .................................................................................... 7Summative/ex-post ............................................................................................. 7Impact evaluations .............................................................................................. 7Global/regional program evaluations .................................................................. 8Experience reviews and meta-analysis ................................................................. 8

Special Studies .................................................................................................... 8

Part II: Evaluation Designs ............................................................................. 9

Quantitative designs ......................................................................................... 14Qualitative designs ........................................................................................... 17Mixed method designs ...................................................................................... 18

Part III: Data Collection Methods ................................................................ 25

Mining Project Records and Secondary Data .................................................... 26Formal Surveys ................................................................................................. 28Rapid Appraisal Methods ................................................................................. 31Participatory Methods ...................................................................................... 34

Part IV: Evaluation tools and approaches .................................................... 37

Results Frameworks .......................................................................................... 38Performance Management Plans ....................................................................... 40Gender Analysis ............................................................................................... 42Cost-Benefit and Cost-Effectiveness Analysis .................................................... 44Information Communication Technology ........................................................ 46

Page 4: EVALUATION - DME for Peace · 2018-03-09 · Evaluation designs describe the logic and the conceptual framework for answering the evaluation’s key questions. DoS recognizes two

Evaluation: Some Tools, Methods & Approaches

4

EVALUATIONEVALUATION

Page 5: EVALUATION - DME for Peace · 2018-03-09 · Evaluation designs describe the logic and the conceptual framework for answering the evaluation’s key questions. DoS recognizes two

Social Impact

5

EVALUATIONEVALUATIONEVALUATION

PART IPurpose and Overview

Page 6: EVALUATION - DME for Peace · 2018-03-09 · Evaluation designs describe the logic and the conceptual framework for answering the evaluation’s key questions. DoS recognizes two

Evaluation: Some Tools, Methods & Approaches

6

PURPOSEThe State Department promotes evaluation to achieve the most effective foreign policy outcomes and greater accountability to the American people.1 The findings, conclusions and recommendations generated from evaluations should be used to improve the effectiveness of State Department programming in the areas of foreign assistance, diplomacy, security and other operations and to prevent mistakes from being repeated. The Department’s evaluation policy is based on the premise that evaluations must enable evidence-based decision making and reward candor more than superficial success stories.

The purpose of this Evaluation Overview is to strengthen awareness of evaluation tools, methods and approaches, in order to assist the Department and its partners in their planning and implementation of useful, high-quality evaluations. The booklet is designed to present some of the most useful and promising evaluation tools and methods to facilitate planning, managing and using evaluations. For each tool and approach, you will find a summary of purpose and use, advantages and disadvantages, relative costs, and the skillsets and time required for their undertaking. Key references for additional information are also provided.

The tools, methods and approaches in this overview are intended to further serve the Department’s primary evaluation purposes: accountability—Well designed and timely evaluations help to ensure

accountability for the USG resources spent on foreign affairs activities. Evaluations enable program managers and leadership to determine the cost effectiveness of programs, projects, initiatives, activities, interventions, etc., and in the case of a program or project, quality of its planning and implementation. Consequently, evaluation findings can provide empirical data for reports to various stakeholders in foreign assistance planning and in larger diplomatic activities.

Learning—Evaluations document the results, impact, or effectiveness of organizational and programming activities, thereby facilitating learning from experience. The Department can apply such learning to the development of new projects, programs, strategies and policies. Empirically-grounded evaluations also aid informed decision making when considering new programs or projects, interventions, activities, etc.

The Department distinguishes among several types of evaluation: performance/process evaluations, summative/ex-post evaluations, impact evaluations, global/regional program evaluations, experience reviews and special evaluation studies. The Department anticipates that most of its evaluations will be performance evaluations due to their ability to generate rapid and cost-effective learning.

1 Department of State Evaluation Policy, February, 2012.

Page 7: EVALUATION - DME for Peace · 2018-03-09 · Evaluation designs describe the logic and the conceptual framework for answering the evaluation’s key questions. DoS recognizes two

Social Impact

7

TyPES Of EVALUATIONS

Depending upon their information needs, resources and priorities, bureaus may utilize a wide range of evaluation types, which may include the following

Performance /Process Evaluations: Such evaluations focus on the performance of an effort and examine its implementation, inputs, outputs and likely outcomes. They are undertaken to answer a wide range of important questions: Did the effort operate as planned? What problems and challenges, if any, did it face? Was it or is it being effectively managed? Did it provide planned goods and services in a timely fashion? If not, why not? Were the original cost estimates about it realistic? Did it meet its targets or is it likely to meet them? What are its expected effects and impacts? Is it sustainable?

There is no hard or fast rule regarding when bureaus should conduct performance/process evaluations; however, for an effort with a life span of five years, Bureaus should consider undertaking an evaluation during its mid-cycle so that managers can have an objective assessment of its implementation progress, problems and challenges, which would enable them to make mid-course corrections, if necessary. On the other hand, for a two or three year effort, it is might be preferable to conduct the evaluation at its end because it takes eight to twelve months before an intervention becomes operational and its outputs are measureable.

Summative/ ex-post evaluations: They differ from performance evaluations in that their focus is primarily on outcomes

and impacts, but they often include effectiveness, and they are conducted when an effort has ended or is soon to end. Summative/ex-post evaluations answer questions such as: What changes were observed in targeted populations, organizations or policies during and at the end of effort? To what extent can the observed changes be attributed to it? Were there unintended effects which were not anticipated at the planning stage? Were they positive or negative? What factors explain the intended and unintended impacts? The essence of summative/ex-post evaluations lies in establishing that the changes have occurred as a result of the intervention, or at least the latter has substantially contributed to them.

Impact Evaluations: Impact evaluations are a sub-category of summative/ex-post evaluations. For the purposes of this Guidance, they refer to those evaluations which use control groups to measure the precise impacts of an effort. In such evaluations, two groups – treatment and control – are established at the launch of an effort. The treatment group receives the services and goods from the effort in terms of technical assistance, training, advice and/or financial support from the effort while the control group does not. The overall impact of the effort is measured by comparing the performance, conditions or status of the two groups.

This is often referred to as the “counterfactual” which is a comparison of what actually happened to what would have happened in the absence of the effort.

Page 8: EVALUATION - DME for Peace · 2018-03-09 · Evaluation designs describe the logic and the conceptual framework for answering the evaluation’s key questions. DoS recognizes two

Evaluation: Some Tools, Methods & Approaches

8

EVALUATION

This booklet provides an overview of various tools and methods for conducting these different types of evaluations. The list of topics included here is not intended to be comprehensive. Some of these tools and approaches are complementary, while some are substitutes. Although some have broad applicability, others are quite narrow in their uses. The choice of which tools or approaches are appropriate for a given context will depend on a range of considerations, including the intended uses of the evaluation, evaluation stakeholders, the speed with which the information is needed, and the resources available.

Mixing and matching approachesMany of the tools and methods in this booklet work best when they are creatively mixed or blended. A mixed methods approach combines elements of qualitative and quantitative evaluation approaches and data collection methods. This approach allows evaluators to leverage the strengths of each tool while mitigating its weaknesses. Various approaches can be used sequentially or simultaneously, with one method dominating the design or each method contributing equally. The

Global/Regional program evaluations: They may be designed to examine the performance and outcomes of a major sector or sub-sector of foreign affairs programs to draw general findings, conclusions and lessons. The purpose of a global evaluation of electoral assistance, for example, is not to evaluate the success or failure of individual projects but to determine the efficacy and outcomes of electoral assistance programs per se.

Experience Reviews: These involve systematic analysis of past experience mostly based on the review of past documents, reports, evaluations and studies. Evaluators can also supplement the information by key informant interviews or workshops. They focus on a limited range of questions and can be completed within the span of two or three weeks.

Special Evaluation Studies: Bureaus may also undertake special evaluation studies to meet their specific information needs. Such studies may be undertaken when (a) a key decision has to be made and available information is inadequate; (b) there are major implementation problems that should be addressed, or (c) the Secretary of the State, OMB, White House and the Congress need empirically grounded information that is not available from routine sources. For example, if the White House needs information about the implementation of Human Rights programs in the Middle East, DRL or the Bureau for Near Eastern affairs might commission a study on the subject.

TyPES Of EVALUATIONS (continued)

Page 9: EVALUATION - DME for Peace · 2018-03-09 · Evaluation designs describe the logic and the conceptual framework for answering the evaluation’s key questions. DoS recognizes two

Social Impact

9

EVALUATION

Department's Evaluation Policy does not value any single method over others. A mixed method approach provides evaluations with a number of advantages:

Triangulation – increases validity and credibility of findings and conclusions;Comprehensiveness – allows for collection of enough data to sufficiently answer

all evaluation questions;Clarity – data collected from one method can clarify or supplement findings

collected via another method;Development – information collected through a given method may inform the

use or revision of subsequent methods;Initiation – data collected through multiple methods that provides divergent

findings can trigger further analysis

kEy RESOURCES Department of State (2012). Department of State Program Evaluation Policy.

http://www.state.gov/s/d/rm/rls/evaluation/2012/184556.htm Department of State (2012). Evaluation Guidance for the Department of State. Department of State (2012). Department of State Evaluation Policy: Frequently Asked Questions. Department of State Diplopedia: http://diplopedia.state.gov/index.php?title=State_Program_Evaluations Suggested Evaluation Resources: http://diplopedia.state.gov/images/Suggested_Evaluation_Resources.docx Department of State Evaluation Community of Practice: http://cas.state.gov/evaluation

Page 10: EVALUATION - DME for Peace · 2018-03-09 · Evaluation designs describe the logic and the conceptual framework for answering the evaluation’s key questions. DoS recognizes two

Evaluation: Some Tools, Methods & Approaches

10

EVALUATIONEVALUATION

Page 11: EVALUATION - DME for Peace · 2018-03-09 · Evaluation designs describe the logic and the conceptual framework for answering the evaluation’s key questions. DoS recognizes two

Social Impact

11

EVALUATIONEVALUATIONEVALUATION

PART IIEvaluation Designs

Page 12: EVALUATION - DME for Peace · 2018-03-09 · Evaluation designs describe the logic and the conceptual framework for answering the evaluation’s key questions. DoS recognizes two

Evaluation: Some Tools, Methods & Approaches

12

OVERVIEW Of EVALUATION DESIgNSEvaluation designs describe the logic and the conceptual framework for answering the evaluation’s key questions. DoS recognizes two main kinds of evaluation designs, each with particular strengths and limitations in addressing specific kinds of evaluation questions—performance evaluation designs and impact evaluation designs.

Performance evaluation (PE) designs are best at answering descriptive and normative questions about programs. Illustrative descriptive questions can include: who benefited from the program and who did not? What were strengths and weaknesses in program implementation? How has the program made a difference? Normative questions are those which gauge program performance against certain agreed norms or standards: to what extent did the program achieve its target of training 400 election supervisors? To what extent were “do no harm” principles adhered to during implementation of the peace building program? To what extent were DoS Green Building standards followed in the construction of the new embassies? There are many PE designs and to simplify we identify three main types: 1) PE designs using primarily quantitative methods; 2) PE designs using primarily qualitative methods; and 3) mixed methods PE design. Due to their practicality in terms of lower costs, faster implementation and greater ability to describe the interaction of the program and the program context, PE designs are much more widely used in evaluating DoS programs compared to IE designs.

Impact evaluation (IE) designs are used to answer cause-and-effect questions about DoS programs—to what extent did the program cause outcomes to occur? To what extent can the program benefits be attributed to the program? For example, an IE design might be used to answer the question: To what extent did the youth training program in Tunisia lead to greater employment of youth? In the DoS evaluation policy IEs use experimental and quasi-experimental designs with a counterfactual or control group. DoS intends to use these design only very selectively due to their substantial cost, time and technical requirements.

These main designs are summarized in the following chart and then the advantages and challenges in using each design type are outlined below.

Page 13: EVALUATION - DME for Peace · 2018-03-09 · Evaluation designs describe the logic and the conceptual framework for answering the evaluation’s key questions. DoS recognizes two

Social Impact

13

Chart 1: Continuum of Evaluation Designs

Performance EvaluationsImpact Evaluations

Experimental and quasi-experimental

designs

1. Designs with randomized assignment (RCTs)

2. Designs with comparison groups but not randomized assignment

• Regression discontinuity

• Difference-in-difference

• Matching

Designs that assess outcomes for the same group using

quantitative methods

• Snapshot design• Before-and-after• Cross-sectional• Time Series

Designs that assess outcomes for the same group using

qualitative methods

• Snapshot design• Before-and-after• Cross-sectional• Appreciative inquiry• Most significant

change• Case study design

Designs that systematically integrate

quantitative & qualitative methods

• Case study design• Qualitative dominant

design• Quantitative

dominant design• Balanced mixed

methods design

Quantitative designs without control or comparison groups

Qualitative designs Mixed methods designs

Increasing feasibility in the field

Decreasing statistical rigor for causal questions

Page 14: EVALUATION - DME for Peace · 2018-03-09 · Evaluation designs describe the logic and the conceptual framework for answering the evaluation’s key questions. DoS recognizes two

Evaluation: Some Tools, Methods & Approaches

14

QUANTITATIVE PERfORMANCE EVALUATION DESIgNS

What are they?Quantitative performance evaluation (PE) designs rely predominantly on the use standardized measures and standardized data collection procedures throughout the evaluation to ensure comparability or to measure results. There a many types of quantitative PE designs. Several of these that are potentially most useful for DOS are: 1) quantitative snapshot designs; 2) before and after designs 3) cross sectional designs; and 4) time series designs. Quantitative PE designs measure program effects at a single point in time or through repeated measures without a counterfactual group and do not answer cause-and-effect questions with certainty. These designs are often very practical and can be used widely for evaluating DoS programs.

ADVANTAGES:

Evaluation findings can be generalized to the population about which information is required Samples of individuals, communities or organizations can be selected to

ensure that the results will be representative of the population being studied. Estimates can be obtained of the magnitude and distribution of program

results. Standardized approaches permit the evaluation to be replicated in different

areas over time with comparable findings. Increases the credibility of findings for many (but not all) evaluation users

CHALLENGES:

Many kinds of information are difficult to obtain through structured data collection instruments particularly on sensitive topics such as domestic violence and income. Many groups, especially ethnic minorities, sex workers and victims of

trafficking may be more difficult to reach through quantitative evaluations. There is often little contextual data to help interpret results or explain

variations across groups Difficulty in studying the program implementation process. If capable local survey firms are not on the ground in-country quantitative

PEs can be expensive and time consuming, depending on their size and scale.

Page 15: EVALUATION - DME for Peace · 2018-03-09 · Evaluation designs describe the logic and the conceptual framework for answering the evaluation’s key questions. DoS recognizes two

Social Impact

15

COSTS:Many quantitative PEs are in the middle cost range with snapshot PEs being less expensive and times series designs being more expensive assuming DOS will pay for the repeated data collection costs.

SKILLS REQUIRED:A high degree of skill in survey design, sampling, survey management and analysis, especially for larger scale survey. Quantitative PE designs using simple surveys will be less technically demanding

TIME REQUIRED:For snapshot and cross-sectional PE designs 3-5 months is typical. Before-and-after designs typically require a few months at program inception (baseline) and another few months at the end of the program to assess performance.

kEy RESOURCES Bamberger, M. (2012). RealWorld Evaluation Chapter 12. Gertler, P. Impact Evaluation in Practice Rist, R. and Morra, L. (2009). The Road to Results: Designing and Conducting Effective Development Evaluations.

The World Bank. www.

Page 16: EVALUATION - DME for Peace · 2018-03-09 · Evaluation designs describe the logic and the conceptual framework for answering the evaluation’s key questions. DoS recognizes two

Evaluation: Some Tools, Methods & Approaches

16

QUANTITATIVE PE DESIgNSQuantitative Snapshot Designs look at a group of program participants at one point in time during or after the intervention. These designs can be used to answer descriptive or normative questions. For example, the main design might use a single simple survey of several dozen to a few hundred program participants to answer descriptive questions about how the program has benefited participants, how much they liked the program, or how they rate the quality of program services. The design can also be used to answer normative questions against specific targets or criterion such as did the program achieve its targeted 75% satisfaction rating among program participants? These designs produce rapid access to evaluative information and they are relatively cheap.

Quantitative Before-and-after Designs can be used to answer descriptive and normative questions such as: how much have participants learned during the program, how income has increased or how have morbidity or mortality rates, or incidents of violent conflict decreased during the program. Evaluators ask about group characteristics before and after the program and there is no comparison group. For example, in a conflict management program evaluators could do a pre-test (before the program) and post-test (after the program) to see how much participants learned about mediation techniques. Or in an income generation project evaluator could measure participant income at program inception and again at program completion. These designs require a baseline or the time and resources to recreate one.

Cross-Sectional Designs show a snapshot of program performance at one point in time. The quantitative version of this design is used with a survey and it answers questions about subgroup responses to the programs. It also can be used to address descriptive or normative questions. For example, a descriptive question that could be addressed might be: how did women versus men differentially respond to the program? Whereas a normative question could be: How did ethnic minorities versus the majority group compare in meeting the targeted 10% increase income for program participants? The advantage of this design is that it systematically disaggregates the subgroups and examines how the program has differentially affected them.

time Series Designs take repeated measures to explore and describe changes over time and to identify trends. For example, in health programs a time series design might draw on government or other donor statistics to measure HIV prevalence rates over time in a particular district that is receiving HIV/AIDS programming. In a longitudinal design repeated measures are taken from the same group. In an interrupted time series design, measures are taken from before the program, to examine the pre-program trends, and then again during and after the program to examine the expected interruption in the trend. The design is useful for where existing data sets are available to examine trends, otherwise the design is very and time consuming to implement.

Although each of the above designs are quantitative each could have a qualitative counterparts. For example, qualitative designs can use the same basic logic of snapshot, before-and-after, cross-sectional and even time series designs.

Page 17: EVALUATION - DME for Peace · 2018-03-09 · Evaluation designs describe the logic and the conceptual framework for answering the evaluation’s key questions. DoS recognizes two

Social Impact

17

QUALITATIVE PE DESIgNS

What are they?Qualitative PE designs draw on a range of primarily qualitative methods to address evaluation questions where understanding the interplay between program performance and the program context is central, where evaluations need to be conducted quickly, flexibly and at low cost; or where complex or rapidly changing programs require fast, flexible, and on-going learning. These PE designs are often comprised of a combination of document review, key informant interviews, focus groups interviews and participatory evaluation methods such as mapping exercises, or other qualitative approaches such as Appreciative Inquiry or Most Significant Change methodology. Qualitative PE designs are very practical for evaluating many kinds of DoS programs, especially those where outcomes such as poverty, vulnerability, security, and empowerment combine a number of different dimensions, which can be difficult to observe and measure.

STRENGTHS

Flexibility to evolve Sampling focuses on high value subjects Holistic focus Examine the broader context of the program Multiple qualitative sources provide understanding of complex phenomena Narrative reports more accessible to non-specialists The use of participatory approaches makes it more likely that vulnerable and

voiceless groups are heard

CHALLENGES

A flexible, evolving design may frustrate key users of the evaluation and be less practicable, especially for some short duration evaluations Lack of generalizability of findings to other programs Multiple perspectives—hard to reach consensus on some major themes Methodological challenges in boiling down large quantities of qualitative data Interpretivist methods appear too subjective

COSTS, TIME, SKILLS REQUIRED, RESOURCESCost, time and skills required will depend on the specific blend of qualitative methods selected to support the PE design. See Part III for specifics.

Page 18: EVALUATION - DME for Peace · 2018-03-09 · Evaluation designs describe the logic and the conceptual framework for answering the evaluation’s key questions. DoS recognizes two

Evaluation: Some Tools, Methods & Approaches

18

MIxED METhODS EVALUATION DESIgNS

What are they?Mixed methods evaluation designs involve the systematic integration of different methodologies at all stages of an evaluation. The mixed method approach normally refers to evaluation designs that combine quantitative and qualitative methods. It is important to distinguish systematic integration of quantitative and qualitative methods from many evaluations that combine methods in an ad hoc manner.

The benefits and systematic and integrated use of mixed methods evaluation designs is widely recognized in all spheres of domestic and international evaluation work. They are generally the preferred evaluation design DoS programs for the following reasons: DoS programs operate in complex and changing social, economic, ecological

and political contexts and no single evaluation methodology can adequately describe the interactions among all these different factors. DoS program implementation and outcomes are affected by a wide range

of historical, economic, political, cultural, organizational, demographic and natural environmental factors, all of which require different methodologies for their assessment. DoS programs also produce a range of different outcomes and impacts, many

of which require different methodologies for measurement and evaluation. DoS programs change in responsive to how they are perceived by different

segments of the target (and non-target) population, and observing these processes of behavioral change requires the application of different methods.

ADVANTAGESMixed methods designs, when used systematically, offer the potential to combine the benefits of both qualitative and quantitative approaches while compensating for the limitations of each approach when used separately. A well-designed mixed method approach can offer a range of potential benefits: A well-designed mixed methods evaluation is able to draw on a much broader

range of qualitative and quantitative tools, techniques and conceptual frameworks at all stages of the evaluation Normally, the design will also incorporate professionals from different

disciplines into the core evaluation team Mixed methods designs assist DoS in understanding how local contextual

factors can explain variations in program implementation and outcomes in different locations

Page 19: EVALUATION - DME for Peace · 2018-03-09 · Evaluation designs describe the logic and the conceptual framework for answering the evaluation’s key questions. DoS recognizes two

Social Impact

19

Mixed methods designs combine the representativeness of quantitative methods that allow for a generalization of findings from a sample to a larger population with the ability of qualitative methods to assess the effect of intervening variables (for example ethnicity, community leadership, etc.) on outcomes.

COSTS, TIME, SKILLS REQUIREDCost, time and skills required will depend on the specific blend of mixed methods selected to support the PE or IE design. See Part III for specifics

RESOURCES Bamberger, M. (2013). The Mixed Methods Approach to Evaluation. SI Concept Note Series. No. 1., April

2013. http://www.socialimpact.com/evaluationresources

Page 20: EVALUATION - DME for Peace · 2018-03-09 · Evaluation designs describe the logic and the conceptual framework for answering the evaluation’s key questions. DoS recognizes two

Evaluation: Some Tools, Methods & Approaches

20

CASE STUDy DESIgN

What is it?a Case Study Design is typically a qualitative or mixed methods evaluation. It is a non-experimental design and does not use random selection or control and comparison groups. The design is frequently used when DoS wants to gain in-depth understanding of a program process, event, or situation and explain why results occurred. It is useful when answering descriptive evaluation questions about why and how the intervention works and it can be especially useful in portraying complex program processes and how the program context interacts with targeted individuals, communities or institutions to produce results or behavior changes. A key attribute of a case study design is that it highlights why decisions were made, how decisions were implemented and finally, with what results. Case study designs can be used to examine program extremes (high performing or low performing examples) or a typical intervention.

Case studies can use qualitative, quantitative or mixed methods to collect data. They can consist of a single case or multiple cases across multiple sites or countries. For example, a case study design could be used to describe how a program to reintegrate former child soldiers in Sierra-Leone affected children participating in the program. The case study would describe the program context, the history and background of some key individual(s), how they participated in the program and how the program affected the lives of selected children and their communities upon reintegration. Rich learning could be gained from cases that portrayed typical children in the program, or ones who had had experienced particular successes or failures due to their involvement.

ADVANTAGES:

Allows for in-depth analysis of interplay between program context and results Helps to establish plausible causal relationships between interventions and

outcomes Frequently involves a mixed-methods approach strengthening credibility of

results Provides a story line that may be compelling for readers

LIMITATIONS:

Focuses on only one causal relationship, sometime leaving out other potential relationships Difficult to generalize to other situations

Page 21: EVALUATION - DME for Peace · 2018-03-09 · Evaluation designs describe the logic and the conceptual framework for answering the evaluation’s key questions. DoS recognizes two

Social Impact

21

Cannot establish statistical causality or significance as in quantitative impact evaluation designs.

COST:Medium, depending on number of cases selected and depth of data collection.

SKILLS REQUIRED:Familiarity with various qualitative research methods such as key informant interviews, focus groups and direct observation. Depending on the scope and focus of the case study design may also draw on survey and other quantitative skills.

TIME REQUIRED:Generally a minimum of at least 2-3 days required to collect case study data for a single case such as an individual or a small-scale organization, plus an additional 2-3 days for analysis and report writing. More time if surveys are required. Substantially more time of the unit of analysis is a large group (Liberian armed forces) or large organization (Liberian Ministry of Defense), community, or region.

kEy RESOURCES: USAID (2013). Evaluative Case Studies. Technical Note. USAID Monitoring and Evaluation Series. No x.

Version x. (draft) Yin, Robert K. (2009). Case Study Research: Design and Methods. Fourth Edition. Thousand Oaks, CA: Sage. Social Impact (2006). Monitoring, Evaluation and Learning for Fragile State and Peacebuilding Programs: Practical

Tools for Improving Program Performance and Results, pp. 40-47.

Page 22: EVALUATION - DME for Peace · 2018-03-09 · Evaluation designs describe the logic and the conceptual framework for answering the evaluation’s key questions. DoS recognizes two

Evaluation: Some Tools, Methods & Approaches

22

IMPACT EVALUATION

What is it?Impact evaluations utilize experimental or quasi-experimental methods to assess the changes in development outcomes that are directly attributable to a given intervention. To be able to isolate the impact of an intervention, evaluators must first identify a credible and rigorously defined counterfactual – a theoretical state that predicts what would have happened to beneficiaries in the absence of the intervention. The counterfactual is estimated by identifying a comparison group that is as similar to the beneficiary (treatment) group as possible. Impact is then measured by comparing the changes over time between the treatment group and this comparison group. While comparison groups can be selected using a variety of methodologies, randomized selection of potential beneficiaries into treatment and control groups provides the strongest evidence of a relationship between the intervention under study and the outcome measured. Given the complexities of development work, however, experimental designs entailing randomized selection are not always possible or desired. In such cases, evaluators should use the most rigorous quasi-experimental methods available.

What can we use it for? Measuring outcomes and impacts of an activity and distinguishing these from

the influence of other, external factors Strengthening accountability for results Informing decisions on whether to expand, modify or eliminate projects,

programs or policies Testing the relative effectiveness and efficiency of alternative interventions Drawing lessons for improving the design and management of future activities

ADVANTAGES:

More rigorous than performance evaluations Can attribute changes in development outcomes to discrete interventions Can compare competing interventions or alternative intervention designs Provide estimates of the magnitude of outcomes and impacts for different

demographic groups and regions over time Statistical analysis and rigor can give managers and policy-makers added

confidence in decision-making

1Findings should be available in time to inform the project itself or future strategies, designs and procurements

Page 23: EVALUATION - DME for Peace · 2018-03-09 · Evaluation designs describe the logic and the conceptual framework for answering the evaluation’s key questions. DoS recognizes two

Social Impact

23

CHALLENGES:

Considerably more costly, time-consuming, and management intensive than performance evaluations Substantive changes in project

design can threaten evaluation validity Require highly specialized

evaluators to design and implement Selection of comparison groups could be politically or logistically difficult

(especially in the case of randomization)

COST:Impact evaluations generally cost more than performance evaluations. For reference see the TIPS note on Impact Evaluation Costing.

SKILLS REQUIRED:Strong technical skills in social science research design, management analysis and reporting. These generally include sampling and power calculations; survey instrument design; enumeration and enumerator training; interviewing; data entry, warehousing, cleaning and management; data analysis using statistical software; and qualitative research skills to triangulate results.

TIME REQUIRED:Varies according to design and scope of evaluation but could take multiple years.

treatment Group –exposed to a given intervention (beneficiaries).

Control Group –identified using randomized selection and not exposed to a given intervention (counterfactual).

Comparison Group –identified by non-randomized selection and not exposed to a given intervention (counterfactual).

Page 24: EVALUATION - DME for Peace · 2018-03-09 · Evaluation designs describe the logic and the conceptual framework for answering the evaluation’s key questions. DoS recognizes two

Evaluation: Some Tools, Methods & Approaches

24

A fEW APPROAChES USED fOR IMPACT EVALUATIONSExperimental Design – members of a population are randomly assigned to treatment and control groups, and questionnaires or other data collection instruments (anthropometric measures, school performance tests, etc.) are administered to both groups. This is done both before and after the project intervention. Randomization maximizes the probability that the two groups will be statistically similar, controlling for selection bias and producing the most rigorous counterfactual estimate. Also called Randomized Control Trials (RCTs).

Quasi-Experimental Design (QED) – when randomization is not possible, a comparison group is purposively selected to be as similar to the treatment group as possible. There are a number of different methodologies for selection, each of which entails its own assumptions about the ‘goodness-of-fit’ between the comparison group and the counterfactual.

regression Discontinuity (rD) – this QED utilizes a clear, numerical cutoff (threshold) score for participation in a given intervention to create a counterfactual. Evaluators compare proximate observations from both sides of the cutoff to estimate the impact of a given intervention. Those just above the cutoff are the treatment group, whereas those just below

are the comparison. This method assumes that the local impacts (just around the cutoff) are generalizable to the broader population.

Difference in Differences (DD) – this QED compares the changes in development outcomes over a given period of time between the beneficiaries (treatment) and a purposively selected group of non-beneficiaries (comparison). This method allows evaluators to control for differences between the treatment and comparison groups that are constant over time but forces us to assume that the two groups would have experienced the same changes (parallel trends). DD can be paired with either experimental designs or matching techniques to increase analytical rigor.

Matching – this QED relies on large data sets to construct the best possible comparison group on the basis of observed characteristics. Matching can be conducted using a variety of methods including Propensity Score Matching, where each unit is assigned a probability (0-1) that it will participate in a given program. The score is expressed through a summary index of relevant characteristics. Because matching is based on observed characteristics, this method necessitates the assumption that there are no differences in unobservable characteristics (motivation, etc.).

kEy RESOURCES Gertler, P. et. al. (2011). Impact Evaluation in Practice. The World Bank, Washington, D. C. International Initiative for Impact Evaluation (3ie). http://www.3ieimpact.org/ Khandker, S. R., Koolwal, G. B., & Samad, H. A. (2010). Handbook on Impact Evaluation: Quantitative Methods

and Practices. Washington, DC: The International Bank for Reconstruction and Development / The World Bank. http://go.worldbank.org/9H20R7VMP0

World Bank. Development Impact Evaluation Initiative. http://go.worldbank.org/1F1W42VYV0 Duflo, E. (2007). Using Randomization in Development Economics Research: A Toolkit. Centre for Economic Policy

Research, London. econ-www.mit.edu/files/806

EVALUATIONEVALUATION

Page 25: EVALUATION - DME for Peace · 2018-03-09 · Evaluation designs describe the logic and the conceptual framework for answering the evaluation’s key questions. DoS recognizes two

Social Impact

25

EVALUATIONEVALUATIONEVALUATION

PART IIIData Collection Methods

Page 26: EVALUATION - DME for Peace · 2018-03-09 · Evaluation designs describe the logic and the conceptual framework for answering the evaluation’s key questions. DoS recognizes two

Evaluation: Some Tools, Methods & Approaches

26

MININg PROjECT RECORDS AND SECONDARy DATA

What is it?Data mining uses project documents and records or secondary sources such as published reports, censuses, surveys and comparative international data during the evaluation. Project documents that can be mined include periodic project reports (monthly, biannual, annual), baseline data, needs assessments, grant data bases, internal and external evaluations, technical advisor input reports, field reports and project logs and diaries kept by project personnel or beneficiaries. Mining secondary data can include use of qualitative and ethnographic data such as posters, graffiti, mass media reports (newspapers, TV, etc.), e-mail and social media (Facebook, You-Tube, etc.). Examples of widely used comparative international data sets include: MDG statistics, UN Human Development Index, Demographic and Health Surveys, World Bank World Development Indicators and Transparency International.

What can we use it for? Complement or check data collected during the evaluation Reconstruct baseline data when the evaluation is commissioned late in the

project cycle Develop a sampling frame for the selection of the project or comparison

groups Define the counterfactual when it is not possible to use a pre/post-test

comparison group evaluation design ADVANTAGES:

Produces significant cost and time savings Strengthens sample design by ensuring coverage of the total target population.

Improves matching of the project and comparison groups through techniques such as propensity score matching and instrumental variables Provides independent check of data validity Enriches quality and interpretation of evaluation findings Provides an alternative way to define the counterfactual – particularly useful

for complex evaluations

Page 27: EVALUATION - DME for Peace · 2018-03-09 · Evaluation designs describe the logic and the conceptual framework for answering the evaluation’s key questions. DoS recognizes two

Social Impact

27

CHALLENGES:

Voluminous project records may be difficult and time- consuming to review and analyze Often does not cover the correct time period, level of analysis, or correct

sample May not provide all of the required information Difficult and time-consuming to check the validity of secondary data

COST:The cost of using published data or well-documented surveys is very low. However, using project records or data from other organizations may require significant costs to put the data in a form that can be used for the evaluation.

SKILLS REQUIRED:Experience with statistical analysis of survey data.

TIME REQUIRED:Relatively short compared to primary data collection.

kEy RESOURCES http://betterevaluation.org/plan/describe/existing_documents Bamberger, M. (Nov. 2010). Reconstructing baseline data for impact evaluation and results measurement.

Special Series on the Nuts and Bolts of M&E Systems, No. 4. In Poverty Reduction and Economic Management Notes. The World Bank. http://siteresources.worldbank.org/INTPOVERTY/Resources/335642-1276521901256/premnoteME4.pdf

Boslaugh, S. (2007). An Introduction to Secondary Analysis. Excerpt from Secondary Data Sources for Public Health: A Practical Guide. New York: Cambridge University Press. http://assets.cambridge.org/97805218/70016/excerpt/9780521870016_excerpt.pdf

Page 28: EVALUATION - DME for Peace · 2018-03-09 · Evaluation designs describe the logic and the conceptual framework for answering the evaluation’s key questions. DoS recognizes two

Evaluation: Some Tools, Methods & Approaches

28

fORMAL SURVEyS

What are they?Formal surveys are used to collect standardized information from a carefully selected sample of individuals or aggregated units (households, schools, etc.). Surveys often collect comparable information for a relatively large number of people in particular project groups.

What can we use them for? Providing baseline data against which to compare strategy and performance Comparing different groups at a given point in time Comparing changes over time in the same group Providing a key input to a formal evaluation of the impact of a program or

project Assessing the level of need in a particular target group or in a particular sector

as the basis for preparing a project or program design ADVANTAGES:

Findings from the right sample of respondents can be applied to the wider target group or the population as a whole Quantitative estimates can be made for the size and distribution of impacts With the proliferation of donor surveys and national statistics agencies, there

may be good survey data to draw or build on

CHALLENGES:

With the exception of Core Welfare Indicators Questionnaires (CWIQ), results are often not available for a long period of time The processing of data and quality assurance can be a major bottleneck for the

larger surveys even with software tools such as Statistical Package for the Social Sciences (SPSS) Demographic and Health Surveys (DHS) and household surveys are expensive

and time consuming Used without qualitative methods, surveys may give an incomplete picture of

results and underlying causes of change

Page 29: EVALUATION - DME for Peace · 2018-03-09 · Evaluation designs describe the logic and the conceptual framework for answering the evaluation’s key questions. DoS recognizes two

Social Impact

29

COST:Ranges from $30-60 per household for the CWIQ to $1-3 million for a full DHS. Costs may be significantly higher with no master sampling frame for the country.

SKILLS REQUIRED:Sound technical and analytical skills for sample and questionnaire design, data analysis, processing and reporting.

TIME REQUIRED:Depends on the sample size. The CWIQ can be completed in two months. Standard DHS fieldwork generally requires between three and seven months to complete, while data collection through the final report takes one year to 18 months.

kEy RESOURCES Measure DHS. Demographic and Health Surveys. http://www.measuredhs.com/start.cfm Grootaert, C. and van Bastelaer, T. (2002). Understanding and Measuring Social Capital: a multidisciplinary tool for

practitioners. Washington, DC: World Bank. Sapsford, R. (2011). Survey Research (2nd ed.). Newbury Park, CA: Sage Publications. World Bank. Core Welfare Indicator Questionnaire (CWIQ). http://go.worldbank.org/66ELZUGJ30 World Bank. Living Standards Measurement Survey (LSMS). http://www.worldbank.org/lsms/ World Bank. Quantitative Service Delivery Surveys (QSDS). http://go.worldbank.org/MB54FMT3E0 World Bank. Citizen Report Card and Community Scorecard. http://go.worldbank.org/QFAVL64790

EVALUATIONEVALUATION

Page 30: EVALUATION - DME for Peace · 2018-03-09 · Evaluation designs describe the logic and the conceptual framework for answering the evaluation’s key questions. DoS recognizes two

Evaluation: Some Tools, Methods & Approaches

30

SOME TyPES Of SURVEySPublic Opinion Surveys are designed to represent the opinions of a population by conducting a series of questions and then extrapolating generalities within confidence intervals. For example, The Asia Foundation Survey of the Afghan People provides insights into Afghans' views on security, national reconciliation, the economy, governance, corruption, justice, development, provision of services and gender equality. The survey has been conducted annually since 2006 and tracks public opinion trends on these issues.

Demographic and Health Surveys (DHS) are nationally-representative household surveys that provide data for a wide range of monitoring and impact evaluation indicators in the areas of population, health, and nutrition. Standard DHS Surveys have large sample sizes and typically are conducted every five years, to allow comparisons over time. Interim DHS Surveys focus on the collection of information on key performance monitoring indicators but may not include data for all impact evaluation measures (such as mortality rates). These surveys are conducted between rounds of DHS surveys and have shorter questionnaires than DHS surveys.

Core Welfare Indicators Questionnaire (CWIQ) is a household survey that measures changes in social indicators for different population groups—specifically indicators of access, utilization, and satisfaction with social and economic services. It is a quick and effective tool for improving activity design, targeting services to the poor and, when repeated annually, for monitoring activity performance. Preliminary results can be obtained within 30 days of the CWIQ survey.

Multi-Topic Household Survey (also known as Living Standards Measurement Survey—LSMS) is a multi-subject integrated survey that provides a means to gather data on a number of aspects of living standards to inform policy. These surveys cover spending, household composition, education, health, employment, fertility, nutrition, savings, agricultural activities, other sources of income. Single-topic household surveys cover a narrower range of issues in more depth.

Client Satisfaction (or Service Delivery) Survey is used to assess the performance of donor or government services based on client experience. The surveys shed light on the constraints clients face in accessing services, their views about the quality and adequacy of services, and the responsiveness of government officials. These surveys are usually conducted by a government ministry or agency.

Citizen Report Cards have been conducted by NGOs and think-tanks in many countries. Similar to service delivery surveys, they have also investigated the extent of corruption encountered by ordinary citizens. A notable feature has been the widespread publication of the findings.

Social Capital Surveys measure people’s perceptions of the trustworthiness of other people and key institutions that shape their lives, as well as the norms of cooperation and reciprocity that underlie attempts to work together to solve problems. These surveys have been used for monitoring and evaluating peacebuilding and transition programs.

Page 31: EVALUATION - DME for Peace · 2018-03-09 · Evaluation designs describe the logic and the conceptual framework for answering the evaluation’s key questions. DoS recognizes two

Social Impact

31

RAPID APPRAISAL METhODS

What are they?Data collection methods that can be employed quickly and at a low cost to obtain a narrow, but in-depth understanding of the conditions and needs of the targeted group. These methods elevate the importance and relevance of local knowledge. Less structured than classic evaluation methods (i.e., surveys, experiments), they tend to use a smaller sample size and may therefore have less statistical accuracy.

What can we use them for? Accommodate resource constraints Investigate motivations and attitudes behind behaviors Assess the development hypothesis and facilitate the development of a more

comprehensive, formal survey tool Identify the universe of stakeholders and opinion leaders/decision-makers

ADVANTAGES:

Quick and cost-effective Highly adaptable Focus on qualitative information produces detailed data On-the-spot analysis allows for verification of conclusions by local people

CHALLENGES:

Limited generalizability/reliability and lack of clear validation procedures lessens credibility Susceptible to agendas of participants Externally-driven process (not inherently participatory) Quality of results dependent on skill of evaluators

COST:Low to medium, depending on scope of evaluation and methods selected

SKILLS REQUIRED:Data collection (administration of individual and group interviews, group/meeting facilitation, field observation), cultural sensitivity, qualitative data analysis and basic statistical skills.

Page 32: EVALUATION - DME for Peace · 2018-03-09 · Evaluation designs describe the logic and the conceptual framework for answering the evaluation’s key questions. DoS recognizes two

Evaluation: Some Tools, Methods & Approaches

32

TIME REQUIRED:Two to six weeks, depending on the scope of evaluation (number of units to be evaluated); should be scheduled according to lifestyle of community being evaluated.

RAPID APPRAISAL METhODSKey informant interviews – a series of individual interviews with a small, select group of people with vast knowledge of a particular subject. These are frequently semi-structured, following a prepared interview guide with predetermined topics or loosely-worded questions. They may be easier and less expensive than focus groups, given a lesser demand for coordination or incentives.

Focus groups – interviews facilitated by an impartial moderator with several homogenous groups of stakeholders. Typically comprised of seven to 12 participants and lasting between one to two hours; anything longer should be broken into multiple sessions. Focus groups allow participants to build upon one another’s comments, but are potentially at risk of producing data biased by the most vocal participants. Best for generating, testing or exploring ideas or as a method of triangulation.

Community interviews – differ from focus groups because they occur in a public setting and are open to all community members. Interview protocol is usually more structured and there is less discussion amongst participants. Most interaction is between interviewer and participants.

Before-and-after photos (or drawings) – provide visual evidence of change (though not necessarily evidence of the reason for the change) that is easily understood by most audiences. Can be augmented with captions written by community members.

Direct observation – multiple evaluators consciously record what they see, hear and smell of their physical surroundings, activities, processes or discussions. In structured observation, evaluators look for a specific behavior, object or event and use a common form to record scores/comments. Direct observation makes evaluators aware of aspects either purposefully or inadvertently omitted in data collection from participants but provides only a snapshot of the situation.

Mini-surveys – Much smaller than a formal questionnaire, mini-surveys focus on a narrowly-defined issue, question or problem, include 15-30 questions and are designed to take no more than 30 minutes to complete. They are administered only to 25-75 people, who are most often selected through nonprobability sampling. Mini-surveys are attractive to evaluators because they can generate quantitative, easily analyzable data fairly quickly. Web-based tools such as Survey Monkey are very useful where respondents have good connectivity.

Document review – review of project or external materials can provide information about the context and events that occurred prior to evaluation.

Scoring/ranking – assesses the relative importance of different items. Ranking requires ordering by priority; scoring requires assigning a value.

Page 33: EVALUATION - DME for Peace · 2018-03-09 · Evaluation designs describe the logic and the conceptual framework for answering the evaluation’s key questions. DoS recognizes two

Social Impact

33

kEy RESOURCES McNall, M. and Foster-Fishman, P. (2007). Methods of Rapid Evaluation, Assessment, and Appraisal. American

Journal of Evaluation, 28:151. http://www.pol.ulaval.ca/perfeval/upload/publication_194.pdf USAID. (2007). Using Rapid Appraisal Methods. Performance Monitoring & Evaluation TIPS, 2 Ed., No. 5.

http://www.usaid.gov/policy/evalweb/documents/TIPS-UsingRapidAppraisalMethods.pdf

Page 34: EVALUATION - DME for Peace · 2018-03-09 · Evaluation designs describe the logic and the conceptual framework for answering the evaluation’s key questions. DoS recognizes two

Evaluation: Some Tools, Methods & Approaches

34

PARTICIPATORy METhODS

What are they?A collection of methods designed to facilitate ownership of M&E findings and recommendations among the local population. Project beneficiaries play the primary role in evaluation planning, data collection, analysis, and reporting. Follow-up actions are decided upon and implemented locally. The methods are flexible, visual (sometimes oral) and group-oriented; a small evaluation team facilitates but does not dictate the process.

What can we use them for? Stakeholder analysis, problem analysis, community assessment Project design and implementation Monitoring and evaluation

ADVANTAGES:

Empowers and has potential to build local capacity Ownership by local population can boost quality (comprehensiveness,

accuracy) of findings Facilitates the exploration of sensitive subjects Monitoring allows corrective action to be taken sooner Increases the chance that beneficiaries will be supportive/actively involved in

implementation of evaluation recommendations, thereby increasing likelihood of sustainability Increases information transparency

CHALLENGES:

Less objectivity than other methods, potentially reducing credibility of results Evidence is primarily anecdotal Requires substantial time commitment from locals May raise participant expectations of expected project results Potential for domination and misuse by some stakeholders to further their

own interests

COST:Low to medium, depending on scope and depth of application.

Page 35: EVALUATION - DME for Peace · 2018-03-09 · Evaluation designs describe the logic and the conceptual framework for answering the evaluation’s key questions. DoS recognizes two

Social Impact

35

SKILLS REQUIRED:Familiarity with, or a minimum of several days’ training in participatory approaches; group activity facilitation; ability to create safe, enabling environment; listening; respect; analysis of qualitative data.

TIME REQUIRED:Varies widely according to scope of evaluation and methods selected; typically a few days to a week per community for use of some of the rapid appraisal methods but much longer for other methods (i.e., participant observer) and for follow-up.

COMMONLy USED PARTICIPATORy TOOLSParticipant Observer – Full immersion, to the extent possible, in the local culture. This method allows the researcher to draw conclusions from a first-hand basis.

Participatory Rural Appraisal (PRA) – Can be used in both rural and urban areas. Some evaluators also categorize the methods listed below as Rapid Appraisal Methods; the difference lies in whether the application of the method permits the community to believe it “owns” the data – a key tenant of participatory methods.

Participatory mapping – collectively creating a map of the community

Participatory calendars – collectively recalling a history or projecting an anticipated schedule of events (i.e., timing of rainy season)

transect walks – a small group of locals walks the evaluators/facilitators through the community and discusses what they observe

Creative expression – drawing, storytelling, drama, role-playing, music, collage making

Participatory video – using community made videos to assess change

kEy RESOURCES Harvey, E. (2005). Guide for Participatory Appraisal, Monitoring and Evaluation (PAME). Braamfontein, South

Africa: The MVULA Trust. http://www2.gtz.de/Dokumente/oe44/ecosan/en-guide-participatory-monitoring-evaluation.pdf

Taylor-Powerll, E., Rossing, B. & Geran, J. (July 1998). Evaluating Collaboratives: Reaching the Potential. University of Wisconsin-Extension. http://learningstore.uwex.edu/assets/pdfs/G3658-8.PDF [Creative Expression]

UNDP (1997). Who are the Question Makers: A Guide to Participatory Evaluation. http://www.undp.org/evaluation/documents/who.htm)

Page 36: EVALUATION - DME for Peace · 2018-03-09 · Evaluation designs describe the logic and the conceptual framework for answering the evaluation’s key questions. DoS recognizes two

Evaluation: Some Tools, Methods & Approaches

36

EVALUATIONEVALUATION

Page 37: EVALUATION - DME for Peace · 2018-03-09 · Evaluation designs describe the logic and the conceptual framework for answering the evaluation’s key questions. DoS recognizes two

Social Impact

37

EVALUATIONEVALUATIONEVALUATION

PART IVEvaluation Tools and Approaches

Page 38: EVALUATION - DME for Peace · 2018-03-09 · Evaluation designs describe the logic and the conceptual framework for answering the evaluation’s key questions. DoS recognizes two

Evaluation: Some Tools, Methods & Approaches

38

RESULTS fRAMEWORkS

What are they?A results framework (RF) is a graphical representation of a development hypothesis. It demonstrates the causal linkages between all levels of results necessary and sufficient to achieve a specific bureau or mission goal. These results must be realistic and achievable, one-dimensional, measurable and within the manageable interest of the implementing Operating Unit. RFs are based on problem analysis and information produced by technical analysis and other related assessments.

What can we use them for? Planning – help determine the cause-and-effect pathway of objectives needed

to reach a bureau or mission goal Assessment – help the evaluator understand the logic behind the design of a

particular intervention Evaluation design – athe causal chain and anticipated results can help shape

evaluation questions and methods

ADVANTAGES:

Brings the big picture to light through a focus on intervention effects and outcomes rather than outputs Solidifies linkages between program outcomes and national-level goals and

strategies Facilitates agreement and understanding on the design and anticipated goals

of interventions, and generates ownership amongst all mission or bureau team members

CHALLENGES:

Bias toward quantifiable indicators Potential to oversimplify complex interventions

COST:Very little for development of actual RF; however, the cost of preceding analyses and assessments will vary.

Page 39: EVALUATION - DME for Peace · 2018-03-09 · Evaluation designs describe the logic and the conceptual framework for answering the evaluation’s key questions. DoS recognizes two

Social Impact

39

SKILLS REQUIRED:Several days of training recommended to grasp understanding of RF development. For actual development, team members will need an understanding of the local context in which results are being sought, technical knowledge, data analysis and interpretation, problem recognition and logical reasoning.

TIME REQUIRED:Actual development may take only a week or two but depends on the results of assessments and analyses (i.e., environmental, gender, economic, etc.) that may require up to several months.

kEy RESOURCES Department of State (2012). Managing for Results: Department of State Project Design Guidebook Department of State (2012). Functional Bureau Strategy Guidance and Instructions Department of State (2012). Integrated Country Strategy Guidance and Instructions Department of State (2012). Joint Regional Strategy Guidance and Instructions USAID. (2010). Building a Results Framework. Performance Monitoring & Evaluation TIPS, 2nd Ed., No. 13.

http://pdf.usaid.gov/pdf_docs/PNADW113.pdf

Page 40: EVALUATION - DME for Peace · 2018-03-09 · Evaluation designs describe the logic and the conceptual framework for answering the evaluation’s key questions. DoS recognizes two

Evaluation: Some Tools, Methods & Approaches

40

PERfORMANCE MANAgEMENT PLANS

What are they?Performance management plans (PMP) measure and track progress toward achieving results by identifying and defining a list of project-related indicators. The plans typically include an overview of the bureau or mission's management systems, how the PMP was developed, the relevant results framework, a narrative on the development hypothesis, indicator reference sheets, an indicator table, and an M&E task schedule.

What can we use them for? Establish monitoring and evaluation systems for reporting results, including

how information is collected, reviewed and analyzed Document definitions, assumptions and decisions Use data collection to make informed decisions Better communicate facts and figures on program achievements and progress

to the Department stakeholders (host country counterparts, partners, Congress, OMB and American taxpayers)

ADVANTAGES:

Puts bureau or mission teams on the same page at early stages of project development Forces measurement of change for critical indicators Helps keep M&E activities on schedule (i.e., data collection, data quality

assessment, evaluations, etc.) Improves knowledge, transparency, and accountability

CHALLENGES:

Easy to develop an unwieldy PMP with too many indicators Time and cost implications may make ideal indicators unreasonable to collect

COST:Minimal for PMP development; baseline data collection costs will vary.

SKILLS REQUIRED:Training on overall Department results-based management approach and PMP development; technical knowledge (in relevant sector); in-depth knowledge of country context; knowledge of performance management methodologies.

Page 41: EVALUATION - DME for Peace · 2018-03-09 · Evaluation designs describe the logic and the conceptual framework for answering the evaluation’s key questions. DoS recognizes two

Social Impact

41

TIME REQUIRED:May take between 2-4 weeks to develop PMP document, solicit and integrate input and have reviewed by management. Depending on timing, may decide to wait to undertake collection of baseline data and establish targets. Prep time may be needed to develop SOWs for any aspects of PMP development that will be contracted out.

kEy RESOURCES Department of State (2012). Performance Management Guidebook

Page 42: EVALUATION - DME for Peace · 2018-03-09 · Evaluation designs describe the logic and the conceptual framework for answering the evaluation’s key questions. DoS recognizes two

Evaluation: Some Tools, Methods & Approaches

42

gENDER ANALySIS

What is it?An approach to program planning, monitoring and evaluation that assesses the different ways in which interventions affect men and women, boys and girls, and people of differing civil status (single, married, divorced, widowed, etc.). This analysis can be conducted at the project, sector and national levels and ensures that differences are addressed in the evaluation design, sample selection, data collection and analysis. It recognizes the limitations of conventional quantitative data collection methods for discussing sensitive topics such as domestic violence, control of household resources, sexual harassment, social and political participation, and gender differences in the labor market.

What can we use it for? Identify and address design, implementation and outcome issues with

differential consequences for men and women, boys and girls Promote equity and human rights Promote economic efficiency of programs by ensuring full participation of

both sexes and maximizing the different resources that men and women contribute

ADVANTAGES:

Ensures that all sectors of the target population benefit from interventions and that resources of all sectors of the target community are mobilized Ensures efficiency and equity of program impact Addresses social, cultural, legal and political factors that limit women’s

participation at the household, community, local and national levels Addresses sensitive human rights issues such as human trafficking, sex trade

and exposure to HIV/AIDS

CHALLENGES:

May raise sensitive issues that governments may not wish to address and that donors may not wish to push Uses frameworks and data collection techniques with which many

quantitatively trained researchers may not be familiar, sometimes causing reluctance to use these techniques May require additional resources to contract staff with specialist skills

Page 43: EVALUATION - DME for Peace · 2018-03-09 · Evaluation designs describe the logic and the conceptual framework for answering the evaluation’s key questions. DoS recognizes two

Social Impact

43

COST:Depending on the methods used, gender analysis may increase the cost of the evaluation by 10-20%. In cases where a stand-alone gender analysis is required, the cost will be similar to comparable conventional evaluation studies.

SKILLS REQUIRED:Sound knowledge of both quantitative and qualitative research and evaluation techniques combined with familiarity with current thinking on gender and development as well as special issues involved in collecting data on sensitive gender-related topics.

TIME REQUIRED:When used to complement a conventional evaluation, gender analysis may increase the required staff weeks by 10-20%. However, if carefully coordinated, most of the follow-up surveys or in-depth interviews can be conducted at the same time as the regular survey.

kEy RESOURCES Department of State (2012). Department of State Policy Guidance: Promoting Gender Equality to Achieve Our

National Security and Foreign Policy Objectives. http://www.state.gov/s/gwi/rls/other/2012/187001.htm Bamberger, M. (2005). Handbook for evaluating the impacts of development policies and programs. Developed

for International Program for Development Evaluation Training Workshop. Carleton University, Ottawa. http://bambergerdevelopmentevaluation.org [click on “gender”]

Page 44: EVALUATION - DME for Peace · 2018-03-09 · Evaluation designs describe the logic and the conceptual framework for answering the evaluation’s key questions. DoS recognizes two

Evaluation: Some Tools, Methods & Approaches

44

COST-BENEfIT AND COST-EffECTIVENESS ANALySIS

What are they?Cost-benefit and cost-effectiveness analysis are tools for assessing whether or not the costs of an activity can be justified by its outcomes and impacts. Cost-benefit analysis measures efficiency by monetizing all inputs, outputs and outcomes. Cost-effectiveness analysis estimates inputs in monetary terms and outcomes in non-monetary quantitative terms (such as improvements in student reading scores). Whereas cost-effectiveness focuses on a particular outcome, cost-benefit analysis seeks to include all outcomes, each converted to a monetary benefit.

What can we use them for? Identifying projects or approaches that offer the most efficient allocation of

resources Assessing a project’s outcomes relative to its costs, facilitating comparison with

other projects

ADVANTAGES:

It is a high quality approach for estimating and comparing the efficiency of programs and projects Makes explicit project assumptions that might otherwise remain implicit or

overlooked at the design stage Useful for convincing policy-makers and funders that activity benefits justify

its costs

CHALLENGES:

Fairly technical, requiring specialized financial and human resources Converting benefits (such as increased life expectancy) into monetary terms

often requires assumptions that strongly influence results Cost-effectiveness may be difficult to interpret in projects with multiple types

of outcomes Requisite data for cost-benefit calculations may not be available Results must be interpreted with care, particularly in projects where results are

difficult to quantify or are poorly measured

Page 45: EVALUATION - DME for Peace · 2018-03-09 · Evaluation designs describe the logic and the conceptual framework for answering the evaluation’s key questions. DoS recognizes two

Social Impact

45

COST:Varies greatly, depending on scope of analysis and availability of data.

SKILLS REQUIRED:The procedures used in both types of analyses are often highly technical. They require skill in economic analysis of programs in the sector and availability of relevant economic and cost data.

TIME REQUIRED:Varies greatly depending on scope of analysis and availability of data.

kEy RESOURCES Belli, P., et al. (2000). Economic Analysis of Investment Operations: Analytical Tools and Practical Applications. The

World Bank, Washington, D.C. Millennium Challenge Corporation. (April 2009). Guidelines for Economic and Beneficiary Analysis. http://www.

mcc.gov/documents/guidance/guidance-economicandbeneficiaryanalysis.pdf

EVALUATIONEVALUATION

Page 46: EVALUATION - DME for Peace · 2018-03-09 · Evaluation designs describe the logic and the conceptual framework for answering the evaluation’s key questions. DoS recognizes two

Evaluation: Some Tools, Methods & Approaches

46

INfORMATION COMMUNICATION TEChNOLOgy (ICT) fOR EVALUATION

What is it?ICT for Evaluation encompasses a broad and growing range of tools to increase the effectiveness and efficiency of international evaluations. These tools include use of Personal Date Assistants (PDAs), Smartphones, Netbooks, iPads, email and web-based surveys, digital photos, audio and video and on-line focus groups to collect data. In addition, there are technologies for enhancing the communication of geographically dispersed evaluation teams such as WebEx or Skype. Finally, there are tools for analyzing quantitative data such as Statistical Package for the Social Sciences (SPSS) and Statistical Analysis Software (SAS); and tools for analyzing qualitative data such as the Center for Disease Control’s EZ-Text (free) plus commercial products such as NVivo and Atlas-ti. NVivo and Atlas-ti are tools for storing, coding, managing, analyzing and retrieving qualitative data.

ADVANTAGES:

In the case of electronic surveys substantial cost savings, immediate aggregation of data; reduced transcription errors, programmable skip patterns, and with GIS-enabled devices geo-positioning of survey respondents. In the case of virtual communication tools the ability for teams to better plan

and coordination their activities in the field and collaborate on findings and report writing when not collocated. In the case of qualitative data analysis tools the ability to handle large data

sets, effective storage systems and once coding has been established quick and easy access to analyzing data and establishing relational networks among data categories. The tools also establish an audit trail as coding of qualitative data proceeds.

CHALLENGES:

High upfront costs for acquiring handhelds for survey teams Special skills required for programming and operating handheld devices and

specialized software packages. Operating environments in remote locations may not include access to

internet, cell phones or electricity

Page 47: EVALUATION - DME for Peace · 2018-03-09 · Evaluation designs describe the logic and the conceptual framework for answering the evaluation’s key questions. DoS recognizes two

Social Impact

47

Tools like NVivo and Atlas-ti take time to learn; inputting and coding the data can be time consuming and once categories, codes etc have been established it may be hard to change them. In field settings capturing qualitative data on digital recorders may not be practical and producing quality transcripts for qualitative data analysis may add large and impractical amounts of time to the data collection level of effort and budget.

COSTS:Varies greatly depending on technologies used.

SKILLS REQUIRED:ICT experts for PDA and handhelds for programming purposes; tteam members who are skilled in using specific quantitative and qualitative software packages.

TIME REQUIRED:Highly variable depending on technologies used.

kEy RESOURCES: Sue, V. M., & Ritter, L. A. (2012). Conducting online surveys. (2nd ed.). Thousand Oaks, CA: SAGE Publications. http://www.cdc-eztext.com/ http://betterevaluation.org/blog/analyzing_data_using_common_software The Evaluation Exchange, Volume X Number 3, Fall 2004. Taking the Next Step: Harnessing the Power of

Technology for Evaluation. Harvard Family Research Project.

Page 48: EVALUATION - DME for Peace · 2018-03-09 · Evaluation designs describe the logic and the conceptual framework for answering the evaluation’s key questions. DoS recognizes two

http://www.socialimpact.comS O C I A L I M PA C TS O C I A L I M PA C T