racfintroduction to evaluation

1

An Introduction to Evaluation “Evaluation” is a term used in applied science that encompasses a diversity of philosophies and methodologies. The word itself implies that the goal is to determine the “value” of some thing. Judgments concerning value can be made using subjective, qualitative methods, as well as by objective, quantitative methods. In other words, evaluation data is often both written/linguistic (transcripts of interviews) and numeric (data from surveys). Statistics guide our interpretations of the numeric data; analysis of qualitative data can be systematized, but will still remain subjective. Therefore, evaluation science is not an “exact” science – and being an exact science is not its goal. In assessing value, evaluation science wants to take both subjective and objective data into account in order to provide the most holistic picture possible. This is referred to as the “Mixed-Methods Approach.” Approaches to Evaluation In general, evaluations can be said to be process focused (examining, describing a set of processes) or outcome focused (assessing a set of desired outcomes), or both. Process evaluations are often referred to as formative, in that they provide good data with which to generate hypotheses for an outcome evaluation. Outcome evaluations are often referred to as summative due to their focus on drawing conclusions. Different overall approaches to evaluation are best understood by examining the underlying philosophies of each approach: 1. Accountability models These models are audit-type models that examine the correspondences between what the program’s original goals were, and whether or not those goals were obtained. Goals are generally quantifiable. These models also place emphasis on tracking the money – they examine how the money provided for a program was used. Examples of accountability evaluations

2

include efficiency evaluations (evaluations that document inefficient practices), cost-analyses (cost-benefit analysis, cost-per-participant, cost-budgeting analysis), and quality control studies. 2. Due diligence models The central goal of due diligence evaluation is simply to maintain follow-up reporting. It is most commonly employed when the information desired from the program is not extensive, minimum standards for that reporting are well-defined, but the real value obtained is in the reporting process itself. It is the “reporting back” that is most essential, and generally helps both parties by keeping the organizations in contact through a standardized set of procedures. The strength of due diligence evaluation is in its emphasis on standards and relationship maintenance. 3. Participatory Action models These models were developed to make the process of evaluation more participatory and egalitarian, and to focus the results of evaluation on producing actionable recommendations. The agency or program being evaluated collaborates with the evaluator on everything from the questions the evaluation should answer, the methods that will be used, the collection, analysis, and interpretation of data, and the formation of actionable recommendations for the future. Participatory models have a number of strengths. They help to “teach” the value of engaging in evaluation practices by creating a fully collaborative partnership. Programs that undergo such evaluations are given the information, methods, tools and training necessary to independently conduct evaluations in the future. Evaluation itself is demystified. And because the program being evaluated helps to define the research questions asked, buy-in to the recommendations that emerge is more likely. Evaluating organizations benefit by forming strong partnerships with programs being evaluated by providing consultative and technical assistance to the program, which normally the program would have to contract for.

3

Methods used in evaluation research As discussed earlier, both quantitative (numeric) and qualitative (linguistic) methods are used in evaluation research. Examples of quantitative methods 1. Surveys and questionnaires Surveys tend to ask questions of a more global nature, while questionnaires tend to be more focused. Public opinion research polls, and other such large-group, global instruments are surveys. Surveys are often conducted to examine policy-level questions, public attitudes, and people’s perceptions of larger issues. Their advantage is that they are easy and quick to administer to large numbers of people. Therefore, one of their implied goals is to be able to form conclusions that are applicable to large groups of people (i.e., “representative”). The primary disadvantages of surveys are that they tend to ask very global questions which each individual might interpret differently, they are forced to quantify concepts that may not be best studied through quantification, and they lack context – it is not always easy to interpret what the results of surveys actually mean. Surveys are analyzed using statistical software, and analysis of a survey or questionnaire’s statistical properties (its reliability and validity) is referred to as pscyhometrics. Questionnaires are like surveys, in that they can reach many people, and they suffer from the same disadvantages that surveys do, but their topics tend to be much more focused and clearly defined, so their questions tend to be more specific and less global in nature. Questionnaires are used in a variety of settings. In general, questionnaires are better contextualized than surveys are – because of their more discrete focus. Thus, interpretations of what data from questionnaires mean is generally a little more straightforward when compared to surveys. Questionnaires are also analyzed using statistical software. Surveys and questionnaires always ask close-ended questions, which are tied to a particular scale. Sometimes questions require simple “yes” or “no” answers, or provide a choice between several categories (eg., which of these three statements best describes your feelings?). These yes/no and categorical scales are called forced-choice scales – they force the respondent to choose the best fitting response. Other times, respondents are asked to rate the question/statement using a numeric scale (eg., How strongly did that affect you, on a scale of 1 = no effect to 5 = very strongly). Numeric scales

4

can be embedded into what look like categories at times (eg., How strongly do you agree or disagree with each statement? strongly disagree, disagree, agree, strongly agree). 2. Deductive coding Deductive coding is a process that can be used to extract quantitative information from qualitative data (like interview transcripts, archival materials, etc.). For example, if I ask each interviewee how they feel about a new public policy, I can rate the intensity of their response using a numeric scale. I can also perform counts of key words. I could count the number of sentences spent discussing a particular topic. I can apply a rubric meant to assign a “score” to an interviewee’s level of knowledge, awareness, etc. I can also assign scores to designate levels of importance or relevancy, which is often done with documents and archival materials. The numbers are then analyzed using statistical software. The major strength of deductive coding, and why it is used so frequently in evaluation research, is that it does help summarize large amounts of qualitative data, which is considered unwieldy, in quantitative, easy to communicate terms (percentages, proportions, etc.). Examples of qualitative methods 1. Abstracting The goal of abstracting is to choose ahead of time, a set of topics of interest. Abstracting techniques can be applied to interview data, document reviews, and archival data. With interviews, transcripts are reviewed for participant commentary on the chosen topics, and those sections of each interview transcript that contain references to those topics of interest are lifted out and placed in a new file – one file per topic. This brings together all interviewee opinions on each chosen topic; these topically-organized files can then be summarized, using actual quotes from respondents to illustrate important points. 2. Inductive (Grounded Theory) coding The difference between inductive and deductive coding is a simple one. In deductive coding, as discussed previously, topics of interest are chosen before the data is consulted. The opposite is true in inductive coding. Codes are developed by reading through the data. This is why this type of coding is referred to as “Grounded Theory” coding, as the main idea is to let the data tell the story from the ground up, rather than to check the data

5

against an existing story (top-down, deductive coding). Both types of coding can reach a level of scientific reliability that rivals any survey or questionnaire method. Establishing coding reliability involves at least two coders working independently (in essence, checking each other for consistency). In inductive coding, when a final set of codes is developed, a third coder is often brought in to apply the codes to the data again, to insure reliability. 3. Ethnography Ethnographic methods are most frequently used in anthropology, but have become very common in evaluation research. The main method of ethnography is observation. In ethnography, participant observation is used most frequently (the evaluator immerses him/herself in the culture being studied and participates in that culture’s practices). Participant observation is most often recorded in the form of field notes. Field notes contain subjective impressions, as well as objective observations, and can be analyzed using any of the methods described previously for analysis of qualitative data. Observation is not always participatory. More formalized methods of behavioral observation require advanced training on a particular coding system to be applied to either live observations, or video recordings. Some coding systems are very complex, some are very simple. When observation is conducted this way, there is the opportunity to produce quantifiable results, rather than just qualitative data. 4. Case Studies A case study is an in-depth, contextualized look at a particular program or agency which provides a meaningful and holistic view of that program (or agency). Case studies generally employ mostly qualitative methods, including observation, interviewing, document reviews, etc. One important aspect of a good case study is a thorough exploration and description of the context in which the program lives, and a description of how this context shapes the program (in this way, case studies borrow somewhat from the emphasis on understanding local “culture” seen in ethnographies). Case studies are especially helpful when examining how different communities adapt the same program to meet local circumstances. Some evaluations call for multiple cases to be examined and compared, which then allows for even wider applicability of findings.

6

5. Interviews I’ve mentioned the use of interviews several times, and there are several different approaches to interviewing. In general, interviews can be classified as one of three types: a. Structured Structured interviews all use exactly the same set of questions and exactly the same set of probes (probes are more specific questions asked after a more general question) for all participants. No divergence from the interview script is allowed. Structured interviews are excellent when a high level of standardization is required. They are less appropriate in circumstances where the interviewer is uncertain regarding what types of information will emerge during the interviews, because they do not allow for divergence or tangents. b. Semi-structured Semi-structured interviews are the most popular types of interviews used in evaluation research. These interviews have an overall structure or set of questions that will be asked of each interviewee, but then allow for flexibility in the addition of questions relevant to particular interviewees. The interview script, therefore, is allowed to wander a bit, based on the responses from the interviewee, and oftentimes, each interviewee will respond to a core set of questions asked of all interviewees, and a set of questions specifically designed for them. c. Informal Informal interviews most often are utilized in ethnographies and case studies. The interviewer may have a topic in mind, or even an introductory question, but in general, the interviewee guides the interview process. 6. Focus Groups Focus groups are an excellent way to study both how people perceive issues of interest, but also, how they interact with one another regarding those issues. Oftentimes, when people interact on a topic, new understandings become available which no single individual would probably have reached alone. One can think of focus groups as evaluation researcher’s way of studying the intersection of personal knowledge and social interaction. In their structure, focus groups can follow any of the same formats interviews follow. They are most often semi-structured, and are led by a

7

moderator who both helps to elicit a great depth of information, but also helps to facilitate the social interaction between participants. Participants are often also asked to fill out surveys/questionnaire/set of open-ended questions. This is done because there are differences between what people will say in public in front of others, and what they will write privately and anonymously.

racfintroduction to evaluation

Documents