crjs 4466 program & policy evaluation lecture #4 test #1 results evaluation projects questions?

CRJS 4466PROGRAM & POLICY EVALUATION

LECTURE #4

•Test #1 results •Evaluation projects

• Questions?

Measurement in Program Evaluation:

• test – measurement theory:

observed score on measure

true score

error

Deductive/Inductive ModelDeductive/Inductive Model

TheoryTheory

ConceptConcept PropositionProposition ConceptConcept

VariableVariable HypothesisHypothesis VariableVariable

OperationalizationOperationalization

Indicator(s)Indicator(s) Indicator(s)Indicator(s)

EmpiricalEmpirical

Conceptual Framework

Measures

IntendedInputs

ProgramComponents

IntendedOutputs

IntendedOutcomes

Measures Measures

An Example

• Let’s begin with an example– The photo radar program in BC is intended to

reduce the number of speed-related motor vehicle collisions on BC roadways

• We can model it

Photo RadarProgram

Fewer speed-related motor

vehicle collisions

An Example

• If we want to measure the performance of the program, we need to translate the intended outcome into observables

• Our conceptual framework for measurement outlines the process

Table 4-2: Program Logic of the Vancouver Radar Camera

Intervention

ConstructIs the construct clearly stated?

Speed-related motor vehicle

collisions

Measurement procedures (the actual steps we use to

gather the data)

Criteria/Issues Criteria/Issues For For

MeasurementMeasurement

MeasurementMeasurementProcessProcess Our ExampleOur Example

Attending police officer’s assessment of whether speed was a contributing factor; recorded in an accident report; entered into a

database

Are the measurement procedures valid and

reliable?

Figure 4-2: Measuring Constructs in Evaluations

Measuring Mental Constructs

-We ask survey questions-We try to control how the

questions are asked-Intended survey questions or

survey items are stimuli

While we are asking the questions, uncontrolled things happen:-Interviewer characteristics

-Setting characteristics-Interviewee characteristics-Instrument characteristics

STIMULISTIMULI RESPONSESRESPONSES

The Person’s:The Person’s:KNOWLEDGEKNOWLEDGEATTITUDESATTITUDES

EXPERIENCEEXPERIENCE

Valid and reliable responses to survey items (useful data)

-Responses to uncontrolled stimuli (noise)-These produce invalid or unreliable data

The challenge is to separate useful data from noise

Validity and Reliability of Measures

• Validity: does the variable actually measure the corresponding construct?

• In our example of the photo radar program, do we believe that police officers can actually tell whether speed was a contributing factor in a motor vehicle accident?

• Reliability: if we repeat the measurement process for a construct in a given situation, do we get the same result?

• In a given accident situation, would independent observers reach the same conclusions about speed being a contributing factor?

Types of Validity

• There are different ways of assessing validity – several are relevant here

• Face validity: do we judge the measurement process/variable to validly represent the construct?

• Content validity: would experts in the field say that the measure captures the meaning of the construct?

• Concurrent validity: does the measure correlate with another measure that is valid?– Measuring crime levels (police reports and victim surveys)

Types of Reliability

• We can also assess reliability in different ways• Having two or more independent observers take

measurements in a given situation– Two police officers completing accident report forms

• Having the same observer repeat the measurement process in a given situation– Police officer repeats the assessment of possible

contributing causes of the accident

Tests for Checking Reliability

• Test-retest method - take the same measurement more than once.

• Split-half method - make more than one measurement of a social concept (prejudice).

• Use established measures.• Check reliability of research-workers.

Characteristics of Variables

• Variables can categorize (nominal variables)– Categories must be mutually exclusive and jointly

exhaustive• In a job training program, clients could be categorized as

being on social assistance or not• Variables can rank (ordinal variables)

– Categories are ranked from less to more• In a job training program, clients could be asked to rate

the program: not beneficial, somewhat beneficial, very beneficial

• Variables can count (interval and ratio variables)– There is a unit of measurement

• Number of weeks of job training

Likert Item and Response Categories

Improved pre-harvest planning, quicker reforestation, and better planting maintenance would reduce the

need for chemical or mechanical treatments.Strongly Strongly Agree Agree Neither Disagree Disagree

1 2 3 4 5 Please circle the appropriate response

Example QuestionsQuestion 8: Do you think that your police services would

improve if your police department and all other police departments (emphasis in the original) in the West Shore area combined into one department?_____ Yes _____ No _____ Undecided

Question 9: Have you discussed this question of police consolidation with friends or neighbors?_____ Yes _____ No _____ Undecided

Question 10: Are you for or against combining your police department with police departments in surrounding municipalities?_____ Yes _____ No _____ Undecided

Examples of Validity and Reliability Issues Applicable to Surveys

Validity: Bias Source of the Problem Reliability: Random Error

race, gender, appearance, interjections,

interviewer reactions to responses

interviewer inconsistency in the way questions are worded/spoken

old age, handicaps, suspicion

respondent wandering attention

biased questions, response set, question order

instrument single measures to measure client perceptions of the

program

privacy, confidentiality,

anonymity

interviewing situation/interviewing method

noise, interruptions

biased coding, biased categories (particularly

for qualitative data)

data processing coding errors, intercoder reliability problems

Four Levels of Measurement

1. Nominal - offer names for labels for characteristics (gender, birthplace).

2. Ordinal - variables with attributes we can logically rank and order.

Four Levels of Measurement

3. Interval - distances separating variables (temperature scale).

4. Ratio - attributes composing a variable are based on a true zero point (age).

crjs 4466 program & policy evaluation lecture #4 test #1 results evaluation projects questions?

Documents