the criterion choices dr. steve training & development inp6325
TRANSCRIPT
Why Evaluate Training?
What can be gained from evaluating training?1. Determine effectiveness of program.2. Demonstrate benefits of training to top
management and stakeholders3. Demonstrate job-relatedness of training
(legal implications)4. Research value in aiding future training
development5. Ability to make personnel decisions
(promotion, retention, etc.)
Why not evaluate training?
In a survey of 611 orgs, 92% claim to evaluate training programs, however the vast majority of these were simply trainee reaction, rather than learning or transfer
What’s preventing them?
1. Evaluation not often emphasized by management
2. Training directors do not know how3. HR may not understand importance4. View that evaluation is expensive and risky
Determining Criteria
To determine criteria, must know purpose of evaluation:
To predict job success, must evaluate training for relationship between training performance and job performance
Must determine whether one training program is better than another or no formal training at all.
Pessimistic view of criteria selection
Guion (1961)1. I/O psychologist has a hunch about a problem2. Read a vague, ambiguous job description3. Form a fuzzy concept of the ultimate criterion4. Develop a set of measures that can be combined
to approximate ultimate criterion5. Judge relevance of measure: deficiency &
contamination6. Data required for the measure are not available in
company personnel files (and never will be)7. Select best available criterion
Evaluating training effectiveness
To assess effectiveness of training, must:1. Develop criteria that assesses trainees’ learning
of the KSAs necessary for the job2. Assess performance at the end of training
(progress)3. Assess performance after period of time on job
(transfer)
Criterion Selection
B
D
C
A
A – KSAs not in Needs Assessment or CriterionB – Criterion Deficiency - KSAs in Needs Assessment, but not CriterionC – Criterion Contamination (error + bias) - KSAs in Criterion, but not Needs AssessmentD – Criterion Relevance - KSAs in both Needs Assessment & Criterion
Ultimate Criterion – better time managementUltimate Criterion – better time management
Actual Criterion - files reports on timeActual Criterion - files reports on time - Avoids overtime- Avoids overtime - Meets deadlines - Meets deadlines
Criterion Deficiency
Criterion Deficiency – training program intended to teach certain KSAs required for the job, but criteria used to evaluate training are missing KSAs
Example: Postal Clerk- mail sorting skill not part of the training- Sort Mail
- Weigh pkg- K of pricing
Criterion Contamination
Criterion Contamination – extraneous variables included in the criteria that were not part of training program Opportunity bias – some individuals might have a
greater opportunity for successful job performance which had nothing to do with training
Geographiclocation
salesmanship
- KSAs
Example: Salesperson - performance affected by location (opportunity) as much as by training
Criterion Reliability
Criterion Reliability – consistency of criterion measure. Example: inter-rater reliability of supervisory
ratings Negatively impacted by:
Competence of judges Simplicity of behaviors Overtness of behaviors Operational definition of behaviors
Two Views of Criterion Development
Composite Criteria Multiple Criteria
Mastery of needs assessment
A + B + C = XGrade:
A if X > 90%
Mastery of needs assessment
ASkill at writing
task statements
BKnowledge of task analysis
techniques
CSkill at
presenting Results
Problem: not very diagnosticMay not know where one was successful or failed
Problem: may meet some criteria, but not others. What constitutes success?
Proximal vs. Distal Criteria Proximal – short term criteria Distal – Long term criteria
Example: Political training Proximal: performance during campaign Distal: performance in office
Poll of National Adults
Approve Disapprove No Opinion
9/22/2001 90 8 2
9/22/2002 66 30 4
9/22/2004 48 49 2
9/22/2006 42 55 3
USA TODAY/CNN/Gallup Poll results Below are the results of a USA TODAY/CNN/Gallup Poll:1. Question: Do you approve or disapprove of the way George W. Bush is handling his job as president?
Levels of Criteria
1. Reaction – opinion of trainees (survey) Affective reactions Utility judgments
2. Learning – mastery of training material (test)
Immediate knowledge Knowledge retention Behavior/Skill demonstration
3. Behavioral – trainee job performance (ratings)
Transfer
4. Results – org profit by training ($$$$)
Proximal
Distal
Reaction Criteria
Guidelines for Reaction Criteria Developmenta. Questions based on information from needs assessmentb. Q’naire includes quantifiable data (do not use ONLY open-
ended questions)c. Q’naire should be anonymousd. Should include SOME open-ended questionse. Pilot test Q’naire for length and comprehension
Benefit: provides info from all trainees, not just those with extreme opinions
Disadvantage: may have nothing to do w/ eventual performance, but if training is perceived as poor it is less likely to be taken seriously or skills retained
Reaction Criteria Example
Strongly
Disagree
Disagree
Neither Agree nor Disagree
Agree
Strongly
Agree
1. The objectives of this program were clear
1 2 3 4 5
2. The instructor was helpful and contributed to the learning experience
1 2 3 4 5
3. There was an appropriate balance between lecture, participant involvement, and exercises in the program
1 2 3 4 5
4. The topics covered in this program were relevant to the things I do on my job
1 2 3 4 5
5. I can see myself performing more effectively after attending this program
1 2 3 4 5
6. The logistics for this program (e.g., arrangements, food, equipment) were satisfactory
1 2 3 4 5
Learning Criteria
Test at end of training addressing material covered during training
Example: Can receptionist trainee recall the steps to transfer a phone call on the company’s phone system?
Pre-test/Post-test comparison
Behavior Criteria
Criteria should come directly from Task and KSA analyses.
Use experimental methods to demonstrate improvements due to training.
Assess whether performance goals are met. Example: Bus Driver performance:
Stops and restarts without rolling back Tests brakes at tops of hills Uses mirrors to check traffic Signals following traffic Stops before crossing sidewalk when coming out of
driveway Stops clear of pedestrian crosswalks
Results Criteria
Measure of training program in terms of meeting organizational goals Money saved = lower turnover, lower
absenteeism, improved morale, improved productivity, etc.
Utility – Whether training saves more money than it costs Ex: if no formal training in place, senior workers
lose time showing junior workers what to do.
Utility Analysis
U = (T x N x dt x Sdy) – (N x C) U = $ value of training program (Value) – (Cost) T = # years duration of training effect on
performance N = # trainees dt = effect size or true difference in pre-post perf
= (Xc – Xe) /ryy
Sdy = std deviation of performance in $ of untrained group
C = cost per trainee Example: U = (2yrs x 100N x .5 x $5,000) – (100N x
$200) = $500,000 - $20,000 = $480,000 over 2 yrs
Utility Analysis
Reasons for NOT using utility analysis Data may not be readily available or
unreliable Seeking non-monetary benefits of training Other variables confound results
Choosing Criteria
Conclusions: Reaction information is important to know for
whether trainees will accept training program, but it does not translate into effectiveness in terms of learning, transfer, or monetary savings
Learning criteria is a good predictor of results Learning criteria is a modest predictor of
transfer Behavior criteria is a modest predictor of
results
Other Criteria Concerns
Can’t judge effectiveness of training strictly on outcome (summative), must also look at process (formative) Process measures tell us source of outcome
changes Outcome alone is not diagnostic Importance variables affecting process include:
differences between trainers, settings, student samples, motivation of groups, etc.
Subjective vs. Objective Measures
Subjective – ratings, opinions Problem of rater biases (halo, central tendency,
leniency) Easy to use Was rater well-trained to make ratings?
Subjective ratings may be improved by training the rater
Objective – countable, observable measures such as production, absences, defects missed, etc. Problem of opportunity bias
Objective measures should be used when possible, but must be aware of potential contamination and account for it