the variability of assessment centre validities: subject to purpose? kim dowdeswell, senior research...

28
The variability of assessment centre validities: Subject to purpose? Kim Dowdeswell, Senior Research Consultant & Industrial Psychologist 30 th Assessment Centre Study Group Conference, 18 th March 2010 ©2010 SHL Group Limited

Upload: john-bradford

Post on 27-Dec-2015

217 views

Category:

Documents


0 download

TRANSCRIPT

The variability of assessment centre validities: Subject to

purpose?

Kim Dowdeswell, Senior Research Consultant & Industrial Psychologist

30th Assessment Centre Study Group Conference, 18th March 2010

©2010 SHL Group Limited

2©2010 SHL Group Limited

Presentation OutlineSignposting the

discussion

• Introduction

• Assessment centre validation research over the years

• Assessment vs. development centre usage trends

• Comparing validities for overall assessment ratings vs. dimension ratings

• Potential differences in validities between assessment centre purposes and approach

• Questions?

3©2010 SHL Group Limited

IntroductionOn assessment centres

• Assessment centre methods incorporate three features:– The use of multiple assessment techniques– Standardised methods of making inferences from such techniques– Pooled judgements of multiple assessors in rating each candidate’s

behaviour

• For three major purposes (International Task Force on Assessment Center Guidelines, 2009):

– To predict future behaviour for decision-making– To diagnose development needs– To develop candidates on dimensions of interest

AC validation research over the years

©2010 SHL Group Limited

An overview of meta-analytic research evidence

5©2010 SHL Group Limited

AC Validation ResearchThe ‘gold standard’

Gaugler et al. (1987):

• Corrected mean validity coefficient of 0.37– Meta-analysis of 50 assessment centre studies– Relation between overall assessment ratings and various criteria

6

+1 PERFECT PREDICTION

.63

.60

.54

.51

.51

.40

.37

.35

.26

.18

.020

-0.01

Ability and Structured InterviewAbility and Work sampleWork Sample TestsStructured InterviewsAbility TestsPersonality TestsAssessment CentresBio dataReferencesYears Job ExperienceGraphologyRANDOM PREDICTIONAge

The Validity LadderSchmidt & Hunter (1998)

Criterion: Overall Job Performance

7©2010 SHL Group Limited

AC Validation ResearchThe ‘gold standard’

Gaugler et al. (1987):

• Corrected mean validity coefficient of 0.37– Meta-analysis of 50 assessment centre studies– Relation between overall assessment ratings and various criteria

• Differences observed in validity coefficients i.t.o.:

By Criteria: By AC Purpose:

Job performance 0.36 Promotion 0.30

Potential ratings 0.53 Early identification 0.46

Dimension ratings 0.33 Selection 0.41

Training performance 0.35 Research 0.48

Career advancement 0.36

8©2010 SHL Group Limited

AC Validation ResearchOver the years

• Validity coefficients of assessment centres seem to be dropping:

• 1987 -> 2007 shows a statistically significant drop:

Study Year Validity coefficient

Gaugler et al. 1987 0.37

Hermelin et al. 2007 0.28

Study Year 95% CI band

Gaugler et al. 1987 0.30 ≤ ρ ≤ 0.42

Hermelin et al. 2007 0.24 ≤ ρ ≤ 0.32

9©2010 SHL Group Limited

AC Validation ResearchChallenges to conducting AC

validation

• Sampling error – AC time and cost = small samples

• Moderate to severe levels of range restriction– Starting with small samples -> even fewer appointments

• Reliability of supervisor ratings of performance / potential– A notoriously common problem in validation research

10©2010 SHL Group Limited

AC Validation ResearchChallenges to conducting AC

validation

• Sampling error – AC time and cost = small samples

• Moderate to severe levels of range restriction– Starting with small samples -> even fewer appointments

» Hermelin et al. (2007) put indirect range restriction forward as an explanation for the lower results observed; with cost considerations, ‘modern’ AC participants subject to more pre-selection than previously

• Reliability of supervisor ratings of performance / potential– A notoriously common problem in validation research

Assessment vs. development centre usage trends

©2010 SHL Group Limited

The influence of purpose

12©2010 SHL Group Limited

AC vs. DC Usage TrendsThe influence of purpose

• The use of ACs for development has been increasing over the years

• Popular purposes for utilising ACs in surveyed US organisations (Spychalski et al., 1997):

– Selection (50.0%) / Promotion (45.8%)– Development planning (39.2%)

• Popular purposes for utilising ACs in surveyed SA organisations (Krause et al., 2010):

– Selection & Development (65%)– Selection alone (22%)– Development alone (13%)

13©2010 SHL Group Limited

AC vs. DC Usage TrendsThe influence of purpose: Selection ACs

• Selection ACs are designed to identify the best candidate for a job

• Features in ‘Selection Centres’ (Spychalski et al., 1997):– Assessors serve many times annually; ample opportunity to keep

skills current– Assessors are almost always asked to compile overall performance

ratings for selection purposes– ‘Selection Centre’ data validated more frequently than in centres used

for other purposes

More frequent likelihood of validation to be prepared for possible legal challenges to selection decisions based on AC data?

14©2010 SHL Group Limited

AC vs. DC Usage TrendsThe influence of purpose: Development ACs

• Goals of developmental assessment centres vary:– Identifying training needs– Formulating personalised developmental needs & action plans– Developing skills on the basis of immediate feedback and practice

• Features in ‘Development Planning Centres’ (Spychalski et al., 1997):

– Fewer candidate selection mechanisms used, with heavy reliance on supervisor data

– Assessors conduct lengthy discussion sessions, with other assessors and with candidates in feedback sessions

– ‘Development Planning Centres’ infrequently validated

15©2010 SHL Group Limited

AC vs. DC Usage TrendsThe influence of purpose: Assessor evaluations

• Possible implications of AC focus shifting from selection to development?

• “Assessors may evaluate candidates differently, depending on whether their ratings will serve a selection purpose (i.e., ‘yes/no’ decision) or a developmental purpose (i.e., identification of strengths and weaknesses).”

(Lievens & Klimoski, 2001)

The authors noted (in 2001) that there was no assessment centre research manipulating such variables that they knew of

Comparing validities for overall assessment ratings vs.

dimension ratings

©2010 SHL Group Limited

Which yield higher validities?

17©2010 SHL Group Limited

Comparing ValiditiesOverall assessment vs. dimension ratings

• ‘Selection Centres’ typically use an overall assessment rating (OAR) to inform selection / promotion decisions

– OAR-based validity evidence e.g. Gaugler et al. (1987); Hermelin et al. (2007)

• ‘Development Centres’ typically use dimension ratings to facilitate detailed feedback with participants about their strengths and weaknesses

– Dimension-based validity evidence e.g. Arthur et al. (2003)

18©2010 SHL Group Limited

Comparing ValiditiesOverall assessment vs. dimension ratings

Arthur et al. (2003)

• Criterion-related validity of AC dimensions compared to OARs

Dimension Validity Coefficient

OAR Validity Coefficient

Problem Solving 0.39

0.37

(Gaugler et al., 1987)

Influencing Others 0.38

Organizing & Planning 0.37

Communication 0.33

Drive 0.31

Consideration/awareness of others 0.25

19©2010 SHL Group Limited

Comparing ValiditiesOverall assessment vs. dimension ratings

Arthur et al. (2003)

• Criterion-related validity of regression-based composite of AC dimensions compared to OARs

Dimension R OAR Validity Coefficient

Problem Solving 0.39

0.37

(Gaugler et al., 1987)

Influencing Others 0.43

Organizing & Planning 0.44

Communication 0.45

Drive 0.45

Consideration/awareness of others 0.45

20©2010 SHL Group Limited

Comparing ValiditiesOverall assessment vs. dimension ratings

Arthur et al. (2003)

• The use of OARs may result in an underestimate of criterion-related validity of assessment ratings

• Predictive validity of AC composite scores derived from dimension weights can be enhanced if dimension intercorrelations reduced

• May not need as many dimensions as have typically been used

• AC dimensions -> single composite score in same manner as typically used to combine scores from multipredictor test batteries

Potential differences in validities between AC purposes &

approach

©2010 SHL Group Limited

What can we learn about increasing AC validity?

22©2010 SHL Group Limited

Differences in Approach & PurposeWhat can we learn about increasing validity?

• We have seen validity findings are influenced by:– The criteria used

» E.g. Dimension ratings (0.33) vs. Potential ratings (0.53)

– The purpose of the AC» E.g. Promotion (0.30) vs. Early Identification of Potential (0.46)

– The approach used in compiling candidate scores» E.g. OARs (0.37) vs. individual dimensions (0.25-0.39) vs. regression-based

composite of dimensions (0.45)

• Good research design & methodology is key

23©2010 SHL Group Limited

Differences in Approach & PurposeWhat can we learn about increasing validity?

• Literature fairly consistent i.t.o. other characteristics of ‘more highly valid’ ACs:

– Using psychologists as assessors rather than managers» Concern: Spychalski et al (1997) found that only 5.7% of cases utilised

psychologists as assessors

– Limit the number of dimensions evaluated» Concern: Krause et al. (2010) found only 20% of SA organisations

evaluated 5 dimensions or less in ACs

– Evaluate the ‘right’ dimensions; critical for success as identified by role analysis

» Further: Arthur et al. (2003) found some dimensions are more valid than others

24©2010 SHL Group Limited

• When considering several studies reporting on developmental assessment centres, Lievens & Klimoski (2001) noted as a limitation the majority of studies did not relate DC ratings to external criteria

• “However, in assessment centers conducted for developmental purposes other constructs might serve as more relevant criteria”

• “When validating or otherwise evaluating DACs, the appropriate criterion is change in participants’ understanding, behavior, and proficiency on targeted dimensions”

(International Task Force on Assessment Center Guidelines, 2009)

Differences in Approach & PurposeA final point on ACs vs. DCs & validities…

25©2010 SHL Group Limited

• We should not lose sight of the purpose of the Assessment Centre (Howard, 2009):

– Selection/Promotion:» Help find the right person for the job

– Diagnosis/Development:» Better choice of development activities?

– Succession/Placement:» Help find the right job for the person?

Differences in Approach & PurposeA final point on ACs vs. DCs & validities…

26

Questions…

27©2010 SHL Group Limited

References

Arthur Jr, W., Day, E.A., McNelly, T.L. & Edens, P.S. (2003). A meta-analysis of the criterion-related validity of assessment center dimensions. Personnel Psychology, 56, 125-154.

Gaugler, B.B., Rosenthal, D.B., Thornton, G.C. & Bentson, C. (1987). Meta-analysis of assessment center validity. Journal of Applied Psychology, 72, 493-511.

Hermelin, E., Lievens, F. & Robertson, I.T. (2007). The validity of assessment centres for the prediction of supervisory performance ratings: A meta-analysis. International Journal of Selection & Assessment, 15(4), 405-411.

Howard, A. (2009). Making assessment centers work they way they are supposed to. Keynote address at the 29th Assessment Centre Study Group Conference, Stellenbosch, South Africa, March 2009.

International Task Force on Assessment Center Guidelines. (2009). Guidelines and ethical considerations for assessment center operations. International Journal of Selection & Assessment, 17(3), 243-253).

28©2010 SHL Group Limited

References

Krause et al. (2010). State of the art assessment centre practices in South Africa: survey results, challenges, and syggestions for improvement. Keynote address at the 30th Assessment Centre Study Group Conference, Stellenbosch, South Africa, 18-19 March 2010.

Lievens, F. & Conway, J.M. (2001). Dimension and exercise variance in assessment centre scores: A large-scale evaluation of multitrait-multimethod studies. Journal of Applied Psychology, 86(6), 1202-1222.

Lievens, F. & Klimoski, R.J. (2001). Understanding the assessment center process: Where are we now? In C.L. Cooper & I.T. Robertson (Eds.) International Review of Industrial and Organizational Psychology vol. 16 (pp. 245-286). Chicester: John Wiley & Sons, Ltd.

Spychalski, A.C., Quinones, M.A., Gaugler, B.B. & Pohley, K. (1997). A survey of assessment center practices in organizations in the United States. Personnel Psychology, 50, 71-90.