relationship of competency and global ratings in osces

Relationship of Individual Competency and Overall Global

Ratings in Practice Readiness OSCEs

Saad ChahineMount Saint Vincent University

[email protected]

Bruce HolmesDivision of Medical Education, Dalhousie University

[email protected]

mailto:[email protected]

mailto:[email protected]

Purpose of CAPP OSCE

• Clinical Assessment for Practice Program• A program of the College of Physicians and

Surgeons of Nova Scotia (CPSNS)

1.To assess the clinical competence of IMG candidates for readiness to practice.

2.To provide feedback on candidates' performance for their continuing professional development.

The CAPP ProgramPart A: Initial assessment• Assessment of competence via OSCE & therapeutics

exam (Practice Ready)

Part B: 1 year mentorship with a family physician• Defined license for 1 year with performance assessment

Part C: Additional 3 years of defined license untilcertified by The College of Family Physicians

(CCFP)

Big Overarching Research:

To understand rater cognition in the assessment of candidates in the OSCE: What goes on in the minds of examiners when they assess candidates in the OSCE?

As part of this research agenda, this study answers the question:

Which competencies are most predictive of determining the satisfactory rating for the overall global rating score at each station?

Assessed Competencies

• History Taking• Physical Exam (In half of OSCE stations) • Communication Skills• Quality of Spoken English• Counselling (In half of OSCE stations) • Professional Behaviour• Problem Definition & Diagnosis• Investigation & Management• Overall Global

Example: History Taking

Overall Global Rating

Data Set2010 - 14 Station OSCE - 31 Candidates - 434 Observations (stations x candidates)

2011 - 12 Station OSCE - 36 Candidates - 432 Observations (stations x candidates)

2012 - 12 Station OSCE - 36 Candidates - 432 Observations (stations x candidates)

OSCE stations:

14 minutes spent at each station

examiner questions at 10 minutes

3 minutes between candidates

Design

• Goal: What constitutes a pass/fail in an examiner’s mind?

• Overall Global rating was recoded as pass fail

– Fail (0) = Inferior, Poor or Borderline– Pass (1) = Satisfactory, Very Good or Excellent

• Competencies were rated on 1-6 scale

– 1 = Inferior– 6 = Excellent

Descriptive Analysis

Year Number of Observations

Overall Global Pass

Overall Global Fail

2010 434 (2 missing) 86 (20%) 346 (80%)

2011 432 (4 missing) 93 (22%) 335 (78%)

2012 432 (10 missing) 114 (26%) 308 (74%)

Year Number of Observations

Investigation Management Pass

Investigation Management Fail

2010 434 (3 missing) 317 (74%) 114 (26%)

2011 432 (0 missing) 272 (63%) 160 (37%)

2012 432 (1 missing) 271 (63%) 160 (37%)

*Note INVMAN was recoded 0/1

Multivariate Analysis

• Hierarchical Generalized Linear Model (HLGM - A logistic regression that is nested)

• Nested structure to the data: Candidates are nested within Year and Stations are nested within Candidates

• High consistency across examiners (previous study) • The analysis was conducted in steps to find the best

model• The goal is to determine what competencies are most

predictive of a pass/fail at OSCE stations

Model

Station Level

Prob (OverGlob=1|B)=P

log[P/(1-P)]= P0+P1*(Hist)+P2*(PHYS)+P3*(BEHAV)+P4*(QSE)+P5(COMM)+P6*(COUNS)+P7*(PDD)+P8*(INVMAN)

Candidate Level

P0=B00+ B01*(TRACK)+R0

P1=B10

P2=B20

P3=B30

P4=B40

P5=B50

P6=B60

P7=B70

P8=B80

Year Level

B00=G000+U00

B10=G100

B20=G200

B30=G300

B40=G400

B50=G500

B60=G600

B70=G700

B80=G800

Results

Variable Fixed Effects

Estimate SE T-ratio Df P

G000 Overall -2.17 0.28 -7.78 2 0.00

G010 TRACK 0.02 0.33 0.07 101 0.94

G100 HIST 0.51 0.13 4.04 1272 0.00

G200 PHYS -0.02 0.07 -0.27 1272 0.77

G300 BEHAV 0.49 0.16 3.11 1272 0.00

G400 QSE 0.17 0.17 0.99 1272 0.32

G500 COMM 0.64 0.17 3.72 1272 0.00

G600 COUNS 0.06 0.06 0.95 1727 0.34

G700 PDD 0.64 0.11 5.84 1272 0.00

G800 INVMAN 1.36 0.14 9.83 1272 0.00

Results: Best ModelFixed Effects

Estimate SE T-ratio Df P Odds Ratio

Overall -2.16 0.18 -12.22 2 0.00 0.12

HIST 0.53 0.12 4.29 1272 0.00 1.71

BEHAV 0.52 0.15 3.43 1272 0.00 1.68

COMM 0.73 0.16 4.78 1272 0.00 2.07

PDD 0.63 0.11 6.02 1272 0.00 1.88

INVMAN 1.37 0.14 10.06 1272 0.00 3.93

*Note: Variation significant at Candidate level, Variation NOT significant at Year level

Example of Borderline/Satisfactory at the

Station• If you borderline all competencies…

– 6% probability you receive an overall pass at the station

• If you satisfactory all competencies…– 81% probability you pass receive an overall pass at the station

• If you satisfactory in all and borderline on Investigation Management. – 52% probability pass receive an overall pass at the station

• If you borderline in all and satisfactory on Investigation Management…– 21% probability pass receive an overall pass at the station

Rater Cognition

• Examiners do not weigh each competency equally…Investigation and Management is a key component in determining overall pass/fail on a station.

• Little variation in ratings for Quality of Spoken English…all candidates do well on this competency…we keep it in the exam as a check

• Physical Exam and Counseling are not significant predictors. We suspect this is due to insufficient data (half of the stations have these competencies)

• Track (1 vs 2) is not a predictor• There is not a significant variation from year to year.

Take Home Message• Examiners intuitively deem some competencies

as more important– Therefore, should they be weighted?

• For practice ready OSCE…– Consider more emphasis on complex competencies in

case development and blueprint

• Results support a qualitative study – Follow up study to understand how examiners

conceptualise competencies through cognitive interviews

Thank You!

www.capprogram.ca

Saad Chahine

[email protected]

Bruce Holmes

[email protected]