work-place based assessments

3
Work-place based assessments Kevin Hayes Abstract Work-place based assessments (WBAs) are now being used by nearly all Medical Royal Colleges as the principal tools of assessment of ongoing clinical training. Their implementation has been problematic and has pre- sented huge logistical challenges for local, regional and national organi- zations. There is a large body of evidence to back up the use of WBAs in undergraduate and postgraduate training, though most of the latter is from the USA. The individual tools of WBAs are considered to be educa- tionally valid, and also reliable if enough are performed over time, but feedback and studies in the UK have repeatedly shown that delivery of the tools in the work-place is in conflict with the pressures of clinical service delivery. This review examines the purpose and practical consider- ations of WBAs, the pros and cons of different WBA methods, explores possible problems and solutions in their use and outlines their possible future direction. Keywords CBDs; mini-CEX; OSATS; reliability; team observations (TOs); validity; work-place based assessments (WBAs) Introduction Despite more than a decade since the advent of Calman-style postgraduate training, there has only been formal documented assessment of actual day-to-day working in the last 2e3 years in the form of Work-place based assessments (WBAs). Prior assessment has been informal, locally derived at regional level and of highly variable quality leading to highly subjective end of year assessments (usually Record of In-Training Assessments, RITAs). WBAs are arguably the single biggest change to the assessment of clinical training in the last 20 years and aim to lead to more valid, reliable and objective measurement of a trainee’s clinical ability. They have been implemented nationally and supported by both the Postgraduate Medical and Educational Training Board (PMETB) and the Association of Medical Royal Colleges (AMRC) in an attempt to standardize and quality assure their use. They are designed to form an integral part of all trainees’ portfolios and therefore help to inform the newer Annual Review of Clinical Practice (ARCP). Why WBAs? With the advent of The European Working Time Directive (EWTD) and structured run-through training, there has been a significant reduction in both the number of years of specialist training and also the number of hours worked per year. This reduction in trainees’ experience is meant to be offset by more structured, directed training and assessment. Particular concerns have arisen in surgical specialities, including Obstetrics and Gynaecology, where previous lack of formal assessment was made up for by performing large numbers of procedures over a long period, being observed by multiple assessors. Miller’s pyramid of clinical skill acquisition (“knows”, “knows how”, “shows how”, “does”) highlights the fact that in clinical medi- cine there has been an almost total lack of formal assessment of what a trainee actually “does”. OSCEs used for the Part 2 MRCOG, only assess “shows how” and it is well recognized that performance in “mock” situations does not necessarily predict “real life” performance. WBAs are therefore designed to test the “does” in terms of real patient clinical assessment and commu- nication, clinical knowledge, procedural/operative skills and professional behaviours. Validity and reliability The validity of an assessment measures whether the assessment tool truly tests the domain or domain of interest. As all the tools directly observe daily practice they have an intrinsic construct validity (e.g. an objective structured assessment of technical skills (OSATS) truly assesses the ability of a trainee to perform a laparoscopy on a real patient in a real theatre as opposed to in a simulation). Most studies on WBAs report high construct val- idity and this has also been reported in Obstetrics and Gynae- cology by Bodle et al. in particular relation to OSATS. High face validity (how much the assessment is “respected” by trainees and trainers) is also a constant finding. Reliability is a measure of the “reproducibility” or “general- izability” of an assessment and it can be psychometrically measured. Reliability will always improve with increased sampling of a trainee, i.e. the more WBAs they do the more reliable the assessments become. A survey of 200 Obstetrics and Gynaecology trainees and 82 trainers by the RCOG assessment sub-committee in 2008 (unpublished) revealed at least 50% of all respondents reporting difficulty in obtaining “sufficient” numbers of WBAs due to time pressures. Informal feedback suggests that deliverability is still a major issue. While this remains, reliability will continue to be highly variable. The purpose of WBAs The purposes of WBAs are quite simply a formal, contempora- neous, continual feedback mechanism for trainees (and trainers) to improve clinical performance. As a formative assessment they are not designed to be used as a “passefail” or “high stakes” assessment. Improvement will be documented as moving from “working towards competence” to “competent” in individual areas, as well as through the use of more complex cases and procedures over time. No trainee should “fail” his or her end of year assessment (ARCP) based on a WBA as the ARCP is a summative process. WBAs do however form an important part of a trainee’s portfolio as they add to the overall “richness” of information about an individual trainee’s progress. Sadly a lack of clarity about their purpose and a rushed implementation has led to an initial degree of distrust and cynicism amongst trainees and trainers alike. Kevin Hayes MRCOG is a Senior Lecturer and Consultant in Obstetrics and Gynaecology and Medical Education at St George’s University of London, London, UK. Conflicts of interest: none. ETHICS/EDUCATION OBSTETRICS, GYNAECOLOGY AND REPRODUCTIVE MEDICINE 21:2 52 Ó 2010 Elsevier Ltd. All rights reserved.

Upload: kevin-hayes

Post on 30-Nov-2016

218 views

Category:

Documents


4 download

TRANSCRIPT

Page 1: Work-place based assessments

ETHICS/EDUCATION

Work-place basedassessmentsKevin Hayes

AbstractWork-place based assessments (WBAs) are now being used by nearly all

Medical Royal Colleges as the principal tools of assessment of ongoing

clinical training. Their implementation has been problematic and has pre-

sented huge logistical challenges for local, regional and national organi-

zations. There is a large body of evidence to back up the use of WBAs in

undergraduate and postgraduate training, though most of the latter is

from the USA. The individual tools of WBAs are considered to be educa-

tionally valid, and also reliable if enough are performed over time, but

feedback and studies in the UK have repeatedly shown that delivery of

the tools in the work-place is in conflict with the pressures of clinical

service delivery. This review examines the purpose and practical consider-

ations of WBAs, the pros and cons of different WBA methods, explores

possible problems and solutions in their use and outlines their possible

future direction.

Keywords CBDs; mini-CEX; OSATS; reliability; team observations (TOs);

validity; work-place based assessments (WBAs)

Introduction

Despite more than a decade since the advent of Calman-style

postgraduate training, there has only been formal documented

assessment of actual day-to-day working in the last 2e3 years in

the form of Work-place based assessments (WBAs). Prior

assessment has been informal, locally derived at regional level

and of highly variable quality leading to highly subjective end of

year assessments (usually Record of In-Training Assessments,

RITAs). WBAs are arguably the single biggest change to the

assessment of clinical training in the last 20 years and aim to lead

to more valid, reliable and objective measurement of a trainee’s

clinical ability. They have been implemented nationally and

supported by both the Postgraduate Medical and Educational

Training Board (PMETB) and the Association of Medical Royal

Colleges (AMRC) in an attempt to standardize and quality assure

their use. They are designed to form an integral part of all

trainees’ portfolios and therefore help to inform the newer

Annual Review of Clinical Practice (ARCP).

Why WBAs?

With the advent of The European Working Time Directive

(EWTD) and structured run-through training, there has been

a significant reduction in both the number of years of specialist

Kevin Hayes MRCOG is a Senior Lecturer and Consultant in Obstetrics

and Gynaecology and Medical Education at St George’s University of

London, London, UK. Conflicts of interest: none.

OBSTETRICS, GYNAECOLOGY AND REPRODUCTIVE MEDICINE 21:2 52

training and also the number of hours worked per year. This

reduction in trainees’ experience is meant to be offset by more

structured, directed training and assessment. Particular concerns

have arisen in surgical specialities, including Obstetrics and

Gynaecology, where previous lack of formal assessment was

made up for by performing large numbers of procedures over

a long period, being observed by multiple assessors. Miller’s

pyramid of clinical skill acquisition (“knows”, “knows how”,

“shows how”, “does”) highlights the fact that in clinical medi-

cine there has been an almost total lack of formal assessment of

what a trainee actually “does”. OSCEs used for the Part 2

MRCOG, only assess “shows how” and it is well recognized that

performance in “mock” situations does not necessarily predict

“real life” performance. WBAs are therefore designed to test the

“does” in terms of real patient clinical assessment and commu-

nication, clinical knowledge, procedural/operative skills and

professional behaviours.

Validity and reliability

The validity of an assessment measures whether the assessment

tool truly tests the domain or domain of interest. As all the tools

directly observe daily practice they have an intrinsic construct

validity (e.g. an objective structured assessment of technical

skills (OSATS) truly assesses the ability of a trainee to perform

a laparoscopy on a real patient in a real theatre as opposed to in

a simulation). Most studies on WBAs report high construct val-

idity and this has also been reported in Obstetrics and Gynae-

cology by Bodle et al. in particular relation to OSATS. High face

validity (how much the assessment is “respected” by trainees

and trainers) is also a constant finding.

Reliability is a measure of the “reproducibility” or “general-

izability” of an assessment and it can be psychometrically

measured. Reliability will always improve with increased

sampling of a trainee, i.e. the more WBAs they do the more

reliable the assessments become. A survey of 200 Obstetrics and

Gynaecology trainees and 82 trainers by the RCOG assessment

sub-committee in 2008 (unpublished) revealed at least 50% of all

respondents reporting difficulty in obtaining “sufficient”

numbers of WBAs due to time pressures. Informal feedback

suggests that deliverability is still a major issue. While this

remains, reliability will continue to be highly variable.

The purpose of WBAs

The purposes of WBAs are quite simply a formal, contempora-

neous, continual feedback mechanism for trainees (and trainers)

to improve clinical performance. As a formative assessment they

are not designed to be used as a “passefail” or “high stakes”

assessment. Improvement will be documented as moving from

“working towards competence” to “competent” in individual

areas, as well as through the use of more complex cases and

procedures over time. No trainee should “fail” his or her end of

year assessment (ARCP) based on a WBA as the ARCP is

a summative process. WBAs do however form an important part

of a trainee’s portfolio as they add to the overall “richness” of

information about an individual trainee’s progress. Sadly a lack

of clarity about their purpose and a rushed implementation has

led to an initial degree of distrust and cynicism amongst trainees

and trainers alike.

� 2010 Elsevier Ltd. All rights reserved.

Page 2: Work-place based assessments

ETHICS/EDUCATION

Individual assessment tools

Mini-CEX

The mini-CEX assesses real clinical encounters (history taking,

examination, communication and management) observed at first

hand by an assessor and provides immediate feedback to the

trainee. There is good evidence to support its use for assessing

trainees in the USA. Wilkinson et al. evaluated the use of mini-

CEX for the Royal College of Physicians (RCP) and found them to

be the most prone to assessor variation necessitating potentially

larger numbers to make them reliable.

Case based discussions (CBDs)

CBDs test clinical application of knowledge and serve to highlight

both good knowledge and also areas where knowledge may be

lacking or in need of further study. Feedback is directed at

identifying learning needs and a specific and realistic topic as

“homework” and then “feedback” is considered good practice.

Objective structured assessment of technical skills (OSATS)

OSATS assess technical/operative competence and are the most

utilized WBA in Obstetrics and Gynaecology, probably due to the

fact that they are the least time consuming and that trainees

enjoy performing surgical procedures. The RCOG has defined

core operative procedures to be assessed over time and defined

a minimum number of OSATS per procedure per year. Increasing

technical ability is meant to be measured by the assessment of

increasingly complex operative procedures over time.

Team observations (TOs)

TOs (a modified form of 360� appraisal) assess many attributes

relating to overall professional behaviour, including team-

working, diligence, time-keeping and inter-professional commu-

nication, as well as clinical performance. They are the form of

WBA with which we have most experience in the UK and there is

a wealth of literature to support their use. They are highly

informative about a trainees overall performance and provide

extensive feedback, which in itself influences behaviour.

Advantages and disadvantages of WBA tools

WBA tool Construct tested Pros

Mini-CEX Clinical encounter, history,

examination, explanation

Construct and face va

Broad spectrum of ca

Multiple skills tested

CBDs Clinical knowledge Construct and face va

Identifies knowledge

OSATS Technical/operative skills Construct and face va

Most time efficient

Most utilized

Probably reliable

Broad spectrum case

TOs Professional behaviours Good experience of u

Negative indicators h

Table 1

OBSTETRICS, GYNAECOLOGY AND REPRODUCTIVE MEDICINE 21:2 53

Table 1 summarizes the construct and pros and cons of the

individual WBA tools.

Important variables

There are many variables that can alter the outcome of all WBAs.

These should all be taken into account by trainees and trainers

prior to the assessment and also help to emphasize the impor-

tance of multiple assessors being used. These variables include:

stage of training, level of trainer (consultants are harder judges

than more junior staff), sex of trainer (males have been shown to

be harder judges in OSATS), familiarity with the trainee, in-

patient/out-patient setting and high/low case complexity.

Conflict of interests and rating problems

There will always be a conflict between a trainer’s role as

a trainer and as an assessor (judge). The use of multiple trainers

helps to reduce this problem but in clinical medicine it commonly

leads to “halo” effects for the most familiar trainees, score

inflation and range restriction to the upper end. The assessment

may therefore not truly reflect the observed performance. This

effect is likely to be exacerbated when numerical or “Likert-type”

scales are used as they currently are in Obstetrics and Gynae-

cology, as more complex forms may in themselves make trainers

less able or willing to use the forms as intended and assessors

develop a mid-upper range tendency. Numerical scales while

seeming a more familiar format for rating trainees also pose the

problem of a “pass mark” that needs to be achieved e these are

formative assessments and do not have a passefail element e

and can also deflect away from the most important element of the

assessment e direct verbal and written narrative feedback.

Problems in delivery

In the UK several pressures have coincided on senior clinical

staff, namely financial pressures on NHS trusts to deliver service

targets and an explosion of assessment (including WBAs) for

medical students and all grades of junior doctors over the same

Cons

lidity

ses

Most time consuming

Least utilized

Least reliable

Most prone to case, setting and trainer biases

lidity

gaps

Case specific

Prone to case, setting and trainer biases

lidity

s

Emergency cases under-represented

Peri-operative management not focus of

assessment

Procedure specific

se in UK

ighly predictive

Time consuming

Confidentiality and therefore honesty issues

Dealing with feedback can be difficult

� 2010 Elsevier Ltd. All rights reserved.

Page 3: Work-place based assessments

Practice points

C WBAs are a formative assessment principally designed to give

trainees feedback.

C WBAs have good construct and face validity.

C WBAs are probably reliable if enough are performed and

different assessors are used.

C WBAs can be time consuming and feedback indicates there are

real difficulties delivering them due to service pressures.

C Narrative feedback is more important than any rating scales

that are used.

C WBAs can be used as part of a whole number of components

in a trainees’ portfolio to inform the summative end of year

assessment (ARCP).

C Ongoing research is still needed to ensure that these tools

remain fit for purpose in the UK training system.

ETHICS/EDUCATION

period of time. Not surprisingly time pressures mean that the

single biggest barrier to implementation and maintenance of

WBAs is delivery of sufficient numbers to make them meaning-

ful. Lack of clarity about the purpose and utility of WBAs has

also meant they have “got off on the wrong foot” but at least with

national guidance and information the situation has become

clearer for trainees and trainers alike. There is also a well-

recognized cultural problem where significant numbers of

trainers, and indeed some trainees, have been resistant to

change. This is slowly changing as younger trainees have had

WBAs embedded in their training for years (it is the norm) and

investment in local and regional “training the trainers” has

increased.

Summary

WBAs have become an integral part of modern postgraduate

training in Obstetrics and Gynaecology in the UK and are here to

stay. A cultural change in thinking is happening, and hopefully

WBA use will become a day-to-day activity as they were always

intended to be. The roles of the clinical and educational supervi-

sors and college tutors have now been defined and appointment to

these posts is now determined by due process rather than infor-

mally as before. These supervisors and tutors have a pivotal role to

ensure local delivery and quality assurance as well as trainees

taking responsibility in driving their own learning. In time

numerical rating scales are likely to be removed to bring the

feedback element of the assessment to the focus of any discussion

rather than “gaining a certain score”. Continued investment is

needed from the national and regional bodies (good assessment is

quite simply not cheap) and educational guidance is required from

the RCOG and PMETB to ensure a level playing field for all. Good

communication between NHS trusts and educational bodies will

help to balance the needs of service delivery and training provi-

sion. The tools are valid and probably reliable but delivery of them

remains the single biggest problem. A

FURTHER READING

Bodle J, Kaufmann S, Bisson D, et al. Value and face validity of objective

structured assessment of technical skills (OSATS) for work based

assessment of surgical skills in obstetrics and gynaecology. Med Teach

2008; 30: 212e6.

OBSTETRICS, GYNAECOLOGY AND REPRODUCTIVE MEDICINE 21:2 54

Chikwe J, De Souza A, Pepper J. No time to train surgeons. Br Med J 2004;

328: 418e9.

Davis M, Ponnamperuma G. Work-based assessment. In: Dent J, Harden R,

eds. A practical guide for medical teachers. London: Elsevier, 2005:

336e45.

Norcini J, Blank L, Duffy F, et al. The mini-CEX:a method for assessing

clinical skills. Ann Intern Med 2003; 138: 476e81.

Pangaro L, Holmboe E. Evaluation forms and global rating scales. In:

Holmboe E, Hawkins R, eds. Practical guide to the evaluation of

clinical competence. USA: Elsevier, 2008: 24e41.

Rethans J, Norcini J, Baron-Maldonado M, et al. The relationship between

competence and performance: implications for assessing practical

performance. Med Educ 2002; 36: 901e9.

Schuwirth L, van der Vleuten C. How to design a useful test: the principles

of assessment. Edinburgh: Association for the Study of Medical

Education, 2006.

Wragg A, Wade W, Fuller G, et al. Assessing the performance of specialist

registrars. Clin Med 2003; 3: 131e4.

Work-place based assessment e a guide for implementation. London, UK:

Postgraduate Medical and Educational Training Board and the Asso-

ciation of Medical Royal Colleges, 2009.

Wilkinson J, Crossley J, Wragg A, et al. Implementing workplace-based

assessment across the medical specialties in the United Kingdom.

Med Educ 2008; 42: 364e73.

� 2010 Elsevier Ltd. All rights reserved.