a subjective rating scale for assessing - dtic.mil subjective rating scale for assessing pilot...

18
copy UNLIMITED "--) TR 90019 0Y) AD-A227 864 ROYAL AEROSPACE ESTABLISHMENT DTIC IELECTE fl 0 D U Technical Report TR 90019 March 1990 A Subjective Rating Scale for Assessing Pilot Workload in Flight: A Decade of Practical Use by A. H. Roscoe G. A. Ellis Procurement Executive, Ministry of Defence Farnborough, Hampshire UNLIMITED 90 10 .22 164

Upload: lamdien

Post on 08-Mar-2018

218 views

Category:

Documents


1 download

TRANSCRIPT

copy UNLIMITED"--) TR 90019

0Y)

AD-A227 864

ROYAL AEROSPACE ESTABLISHMENT

DTICIELECTE fl

0 D U Technical Report TR 90019

March 1990

A Subjective Rating Scale for Assessing

Pilot Workload in Flight:A Decade of Practical Use

by

A. H. RoscoeG. A. Ellis

Procurement Executive, Ministry of Defence

Farnborough, Hampshire

UNLIMITED

90 10 .22 164

UNLIMITED

ROYAL AEROSPACE ESTABLISHMENT

Technical Report 90019

Received for printing 5 March 1990

A SUBJECTIVE RATING SCALE FOR ASSESSING PILOT WORKLOAD IN rLIGIET:

A DECADE oF PRACTICAL USE

by

A. Hi. Roscoe*

G. A. Ellis**

SUMAARY

Llo~pite the many techniques deveýloped for tvaluati~ig pilot workload in

fliqht siubjectivv aa e.,wient by excporience-4 p;.lots is atill, the rtoost relitable

mtehod kby far. This report doscribe4 the deoign and developmert --with tha nelp

of practising test pilots - of a ten-pe..nt catini qca.'e. The s'cilo uses a

dociaic~n tree similar to that aosd by the Cooper-Harptoz handling Qualities seale,

dnQ1, bda~ed on tne concept of space cap~acity. ExawapIes are given of its use by

a larqu numbor of pilots in variou.a flight triala and vykload atudie3..I t

DoparttwntalReene F1 6

c~opyrivhm

C'anitroJ2@r UNSO Londonb1990

Foermly ur RAE, Wedfod#apd now with Pkrr~!anna Airway**Fortmetty of M~E, flodford, and now with British Aerospace~

MUtA ZHL=

'Ar

2

LIST OF CONTENTS

Page

1 INTRODUCTION 3

2 THE 'BEDFORD' WORKLOAD SCALE 4

2.1 Design and development of the 'Bedford' scale 4

2.2 Description of the scale (Fig 1) 6

3 PRACTICAL EXPERIENCE WITH THE 'BEDFORD' SCALE 7

'. DISCUSSION 11

5 CONCLUSIONS 13

Refr~re ives 14

I.~lustration .Figure 1 (in text)

Report documentation page inside Lak cover

j-,

A-1k

tI-

3

1 INTRODUCTION

In 1969 the increasing importance to flight safety of changes in levels of

pilot workload generated by new operating techniques such as vertical take-off

and landing (VTOL), low visibility landings, and reduced noise approaches, then

being evaluated at the Royal Aircraft Establishment at Bedford, resulted in a

greater interest in assessing workload during flight testing. Unstructured pilot

opinion recorded during flight, or, more often, after flight was the accepted

method of evaluating workload at this time. The possibility of obtaining

misleading information because of bias or of pre-conceived notions about workload

levels - a recognised problem associated with subjective techniques - resulted in

a programme aimed at developing a complementary but more independent measure.

Following a detailed survey of available techniques, monitoring of pilot's heart

rates was selected for further evaluation.

After some 5 years, during which considerable experience was gained of

using test pilot's heart rates to support their opinions of workload, it was

decided to improve the method of obtaining subjective assessments by employing a

specially designed rating scale. A search for a suitable scale for use in flight

was unsuccessful: most research on subjective ossessment by pilots had been

concerned with aircraft handling qualities'1 3 . Although some scales such as that

designed by Cooper and Harper3 have somotimos been used to rate workload they are

not ideal for this purposo. As Gerathowhol4 pointed out, ". . . subjectivo pilot

ratings of handling qualities, as accurate as they may be in rgoard to control

desiability or difficulty, do not contribute to workload determinations, aineo

they are only loosely connected to task dmatnds and pilot rosponseO. The

decision was made, therefore, to design and develop a workload rating acale at

RAE Bedford 'on the back' of current flight teatinq and with the help of

praetiainq tost pilots.

As well a3 dosigning a pilot workload rating scale it esoomd 3ensible to

defino workload. A toview of the literature revealed A plathord of doe(nltina

based mostly en workload as a 4et of flight usk domands, a* the oftott roquird

to satisfy those demands, ir as th4 rosult* of that effort - porotman. . M#4y

of the d4flnitioen apdafw espvlicatod And/or unrealistie in thd 2onteox of ral

Utlttg a Is and fto a*o& obtained the Vlows of sum* 3O

tilitaty and kAirlino pilot3 and conciud*J thaL mtr~r thian 001 of Pro easioual

pAl*tn think of workload in totrm of effort. this is also an inr4protatioa that

agreod Woll wiuth the 1nfluee of sueh individual facotos as natural ability.

. ., ..... , . .

4

training, and experience on the piloting task. There is evidence that the

failure of pilots to perceive the demands of the flight task correctly has been a

causative factor in several accidents; it also seems likely that workload levels

tend to be determined by how a pilot assesses the flight task. With these

findings in mind Ellis and Roscoe proposed that a slight modification to the

definition of workload used by Cooper and Harper in the introduction to their

Handling Qualities Rating Scale3 would be appropriate, namely: "Pilot workload is

the integrated mental and physical effort required to satisfy the perceived

demands of a specified flight task".

2 THE '5KD10RD' WORKLOAD SCALX

2.1 Design and development of the 'Bedford' scale

The initial ob jective was to design an interval scale that could be used to'

give ratings in flight during - or immediately following - highly demanding

piloting tasks, and that would result in absolute values of workload.

As Ellisa' observed. "The use of rating scales results in the allocation (:4

a numerical value to the quantity that is being muasured. Not unnaturally,

researchers Wish to use statistical and mathematical proce.sses on the nube.ýS 1-

obtained, and so moot of the rating scales that have been devised hAve b~een

intended to be linear". However, McDonnell'. in discu!2sinq the rating of

aircraft handling qualities. referred to the difficulty of ac:hiiwinq linoarity

with ordinal and adjectival scales. A~nd 11e55g later cq=-i tod: "The rmajority

rating scales in existence have two thinqs in coqr=ncn tho'y a~eiio boh or~ia

adjectival~ in natuWo. Futhormero, the results of vaticmý *thior a*rty

3tedI31'ý' aiAed at developing non-adjectival lineat rating vsleoo woto not

encouraging.

It beestee obvious that an ordinal, atijgctival tratingq Ocalf) wjull 12.0 r-4r

41PProPriate for dovvlotwant at, a flight teot cintro 14cknq dtciies? !ot

laboratory exptients. An ordinal scalo tot Aircraft handlinq qqhltiog tnp

Cooper-Harpor scale'. wag allready dovolopod and Ostahblsioed; it. was e~aay *t4 ;A'k-

andi vas widqlv accoptod awangst tout pjl.i~ta and onqlnootm. at h t tad

goo6d adense to tt~y to design thO watkiad scalet usiti a gichla. di~siV. t.,de~

tree vith appropriate deoeriptott;.

The first daeaign, a nineo-point fieale. used doriptott baged- *tn litfott

such as: Pilot effort not a tactor tot doý!re4 pezort*Anca - rating 14l n~-&

partottaaneo requiros ~eoaopilot effort - ratinq I., adequate mac

requiros exrenaive 010t at fott -rating Sý intousivo pilot eflert is rV4,Ltred t"5

L~

5

retain control - rating 8; and finally, control will be lost during some portions

of required operation - rating 9.

At first it was not obvious whether the scale should bc absolute or

comparative. In other words, should it try to cover all possible workload levels

in all flying tasks? Or should it have the more limited aim of acting as a

comparative measure between the workload experienced and that which could be

considered normal or reasonable for the task in hand? It was therefore decided to

construct both types of scale, and then to decide which of them would be the more

appropriate in various circumstances.

An interesting finding from the questionnaire study by Ellis and Roscoe 5

was that pilots find it convenient to think in terms of 'spare capacity' when

considering their levels of workload. What other relevant but secondary tasks

can be taken on in addition to the primary flight task? For example, when the

primary task is an instrument approach in bad weather how much spare capacity

does one consider is available for monicoring the actions of the other pilot,

looking outside the cockpit, listening to the radio etC? The higher the workload

generated by the primary task the less capacity there is for these secondary

tasks. Pilots seen to find it a relatively simple matter to judge how much more

they could do even it there is no requirement to do so,

These findings Auggested that descriptora incorporating the concept of

spare capacity wo'uld be of greater value than roeerence to effort. The scales

were alao extended from nine to ten ratings.

In addition to the concept of spare capacity the 'absolute' scale also

referred to arousal, time, and fatigrue. Those now descriptors Included;

Low workload, plenty of tioe and capacity to cotwleto all tasks At a

moderate arousal satet, level of of fort could be Maintaine~d for severial hours.,-

voting Z,

* Modertetc workload, all pritmary And secondAry taksks within pilot capacity.

* but fairly high arousal state nee~ded tiring and fatigue likely after

1 -2 houra; - vating 3. Very hig;h work-load, only more irvortant secondary tasks

co:Pploted and then only infrequently; - rating 6.

In view of the present concernt about uindoratdusal and uftddrload it is worth

noting that the lowest wborkload de3riptor In this scalA read: Very low workload,

few taoks for the time available, dooe risk of boredom: rating 1. ThIS WAD

* shortly atended to., Workload teo low, too much spAre capacity# danger of

r,.

. A! Vt

6

During construction of the scales it quickly became clear that any

'absolute' scale that attempted to include the whole spectrum of workload levels

experienced by pilots would be too coarse to be practical. Certain tasks such as

gun aiming or landing in adverse weather were always concentrated within a few

seconds and were always high workload. Others, such as en-route flying, could be

sustained for many howirs. To place them on the same scale was of very limited

value. Also, pilots comments and ratings of workload made during various flight

tests in different types of aircraft, showed the difficulty of obtaining absolute

values. It was found that pilots liked to compare their workload to some form of

baseline, usually to previous experience.

Effort was therefore concentrated upon developing a comparative scale that

would help to answer the important and practical question of whether the workload

is appropriate for the primary task under consideration. Subsequent experience

of using the scale h4s proved this decision to have been correct (see later).

The need for concise descriptors in an adjectival rating scale was high-

lighted during the evaluation of the absolute scale in fight. it soon became

apparent that the introduction of the additional factots of aranal and ratgue

complicated the scale unduly. After same development in flilht and furthe-r

discussiOn3 with pilots the present descriptors were introduted (Fig 1). these

were readily acoepted and in a short time considered by Ovdford test pilots tQ be

quite adequate for the purpoae of rating worklead.

a.2 a ecrpto of the scale (Fig 2)

The pilot starts his dei~-al@process4 at tho bettýs. ý5f

the decision tree, which consists of tncee quo-,stions roquirinq Yk*A VOi .1in order to proeend to the dQcaiptionz of different lavejz 0 w .d, ThR

d*soriptora or* of increasing levels of workload aesoiaiated with rotiftqt of to,

10. Half ratings Art alloved thereby incret;ng the gsonitivity cd tho eug,

this bee"*e particu~larly dasivablo at tho lower voitkhvAd lov,*11. et in~.4!y t~l

ratings between the 'decision' groups voto not sought but aa mAny pilotsý sgoqt--

to find it difficult to decida betwevile # 'y anid *tbol o tar t~ voWlo5 Aakigad t -

tocy without t•ductlon?* A rating of V. bacat accptablo.

It is most imtortant that the flight t~ak to bo Pdtod 4hould tv uell.

defined And/or the period of time oveir which the Isseatoent Imad@ be nade

with reasotuable precision. The workload Weing assaesed is that involved In the

esecution of the prieury task; any additional ta~ks - such as tsanitozit other

aet *boee s - must ba Includad as part of the pitot's spare cacity.

7

Workload Description RatingDecision Morklood Insignificant.

Treeg 1 zz:::capcity for all desirable saditional 3yes tss

Insufficient spare capacity for easy attentionwa orla to additional tasks. 4L.wokodNo irReduced spare capacityv. Additional tests cannotsatisfactory ~I be given the desired &mus~t of attention.

without i ___________________ ___

reduction,? sp: are capacity. ravel of effort allows

yes

Very itutlesire a pain t y.in but Satatesanc, of7

effrtto hepr n~r asklo to quelpstf~ien. efot.1

toleabl fo Veryt Atuworload wansat Iit als mpefe spire~n capact)

ma doubts as to ttio ta to Ra1@ van ?#Vol ofneffort.a9

, V41 it ,,,cib, N ?o III rest abocadow. pilo attc U4410 ts @y fcituanta

Th e noa noa .. wraesio o thertadtin Scal va3ft oirt~ again.412daPi~

ThO n -theHS11rror acao i.-jumo tao-af triat" handin q~trial an~dsrod

tho AVrdqm .aof Imin a att itiltoad rooamp to vtv he atebit ake-ff otsratm" of

idtw itto piat t $hoaltu tyiatle01.*t~a-Vt atta oaedtaQT

At th o ftdtCP h f h on fbcfii ibce h odto tr"41iý, donw-Ads o apro-et.anq4. s tovooionl flingspod. i aptdah~ath onaA*gaulyrtto otoatPsto gii

Th s~o d ud t sss hndi% uliit ~d h

Eleven pilots rated their workload and had their heart rates recorded

during ramp take-offs, the ramp angle was increased in steps from 60 to 150 over

the period of the trial. Workload ratings and heart rates showed good agreement;

and both ratings and heart rates confirmed that workload levels were not

increased for greater ramp ar.gles nor for night take-off s. These workload

indicators also demonstrated that levels of workload are higher during the more

conventional short take-of fs (for this aircraft) from a runway.

The workload rating scale was used extensively during a trial to evaluate

Economic Category 3 approach and landings 12. Pilots' heart rates were again

recorded to complement their ratings of workload. The technique involved an

autopilot coupled approach to a decision height of 50 ft for the HiS 748 and 60 ft

for the SAC 1-11 aircraft at which height the autopilot was discoimectod for a

manual landing if the runway lights were seen. If the lights were not seen by

decinion height a go-around Was made.

In addition to rating the final approach with autopilot. and tr.e manual

landing, ratings were given for the very short term workload a aouiatoc with

making the docision.

It, late 1902 the Bedford scale and pilots' heart rate reapon~oe, wore u-e'd

tO a330ss workload in flight during crew eomplomat ce ~~nof thf&MAe 1413 Post flight quoastionnaixeaoa3~ne tho in-fligit. klatak ,ýnw4m

1rom the three toami (of twio pilots) who *eAch flew t~hro etqý,L at ~ e l~'

achqdeeao around a teireuit of threet hiqh initenoity pr tin-

Paris Charles do Gaulle, and Amserdam -W I).

Wrkload tatingsA vot's 2bohind ifvt Itth pilokp An4 t ar,

flight ob*Qrvor, an vtirbaol roquesz tiad llight xlqt~.a4 re kttý ý r~

biy 0eants of steall keybv!4rit fitted to tho rottrýl "lu". afS to tito

clip board. Mkiinga, wh!ich vere plKt-44 auot alyo~c hq-ari tftý

at tho tio* of the roquoet. were ated ttinj kio a o"1aktL'_1r40-

tequiests were woro froquotit dehnqi high rgikloki fgeat9 4>5f fli~hit 9Aýh RA "

takq-of I and Iinlt-loi Olv* the prcoeah a#4i lakndiul4. and vhen iie d !§*1rcltý

flight iallutos and eteoe aur4

pilots Vore DnsttuctO4 int tne ugo of tho relined zgal' beoloe 1'hp - r~s

atirted And, in partittalar, were askiod to eohi'i'sr theit utkowdfit the

pitevioul )0 A. All six pilott And mott Ot the !1lqht ýU10tvet!A fUnýUt thoesr;1

#6asy to irtot thero was no ovitiepee that qiving riatlvq;a ih fli-hI iitua r i4ed

t4e piloting task and only rarely was A iratitil ddlayed tý.-Y the fiqb dqemah4S.

Wi.t kationge uosa not used durlhq this trial aud thai~h t.ho seur.1tivity o~the

7 9

scale was consequently reduced it did not appear to influence its value for

certification purposes. In fact, the ratings obtained during the trial were

considered, overall, tc be of considerable value; and there was also surprisingly

good agreement between ratings given by pilots and those given by most of the

flight observers.

There was a reasonably good relationship between pilot's ratings and their

heart rate responses; disagreements seemed to be due mostly to the failure if the

pilot to rate the entire period under review.

Lidderdale'4 used the Bedford scale and heart rate recordings to assess

crew workload (pilot and navigator) during the evaluation of low-level high-speed

flight in a supersonic tactical fighter aircraft. He reported that "... the

aircrew understood the scale readily and whilst it was sufficiently comprehensive

to cover 411 circumstances it was easy to remember and small enough to be carried

on the flying suit knee-pad"t At pro-planned times the navigator, having

recorded his own rating of workload, would request a r4tir4 trowa the pilot. Beth

pilots And navigators veported little. difficulty in giving ratings3 in flight. A

pos,,t-flight.esat technique. basedof on the Analytical llierarchy PrnCess3'

was U40d to analyys paired compaisons, Liddardalo. in reporting a high

correot.'ion twetvon in-flight ratinls and the Post-flight s snt.oevd

that the farmert ttchnilque was eaieri to useo And mere p;4ctieal for u4O in Aft

;VPert4iOnAl ttial tpertrnA~l etahcto)

Thet Pedfttir'4l(cal w43 uNe" tth succes by Ha Mir an4 RluWelV t

r12l0Kad in arty hlelicpkte pilmts 01n94ge4 in vAtWr4nttflight taskalt 4n4 by

durinr f-.0iAvo rkgatiii i intot tho level a trkoa andI Operstnhq

n.by n4lte.ptoc pilott iValv an Nrth sa 4l1pltfre

W&Aggos atin revq ugv/d the t-9izJVQ -aslo 1c§t~hqt with heart tqr'-

"§~ ýtvo 'ce ho 110voe Mi o~Pilot workloamd 40onerated by toW adflhtC;*d

tecnolgy oetvc~ 141 with th4Oe qgenerlatd by the earlier lkeln" W)t Thit.

liritatolie Airways6 gtasy Vgv~afW -ariout duritng routine pastooener flights in%

rop drig hich lltftepiltsu fOUnd no !diffic-uity in qivintq tatings. at the Oh4

Of 14WR pe 't~iUlat figIht Oik-0 Or sgbs-Pha*e Orfnees. & IWsprienxed fih

ak*,±etver, Usingq 11W Ledfod sesleq. galgo atod the 4iffe'lent flight taskS. Viye

piats rtiipsedin tho *tu4y. three where etftitoted' on tho 0131StAa then,

afto e to type. on the #'G1 tea Pio ve Mnitered of both aircr:taf,

""a they auattlthtd every 4kM nrthiz botween 97) and (061, Aoth workload ringsa~d v~t tay, -etp~lw*SWO44tat~. tlw:Wbftttft ''' levels .at wvrkl'td jto

VII

* 4

A-,

10

lower on the 767 than on the 737. These measures have also distinguished quite

clearly between the different levels of workload associated with flying the 767

in different modes: hand-flying with raw information, hand-flying with flight

director integrated with the flight management system (Ft4S), and with autopilot

and autothrottle.

An extension to this study is presently under way in which workload levels

generated by flight failures, emergencies, and abnormal operating conditions in

the 767 are being assessed in a flight simulator. Both ratings and heart rates

for normal flight in the aircraft and simulator were used successfully to

demonstrate the value of the simulator for this type of investigatiort '.

Practical experience of using the workload rating scale in several t'.ght

trials showed it to be markedly better than previous methods of obtaining pilot

opinion, and though it lack'd some sensitivity - otipecially at the lower leveig

it was becoming well accepted by pilota and by tesearch scientists 414ke, never-

theless, in 19B3 it wa-i decided to carry out a series of *.lights to domn~tstrýt'

the ability of thit rating scale and of heart rate tespons to d~~is

between four short flight tasks having. theckreticdaly. throo dittornt lv-vtýýZ

difficulty. A N81S twin 'businass-3etl was use0d for 'ne! tza ittntv

*#tporioncod pilots flow a total of is 3equoncvs. 94ch voquomonee f:5 -e1tv

ta A 3 6 0 " turn in 2 min at cvngtant altitudo, :A$. and Ygtb oq tuu?%.

(bl A 360" turh in ý min with 4 a ixutno~u4 t~ao v* '00ft.n

tt a constant XAS and Cato 0f tutu.

(C) A 36V~ tufn io : twi Vith 4 04Litoatt4 O t' 011twao.l'4

folowed by 4 tvfae 40 turrn In Z nib Vitih a f

2000 ft at A eofistaft IMS and tato of turn.

id 1%30 tiurn in wi with A ziwk~lahltun 'W Iue 9c~

tach Soqeene was fliswn 4t -A 4ato tigeht in til*at A'tvpte m~ fktsl

pefeimne ws anteedby tmefý -in $' "'Ut ut a§ it 9'1," ~ ei'~

that @"dh pilot was deterMined to "Cet tioll im, the eoea

this vat discootinu~d. It vas tfhe iotwention to. Vaty th ore it% 6hi0% tthe k46'Aik

"ter flOW6 - e~t Oatatiotally, it %oat ouch poreQ ctnvouient Io fl-Y oath ~$Lit the bcdar 1-4 with tht? I itt task beIA4 reV@Ate4 ikk t49 000~ 4-f tMe t~h

6"ert MOMw wore ooeotdad throughout Oat~h 044UPftV 4 an wserk1nKad sativ4-1 vOte

t#4u~ead Atter oath tlilqht task. fthre pilot* Clow the .s1~uthro# tvito. 'thOtSr

Pilots not curtear e the 126 veoe give" at. boat 30 ai f~aillativatien0 beIO-t

being asked to rate the tasks; similarly, those pilots unfamiliar with the rating

scale were given a full briefing bef.'jrehand.

Results were highly encouraging and demonstrated that for 110 of the

12 pilots both workload ratings and heart rate responses were able reliably to

distinguish between the different tasks. In addition there wasa very good

agreement between ratings and heart rates for six of the 10 pilots and reasonably

good agreement for the other four.

The meAn ratings and range for each flight task were as follows:

Task Mean Rating Range

1 4.8 3-6

2 6.1 4-7

3 S. -0

47,0 4-a

1 5.03-6

0n thezortical grounds it wa.s ques4tiotiabt~ whether t~alk -31, which lasted

twiceo as long as the othter throe, would be rated lower Than taýýk - 4C In the

QvQ5It, task - 3 w4&. -4ctad higher thgn teak - 4 on thtoea occasions. lower on' fivo.

An-4 the agrw cst sevqn occasions. AlIheugh piloti were askted to rate the enttse

tastk, half of the pilots gave two. ratings tat task - 3.

Ovoza~l, kthere w4As no- ovide@n-tv of le~faq igoad roduced wothlee4d twe

t4ask - I flowft 4t tho #t4!.t end 4t the end "t the aeqtwsiee. Similerly, thtv*t

pia§ w04 fle6, the aewee oa n:f two separ~at ottvttna 'lid sn.j ow " on vidonce n3f

vorkto-gtl r-kd!Cti-o.n with iiOet 4txrailiarity.

4 DISCUSSION

th# lkd--ferd s'±atg haN now bieen int up 9#1 "toe tha%0y~k n t & h

4etiilt-a prinnaly !fo uge by ten11 PIl&t hj bQeen Vdsontll r lti

Pilts cvi hlloptot plota,. atd alirline0 r110411. The idea 04 epto

capaity apals tgg tst ilotsh eto eprt nlat it, helps thgo ttO arrive at at

fl~ltptiae r~in~jwithrelaive siath'is i stoaryvla~ewe fr

r~io th s*e~PttracttIte. stints a4.Th-it to bkecI0 Isalliaf sit\ý theb scalg

resattbiyq~ely to'. ilos'then s4ers tN think *ftly In tvtft Of ntU#6ers

witoutse~enc tothe Aactal taflsion Itre.

the a~ttia~e t fwng C t* use a rtto itdjSale diarfing or sbortty 4fter,

a Tk**a#d0t-j flight task hAt been decssssotratod 04.4%V tines durift the W0 yeata

especiilly durin g log flight sectors tgq~uliti# tony ratings.4 P4st (11gMt

tatiq4 -t ws even s th*m assistiktce ot video rocrtdibqS - Oe.Utb lossb tvllablve.

r' Rs;

4'. . .. ..

12

The authors are unaware of any other workload rating scale that has been

used as extensively for assessing workload in the 'real world'; nevertheless,

there are a number of possible shortcomings.

Several authors have underlined the importance of sensitivity and diagnos-

ticity, others have stressed the multidimensional nature of workload, and rating

scales having varying degrees of sophistication have been designed with these

issues in nirnd 2 0 - 2 2 . A subjective workload assessment technique (SWAT) developed

by Reid and his co-workers 2 3 , 2 4 considers workload in three dimensional terms -

time load, mental effort, and psychological stress. Hart and her colleagues25, 2 6

have developed and refined a multi-dimensional rating scale consisting finally of

six subscales, namely: physical demands, mental demands, time pressure, own

performance, effort and frustration.

Certainly, in many situations it may be quite important to be able to

analyse the reasons why workload has changed. The Bedford scale does not have

the power of diagnosis but in practice, on the rare occasions when diagnostic

information is required, post-flight discussions with the pilots - especially

when their beat-to-beat heart rate plots are used as an aide memoire - are

proving to be of considerable value. And, in most cases, assessment of overall

or global workload is all that is required - for example, during workload

assessments for the purpose of crew complement certification.

The lack of sensitivity at the lower end of the scale was at first thought

to be a disadvantage, but experience now suggests that it is unrealistic to

. strive for a high level of sensitivity. This is particularly so in view of the

variations in subjective evaluations between pilots and, occasionally, within the

same pilot from time to time. In addition, the cost effectiveness of modifying

systems or procedures to correct for small differences'in satisfactory 'otkload

-". .is questionable.

The nonlinear nature of the scale, although making it difficult to cnar:y

out statistical treatments on data, does not seem to cause any problem in

practice.

The decision to develop a scale giving relative - rather 1,han absolute

valu;es o.f workload has not appeared to pose any problem from the practical point

'o view, And it may be questionable whether absolute rating scale, are really

recessary. Lidderdale14 , in comparing in-flight with post-flight worlload

assessments, w!'*Le: "It is possible that all assessments of workload are made

trom a baseline of comparison with other elements in the flight and, if this is

the caoa, all ra~ing methods may be relative."

I

. . ' " .. . • • .

13

Examination of several hundred workload ratings given -luring a variety of

flight tasks or flight phases together with attempts to correlat~e katings with

heart rate responses have resulted in three important findings. Firstly, as

suggested by Ellis 6 in 1979, it was not possible to compare ratings for different

flight tasks; for instance, ratings given during the take-off could not be

related to those given during the approach and landing.

Secondly, although reasonably consistent, ratings were highly individual

for each pilot; unless some normalisation procedure was applied to the data - if

sufficient data were available -each subject pilot had to be considered as his

own control; a sirni±ar but more marked idiosyncrancy has already been identified

for heart rate responscs25.

Finally, a goo~d correlation be,.o;en worklead ratings and heart rate

responses for the name flight task tqý.s apparert in about S0% of pilots bk~t one in

five did no, show any agzeement. The veasot- for this lack of agreement has not

always been identified, s.c'meLimej it was bec-ause a pilot i4ss not integrated his

workload over the entire period of interest, but there is mnoie evidence that it

may be related t V~.- nature of that pa-licular indivietual's heart rate response.

5 COZICLUSXQNS

The B~edford workload rating scale was designed for uae in the 'real world'

of practical flight testing some 10 years ago. anrl pilots have, without

axception. found it easy to use in flight. During tho past decadde, despite the

shortcomings rotortted to above, its value has b~een demonstrated in several flight

triala - especially when ratings have been augmented by recording the pilots

heart rate.

Rlowovat, unlike the studies errtiod out an some other rating

scals ~the Vvdford scale hasi not boon subjected to a critical evalu-

otion in contrtilled labor~tory expoariinenta. Nevertholonot. the authors beliovo

tho #cale in ita present form is quito suitablo iot assessing workload in n~st

3rs~o ituAtionj.

41.

14

1EFZRZRNCES

No. Author Title, etc

1 R.P. Harper Pilot evaluation of handling qualities.

Ninth Annual Human Factors Society Meeting

(1965)

2 J.D. McDonnell An application of measurement methods to imrrovv

the quantitative nature of pilot rating scales.

IEEE Transactions on Man-Machine Systems

10, 81-92 (1969)

3 G.E. Cooper The use of pilot rating in the evaluation :,f

R.P. Harper air~craft handling qualities.

NASA Technical N6te TN-D-5153 Washington DC

(1969,1

4 S.J. Gerat~hewohl Definition and measurement of perceptual and mental

workload in aircrew and operators of Air Force

veapon 3ystemi.

In AGARD CF-181 Higner mental funetioning Ln

operational aystem5 AGAP.D, Paris (1976)

5 G.A. Ellis The airline pilot's view of flight dock werklal1:

A.R. Roscoe A preliminary study using a quostionnaire.

RAE Technical~ Memorandum~ FS(IBý 465 i)

6 G.A. Elisa Subjuctivo assa~sment. pilot apinion momiuroi,

In: AGARL'ograph No 233 Ao.5coe A 11(d

Aasfsosinq Pilot Wokload, A"AKD. V~rls ti9Th1

1 0. McDonnell Pilot rating tochniquots for thQ edtititaion Andi

ovi~luation of hvidingq i~gtito.4.s

AVFF)J - TA-60-"6 W;ri4Vatwraor1 Arlh, Oh;D tl(

8 I.A.A )I08s Ron adjaotiva4 ratinq ticalo.s In humtAn roapwi

MUwtAu Factors. 15, 275-280 (1973)

9 D.A. SPYkOr DQ~alopmot~n of tachniquot tot #*a urifiq piot

UK"A elk-k8$iF, wahingto" DC (19711

7 .>

15

REFERENCES (continued)

No. Author Title, etc

10 J.M. Rolf Evaluating measures of workload in a flight simulator.

et al In AGARD CP-146 Simulation and Study of high workload

operations.

AGARD, Paris (1974)

11 A.H. Roscoe Assessing pilot workload in flight.

In: Conference Proceedings No.373.

Flight Test Techniques, AGARD, Parin (1984)

12 A.H. Roscoe Pilot workload and economic Category 3 landings.

In: Conference Pre-prints Aerospace Medical

Association Annual Scientific Meeting,

Washington DC (1980)

1i W.A. Wai .right Flight test evaluation of crew workload.

In AGARDograph No 282 Roscoe A H (Ed)

The practical assessment of pilot workload.

AQARD. Paris (1987)

14 I.G. Lidderdale measurement tt aircrew workload during low-level

flIght.

!n AC'RDoqrs.n No 282 Roscoe A H $Ed)

The practical aass3mont oZ pilot workload,

AGAHD, Pa-ii (198')

"i5 T.L. $Sdty Th-i atialytical hierarchy prcess.

Heiraw-Hl1X (1980)

16 H.C. Muir The asts,=ýont of wuorkloari in helicopters.

R&. El-woll In., Procae4.ýngs of R~AE/AGARNDCIT Sympo~itz -

The prwical maeo-imnt of pi}o. workload.

Cranflic (198f)

17 P. |Harn'0 Pligqht d.ck environmont and lorkih.g in North Sea

A.C. reobor helicopter Opqcataona.

"La tonforenco Nbstracts Aerospace ModicAl

Agaoeiation Annual Scientific Hooting,

W4Ajhington DC (1188)

%%

: 19 A.fl. RoSceN The i.ct of nw technotoqy en pilot vo~k~oad.• U. •S, Or~eve ,,AE "Te.hnatcal Pap~er C.&•• (1986)

-Ii,•-••::.

16

REFERENCES (concluded)

No. Author Title, etc

19 A H. Roscoe Assessment of pilot workload during Boeing 767

B.S. Grieve normal and abnormal operating conditions.

SAE Technical Paper 881382 (1988)

20. J.H. Skipper Evaluation of decision-tree rating scales for

C.A. Rieger mental workload estimation.

W.W. Wierwille Ergonomics, 29, 585-599 (1986)

21 G.B. Reid Development of multi-dimensional subjective

et al measures of workload.

Proceedings of the International Conference on

Cybenetics and Society, 403-406, New York (1981)

22 S.G. Hart Development of NASA-TLX (NASA-Task Load Index).

L.E. Staveland Results of empirical and theoretical research

In: Human Mental Workload, Eds P.A. Hancock and

N. Meshkati, North Holland Press, Amsterdam

(1988)

23 G.B. Reid Application of conjoint measurement to wo-kload

C.A. Shinglodocker scale development.

F.T. Eggomeir In: Proceedings of the Human Factor3 Society -

25th Annual Meeting (1981)

24 F.V. Schick The use of subjective workload o3sessmnt to? nquv_

R.L. Hann in a complex flight task.

In AGARDograph No 282 The practical asseisimot iZ

pilot workload. Ed A.H. RoSCoe, AGAIAD, Vari3 (097)

25 H.R. Bortolussi Measuring pilot workload in a motion base traintr.

U.N. Kantowitz A Co0Parison of four tochniquos.

3.G. Hart Proceedings of the Third Biannual Symposlum on

Aviation Psychology, Columbua, Ohio (1988)

26 W.W. Wierwill* Deciaion-trao rating dcalo. for workload ostlmntid:

J.H. Skipper Thema and variations.

C.A. Riegat Proceedings of 20th Annual Contfoonco on Manual

Control, Sunnyvale (1984)

27 P. Tsang Cognitive demands in aUtonation Aviat,

W.W. Johnson Space &avi.on Hod, 60, 130-5 (.1989)

..I• ,. .4

REPORT DOCUMENTATION PAGEOverall security classification of this page

E - UINLIMITED

A\,: fir a, possible this page should contain only unclassified information. If it is necessary to entcr classified information, the boxaii' xc ni'st be ma.ked to indicate the classification, e.g. Restricted, Confidential or Secret.

I DRIC Refet~eicc 2. Originator's Reference 3. Agency 4. Report Security ClassificationiMarking(o be added by DRIC) Reference

RAE TR 90019 UNLIMITED

I S DRI( Code for Originator 6. Originator (Corporate Author) Name and Location7672000H Royal Aerospace Establishment, Bedford, UK

i '. Sponsoring, Agency's Code 6a. Sponsoring Agency (Contract Authority) Name and Location

fitle A subjective rating scale for assessing pilot workload in flight:

a decade of practical use

For. (ltI 1 ranslations) Title in Foreign Language

7b. ilFr (Conference Papers) Title, Place and Date of Conference

MA.\,,ix1. I Surname. Initials 9a. A'jthoa 2 9b. Authors3,4 .... 10. Date Pages Refs.March I IRoscoe, A.H. Ellis, G.A. 1990 16 27

S(,,trac{ .,mbei 12. Period 13. Project 14. Other Reference Nos.

Flight Management 6

I:1) Controlled by -

(b) Special limitations (if any) -It it is intended that a copy of this document shall be released overseas refer to RAE Leaflet No.3 to Supplement 6 ofMOD Manual 4.

o. Ih'w-uptor (Keywords) (Descriptors marked 0 are selected from TEST)

Workload. Flight. Pilot. Rating scale.

i 7 .Xbq•'trj

Despite the many techniques developed for evaluating pilot workload inflight subjective assessment by experienced pilots is still the most reliablemethod by far. This report describes the design and development - with the helpof practising tqst pilots - of a ten-point ratin.g scale. The scale uses adecision tree similar to that used by the Cooper-Harper Handling Qualities scale,and i8 based on the concept of spare capacity. Examples are given of its use bya large number of pilots in various flight trials and workload studies.

Best Available CopyRAI, AIi (tledQthe Y~