assessing the impact of stressors on performance: observations on levels of analyses

12
ELSEVIER BIOLOGICAL PSYCHOLOGY Biological Psychology 40 (1995) 197-208 Assessing the impact of stressors on performance: observations on levels of analyses John Caldwell U.S. Army Aeromedical Research Laboratory, Fort Rucker, AL 36362, (IsA Accepted 3 November 1994 Abstract Applied researchers often are required to rely on limited laboratory studies to estimate the effects of various stressors on actual job performance. It can be difficult to select measures which lend themselves to implementation in laboratory settings while also providing sufficient capability to predict complex ‘real-world’ performance. Studies which employ simulations of operationally-relevant tasks and those which include the administration of basic cognitive tests are favored by many applications-oriented researchers. This is despite the fact that such a testing approach may limit sensitivity due to the requirements for extensive practice on these tasks in order to obtain stable results. Studies which use physiological assessments appear to be less readily accepted by applied researchers because of the difficulties in drawing a direct link between physiological indexes and operational performance. However, there are arguments to be made for the inclusion of physiological evaluations with the more traditional, performance-based measures. Data from three studies are cited here to support the value of using a multifaceted approach to the study of operationally-relevant stressors. Although these studies were not conducted to systematically investigate the relative merits of performance, cognitive, and physiological assessments, they do serve to highlight the fact that inclusion of all three types of tests tend to maximize the validity, interpretability, and sensitivity of applied research. Keywords: Physiological assessments; Performance assessments; Cognitive assessments; Flight evaluation; Atropine sulfate; Antihistamines; Sleep deprivation Elsevier Science Ireland Ltd. SSDI 0301-0511(95)05115-Q

Upload: john-caldwell

Post on 23-Nov-2016

212 views

Category:

Documents


0 download

TRANSCRIPT

ELSEVIER

BIOLOGICAL PSYCHOLOGY

Biological Psychology 40 (1995) 197-208

Assessing the impact of stressors on performance: observations on levels of analyses

John Caldwell

U.S. Army Aeromedical Research Laboratory, Fort Rucker, AL 36362, (IsA

Accepted 3 November 1994

Abstract

Applied researchers often are required to rely on limited laboratory studies to estimate the effects of various stressors on actual job performance. It can be difficult to select measures which lend themselves to implementation in laboratory settings while also providing sufficient capability to predict complex ‘real-world’ performance. Studies which employ simulations of operationally-relevant tasks and those which include the administration of basic cognitive tests are favored by many applications-oriented researchers. This is despite the fact that such a testing approach may limit sensitivity due to the requirements for extensive practice on these tasks in order to obtain stable results. Studies which use physiological assessments appear to be less readily accepted by applied researchers because of the difficulties in drawing a direct link between physiological indexes and operational performance. However, there are arguments to be made for the inclusion of physiological evaluations with the more traditional, performance-based measures. Data from three studies are cited here to support the value of using a multifaceted approach to the study of operationally-relevant stressors. Although these studies were not conducted to systematically investigate the relative merits of performance, cognitive, and physiological assessments, they do serve to highlight the fact that inclusion of all three types of tests tend to maximize the validity, interpretability, and sensitivity of applied research.

Keywords: Physiological assessments; Performance assessments; Cognitive assessments; Flight evaluation; Atropine sulfate; Antihistamines; Sleep deprivation

Elsevier Science Ireland Ltd. SSDI 0301-0511(95)05115-Q

198 J. Caldwell el al. /Biological Psychology 40 (1995) 197-208

1. Background

Scientists who work in applied research are required to answer operationally- relevant questions about the performance impact of a variety of stressors. Examples of such questions in military aviation research include those concerning the impact of chemical defense medications, an array of pharmacological compounds (i.e., antihistamines, stimulants, hypnotics), and stresses imposed by sleep deprivation, mission demands, and clothing or equipment.

To study these and other factors, a number of categories of measures are available to the researcher. These range from the basic physiological measures such as elec- troencephalograms and evoked potentials, through various types of cognitive skills tests, to the applied performance measures such as flight simulation and actual in- flight evaluations. The data derived from each category are essential to making rea- sonably accurate judgements about the effects of operationally-relevant stressors on ‘real-world’ performance.

Flight evaluation, in an aircraft or simulator, is the most face-valid technique available for determining the manner in which an aviator’s flight skills will be af- fected by some factor in the real world. Here, the objective is to set up a realistic mission that can be safely and precisely executed under controlled conditions. The results then can be generalized to performance in the operational environment. However, as Lees and Ellingstad (1990) point out, it is difficult to select from the myriad of available flight-performance variables those which will adequately discriminate among differently-skilled pilots or various experimental conditions.

Cognitive tests offer an alternative that also possesses a moderate degree of face- validity without many of the complexities and expenses of flight evaluations. AGARD (1989) points out that, based on models derived from information- processing theories, it is reasonable to study potential operational problems using tests of basic cognitive skills. Basically, the goal is to determine the subtasks which make up the more complex operational tasks and test these subtasks under the influ- ence of a given intervention to estimate the impact of that intervention on ultimate operational performance. This approach tends to overcome the problem of choosing from a profusion of different variables available in complex performance scenarios by limiting the selection to a few basic skills presumed to be common to several as- pects of performance.

Physiological indexes constitute another level of assessment. Although the collec- tion and analysis of physiological data represents the least face-valid method for assessing the status of pilots, Caldwell et al. (1994) provide a variety of reasons for their inclusion in studies of operational stressors. Among these is that they offer a direct indication of the effects of interventions on the basic physiological substrate which is thought to underlie both cognition and performance. Also, physiological indexes are not susceptible to the practice effects which tend to diminish the sensitiv- ity of many performance measures. In other words, resting EEGs can be collected repeatedly without affecting the overall results or diminishing the sensitivity of the assessment. However, repeated administrations of cognitive tests typically produce performance improvements (due to practice) as well as increased test stability (often associated with decreased sensitivity).

J. Caldwell et al. / Biological Psychology 40 (1995) 197-208 199

Each level of analysis (i.e., behavioral, cognitive, and physiological) is associated with advantages and disadvantages, but a strong argument can be made for the in- clusion of all three in studies of operationally-relevant stressors. Converging evi- dence from simulated-job performance, basic cognitive skills, and physiological status contributes to a more complete understanding of the effects of an intervention than that which is available from any single measure studied in isolation. Further- more, there is evidence (which will be presented later in this report) that a reliance only on cognitive or simulated performance testing to the exclusion of physiological examinations may lead to overly-optimistic estimates about operational perfor- mance. In other words, a stressor may not impact selected performance measures (from cognitive or simulation tasks) in the laboratory despite the fact that a subject’s basic functioning has been compromised. Physiological manifestations of this com- promise often exist, but if the analysis approach excludes evaluation of physiological status, these manifestations go undetected. Thus, in cases where general cognitive skills or simulated performance is the sole focus, it might be concluded that the stressor exerts no adverse effects at all when, in fact, the effects are present but simp- ly elude detection at one level of analysis (the behavioral or cognitive level).

The absence of observable decrements in the types of basic cognitive skills which are frequently tested in the laboratory does not guarantee a similar lack of impair- ment on more complex performance tasks (such as flying a real aircraft), regardless of whether or not those skills are components of the ultimate performance task. This point must be kept in mind when evaluating operationally-relevant problems. As will be shown below, in cases where the intervention (stressor) is powerful, it may pro- duce a wide range of effects that include decrements (or enhancements) in both uni- dimensional and multi-dimensional performance as well as ultimate operational per- formance changes. However, in cases where the stressor is small, it is more likely that the basic simplistic skills tests will not be impaired even though the more complex operational tasks may suffer. This is because (1) subjects are trained (and overtrain- ed) on these basic laboratory tests to the point where their performance on cognitive and simulator tasks becomes somewhat automatic, and (2) the tests are typically short, relatively simple, and uni-dimensional, so that individuals can effectively allocate their diminished resources in a way that often will not be possible under complex, concurrent, and sustained demands. While it is true that many operational tasks are also over-practiced to the point of near-automation, there remains suffi- cient novelty and unpredictability in the real world to prevent the reliance on rote- memory and static resource allocation which is often present in more controlled situations.

Ignoring the problems associated with response complexity, stimulus novelty, and resource allocation can lead to overly-optimistic estimates of the effects of a given stressor on operational performance. This is particularly true in cases where the in- vestigational methodologies do not include dimensions beyond basic skills testing and/or performance evaluations. In order to make more accurate estimates about the potential impact of interventions, it is desirable to employ a multidimensional assess- ment strategy whenever possible. The strategy should include evaluations of an array of indexes ranging from the basic (i.e., physiological) to the complex (i.e., simulated or actual job performance). Without taking such an approach, it is difficult to make

200 J. Caldwell et al. /Biological Psychology 40 (1995) 197-208

valid estimates of ‘real-world’ performance effects from limited laboratory studies. This is because the most face-valid performance measures may not in fact possess the highest degree of actual validity or reliability in terms of identifying operationally-significant decrements. Conversely, the physiological measures which are often discarded in applied research may offer the most sensitivity in terms of identifying functional changes which may have real-world consequences.

To illustrate this point, three recently completed research studies will be sum- marized briefly below. Two of the studies consist of a three-dimensional approach to the assessment of an operationally-relevant intervention which includes (1) elec- trophysiological measurements, (2) basic cognitive or psychomotor skills testing, and (3) actual or simulated flight performance. The third study includes only the EEG and cognitive assessments. However, all three clearly point out the utility of including physiological (EEG) evaluations in assessment strategies where the objec- tive is to make determinations about whether specific interventions may compromise operational performance.

2. Studies

2.1. Effects of atropine sulfate

The first study is one in which the effects of atropine sulfate were examined in army helicopter pilots (Caldwell et al., 1992). This study was of interest to the army because atropine is provided to aviators in sets of three 2-mg auto-injectors for self administration as an antidote to nerve agent. There was concern over what would happen in situations where the drug was injected after an aviator mistakenly perceiv- ed that he had been exposed to nerve agent. Would this individual be able to con- tinue with his flight mission or should he be required to land his aircraft immediately and wait for the drug’s effects to dissipate?

2.1.1. Methods To answer this question, 12 aviators in good health were tested after each had

passed a rigorous physical examination. These pilots flew standardized flight profiles in a specially-instrumented UH-1H utility helicopter while measures of altitude, airspeed, heading, pitch, roll, vertical speed and instrument-landing system (ILS) localizer and glide slope were collected (Mitchell et al., 1988). Psychomotor-tracking ability was assessed in close proximity to the flights with a Zero-Input Tracking Analyzer (ZITA) which presented a laterally moving target on a dot-matrix display. A joystick attached to the console of this device was provided for subjects to control the movement of the target. Electroencephalographic (EEG) data were collected in the laboratory with a Cadwell Spectrum 32 which provided for acquisition, storage, and analysis of 21 channels of EEG data.

Subjects remained in the laboratory for periods lasting up to 11 consecutive days. During this time, they began by being trained on the proper execution of the flight profile and laboratory tests prior to initiation of the counterbalanced, double-blind dose and control-day sequence. On the evening of a subject’s first training day, 25

J. Caldwell et al. /Biological Psychology 40 (1995) 197-208 201

EEG electrodes were attached for EEG recording, and these electrodes remained on the subject’s scalp continuously for the duration of the study.

There were 2-3 training days (depending on when asymptotic performance was reached), and 3 dose days in the study. Each dose day was followed by a control day which allowed time for the last dose to be eliminated before the next dose was ad- ministered. Extra days were sometimes included when inclement weather prevented the conduct of research flights. On each dose administration day, only one dose (placebo, 2 mg or 4 mg I.M.) was given. Each dose day consisted of a morning ses- sion of laboratory tests, a single injection, a morning flight, a noon laboratory ses- sion (beginning about 2.5 h post-dose), an afternoon flight, and an evening laboratory session (beginning about 8 h post-dose). Control days were similar but there was no injection, no flying, and no evening session.

The laboratory test sessions included a 2-min resting eyes-open/eyes-closed EEG which was collected in a sound-attenuated chamber during the baseline and the two post-dose sessions. The EEG testing was followed closely by three increasingly dif- ficult tracking tests in which the delays between control movements and cursor responses and the acceleration rates for the cursor were manipulated (in the easiest test there was no delay and the acceleration rate was constant). Subjects were re- quired to hold the cursor over a centered tracking target using a joystick control.

The flights consisted of a set of standardized maneuvers in which performance was evaluated via the computerized aircraft-monitoring system during higher altitude work. The maneuvers conducted under visual flight rules consisted of five straight- and-level segments, two standard-rate level turns, a climbing turn, a descending turn, a straight climb and descent, and two steep turns. The maneuvers performed under simulated instrument conditions consisted of one straight-and-level segment and an instrument-landing system (ILS) approach.

2.1.2. Results Electrophysiological data showed atropine-related increases in relative delta activ-

ity at Fz, Cz, and Pz (p < 0.05) where there was more 1.5-3.0 Hz activity under the 4-mg than under placebo dose. Theta (3.0-7.5 Hz) activity was increased under the 4-mg dose at Pz and Oz (p < 0.05) and the relative power of alpha (7.5-13 Hz) activity was substantially reduced by 4 mg of atropine at Fz, Cz, Pz, and Oz (p < 0.05).

Tracking scores on the easiest tracking task revealed the poorest performance across sessions under the 4-mg dose, and on the intermediate task there was a dose- by-session effect which appeared to result from increased errors under the 4-mg dose at noon. On the most difficult task, there was a similar effect in that the 4-mg dose impaired performance significantly at the noon session (p < 0.05).

Flight data showed an atropine-related reduction in the accuracy of heading con- trol under the 4-mg dose for the visually-referenced and instrument-referenced straight-and-levels (SLs). In addition, there was an atropine-related drop in airspeed control during the instrument SL (p < 0.05). Heading control suffered under 4 mg of atropine in the straight climb and the straight descent, and in the climbing turn, vertical speed and pitch control were found to be reduced by the 4-mg dose. The instrument-landing system approach also showed effects due to atropine in that airspeed control suffered under both atropine doses (p < 0.05).

202 J. Caldwell et al. /Biological Psychology 40 (1995) 197-208

2.1.3. Discussion The 4-mg dose of atropine sulfate was shown to be a powerful pharmacological

stressor which manifested effects across physiological indexes and both simple and complex performance tasks (to include actual helicopter flight performance). The ef- fects from all three types of tests (EEGs, psychomotor tracking, and actual flights) provided consistent evidence of impairment which included reduced cortical alert- ness, impaired psychomotor reactions, and degraded pilot performance. This re- search indicated that the choice of any of the independent variables used here would have lead to the conclusion that aviators were substantially impaired under 4 mg of atropine whereas the 2-mg dose did not typically exert marked effects. Thus, in this case, the research conclusions would have been essentially the same whether the authors had chosen a uni-dimensional or a multi-dimensional assessment strategy. However, as discussed next, such agreement between the strategies may not occur when a more subtle stressor is examined.

2.2. Effects of two antihistamines

In contrast to the atropine study, somewhat different results were found when the less severe effects of mild antihistaminic compounds were examined on many of the same basic dimensions (Verona and Stephens, 1991; Stephens et al., 1992). The pur- pose of this investigation was to evaluate the effects of therapeutic levels of terfenadine and diphenhydramine on army helicopter pilots. The reason this was of interest to the army is that aviators suffering from allergic reactions have in the past been barred from taking antihistamines because of the sedative effects which they typically produce. Terfenadine is a new antihistamine which purportedly produces a therapeutic effect without the drowsiness usually associated with these drugs. The question was whether aviator status would be adversely affected by terfenadine in comparison to a known sedative antihistamine (diphenhydramine).

2.2.1. Methods To answer this question, 12 UH-60 pilots in good health, possessing normal

vision, were tested on simulator flight performance, cognitive skills, and electroencephalographic activity. Simulator flight performance evaluations were conducted using the USAARL UH-60 flight simulator system which includes an operational crew station, computer-generated visual display, six-degree motion sys- tem, specially constructed environmental conditioning equipment, and a complete data acquisition system. Several aspects of simulator control were monitored during each session - including heading, air speed, vertical speed, altitude, and other mea- sures - in order to evaluate how well subjects maintained control according to in- structions. Cognitive performance was evaluated with the Complex Cognitive Assessment Battery (CCAB) which was developed for the Army Research Institute (Samet et al., 1987) and the Walter Reed Performance Assessment Battery (WRPAB) developed at the Walter Reed Army Institute of Research (Thorne et al., 1985). Subtests included the tower puzzle, Stanford Sleepiness Scale, mood scale II, two-column addition, serial reaction-time, time wall, interval production, code sub- stitution, logical reasoning, and manikin (Verona and Stephens, 1991). Elec-

J. Caldwell er al. /Biological Psychology 40 (1995) 197-208 203

trophysiological data were collected with a Cadwell Spectrum 32. Both resting EEG and auditory and visual evoked-response data (P3OOs) were collected on 21 monopolar (mastoid referenced) leads.

Subjects participated for 2-week periods beginning with four practice sessions on the simulator and eight practice sessions on the cognitive tests during the first week (Monday through Thursday). On the fifth day (Friday), the actual drug testing began, with each subject ultimately receiving all three drug conditions (terfenadine, diphenhydramine, and placebo) on different drug-administration days. The drug days each were separated by 2 control days. To equate the dosages of the two antihistamines, the following procedure was used: for the diphenhydramine condi- tion, subjects were given 100 mg diphenhydramine at 4 h prior to the test session and then 50 mg more immediately prior to the session. For the terfenadine condition, subjects were given 120 mg terfenadine 12 h prior to the session, a 60-mg dose 4 h before the session, and a placebo immediately preceding the session.

Test sessions consisted of subjects first completing a simulator flight which con- sisted of a series of precision upper-air maneuvers, an instrument approach, a series of emergency procedures, and two instrument takeoffs. Next, each subject completed the subtests from the CCAB and the WRPAB. The sessions were completed with each subject performing a resting eyes- open/eyes-closed EEG and a visual and auditory P300 task.

2.2.2. Results

Electrophysiological data revealed that diphenhydramine administration was associated with reduced alpha power at Fz, Cz, Pz, and Oz, and reduced beta at Fz, Pz, and Oz (p < 0.05). Diphenhydramine also suppressed visual P300 amplitude, but it had no effect on the latency of this component or on the amplitude or latency of the auditory P300. Terfenadine did not produce any changes in the EEG or evoked responses.

The cognitive performance tests, which included a variety of measures from the CCAB tower puzzle and several WRPAB tests, indicated that these tests were largely insensitive to the sedative effects of diphenhydramine. Of the 10 subtests analyzed, only one objective test and two subjective tests evidenced changes due to diphenhydramine. The drug was associated with a reduction in percent correct on mental rotation and increases in self-reports of sleepiness and fatigue @ < 0.05).

Analysis of the flight performance data collected in the UH-60 likewise indicated few effects attributable to diphenhydramine. In fact, there were no drug effects at all on three accelerations and climbs, a deceleration, an instrument approach, the left standard-rate turns, an S-turn, seven straight-and-levels, and a takeoff maneuver. Diphenhydramine did increase slip control errors during the first of three descents, but it also apparently improved airspeed control in the fourth of six right turns. Based on the total number of maneuvers and variables examined, this number of significant findings could have been due to chance alone. Thus, the overall picture is one of virtually no impact of diphenhydramine on flight performance.

2.2.3. Discussion In contrast to the multidimensional and pervasive effects of atropine, diphen-

204 J. Calriwell et al. /Biological Psychology 40 (1995) 197-208

hydramine did not produce changes of sufficient magnitude to manifest themselves across all three sets of measures. In fact, the most face-valid measures of the effects of diphenhydramine on overall aviator performance failed to clearly detect decrements in functioning despite the fact that the EEG results suggested CNS seda- tion. Both the simulations, which contained routine precision-instrument flight maneuvers, and the majority of PAB tests, which sampled many of the basic cogni- tive skills necessary for mission planning and flight execution, appeared to lack the necessary sensitivity for detecting the known sedative effects of diphenhydramine. These results are similar to those of Eddy, Dalrymple, and Schiflett (1992) who found that subjects involved in command, control, and communications simulations did not evidence performance decrements attributable to diphenhydramine despite the fact that a subject-matter expert rated the subjects as being impaired.

Had Verona and Stephens (1991) and Stephens et al. (1992) used only simulator testing or cognitive testing in their study, the authors may have concluded that diphenhydramine exerted no adverse effects on helicopter pilots. The inclusion of a direct physiological measure, which was sensitive to the drug effects of interest, per- mitted a more realistic assessment of the potential sedative effects of diphenhydr- amine in an aviation context.

2.3. Effects of sleep deprivation

Another study which serves to illustrate the relative sensitivity of EEG measures in comparison to some types of performance measures is one by Comperatore et al., (1993) in which the effects of sleep loss rather than drug interventions were exam- ined. The purpose of this study was to examine the effects of 60 h of sleep depriva- tion on cortical arousal and cognitive skill on two tasks. The objective was to establish the magnitude and time course of effects on each of the selected dependent variables.

2.3.1. Method

Eight subjects were tested after having been screened for the presence of illness or medication use which may have confounded the results. Subjects participated in evaluations of spontaneous EEG activity with eyes-open, middle-latency evoked responses (MLR) to auditory clicks, and cognitive tests. MLRs are thought to reflect the activation of brain structures such as the thalamus, temporal cortex, and reticular formation (Comperatore et al., 1989). Cognitive tests offer an evaluation of basic cognitive skills thought to underlie many flight-related tasks. The EEG and MLR data were collected with a Cadwell Spectrum 32. In both cases, 21 monopolar channels of data were collected, but only a subset of these data were analyzed. The cognitive data were collected on a standard desktop computer using software from the Walter Reed Performance Assessment Battery (WRPAB). Two subtests were completed during each test session - the logical reasoning and mental rotation (manikin) tasks.

Subjects participated for a period of 5 days which began with training and familiarization on the first day, after which subjects were allowed to get a full night’s sleep. Following the training day, subjects were tested in sessions starting at 07:OO

J. Caldwell et al. /Biological Psychology 40 (1995) 197-208 205

h, 13:00 h, 17:30 h and 22:00 h on Day 1 (baseline) after which they were not permit- ted to sleep. Testing continued at 03:OO h, 07:OO h, 13:00 h, 17:30 h and 22:00 h on Day 2, and at the same times on Day 3 (with the exception of the 22:00 h session). Subjects then were allowed to gain recovery sleep on the night of Day 3 prior to one recovery-day test session (at 13:00 h) on Day 4. Only the sessions which were con- sistently present during baseline and the 2 deprivation days were used in the statistical analysis. Thus, for the present purposes, the 03:00-h and the 22:00-h ses- sions were dropped.

During each one of the test sessions, subjects were evaluated twice in terms of spontaneous EEG activity (immediately prior to each WRPAB test), and MLR data were collected after the WRPAB and EEG tests were completed. Thus, there were two separate EEG tests separated by 5-6 min in each session.

2.3.2. Results Analyses of the cognitive performance data revealed that mean reaction-time for

correct responses on the mental rotation task was not increased after only one night of sleep deprivation, but was longer after two nights of sleep deprivation (JJ < 0.05). A tendency toward a similar effect was seen with the logical reasoning task, but the results were not significant. The percentage of correct responses was unaffected on either logical reasoning or mental rotation.

Analysis of the EEG data showed that significant changes in cortical activation were evident in the theta band as early as Day 2 in the second EEG. In the EEG collected prior to the logical reasoning task, there was a significant elevation in Fz theta between Days 1 and 3, and reductions in Pz alpha from days 2-3. In the EEG collected prior to the mental rotation task, there was increased Fz delta from Day 1 to Day 3, and there were increases in both Cz and Pz theta from Day 1 to Day 2. Additionally, there were increases from Day 2 to Day 3 at Fz, Cz, and Pz (p < 0.05).

The morphology of the MLR was quite sensitive to the effects of even mild sleep deprivation as was revealed in the analyses of the area-under-the-curve data from 40-100 ms post-stimulus (results from RMS and 10 Hz FFT calculations were similar). At both Fz and Cz, the area of the MLR was increased significantly be- tween Day 1 (undeprived) and Days 2 and 3 (p < 0.05). When the MLRs were exam- ined visually, it could be seen that this effect resulted from an elevation in a slow positive potential across the MLR time history which suggested a slower recovery cycle for the Pl component (at 55-75 ms).

2.3.3. Discussion In a manner similar to the data from the antihistamine study, the results of this

investigation indicated that the EEG/MLR data were more sensitive to the effects of the independent variable than were the performance tests. The WRPAB did not detect a substantial change in subject capabilities until after 55 h of sleep deprivation whereas both the EEG and the MLR revealed changes earlier. Thus, a sole reliance upon performance data in this study would have resulted in an overly-optimistic esti- mate of the subjects’ abilities to cope with the effects of sleep deprivation. Here, these data suggested that subjects were not significantly worse off after one night’s

206 J. Caldwell et al. /Biological Psychology 40 (1995) 197-208

sleep deprivation than they were on an undeprived day, whereas the elec- trophysiological data revealed results to the contrary.

3. Discussion

While this is only a quasi-experimental way of examining the relative sensitivity of different types of measures, the results of the three studies support the need for multi-dimensional testing strategies. Particularly in the case where small decrements occurred, it appeared that although performance measures may have been the most face-valid determinations of whether or not personnel would be affected by a stressor, they weren’t the most valid or reliable indicators of decrements which could have been operationally significant. Conversely, the physiological measures, often regarded as being of limited utility in applied research, appeared to be sensitive in- dicators of changes in functional status which could have real-world consequences. These results highlight the importance of examining the functional status of person- nel from multiple vantage points - including the physiological.

While the physiological measures have the disadvantages of being more difficult and expensive to implement and interpret, they possess the advantage of providing a direct indication of brain and/or bodily functioning which should impact perfor- mance. In addition, physiological measures are (1) objective, (2) largely non- susceptible to ‘practice effects,’ and (3) immune to intentional or unintentional deception on the part of the subject.

Basic performance tests (i.e., cognitive tests or simulator tests) have the advan- tages of being inexpensive and relatively easy to implement, and the results appear to be easier to translate into operationally-relevant terms than results from physiological evaluations. Also, many performance tests can be administered, scored, and (at some level) interpreted by novice users - advantages which are not shared by most physiological measures. However, performance tests, particularly as they are used in scientific experiments, present problems in that: (1) it is difficult to select those which accurately measure individual subcomponents of the ultimate operational task; (2) once the right subtests are found, results are difficult to trans- late into terms which have operational relevance because the tests are usually ad- ministered in a way that minimizes sensitivity (i.e., testing one skill at a time rather than a combination); and (3), even with the best of tasks, rigorous training to stabilize performance (and reduce error variance) at the beginning of the experiment often results in the subject’s ability to complete the task with a certain amount of automaticity. Thus, the subject can rely on well-learned, sometimes almost reflexive responses rather than having to diligently work through each test item as he/she did when first exposed to the test - another problem that reduces test sensitivity.

All of this is not to say that performance tests have no value. In fact, they are es- sential to conduct comprehensive investigations of operationally-relevant indepen- dent variables. Performance measures taken in conjunction with physiological data render a more complete picture of the subject’s functional state than either measure can provide in isolation. In many cases, the results from the two types of measures will converge in such a way that a complete understanding of an intervention’s ef- fects can be arrived at easily. In other cases, where such convergence does not occur,

J. Caldwell et al. /Biological Psychology 40 (1995) 197-208 207

the differing results can pinpoint valuable information which may be relevant in the operational environment. For example, if performance is being degraded but there is no evidence of concurrent physiological decrements, perhaps environmental or task-related factors can be modified to correct the problem. If physiological changes are evident in the absence of performance decrements, this can serve as a warning that further intensification of the performance requirements may induce problems in an already impaired subject. In either case, the information obtained in both realms (physiological and performance) is valuable from research and operational standpoints.

4. Conclusions

Conducting scientific research under controlled conditions for the purpose of estimating the potential impact of operational stressors is essential. However, the very factors that make such investigations analyzable and interpretable may reduce the investigator’s capability to generalize findings to actual real-world conditions. One of these factors is the reliance upon uni-dimensional, standardized, well- practiced performance tests to estimate whether or not a stressor will ultimately im- pact operational performance. Although it is true that impairments in basic skills (evaluated by these tests) can be expected to degrade the operational performance composed of these skills, it is unfortunately the case that testing such skills in the laboratory may not translate easily to the real world. In the real world, people must perform complex combinations of lower-level skills simultaneously and continuously in often unpredictable situations, whereas in the laboratory, people can focus on ex- ercising one skill at a time for relatively brief periods in highly controlled cir- cumstances. Also, the laboratory typically presents a very predictable task environment which may or may not exist in operational scenarios.

Thus, laboratory performance tests are not maximally sensitive to stress-induced decrements which in turn may create safety or efficiency problems in an operational environment. In fact, a sole reliance on performance indexes may lead to the conclu- sion that a given stressor has no adverse effect when in fact the effect is present - but simply not powerful enough to be detected using a relatively insensitive tool.

The implementation of physiological measures in combination with performance measures will help to alleviate this sensitivity problem. Physiological measures are able to provide more direct assessments of basic functional status than is possible with performance measures. Furthermore, physiological evaluations are not suscep- tible to practice effects or subjective interpretations.

It is recommended that both types of measures be combined into a multi- dimensional assessment strategy whenever possible to maximize both the sensitivity of the assessments and the interpretability of findings.

References

AGARD (1989). Human performance assessment methods. AGARDograph No. 308. Neuilly sur Seine:

Advisory Group for Aerospace Research and Development.

Caldwell, J.A., Stephens, R.L, Carter, D.J., & Jones, H.D. (1992). Effects of 2 mg and 4 mg atropine sul-

208 J. Caldwell et al. /Biological Psychology 40 (1995) 197-208

fate on the performance of U.S. Army helicopter pilots. Aviation, Space, and Environmental Medicine,

63(10), 857-864. Caldwell, J.A., Wilson, G.F., Cetinguc, M., Gaillard, A.W.K., Gundel, A., Lagarde, D., Makeig, S.,

Myhre, G., & Wright, N.A. (1994). Psychophysiological assessment methods, AGARDograph No. 324.

Neuilly sur Seine, France: Advisory Group for Aerospace Research and Development.

Comperatore, CA., Caldwell. J.A., Stephens, R.L., Chiaramonte, J.A., Pearson, J.Y., Trast, S.T., & Mat-

tingly, A.D. (1993). The use of electrophysiological and cognitive variables in the assessment of degrada- tion during periods ofsustained wakefulness. USAARL Technical Report No. 93-5. Fort Rucker: U.S.

Army Aeromedical Research Laboratory.

Comperatore, C.A., Kraus, N., Stark, C., & McGee, T. (1989). Contributions ofsubcortical nuclei to the generation of middle latency componenets of the auditory evoked response. Association for Research in

Otolaryngology, abstracts of the 12th midwinter meeting.

Lees, M.A., & Ellingstad, V.S. (1990). Assessing human performance during the operation of dynamic vehi- cles: A literature review and examination of methods. USAARL Contractor Report No. 90-2. Fort

Rucker: U.S. Army Aeromedical Research Laboratory.

Mitchell, A., Lewis, A., Jones, H., Higdon, A., & Baer, D. (1988). Aircraff in-flight monitoring sysrem (AIMS). USAARL Letter Report No. Lr 88-12-S-2. Fort Rucker, AL: U.S. Army Aeromedical Re-

search Laboratory.

Samet, M.G., Geiselamn, R.E., Zajaczkowski, F., & Marshall-Mies, J. (1987). Complex cognitive assess- ment battery (CCAB). Los Angeles: Analytical Assessments Center.

Stephens, R.L., Caldwell, J.A., Comperatore, C.A., Pearson, J.Y., & Delrie, D.M. (1992). Effects of

terfenadine and diphenhydramine on brain activity and performance in a UH-60 flight simulator. USAARL Technical Report No. 92-33. Fort Rucker: U.S. Army Aeromedical Research Laboratory.

Thorne, D.R., Genser, S.G., Sing, H.C., & Hegge, F.W. (1985). The Walter Reed performance assessment

battery. Neurobehavioral toxicology and teratology, 7, 41 S-418.

Verona, R.W., & Stephens, R.L. (I991 ). The effects of terfenadine and diphenhydramine on cognitive per- formance as measured with performance assessmenl batteries. USAARL Contractor Report No. 91-3.

Fort Rucker: U.S. Army Aeromedical Research Laboratory.