measuring situation awareness in virtual environment · pdf filemeasuring situation awareness...

15
Measuring Situation Awareness in Virtual Environment-Based Training David B. Kaber North Carolina State University Jennifer M. Riley and Mica R. Endsley SA Technologies, Inc., Marietta, Georgia Mohamed Sheik-Nainar Synaptics, Inc., Santa Clara, California Tao Zhang Purdue University Donald R. Lampton U.S. Army Research Institute for the Behavioral and Social Sciences, Orlando, Florida We evaluated the efficacy of a computer-based situation awareness (SA) measurement system for training dismounted infantry SA in an urban terrain virtual reality (VR) simulation. Based on past research, we hypothesized that the SA measures would be sensitive to individual (squad leader) differences, and that the frequency of specific probes would reveal differences in critical SA requirements among scenarios. Three infantry squads performed multiple trials across two different scenarios. A confederate platoon leader posed probes to squad leaders during trials and experts made ratings afterward. Results revealed squad leaders had similar responses to probes, despite differences in combat experience. Analysis of probe frequency revealed different high priority SA elements and decisions for each scenario. The SA behavior and commu- nication ratings revealed differences among squads, which trended with experience. Measures of SA were also consistent across the test scenario as a result of similar mission types and task difficulties. We discuss the implication of our findings for future research and theory within this area. Keywords: situation awareness measurement, military operations, virtual reality, simulation and training Situation awareness (SA) refers to the level of awareness and dynamic understanding an individual has of a situation. Endsley (1995b) describes SA as a state of knowledge resulting from, “the processing of elements in the envi- ronment within a volume of time and space (Level 1), the comprehension of their meaning (Level 2), and the projection of their status in the near future (Level 3)” (p. 36). Soldier SA includes knowledge of events in the battlefield, the meaning of specific tactical actions to a squad’s mission, and projections of future com- bat actions (Endsley et al., 2000). Achieving SA depends on perceptual, spatial, and cognitive David B. Kaber, Department of Industrial and Systems Engineering, North Carolina State University; Jennifer M. Riley and Mica R. Endsley, SA Technologies Inc., Mari- etta, Georgia; Mohamed Sheik-Nainar, Synaptics Inc., Santa Clara, California; Tao Zhang, Libraries, Purdue Uni- versity; Donald R. Lampton, U.S. Army Research Institute for the Behavioral and Social Sciences, Orlando, Florida. The views and opinions expressed in this article are solely those of the authors and do not reflect an endorse- ment by the U.S. Government or any of its agencies. Work was supported with grants from Army/OSD Small Busi- ness Innovative Research (DASW01-04-C-004) as well as from The Office of Naval Research’s Virtual Technologies & Environments Program. Don Lampton and Bruce Knerr served as project technical monitors. We acknowledge John Hyatt, Paul Blankenbeckler, and Justin Reynolds for their contributions to this work. Correspondence concerning this article should be ad- dressed to David B. Kaber, Department of Industrial and Systems Engineering, North Carolina State University, 111 Lampe Dr., 400 Daniels Hall, Raleigh, NC 27695-7906. E-mail: [email protected] This document is copyrighted by the American Psychological Association or one of its allied publishers. This article is intended solely for the personal use of the individual user and is not to be disseminated broadly. Military Psychology © 2013 American Psychological Association 2013, Vol. 25, No. 4, 330 –344 0899-5605/13/$12.00 DOI: 10.1037/h0095998 330

Upload: nguyenkhanh

Post on 06-Mar-2018

215 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Measuring Situation Awareness in Virtual Environment · PDF fileMeasuring Situation Awareness in Virtual Environment-Based ... (squad leader) differences, and ... platoon leaders based

Measuring Situation Awareness in VirtualEnvironment-Based Training

David B. KaberNorth Carolina State University

Jennifer M. Riley and Mica R. EndsleySA Technologies, Inc., Marietta, Georgia

Mohamed Sheik-NainarSynaptics, Inc., Santa Clara, California

Tao ZhangPurdue University

Donald R. LamptonU.S. Army Research Institute for the Behavioral and Social Sciences,

Orlando, Florida

We evaluated the efficacy of a computer-based situation awareness (SA) measurementsystem for training dismounted infantry SA in an urban terrain virtual reality (VR)simulation. Based on past research, we hypothesized that the SA measures would besensitive to individual (squad leader) differences, and that the frequency of specificprobes would reveal differences in critical SA requirements among scenarios. Threeinfantry squads performed multiple trials across two different scenarios. A confederateplatoon leader posed probes to squad leaders during trials and experts made ratingsafterward. Results revealed squad leaders had similar responses to probes, despitedifferences in combat experience. Analysis of probe frequency revealed different highpriority SA elements and decisions for each scenario. The SA behavior and commu-nication ratings revealed differences among squads, which trended with experience.Measures of SA were also consistent across the test scenario as a result of similarmission types and task difficulties. We discuss the implication of our findings for futureresearch and theory within this area.

Keywords: situation awareness measurement, military operations, virtual reality, simulationand training

Situation awareness (SA) refers to the levelof awareness and dynamic understanding anindividual has of a situation. Endsley (1995b)describes SA as a state of knowledge resultingfrom, “the processing of elements in the envi-ronment within a volume of time and space(Level 1), the comprehension of their meaning

(Level 2), and the projection of their status inthe near future (Level 3)” (p. 36). Soldier SAincludes knowledge of events in the battlefield,the meaning of specific tactical actions to asquad’s mission, and projections of future com-bat actions (Endsley et al., 2000). Achieving SAdepends on perceptual, spatial, and cognitive

David B. Kaber, Department of Industrial and SystemsEngineering, North Carolina State University; Jennifer M.Riley and Mica R. Endsley, SA Technologies Inc., Mari-etta, Georgia; Mohamed Sheik-Nainar, Synaptics Inc.,Santa Clara, California; Tao Zhang, Libraries, Purdue Uni-versity; Donald R. Lampton, U.S. Army Research Institutefor the Behavioral and Social Sciences, Orlando, Florida.

The views and opinions expressed in this article aresolely those of the authors and do not reflect an endorse-ment by the U.S. Government or any of its agencies. Workwas supported with grants from Army/OSD Small Busi-

ness Innovative Research (DASW01-04-C-004) as well asfrom The Office of Naval Research’s Virtual Technologies& Environments Program. Don Lampton and Bruce Knerrserved as project technical monitors. We acknowledgeJohn Hyatt, Paul Blankenbeckler, and Justin Reynolds fortheir contributions to this work.

Correspondence concerning this article should be ad-dressed to David B. Kaber, Department of Industrial andSystems Engineering, North Carolina State University, 111Lampe Dr., 400 Daniels Hall, Raleigh, NC 27695-7906.E-mail: [email protected]

Thi

sdo

cum

ent

isco

pyri

ghte

dby

the

Am

eric

anPs

ycho

logi

cal

Ass

ocia

tion

oron

eof

itsal

lied

publ

ishe

rs.

Thi

sar

ticle

isin

tend

edso

lely

for

the

pers

onal

use

ofth

ein

divi

dual

user

and

isno

tto

bedi

ssem

inat

edbr

oadl

y.Military Psychology © 2013 American Psychological Association2013, Vol. 25, No. 4, 330–344 0899-5605/13/$12.00 DOI: 10.1037/h0095998

330

Page 2: Measuring Situation Awareness in Virtual Environment · PDF fileMeasuring Situation Awareness in Virtual Environment-Based ... (squad leader) differences, and ... platoon leaders based

abilities (Endsley & Bolstad, 1994; Gugerty &Tirre, 1997; O’Hare, 1997); task and environ-mental factors (Hockey, 1986; Sharit & Sal-vendy, 1982); doctrine or accepted tactics (Tay-lor, Endsley, & Henderson, 1996); and availablesources of SA, including team members andinformation technologies (Kaber & Endsley,1998; Endsley et al., 2000).

Research shows that relevant experienceplays a significant role in a soldier’s ability toquickly form and maintain SA in time-criticaloperations (Lampton, Riley, Kaber, & Endsley,2006; Pleban, Eakin, Salter, & Matthews,2001). Experience related to achieving andmaintaining SA can be gained through effectivetraining programs with appropriate feedback ontrainee performance. Custom scenarios can bedesigned to facilitate soldier practice in percep-tion, comprehension, and projection of dynamicsituations. Feedback on SA through AARs (Af-ter-Action Reviews) is expected to promote sol-dier understanding of why different types ofbattlefield errors may occur, including missinginformation, failing to relate states of the envi-ronment (terrain) to action, and failing to pre-dict the need for specific tactics. Such feedbackcan also provide important indicators of soldierreadiness for actual combat.

The U.S. Army currently uses virtual reality(VR)-based simulation systems to train infantryunits in dismounted operations, including com-bat missions, and to facilitate advanced missionrehearsal (Hamilton & Holmquist, 2005). Train-ing scenarios typically involve squads or rifleteams in military operations in urban terrain(MOUT—e.g., roving [virtual] patrols, crowdcontrol, attacks on buildings, suppressing en-emy insurgents, dealing with hostage situa-tions). The majority of these scenarios employhuman simulation operators to role-play oppo-sition forces. The high-level tasks of all virtualmissions are to fight and overcome hostilethreats. Past research has shown that training in VRsimulations can be an effective way to improvingsoldier decision-making and situation-assessmentabilities (Lampton et al., 2006; Pleban et al.,2001) with reduced cost and resources as com-pared to traditional field training. However,measuring soldier SA in virtual environmentsremains a challenge.

Although there are various approaches tomeasure soldier SA (Kaber, Riley, Lampton, &Endsley, 2005), they are not without their lim-

itations. Matthews, Pleban, Endsley, and Strater(2000) noted that it may be difficult to freeze alarge-scale simulation or field exercise as re-quired by the Situation Awareness Global As-sessment Technique (SAGAT; Endsley, 1995a)without compromising trainee performance.Similarly, the Situation-Present AssessmentMethod (SPAM; Durso et al., 1998) employsfreezes for SA queries, and this procedure maystill affect operator task performance. Real-timeSA probe measures (Jones & Endsley, 2000)question trainees on SA during ongoing tasksand provide some advantages over SAGAT andSPAM because they can be developed to be partof normal training task communications. How-ever, it is often difficult to achieve the necessarysample size on SA queries for reliable assess-ments because of simulation events and opera-tor workload levels. Other techniques involvingexpert observer rating of specific soldier behav-iors (e.g., the Situation Awareness BehaviorallyAnchored Rating Scale (SABARS; Strater,Endsley, Pleban, & Matthews, 2001) and com-munications for achieving SA (e.g., Wright &Kaber, 2003) allow for assessment of compo-nents of SA and can be conducted in real timeduring soldier training without intrusion on per-formance. The Coordinated Awareness of Situ-ation by Teams (CAST; Gorman, Cooke, &Winner, 2006) focuses on coordinated pro-cesses involved in a team adapting to nonrou-tine situational constraints like a “roadblock.”However, introducing artificial roadblocks orunlikely events into military training simula-tions may interfere with primary task perfor-mance and consequently affect the validity ofthe training.

This research begins to address shortcomingsof past studies on the assessment of soldier SAby examining the efficacy of an SA measure-ment system designed to support enhancedfeedback for soldiers during AARs followingArmy VR-based training exercises. The measure-ment system takes advantage of the customiz-ability and flexibility of virtual environmentswhile avoiding some of the common limitationsof SA measures discussed previously. The SAmeasurement system integrated real-timeprobes as a direct, objective measure of squadleader SA, complemented with expert observerratings of SA behaviors and squad communica-tions as subjective measures. Each of the se-lected measures (probes and expert ratings)

331MEASURING SITUATION AWARENESS

Thi

sdo

cum

ent

isco

pyri

ghte

dby

the

Am

eric

anPs

ycho

logi

cal

Ass

ocia

tion

oron

eof

itsal

lied

publ

ishe

rs.

Thi

sar

ticle

isin

tend

edso

lely

for

the

pers

onal

use

ofth

ein

divi

dual

user

and

isno

tto

bedi

ssem

inat

edbr

oadl

y.

Page 3: Measuring Situation Awareness in Virtual Environment · PDF fileMeasuring Situation Awareness in Virtual Environment-Based ... (squad leader) differences, and ... platoon leaders based

was expected to provide different insightsinto squad leader SA. The SA probe measurewas intended to evaluate battlefield awarenessin the mind of the squad leaders. The SAbehavior ratings were to measure squad leaderability to act based on SA. The communica-tion ratings were to measure squad leaderability to distribute SA to squad members inorder to facilitate shared understanding ofcombat situations and knowledge of missionrequirements.

In this article, we examined the sensitivity ofthe specific measures to individual differencesin soldier ability to achieve SA; the reliability ofthe measures for assessing soldier SA acrosssimilar simulated combat conditions in VR; andassessed soldier perceptions of combat taskworkload in order to determine if there were anydifferences among training scenarios and toidentify any relation with SA. Before presentingthe results of our study, we describe the devel-opment of the SA measurement tool. Subse-quently, we present the methodology and resultsand conclude with a discussion that outlines thetheoretical, research, and practical implicationsof our work for future research in this area.

Development of SA Measurement Tools

We developed three software applications toimplement probing and expert observer ratings.The automated probe delivery application(APDA) was used to present potential probes toplatoon leaders based on training scenarioevents (e.g., squad falls under sniper attack),squad communications (e.g., leader situation re-port), and/or the specific locations of squadmembers in the virtual training environment(VTE) at any given time. The electronic SAbehavior measurement (ESABM) applicationincluded 27 target squad leader behaviors con-sidered to represent consistency with acquisi-tion and dissemination of SA, each with a 10-point scale to be rated by expert analysts. Thesituation awareness measure of team communi-cation and coordination (SAMTC) applicationsupported expert analysts’ ratings of soliderteam communications in terms of SA. Both theESABM and SAMTC were presented on tabletPCs connected wirelessly to a database on theAPDA workstation.

Real-Time Probe Measure

SA probes. Following Jones and Endsley(2000), we used cognitive task analysis to de-velop a database of SA requirements for squadleaders in MOUT missions. Structured inter-views were conducted with reservist or retiredsenior military officers. Goal-Directed TaskAnalysis (GDTA; Endsley, 1993) was used toidentify soldier goal states for MOUT, identify-ing operational tasks to achieving the goals,identifying specific questions as part of decisionmaking in task performance, and establishingthe SA requirements to answer questions. SArequirements resulting from the analysis werecategorized according to the levels of SA de-fined by Endsley (1995b). The requirementswere then translated into natural languageprobes to be posed to squad leaders duringtraining trials by a confederate platoon leader(also directing squad actions) over a communi-cations network.

We integrated probes into communicationsbetween soldier trainees and experienced mili-tary personnel to guide training trials. Thismethod differs from verbal protocol analysis(Ericsson & Simon, 1998), as subjects are notmerely asked to verbalize what they are doingand why, but they are asked about specific SArequirements for a task. The probe measurementtechnique also requires experimenters to deter-mine in real time whether subject responses arecorrect relative to the ground truth of a simula-tion. This scoring information is stored in adatabase through the APDA to support AARs aspart of a training exercise.

SA probe tool. The APDA interface pre-sented an overview map of the VTE on whichvirtual markers were overlaid to identify loca-tions for administration of SA probes to squadleaders. Moving icons were used to representsquad member avatar positions in the VTE, asan exercise played out. When the icon for asquad leader avatar contacted a probe marker,the APDA presented a list of appropriate probesfor a platoon leader to choose from. The appli-cation interface (see Figure 1) included dialogsfor selecting probes and responses from train-ees, as well as recording “ground truth” infor-mation on the simulation. At the close of atraining trial, the APDA automatically calcu-lated the percent of correct soldier responses toprobes at each level of SA.

332 KABER ET AL.

Thi

sdo

cum

ent

isco

pyri

ghte

dby

the

Am

eric

anPs

ycho

logi

cal

Ass

ocia

tion

oron

eof

itsal

lied

publ

ishe

rs.

Thi

sar

ticle

isin

tend

edso

lely

for

the

pers

onal

use

ofth

ein

divi

dual

user

and

isno

tto

bedi

ssem

inat

edbr

oadl

y.

Page 4: Measuring Situation Awareness in Virtual Environment · PDF fileMeasuring Situation Awareness in Virtual Environment-Based ... (squad leader) differences, and ... platoon leaders based

Depending upon the training scenario, therewere between 60 and 70 probes for platoonleaders to choose from. Jones and Endsley(2000) recommended that high numbers ofprobes be used in order to achieve sufficientreliability and sensitivity in assessing SA. Theyprobed air defense system operators, on aver-age, once every 2 minutes and found no evi-dence of interference with performance. Theyalso suggested that probes could be adminis-tered more often without posing problems. Inour use of the APDA, we did not require con-federate platoon leaders to deliver specificprobes a certain number of times during trials.They were, however, asked to attempt to deliverat least 15 probes per trial and to distributeprobes across levels of SA. This procedure gavethe platoon leaders freedom to choose probesbased on training objectives and their knowl-edge of what was important for the simulatedcombat situation and why. This type of flexibil-ity in real-time probe delivery was not possiblein previous studies. After each trial, we ana-lyzed the frequency with which platoon leaders

delivered probes from the database. We consid-ered the applicability of probes throughout thecourse of a training trial as another measure forrevealing SA elements the experts considered tobe of high priority for different MOUT mis-sions. This form of real-time probe data analysisrepresents a new approach to verifying SA re-quirements for specific missions based on ex-pert knowledge.

Behavior Measures

The ESABM application was developedbased on the situation awareness behaviorallyanchored rating scale (SABARS) created byStrater et al. (2001). The SABARS requiresan expert to evaluate soldier behaviors forconsistency with SA subsequent to observingan exercise (Matthews et al., 2000). The orig-inal SABARS consisted of 28 questions (orbehaviors) with relevance to SA in combatmissions. The behaviors were specific to as-sessment of platoon leader SA. Instead ofcreating a post-trial measure for squad leader

Figure 1. APDA interface.

333MEASURING SITUATION AWARENESS

Thi

sdo

cum

ent

isco

pyri

ghte

dby

the

Am

eric

anPs

ycho

logi

cal

Ass

ocia

tion

oron

eof

itsal

lied

publ

ishe

rs.

Thi

sar

ticle

isin

tend

edso

lely

for

the

pers

onal

use

ofth

ein

divi

dual

user

and

isno

tto

bedi

ssem

inat

edbr

oadl

y.

Page 5: Measuring Situation Awareness in Virtual Environment · PDF fileMeasuring Situation Awareness in Virtual Environment-Based ... (squad leader) differences, and ... platoon leaders based

SA assessment, we worked with a militaryexpert to identify squad leader behaviors hav-ing relevance to SA in MOUT missions thatcould be rated during a training exercise. Thisresulted in 27 target squad leader behaviorsfor SA analyst observation and rating in realtime. The behaviors are shown in Figure 2, asdisplayed in the ESABM application inter-face. The real-time ratings represent an en-hancement over the original SABARS andserve to eliminate analyst recall bias in be-havior ratings.

To apply the ESABM, a SA analyst watcheda squad leader’s motion behavior during a trialand monitored leader communications withother squad members. Analysts were able torecord multiple ratings on each behavior duringa single trial and the ESABM automaticallyaveraged the ratings. At the close of a trial, theapplication calculated an overall SA score bysumming the rating data across the targetbehaviors.

Communication Measures

The SAMTC facilitated expert observer rat-ings of specific soldier team communications interms of SA. We adapted the approach used byWright and Kaber (2003) and Brannick, Prince,Prince, and Salas (1995) (for team communica-tion and coordination analyses) to the analysisof squad leader achievement of SA throughrequests for, and dissemination of, informationwithin a squad. The SAMTC represents a novelapproach to soldier SA assessment based onreal-time communication ratings. The measure-ment technique required experts to monitor allverbalizations via the squad communicationsnetwork during an exercise. Raters classifiedtarget verbalizations in terms of the elements ofMOUT (e.g., enemies, friendlies, terrain, civil-ians, etc.). Soldier statement quality ratings(“good” or “bad”) were also made in real timeduring exercises. An overall rating of the qual-ity of squad leader SA (on a scale from 1 [no

Figure 2. ESABM interface.

334 KABER ET AL.

Thi

sdo

cum

ent

isco

pyri

ghte

dby

the

Am

eric

anPs

ycho

logi

cal

Ass

ocia

tion

oron

eof

itsal

lied

publ

ishe

rs.

Thi

sar

ticle

isin

tend

edso

lely

for

the

pers

onal

use

ofth

ein

divi

dual

user

and

isno

tto

bedi

ssem

inat

edbr

oadl

y.

Page 6: Measuring Situation Awareness in Virtual Environment · PDF fileMeasuring Situation Awareness in Virtual Environment-Based ... (squad leader) differences, and ... platoon leaders based

skill] to 5 [complete skill]) was made at the endof trials, based on individual communicationcounts and ratings.

Hypotheses

In general, it was hypothesized that the SAmeasures, behavior, and communication ratingswould reveal individual differences in SA re-lated to soldier combat experience and intactversus unfamiliar squad configurations in train-ing exercises (Hypothesis 1). As a result oftraining scenario feature manipulations, differ-ent conditions were expected to yield compara-ble levels of SA (Hypothesis 2); however, themeasurement system (in particular, the analysisof SA probe frequency) was expected to revealdifferent aspects of SA most critical to eachmission (Hypothesis 3). (This analysis also al-lowed us to relate general mission features backto SA requirements.) The SA measures werealso expected to be related to squad perceptionsof task workload, if they fluctuated among sce-narios or trials. In particular, SA was ex-pected to decrease as combat workload in-creased (Hypothesis 4).

Method

Participants

Three Army infantry squads, with nine mem-bers each, were recruited for participation in a“training research exercise” to evaluate the SAmeasurement tools. Two of the squads wereintact; the soldiers comprised actual combatunits. The third squad was comprised of soldiersthat had not previously worked together. Allsoldiers ranged in age from 19 to 32 years. Themajority (23 of 27) indicated prior participationin live MOUT site training (e.g., at Fort Polk).Only one soldier had previously participated inVE-based training through the Army. Twelve ofthe soldiers reported actual combat experiencein the wars in Iraq and/or Afghanistan.

Measures

SA. Dependent variables included squadleader answers to the real-time probes in termsof the percentage of correct responses, and ex-pert ratings of SA behavior and team commu-nications. With respect to the post-hoc probing

frequency analysis, we counted the number oftimes probes were delivered during trials andthen totaled all counts across trials for each ofthe two scenarios. Two expert SA analysts ap-plied the rating techniques in each trial with onerater evaluating soldier behaviors in all SP (Se-curity Patrol) scenarios and soldier communica-tions in all AC (Arms Cache Discovery) sce-narios, and the other doing the exact opposite.Overall scores for consistency of soldier actionswith SA, and the quality of team communica-tions, were recorded for each trial for eachsquad leader.

Task workload. Each squad member (in-cluding leaders) was required to rate the level oftask workload for each trial (scenario type) us-ing a visual analog scale with anchors of lowand high. A univariate scale was chosen insteadof a multivariate measure (e.g., NASA-TLXquestionnaire) because the task workload wasintended to characterize the overall demand onsoldier information processing and physical ac-tions (cf., Hendy, Hamilton, & Landry, 1993).

Feedback ratings. Soldiers were asked torate different types of feedback they receivedduring AARs, following the training trials.They rated the utility of platoon leader feedbackon mission knowledge, their behaviors, andcommunications as a basis for improving com-bat skills. These ratings were also made on ascale with anchors of low and high. Individualratings were averaged across a squad for a sin-gle score on each type of feedback.

Apparatus, Design, and Procedures

Nine immersive VR booths were used for thestudy, each with a single 12.5= ! 7= rear-projection screen displaying virtual battlefieldimagery to soldiers. The booths also includedintegrated mock-ups of infantry weapons (e.g.,M-16/M4 rifles), which were instrumented withinterface controls (a thumbstick and two push-buttons) to facilitate environment navigationand selection of equipment for avatar use (e.g.,virtual night-vision goggles, compass, grenadelauncher, flares, flash bangs, etc.). Soldiers wererequired to don a helmet and rucksack whenentering a booth. Motion tracking system re-ceivers were attached to the weapon system andhelmet. The receivers provided data on soldierposture position and orientation of viewpoint,which were used to drive basic motions of the

335MEASURING SITUATION AWARENESS

Thi

sdo

cum

ent

isco

pyri

ghte

dby

the

Am

eric

anPs

ycho

logi

cal

Ass

ocia

tion

oron

eof

itsal

lied

publ

ishe

rs.

Thi

sar

ticle

isin

tend

edso

lely

for

the

pers

onal

use

ofth

ein

divi

dual

user

and

isno

tto

bedi

ssem

inat

edbr

oadl

y.

Page 7: Measuring Situation Awareness in Virtual Environment · PDF fileMeasuring Situation Awareness in Virtual Environment-Based ... (squad leader) differences, and ... platoon leaders based

avatars in the VTE (e.g., standing, kneeling, andprone). Each VR booth also included a radioheadset with hand control for soldier use and a“listen-only” headset for the expert observers.The radios allowed members of the trainingsquad to communicate with each other and thesquad leader to contact a platoon leader. In oneof the two training scenarios, soldiers also usedinterfaces on the weapon mock-up to markrooms of buildings (with virtual paint) in orderto prevent duplication of squad search efforts.In the other scenario, they used the weaponinterfaces to deploy smoke to conceal squadmovement across open streets.

The type of training scenario was manipu-lated as a within-subjects variable (i.e., allsquad leaders were exposed to both a SP andAC scenario). Each squad completed three trialsunder each scenario with changes in the time ofday of the simulation (daylight vs. night), theweather (clear vs. fog/rain), and the number ofsquad casualties occurring during trials (0, 1, 2).Combinations of these scenario features wereselected to balance the level of difficulty fromone trial to the next. Squad leaders were ex-posed to the scenarios in the same order basedon a predefined training approach. The se-quence of trials was also determined based onthe goals of the Army training exercise.

The mission of the roving SP was to secure afood convoy route in a small town. On route toa warehouse, the squad was directed to watchfor and impede any insurgent activity. Theyultimately encountered an armed civilian, whowas to be disarmed. While detaining the civil-ian, a sniper engaged the squad from a watertower. An RPG (Rocket-propelled Grenade)team subsequently destroyed a Red Cross vehi-cle on a street near the squad. Civilian andfriendly force casualties occurred as result ofthese activities. The squad was required to dealwith the casualties and continue on to the ware-house for the delivery. In the AC discoveryscenario, a squad was assigned several (virtual)buildings in the small town for clearing andsearch for arms and contraband. During thesearch, a civilian approached the squad claim-ing knowledge of a cache site and asked that thesquad follow him. In so doing, the squad en-countered a car bomb and small arms fire froma three-man insurgent element in a nearbybuilding. In some trials, the squad took casual-ties during the firefight and was required to

defuse the situation and secure the AC. Theobjective of both scenarios was to train thesquads on tactics in dealing with small unitinsurgent threats and noncombatant issues typ-ical to urban environments, as well as the im-portance of achieving accurate battlefieldawareness.

The experimental procedure for each squadspanned 2 days. The first day included: (1) anintroduction to the training facility and identi-fication of the goals for the event (about 45min); (2) familiarization with the VR equipmentand simulation (about 1 hour); (3) trainee as-signment to virtual fire-teams (about 15 min);and (4) performance in three training trials(about 1 hour and 30 min each). Each trainingtrial involved a mission briefing by a confeder-ate platoon leader and an operations order bythe squad leader (about 25 min). During a train-ing scenario, real-time probes were posed tosquad leaders by platoon leaders via the com-munication network. Expert observers alsomade ratings of squad team SA behaviors andcommunications. Once a squad completed a vir-tual mission, or was rendered “combat ineffec-tive,” the trial was terminated. The soldiers thencompleted the combat workload ratings andwere provided with an AAR (about 30 min).The platoon leader guided a discussion of crit-ical events in the training scenario; soldier be-haviors and communications; and tactical deci-sion making. Recommendations were providedfor improving SA and performance in subse-quent trials (this approach was consistentamong squads and trials). At the close of anAAR, soldiers completed the feedback ratings.The second day of the experiment included: (1)a reorientation to the goals and equipment(about 45 min); (2) three more training trials(about 1 hour and 30 min each); and (3) anexercise debriefing (about 45 min with ARIresearchers). Soldiers were asked to providecomments on the feedback they receivedthrough the AARs.

Results

With each squad completing six training tri-als, there were a total of 18 observations perresponse measure. Data were limited in part dueto the availability of the Army training facilityand actual infantry squads. Related to this, vari-ations in response measures from one training

336 KABER ET AL.

Thi

sdo

cum

ent

isco

pyri

ghte

dby

the

Am

eric

anPs

ycho

logi

cal

Ass

ocia

tion

oron

eof

itsal

lied

publ

ishe

rs.

Thi

sar

ticle

isin

tend

edso

lely

for

the

pers

onal

use

ofth

ein

divi

dual

user

and

isno

tto

bedi

ssem

inat

edbr

oadl

y.

Page 8: Measuring Situation Awareness in Virtual Environment · PDF fileMeasuring Situation Awareness in Virtual Environment-Based ... (squad leader) differences, and ... platoon leaders based

trial to the next may have been correlated due tothe use of the AARs. Consequently, parametricstatistical tests (assuming independence of ob-servations) were not used for data analyses. Anonparametric counterpart to the analysis ofvariance (ANOVA), the Kruskal-Wallis test,was applied.

Analyses on Hypotheses 1 and 2

With respect to the real-time probe measure,Kruskal-Wallis tests revealed no significantmain effect of squad leader (subject; t(2) "1.632, p " .4422) or scenario type, t(1) "0.3516, p " .5532 on percent correct responsesto probes. In general, squad leader overall SAwas high, ranging from approximately 83% to96% accurate. Variations in SA across trialswere substantial, and this may have led to a lackof sensitivity of the statistical test for revealingaverage differences, which was counter to Hy-pothesis 1. In support of Hypothesis 2, the ma-nipulation of the various scenario features ap-peared to yield comparable levels of soldier SA.(It is important to note that the probe measuredid reveal internal validity in that significantpositive correlations occurred between total SAand measures on each level of SA; perception,comprehension, and projection.).

With respect to the expert observer ratings ofsoldier behaviors for consistency with SA,Kruskal-Wallis tests revealed a significant ef-fect of squad leader, t(2) " 13.1147, p " .0014on the overall ESABM scores. This was inagreement with Hypothesis 1. As posited inHypothesis 2, the pattern of SA-related behav-iors was consistent across scenario types, t(1) "1.4629, p " .2262. Pairwise comparisons of theESABM scores for the various squad leadersrevealed significant differences (p# .05) amongall leaders, which appeared to correspond withactual experience. Squad Leader #1 had themost experience and achieved a mean score of8.25/10. Squad Leader # 2 was new to the postand achieved the worst score (4.875/10) in lead-ing an unfamiliar squad. Squad Leader #3achieved a mean score of 7.125/10. AlthoughSquad Leader #3 appeared to have more rele-vant MOUT experience than Squad Leader #1,he had five fewer years of service.

In regard to the team communications mea-sure of SA, Kruskal-Wallis tests also revealed asignificant effect of the squad leader, t(2) "

10.8348, p " .0044 on the overall communica-tion quality ratings by experts. This was also inagreement with Hypothesis 1. In support ofHypothesis 2, there was no significant effect ofthe MOUT scenario type, t(1) " 0.6689, p ".4135. Pairwise comparisons of the meanSAMTC scores revealed significant differences(p# .05) among all leaders, which also ap-peared to correspond with military experience.Leader #2 was rated worse than Leaders #1 and#3, and Leader #1 was rated superior to #3.

A correlation analysis revealed that the expertbehavior ratings were significantly correlatedwith communication ratings (r " .90, p#.0001). This suggested that communication is acritical part of team behavior, and that there wassome consistency across measures within theSA measurement system. The two ratings were,however, not correlated with squad leader SAassessed using the real-time probes.

Analysis on Hypothesis 3

The analysis of the frequency of platoonleader probe use revealed high-priority SA re-quirements for each scenario (Hypothesis 3). Inthe SP scenario, 16 out of 65 queries ($25% ofthe database) were posed to squad leaders on atleast two or more occasions. In the AC scenario,21 out of 63 probes (33%) were posed to squadleaders on at least two or more occasions. Table1 presents the “top” six probes for the SP sce-nario in terms of frequency of use. Two probeswere posed to squad leaders 10 or more times,and four probes were posed on five or moreoccasions. It can be inferred that if a particularprobe was used frequently, the SA requirementunderlying the probe was critical to the targetmission. The table reveals platoon leaders fo-cused on squad leader awareness of the positionof their unit, movement of the enemy, and pro-jecting where they might be, and civilian activ-ity during the scenario. These SA requirementswere associated with various mission features,including the need to secure a defined route andprevent insurgents and civilians from interfer-ing with the convoy. The requirements werealso related to mission subgoals, including eval-uating routes, avoiding fire fights, projectingenemy action, maneuvering against the enemyand preventing collateral damage.

Table 2 presents the “top” eight probes forthe AC scenario. Four probes were posed five or

337MEASURING SITUATION AWARENESS

Thi

sdo

cum

ent

isco

pyri

ghte

dby

the

Am

eric

anPs

ycho

logi

cal

Ass

ocia

tion

oron

eof

itsal

lied

publ

ishe

rs.

Thi

sar

ticle

isin

tend

edso

lely

for

the

pers

onal

use

ofth

ein

divi

dual

user

and

isno

tto

bedi

ssem

inat

edbr

oadl

y.

Page 9: Measuring Situation Awareness in Virtual Environment · PDF fileMeasuring Situation Awareness in Virtual Environment-Based ... (squad leader) differences, and ... platoon leaders based

more times and four probes were posed on fouror more occasions. Another 13 probes wereposed two or more times. Again, it was inferredthat high-frequency probes indicated high-priority SA requirements for the training sce-nario. The table reveals platoon leaders wereconcerned with squad leader knowledge of en-emy location and capability to detect the squad,as well as whether the squad could take cover.They also appeared to be concerned with spe-cific threats to the unit and if they could elimi-nate the enemy, as well as the squad remainingmission worthy and being able to clear build-ings. Furthermore, they were frequently con-cerned with endangering civilians during the

scenario. These knowledge requirements wereassociated with scenario features, includingdealing with a car bomb and insurgent teamattack, and protecting people evacuating aschool. The SA requirements were also relatedto mission subgoals, including assessing andavoiding threats, maneuvering against the en-emy and occupying terrain, and ensuring effec-tive squad and civilian security. Other squadleader decisions related to these subgoals arepresented in the table. Due to differences in thehigh-priority SA requirements identified foreach scenario, statistical comparisons of probingfrequency among scenarios was not conducted asa further basis for assessing Hypothesis 3.

Table 1Frequency of Specific Probes in SP Scenario

Probe # ProbeFrequency

of useRelated scenario features (F), goals (G), and

decisions (D)

6 Are you maintaining the planned route? 11 F: Need to secure defined convoy route.G: Evaluate feasibility of route in movement.D: Is squad movement meeting mission objectives?D: Would an alt. route improve speed/security?

30 Where are you? What is your presentlocation?

10 (Same as above.)

31 Where is the fire coming from? 6 F: Need to clear insurgent forces that wouldcompromise mission.

G: Avoid enemy fires.D: How can fires be applied to achieve mission?D: Will fires be effective?D: Where is cover or concealment?

12 Does it look like the enemy isdefending or moving against you?

5 F: Need to prevent convoy from attack and secureobjective.

G: Project enemy behavior.D: What is the likely enemy course of action?D: What is the most dangerous COA?

47 Where are likely rooftop or windowenemy sniper positions withobservation of your movement?

5 F: Need to deal with enemy sniper activity.G: Maneuver against enemy positions.D: Where can I operate w/o limitations?D: What positions are best to engage the enemy?D: What is the field-of-fire on the objective?D: What do I need to do to control squad

movement and coordination?18 What is the civilian activity? 5 F: Need to ensure civilians do not interfere with

convoy on route.G: Prevent violence/secure areas from

threats/control collateral damage.D: Is squad security appropriate for the mission?D: What is the best position to control civilians?D: Are reinforcements needed?D: Are arrests needed?D: What is the min. force needed to clear the area?D: What limits do the ROE put on my return fire?D: What is the degree of threat?D: Does my squad have appropriate equipment?

338 KABER ET AL.

Thi

sdo

cum

ent

isco

pyri

ghte

dby

the

Am

eric

anPs

ycho

logi

cal

Ass

ocia

tion

oron

eof

itsal

lied

publ

ishe

rs.

Thi

sar

ticle

isin

tend

edso

lely

for

the

pers

onal

use

ofth

ein

divi

dual

user

and

isno

tto

bedi

ssem

inat

edbr

oadl

y.

Page 10: Measuring Situation Awareness in Virtual Environment · PDF fileMeasuring Situation Awareness in Virtual Environment-Based ... (squad leader) differences, and ... platoon leaders based

Analysis on Hypothesis 4Statistical analyses on squad leader ratings of

mental workload during training trials revealedalmost identical perceptions across leaders,t(2) " 0.0356, p " .9823. On average, thepercent load determined for Leader #2 wasslightly lower than for Leaders #1 and #3; how-

ever, any differences were not statistically sig-nificant. There was also no evidence of an effectof scenario type on leader workload ratings,t(1) " 2.2888, p " .1303. This was in line withexpectation, based on the manipulation of sce-nario features. In relation to Hypothesis 4, cor-relation analyses on perceived combat workload

Table 2Frequency of Specific Probes in AC Scenario

Probe # ProbeFrequency

of useRelated scenario features (F), goals (G)

and decisions (D)

30 Is the enemy within range? 9 F: Need to react to insurgent team activity.G: Determine level of threat.D: What are the strengths/weaknesses of the squad?D: What is the immediacy of the threat?D: What damage can the threat do?D: Can I eliminate the threat?

65 Will the enemy be able to detectyour squad?

8 F: Need to use cover in entering and clearing bldgs.G: Avoid danger areas.D: What is the least exposed position?D: Can I avoid danger?D: Do I have time to avoid danger?D: How do I minimize impact of danger?

13 Can you take cover? 6 F: (Same as above)G: Avoid enemy observation.D: Is cover available?D: How can I take cover?D: What movement will minimize exposure?

6 Are the [building] rooms cleared? 5 F: Need to search/secure buildings along route.G: Use terrain to gain advantage over enemy.D: Where can I maneuver to engage the enemy?D: Will occupying the bldg. give me an advantage?D: How will the enemy respond to my position?

31 Is there a car bomb threat? 4 F: Need to deal with car bomb attack.G: Defend against attack?D: Where can obstacles or booby traps be placed?D: What positions allow for defense against the threat?D: Where can fires be used against the enemy?

18 Do you have enough firepower tosuppress enemy position?

4 F: Need to engage insurgent team protecting cache.G [4.3]: Maneuver against enemy positions.D: Where can I operate w/o limitations?D: What positions are best to engage the enemy?D: What is the field-of-fire on the objective?D: What do I need to do to control squad movement?

1 Are civilians in danger if youhave to engage the enemy?

4 F: Need to protect civilians evacuating bldgs. duringsearch for contraband.

G: Prevent hostile action in a secured area.D: Can enemy forces be isolated?D: How do civilians impact my security task?D: What reinforcements are available to assist?

10 Can you continue the mission? 4 F: Need to consolidate squad after locating cache.G: Evaluate capability to continue mission.D: Are my weapons adequate to continue?D: Are my forces adequate to continue?D: Is my squad combat effective?

339MEASURING SITUATION AWARENESS

Thi

sdo

cum

ent

isco

pyri

ghte

dby

the

Am

eric

anPs

ycho

logi

cal

Ass

ocia

tion

oron

eof

itsal

lied

publ

ishe

rs.

Thi

sar

ticle

isin

tend

edso

lely

for

the

pers

onal

use

ofth

ein

divi

dual

user

and

isno

tto

bedi

ssem

inat

edbr

oadl

y.

Page 11: Measuring Situation Awareness in Virtual Environment · PDF fileMeasuring Situation Awareness in Virtual Environment-Based ... (squad leader) differences, and ... platoon leaders based

and squad leader SA did not reveal significantlinear associations in either the SP or AC sce-narios. More specifically, leader SA did notappear to degrade with increasing workload.This was the case for total SA, and Level 1, 2and 3 SA measured using the real-time probetechnique.

Result of Soldier Feedback

With respect to the soldier evaluations of thequality of various types of SA-based feedbackthrough AARs, we found ratings of missionknowledge, behavior, and communicationsfeedback to range from approximately 62% to91% (on a 100-point scale). In general, resultssuggested additional feedback, as part of AARs,based on the outcomes of the SA measureswould be perceived as useful by training squadsin regular exercises to develop SA and combatdecision skills.

Discussion

In this section, we provide explanations ofwhy certain results might have occurred withthe new SA measurement tools during the train-ing study. On this basis, we identify any short-comings of our techniques and lessons learnedfor further development of such tools as well asapplications. We also identify the types ofknowledge that can be gained with each methodand improvements in SA measurement overother existing approaches.

Real-Time Probe Measure

In the present study, the automated probedelivery tool was used to pose, on average, 19SA probes to soldiers per training trial. Thisamounts to one probe every 1.6 minutes duringthe course of a 30-min trial. These numbers areimportant because results on the real-time SAprobes suggested some lack of sensitivity toindividual differences among squad leaders. Aspreviously mentioned, Jones and Endsley(2000) probed air defense system operators, onaverage, once every 2 minutes. According toJones and Endsley’s (2000) recommendationfor probing at least every 30 seconds, we wouldhave needed to probe soldiers 60 or more timesin each training trial to promote sensitivity ofour analyses on measures of the various levels

of SA. The lesson learned here is that highnumbers of probes are needed as a basis forstatistical analysis of SA data but such probesmust be effectively integrated into scenariocommunications so as not to detract from sol-dier task performance. In our case, the lownumbers of probes were primarily due to squadsexecuting tactical maneuvers not anticipated bythe research team and limiting the number ofvirtual probe markers encountered by a squadleader. That is, the number of opportunities fora platoon leader to pose queries was limited bycreative squad operations. From a tactical per-spective, squad leader behavior demonstratedcritical training and experience. From the per-spective of assessing SA, their behaviors re-duced the potential utility of the measurementapproach for supporting feedback during AARs.In general, in designing and applying such anSA probing tool, researchers need to be able toforesee potential trainee behaviors in order tomaximize opportunities for delivering probes.(We discuss potential modifications to the prob-ing software below.).

Another potential factor in the sensitivity ofthe automated real-time probe measure (or lackthereof) to squad leader individual differencesmight have been the use of AARs betweentraining trials. Although the subject squad lead-ers entered the experiment with different levelsof experience in terms of Army deployments,feedback provided during the AARs might havehelped put leaders “on an equal footing” interms of knowledge of SA requirements for thetraining scenarios (SP and AC). With respect toour data analysis, the level of accuracy of SAacross squad leaders might have become moreconsistent with additional training trials. Re-lated to this, it is also possible that the AARsaffected squad leader ratings of cognitive work-load. That is, as leader SA improved and theywere able to more effectively manage the train-ing scenarios, their perceptions of workloadmight have decreased and become more consis-tent across leaders. Of course, it is importantfrom a training perspective that there are im-provements in soldier combat SA and decisionskills; however, if the objective of applicationof a probing tool is to identify differences insolider SA based on background, training, ex-perience, and so forth, then the use of AARs incombination with such a tool is not recom-mended. This also applies to assessment of sol-

340 KABER ET AL.

Thi

sdo

cum

ent

isco

pyri

ghte

dby

the

Am

eric

anPs

ycho

logi

cal

Ass

ocia

tion

oron

eof

itsal

lied

publ

ishe

rs.

Thi

sar

ticle

isin

tend

edso

lely

for

the

pers

onal

use

ofth

ein

divi

dual

user

and

isno

tto

bedi

ssem

inat

edbr

oadl

y.

Page 12: Measuring Situation Awareness in Virtual Environment · PDF fileMeasuring Situation Awareness in Virtual Environment-Based ... (squad leader) differences, and ... platoon leaders based

dier perceived workload in virtual trainingexercises.

With respect to the results of the trainingscenario manipulation on the SA probe andworkload measures, we intentionally controlledscenario features (time of day, weather, numberof squad casualties), in order to ensure compa-rable levels of “combat” difficulty during train-ing trials. Although the two scenarios poseddifferent goals for the squads, many of the tasksand information processing requirements werecomparable. This may also explain why the SAprobe measure and workload scores were simi-lar among the scenarios (tests of Hypotheses 2and 4). In general, the approach we took to thedesign of scenarios is appropriate for comparingsquad leader SA across trials, and for the pur-pose of evaluating the reliability of SA mea-sures for different training missions. However,if one is interested in identifying how SA mightvary from one MOUT to another, scenariosshould be designed with different goals, such asmaintaining squad safety (like our scenarios)versus deliberate attacks on an enemy in anurban environment. It is likely that in moreoffensive situations, SA communications andbehaviors of leaders would be different than insituations emphasizing civilian safety, as theirmission objectives are different.

From an experiment control perspective, theautomated real-time probe measure was suc-cessful in unobtrusively querying soldier SA,unlike other prior SA measurement methodolo-gies (e.g., SAGAT) that require simulationfreezes to test trainees (Endsley, 1995a). In ad-dition, no trainees suspected questions posed bya platoon leader during trials to be a basis forevaluation of their virtual battlefield knowledge,although the specific probes might have altereda squad leader’s pattern of attention.

Frequency Analysis of SA Probes

The development and application of the au-tomated probe delivery system also served as abasis for “sifting” the range of SA requirementslisted in our GDTA (on MOUT operations) toidentify those commonly considered by experts(the confederate platoon leaders) throughoutmission performance. In line with our hypothe-sis, differences were identified in the high-priority SA requirements between the two train-ing scenarios. In one scenario, experts focused

on relative locations of the squad and enemyforces as well as leader ability to predict firesand snipers. In the other scenario, experts fo-cused on SA on specific insurgent threats andsquad combat readiness. A common high-priority SA requirement between the scenarioswas potential endangerment of civilians andwhether squad leaders were keeping track ofthis.

In general, post-hoc probe frequency analysisprovides a basis for identifying specific high-priority SA elements that should be trained inMOUT operations. The analysis can also pro-vide a basis for refining a probe database for-mulated through cognitive task analysis. Thiscould reduce confederate observer workload inusing tools like the APDA and selecting probesduring training exercises. This type of analysisand the outcomes identified here have not beengenerated in previous real-time probe studies(e.g., Jones & Endsley, 2000).

Behavior and Communication Measures

Opposite to the data collection results for theprobe measure, the electronic SA behavior mea-sure (ESABM) and the SA measure of teamcommunications (SAMTC) both provided manyopportunities for observations on soldier SA.On average, analysts made 40 behavior ratingsper trial across the 27 critical MOUT behaviors.With respect to ratings of SA communicationson various elements of MOUT, analysts pro-vided an average of 20 ratings per trial. Theseresults suggest that both measures can providesubstantial samples for statistical analyses onthe influence of training mission characteristicson SA, even with relatively short training trialtime (30 min).

Considering the behavior and communicationratings together, both revealed differencesamong squad leaders but not between scenarios.We expected squad leaders to have similar chal-lenges demonstrating and communicating SAacross the scenarios (Hypothesis 2) because ofthe specific feature manipulations. It is alsoimportant to note that many of the ESABMitems are general SA behaviors that may beapplicable across MOUT scenarios. This is, inpart, based on its origins in the SABARSmethod (Strater et al., 2001). In the ESABMadaptation of the SABARS, there were a fewMOUT-specific items that could be considered

341MEASURING SITUATION AWARENESS

Thi

sdo

cum

ent

isco

pyri

ghte

dby

the

Am

eric

anPs

ycho

logi

cal

Ass

ocia

tion

oron

eof

itsal

lied

publ

ishe

rs.

Thi

sar

ticle

isin

tend

edso

lely

for

the

pers

onal

use

ofth

ein

divi

dual

user

and

isno

tto

bedi

ssem

inat

edbr

oadl

y.

Page 13: Measuring Situation Awareness in Virtual Environment · PDF fileMeasuring Situation Awareness in Virtual Environment-Based ... (squad leader) differences, and ... platoon leaders based

more or less relevant to one of the scenariosunder study. Consequently, some ESABMitems may not promote sensitivity of the mea-surement approach for distinguishing amongstability and support operations, deliberate at-tacks, and so forth. Since the communicationratings were highly correlated with behaviorratings, both ratings showed the same patternfor individual and scenario differences. Al-though we did not find significant differences insquad leader SA based on the probe measure,this does not invalidate the SA behavior andcommunication measures. Rather, it suggestssquad leaders might have used different strate-gies, based on their experience and training, toachieve the same level of SA.

Conclusion

In this section, we return to assess the objec-tives of our study. We also identify limitationsof the research, in general, as well as importantfuture research directions for SA measurementsystem development. The objective of this studywas to evaluate the efficacy of a computer-based SA measurement system for training dis-mounted infantry in MOUT in a VR simulation.This included investigating sensitivity and reli-ability of several measures to differences insquad leader ability to achieve SA and to vari-ous scenario conditions. We prototyped threemeasurement tools that involved real-time prob-ing of soldiers in exercises and expert observerratings of communications and behaviors. Re-sults of an experiment with actual infantrysquads revealed sensitivity of the ratings to in-dividual differences and showed some promisefor capturing SA on critical training events.Soldier subjective ratings of the potential forfeedback on such measures to enhance AARs(e.g., understanding when decision or perfor-mance problems may have occurred based onSA) and support decision skill developmentwere positive (62% to 91% on a 100% scale).

The three types of SA measures used in thepresent study provided different insight into sol-dier SA. For infantry squads participating intraining exercises, achieving SA may be seen asa three-step progression requiring them to ad-dress the dimensions of the measures we devel-oped. The three steps include developing mis-sion knowledge, demonstrating that knowledgethrough SA, and communicating SA to squad

members; however, failure in one step affectsthe subsequent steps. A squad leader might have“good” mission knowledge but may be incapa-ble of executing appropriate behaviors that re-flect his or her SA and, subsequently, (s)he maycommit errors in communicating with teammembers. As an example, the probe measureindicated the worst performing squad leaderknew the battlefield situation because he movedto decent observation points in the VTE on hisown. However, the behavior ratings from expertanalysts indicated that his decisions concerningsquad actions were not consistent with “good”SA. In this way, the suite of SA measures pre-sented here can be used to evaluate the stages ofsquad leadership and facilitate training to im-prove one or all levels of SA.

By comparison with established SA question-naire or freeze techniques, such as SAGAT, theprobe methodology is more flexible for appli-cation to high-fidelity training systems, like theVR-based trainers used by the Army, or fieldexercises. Such applications would be difficultwith SAGAT, requiring freezes of live trainingevents and having analysts interview soldiers(in place) for SA assessment. The expert ratingsof SA behaviors and communications devel-oped in this research may also provide reliableand accurate assessments of solider SA as theSAGAT methodology without obtrusiveness totraining performance. In general, all three mea-surement tools (APDA, ESABM, and SAMTC)have been demonstrated to work with a VTE.The tools can also be further adapted for use inanalyzing soldier SA in other virtual trainingscenarios. In addition, the tools simplify dataanalysis for researchers by performing calcula-tions of preliminary results and automated da-tabase storage.

The main limitation of this study was thesmall sample of squad leaders. The limited dataset made it difficult to reliably assess the sensi-tivity of the SA measurement tool to individualdifferences and scenario conditions. BecauseSA probes were integrated into normal commu-nications between the confederate platoonleader and squad leader, the number of probesdelivered in a trial was limited by both thescenario and the discretion of the platoon leader.However, allowing platoon leaders flexibility inposing various probes during trials supportedidentification of high-priority SA requirementsfor specific scenarios. It is also possible that this

342 KABER ET AL.

Thi

sdo

cum

ent

isco

pyri

ghte

dby

the

Am

eric

anPs

ycho

logi

cal

Ass

ocia

tion

oron

eof

itsal

lied

publ

ishe

rs.

Thi

sar

ticle

isin

tend

edso

lely

for

the

pers

onal

use

ofth

ein

divi

dual

user

and

isno

tto

bedi

ssem

inat

edbr

oadl

y.

Page 14: Measuring Situation Awareness in Virtual Environment · PDF fileMeasuring Situation Awareness in Virtual Environment-Based ... (squad leader) differences, and ... platoon leaders based

approach promoted the value of the trainingexercise for soldiers by ignoring probes thatwere less relevant, based on the current status ofa squad. The alternative would be to requiresquad leaders to respond to all probes in adatabase for the purpose of testing the SAprobes. Such an approach might lead to artificialcommunications by a platoon leader and highcognitive workload for squad leaders. Futurestudies should focus on achieving a balancebetween unobtrusiveness and reliability of thereal-time probe measure.

Other potential limitations of the study wererelated to experiment control. For example,only one expert was used for behavior and com-munication ratings for each trial, and differentplatoon leaders were used for the two scenarios.These steps might have contributed to the de-gree of sensitivity of the analyses on the SAprobes.

In order to improve sensitivity of the real-time SA probe measure in future exercises, theAPDA tool is to be made more flexible to allowplatoon leaders to pose any probe (in the data-base of MOUT SA requirements) at any time ina training scenario. Platoon leaders will not beconstrained to subsets of probes associated withvirtual probe markers in the VTE that a squadmight encounter. The software will also be en-hanced to allow platoon leaders to select (in realtime) key elements of MOUT (e.g., weapons,friendlies, enemies, etc.) as bases for probing.They will then select among squad leader goalstates (related to a MOUT element), specifictask decisions, and specific probes for the goalstate. This approach needs to be tested throughadditional research to determine whether highernumbers of probes can be collected in trials tofurther support parametric statistical analyses oftrainee responses.

Further research is needed to define adminis-tration approaches that lead to sensitivity of allmeasures to training scenario manipulations forsupporting AAR feedback on soldier SA skills.Beyond this, research should be conducted toassess the potential applicability of the real-timeSA probe measurement system (APDA tool) foruse in field training exercises. Such testingcould be accomplished through technologiessimilar to those used in the present study, in-cluding: (1) a wireless squad communicationnetwork; (2) portable tablet PCs for use byconfederate platoon leaders to select SA probes

and for use by training analysts to rate squadleader behaviors and communications; and (3) anetworked secure server providing platoonleader access to the database of probes and forstoring data collected during the field exercise.

References

Brannick, M. T., Prince, A., Prince, C., & Salas, E.(1995). The measurement of team process. HumanFactors, 37, 641–651.

Durso, F. T., Hackworth, C. A., Truitt, T. R., Crutch-field, J., Nikolic, D., & Manning, C. A. (1998).Situation awareness as a predictor of performancein en route air traffic controllers. Air Traffic Con-trol Quarterly, 5, 1–20.

Endsley, M. R. (1993). A survey of situation aware-ness requirements in air-to-air combat fighters. In-ternational Journal of Aviation Psychology, 3,157–168.

Endsley, M. R. (1995a). Measurement of situationawareness in dynamic systems. Human Factors,37, 65–84.

Endsley, M. R. (1995b). Toward a theory of situationawareness in dynamic systems. Human Factors,37, 32–64.

Endsley, M. R., &Bolstad, C. A. (1994). Individualdifferences in pilot situation awareness. Interna-tional Journal of Aviation Psychology, 4, 241–264.

Endsley, M. R., Holder, L. D., Leibrecht, B. C.,Garland, D. J., Matthews, M. D., & Graham, S. E.(2000). Modeling and measuring situation aware-ness in the infantry operational environment (Re-search Report No. 1753). Alexandria, VA: U.S.Army Research Institute for Behavioral and SocialSciences.

Ericsson, K. A., & Simon, H. A. (1998). How tostudy thinking in everyday life: Contrasting think-aloud protocols with descriptions and explanationsof thinking. Mind, Culture, & Activity, 5, 178–186.

Gorman, J. C., Cooke, N. J., & Winner, J. L. (2006).Measuring team situation awareness in decentral-ized command and control systems. Ergonomics,49, 1312–1325.

Gugerty, L., & Tirre, W. (1997). Situation awareness:A validation study and investigation of individualdifferences. Proceedings of the Human Factorsand Ergonomics SocietyAnnual Meeting, 40, 564–568. doi: 10.1177/154193129604001202

Hamilton, R. M., & Holmquist, J. P. (2005). Trainingin virtual and augmented realities: An interviewwith Bruce Knerr. Ergonomics in Design, 13,18–22.

Hendy, K. C., Hamilton, K. M., & Landry, L. N.(1993). Measuring subjective workload: When isone scale better than many? Human Factors, 35,579–601.

343MEASURING SITUATION AWARENESS

Thi

sdo

cum

ent

isco

pyri

ghte

dby

the

Am

eric

anPs

ycho

logi

cal

Ass

ocia

tion

oron

eof

itsal

lied

publ

ishe

rs.

Thi

sar

ticle

isin

tend

edso

lely

for

the

pers

onal

use

ofth

ein

divi

dual

user

and

isno

tto

bedi

ssem

inat

edbr

oadl

y.

Page 15: Measuring Situation Awareness in Virtual Environment · PDF fileMeasuring Situation Awareness in Virtual Environment-Based ... (squad leader) differences, and ... platoon leaders based

Hockey, G. R. J. (1986). Changes in operator effi-ciency as a function of environmental stress, fa-tigue and circadian rhythms. In K. Boff, L. Kauf-man, & J. Thomas (Eds.), Handbook of perceptionand performance (pp. 44/1–44/49). New York,NY: Wiley.

Jones, D. G., & Endsley, M. R. (2000). Can real-timeprobes provide a valid measure of situation aware-ness? In D. B. Kaber & M. R. Endsley (Eds.),Human performance, situation awareness and au-tomation: User-centered design for the new mil-lennium (pp. 245–250). Atlanta, GA: SA Technol-ogies.

Kaber, D. B., & Endsley, M. R. (1998). Team situ-ation awareness for process control safety and per-formance. Process Safety Progress, 17, 43–48.

Kaber, D. B., Riley, J. M., Lampton, D., & Endsley,M. R. (2005). Measuring situation awareness in avirtual urban environment for dismounted infantrytraining. In Proceedings of the 11th InternationalConference on Human-Computer Interaction (Vol.9). Mahwah, NJ: Erlbaum & Assoc.

Lampton, D. R., Riley, J. M., Kaber, D. B., & End-sley, M. R. (2006, November). Use of immersivevirtual environments for measuring and trainingsituation awareness. Paper presented at the U.S.Army Science Conference, Orlando, FL.

Matthews, M. D., Pleban, R. J., Endsley, M. R., &Strater, L. D. (2000). Measures of infantry situa-tion awareness for a virtual MOUT environment.In D. B. Kaber & M. R. Endsley (Eds.), Humanperformance, situation awareness and automa-

tion: User-centered design for the new millennium(pp. 262–267). Atlanta, GA: SA Technologies.

O’Hare, D. (1997). Cognitive ability determinants ofelite pilot performance. Human Factors, 39, 540–552.

Pleban, R. J., Eakin, D. E., Salter, M. S., & Mat-thews, M. D. (2001). Training and assessment ofdecision-making skills in virtual environments(Research Report 1767). Alexandria, VA: U.S.Army Research Institute for the Behavioral andSocial Sciences.

Sharit, J., & Salvendy, G. (1982). Occupationalstress: Review and reappraisal. Human Factors,24, 129–162.

Strater, L. D., Endsley, M. R., Pleban, R. J., &Matthews, M. D. (2001). Measures of platoonleader situation awareness in virtual decision-making exercises (Research report no. 1770). Al-exandria, VA: U.S. Army Research Institute forBehavioral and Social Sciences.

Taylor, R. M., Endsley, M. R., & Henderson, S.(1996). Situational awareness workshop report. InB. J. Hayward & A. R. Lowe (Eds.), Appliedaviation psychology: Achievement, change andchallenge (pp. 447–454). Aldershot, UK: AshgatePublishing Ltd.

Wright, M. C., & Kaber, D. B. (2003). Team coor-dination and strategies under automation. Proceed-ings of the Human Factors and Ergonomics Soci-ety Annual Meeting, 47, 553–557. doi: 10.1177/154193120304700363

344 KABER ET AL.

Thi

sdo

cum

ent

isco

pyri

ghte

dby

the

Am

eric

anPs

ycho

logi

cal

Ass

ocia

tion

oron

eof

itsal

lied

publ

ishe

rs.

Thi

sar

ticle

isin

tend

edso

lely

for

the

pers

onal

use

ofth

ein

divi

dual

user

and

isno

tto

bedi

ssem

inat

edbr

oadl

y.