automated performance assessment and feedback for free-play simulation-based training

24

Performance Improvement, vol. 46, no. 10, November/December 2007©2007 International Society for Performance ImprovementPublished online in Wiley InterScience (www.interscience.wiley.com) • DOI: 10.1002/pfi.168

AUTOMATED PERFORMANCE ASSESSMENTAND FEEDBACK FOR FREE-PLAY SIMULATION-BASED TRAINING

James Ong

Practice and experience, whether simulated or on the job, are not enough to ensure effective

learning. Learners must be able to make sense of those experiences to identify poor decisions

and actions, missing knowledge, and weak skills that deserve attention. Using instructors to

provide one-on-one instruction is effective but also expensive. This article describes ways of

using intelligent software to assess student performance and provide feedback automatically in

free-play simulations. Case studies describe applications of these methods.

COMPUTER-BASED training systems today tend to beused primarily as electronic textbooks. They present factsand concepts using text and multimedia, and they test thelearner’s understanding by asking multiple-choice or fill-in-the blank questions. Because these systems focus onfactual knowledge rather than on performance, they caninstill “inert knowledge” (Whitehead, 1929) that peoplecan recognize or recall but cannot apply when the situa-tion clearly calls for it. These methods often producetrained novices who are familiar with the subject area butlack in-depth expertise needed for high performance.

By contrast, simulation-based learning enables studentsto apply their knowledge and practice their skills in a vari-ety of hypothetical situations. In particular, software sim-ulations can present scenarios to many students andchallenge them to apply their knowledge and skills to analyze situations, make appropriate decisions, and see theeffects of their actions. One common type of training sim-ulation is the branching scenario in which the softwarepresents the current situation at each decision point to thestudent using a combination of text, graphics, audio, orvideo. The software then presents a set of possible actionsand prompts the student to select one. The student’schoice determines what happens next, and the softwareadvances to the next decision point by presenting the newsituation, offering a new set of choices, and so on.

Branching scenarios are popular because they provideuseful experiential learning and are relatively easy to cre-ate compared to more sophisticated simulations.However, because they usually offer only a few choices ateach decision point, they have several disadvantages. First,branching scenarios do not challenge students to generateoptions on their own as they would have to in real life.Second, the course of action the student prefers might notbe offered as a choice, which can be a frustrating situa-tion. By contrast, free-play simulations let students selectfrom many possible actions and use some type of simula-tion engine to determine how the simulated world shouldrespond to each action.

ASSESSMENT AND FEEDBACK AFTERSIMULATION EXERCISESPractice and experience, whether simulated or on the job,are not enough to ensure effective learning. Learners mustbe able to make sense of those experiences to identify poordecisions and actions, missing knowledge, and weak skillsthat deserve attention. In complex situations with inter-connected cause-and-effect relationships, it is often hardfor students to figure out how their actions led to the var-ious outcomes. Even if the desired goals were achieved, itis unlikely that everything the student did was correct or

Performance Improvement • Volume 46 • Number 10 • DOI: 10.1002/pfi 25

optimal. In addition, outcomes are often affected by fac-tors unrelated to the student’s actions, such as favorable or unfavorable simulated conditions and events. Thus,feedback based solely on outcomes can be misleading and hinder training. For example, Johnston, Cannon-Bowers,and Smith-Jentsch (1995) reported that feedback providedto the learner based on a lucky positive outcome may actu-ally reinforce poor processes.

Instructors can help students learn much more fromtheir experiences by observing the exercises, assessing stu-dent performance, providing feedback, and guidingreflection. Although exercise debriefings are now a com-mon practice, the U.S. Army has been a leader in formal-izing and institutionalizing these reflection and reviewprocesses. First used in the 1970s to capture lessons fromthe simulated battles at the National Training Centers,after-action reviews (AARs) are now a standard, carefullydesigned procedure. Led by instructors after live and sim-ulated military exercises, AARs help students review, dis-cuss, and reflect on four key questions (Garvin, 2000):

• What did we set out to do?

• What actually happened?

• Why did it happen?

• What are we going to do next time?

INTELLIGENT TUTORING SYSTEMSAUTOMATE ASSESSMENTStudies show that individualized instruction is far moreeffective than traditional classroom instruction. In 1984,Benjamin Bloom, an educational psychologist, reportedthat students who received individualized tutoring per-formed on average two standard deviations better thanstudents who received traditional classroom instruction.

This means that the performance of the average tutoredstudent exceeded the performance of almost 98% ofthe students receiving traditional instruction. Indivi-dualized training is effective because a skilled tutor canselect learning topics, set the pace, and adapt the instruc-tional style to the needs and preferences of each student.The challenge, then, is figuring out how to provide thebenefits of one-on-one instruction to many students inan affordable way.

Intelligent tutoring systems (ITSs) (Ong & Ramachan-dran, 2000) are software programs that encode and applythe subject matter and teaching expertise of instructors toprovide the benefits of one-on-one tutoring in an auto-mated way. During each scenario, the ITS evaluates the stu-dent’s actions to assess knowledge and skills. The ITS canalso provide hints during the exercise, either proactively oron demand. After each scenario, the ITS can presentdetailed feedback that identifies the student’s strengths andweaknesses and select appropriate next exercises thataddress the student’s specific learning needs. Socratictutoring dialogues (Rose, Moore, VanLehn, & Allbritton,2000; Domeshek, Holman, & Ross, 2002) can be used toask the students questions, probe for their rationale, diag-nose their knowledge and skills, and help them realize les-sons on their own. ITSs can help ensure that trainingsimulators are effective in distance learning situationswhen human instructors are not present. Even wheninstructors are available, ITSs can be used as force multipli-ers that provide first-pass coaching and performanceassessment, so instructors can focus their attention on students with the most challenging learning issues, eitherface-to-face in a classroom environment or over the Webby synchronous or asynchronous communication.

Studies show that students taught using intelligenttutoring systems generally performed better and learnedfaster compared to students receiving traditional class-room instruction. For example, at Carnegie MellonUniversity, researchers developed an intelligent tutoringsystem, the LISP Tutor, that taught computer program-ming skills to college students. In one controlled experi-ment, students who received computer tutoring scored43% higher on the final exam than the control group whoreceived traditional instruction (Anderson, Boyle, &Reiser, 1985). The Sherlock intelligent tutoring systemtaught U.S. Air Force personnel troubleshooting pro-cedures for problems associated with an F-15 manualavionics test station. Students taught using Sherlock per-formed significantly better than the control group, andafter 20 hours of instruction, they performed as well asexperienced technicians with four years of on-the-jobexperience (Nichols, Pokorny, Jones, Gott, & Alley, 1992).Efficient training was achieved by presenting students

In complex situations withinterconnected cause-and-effect relationships, it is oftenhard for students to figureout how their actions led tothe various outcomes.

26 www.ispi.org • DOI: 10.1002/pfi • NOVEMBER/DECEMBER 2007

with a mixture of problems that represent both commonand unusual situations. By contrast, on-the-job experi-ence is often filled with steady repetition of routine situa-tions, so it can take years to encounter the less common,and often more difficult, situations that enable practition-ers to become true experts.

A meta-analysis of 11 intelligent tutoring system eval-uations reported by Fletcher (1999) showed that the aver-age student using ITSs performed on average 0.84standard deviations better than students taught usingconventional classroom instruction. By contrast, an eval-uation of 47 interactive multimedia instruction systemsshowed just a 0.5 standard deviation improvement(Fletcher, 1990).

IMPLEMENTING AUTOMATEDASSESSMENTAlthough ITS technology offers many benefits, successfulimplementation poses several challenges. First, a free-playscenario can play out in many ways, depending on what thestudent does. This variability means that student perfor-mance cannot be accurately assessed simply by recognizingprespecified student actions at prespecified times. Insteadthe software needs to evaluate the student’s actions in aflexible manner that can handle the various ways in whichthe scenario might unfold. Second, the software needs toevaluate student actions in context, not in isolation. Forexample, whether a particular set of actions is appropriatedepends on which events had already taken place, as well asthe state of the simulation when the actions were carriedout. Thus, subject matter expertise, encoded within soft-ware, is needed to estimate the student’s knowledge andskills based on the actions carried out.

ITS authoring tools simplify the process of specifyinghow the tutoring software should interpret studentactions, assess performance, and provide effective instruc-tional feedback. The challenge in creating an intelligenttutoring system is making it smart enough to recognizemany complex sequences of actions, events, and simula-tion state to infer student strengths and weaknesses whilemaking it easy for an instructor to enter this knowledgewithout requiring programming skills or excessive effort.

CASE STUDIESThis section describes case studies of intelligent tutoringsystem applications and the assessment and feedbacktechnologies that make them possible.

Tactical Action OfficerA U.S. Navy tactical action officer (TAO) is a senior offi-cer, usually second in command to the captain, who com-mands watch standers who operate a ship’s weapons,sensors (sonar and radar), navigation, and supportingaircraft and vessels. Under extreme stress and time pres-sure, TAOs must make rapid tactical decisions in hostile,uncertain situations that affect mission success, the safetyof a multibillion-dollar ship, and the lives of the ship’screw. For example, a TAO must be able to determinewhether a radar blip is an incoming missile or a passengerairliner and respond quickly and appropriately. Failure toact correctly can have dire consequences.

Becoming a great TAO requires extensive experience intactical situations. However, it used to be very expensive toprovide this experience. At the U.S. Navy Surface WarfareOfficers School (SWOS) in Rhode Island, eight studentsand five instructors were needed to play other roles andevaluate a single TAO’s performance using a $6 millionsimulator that cost $600,000 per year for softwareenhancements. This high cost limited the amount of prac-tice each student could receive. To provide lower-costtraining, Stottler Henke, an artificial intelligence softwaredevelopment company, developed a simulation-basedintelligent tutoring system, the tactical action officer intel-ligent tutoring system (TAO ITS), comprising three soft-ware applications: a student interface, a scenario authoringtool, and an instructor interface.

The student interface provides a free-play simulationthat enables students to act as TAOs in realistically complextactical simulations that involve friendly, enemy, and neu-tral aircraft, helicopters, missiles, ships, and submarines.The simulation user interface displays a geographical mapof the region and provides rapid access to the ship’s sensor,weapon, and communication functions (Figure 1). The

Studies show that studentstaught using intelligenttutoring systems generallyperformed better and learnedfaster compared to studentsreceiving traditionalclassroom instruction.


TAO ITS simulation user interface enables students toreceive tactical information and control the ship’s weapons,sensors, and support vessels and aircraft (Figure 1).

At the end of the scenario, the system presents a reportcard that identifies student actions that demonstratedcorrect (green) and incorrect application (red) of tacticalprinciples and rules of engagement (Figure 2). For eachprinciple, the student can select and review relevant mul-timedia material or see a replay of the relevant part of thescenario.

The scenario authoring tool enables instructors to cre-ate scenarios that specify the type, attributes, and intelli-gent behaviors of friendly, enemy, and neutral ships,planes, helicopters, missiles, and submarines. For exam-ple, a plane might fly patterns to search for enemy vesselsand then attack the vessels when found.

TAO ITS is unique in its use of artificial intelligenceto ensure that the simulated forces behave realistically.TAO ITS provides an early version of the SimBionicintelligent agent tool kit that enables tactical experts todefine these behaviors without extensive support fromsoftware programmers by drawing and configuringflowcharts. These flowcharts represent hierarchical finitestate machines (FSMs) that recognize and dynamically

respond to specific patterns or sequences of studentactions, simulated events, and state conditions. Therecan be one or more FSMs for each simulated entity, run-ning in parallel, that detect different situations andrespond appropriately.

Using the scenario authoring tool, instructors can alsocreate FSMs to implement automated evaluators thatassess students’ performance by analyzing their actions inrelation to other simulation events and state conditions.Many evaluators can run in parallel, so each evaluatorinfers different strengths or weaknesses when it detects aparticular sequence of actions, events, and states.

The scenario authoring tool includes an early versionof the SimBionic intelligent agent tool kit that enablesscenario authors to define simulation behaviors and eval-uators as hierarchical finite state machines (Figure 3).

The instructor interface enables the instructor toreview students’ performance and assess their progress indetail. It allows instructors to review student simulationruns and tailor scenarios to help overcome any observeddeficiencies in their understanding to ensure that class-room training is adequate.

TAO ITS has been a useful training game. The Naval AirWarfare Center Training Systems Division in Orlando,

FIGURE 1. TAO ITS SIMULATION USER INTERFACE


Florida, surveyed 12 students at SWOS to elicit their reac-tions to TAO ITS. Nine students had extremely favorablereactions to TAO ITS as a classroom aid, two had favorablereactions, and one was neutral. According to LieutenantCommander Gene Black, lead AEGIS instructor at SWOS,“TAO ITS gives student tactical action officers 10 times thetactical decision-making opportunity [compared withthat provided by] existing training systems.” The U.S. Navydesignated TAO ITS as a Small Business InnovationResearch success story, a distinction awarded to only asmall fraction of these research projects.

Encouraged by the success of this system, the navy con-tracted Northrop Grumman and Stottler Henke todevelop an enhanced version of TAO ITS to make thetraining experience even more realistic, engaging, andeffective. The second-generation TAO ITS, currentlybeing tested, uses speech recognition, so students canconverse with simulated crew members controlled byintelligent software agents. In addition, it evaluates thestudent’s actions in real time to provide coaching duringscenarios.

FIGURE 3. SIMBIONIC GRAPHICAL EDITOR FOR PERFORMANCE ASSESSMENT ALGORITHMS

FIGURE 2. TAO ITS REPORT CARD


Sonar Data AnalysisIntelligent tutoring systems can also be used to trainanalytical skills. For example, U.S. Navy sonar techni-cians interpret undersea sonar data to detect, classify,and track submarines. Becoming an expert acoustic ana-lyst requires extensive practice, analyzing many acousticdata sets under the guidance of expert analysts. To pro-vide more practice and instruction to students, StottlerHenke developed the acoustic analysis intelligent tutor-ing system (AAITS). Using this system, students reviewand annotate graphical presentations of the sonar datato enter their observations and analyses. Students clickon images of the sonar data to note significant features,link features that indicate significant relationships, andselect and justify their final classification of the soundsource.

Acoustic analysis experts use a scenario authoring toolto enter their own analyses of the sonar data, using a sim-ilar graphical annotation tool. At the end of each scenario,AAITS compares each student’s annotations with theexpert’s. Differences are reported as errors, which arepassed along to the student and the student’s instructor,along with associated acoustic analysis principles.

By storing low frequency analysis recording grams(lofargrams) annotated by experts, AAITS also serves asa knowledge repository to disseminate the most currentacoustic analysis expertise to sonar technicians on landor at sea. AAITS has been deployed at 10 sites within thenavy. The U.S. Navy designated AAITS as a SmallBusiness Innovation Research Success Story.

Task Tutor ToolkitLike many other organizations, the National Aeronauticsand Space Administration (NASA) relies on procedures,guidelines, and strategies to carry out tasks that supportcomplex missions. To the extent possible, NASA plansevery activity, anticipates contingencies, and carefullydesigns, tests, and trains detailed, step-by-step proceduresfor nominal and off-nominal situations. Because it is notpossible to anticipate every situation in detail, guidelinesand mission rules specify appropriate actions in a moregeneral way that constrain, but might not fully specify, thespecific steps to be carried out.

Astronauts and ground-based crew members preparefor missions by practicing simulation exercises that coverdiverse situations. Instructors help students operate thesimulator, monitor and evaluate each student’s perfor-mance, and provide help and instructional feedback.Although this training approach is very effective, itrequires many instructors to provide students with indi-vidual attention.

To lower the cost of providing individualized instruc-tion, Stottler Henke developed for NASA the Task TutorToolkit, an intelligent tutoring system shell that can inte-grate with diverse training simulations. The Task TutorToolkit provides step-by-step hinting by answering ques-tions such as, “Give me a hint,” “What do I do?” “How doI do that?” and “Why should I do that?” After each action,the Task Tutor Toolkit provides feedback by telling thestudent whether the action was correct.

Typical coaching systems assume that there is a singlesequence of correct actions for each exercise. However,this assumption is usually appropriate only for narrowtasks that can be done in a single way. The Task TutorToolkit evaluates student actions more flexibly by recog-nizing alternate, acceptable sequences of steps. For exam-ple, there may be more than one way of achieving a goalwithin the scenario, and there may be more than oneallowable order in which some steps are carried out. Byrecognizing the range of acceptable solutions, the TaskTutor Toolkit can coach students through richer scenar-ios, not just narrow tasks, that challenge them to assesssituations, identify relevant procedures and guidelines,and carry out appropriate actions.

Compared to SimBionic, the Task Tutor Toolkit canassess performance in scenarios that are more con-strained, with only a moderate number of possible solu-tions. However, for these types of scenarios, the TaskTutor Toolkit enables easier, more rapid authoring.Specifically, the tool kit relies on a solution templatethat specifies the range of acceptable steps and theirallowed ordering. The scenario author creates an initialsolution template by using the simulator to demonstrateone possible correct sequence of actions for the sce-nario. The author then uses the authoring tool to gener-alize the solution template so that it accepts other validsequences of actions. For example, the author couldspecify alternate sets of actions for part of the scenarioor specify that certain actions can be carried out in anyorder. The author can specify conditions on simulationstate variable values that must be true for an action to becorrect. For example, an instructor could specify that anaction should be carried out only when the simulationvariable named Temperature is more than 500. Finally,the author annotates the solution template by linkingprinciples with actions or groups of actions. This infor-mation supports assessment by letting the tutoring sys-tem identify principles the student appears to knowwhen he or she carries out a particular action or groupof actions.

The Task Tutor Toolkit was included in Spinoff 2003,NASA’s catalogue of successful spin-off technologiesdeveloped with research funding from NASA.


The Task Tutor Toolkit’s authoring tool lets scenarioauthors create solution templates quickly by using the sim-ulator to demonstrate a correct sequence of actions for thescenario, generalize the solution template to accept othervalid sequences of actions, and associate steps and groupsof steps with hints and principles. A hierarchical overviewof the solution template is shown in the left pane of Figure4, and the details of the selected step or group, includinghints and principles, are shown in the right pane.

CONCLUSIONIntelligent tutoring systems technology has progressedsteadily over the past several decades, and it has been usedto develop highly successful, automated training systems.Authoring tools and instructional design methods sim-plify the creation of simulation-based intelligent tutoringsystems, making them a practical way to enable effective,application-level learning.

References

Anderson, J.R., Boyle, C., & Reiser, B. (1985). Intelligent tutor-ing systems. Science, 228, 456–462.

Bloom, B. (1984). The 2 sigma problem: The search for meth-ods of group instruction as effective as one-to-one tutoring.Educational Researcher, 13(6), 4–16.

Domeshek, E., Holman, E., & Ross, K. (2002, December).Automated Socratic tutors for high-level command skills. InProceedings of the 2002 Industry/Interservice, Training,Simulation and Education Conference. Arlington, VA: NationalTraining and Simulation Association.

Fletcher, J.D. (1990). The effectiveness of interactive videodiscinstruction in defense training and education. Alexandria, VA:Institute for Defense Analyses.

Fletcher, J.D. (1999). Intelligent tutoring systems: Then and now.Workshop on Advanced Training Technologies and LearningEnvironments. NASA / CP-1999–209339, NASA LangleyResearch Center, Hampton, VA.

Garvin, D. (2000). Learning in action. Boston: HarvardBusiness School Press.

Johnston, J.H., Cannon-Bowers, J.A., & Smith-Jentsch, K.A.(1995). Event-based performance measurement system forshipboard command teams. In Proceedings of the FirstInternational Symposium on Command and Control Researchand Techonology (pp. 274–276). Washington DC: Institute forNational Strategic Studies.

FIGURE 4. TASK TUTOR TOOLKIT INTELLIGENT TUTORING SCENARIO AUTHORING TOOL


Nichols, P., Pokorny, R., Jones, G., Gott, S., & Alley, W. (1992).Evaluation of an avionics troubleshooting tutoring system (Tech.Rep.). Brooks AFB, TX: Human Resources Directorate.

Ong, J., & Ramachandran, S. (2000, February). Intelligenttutoring systems: The what and the how. Learning Circuits,February, 2000. Retrieved August 31, 2007, from http://www.learningcircuits.org/2000/feb2000/ong.htm.

Rose, C. P., Moore, J.D., VanLehn, K., & Allbritton, D. (2000).A comparative evaluation of Socratic versus didactic tutoring(LRDC Tech. Rep. LRDC-BEE-1). Pittsburgh: University ofPittsburgh.

Whitehead, A.N. (1929). The aims of education. New York:Macmillan.

JAMES ONG, MBA, MS, leads the development of advanced training and performance supportsystems at Stottler Henke. For the U.S. Navy, he led the development of the acoustic analysis intel-ligent tutoring system, and for NASA, he led the development of the Task Tutor Toolkit. He hasserved in product development, applied research, consulting, and systems engineering roles atBelmont Research, BBN, and AT&T Bell Laboratories. He received an MBA in marketing fromBoston University, an MS in computer science from Yale, an MS from the University of Californiaat Berkeley, and a BS from the Massachusetts Institute of Technology. He may be reached [email protected].

automated performance assessment and feedback for free-play simulation-based training

Documents