tactical language training system natalie macconnell april 21, 2005
TRANSCRIPT
Tactical Language Training System
Natalie MacConnell
April 21, 2005
Organization of Talk
What is the Tactical Language Training System?
Objectives and Quick Facts
System Architecture Mission Skill Builder
Mission Practice Environment
Speech Recognition and Error Modeling
Demonstration Video
Summary
What is the Tactical Language Training System (TLTS)?
Intelligent tutoring system design to aid military personnel in rapidly acquiring language and cultural skills in order to carry out peaceful and effective communication in foreign countries
Focuses on “tactical languages”: subsets of linguistic, gestural, and cultural knowledge and skills necessary to accomplish the task at hand
Currently developed for Levantine and Iraqi Arabic
Virtual tutor coaches learners in pronunciation, assesses their mastery, and provides assistance
Learners then apply their language skills to perform missions in an interactive story environment, where they communicate with autonomous, animated, Arabic speaking characters
Quick Facts about TLTS Center for Advanced Research in
Technology for Education (CARTE) at the University of Southern California Dr. Lewis Johnson, director of CARTE, linguist
and A.I. expert $7.4 million project funded by DARPA
Being developed as part of the Training Superiority Program (DARWARS)
“DARWARS seeks to transform military training by providing continuously-available, on-demand mission-level training for all forces at all echelons”
To be deployed late this year Full program to include about 80 hours of
instruction with a vocabulary of around 500 carefully chosen words
Objectives of the TLTS Help military and civilian personnel gain an understanding of a
foreign language and culture so they can learn to communicate peacefully and effectively with foreigners in their native language
Eliminate heavy reliance on language experts
Deemphasize written language -- focus on spoken communication skills for immediate application
Learn the role of nonverbal communication
Develop a more engaging and motivating learning environment compared to traditional language instruction
Provide training in less commonly taught, difficult to learn languages
Yield rapid acquisition of foreign language skills
System Architecture Three main components
Mission Skill Builder (MSB): interactive exercises that introduce learner to the vocabulary and pronunciation of the language
Mission Practice Environment (MPE): story-based, interactive video game environment where learners advance through game levels by using their newly acquired linguistic and cultural skills to accomplish particular tasks and missions
Medina Authoring Tool: used to develop curriculum and game content Common set of services and content databases: Curriculum
Database, Pedagogical Agent, Learner Model, and Language Model Language Model consists of:
Speech Recognizer: used by MSB and MPE Natural Language Parser: annotates phrases with structural
information and refers to relevant grammatical explanations Error Model: finds and analyzes syntactic and phonological
mistakes in the learner’s speech
System Architecture Diagram
Language Model
Mission Skill Builder (MSB)
Pedagogical Agent
LearnerModel
Mission Practice Environment (MPE)
MEDINA Authoring Tool
CurriculumMaterial
NLP Parser
SpeechRecognizer
Error Model
Johnson, W.L., S. Marsella, N. Mote, H. Vilhjalmsson, S. Narayanan , and S. Choi, Tactical Language Training System: Supporting the Rapid Acquisition of Foreign Language and Cultural Skills
Mission Skill Builder Intensive and “intelligent” version of traditional language lab programs
where students are exposed to words and phrases pronounced by native speakers, which they imitate and practice
Important innovations: Speech Recognizer is tailored for learner speech so it is able to evaluate
learner’s pronunciation and detect common errors Pedagogical Agent provides the learner with tailored performance feedback Learner Model tracks what the learner has mastered and what areas the
learner needs to improve Learning process involves the following steps:
Learner hears Pedagogical Agent pronounce phrase Learner records himself speaking the phrase Speech Recognizer analyzes the recording and passes it to the Pedagogical
Agent, which provides appropriate feedback based on pronunciation errors and the Learner Model’s learner history
Also instructs students in non-verbal communication
Mission Skill Builder
Mission Practice Environment
Story-based, interactive video game environment designed to give students an unscripted, unpredictable, and challenging test of their mastery of the skills learned in the MSB
Learner moves a uniformed figure through a videogame-like Lebanese village
Learner speaks into a microphone to control the speech of his character and selects from gestures for nonverbal communication
Can carry on free-form conversation with AI-animated Arabic speaking characters, who can understand what is said if it is understandable Arabic and then respond
Learner must be careful to use appropriate phrases and gestures
Tests the learner’s ability to carry on two-way communication
Mission Practice Environment
“In a scene in a café, Sergeant Smith must try to find out who the village headman is. If he doesn’t act properly, one of the café patrons
will jump up and demand to know who he really is. If tensions escalate, the patron will
eventually accuse the sergeant of being a CIA agent. Standing in the background is the
pedagogical agent, here in the role of aide, who can assist the learner by translating phrases or
offering suggestions of what to say.”
Initial Game Scenario:
Mission Practice Environment
Mission Practice Environment
Implemented as a Total Conversion Mod to Unreal Tournament 2003
Removed all the combat elements Added a speech recognition engine Added intelligent agents that react to the
learner’s speech and pronunciation
UnrealWorld: renders it on the screen and provides a user interface
The Learner Model maintained by the Pedagogical Agent controls the aide’s behavior in the game
Adapts to each individual, noting consistent errors or difficulties, which can be targeted for remedial practice in the MSB
Based on the graphics capabilities of Unreal Tournament
Mission Practice Environment
Mission Practice Environment
MissionEngine
Controls what happens in the game, while the UnrealWorld renders it on the screen
Represents each character in the story as an agent with its own goals, relationships, and private beliefs
High-level director agent influences the character agents
Controls how the story unfolds
Ensures the pedagogical and dramatic goals are met
Backend written in Python
MissionEngine System Architecture Pedagogical Agent: intelligent agent that provides feedback and
encouragement to the learner based on pronunciation correctness and learner history; implemented in Python
Automatic Speech Recognizer: speech recognition system built on top of the Cambridge Hidden Markov Model Toolkit (HTK); implemented as a C++ library
PsychSim: decision-making framework of the virtual characters; models the goals, motivations, and world beliefs of the characters; implemented in Python
SocialPuppets: module that controls physical character behavior in the environment given a description of the character's intent from PsychSim; implemented in Python
Gamebots: interface that allows Unreal Tournament bots to be controlled
DataManager: storage module used for all data in the system; implemented in C++ as an XML database
MissionEngine Architecture
http://www.python.org/pycon/2005/papers/4/MissionEngine.WhitePaper.pdf
Speech Recognition
Hidden Markov Model Automatic Speech Recognizer bootstrapped from English and Modern Standard Arabic speech and enhanced with data from native and learner Lebanese Arabic speech Implemented using the Cambridge HTK
Trained on a Modern Standard Arabic dataset with around 10 hours of native speech, as well as approximately one hour of non-native speech samples Learner speech data is being collected to train the ASR
Generated non-native pronunciation variations for every utterance in the system and loaded into the Arabic ASR
Hypothesis Rejection Module compares HMM likelihoods from an Arabic recognizer, English recognizer, and pronunciation variants to detect whether the user has spoken the right utterance and provide correct feedback
Speech Recognition
Dynamic switching of recognition grammars allows the recognizer to focus on recognizing the words and phrases that are likely to occur in a given learning context
For the MSB, recognizer is constrained to recognize only the pronunciation variants of the utterances being taught
For the MPE, recognizer is a finite state graph, which has all the utterances in the MSB as parallel paths Focuses on recognizing the most likely utterance from among a set of
utterances that are appropriate for a given scene Enables the system to simulate dialogue with other characters If the recognizer recognizes a phrase that doesn’t fit into the current
context, the character indicates that he does not understand If the recognizer fails to recognize an utterance, the aide makes a
suggestion to the learner
Error Detection and Modeling
The ASR detects learner errors and passes them to the Pedagogical Agent which provides feedback to the learner
Aims to recognize (1) what the learner intended to say, (2) the deviations the learner made from what he intended to say
For each lesson or exercise, a recognition grammar is loaded that detects correct responses for that context as well as likely learner errors
Speech Recognizer must recognize both true Arabic words and mispronounced Arabic words since it is dealing with learner speech
The variability of learner language makes robustness difficult to achieve Inaccuracies in the speech analysis algorithms caused utterances that were
pronounced correctly but slowly to be rejected -- has been modified to give higher scores for these utterances so they are not rejected as errors
Can reduce recognition vocabulary size because the learner is taught a small subset of the language
Video Clip
http://www.isi.edu/~jmoore/Mankin/TLMankin256.wmv
Summary
Help people gain an understanding of a foreign language and culture so they can communicate peacefully and effectively with foreigners in their native language
Focus on “tactical languages” to accomplish specific missions
Focus on spoken communication skills
Rapid acquisition of foreign language skills save time
Remove need for interpreters save money
Model learner speech and common errors, including English language utterances
More engaging learning experience (and video games are fun!)
References “DARPA Tactical Language Training Project”:
http://www.isi.edu/isd/carte/proj_tactlang/tactical_lang_overview.pdf “Experts Use AI to Help GIs Learn Arabic”:
http://www.usc.edu/uscnews/stories/10321.html HTK Speech Recognition Toolkit: http://htk.eng.cam.ac.uk/ Johnson, W. L., C. Beal, A. Fowles-Winkler, U. Lauper, S. Marsella , S. Narayanan,
D. Papachristou , and H. Vilhjalmsson, Tactical Language Training System: An Interim Report
Johnson, W.L., S. Marsella, N. Mote, H. Vilhjalmsson, S. Narayanan , and S. Choi, Tactical Language Training System: Supporting the Rapid Acquisition of Foreign Language and Cultural Skills
“Mission to Arabic: It's Not Your Father’s Language Lab”: http://www.isi.edu/stories/print/78.html
“MissionEngine: Multi-system integration using Python in the Tactical Language Project”: http://www.python.org/pycon/2005/papers/4/MissionEngine.WhitePaper.pdf
Mote, N., W. L. Johnson, A. Sethy, J. Silva, and S. Narayanan, Tactical Language Detection and Modeling of Learner Speech Errors: The case of Arabic tactical language training for American English speakers
“The Tactical Language Project at CARTE”: http://www.isi.edu/isd/carte/proj_tactlang/
Additional Resources
DARPA Training Superiority Program (DARWARS): http://www.darpa.mil/dso/thrust/biosci/training_super.htm
Mission Rehearsal Exercise Project: http://www.ict.usc.edu/disp.php?bd=proj_mre
NPR, “A Virtual Course in Iraqi Arabic”: http://www.npr.org/templates/story/story.php?storyId=4503426
Newsweek, “Arabic: High-Tech Tutor”: http://www.msnbc.msn.com/id/5146254/site/newsweek/
The Pulse Journal, “Researchers tame violent video game to keep troops safe in Iraq”: http://www.pulsejournal.com/news/content/shared/news/nation/stories/0222_TRAINING_GAME.html
Wired Magazine, “The War Room”: http://www.wired.com/wired/archive/12.09/warroom.html