tactical language training system natalie macconnell april 21, 2005

Tactical Language Training System

Natalie MacConnell

April 21, 2005

Organization of Talk

What is the Tactical Language Training System?

Objectives and Quick Facts

System Architecture Mission Skill Builder

Mission Practice Environment

Speech Recognition and Error Modeling

Demonstration Video

Summary

What is the Tactical Language Training System (TLTS)?

Intelligent tutoring system design to aid military personnel in rapidly acquiring language and cultural skills in order to carry out peaceful and effective communication in foreign countries

Focuses on “tactical languages”: subsets of linguistic, gestural, and cultural knowledge and skills necessary to accomplish the task at hand

Currently developed for Levantine and Iraqi Arabic

Virtual tutor coaches learners in pronunciation, assesses their mastery, and provides assistance

Learners then apply their language skills to perform missions in an interactive story environment, where they communicate with autonomous, animated, Arabic speaking characters

Quick Facts about TLTS Center for Advanced Research in

Technology for Education (CARTE) at the University of Southern California Dr. Lewis Johnson, director of CARTE, linguist

and A.I. expert $7.4 million project funded by DARPA

Being developed as part of the Training Superiority Program (DARWARS)

“DARWARS seeks to transform military training by providing continuously-available, on-demand mission-level training for all forces at all echelons”

To be deployed late this year Full program to include about 80 hours of

instruction with a vocabulary of around 500 carefully chosen words

Objectives of the TLTS Help military and civilian personnel gain an understanding of a

foreign language and culture so they can learn to communicate peacefully and effectively with foreigners in their native language

Eliminate heavy reliance on language experts

Deemphasize written language -- focus on spoken communication skills for immediate application

Learn the role of nonverbal communication

Develop a more engaging and motivating learning environment compared to traditional language instruction

Provide training in less commonly taught, difficult to learn languages

Yield rapid acquisition of foreign language skills

System Architecture Three main components

Mission Skill Builder (MSB): interactive exercises that introduce learner to the vocabulary and pronunciation of the language

Mission Practice Environment (MPE): story-based, interactive video game environment where learners advance through game levels by using their newly acquired linguistic and cultural skills to accomplish particular tasks and missions

Medina Authoring Tool: used to develop curriculum and game content Common set of services and content databases: Curriculum

Database, Pedagogical Agent, Learner Model, and Language Model Language Model consists of:

Speech Recognizer: used by MSB and MPE Natural Language Parser: annotates phrases with structural

information and refers to relevant grammatical explanations Error Model: finds and analyzes syntactic and phonological

mistakes in the learner’s speech

System Architecture Diagram

Language Model

Mission Skill Builder (MSB)

Pedagogical Agent

LearnerModel

Mission Practice Environment (MPE)

MEDINA Authoring Tool

CurriculumMaterial

NLP Parser

SpeechRecognizer

Error Model

Johnson, W.L., S. Marsella, N. Mote, H. Vilhjalmsson, S. Narayanan , and S. Choi, Tactical Language Training System: Supporting the Rapid Acquisition of Foreign Language and Cultural Skills

Mission Skill Builder Intensive and “intelligent” version of traditional language lab programs

where students are exposed to words and phrases pronounced by native speakers, which they imitate and practice

Important innovations: Speech Recognizer is tailored for learner speech so it is able to evaluate

learner’s pronunciation and detect common errors Pedagogical Agent provides the learner with tailored performance feedback Learner Model tracks what the learner has mastered and what areas the

learner needs to improve Learning process involves the following steps:

Learner hears Pedagogical Agent pronounce phrase Learner records himself speaking the phrase Speech Recognizer analyzes the recording and passes it to the Pedagogical

Agent, which provides appropriate feedback based on pronunciation errors and the Learner Model’s learner history

Also instructs students in non-verbal communication

Mission Skill Builder


Story-based, interactive video game environment designed to give students an unscripted, unpredictable, and challenging test of their mastery of the skills learned in the MSB

Learner moves a uniformed figure through a videogame-like Lebanese village

Learner speaks into a microphone to control the speech of his character and selects from gestures for nonverbal communication

Can carry on free-form conversation with AI-animated Arabic speaking characters, who can understand what is said if it is understandable Arabic and then respond

Learner must be careful to use appropriate phrases and gestures

Tests the learner’s ability to carry on two-way communication


“In a scene in a café, Sergeant Smith must try to find out who the village headman is. If he doesn’t act properly, one of the café patrons

will jump up and demand to know who he really is. If tensions escalate, the patron will

eventually accuse the sergeant of being a CIA agent. Standing in the background is the

pedagogical agent, here in the role of aide, who can assist the learner by translating phrases or

offering suggestions of what to say.”

Initial Game Scenario:


Implemented as a Total Conversion Mod to Unreal Tournament 2003

Removed all the combat elements Added a speech recognition engine Added intelligent agents that react to the

learner’s speech and pronunciation

UnrealWorld: renders it on the screen and provides a user interface

The Learner Model maintained by the Pedagogical Agent controls the aide’s behavior in the game

Adapts to each individual, noting consistent errors or difficulties, which can be targeted for remedial practice in the MSB

Based on the graphics capabilities of Unreal Tournament


MissionEngine

Controls what happens in the game, while the UnrealWorld renders it on the screen

Represents each character in the story as an agent with its own goals, relationships, and private beliefs

High-level director agent influences the character agents

Controls how the story unfolds

Ensures the pedagogical and dramatic goals are met

Backend written in Python

MissionEngine System Architecture Pedagogical Agent: intelligent agent that provides feedback and

encouragement to the learner based on pronunciation correctness and learner history; implemented in Python

Automatic Speech Recognizer: speech recognition system built on top of the Cambridge Hidden Markov Model Toolkit (HTK); implemented as a C++ library

PsychSim: decision-making framework of the virtual characters; models the goals, motivations, and world beliefs of the characters; implemented in Python

SocialPuppets: module that controls physical character behavior in the environment given a description of the character's intent from PsychSim; implemented in Python

Gamebots: interface that allows Unreal Tournament bots to be controlled

DataManager: storage module used for all data in the system; implemented in C++ as an XML database

MissionEngine Architecture

http://www.python.org/pycon/2005/papers/4/MissionEngine.WhitePaper.pdf

Speech Recognition

Hidden Markov Model Automatic Speech Recognizer bootstrapped from English and Modern Standard Arabic speech and enhanced with data from native and learner Lebanese Arabic speech Implemented using the Cambridge HTK

Trained on a Modern Standard Arabic dataset with around 10 hours of native speech, as well as approximately one hour of non-native speech samples Learner speech data is being collected to train the ASR

Generated non-native pronunciation variations for every utterance in the system and loaded into the Arabic ASR

Hypothesis Rejection Module compares HMM likelihoods from an Arabic recognizer, English recognizer, and pronunciation variants to detect whether the user has spoken the right utterance and provide correct feedback

Speech Recognition

Dynamic switching of recognition grammars allows the recognizer to focus on recognizing the words and phrases that are likely to occur in a given learning context

For the MSB, recognizer is constrained to recognize only the pronunciation variants of the utterances being taught

For the MPE, recognizer is a finite state graph, which has all the utterances in the MSB as parallel paths Focuses on recognizing the most likely utterance from among a set of

utterances that are appropriate for a given scene Enables the system to simulate dialogue with other characters If the recognizer recognizes a phrase that doesn’t fit into the current

context, the character indicates that he does not understand If the recognizer fails to recognize an utterance, the aide makes a

suggestion to the learner

Error Detection and Modeling

The ASR detects learner errors and passes them to the Pedagogical Agent which provides feedback to the learner

Aims to recognize (1) what the learner intended to say, (2) the deviations the learner made from what he intended to say

For each lesson or exercise, a recognition grammar is loaded that detects correct responses for that context as well as likely learner errors

Speech Recognizer must recognize both true Arabic words and mispronounced Arabic words since it is dealing with learner speech

The variability of learner language makes robustness difficult to achieve Inaccuracies in the speech analysis algorithms caused utterances that were

pronounced correctly but slowly to be rejected -- has been modified to give higher scores for these utterances so they are not rejected as errors

Can reduce recognition vocabulary size because the learner is taught a small subset of the language

Video Clip

http://www.isi.edu/~jmoore/Mankin/TLMankin256.wmv

Summary

Help people gain an understanding of a foreign language and culture so they can communicate peacefully and effectively with foreigners in their native language

Focus on “tactical languages” to accomplish specific missions

Focus on spoken communication skills

Rapid acquisition of foreign language skills save time

Remove need for interpreters save money

Model learner speech and common errors, including English language utterances

More engaging learning experience (and video games are fun!)

References “DARPA Tactical Language Training Project”:

http://www.isi.edu/isd/carte/proj_tactlang/tactical_lang_overview.pdf “Experts Use AI to Help GIs Learn Arabic”:

http://www.usc.edu/uscnews/stories/10321.html HTK Speech Recognition Toolkit: http://htk.eng.cam.ac.uk/ Johnson, W. L., C. Beal, A. Fowles-Winkler, U. Lauper, S. Marsella , S. Narayanan,

D. Papachristou , and H. Vilhjalmsson, Tactical Language Training System: An Interim Report

Johnson, W.L., S. Marsella, N. Mote, H. Vilhjalmsson, S. Narayanan , and S. Choi, Tactical Language Training System: Supporting the Rapid Acquisition of Foreign Language and Cultural Skills

“Mission to Arabic: It's Not Your Father’s Language Lab”: http://www.isi.edu/stories/print/78.html

“MissionEngine: Multi-system integration using Python in the Tactical Language Project”: http://www.python.org/pycon/2005/papers/4/MissionEngine.WhitePaper.pdf

Mote, N., W. L. Johnson, A. Sethy, J. Silva, and S. Narayanan, Tactical Language Detection and Modeling of Learner Speech Errors: The case of Arabic tactical language training for American English speakers

“The Tactical Language Project at CARTE”: http://www.isi.edu/isd/carte/proj_tactlang/

Additional Resources

DARPA Training Superiority Program (DARWARS): http://www.darpa.mil/dso/thrust/biosci/training_super.htm

Mission Rehearsal Exercise Project: http://www.ict.usc.edu/disp.php?bd=proj_mre

NPR, “A Virtual Course in Iraqi Arabic”: http://www.npr.org/templates/story/story.php?storyId=4503426

Newsweek, “Arabic: High-Tech Tutor”: http://www.msnbc.msn.com/id/5146254/site/newsweek/

The Pulse Journal, “Researchers tame violent video game to keep troops safe in Iraq”: http://www.pulsejournal.com/news/content/shared/news/nation/stories/0222_TRAINING_GAME.html

Wired Magazine, “The War Room”: http://www.wired.com/wired/archive/12.09/warroom.html

tactical language training system natalie macconnell april 21, 2005

Documents