lessons learned mokusei: multilingual conversational interfaces future plans explore...

Post on 31-Dec-2015

212 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Lessons Learned

Mokusei: Multilingual Conversational Interfaces

Future Plans

• Explore language-independent approaches to speech understanding and generation

• Port human-language technologies for English conversational interfaces to Japanese

• Use existing Jupiter domain as test case– A telephone-only conversational interface for

weather information– More than 500 cities worldwide (~350 in US)– On-line information from four Web sites– Use the Galaxy client server architecture

• Speech Recognition (SUMMIT: Glass et al., ICSLP ‘96)– Lexicon: >2,000 words with phonemic pronunciations– Phonological modeling:

* Japanese specific phonological rules, e.g.,desu ka /d e s k a/

* Japanese phonetic units mapped into English ones– Acoustic modeling:

* Used English models to generate forced transcriptions utterances

* Retrained acoustic models to create hybrid models– Language modeling:

* Class n-gram using 60 word classes. trained on ~3,500 read & spontaneous sentences

* Also exploring a class n-gram derived automatically from TINA

• Speech Synthesis

– NTT Fluet text-to-speech system

Note: Sample sentences from Japanese speakers can be played from PC

S. Seneff, J. Glass, T.J. Hazen, J. Polifroni, and V. ZueMIT Laboratory for Computer Science

Y. MinamiNTT Cyberspace Laboratories

SpeechGeneration

SpeechUnderstanding

CommonSemantic

Frame

DATABASE

SpeechUnderstanding

SpeechGeneration

Language as Interface

• Language Understanding (TINA: Seneff, Comp Ling, ‘92 )– Japanese grammar contains >900 unique non-terminals– Translation file maps Japanese words to English equivalent– Produces same semantic frame as for English inputs– Left recursive structure of Japanese requires look-ahead to

resolve role of content words* Parse each content word into structure labeled “object”* Drop off “object” after next particle, which defines role and

position in hierarchy

• Language Generation (GENESIS, Glass et al., ICSLP ‘94)– Used English language generation tables as template– Modified ordering of constituents– Provided translation lexicon for words – Many language specific challenges, including constituent

ordering, quantifier translation, and multiple meanings

Language as Content

• Use the same internal representation for Japanese and English• Update from Web sites and satellite feeds at frequent intervals• Parse all data into semantic frames to capture meaning• Scan frames for semantic content and prepare new relational

database table entries

English: Some thunderstorms may be accompanied by gusty winds and hail

Japanese:

clause: weather_eventtopic: precip_act, name: thunderstorm, num: pl

quantifier: somepred: accompanied_by

adverb: possiblytopic: wind, num: pl, pred: gusty

and: precip_act, name: hail

weather

windhail

rain/storm

Frame indexed under weather, wind, rain, storm, and hail

• Our approach to developing multilingual interfaces appears feasible

• A top-down approach to parsing can be made effective for left-recursive languages

• Word order divergence between English and Japanese motivated a redesign of our language generation component

• Novel technique of generating a class n-gram language model using the NL component appears promising

• Involvement of Japanese researcher is essential

• Additional data collection from native Japanese speakers

– Nearly 1000 sentences were collected in December

• Improvement of individual components– Vocabulary coverage, acoustic and

language models– Parse coverage– Continued development of a more

sophisticated language generation component

• Expansion of weather content for Japan

Research Objectives

top related