1 commercial applications of natural language technology april 14, 2005 deborah a. dahl principal,...
TRANSCRIPT
11
Commercial Applications of Commercial Applications of Natural Language Natural Language
TechnologyTechnologyApril 14, 2005April 14, 2005
Deborah A. DahlDeborah A. DahlPrincipal, Conversational TechnologiesPrincipal, Conversational Technologies
Chair, World Wide Web Consortium Multimodal Chair, World Wide Web Consortium Multimodal Interaction Working GroupInteraction Working Group
[email protected]@conversational-technologies.com
22Conversational Technologies
Business MotivationsBusiness Motivations
save money (operators, phone costs)save money (operators, phone costs) improve user satisfactionimprove user satisfaction provide new revenue-generating provide new revenue-generating
servicesservices do something that couldn’t otherwise be do something that couldn’t otherwise be
donedone legal requirementslegal requirements
33Conversational Technologies
Technical Motivations for Technical Motivations for Natural Language ProcessingNatural Language Processing
indirection
complexity of intention
complexity of language
I’ve been having a lot of problems with my inkjet printer the last few weeks
I’ve been having a lot of problems with my inkjet printer the last few weeks
system: When do you want to depart?user: I need to be downtown for an 8:00 meeting
system: When do you want to depart?user: I need to be downtown for an 8:00 meeting
•travel from San Francisco to Philadelphia•aisle seat•vegetarian meal•no more than one stop•red-eye ok if arrives after 6:00 a.m. and doesn’t stop in Chicago•wheelchair needed•should be on one of my preferred airlines unless fare is much higher
•travel from San Francisco to Philadelphia•aisle seat•vegetarian meal•no more than one stop•red-eye ok if arrives after 6:00 a.m. and doesn’t stop in Chicago•wheelchair needed•should be on one of my preferred airlines unless fare is much higher
system: Where do you want to go?user: Philadelphia
system: Where do you want to go?user: Philadelphia
44Conversational Technologies
Natural Language Understanding in a Natural Language Understanding in a Spoken Dialog SystemSpoken Dialog System
RecognizerMeaning
extraction
Dialog manager
Back-end application
Generate prompt
Speech generation
Language model
Acoustic modelsNLU rules
DM rules
NLG rules
Templates
TTS
Recordings
55Conversational Technologies
Natural Language Processing in Natural Language Processing in Commercial Spoken Dialog Commercial Spoken Dialog
SystemsSystems Form-filling applicationsForm-filling applications Classification of free-form spoken inputsClassification of free-form spoken inputs StandardsStandards
66Conversational Technologies
Form-filling Spoken Dialog SystemsForm-filling Spoken Dialog Systems
retail bankingretail banking voice portalsvoice portals access to email, voice mailaccess to email, voice mail travel reservations (Amtrak Julie)travel reservations (Amtrak Julie) package trackingpackage tracking
77Conversational Technologies
Multimodal Form-Filling using Multimodal Form-Filling using XHTML+VoiceXHTML+Voice
IBM Chinese food demoIBM Chinese food demo
88Conversational Technologies
Government Applications -- NASAGovernment Applications -- NASA
Clarissa -- International Space Station's new Clarissa -- International Space Station's new speech-powered virtual assistantspeech-powered virtual assistant
Space station checklists are very long and Space station checklists are very long and complex with many branches, which often complex with many branches, which often require 'fill-in-the-blank' answers. require 'fill-in-the-blank' answers.
General purpose 'procedure reader’General purpose 'procedure reader’ helps astronauts check out space suits and helps astronauts check out space suits and
analyze drinking water quality analyze drinking water quality Scheduled to begin working with astronauts in Scheduled to begin working with astronauts in
May as part of International Space Station May as part of International Space Station Expedition 11. Expedition 11.
99Conversational Technologies
Statistical ClassificationStatistical Classification
Sort user’s statement into bins of predefined Sort user’s statement into bins of predefined topics (for example, place order, find out topics (for example, place order, find out status of order, return item)status of order, return item) given examples of statements that go in given examples of statements that go in
different bins (training data)different bins (training data) sort new examples into the right binssort new examples into the right bins
example of applying these kinds of example of applying these kinds of techniques to text – spam filterstechniques to text – spam filters
1010Conversational Technologies
Example of Classification: Spam Example of Classification: Spam FiltersFilters
FOR YOUR ATTENTION; Dear Sir, I am FOR YOUR ATTENTION; Dear Sir, I am pleased to write you in view of the pleased to write you in view of the circumstances in which I now found circumstances in which I now found myself. This rescuable situation, though myself. This rescuable situation, though with its attendant mutual benefit needs with its attendant mutual benefit needs urgent action hence this letter,and I do urgent action hence this letter,and I do hope you will not hesitate to come to my hope you will not hesitate to come to my rescue….rescue….
90%
84%
93%
98%84%
1111Conversational Technologies
Some Current Statistical Some Current Statistical Classification SystemsClassification Systems
Nuance “Say Anything”Nuance “Say Anything” Scansoft “SpeakFreely”Scansoft “SpeakFreely” ATT “VoiceTone”ATT “VoiceTone” BBN “Call Director”BBN “Call Director” TuVoxTuVox
1212Conversational Technologies
Customer Service CallsCustomer Service Calls
Touchtone menu is complex Touchtone menu is complex with many layerswith many layers
Prompts are confusingPrompts are confusing
Customer wants just to say Customer wants just to say what they need.what they need.
“I’m closing up my summer home and want to turn off the phone.”Problem:
Customer Service Destination?
Entry Point:
Billing Payment
Makearrangements
RepairCancel Service
Orders
Copy of Bill
Unauthorized call
Balance Past due notice
New Service
OrderStatus
Seasonal
OrderMove
OrderChange
Pay now
Conversational Technologies
BBN Call DirectorBBN Call Director™™
AutomatedServices
Sales
Billing
TechnicalSupport
Speech
Text
IVR RouterTopic
“Please tell me briefly the reason for your call today.”
Speech Recognizer
Topic Classifier
Statistical Grammars & Topic Models
“I’m calling to check whether there is any better rate plans
than the one I currently have.”
1414Conversational Technologies
StandardsStandards
Extremely important for commercial Extremely important for commercial applicationsapplications
1515Conversational Technologies
W3C Natural Language StandardsW3C Natural Language Standards
Aimed at form-filling dialogsAimed at form-filling dialogs VoiceXML – defines dialogsVoiceXML – defines dialogs Speech Recognition Grammar Specification Speech Recognition Grammar Specification
(SRGS): describes allowable sequences of (SRGS): describes allowable sequences of wordswords
Semantic Interpretation (SI): describes how Semantic Interpretation (SI): describes how sequences of words are to be interpretedsequences of words are to be interpreted
Extensible MultiModal Annotation (EMMA) Extensible MultiModal Annotation (EMMA) represents final interpretation of user’s inputrepresents final interpretation of user’s input
1616Conversational Technologies
Form-filling DialogForm-filling Dialog
System: Welcome to the weather information service. What state?
User: help
System: Please speak the state for which you want the weather
User: Pennsylvania
System: Please speak the city for which you want the weather.
User: Philadelphia
1717Conversational Technologies
VoiceXML ExampleVoiceXML Example<?xml version="1.0" encoding="UTF-8"?> <vxml version="2.0" xmlns="http://www.w3.org/2001/vxml"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.w3.org/2001/vxml http://www.w3.org/TR/voicexml20/vxml.xsd"> <form id="weather_info">
<block>Welcome to the weather information service.</block> <field name="state">
<prompt>What state?</prompt> <grammar src="state.grxml" type="application/srgs+xml"/> <catch event="help"> Please speak the state for which you want the weather. </catch>
</field> <field name="city">
<prompt>What city?</prompt> <grammar src="city.grxml" type="application/srgs+xml"/> <catch event="help"> Please speak the city for which you want the weather.</catch>
</field> <block>
<submit next="/servlet/weather" namelist="city state"/> </block> </form>
</vxml>
1818Conversational Technologies
SRGS ExamplesSRGS Examples
Context-free grammarContext-free grammar XML and ABNF formats are providedXML and ABNF formats are provided
<rule id="yes"> <one-of>
<item>yes</item> <item>yeah</item> <item>uh huh</item>
</one-of> </rule>
<rule id=“yes-no”<one-of>
<ruleref uri="#yes"/> <ruleref uri="#no"/>
</one-of></rule>
Other FeaturesOther Featuresoptionalityoptionalitylanguage declarationlanguage declarationweighted alternativesweighted alternativespronunciationspronunciationsspecial rules special rules external rulesexternal rulescharacter encodingcharacter encoding
1919Conversational Technologies
SRGS SpecificationSRGS Specification
http://www.w3.org/TR/speech-grammar/http://www.w3.org/TR/speech-grammar/ Status: W3C Candidate RecommendationStatus: W3C Candidate Recommendation Quick Guide to the SRGS SpecificationQuick Guide to the SRGS Specification
http://www.conversational-technologies.com/pages/5/http://www.conversational-technologies.com/pages/5/index.htmindex.htm
2020Conversational Technologies
Semantic InterpretationSemantic Interpretation
Tags are added to the grammar to Tags are added to the grammar to describe the semantics of the user’s describe the semantics of the user’s inputinput
Format uses ECMAScript compact Format uses ECMAScript compact profile (ECMAScript 327)profile (ECMAScript 327)
2121Conversational Technologies
Semantic Interpretation ExampleSemantic Interpretation Example
Three large pizzas with onions<rule id="pizza"> <ruleref uri="#number"/> <ruleref uri="#foodsize"/> <tag> $.pizzasize=$foodsize;
$.number=$number </tag> pizzas with
<ruleref uri="#tops"/> <tag> $.topping=$tops </tag>
</rule>
Result: pizza.number = 3pizza.pizzasize= “large”pizza.toppings = [“onions”]
XML Result:<pizza> <number> 3 </number> <size> large </size> <toppings> <item> onions </item> </toppings></pizza>
2222Conversational Technologies
Semantic Interpretation Semantic Interpretation SpecificationSpecification
http://www.w3.org/TR/semantic-interpretatihttp://www.w3.org/TR/semantic-interpretation/on/
Status: W3C Working DraftStatus: W3C Working Draft
2323Conversational Technologies
EMMAEMMA
Developed by the W3C Multimodal Developed by the W3C Multimodal Interaction Working GroupInteraction Working Group
An XML-based approach to representing An XML-based approach to representing natural language meaningsnatural language meanings
Applicable to multimodal applications, but Applicable to multimodal applications, but originally developed for speechoriginally developed for speech
2424Conversational Technologies
EMMAEMMA
Represents user inputRepresents user input Vehicle for transmitting user’s intention Vehicle for transmitting user’s intention
throughout applicationthroughout application Focus on language input (text, handwriting, Focus on language input (text, handwriting,
speech)speech) Three componentsThree components
data modeldata model interpretationinterpretation annotation (main focus of standard)annotation (main focus of standard)
2525Conversational Technologies
InterpretationInterpretation
ExampleExample I want to go from Denver to I want to go from Denver to
PittsburghPittsburgh<instance>
<air_travel>
<from_city>Denver</from_city>
<to_city>Pittsburgh</to_city>
</air_travel>
<instance>
<instance>
<air_travel>
<from_city>Denver</from_city>
<to_city>Pittsburgh</to_city>
</air_travel>
<instance>
2626Conversational Technologies
<emma:emma emma:version="1.0" xmlns:emma="http://www.w3.org/2003/04/emma#"
</emma:emma>
EMMA ExampleEMMA Example
<!-- time stamp for result --> <emma:absolute-timestamp emma:start="2003-03-26T0:00:00.15" emma:end="2003-03-26T0:00:00.2"/>
<!-- confidence score --> <rdf:Description rdf:about="#int1" emma:confidence="0.75"/>
<rdf:Description rdf:about="#int1" emma:model="http://myserver/models/city.xml"/>
<emma:interpretation emma:id="int1"> <origin>Boston</origin> <destination>Denver</destination> <date>03112003</date>
</emma:interpretation>
“I want to go from Boston to Denver on March 11, 2003”
2727Conversational Technologies
EMMA SpecificationEMMA Specification
http://www.w3.org/TR/emmahttp://www.w3.org/TR/emma
Status: W3C Working DraftStatus: W3C Working Draft
2828Conversational Technologies
Natural Language Natural Language Understanding Understanding
W3C Standards SummaryW3C Standards Summary VoiceXML: define spoken dialogsVoiceXML: define spoken dialogs SRGS: describes allowable sequences of wordsSRGS: describes allowable sequences of words Semantic Interpretation: describes what Semantic Interpretation: describes what
intentions are represented by sequences of intentions are represented by sequences of wordswords
EMMA: represents an interpretation of user’s EMMA: represents an interpretation of user’s inputinput
2929Conversational Technologies
Summary of Deployed Spoken Summary of Deployed Spoken Dialog SystemsDialog Systems
Form filling applications are by far the Form filling applications are by far the most commonmost common
Statistical classification systems are Statistical classification systems are becoming more common and are popular becoming more common and are popular with userswith users
Standards are accelerating commercial Standards are accelerating commercial adoption of technologyadoption of technology
3030Conversational Technologies
ResourcesResources Practical Spoken Dialog Systems, Springer, 2005. (D. Dahl, editor)Practical Spoken Dialog Systems, Springer, 2005. (D. Dahl, editor) VB website VB website http://www.w3.org/Voice/http://www.w3.org/Voice/
VoiceXMLVoiceXML SRGSSRGS SISRSISR
MMI website MMI website http://www.w3.org/2002/mmi/http://www.w3.org/2002/mmi/ EMMAEMMA
BeVocal website BeVocal website http://cafe.bevocal.com/http://cafe.bevocal.com/ VoiceXML deployments (some with phone numbers you can try VoiceXML deployments (some with phone numbers you can try
http://www.kenrehor.com/voicexml/#deploymentshttp://www.kenrehor.com/voicexml/#deployments)) Guide to speech standards -- Guide to speech standards --
http://www.speechtechmag.com/issues/9_8/cover/11619-1.htmlhttp://www.speechtechmag.com/issues/9_8/cover/11619-1.html