listener-control navigation of voicexml. nuance speech analysis 92% of customer service is through...

24
Listener-Control Navigation of VoiceXML

Upload: timothy-walsh

Post on 19-Jan-2016

218 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Listener-Control Navigation of VoiceXML. Nuance Speech Analysis 92% of customer service is through phone. 84% of industrialists believe speech better

Listener-Control Navigation of VoiceXML

Page 2: Listener-Control Navigation of VoiceXML. Nuance Speech Analysis 92% of customer service is through phone. 84% of industrialists believe speech better

Nuance Speech Analysis 92% of customer

service is through phone.

84% of industrialists believe speech better than web.

1st Qtr8%

2nd Qtr92%

1st Qtr16%

2nd Qtr84%

Page 3: Listener-Control Navigation of VoiceXML. Nuance Speech Analysis 92% of customer service is through phone. 84% of industrialists believe speech better

W3C (’02)

VoiceXML Forum (’00)

Motorola (’98)

HP (’98)

IBM (’98)

Bell/Lucent (’98)

AT&T (‘95)

History of VoiceXML

PML

PML

SpeechML

TalkML

VoxML

VoiceXML 1.0

VoiceXML 2.0

Page 4: Listener-Control Navigation of VoiceXML. Nuance Speech Analysis 92% of customer service is through phone. 84% of industrialists believe speech better

VoiceXML Open standard-language for serving

voice/audio documents.

VoiceXML is designed for creating audio dialogs that feature.

Synthesized speech, Digitized audio, Recognition of spoken and DTMF key input, Recording of spoken input, Telephony and Mixed-Initiative conversations.

Page 5: Listener-Control Navigation of VoiceXML. Nuance Speech Analysis 92% of customer service is through phone. 84% of industrialists believe speech better

VoiceXML (Cont’d) VoiceXML allows scripts/CGIs etc.

Can take input from the listener via speech(fill out forms like in HTML).

Used extensively for automated call handling.

Makes info accessible over (cell) phones

The next revolution on the Web.

Page 6: Listener-Control Navigation of VoiceXML. Nuance Speech Analysis 92% of customer service is through phone. 84% of industrialists believe speech better

Architectural Model

                                                                                                       

Page 7: Listener-Control Navigation of VoiceXML. Nuance Speech Analysis 92% of customer service is through phone. 84% of industrialists believe speech better

Goals of VoiceXML Web development and content delivery into

voice response applications.

Minimize client/server interactions.

Separate code from service logic.

Shield the application authors from platform specific details.

Page 8: Listener-Control Navigation of VoiceXML. Nuance Speech Analysis 92% of customer service is through phone. 84% of industrialists believe speech better

Voice Browser Software platform running on a network server.

It supports the following features. ASR DTMF Recognition grammars Mixed-initiative dialog TTS

Voice browser:VoiceXML :: Web browser:HTML

Page 9: Listener-Control Navigation of VoiceXML. Nuance Speech Analysis 92% of customer service is through phone. 84% of industrialists believe speech better

Voice Enabling

Page 10: Listener-Control Navigation of VoiceXML. Nuance Speech Analysis 92% of customer service is through phone. 84% of industrialists believe speech better

Sample VoiceXML Code <vxml version="2.0">

<form> <field name="rich">

<grammar type=“application/x-gsl” mode = “voice”> <![CDATA[[ [(yes)]{<option “yes”>} [(no)]<option “no”>} ]]]> </grammar>

<prompt>Would you like to get rich quick?</prompt> <filled>Gotcha.

<if cond="rich==‘yes’">You want to be rich! <goto next="rich.vxml" />

<else /> You don't want to be rich.

<goto next="poor.vxml" /> </if> </filled> </field> </form> </vxml>

Page 11: Listener-Control Navigation of VoiceXML. Nuance Speech Analysis 92% of customer service is through phone. 84% of industrialists believe speech better

Problem with VoiceXML Navigation of the voice document.

Author has to ask where listener will like to go next.

Listener has absolutely no control over navigation.

Tedium, Adv.Applications not possible.

Analogy: Scroll vs book

Page 12: Listener-Control Navigation of VoiceXML. Nuance Speech Analysis 92% of customer service is through phone. 84% of industrialists believe speech better

Solution Allow users to control navigation interactively.

Using Voice Anchors.

Page 13: Listener-Control Navigation of VoiceXML. Nuance Speech Analysis 92% of customer service is through phone. 84% of industrialists believe speech better

Voice Anchors Permit Speech labels that listeners can place

on a dialog.

Listener can return to that dialog later by uttering that label.

Hard to implement, as free-form speech recognition is not possible.

Need to incorporate in the voice browser.

Page 14: Listener-Control Navigation of VoiceXML. Nuance Speech Analysis 92% of customer service is through phone. 84% of industrialists believe speech better

Voice Anchors We developed a number of methods for

attaching voice anchors.

Most practical method: Spelling.

Anchor as a whole word.

Default anchors

Default navigation strategies

Page 15: Listener-Control Navigation of VoiceXML. Nuance Speech Analysis 92% of customer service is through phone. 84% of industrialists believe speech better

Initial VXML

ConverterAugmented

VXMLVoice

browser

Creates a DB file

Place Anchors

Recall Anchor

New VXML DB file

Page 16: Listener-Control Navigation of VoiceXML. Nuance Speech Analysis 92% of customer service is through phone. 84% of industrialists believe speech better

Our Architecture

Page 17: Listener-Control Navigation of VoiceXML. Nuance Speech Analysis 92% of customer service is through phone. 84% of industrialists believe speech better

Cumulative Anchors Different dialogs can be marked with the same

label.

Recalling the label reads out the corresponding dialogs.

Multiple cumulative anchors in a single document.

Allows creation of sub documents.

Hierarchy of sub documents can be created.

Page 18: Listener-Control Navigation of VoiceXML. Nuance Speech Analysis 92% of customer service is through phone. 84% of industrialists believe speech better

Grammar Set of valid expressions.

Each dialog references one or more grammars.

Nuance Grammar Specification Language (GSL).

Inline grammar and Offline grammar. Offline provides the following advantages:

Can be generated dynamically (via CGI’s, ASP's). Reused by multiple dialogs or applications. Updated and modified without change in source code.

Subgrammars and Form-level grammar.

Page 19: Listener-Control Navigation of VoiceXML. Nuance Speech Analysis 92% of customer service is through phone. 84% of industrialists believe speech better

Sample Grammar code<grammar type="application/x-gsl" mode="voice"><![CDATA[[[(skip)]{<option "skip">}[(previous)]{<option "previous">}[(place anchor) (call mark) (begin mark)]{<option

"mark">}[(recall mark) (recall anchor) (recall)]{<option "recall">}]]]>

Page 20: Listener-Control Navigation of VoiceXML. Nuance Speech Analysis 92% of customer service is through phone. 84% of industrialists believe speech better

InitialVXML

ConverterAugmented

VXML

Voice browser

Initial HTML

Translator

Reference to anotherlink in

Augmented VXML

Get the HTML page

Page 21: Listener-Control Navigation of VoiceXML. Nuance Speech Analysis 92% of customer service is through phone. 84% of industrialists believe speech better

Applications

Web access through voice.

This involves the following sequence of steps HTML -> VXML

Translator written in java was already developed.

Navigation of VXML

Page 22: Listener-Control Navigation of VoiceXML. Nuance Speech Analysis 92% of customer service is through phone. 84% of industrialists believe speech better

Applications

Mathematics for visually impaired.

This involves the following steps. MathML -> VXML.

A translator was developed to convert the MathML documents to VXML documents using the XSLT semantics.

Navigation of VXML.

Page 23: Listener-Control Navigation of VoiceXML. Nuance Speech Analysis 92% of customer service is through phone. 84% of industrialists believe speech better

Conclusion & Future work Designing default navigation strategies.

Unit of division for navigation.

Voice Scripting Languages. Example: “repeat chlorine until exit”.

Page 24: Listener-Control Navigation of VoiceXML. Nuance Speech Analysis 92% of customer service is through phone. 84% of industrialists believe speech better