introduction to vxml. what is vxml? voice extensible markup language used in telephone-based speech...

36
Introduction to VXML

Post on 21-Dec-2015

226 views

Category:

Documents


0 download

TRANSCRIPT

Introduction to VXMLIntroduction to VXML

What is VXML?

Voice Extensible Markup Language

Used in

telephone-based speech applications

voice browsing of the web

How does VXML work?

Two main components

tags -- control what the program does

grammars -- control what speech is recognized; what the user can say

Basic VXML tags<vxml> -- defines a VXML application

<form> -- basic component of a dialog

<prompt> -- says something to the user

<field> -- holds information supplied by user

<grammar> -- defines what user can say

<filled> -- what to do once user says something

VXML Hierarchy

<vxml> contains one or more <form>

<form> may contain one or more <prompt> and does contain one or more <field>

<field> contains one or more <prompt> but one and only one <grammar> and an optional <filled>

Execution is linear unless you tell it

otherwise

Example 1

Tony’s Tips #1All files must end with the extension .vxml

Use a simple editor such as Notepad or the editor provided by BeVocal. Do not use Word -- it tends to add extra hidden formatting

All tags occur in pairs (e.g., <form> and </form>)

When using <prompt> avoid punctuation and capitalization

VXML GrammarsTwo types of grammars

Programmer specified -- you decide what can be said

Built-in or pre-defined

Two tags related to grammars

<grammar> -- what can be said

<filled> -- what to do once you recognize something

Built-in Grammars

OK

OK

Example 2a

Example 2b

Recognizing phrasesSingle words

[news sports weather exit]phrases

(i want the news)Combined

((i want the) [news sports])((i want the) [news sports (home section)])

Saying something optional(?(i want the) [news sports] ?please)

Using slotsProblem: <field> holds entire phrase that is recognized

would have to write a separate rule for each possible phrase (i.e., the vocabulary problem)

Slots are a useful shortcut when recognizing phrases

((i want the) [news sports (home section)])((i want the)

• [news { <section news> }• sports { <section sports> }• (home section) { <section home > }])

Example 2c

These are the slots

This goes to another field unless paying by

check

More on grammars

Grammars do not have to be defined directly in vxml file

can be defined in a separate file with .gsl extension (see example 3)

Tony’s Tips #2avoid capitalizing words or using punctuation when defining a grammar

Use built-in grammars whenever possible

Use slots when trying to recognize phrases

Use a separate grammar file if certain things will be recognized repeatedly (e.g. example 3)

So far...

The metaphor is VXML as a form

Dialog is overdetermined (i.e., the system is in control)

What about...

Design considerations

Main Main menumenu

people people menumenu

UndergraUndergrad menud menu

Grad Grad menumenu

Design considerations

Main Main menumenu

people people menumenu

UndergraUndergrad menud menu

Grad Grad menumenu

Design considerations

Main Main menumenu

people people menumenu

UndergraUndergrad menud menu

Grad Grad menumenu

Example 3

Example 4

Main Main menumenu

people people menumenu

UndergraUndergrad menud menu

Grad Grad menumenu

ex4_psych.vxml

ex4_undergr.vxmlex4_grad.vxml ex4_people.vxml

grammagrammarr

psychology.gsl

Tony’s Tips #3When using external grammars the rule name must start with a capital letter and is case sensitive (e.g., Fish is different from FISH)

Plan first, then write code

Break the project into pieces

Test each small piece individually then combine

Adding flexibility

Handling speech recognition errors

Handling long periods of silence

Help

Repeating

Exiting/Hanging up gracefully

Adding flexibility

Given/new information (e.g., emphasis)

This implementation of VXML does not allow you to change emphasis despite having tags <emphasis> and <prosody>

You can get this correct

Additional optionsChange the timeout length

Say as feature

Ex. system can say a series of numbers in “telephone style”

Use pre-recorded sounds (i.e., .wav files)

Change the voice

Tony’s Tips #4Start early

Reuse code -- don’t reinvent the wheel

Put comments in your code

If you get stuck ask someone for help

See Tony’s Tips 1-3

FUll flow chartExample 3/4