speech in, speech out. 24 listopad 2006ws0607 – elevator2/15 nuance server compiled recognition...

Post on 17-Jan-2016

214 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

speech in, speech out

24 listopad 2006 WS0607 – elevator 2/15

Nuance server

compiled recognition grammar, master language package, licence manager

Nuance client

speech-in components

24 listopad 2006 WS0607 – elevator 3/15

anticipate user’s responses

what pieces of information are needed to complete the dialog?

in what order will they be requested?

one piece of information at a time in particular order (directed dialog), several pieces at once, in any order, and prompt for missing items (mixedinitiative)?

recognition grammar

24 listopad 2006 WS0607 – elevator 4/15

syntax

Nuance: Grammar Specification Language (GSL)

Diamant: Speech Recognition Grammar Format (SRGF)

recognition grammar

24 listopad 2006 WS0607 – elevator 5/15

GSL grammar: doc in a file with .grammar extension; e.g. mygram.grammar (mygram will be the resulting package name)

contents: GrammarRuleName   GrammarDescription

GrammarRuleName: at least one uppercase character

GrammarDescription: sequence of words, grammar names, and operators that define a set

of recognizable word sequences words (terminals) in lower-case operators:

recognition grammar

() concat (A B C ... Y) A and B and ...

[ ] disjunction [A B C ... Y ] either A or B or...

? optional ?Y Y is optional

+ positive closure +Y at least one Y

* Kleene star *Y zero or more Y

24 listopad 2006 WS0607 – elevator 6/15

GSL grammar: example expressions

[morning afternoon evening]

“morning”, “afternoon”, “evening”

(good [morning afternoon evening])

“good morning”, “good afternoon”, “good evening”

(?good [morning afternoon evening])

“good morning”, “good afternoon”, “good evening”, “morning”, “afternoon”, “evening”

(thanks +very much)

“thanks very much”, “thanks very very much”, ...

(thanks *very much)

“thanks much”, “thanks very much”, “thanks very very much”, ...

recognition grammar

24 listopad 2006 WS0607 – elevator 7/15

example GSL grammar

.grammar file

.slot_definitions file

.GO_FLOOR [ FLOOR:f (?the FLOOR:f floor) (?the FLOOR:f please) (?Filler ?the FLOOR:f floor ?please)] {<floor $f>}

Filler [ (i would like to go to) (i want to go to) (uh)]

FLOOR [ first {return("1")} second {return("2")} third {return("3")} fourth {return("4")}]

recognition grammar

floor

24 listopad 2006 WS0607 – elevator 8/15

another option: SRGF and export as Nuance GSL

GrammarTest.bat

recognition grammar

24 listopad 2006 WS0607 – elevator 9/15

compiling the package (compile-package.bat)

set PKGHOME = path to your gsl file (w/o extension)

nuance-compile %PKGHOME% English.America.1.3.0

recognition grammar

master recognition package

24 listopad 2006 WS0607 – elevator 10/15

testing the grammar (text)

parse-tool -package path_to_your_model

nl-tool –package path_to_your_model –grammar grammar_in_your_model

recognition grammar

24 listopad 2006 WS0607 – elevator 11/15

running Nuance:

licence manager: lm.bat

recognition server: rs.bat

set PKGHOME = path to your compiled model

recserver -package %PKGHOME% lm.Addresses=localhost config. ...

testing the grammar (speech)

xapp -package path to your compiled model lm.Addresses=localhost

speech recognition

24 listopad 2006 WS0607 – elevator 12/15

running nuance client

edit Diamant config file: Clients.ini

NuanceClient.bat

(btw, have the licence manager and the server running too... duh!...)

Diamant with speech-in

24 listopad 2006 WS0607 – elevator 13/15

adding speech-in

add device as usual

activate recognition: output <string> „start” (start command) to nuance client

read (speech) input from nuance client into variable as usual

access recognition confidence (of type Real) like this: var#confidence

Diamant with speech-in

24 listopad 2006 WS0607 – elevator 14/15

Mary server

online at DFKI...

Mary client

MaryClient.bat

speech-out components

24 listopad 2006 WS0607 – elevator 15/15

Diamant with speech-out

adding speech-out

add device as usual

optionally, set format: {format = <string>} (default plain text) and voice{voice = <string>}

in output node, output <string> to Mary client as usual

24 listopad 2006 WS0607 – elevator 16/15

speech-enabled dialogs

recognition tends to be imperfect...

if recognition confidence low, then, for example (btw, think: grounding):

repeat question

ask for confirmation („did you say blah?”)

inform user what they can say („you can say blah, bloo, and blee, please tryagain”)

but... don’t let user get stuck in endless clarification dialog either!

top related