speech in, speech out. 24 listopad 2006ws0607 – elevator2/15 nuance server compiled recognition...
Post on 17-Jan-2016
214 Views
Preview:
TRANSCRIPT
speech in, speech out
24 listopad 2006 WS0607 – elevator 2/15
Nuance server
compiled recognition grammar, master language package, licence manager
Nuance client
speech-in components
24 listopad 2006 WS0607 – elevator 3/15
anticipate user’s responses
what pieces of information are needed to complete the dialog?
in what order will they be requested?
one piece of information at a time in particular order (directed dialog), several pieces at once, in any order, and prompt for missing items (mixedinitiative)?
recognition grammar
24 listopad 2006 WS0607 – elevator 4/15
syntax
Nuance: Grammar Specification Language (GSL)
Diamant: Speech Recognition Grammar Format (SRGF)
recognition grammar
24 listopad 2006 WS0607 – elevator 5/15
GSL grammar: doc in a file with .grammar extension; e.g. mygram.grammar (mygram will be the resulting package name)
contents: GrammarRuleName GrammarDescription
GrammarRuleName: at least one uppercase character
GrammarDescription: sequence of words, grammar names, and operators that define a set
of recognizable word sequences words (terminals) in lower-case operators:
recognition grammar
() concat (A B C ... Y) A and B and ...
[ ] disjunction [A B C ... Y ] either A or B or...
? optional ?Y Y is optional
+ positive closure +Y at least one Y
* Kleene star *Y zero or more Y
24 listopad 2006 WS0607 – elevator 6/15
GSL grammar: example expressions
[morning afternoon evening]
“morning”, “afternoon”, “evening”
(good [morning afternoon evening])
“good morning”, “good afternoon”, “good evening”
(?good [morning afternoon evening])
“good morning”, “good afternoon”, “good evening”, “morning”, “afternoon”, “evening”
(thanks +very much)
“thanks very much”, “thanks very very much”, ...
(thanks *very much)
“thanks much”, “thanks very much”, “thanks very very much”, ...
recognition grammar
24 listopad 2006 WS0607 – elevator 7/15
example GSL grammar
.grammar file
.slot_definitions file
.GO_FLOOR [ FLOOR:f (?the FLOOR:f floor) (?the FLOOR:f please) (?Filler ?the FLOOR:f floor ?please)] {<floor $f>}
Filler [ (i would like to go to) (i want to go to) (uh)]
FLOOR [ first {return("1")} second {return("2")} third {return("3")} fourth {return("4")}]
recognition grammar
floor
24 listopad 2006 WS0607 – elevator 8/15
another option: SRGF and export as Nuance GSL
GrammarTest.bat
recognition grammar
24 listopad 2006 WS0607 – elevator 9/15
compiling the package (compile-package.bat)
set PKGHOME = path to your gsl file (w/o extension)
nuance-compile %PKGHOME% English.America.1.3.0
recognition grammar
master recognition package
24 listopad 2006 WS0607 – elevator 10/15
testing the grammar (text)
parse-tool -package path_to_your_model
nl-tool –package path_to_your_model –grammar grammar_in_your_model
recognition grammar
24 listopad 2006 WS0607 – elevator 11/15
running Nuance:
licence manager: lm.bat
recognition server: rs.bat
set PKGHOME = path to your compiled model
recserver -package %PKGHOME% lm.Addresses=localhost config. ...
testing the grammar (speech)
xapp -package path to your compiled model lm.Addresses=localhost
speech recognition
24 listopad 2006 WS0607 – elevator 12/15
running nuance client
edit Diamant config file: Clients.ini
NuanceClient.bat
(btw, have the licence manager and the server running too... duh!...)
Diamant with speech-in
24 listopad 2006 WS0607 – elevator 13/15
adding speech-in
add device as usual
activate recognition: output <string> „start” (start command) to nuance client
read (speech) input from nuance client into variable as usual
access recognition confidence (of type Real) like this: var#confidence
Diamant with speech-in
24 listopad 2006 WS0607 – elevator 14/15
Mary server
online at DFKI...
Mary client
MaryClient.bat
speech-out components
24 listopad 2006 WS0607 – elevator 15/15
Diamant with speech-out
adding speech-out
add device as usual
optionally, set format: {format = <string>} (default plain text) and voice{voice = <string>}
in output node, output <string> to Mary client as usual
24 listopad 2006 WS0607 – elevator 16/15
speech-enabled dialogs
recognition tends to be imperfect...
if recognition confidence low, then, for example (btw, think: grounding):
repeat question
ask for confirmation („did you say blah?”)
inform user what they can say („you can say blah, bloo, and blee, please tryagain”)
but... don’t let user get stuck in endless clarification dialog either!
top related