using asterisk to create "her"
DESCRIPTION
In the film “Her” the protagonist falls in love with his computer, an artificial intelligence operating system. While most of us already love Asterisk, things really get interesting when we give Asterisk a voice, and the ability to listen to our instructions. Fortunately for us, Asterisk has impressive capabilities for adding speech recognition and text-to-speech to our calls. This talk will cover many facets of speech applications with Asterisk. We will look at the various commercial and open source speech engines available, as well as how to integrate them into Asterisk. We will look at ways prompts and grammars can be designed to give the caller the best possible experience. We will hear samples of the right and hilariously wrong ways speech can be used. We will cover the various types of speech recognition that exist today (grammar-driven, transcription, hotword and voice biometrics) and how each should be applied. Finally, we’ll show how these pieces come together to make it possible to build something that (for a brief moment) passes as intelligent. Maybe.TRANSCRIPT
Using Asterisk to create “Her”
CAN YOU SPEAK MAGIC?
2
Ben KlangAllison Smithas “Her”
CAN YOU SPEAK MAGIC?
3
CAN YOU SPEAK MAGIC?
3
CAN YOU SPEAK MAGIC?
ALL ABOUT “HER”
4
Allison
CAN YOU SPEAK MAGIC?
5
CAN YOU SPEAK MAGIC?
HOW DOES THIS WORK IN ASTERISK•We have access to the same core tech•ASR: Automatic Speech Recognition•NLU: Natural Language Understanding•TTS: Text-to-Speech•API: Application Program Interfaces
•But it’s not just about the tech•It has to be useful•It has to usable
6
CAN YOU SPEAK MAGIC?
USABILITY: “HER” PERSONALITY
7
CAN YOU SPEAK MAGIC?
CREATING “HER” PERSONALITY•What kind of assistant is she?•Straight, no-nonsense•Bubbly, friendly•Sassy, smart-mouthed•Relaxed, laid back•Energetic, excited•Sultry, provocative
8
CAN YOU SPEAK MAGIC?
WHY PERSONALITY MATTERS
9
CAN YOU SPEAK MAGIC?
HOW DOES “SHE” WORK?
10
CAN YOU SPEAK MAGIC?
INSIDE “HER”
11
RecognizingASR
NLU UnderstandingAPIResearching
TTSResponding
Input/Output Channel
Voice
CAN YOU SPEAK MAGIC?
INSIDE “HER”
12
RecognizingASR
NLU UnderstandingAPIResearching
TTSResponding
Input/Output Channel
Voice
CAN YOU SPEAK MAGIC?
RECOGNIZING•Different kinds of ASR•Dictation / Transcription•Grammar-based•Hotword•Biometrics / Identity•DTMF has its place•The Media Connection•MRCP•HTTP APIs
13
CAN YOU SPEAK MAGIC?
RECOGNIZING INTERFACES•MRCP+ Streaming recognition = fastest response+ MRCPv2 is SIP-based– Somewhat more complex– Mobile-app unfriendly•HTTP API+ Mobile-friendly+ Simple API– Record-and-upload = slower response
14
CAN YOU SPEAK MAGIC?
15
ASR Vendors
/
MRCP HTTP Grammar Dictation Hotword
Nuance ✓ ✓ ✓ ✓ ✓
Lumenvox ✓ ✓
Vestec ✓ ✓ ✓AT&T
Watson ✓ ✓ ✓
Google ✓ ✓
CAN YOU SPEAK MAGIC?
INSIDE “HER”
16
RecognizingASR
NLU UnderstandingAPIResearching
TTSResponding
Input/Output Channel
Voice
CAN YOU SPEAK MAGIC?
GRAMMAR-BASED RECOG
17
Where would you like to go?Chicago
Tell me the month and day you want to leave?August fifth
Tell me the month and day you want to return?August eighth
What can I help you with?Book a flight
Where are you flying from?Atlanta
CAN YOU SPEAK MAGIC?
✓ Destination✓ Departing Date✓ Returning Date+ Extra Constraint
NATURAL LANGUAGE
18
? Origin
“Hm, I want to go to AstriCon in Las Vegas on
October 21st for three days, and I want the last flight out.”
CAN YOU SPEAK MAGIC?
INSIDE “HER”
19
RecognizingASR
NLU UnderstandingAPIResearching
TTSResponding
Input/Output Channel
Voice
CAN YOU SPEAK MAGIC?
20
Send a tweet… Check in at…
What is the weather today?
Get me a table for two…
Who won the game last night?
What is Googletrading at?
When is my next appointment?
CAN YOU SPEAK MAGIC?
20
Send a tweet… Check in at…
What is the weather today?
Get me a table for two…
Who won the game last night?
What is Googletrading at?
When is my next appointment?
ZZZZZZzzzzzz……
CAN YOU SPEAK MAGIC?
21How many sales reps are still in homes?
How much have we sold so far this month?
How many callers are in the queue right now?
Add my manager to this call
When is my next open appointment slot?
CAN YOU SPEAK MAGIC?
INSIDE “HER”
22
RecognizingASR
NLU UnderstandingAPIResearching
TTSResponding
Input/Output Channel
Voice
CAN YOU SPEAK MAGIC?
TEXT-TO-SPEECH•Choose your voice carefully•Voice DBs’ quality varies widely•Tone of voice imparts as much as content•Mix TTS with recorded audio•Consider context of user•Check prosody (rate, pitch, volume)•Structure answers similarly to questions•Give option to repeat•Speech Synthesis Markup
23
CAN YOU SPEAK MAGIC?
INSIDE “HER”
24
RecognizingASR
NLU UnderstandingAPIResearching
TTSResponding
Input/Output Channel
Voice
CAN YOU SPEAK MAGIC?
BEYOND VOICE: GETTING VISUAL
25
CAN YOU SPEAK MAGIC?
MULTI-MODE APPS•Request information by voice •Receive information via screen •SMS •Web browser (WebRTC!)
•Allow continued input from alternate source •Respond via mouse click *or* voice
26
CAN YOU SPEAK MAGIC?
27
CAN YOU SPEAK MAGIC?
QUESTIONS?
PS: ALLISON WANTS TO BE THE NEXT SIRI!
28