creating voice experiences with amazon alexa · smartphones and smartwatches, to home ... a closer...
TRANSCRIPT
CR E A T IN G VOICE E X PE R IE N CE S WI TH A MA ZON A L E X A
VOICE WILL BE EVERYWHERE
Theageoftouchcouldsooncometoanend.Fromsmartphonesandsmartwatches,tohomedevices,toin-carinfotainmentsystems, touchisnolongerthe
primaryuserinterface.
Source:DesignNews
“
”
Although voice technology is still in it’s relative infancy, the future is not as far off as you think.
A S RAutomatedSpeechRecognition
Whatistheuseractuallysaying?
N L PNaturalLanguageProcessing
Whatistheintentoftheuser?
MachineLearningandIntelligence
Provideextraordinaryandadvancedcustomerexperiences
RAPID ADVANCEMENT
A IC L O U DScalable,Reliable&Secure
Abilitytointroducefeaturesatscale,thatcontinuouslyaddvalueovertime
A C C E S SDemocratizingVoiceTechnology
BetterASRandNLPhasleadtobetteraccessandadoption
1970 1980 1990 2000 2010 2020
HUMAN ACCURACY
50% 55%60% 62%
70%
95%
ASRaccuracyhasdramaticallyincreasedinthelast4-5years.
Thisinflectionpointhascreatedsustainedmomentuminconsumeradoptionofvoicetechnology
MACHINE ASR ACCURACY
Source:MindMeld
Asspeechrecognitionaccuracygoesfrom95%to99%,allofusintheroomwillgofrombarelyusingittodaytousingitallthetime.Mostpeopleunderestimatethe
differencebetween95%and99%accuracy.99%isagamechanger.
“
”AndrewNg,ChiefScientistatBaidu
TIME
PER
FOR
MAN
CE
COMPUTER PERFORMANCE& MACHINE LEARNING
MACHINE LEARNING & INTELLIGENCE
HUMAN PERFORMANCE
WE ARE HERE
Soonitwillseemalmostquainttherewasatimewelookedatvoiceassistantsasvirtualfriendswholivedin
ourpocketsandansweredourquestions.
“
”Source:TheDrumNews– HowVoiceTechWillChangeOurLivesForever
INTERFACE EVOLUTION EVENTSIttookgenerationsandseveralmajortechnologicaladvancementsfortouchscreens,GUIandVUItoachievecriticaladoption.
Followingnon-commercialGUImilestones,theadvancesoftheearly80sand90s(Windows95,Apple’sOS,theInternet)changedtrajectoryoftheGUI
Itwasn’tuntilthePalmPilotofthelate‘90sandsmartphonesofthemid-2000sthatallowedTouchtoemergeasakeyinteractionmodality
Human-To-HumanVUIwasbroughtinwiththedawnofthetelephone,butHuman-To-MachineVUIshavejustrecentlybecomeviable
• Canhandlemoreinfo• Morefamiliar• Hardertogetlost• Providesflexibility
TOUCH vs. VOICE
• Faster• Lesscumbersome• Universal• Removesnoise
VOICE AS A KEY MODALITY Whereit’sgoingislimitedbyone’simaginationbutvoiceWILLplayakeyroleinhowwecontrolourhomes,ouroutdoorspacesandaccessinformation…Why?
ACCESS ACCURACY EFFICIENCY SECURITYAcrossbillionsofdevicesbetweenphones,watchescars,Alexa-powereddevicessuchastheAmazonEcho,EchoDot,AmazonTap,AmazonFireTV,ismakingaccesstovoiceubiquitous.
Advancesintheabilitytounderstanduserintentionischangingthegameinadoptionasinteractionsbecomefasterandmorereliable
Easeofusewillmakeitapowerfulchoiceforquickaccesstoanythinginsideandoutsideourenvironment
Voiceisahighlyuniquesignature,asadvancesinbiometricsareintegratedwithvoice,ourindividualitywillbecomeakeytofurtherpersonalizationandsecurity.
DE VE L OP IN GFOR VOICE
Create Great Content: ASK is how you connect
to your consumer
THE ALEXA SERVICESupported by two powerful SDKs
A LE X AVO I C E
S E RV I C E
Unparalleled Distribution: AVS allows your content
to be everywhereLives In The Cloud
Automated SpeechRecognition (ASR)
Natural Language Understanding (NLU)
Always Learning
A LE X AS K I L LS
K I T
A LE X AS K I L L S K I T
EUROPEAN ALEXA SKILLS PARTNERS
ALEXA SKILLS KIT ARCHITECTUREA closer look at how the Alexa Skills Kit processes a request and returns an appropriate response
You Pass Back a Textual or Audio Response
You Pass Back a Graphical Response
Alexa Converts Text-to-Speech (TTS) & Renders Graphical Component
Respond to Intent through Text & Visual
Alexa sends Customer Intent to
Your Service
Your ServiceprocessesRequest
User Makes a Request
Audio Stream issent up to Alexa Alexa Identifies Skill & Recognizes
Intent Through ASR & NLU
Speech Platform
SkillsWeather
ASR
NLU
TTS
“speak”directive
intent
recognitionresult
recognize
intent
recognitionresult
text/SSML
user’sutterance
Alexa’svoice
Alexa’svoice
Alexa, what’s the weather?
WAKE WORD DETECTION
SPEECH CAPTURE
TEXT TO SPEECH OUTPUT
AlexaVoiceService
A LE X AVOICE S E R V ICE
ALEXA FOR MANY KINDS OF DEVICES
ALEXA VOICE SERVICE
REST APIYOUR CODE
YOUR DEVICE“What time is it?”
”It’s 8 PM”
REST REQUEST
REST RESPONSE
Audio Capture
Audio Playback
ALEXA VOICE SERVICE ARCHITECTUREA closer look at how the Alexa Voice Service streams and receives audio from the AVS API
WH A TN E X T?
S OME US E FUL R E S OUR CE S
http://developer.amazon.com/askhttp://developer.amazon.com/blog
http://developer.amazon.com/alexa-fundhttp://bit.ly/alexadevchathttp://bit.ly/alexaforums
http://bit.ly/alexacerthelphttp://bit.ly/alexadevevents
TH A N K YOU QUE S T I ON S ?