using location-aware multimodal speech interfaces to survive a zombie apocalypse

Upload: deciti

Post on 06-Apr-2018

236 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/3/2019 Using Location- Aware Multimodal Speech Interfaces to Survive a Zombie Apocalypse

    1/12

    UsingLocation-AwareMultimodalSpeechInterfacesto

    SurviveaZombieApocalypse

    [email protected]

    1.Introduction

    Intheeventofalarge-scaleapocalypticassaultonhumancivilizationbyrecently

    deceasedreanimatedcorpses,orzombies,apreexistinginfrastructureallowing

    thereal-timemonitoringofassailantsandguidanceofarmedpersonnelwill

    greatlyimprovethelikelihoodofmarginalizedhumanssurviving.Iproposethe

    implementationofasystemimprovingtheconventionalEU112emergency

    telephonenumberservicewithamultimodalinteractivevoiceresponse(IVR)

    systemcapableofbeingdeployedattheonsetofazombieapocalypse.The

    systemsinputandoutputmodalitiesincludespeechrecognition,speech

    synthesis,largevisualdisplays,andglobalpositioninginformationtransmitted

    viacommoncellulartelephones.

    Currentemergencytelephoneservicesconnectcallerswithresponsecentersstaffedwithliveoperatorswhoreceiveinput,assesssituations,andrecommend

    actionsbasedontheirtraining.Theresultisoftenthedispatchingofemergency

    personneltothecallerslocationortheprovidingofinstructionsforthecallerto

    follow.IntegratedGPSfunctionalityinmoderncellularphonesallowsthe

    operatortodeterminethecallerslocationwithreasonableaccuracy.Non-GPS

    capablephonescanbepinpointedwithalesserdegreeofaccuracyusingantenna

    towertriangulation.

    Shouldtheundeadrisefromtheirgravesandactivelyseekconsumptionof

    humanflesh(and/orbrains),traditionalemergencycallcenterswillbeunableto

    handlethesuddeninfluxofincomingrequestsforassistance.Increasingthe

    numberofstafftoaccommodatetheescalateddemandwillprovedifficultdueto

    decreasedpersonnelreliabilityasaresultofdeathorexodus,anddangers

    imposedbyhavinglargenumbersofhumansinafixedenvironment;namely,the

    attractionofzombiestobothnoiseandconcentrationsofhumanflesh.Azombie

  • 8/3/2019 Using Location- Aware Multimodal Speech Interfaces to Survive a Zombie Apocalypse

    2/12

    outbreakwithinvicinityofacallcenterwouldforceevacuation,causingservice

    outages.

    AdecentralizedandredundantIPtelephonynetworkwouldallowfavorable

    uptimeinsituationswhereitwouldbedangerousforhumanstoremain

    stationaryforextendedperiodsoftime.Thelocationofsafehousesandgun

    cachescanbeprovidedtocallers.Informationreceivedfromcallersaboutthe

    quantityofzombiesinhisorherpresenceisaddedtoadatabasealongsideGPS

    coordinatestakenfromtheusersdevice.Largepublicdisplaysstrategically

    locatedinsafehousesandcentralsquaresshowmapswiththelocationsof

    recentlyspottedzombiestohelpaffectedcitizensavoiddangerousareasand

    trackmovementofzombiehordes.

    2.SpeechRecognitionTechnologyRequirements

    Speechrecognitionisanimperfectprocessanditisimportanttoaddressspecific

    propertiesofthesystemsrecognizerbeforedesignanddevelopmentprocesses

    begin.

    Attribute Requiredvalue Rationale

    Vocabularyandlanguage

    Vocabularysize Small

    Asmalldictionaryofimportant

    actionwordsarerequiredbythe

    speechrecognizer.

    Grammar Phrase-based

    Systemrecognizespredefined

    keywords.Grammarandsysteminteractionmodelconsiderspossible

    situationsanduserbase.

    Extensibility Changeable

    Safetyofcitizensrequiresthesystem

    tosupportsomedegreeofextensibility.Additionalvocabulary

    canbeaddedshouldnewfeaturesbe

    desired.

  • 8/3/2019 Using Location- Aware Multimodal Speech Interfaces to Survive a Zombie Apocalypse

    3/12

    Communicationstyle

    Speaker Independent

    Thisisapublicapplication,so

    speakerandgenderindependentmodelsmustbeapplied.

    Speakingstyle Continuous Word-spottingisemphasized.

    Overlap Barge-in

    Theabilitytointerruptsystemoutputsandoffernewcommandsis

    importantinlife-threatening

    situations.Thiswillalsoallowlongerandmoreinformativeoutputstobe

    offered.

    Usageconditions

    Environment Hostile

    Withthemajorityofphonecalls

    takingplaceinhecticsituations,thesystemmustbepreparedtohandle

    hostileenvironments.

    Channelquality Low-qualitySystemoperatesovercellularphone

    networksandaparticularusersreceptionmaybepoor.

    2.DraftoftheSystemsSpeechInterface

    Whenazombieorzombiehordeisspotted,citizensdialathree-digitemergency

    numberontheirmobilephoneandusevoiceinputtoinformavirtualagentofthesituation.Speechinputisrecognizedandspeechoutputisprovidedinthe

    formofdirectionstosafehousesorguncaches.Whilesupportedbythe

    applicationalongsidespeechinput,traditionaltouchpadinputisimpractical

    becausecallerswilloftenbefleeingonfoot,unabletodevotevisualattentiontoa

    handhelddevice.Whilemodernsmartphonesarecapableofprovidingnew

    interactionexperiences,includingtactileandhapticinterfaces,thesensitivityof

    thisapplicationnecessitatesbroadcompatibilityacrossoldergenerationsof

    mobilephoneswithalowworderrorrate(WER)thresholdoflessthan5%.

  • 8/3/2019 Using Location- Aware Multimodal Speech Interfaces to Survive a Zombie Apocalypse

    4/12

    Thesystemsupportsbothsystem-initiativeandmixed-initiativedialogue

    strategies.Duetothesafety-criticaldomainoftheapplication,mostuserswillbe

    trainedhowtousethesystembeforecallingforthefirsttime.Expertuserswill

    preferthemixed-initiativeapproach,inputtingknowncommandswithoutwaitingforguidance,whilenoviceandregularuserswillrequirethesystemto

    takeinitiativeandguidetheconversation.

    BasiccommandsincludeSAFEHOUSE,whenthelocationofthenearestsafe

    houseisdesired,GUN,whenthelocationofthenearestguncacheissought

    and,HELPwhenimmediatesupportisrequestedintheformofmedicalor

    defensiveassistance.Sinceallparamedicsandsafetypersonnelwillbeheavily

    armed,thereisnoneedtospecifybetweenrequestsformedicalorfirepower

    support.

    Word-spottingisanimportanttraitofthespeechrecognizerbutsupplementary

    speechinputs;namely,screamsofvariousvolumes,lengthsandtones,shouldbe

    abletobespottedalongsidestandardwords.Itispossiblethatcallerswillbe

    communicatingoverlow-qualitychannelsinhostileenvironments.Contributing

    totheless-than-idealsituationofthecallisthelikelihoodofauserbeing

    assaultedbyazombieduringtheapplicationdialogue.Thesystemshouldbe

    abletorecognizewhenacallerisscreaminginpainorfear,andimmediately

    dispatchassistance.

    Ifthesameemergencynumberserviceisadaptedtosupportadditional

    situations,suchastraditionalfire,police,andambulancescenarios,itis

    importantthesystemconfirmsthemotivationofthecall.Ifacallerengagesthe

    mixed-initiativeapproachwithoutusinganyoftheabovekeywords,the

    keywordZOMBIEcanbespokenatthebeginningofthecalltoindicatethe

    contextofthesituation.Thisapplicationdescriptionassumesitisadedicated

    zombie-onlyhotline.

    Attheendofeachtransaction,callerswhoarenotintime-sensitivesituations

    areaskedforspecificdetailsoftheirsituation,includingthenumberofzombies

    presentattheirlocation.Thisinformationiscombinedwiththeuserslocation

    coordinatestoupdatesmapsonlargepublicdisplaysincentralareasoftown.

  • 8/3/2019 Using Location- Aware Multimodal Speech Interfaces to Survive a Zombie Apocalypse

    5/12

    Belowaresomeexampledialoguesofthesysteminaction.

    Party Speechin/out Explanation

    System: Whatisyouremergency? Callisconnectedanddialoguebegins.

    Caller:THEREAREZOMBIES

    EVERYWHERE!

    Panickingnoviceuserengagesthe

    systemsinitiativeandanswersthe

    question.

    System:Areyouabletosafelyflee

    thesituation?

    Systempreparestosuggestassistance

    byclarifyingthesituation.

    Caller: NO. Callerrespondstoquestion.

    System:Iamdispatchingassistancetoyourlocation.

    Systemacknowledgesthecallerssituationandforwardshisorher

    locationcoordinatesandarecordingoftheconversationtoapplicable

    personnel.

    DialogueExample1

    Party Speechin/out Explanation

    System: Whatisyouremergency? Callisconnectedanddialoguebegins.

    Caller: Ineedasafehouse!Callerignoresthesystemsinitiative

    andstatesaknownkeyword

    System:

    NearestsafehouseisatKoskikatu9.Wouldyou

    likedirections?

    Systemrecognizessafehousekeywordandprovidesnearest

    location.Takesinitiativebyoffering

    directions.

    Caller: Yes. Callerrespondstoquestion.

    System:

    TurnrightatRongankatu.

    Say"next"whenyouarereadyforthenext

    direction.

    SystemusesthephonesGPS

    functionalitytolocatethecaller.Firstdirectionisprovidedwithapromptfor

    thenext.

    Caller: Iamthere.Next.Callerfollowsdirectionanduses

    providedkeywordtoindicateheor

    sheisreadyforthenextresponse.DialogueExample2

  • 8/3/2019 Using Location- Aware Multimodal Speech Interfaces to Survive a Zombie Apocalypse

    6/12

    Party Speechin/out Explanation

    System: Whatisyouremergency? Callisconnectedanddialoguebegins.

    Caller: AAAAAAAAAAAHHHHHH!Callerignoresthesystemsinitiativeandscreamsasazombiebitesintohis

    orherarm.

    System:

    Iamdispatchingassistancetoyourlocation.Whatis

    youremergency?

    Systemacknowledgesthecallerssituationandforwardshisorher

    locationcoordinatesandarecordingoftheconversationtoapplicable

    personnel.Questionisrepeatedincase

    additionaldetailscanbeprovided.

    Caller:[Soundsofscreamsand

    zombiegruntsinthe

    background.]

    Callerisbeingmauledbyzombies.

    System: Whatisyouremergency?

    Systemrepeatsthequestioneveryfive

    secondswhilecontinuingtorecordtheconversationforlatermanualanalysis.

    DialogueExample3

    Party Speechin/out Explanation

    System: Whatisyouremergency? Callisconnectedanddialoguebegins.

    Caller: INEEDHELP!Callerrespondstotheapplications

    inquiry.

    System:Iamdispatchingassistancetoyourlocation.Howmany

    zombiesarepresent?

    Systemrecognizesthekeywordhelp,

    acknowledgesthecallersinexperienceandforwardshisorherlocationcoordinatesandarecordingofthe

    conversationtoapplicablepersonnel.

    Aquestionisaskedforfurtherdetail.

    Caller: ALOT!

    Numbersalongwithgeneralquantitativephrasessuchasalot

    mustberecognizedbythesystemforlateruse.

    DialogueExample4

  • 8/3/2019 Using Location- Aware Multimodal Speech Interfaces to Survive a Zombie Apocalypse

    7/12

    Party Speechin/out Explanation

    System: Whatisyouremergency? Callisconnectedanddialoguebegins.

    Caller: Guns! Callerignoresthesystemsinitiativeandstatesaknownkeyword.

    System:Nearestguncacheisat

    Hmeenpuisto28.Wouldyoulikedirections?

    Systemrecognizesgunkeywordand

    providesnearestlocation.Takesinitiativebyofferingdirections.

    Caller: Yes. Calleracceptsofferfordirections.

    System:[Directionsdialogue

    initiated]

    DialogueExample5

    Party Speechin/out Explanation

    System: Whatisyouremergency? Callisconnectedanddialoguebegins.

    Caller: Twozombies.

    Callerignoresthesystemsinitiativeandstatesaknownkeyword.A

    numberisprovidedtoaidanalysisandupdatepublicmaps.

    System:Twozombiesnoted.Doyourequireassistance?

    Systemconfirmstheinputandinitiatesfurtherdialogue.

    Caller: No.[Callerhangsup]Callerwasnotindangerandsimplywantedtoreportazombiesighting.

    DialogueExample6

  • 8/3/2019 Using Location- Aware Multimodal Speech Interfaces to Survive a Zombie Apocalypse

    8/12

    Party Speechin/out Explanation

    System: Whatisyouremergency? Callisconnectedanddialoguebegins.

    Caller: Whereisthenearestsafehouse?

    Callerignoresthesystemsinitiativeandstatesaknownkeyword.

    System: Nearestsafehouseisat--Systemqueriesdatabaseandbegins

    providinginformationbutisinterruptedbycallersscream.

    Caller: AAaGaURAAAaaHHHHHCallerbarges-inwhenazombiejumps

    himorherfrombehind.

    System:Iamdispatchingassistancetoyourlocation.Whatis

    youremergency?

    Systemrecognizesscreamandstopsgivingdirections.Personnelare

    notifiedandinitialquestionisrepeatedeveryfivesecondswhile

    continuingtorecordtheconversation

    forlatermanualanalysis.

    DialogueExample7

    Party Speechin/out Explanation

    System: Whatisyouremergency? Callisconnectedanddialoguebegins.

    Caller:Iamsurroundedbyzombies!

    Panickingnoviceuserengagesthesystemsinitiativeandanswersthe

    question.

    System:Areyouabletosafelyflee

    thesituation?

    Systempreparestosuggestassistance

    byclarifyingthesituation.

    Caller: Yes! Callerrespondstoquestion.

    System:Nearestsafehouseisat

    Satamakatu17.Wouldyoulikedirections?

    Systemacknowledgesthecallers

    abilitytorunandautomaticallyrecommendsthelocationofthe

    nearestsafehouse.Directiondialogue

    ensues.

    DialogueExample8

  • 8/3/2019 Using Location- Aware Multimodal Speech Interfaces to Survive a Zombie Apocalypse

    9/12

    Party Speechin/out Explanation

    System: Whatisyouremergency? Callisconnectedanddialoguebegins.

    Caller:IwantagunsoIcankillsomezombies.

    Callerdoesnthaveanemergency,butknowshecancalltoreceivethe

    locationofguncaches.

    System:NearestguncacheisatHmeenpuisto28.Would

    youlikedirections?

    Systemrecognizesgunkeywordandprovidesnearestlocation.Takes

    initiativebyofferingdirections.

    Caller: Nope.Iknowwherethatis.

    Callerdeniesofferfordirectionsand

    hangsup.

    DialogueExample9

    Party Speechin/out Explanation

    System: Whatisyouremergency? Callisconnectedanddialoguebegins.

    Caller: HELPHELPHELPHELP! Callerfranticallyyellsforhelp

    System:

    Iamdispatchingassistancetoyourlocation.Howmany

    zombiesarepresent?

    Systemrecognizesthekeywordhelp,acknowledgesthecallersinexperience

    andforwardshisorherlocation

    coordinatesandarecordingoftheconversationtoapplicablepersonnel.

    Aquestionisaskedforfurtherdetail.

    Caller: FIVEORSIX.

    Thesystemchoosesthelastnumber

    statedtorecordinthedatabase.Thisallowscallerstocorrectthemselvesif

    needbe.

    DialogueExample10

  • 8/3/2019 Using Location- Aware Multimodal Speech Interfaces to Survive a Zombie Apocalypse

    10/12

    3.MultimodalInteractionTechniques

    Alongwiththeaforementionedspeechinput/outputinterface,location

    coordinatesaretakenfromcallersmobilephonesusingGPStechnology(orcell

    towertriangulationwhenGPSdataisunavailable).Thespeechinterfacewill

    operateatthesametimebutindependentlyofthelocationmanagementmodule,

    makingthemodalitiesconcurrent.

    Thecallerscoordinatesareplottedontoamapwithapictureofazombie

    indicatingthattherewasazombiesightingatthatgivenlocation.Thetimeat

    whichthecallercontactedtheemergencysystemisplacedalongsidetheicon.If

    thecallerprovidedaquantitativenumberofzombieslocatedathisorherpositionattimeofcall,thisnumberisalsosituatednexttothelocationicon.

    Sightingsoftenormorezombiesinoneplaceconstituteahorde,andaniconof

    multiplezombiesalongsideeachotherisusedinplaceofthestandardzombie

    icon.Ifaqualitativefigurewasgiven,suchasalot,itisconsideredahorde.

    Thesezombie-sightingmapsareshownonlargeLCDdisplaysinsafehousesand

    busypublicareas.Zombiespoorvisionwillallotthemlittleinterestinthe

    displaysinsituationswherenoisewoulddisturbandattracthordes.

    4.ImplementationPlan

    Aframe-baseddialoguesystemdesignedusingtheVoiceXMLmarkuplanguage

    willmanagedialogueflow.Theopen-endednessaffordedbyframe-based

    systems,opposedtotherigorousstructureoffinitestatemachines,willallow

    missingpiecesofinformationtobeprovidedtotheapplicationatthewillofthe

    caller.Thesystemcanbedevelopedanddeployedinacomprehensivevoice

    environmentsuchasNuance.VoiceXMLsabilitytorapidlyprototype,test,

    deploy,anditerateapplicationsisanimportantfeatureinconstantlychanging

    safetyandrescueenvironments.

    Thesystemwillbeinstalledandconfiguredonseveralserverslocatedin

    geographicallydifferentplacestodecreasethelikelihoodofserveroutages

    occurringsimultaneously.FollowingthesetupoftraditionalIVRsystems,thearchitectureshouldinvolvespeechrecognitionandsynthesismodulesalong

  • 8/3/2019 Using Location- Aware Multimodal Speech Interfaces to Survive a Zombie Apocalypse

    11/12

    withdialogue,presentation,database,andlocationmanagementmodules.The

    diagrambelowdemonstratesthecomponentsoftheserversystemarchitecture

    andtheirrelationshiptooneanother.

    Figure1:Serversystemarchitecture

    5.EvaluationPlan

    Theultimateevaluationplanwillbewitnessingwhetherthesystemiscapableof

    preventingorpostponingthedemiseofhumancivilizationduringazombie

    apocalypse.Untilthedaycomeswhentheundeadrisefromtheirgraves,

    traditionalspeechinterfaceevaluationmethodologiescanbeusedtoquantify

    thequalityofthesystem.

    Worderrorrate(WER)isanimportantmetrictocontrolinthissafetycritical

    applicationespeciallyconsideringtheword-spottingtechniquesmentionedin

    previoussections.AWERaslowaspossible,nohigherthan5%,willensure

    callersareabletoreceivetheattentiontheyneed.Concepterrorrateshouldalso

    bekeptaslowaspossiblebecauseincorrectlyrecognizedconceptswillresultin

  • 8/3/2019 Using Location- Aware Multimodal Speech Interfaces to Survive a Zombie Apocalypse

    12/12

    thesystemprovidingirrelevantinformation.Perplexityshouldbemonitoredto

    verifythatlongerstringsofwordscanbeproperlyidentified.

    Taskcompletionratesandtimesarecriticalapplicationlevelmetricsthatmust

    bekepttoaminimumbecause,similartowordandconcepterrorrates,the

    largerthevalueofthismetric,themorelikelyitisthatliveswillbelost.While

    moneymaynotbeofconcernduringazombieapocalypse,significantcost-

    savingswillbeintroducedbytheimplementationofthissystemthroughthe

    reductionofliveoperatorexpenses.

    Significanthuman-computerstudiesshouldbeperformedbeforethesystemis

    released.Testsubjectscanbeaskedtocallthenumberandperformavarietyof

    simulatedtasks.Speechrecognitioncanbecomparedagainstotherinputmodes,

    suchastouchtoneinputorSMSmessaging.Testparticipantsabilitiestoquickly

    obtaininformationonfictitioussafehousesandguncacheswillemulatethereal

    worldexperiencesthatsomedaywillbehadbycitizensinthenot-so-distant

    future.