itcs 6010 natural language systems. overview welcome to itcs 6010 syllabus introduction
Post on 14-Dec-2015
240 Views
Preview:
TRANSCRIPT
Good Design (our goal!)
“Every designer wants to build a high-quality interactive system that is admired by colleagues, celebrated by users, circulated widely, and imitated frequently.” (Shneiderman, 1992, p.7)
…and anything goes!…
What is an interface?
An interface refers to the part of technology that people interact with
Interactions include information transfer: From user to computer From computer to user
Interaction Components
Interaction hardware include: Keyboard, mouse, stylus, keypad,
microphone Interaction software include:
Window, page, sound, talking voice
What is a ‘well-designed’ interface?
Depends on your perspective…… Examples:
For a programmer – works within technical constraints of project
For a usability engineer – designed with particular user group in mind
For a user – works the way expected
Types of Interfaces
Character-based user interface (CHUI) Graphical user interface (GUI) Web user interface (WUI) Speech user interfaces (SUI)
Auditory user interface (AUI) Graphical user interface with speech (S/GUI) Voice user interface (VUI)
Auditory User Interfaces
An Auditory user interface (AUI) is an interface which relies primarily or exclusively on audio for interaction, including speech and sound. (Weinschenk & Barker 2000)
Examples: Hands-free automobile navigational system Interactive voice response system (IVR) like
automated payment center Products for visually impaired
Auditory User Interfaces
Natural Language/Speech User Interfaces Conversation is natural
Multimodal User Interfaces Combines voice, text, graphics, gestures,
keypad, stylus, etc. into one interface
Graphical User Interface with Speech (S/GUI)
Multimodal interface that involves speech and a GUI
Examples: Voice activated calling on cell phone Dictation software that allows text entered via
text, speech or both
Graphical User Interface with Non-Speech Audio
Interface that includes non-verbal audio Earcons – auditory icons/sounds that
communicate information Examples:
System beeps when user makes an error System knocks when someone wants to chat
Multimodal User Interfaces
Simultaneous Multimodality Multiple modes at the same time, voice-visual
Sequential Multimodality Uses multiple modes sequentially and
seamlessly
Voice User Interface
A voice user interface (or VUI) is what a person interacts with when communicating with a spoken language application. (Cohen et al, 2004)
Why a VUI?
Characteristics that favor VUI: Hands-busy situation No keyboard, mouse, stylus available Disablilties Context-specific, command driven application
But What Makes a Good VUI?
Functionality Speed & efficiency Reliability, security, data integrity Standardization, consistency USABILITY !
Closer to Fine: A Philosophy
…The human user of any system is the focus of the design process. Planning and implementation is done with the user in mind, and the system is made to fit the user, not the other way around….
Bruce WalkerGeorgia Institute of
Technology
Human Factors in Speech
High Error Rates Speech recognition Background noise, intonation, pitch, volume Grammars (missing words, size limitations)
“When speech recognition becomes genuinely reliable, this will cause another big change in operating systems.” (Bill Gates, The Road Ahead 1995)
Human Factors in Speech
Unpredictable Errors Grammars
Sound alike words Austin-Boston Missing words Grammar size limitations
Note: We do not like using unpredictable machines.
Human Factors in Speech
User Expectations Novice users have high expectations of computers
and speech Natural language
Novices expect to say “anything” to the machine i.e. Star Trek
Spoken language differs from written language. i.e. ums or uhs appear in spoken language
Human Factors in Speech
Memory Speech only systems can be taxing on human
memory, i.e. large telephone menu systems. Miller - 7 plus or minus 2
Speech Recognition
Refers to the technologies that enable computing devices to identify the sound of human voice.
List all the U-N-C Charlotte orders.
Speech Recognition
Continuous Recognition Allows a user to speak to the system in an
everyday manner without using specific, learned commands.
Discrete Recognition Recognizes a limited vocabulary of individual
words and phrases spoken by a person.
Speech Recognition
Word Spotting Recognizes predefined words or phrases. Used by discrete recognition applications.
“Computer I want to surf the Web” “Hey, I would like to surf the Web”
Speech Recognition
Voice Verification or Speaker Identification Voice verification is the science of verifying a person's
identity on the basis of their voice characteristics. Unique features of a person's voice are digitized and
compared with the individual's pre-recorded "voiceprint" sample stored in the database for identity verification.
It is different from speech recognition because the technology does not recognize the spoken word itself.
Speech Synthesis
Refers to the technologies that enable computing devices to output simulated human speech.
James, here are the U-N-C Charlotte
orders.
Speech Synthesis
Formant Synthesis Uses a set of phonological rules to control an
audio waveform that simulates human speech.
Sounds like a robot, very synthetic, but getting better.
Speech Synthesis
Concatenated Synthesis Uses computer assembly of recorded voice
sounds to create meaningful speech output. Sounds very human, most people can’t tell
the difference.
Uses of Speech Technologies
Interactive Voice Response Systems Call centers
Medical, Legal, Business, Commercial, Warehouse
Handheld Devices Toys and Education Automobile Industry Universal Access (visual/physical impaired)
EEnvironments
What are the environments? Physical Places of Operation Operating Environments/Systems
IInteractions
What are the interactions? Between humans Between machines Between humans and machines
top related