1 a few of speech recognition's greatest blunders david thomson cto, speechphone (voicexml...

Post on 01-Jan-2016

218 Views

Category:

Documents

1 Downloads

Preview:

Click to see full reader

TRANSCRIPT

1

A Few of Speech Recognition's Greatest Blunders

David Thomson

CTO, SpeechPhone

(VoiceXML Tools Committee chair)

david@speechphone.com

Over 22 years in the field:some breakthroughs,some disasters.

3

Field Problem Examples1. Germs and money2. User training3. Echo cancellation4. Inexperienced management5. Last-minute "improvements"6. User interface testing7. Half-duplex speakerphones8. Ventilation9. Fire safety10.Leading the market11.Offering too much12.Component "upgrade"13.Tuning

4

Chapter: Analog Echo

Germs and Money

5

ATM Speaker Verification

Pick up the phone and say the following digit string: 3594.

3594

• Two levels of security: PIN and voiceprint.

• Random digit strings protect from recordings.

6

Chapter: Analog Echo

User Training

7

MovieFone (777-FILM)

MovieFone w/ASR

•MovieFone was the dominant U.S. movie information service, taking over 80,000,000 calls/year.

• ASR overwhelmingly preferred over touch-tone in caller survey.

•Users favored menu-based over spontaneous input.

Hello and welcome to MovieFone...

8

Example MovieLocator TransactionWhat science fiction movies are playing?

At the Ogden 6 theater, Pirates of the Carribean shows at 7:30.

Near Wheaton, Pirates of the Caribbean is playing at the Ogden 6 theater.

What time is it showing?

Wheaton.

Near what city?

Movie information conversation. The recognizer is designed to understand any reasonable movie information request from the caller.

9

Newspaper

Phone the Theater

MovieFone

MovieLocator

Menu-based

0

11

10

8

3

sometimesnever alwaysoften

8

5

10

5

6

6

4

1

6

8

7

1

0

2

4

Would You Use This To Find Movies?

Total = 22 subjects

10

ASR vs. Human Attendants

ASR:

- 96.2% calls routed correctly

Receptionists:

- 87% calls routed correctly

Conditions: Callers were greeted with “How may I direct your call?” and were routed to one of over 30 departments. Accuracy was scored by the customer.

11

Chapter: Analog Echo

Echo cancellation

12

Echo in an Analog System

SpeechRecognizer

Telephone Network

PromptGenerator

EchoCanceller

-6 dB

Speech: -40 dBmEcho: -33 dBmSNR: -7 dB

-11 dBm signal

-15 dB

-25 dbmSignal

Tip/Ring CardHybrid

-7 dB Line:-9 dB

Low speech signal strength and strong echos generated by the local network card conspire to make speech recognition difficult. Speech is up to 9 dB quieter and echos are about 31 dB louder than in a digital system, for a total signal-to-noise ratio loss of 40 dB.

13

Chapter: Analog Echo

Inexperienced Management

14

Voice Verification and Dialing•Panic response to competitor.

•No initial business case.

•Used unproven SV platform.

•Heavy use of inexperienced contractors.

•Poor budgeting.

•Distributed development organization.

•Turf battles, technical disagreements, egos.

•Changing feature requirements.

•Staff of 60, 4 years, $70M.

15

Chapter: Analog Echo

Last-Minute “Improvements”

Heat Sink Failure

Epoxy Beads

17

Chapter: Analog Echo

User Interface Testing

18

Multilingual Digit Dialer

Vier drei fünf vier zwei null sechs drei sieben.

•Complex user interface•Language dependencies ignored•No testing on naïve users•User errors exceeded ASR errors•System was deployed, then removed

19

Chapter: Analog Echo

Half-Duplex Speakerphones

20

Name Dialing - Placing a Call

(Dial tone)

Calling “home”

Call homeVoiceDialer

Telephone Network

21

Half-Duplex Speakerphones

Speech Recognition System

Response

Prompt

Speaker

Microphone

Half-Duplex Speakerphone

Unless user speech can force the handsfree phone to switch off the prompt, the recognition system hears nothing.

Call messages.

What can I do for you now?

) ) ) )

Lesson: Record or die

23

Unmasking Half-Duplex Equipment

Ready?

OK

Go.

1 - 2 - 3 - 4 - 5 - 6 - 7 - 8 - 9 - 10.

1 - 2 - 3 - 4 - 5 - 6 - 7 - 8 - 9 - 10.

Speakerphone user Handset user

24

Chapter: Analog Echo

Ventilation

25

Extreme Temperature Environment

Frame 2Frame 1

Door VentFanHall Window (20 yards)

Airflow

120 degrees

26

A/C Frame cooling example - side view

A. Ideal airflow

Monitor

Master PC

A/C

Unit

Monitor

Master PC

A/C

Unit

B. Air leaks C. Ducted frame

Monitor

Master PC

A/C

Unit

27

Improved Airflow

28

Chapter: Analog Echo

Fire Safety - 1

Example of Flammability Failure

IR View

30

Chapter: Analog Echo

Fire Safety - 2

31

Central Office Grade Speech Server

Photo of CDSUs in a frame:

d:\ppt\cdsu.jpg

48V Power

LANCard

Backplane Current Sense Resistors

Sense Resistors

33

Chapter: Analog Echo

Leading the Market

34

Wi-Fi Voice Dialing

SoftPhone

VoiceDial

SDK

TTS

Mobile Device

Data Network

VoIPGateway

Telco

Wi-Fi Network

Call DavidThomson

ASR

35

Chapter: Analog Echo

Offering too Much

36

1

2

3

4

5

6

7

8

9

*

0

#

Connecting

630-555-1212

A service that does everything

Business may subscribe to be listed in this service.

Movie Locator

Weather Line

Messages Shopping

Voice E-mailBusiness

Directory Voice Dialing

Business Directory.

Welcome to Lucent Technologies Automated Business Call Dialer. Please say the name of the Business to Call. For information, say ‘help.’

United Airlines.

Calling United. To cancel, say ‘cancel.’

VoiceXML

Privacy Manager

• Now, you can HEAR who's behind the call waiting beep.

• First, you hear the Call Waiting "beep" and then you hear the name of the second caller.

• Once you've heard the name, you decide if you want to "click over" and take the call. It's that simple!

• Talking Call Waiting is only $2.50 a month if you currently have Call Waiting on your phone line.

• Talking Call Waiting is currently available in our Major Market areas of: Chicago, IL Indianapolis, IN Detroit, MI Akron, OH Cleveland, OH Columbus, OH Dayton, OH

Milwaukee, WI

$2.50/mo.

Talking Call WaitingInstructions

or Call to Order Today 1-888-635-5050

http://www.ameritech.com/navigation/site/1,1935,150,00.html

Talking Call Waiting

39

Chapter: Analog Echo

Component “Upgrade”

Processor (before die shrink)

41

Chapter: Analog Echo

Tuning

42

Field Accuracy Improves Over TimeError Rate

Lab 1st Iteration 2nd Iteration Final

Wireless Digit Dialing Trial

Land-Line Models

New Models from Field Data

Final Tuning

43

Other Assorted Field Problems

•ASR works, forces touch-tone failures•Late beep causes people to speak early•Voice enhancement wrecked spectrum•Failure to record left developers blind•Speech takes the heat for unrelated bugs

44

For Slides or More Information

David Thomson

david@speechphone.com

Phone 949-655-1693

top related