wideband speech communications: the good, the bad, and the ugly

14
The Fully Networked Car Geneva, 4-5 March 2009 1 Wideband Speech Communications: the Good, the Bad, and the Ugly Scott Pennock Sr. Hands-Free Standards Specialist QNX Software Systems (Wavemakers)

Upload: flynn-mclaughlin

Post on 15-Mar-2016

38 views

Category:

Documents


0 download

DESCRIPTION

Wideband Speech Communications: the Good, the Bad, and the Ugly. Scott Pennock Sr. Hands-Free Standards Specialist QNX Software Systems (Wavemakers). Outline. Introduction The Good The Bad The Ugly Conclusions. Introduction. What is wideband (WB) speech? - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Wideband Speech Communications:  the Good, the Bad, and the Ugly

The Fully Networked Car Geneva, 4-5 March 2009

1

Wideband Speech Communications: the Good, the Bad, and the Ugly

Scott PennockSr. Hands-Free Standards Specialist

QNX Software Systems (Wavemakers)

Page 2: Wideband Speech Communications:  the Good, the Bad, and the Ugly

The Fully Networked Car Geneva, 4-5 March 2009

2Outline

o Introductiono The Goodo The Bado The Uglyo Conclusions

Page 3: Wideband Speech Communications:  the Good, the Bad, and the Ugly

The Fully Networked Car Geneva, 4-5 March 2009

3Introduction

o What is wideband (WB) speech?• Speech has energy from around 50-10000Hz• Traditional narrowband (NB) terminals and

networks bandlimit speech down to around 300-3400Hz

• WB speech in this presentation refers to a bandwidth of 50-7000Hz

o Why is WB speech important to automotive?• More robust to vehicle noise• Reduces driver distraction• Helps enable spatial auditory displays

o This presentation will review the benefits, challenges, and unresolved issues with WB speech in an automotive environment

Page 4: Wideband Speech Communications:  the Good, the Bad, and the Ugly

The Fully Networked Car Geneva, 4-5 March 2009

4The “Good”

o Improves task performance• Better speech comprehension• Reduced driver distraction• Improved talker identification• Better speech localization• Other potential task improvements

o Preferred by users• Higher quality• Less listening-effort• More comfortable loudness-level• Other factors influencing preference

o Task performance benefits alone make a compelling argument for deploying WB speech in the vehicle

Page 5: Wideband Speech Communications:  the Good, the Bad, and the Ugly

The Fully Networked Car Geneva, 4-5 March 2009

5WB speech provides extra frequency and temporal information

This “difference spectrogram” was calculated by subtracting the NB from WB spectrogram of someone saying “the juice of lemons makes fine punch”.

Page 6: Wideband Speech Communications:  the Good, the Bad, and the Ugly

The Fully Networked Car Geneva, 4-5 March 2009

6WB speech increases intelligibility and is more robust to vehicle noise

3.3 kHz 5 kHz 7 kHzCUTOFF FREQUENCY

0.5

0.6

0.7

0.8

0.9

1.0P

rob(

CO

RR

EC

T)

SNR=24 dBSNR=12 dBSNR = 0 dB

s (e.g. six) mistaken for f (e.g. fix)

90% CI

Confuse_s_f

Probability of correct response by bandwidth and SNR

Page 7: Wideband Speech Communications:  the Good, the Bad, and the Ugly

The Fully Networked Car Geneva, 4-5 March 2009

7WB speech improves speech comprehension and reduces driver distraction

This figure illustrates auditory streaming of speech. Shapes represent phonetic units that have been recognized. Dotted lines show information that would be missing without wideband speech.

Page 8: Wideband Speech Communications:  the Good, the Bad, and the Ugly

The Fully Networked Car Geneva, 4-5 March 2009

8The “Bad”

o Users are more sensitive to WB echo and noise due to perceptual effects• Ear is most sensitive to high frequency region of

WB speech• Loudness of echo and noise in new frequency

regions will add to loudness in narrowband region• High frequency echo is not masked as effectively

by one’s own voiceo Acoustic Echo Cancellers (AEC) have a more

difficult time removing high frequency echo• Poor excitation signal makes it harder to drive

echo canceller to convergence• High frequency distortion is falsely classified as

driver’s speech and can prevent AEC from training

Page 9: Wideband Speech Communications:  the Good, the Bad, and the Ugly

The Fully Networked Car Geneva, 4-5 March 2009

9The challenges presented by WB speech can be addressed

o Good electro-acoustic design of vehicle platforms• Careful acoustic design of vehicle cabin• Proper selection, placement, orientation, and

mounting of microphones and loudspeakers• High quality signal transport (e.g., optical,

differential)o High performance speech enhancement algorithms

• AEC• Noise Reduction (NR)• Low-complexity compression for devices with limited

resources

Page 10: Wideband Speech Communications:  the Good, the Bad, and the Ugly

The Fully Networked Car Geneva, 4-5 March 2009

10The “Ugly”

o Interoperability issues• WB terminal users will experience

inconsistent loudness and quality• NB terminal users will become less

satisfied with quality because of exposure to WB speech

o Long transition period

Page 11: Wideband Speech Communications:  the Good, the Bad, and the Ugly

The Fully Networked Car Geneva, 4-5 March 2009

11Users of WB terminals will experience inconsistent loudness and quality

o Solution for inconsistent loudness is to use Receive Automatic Gain Control (AGC) based on perceived loudness instead of RMS or peak levels

o Differences in quality can be reduced by using BandWidth Extension (BWE) and High Frequency Encoding (HFE) techniques

Page 12: Wideband Speech Communications:  the Good, the Bad, and the Ugly

The Fully Networked Car Geneva, 4-5 March 2009

12There will be a long transition period

o Deployment has already started o Not clear when WB speech will take-off, but

automotive is already well positioned• Vehicle Audio Systems are currently wideband

capable• WB microphones available and easy to drop-in• Several WB speech coders are already standardized

o Even after WB speech takes hold, hybrid WB/NB connections will be around for a long time• NB network equipment and terminals are built to last • Continued use in certain areas

Page 13: Wideband Speech Communications:  the Good, the Bad, and the Ugly

The Fully Networked Car Geneva, 4-5 March 2009

13Conclusions

o WB speech improves task performanceo Users prefer WB speecho WB speech is important to automotive

• More robust to vehicle noise• Reduces driver distraction• Helps enable spatial auditory displays

o WB speech will be a key differentiator for automotive OEMs and service providers

Page 14: Wideband Speech Communications:  the Good, the Bad, and the Ugly

The Fully Networked Car Geneva, 4-5 March 2009

14Conclusions (continued)

o Successful automotive deployment depends on:• Attention to the design of vehicle platforms• High performance speech enhancement

algorithms (e.g., AEC, NR, etc.)o Interoperability issues will eventually get

worked outo NB network equipment/terminals will be in

use in certain areas for a long time