p. n. kulkarni, p. c. pandey, and d. s. jangamashetti / dsp 2009, santorini, 5-7 july 2009 1 dsp...

12
P . N . K u l k a r n i , P . C . P a n d e y , a n d D . S . J a n g a m a s h e t t i / D S P 2 0 0 9 , S a n t o r i n i , 5 - 7 J u l y 2 0 0 9 1 DSP 2009 (Santorini, Greece. 5-7 July 2009), Session: S4P, Paper: S4P.1 Multi-Band Frequency Compression for Sensorineural Hearing Impairment P. N. Kulkarni 1 P. C. Pandey 2 D. S. Jangamashetti 3 1, 2 IIT Bombay, India 3 Basaveshwar Engg. College, Bagalkot , Kar., India

Upload: loreen-bryant

Post on 14-Jan-2016

215 views

Category:

Documents


3 download

TRANSCRIPT

Page 1: P. N. Kulkarni, P. C. Pandey, and D. S. Jangamashetti / DSP 2009, Santorini, 5-7 July 2009 1 DSP 2009 (Santorini, Greece. 5-7 July 2009), Session: S4P,

P. N

. Ku

lka r

ni ,

P. C

. Pan

de y

, an

d D

. S. J

ang

a mas

he t

ti /

DS

P 2

009 ,

San

tori

ni ,

5-7

J uly

20 0

9

1

DSP 2009 (Santorini, Greece. 5-7 July 2009), Session: S4P, Paper: S4P.1

Multi-Band Frequency Compression for Sensorineural Hearing Impairment

P. N. Kulkarni1

P. C. Pandey2

D. S. Jangamashetti3

1, 2 IIT Bombay, India3Basaveshwar Engg. College, Bagalkot , Kar., India

<{pnkulkarni,pcpandey}@ee.iitb.ac.in, [email protected]>

Page 2: P. N. Kulkarni, P. C. Pandey, and D. S. Jangamashetti / DSP 2009, Santorini, 5-7 July 2009 1 DSP 2009 (Santorini, Greece. 5-7 July 2009), Session: S4P,

P. N

. Ku

lka r

ni ,

P. C

. Pan

de y

, an

d D

. S. J

ang

a mas

he t

ti /

DS

P 2

009 ,

San

tori

ni ,

5-7

J uly

20 0

9

2

ABSTRACT

Sensorineural hearing loss is associated with widening of the auditory filters, leading to increased spectral masking and degraded speech perception. Multi-band frequency compression can be used for reducing the effect of spectral masking. The speech spectrum is divided into a number of bands and spectral samples in each of these bands are compressed towards the band center, by a constant compression factor. In the present study, we have investigated the effectiveness of the scheme for different compression factors, in improving the speech perception. Evaluation of the scheme using the modified rhyme test showed maximum improvement in recognition scores for compression factor of 0.6: about 17 % for the normal-hearing subjects under simulated hearing loss, and 6 – 21 % for the subjects with moderate to severe sensorineural hearing loss.

Page 3: P. N. Kulkarni, P. C. Pandey, and D. S. Jangamashetti / DSP 2009, Santorini, 5-7 July 2009 1 DSP 2009 (Santorini, Greece. 5-7 July 2009), Session: S4P,

P. N

. Ku

lka r

ni ,

P. C

. Pan

de y

, an

d D

. S. J

ang

a mas

he t

ti /

DS

P 2

009 ,

San

tori

ni ,

5-7

J uly

20 0

9

3

1. INTRODUCTION

Sensorineural hearing loss

Widening of auditory filters, resulting in increased spectral masking and degradation in speech perception.

Multi-band frequency compression for reducing the effect of increased spectral masking

Spectrum divided into a number of bands and spectral components in each band compressed towards the center, for enhancing the spectral contrasts.

Page 4: P. N. Kulkarni, P. C. Pandey, and D. S. Jangamashetti / DSP 2009, Santorini, 5-7 July 2009 1 DSP 2009 (Santorini, Greece. 5-7 July 2009), Session: S4P,

P. N

. Ku

lka r

ni ,

P. C

. Pan

de y

, an

d D

. S. J

ang

a mas

he t

ti /

DS

P 2

009 ,

San

tori

ni ,

5-7

J uly

20 0

9

4

Earlier work

Critical band based compression [Yasu et al., 2002, 2004] ▪ Magnitude spectrum compressed towards center of each critical band and associated with unaltered phase spectrum (segmentation with Hamming window, STFT, spectral modification, and overlap-add synthesis)▪ Moderate improvement in the VCV recognition score for hearing-impaired subjects (unproc. 35.4%, proc. 38.3%).

Objective of the investigation

To select the most appropriate combination of segmentation, bandwidth, frequency mapping, and compression factor for analysis-synthesis.

Page 5: P. N. Kulkarni, P. C. Pandey, and D. S. Jangamashetti / DSP 2009, Santorini, 5-7 July 2009 1 DSP 2009 (Santorini, Greece. 5-7 July 2009), Session: S4P,

P. N

. Ku

lka r

ni ,

P. C

. Pan

de y

, an

d D

. S. J

ang

a mas

he t

ti /

DS

P 2

009 ,

San

tori

ni ,

5-7

J uly

20 0

9

5

2. SIGNAL PROCESSING

Segmentation▪ Fixed-frame : 20 ms frames with 50% overlap.▪ Pitch-synch.: two local pitch period frames aligned to glottal closure instants (GCIs), with one pitch period overlap.

Spectral analysis and modification▪ Zero padding, 1024-point FFT▪ Compression of complex spectral samples in a set of predefined bands towards the center by a fixed CF

Resynthesis: IFFT and overlap-add

Input speech

Proc. speech

Page 6: P. N. Kulkarni, P. C. Pandey, and D. S. Jangamashetti / DSP 2009, Santorini, 5-7 July 2009 1 DSP 2009 (Santorini, Greece. 5-7 July 2009), Session: S4P,

P. N

. Ku

lka r

ni ,

P. C

. Pan

de y

, an

d D

. S. J

ang

a mas

he t

ti /

DS

P 2

009 ,

San

tori

ni ,

5-7

J uly

20 0

9

6

Factors affecting quality & intelligibility

▪ Segmentation ▪ Bandwidth ▪ Frequency mapping ▪ Comp. factor

Bandwidth

▪ Constant bandwidth (no.of bands : 2 – 18)▪ 1/3 octave bandwidth▪ Auditory criticalbandwidth (ACB)

0

1

2

3

4

5

0 1 2 3 4 5Input frequency (kHz)

Ou

tpu

t fr

eq

uen

cy (

kH

z)

BW = ACB,Comp. factor = 0.6

Page 7: P. N. Kulkarni, P. C. Pandey, and D. S. Jangamashetti / DSP 2009, Santorini, 5-7 July 2009 1 DSP 2009 (Santorini, Greece. 5-7 July 2009), Session: S4P,

P. N

. Ku

lka r

ni ,

P. C

. Pan

de y

, an

d D

. S. J

ang

a mas

he t

ti /

DS

P 2

009 ,

San

tori

ni ,

5-7

J uly

20 0

9

7

Frequency mapping ▪ Sample-to-sample ▪ Superimposition of spectral samples▪ Spectral segment

Spectral segment mapping

1/b a

1

1'

n

j mY k m a X m X j b n X n

m, n : first and last FFT indices in the segment [a,b].

k

k'

Unpr. Sp. Index

Proc. Sp. Index

k'k'+0.5k'-0.5

nma b kic

kic

' 0.5 /ic ica k k k

Output spectral sample = weighted sum of complex spectral samples in the input frequency segment [a,b] corresponding to the output sample k'.

Page 8: P. N. Kulkarni, P. C. Pandey, and D. S. Jangamashetti / DSP 2009, Santorini, 5-7 July 2009 1 DSP 2009 (Santorini, Greece. 5-7 July 2009), Session: S4P,

P. N

. Ku

lka r

ni ,

P. C

. Pan

de y

, an

d D

. S. J

ang

a mas

he t

ti /

DS

P 2

009 ,

San

tori

ni ,

5-7

J uly

20 0

9

8

Processing example

▪ /aka/: (a) unpr. (b) proc. (fixed-frame seg., spectral segment mapping. ACB, CF = 0.6).

▪ Harmonic structure in voiced segments & randomness in unvoiced segments approximately preserved

(a)

(b)

MOS tests (normal hearing subjects) Highest scores for pitch-synch. segmentation, ACB, spectral seg. mapping [Kulkarni et al, Int. J. Speech Tech., vol. 10, pp. 219 - 227, 2007]

Page 9: P. N. Kulkarni, P. C. Pandey, and D. S. Jangamashetti / DSP 2009, Santorini, 5-7 July 2009 1 DSP 2009 (Santorini, Greece. 5-7 July 2009), Session: S4P,

P. N

. Ku

lka r

ni ,

P. C

. Pan

de y

, an

d D

. S. J

ang

a mas

he t

ti /

DS

P 2

009 ,

San

tori

ni ,

5-7

J uly

20 0

9

9

3. LISTENING TESTSModified Rhyme Test (MRT) for quantitative evaluation of speech intelligibility▪ 300 CVC words, presented in six test lists with a carrier phrase.

▪ Automated test procedure for randomized presentation & recording of response and response time.

Experiment I▪ 6 normal-hearing subjects with simulated hearing loss: Broadband masking noise added to the processed speech with SNR constant on a short time (10 ms) basis

▪ SNR: ∞, 6, 3, 0, -3, -6, -9, -12, and -15 dB.▪ 10,800 presentations per subject (300 words × 4 comp. factors × 9 SNR)

Experiment II▪ 11 subjects with moderate to severe sensorineural hearing loss (without using their hearing aids).▪ 1,200 presentations per subject (300 words × 4 comp. factors)

Page 10: P. N. Kulkarni, P. C. Pandey, and D. S. Jangamashetti / DSP 2009, Santorini, 5-7 July 2009 1 DSP 2009 (Santorini, Greece. 5-7 July 2009), Session: S4P,

P. N

. Ku

lka r

ni ,

P. C

. Pan

de y

, an

d D

. S. J

ang

a mas

he t

ti /

DS

P 2

009 ,

San

tori

ni ,

5-7

J uly

20 0

9

10

40

50

60

70

80

90

100

Inf 6 3 0 -3 -6 -9 -12 -15SNR (dB)

Rec

og

nit

ion

sco

re (

%)

Unp.

CF = 0.8

CF = 0.6

CF = 0.4

Exp. I: Recognition scores (avg. across 6 normal hearing subjects)

▪ Processing improved recognition scores for SNR < 0 dB▪ Best improvements observed for C.F. = 0.6 (p < 0.001).▪ Avg. Improvement of 17 % in recognition score for SNR < -6 dB ▪ SNR advantage of 6 dB at about 60% recognition score.

4. RESULTS

Page 11: P. N. Kulkarni, P. C. Pandey, and D. S. Jangamashetti / DSP 2009, Santorini, 5-7 July 2009 1 DSP 2009 (Santorini, Greece. 5-7 July 2009), Session: S4P,

P. N

. Ku

lka r

ni ,

P. C

. Pan

de y

, an

d D

. S. J

ang

a mas

he t

ti /

DS

P 2

009 ,

San

tori

ni ,

5-7

J uly

20 0

9

11

0

20

40

60

80

100

KN

R

MN

R

PK

R

PA

L

PA

R

PP

R

PE

L

RG

R

RJL

SS

L

SS

R

Subjects

Rec

og

. sco

re (

%)

Unproc. CF=0.8 CF=0.6 CF=0.4Exp. II : Recognition scores (for 11 hearing-impaired subjects)

Compression factor 0.8 0.6 0.4

Improvement in % R. S. 2 – 8 6 – 21 3 – 16

Page 12: P. N. Kulkarni, P. C. Pandey, and D. S. Jangamashetti / DSP 2009, Santorini, 5-7 July 2009 1 DSP 2009 (Santorini, Greece. 5-7 July 2009), Session: S4P,

P. N

. Ku

lka r

ni ,

P. C

. Pan

de y

, an

d D

. S. J

ang

a mas

he t

ti /

DS

P 2

009 ,

San

tori

ni ,

5-7

J uly

20 0

9

12

5. CONCLUSION

For both the group of subjects, max. improvement in recognition scores for CF = 0.6

● Normal-hearing subjects ▪ Avg. Improvement of 17 % in recognition score for SNR < -6

dB. ▪ SNR advantage of 6 dB.

● Hearing-impaired subjects Improvement in % recognition score: 6 – 21.