detection of acoustic landmark

Post on 15-Feb-2017

45 Views

Category:

Engineering

1 Downloads

Preview:

Click to see full reader

TRANSCRIPT

1

Detection of Acoustic Landmarks for Speech Processing with High

Resolution M.Tech Credit Seminar

Pushpa Gothwal (09307054)Supervisor: Prof. P. C. Pandey

Electrical Engineering DepartmentNovember 2009

2

Introduction Landmarks and their categorization Landmark detection methods

1. Manual labeling of landmarks2. Detection of abrupt consonant and abrupt landmarks3. Stop consonant landmark detection method

Summary and Future work

Outline

2

3

Introduction

Perception of speech under adverse listening conditions is improved by processing of speech

Landmark detection is needed for processing

Aim : To study 3 different methods of landmark detection and compare their temporal resolution

3

4

Introduction

Landmarks and their categorization Landmark detection methods

1. Manual labeling of landmarks

2. Detection of abrupt consonant and abrupt landmarks

3. Stop consonant landmark detection method Summary and Future work

4

5

.

Landmarks is the region where the spectral discontinuity in speech.

They can be categorized as:– Abrupt Consonantal :It is the closure and release of

constriction. Example- /able/ – Abrupt: It shows the change in sound due to glottal

activity. Example- /paint/– Nonabrupt: It marks the transition between semivowel

to vowel and vice versa. Example-/away/– Vocalic: It occurs when the vocal cord is extremely

open for a vowel. Example-/bat/

What is a Landmark?

6

An illustration of landmarks. AC = abrupt-consonantal, A = abrupt, N = nonabrupt, V = vocalic (Lui 1996)

7

Introduction Landmarks and their categorization Landmark detection methods

1. Manual labeling of landmarks2. Detection of abrupt consonant and abrupt landmarks

3. Stop consonant landmark detection method Summary and Future work

7

8

Manual labeling of landmarks

Spectrogram of /aba/ (Prat)

9

Introduction Landmarks and their categorization Landmark detection methods

1. Manual labeling of landmarks

2. Detection of abrupt consonant and abrupt landmarks3. Stop consonant landmark detection method

Summary and Future work

9

10

Detection of abrupt consonant and abrupt landmarks

It detects two landmarks Spectrum is divided into 6 bands

Band1. 0.0-0.4 Khz 2. 0.8-1.5 3. 1.2-2.0 4. 2.0-3.5 5. 3.5-5.0 6. 5.0-8.0 Band 1-Monitor glottal activityBand 2-5-Monitor Closure and release of sonorantBand 6-Monitor the stop

11Landmark detection algorithm (Lui 1996)

Detection of abrupt consonant and abrupt landmarks (cont.)

12

Spectrogram of “the money is coming today". The middle figure shows energy of band 1; and bottom figure shows ROR of band.(Lui,1996)

Detection of abrupt consonant and abrupt landmarks (cont.)

13

Introduction Landmarks and their categorization Landmark detection methods

1. Manual labeling of landmarks

2. Detection of abrupt consonant and abrupt landmarks

3. Stop consonant landmark detection method Summary and Future work

13

14

Pass I

Step 1 : Spectrum is divided into 5 bandsBand Frequency (kHz) 1 0.0-0.4 (Monitor glottal vibration) 2 0.4-1.2 3 1.2-2.0 4 2.0-3.5 5 3.5-5.0

(Consonant closure andrelease)

Stop consonant landmark detection method

15

Short time spectral analysis

Computation of energy peaks and centroids

Computation of RORs energy and centroid

Computation of spectral transition index

Landmark localization

Wavelet decomposition around landmarks

Computation of short time energy and ZCR

Computation of energy and ZCR RORs

Landmark localization

Landmark(Pass 1)

Landmark (Pass 2)

Pass 1 Pass2

Processing stage for landmark detection (Arjun et al., 2008)

speech

16

Step 2 - Computation of energy peaks and centroid in frequency bands

where k1 and k2 upper and lower frequency index for band b,n frame.

Centroid frequency is k2 k2

fc(b,n)= ∑ k|Xn(k)|2 / ∑ |Xn(k)|2 fs/N (2)

k=k1 k=k1

Ep (b, n) = 10 log10 (max [|X n (k)|] 2), k1 ≤ k ≤k2 (1)

Stop consonant landmark detection method (cont.)

17

Step 3-Computation of energy and centroid RORs

E'p(b,n) = | Ep(b, n+K) − Ep(b,n−K)| (3)

f'c(b, n) = | fc(b, n+K) − fc(b,n−K) | (4)

Stop consonant landmark detection method (cont.)

18

Step 4-Computation of transition index for energy and centroid frequency

5 Tec(n) = 1/5∑E’pn(p, n)f’cn(b,n) (5)

b=1

Stop consonant landmark detection method (cont.)

19 Waveform for /uka/ , ROR for band1(b), band2(c), band3(d) (Arjun et al.,2008)

Stop consonant landmark detection method (cont.)

20 Processing results /uka/ of (Arjun et al., 2008)

Stop consonant landmark detection method (cont.)

21

(a) Windowed segment used in second pass, (b) energy and ZCR ROR’s of level 1, (c) ROR’s of level 2, and (d) transition index Tez computed from ROR’s in (b) and (c) (Arjun et.al.2008)

Stop consonant landmark detection method (cont.)

22

Pass2:

Step1-Compute the wavelet decomposition for segmenting the speech

Step2-Compute the energy and Zero Crossing Rate (ZCR)

Step3-Compute the ROR for energy and ZCR

Stop consonant landmark detection method (cont.)

23

Introduction Landmarks and their categorization Landmark detection methods

1. Manual labeling of landmarks

2. Detection of abrupt consonant and abrupt landmarks

3. Stop consonant landmark detection method

Summary and Future work

23

24

Summary

The first method of landmark detection is time consuming and tedious. Moreover the resolution is also very poor.

The second method is relatively faster but it also gives poor temporal resolution.

The third method gives very high temporal resolution at a faster pace.

24

25

Future Work

To focus on the algorithms for landmark detection in speech and to improvise them to implement in the phone-based recognition system.

26

REFERENCES[Lui 1996] S. A. Liu, “Landmark detection for distinctive feature based speech recognition,” J. acoust. Soc. Am., vol. 100, no. 5, pp. 3417-3430. [Arjun et al., 2008] A.R.Jayan,P.C.Pandey and ,”Detection of Acoustic Landmarks with high resolution for Speech Processing” Procc,14th

National conf.communication.

[Alani et al.,1999] A.Alani and M.Deriche, “A novel approach to speech segmentation using the wavelet transform,” in proc.5th int.stmp.signal Processing and Applications.(ISSSPA’99),127-129.

[OS 2001] D. O'shaughnesey, Speech Communications: Humans and Machine, University Press (India).

[L.R., 2008] L. R. Rabiner, R. W. Schafer, Digital Processing of Speech Signals, Pearson Education Inc. and Dorling Kindersley Publishing Inc., India.

top related