Download - Ms. R. Rani1
-
8/2/2019 Ms. R. Rani1
1/32
Speec ana ys s
-
8/2/2019 Ms. R. Rani1
2/32
Analysisofspeechsoundstakingintoconsiderationtheirmethodof
production
acoustic featurevectors.
``
eex rac on o n eres ng n orma onasanacous cvec or
-
8/2/2019 Ms. R. Rani1
3/32
-
8/2/2019 Ms. R. Rani1
4/32
-
8/2/2019 Ms. R. Rani1
5/32
-
8/2/2019 Ms. R. Rani1
6/32
ResonancesandFormants Resonances are vibratory characteristics of a resonating body.
In the case of an air filled tube the resonance characteristics exist
even when there is no sound being produced.
selectively enhance sound vibrations close to the resonance
frequencies and selectively attenuate sound vibrations remote from.
This results in peaks in the acoustic spectrum of the resulting
speech sound.
These acoustic s ectral eaks are called formants articularl
when they occur in vowels and vowellike consonants.
-
8/2/2019 Ms. R. Rani1
7/32
Spectrograms
pec rogramsperm eexam na ono e ynam cc anges naspeechspectrum. consonants(eg.stopbursts)andalsoforvoweltransitions(betweenvowelsandconsonantsandbetweenthetargetsindiphthongs). Spectrograms,usuallyinconjunctionwithwaveforms,areessentialduringthesegmentingandlabelingofspeech. Spectrogramsusuallyprovidetheclearestvisualcuestotheboundariesbetweenphonemes.
Spectrogramsdonot,however,provideaccuratemeasurementsofvowelformantsasbroadband spectrogramshaveapoorfrequencyresolution(about300Hz)andsothereisahighdegreeofintrinsic
.ThatiswhywetendtouseFFTsandLPCsfortheaccuratemeasurementofformantfrequencies.
-
8/2/2019 Ms. R. Rani1
8/32
Fig: waveformandbroadbandspectrogramoftheword"heard"
-
8/2/2019 Ms. R. Rani1
9/32
-
8/2/2019 Ms. R. Rani1
10/32
-
8/2/2019 Ms. R. Rani1
11/32
-
8/2/2019 Ms. R. Rani1
12/32
0.0143017892 0.490396511
1_aam
g1 g2 aag aa1 aa2 aam m1 m2
Time s
0 0.491
-
8/2/2019 Ms. R. Rani1
13/32
aayvu
1
-1
0
g aa ay y yv v vu u
Time (s)
0 0.8455
0.2 0.1 0.07 0.04 0.07 0.07 0.19.
Words Duration
insecsIntensity
indBPitch
inHzFormantsinHz
F1 F2 F3 F4
aayvu 0.77 80.4 160.2 540.7 1484.6 3750.3 3750.2
. . . . . .
aa 0.2 81.3 137.1 810.4 1181.6 2865.5 3792.2
ay 0.1 84.0 171.1 654.07 1755.3 2599.9 3753.5
y 0.07 80.5 179 362.1 2275.9 2570.3 3878.4
yv 0.04 78.7 174.5 349.3 1928.6 2365.0 3876.5
. . . . . . .
vu 0.06 78.2 166.5 3636.0 1147.2 2570.8 3568.2
u 0.2 77.8 167.2 387.36 1488.5 2611.5 3693.2
-
8/2/2019 Ms. R. Rani1
14/32
Hz)
LPC of aa in aayvuLPC of aa in aayvu
886.4 1212.5
relevel(dB
40
2916.7
3754.0 4813.6
Soundpress
20
Frequency (Hz)0 55000 1000 2000 3000 4000 5000
-
8/2/2019 Ms. R. Rani1
15/32
-
8/2/2019 Ms. R. Rani1
16/32
-
8/2/2019 Ms. R. Rani1
17/32
B/
Hz)
60
LPC of v in aayvu
323.3
pressurelevel(d
40
1190.2
2346.2 3613.2
Frequency (Hz)
0 5500
Sound
20
0 1000 2000 3000 4000 5000
-
8/2/2019 Ms. R. Rani1
18/32
-
8/2/2019 Ms. R. Rani1
19/32
B/
Hz)
LPC of u in aayvu
397.4
1486.33583.6
pres
surelevel(d
40
60 2590.7
Frequency (Hz)
0 5500
Sound
20
0 1000 2000 3000 4000 5000
-
8/2/2019 Ms. R. Rani1
20/32
Linear Prediction Coefficient (LPC)
the poles (related to resonances or formants) that, when combinedwith the speech source spectrum (the "residual" in LPC analysis),would result in the ori inal waveform.
An LPC analysis separates the analysis of the resonantcharacteristics of a speech sound from the source characteristicsof that sound.
The resulting LPC spectrum is a smoothed spectrum with thepeaks representing the formants (resulting from the vocal tract
- .
-
8/2/2019 Ms. R. Rani1
21/32
-
8/2/2019 Ms. R. Rani1
22/32
Figure:Whitenoiseusedasasimplifiedmodelofafricativesoundsource.
Notetherandompatternofboththewaveform(bottom)andthe
spectrum(top).Alsonotethatthespectralenvelope(LPCspectruminred)
isapproximatelyflat.
-
8/2/2019 Ms. R. Rani1
23/32
-
8/2/2019 Ms. R. Rani1
24/32
-
8/2/2019 Ms. R. Rani1
25/32
-
8/2/2019 Ms. R. Rani1
26/32
-
8/2/2019 Ms. R. Rani1
27/32
-
8/2/2019 Ms. R. Rani1
28/32
Identification of Speech Waveforms
Figure: Threelongvowelsinan/h_d/context.
-
8/2/2019 Ms. R. Rani1
29/32
Figure: ThreeEnglishvoicelessoralstopsinCVcontext
-
8/2/2019 Ms. R. Rani1
30/32
-
8/2/2019 Ms. R. Rani1
31/32
-
8/2/2019 Ms. R. Rani1
32/32