a study on speech recognition using dynamic time warping cs 525 : project presentation palden lama...
Post on 19-Dec-2015
213 views
TRANSCRIPT
![Page 1: A STUDY ON SPEECH RECOGNITION USING DYNAMIC TIME WARPING CS 525 : Project Presentation PALDEN LAMA and MOUNIKA NAMBURU](https://reader035.vdocument.in/reader035/viewer/2022062714/56649d3a5503460f94a142dd/html5/thumbnails/1.jpg)
A STUDY ON SPEECH RECOGNITION USING DYNAMIC TIME WARPING
CS 525 : Project Presentation
PALDEN LAMA and MOUNIKA NAMBURU
![Page 2: A STUDY ON SPEECH RECOGNITION USING DYNAMIC TIME WARPING CS 525 : Project Presentation PALDEN LAMA and MOUNIKA NAMBURU](https://reader035.vdocument.in/reader035/viewer/2022062714/56649d3a5503460f94a142dd/html5/thumbnails/2.jpg)
GOALS
Learn how it works ! Focus:
Pre-Processing Dynamic Time Warping/Dynamic Programming
Verify using MATLAB Build a simple Voice to Text Converter
application.
![Page 3: A STUDY ON SPEECH RECOGNITION USING DYNAMIC TIME WARPING CS 525 : Project Presentation PALDEN LAMA and MOUNIKA NAMBURU](https://reader035.vdocument.in/reader035/viewer/2022062714/56649d3a5503460f94a142dd/html5/thumbnails/3.jpg)
HOW DOES IT WORK?
Record Extracta voice Feature Vectors
Digitized Speech Signal(.wave
file)
Acoustic Preprocessin
g(DFT + MFCC)
Speech Recognizer(Dynamic
Time Warping)
![Page 4: A STUDY ON SPEECH RECOGNITION USING DYNAMIC TIME WARPING CS 525 : Project Presentation PALDEN LAMA and MOUNIKA NAMBURU](https://reader035.vdocument.in/reader035/viewer/2022062714/56649d3a5503460f94a142dd/html5/thumbnails/4.jpg)
![Page 5: A STUDY ON SPEECH RECOGNITION USING DYNAMIC TIME WARPING CS 525 : Project Presentation PALDEN LAMA and MOUNIKA NAMBURU](https://reader035.vdocument.in/reader035/viewer/2022062714/56649d3a5503460f94a142dd/html5/thumbnails/5.jpg)
SPEECH SIGNAL
Voiced Excitation fundamental frequency (Speaker dependent)
Loudness signal amplitude Vocal tract shape spectral shaping
(most important to recognize words)
A time signal of vowel /a:/ (fs=11 kHz, length=100ms)
time
![Page 6: A STUDY ON SPEECH RECOGNITION USING DYNAMIC TIME WARPING CS 525 : Project Presentation PALDEN LAMA and MOUNIKA NAMBURU](https://reader035.vdocument.in/reader035/viewer/2022062714/56649d3a5503460f94a142dd/html5/thumbnails/6.jpg)
ACOUSTIC PRE-PROCESSING
DFT (Discrete Fourier Transform) Spectral Coeff. Inverse DFT on log power spectrum Cepstral
Coeff. Makes it easier to extract spectral shaping of the
speech signal.
frequency
Log power spectrum of vowel /a:/(fs=11 kHz, N=512)
Power spectrum of the vowel /a:/ after cepstral smoothing
![Page 7: A STUDY ON SPEECH RECOGNITION USING DYNAMIC TIME WARPING CS 525 : Project Presentation PALDEN LAMA and MOUNIKA NAMBURU](https://reader035.vdocument.in/reader035/viewer/2022062714/56649d3a5503460f94a142dd/html5/thumbnails/7.jpg)
MFCC (MEL FREQUENCY CEPSTRAL COEFFICIENTS)
Mel frequency scale reflects frequency resolution of human ear.
Coeff. Of power spectrum Mel Spectral Coeff. (FEATURE VECTOR)
![Page 8: A STUDY ON SPEECH RECOGNITION USING DYNAMIC TIME WARPING CS 525 : Project Presentation PALDEN LAMA and MOUNIKA NAMBURU](https://reader035.vdocument.in/reader035/viewer/2022062714/56649d3a5503460f94a142dd/html5/thumbnails/8.jpg)
RECOGNIZER One word spoken contains dozens of feature
vectors. (preprocessing every 10 ms of signal)
Compute a ”distance” between this unknown sequence of vectors (unknown word) and known sequence of vectors (prototypes of words to recognize)
PROBLEM !! Unequal length of vector sequence
![Page 9: A STUDY ON SPEECH RECOGNITION USING DYNAMIC TIME WARPING CS 525 : Project Presentation PALDEN LAMA and MOUNIKA NAMBURU](https://reader035.vdocument.in/reader035/viewer/2022062714/56649d3a5503460f94a142dd/html5/thumbnails/9.jpg)
DYNAMIC TIME WARPING : FIND OPTIMAL ASSIGNMENT PATH
![Page 10: A STUDY ON SPEECH RECOGNITION USING DYNAMIC TIME WARPING CS 525 : Project Presentation PALDEN LAMA and MOUNIKA NAMBURU](https://reader035.vdocument.in/reader035/viewer/2022062714/56649d3a5503460f94a142dd/html5/thumbnails/10.jpg)
DYNAMIC TIME WARPING : FIND OPTIMAL ASSIGNMENT PATH
![Page 11: A STUDY ON SPEECH RECOGNITION USING DYNAMIC TIME WARPING CS 525 : Project Presentation PALDEN LAMA and MOUNIKA NAMBURU](https://reader035.vdocument.in/reader035/viewer/2022062714/56649d3a5503460f94a142dd/html5/thumbnails/11.jpg)
DYNAMIC TIME WARPING : FIND OPTIMAL ASSIGNMENT PATH
![Page 12: A STUDY ON SPEECH RECOGNITION USING DYNAMIC TIME WARPING CS 525 : Project Presentation PALDEN LAMA and MOUNIKA NAMBURU](https://reader035.vdocument.in/reader035/viewer/2022062714/56649d3a5503460f94a142dd/html5/thumbnails/12.jpg)
DTW : RECOGNIZING CONNECTED WORDS
![Page 13: A STUDY ON SPEECH RECOGNITION USING DYNAMIC TIME WARPING CS 525 : Project Presentation PALDEN LAMA and MOUNIKA NAMBURU](https://reader035.vdocument.in/reader035/viewer/2022062714/56649d3a5503460f94a142dd/html5/thumbnails/13.jpg)
MATLAB FUNCTIONS
PRE-PROCESSING recordMelMatrix(3)
S = wavread(“speech.wav”) C = Melfiltermatrix(S, N, K) computeMelSpectrum( C,S);
DISPLAY FEATURES Featuredisp.m
WORD RECOGNITION dp_asym(vector1, vector2)
![Page 14: A STUDY ON SPEECH RECOGNITION USING DYNAMIC TIME WARPING CS 525 : Project Presentation PALDEN LAMA and MOUNIKA NAMBURU](https://reader035.vdocument.in/reader035/viewer/2022062714/56649d3a5503460f94a142dd/html5/thumbnails/14.jpg)
RESULTShello hello1
![Page 15: A STUDY ON SPEECH RECOGNITION USING DYNAMIC TIME WARPING CS 525 : Project Presentation PALDEN LAMA and MOUNIKA NAMBURU](https://reader035.vdocument.in/reader035/viewer/2022062714/56649d3a5503460f94a142dd/html5/thumbnails/15.jpg)
library
hello
![Page 16: A STUDY ON SPEECH RECOGNITION USING DYNAMIC TIME WARPING CS 525 : Project Presentation PALDEN LAMA and MOUNIKA NAMBURU](https://reader035.vdocument.in/reader035/viewer/2022062714/56649d3a5503460f94a142dd/html5/thumbnails/16.jpg)
computerhello
![Page 17: A STUDY ON SPEECH RECOGNITION USING DYNAMIC TIME WARPING CS 525 : Project Presentation PALDEN LAMA and MOUNIKA NAMBURU](https://reader035.vdocument.in/reader035/viewer/2022062714/56649d3a5503460f94a142dd/html5/thumbnails/17.jpg)
3.0304e+003
3.5820e+003
3.4499e+003
![Page 18: A STUDY ON SPEECH RECOGNITION USING DYNAMIC TIME WARPING CS 525 : Project Presentation PALDEN LAMA and MOUNIKA NAMBURU](https://reader035.vdocument.in/reader035/viewer/2022062714/56649d3a5503460f94a142dd/html5/thumbnails/18.jpg)
Welcome home (male)
Welcome home (female)
![Page 19: A STUDY ON SPEECH RECOGNITION USING DYNAMIC TIME WARPING CS 525 : Project Presentation PALDEN LAMA and MOUNIKA NAMBURU](https://reader035.vdocument.in/reader035/viewer/2022062714/56649d3a5503460f94a142dd/html5/thumbnails/19.jpg)
Welcome home Welcome back
![Page 20: A STUDY ON SPEECH RECOGNITION USING DYNAMIC TIME WARPING CS 525 : Project Presentation PALDEN LAMA and MOUNIKA NAMBURU](https://reader035.vdocument.in/reader035/viewer/2022062714/56649d3a5503460f94a142dd/html5/thumbnails/20.jpg)
Welcome home Computer Science
![Page 21: A STUDY ON SPEECH RECOGNITION USING DYNAMIC TIME WARPING CS 525 : Project Presentation PALDEN LAMA and MOUNIKA NAMBURU](https://reader035.vdocument.in/reader035/viewer/2022062714/56649d3a5503460f94a142dd/html5/thumbnails/21.jpg)
Welcome back Computer Science
![Page 22: A STUDY ON SPEECH RECOGNITION USING DYNAMIC TIME WARPING CS 525 : Project Presentation PALDEN LAMA and MOUNIKA NAMBURU](https://reader035.vdocument.in/reader035/viewer/2022062714/56649d3a5503460f94a142dd/html5/thumbnails/22.jpg)
2.6418e+003
2.9468e+003
3.8109e+003
4.6701e+003
![Page 23: A STUDY ON SPEECH RECOGNITION USING DYNAMIC TIME WARPING CS 525 : Project Presentation PALDEN LAMA and MOUNIKA NAMBURU](https://reader035.vdocument.in/reader035/viewer/2022062714/56649d3a5503460f94a142dd/html5/thumbnails/23.jpg)
THANKS ! ANY QUESTIONS?