different techniques for speech recognition

18
Different techniques for speech recognition From: Yashi Saxena

Upload: yashi-saxena

Post on 14-Apr-2017

363 views

Category:

Documents


4 download

TRANSCRIPT

Page 1: Different  techniques for speech recognition

Different techniques for speech recognition

From: Yashi Saxena

Page 2: Different  techniques for speech recognition

Index• Introduction• Standard DTW• Stochastic DTW•Hidden Markov model•Conclusion

Page 3: Different  techniques for speech recognition

INTRODUCTION Non-linear sequence alignment has a vast range

of application in DNA matching, string matching, speech recognition etc. it seeks an optimal mapping from the test signal to template signal, meanwhile allowing a non-linear, warping in the test signal. It show’s its power to cut the complexity to 0(nm). This algorithm is proposed in 1978 (DTW) and an update version came in 1988 which has improved the recognition rate from 89.3% to 92.9% in word recognition experiment.

Page 4: Different  techniques for speech recognition
Page 5: Different  techniques for speech recognition

STANDARD DYNAMIC TIME WARPING

Dynamic time warping (DTW) is an algorithm for measuring similarity between two sequences which may vary in time or speed. For instance, similarities in walking patterns would be detected, even if in one video the person was walking slowly and if in another he or she were walking more quickly, or even if there were accelerations and decelerations during the course of one observation. DTW has been applied to video, audio, and graphics — indeed, any data which can be turned into a linear representation can be analyzed with DTW. It was introduced in 1978.

Page 6: Different  techniques for speech recognition
Page 7: Different  techniques for speech recognition

Algorithm Two time series X and Y, of lengths |X| and |Y|, X= x1,x2........,xi,.........x|X|

Y= y1,y2........,yi,.........y|Y|

Wrap path W W= w1,w2........,wk max(|X|,|Y|)< K<|X|+|Y| K is the length of the warp path and the kth element of the warp

path is Wk= (i,j) The optimal warp path is the minimum distance warp path

Dist(W) is the distance of the warp path W, and dist(wki, wkj) is the distance between the two data point index in the kth element of warp path.

Page 8: Different  techniques for speech recognition

Problem • Windowing: (Berndt & Clifford 1994) Allowable

elements of the matrix can be restricted to those that fall into a warping window, |i-(n/(m/j))| < R, where R is a positive integer window width. This effectively means that the corners of the matrix are pruned from consideration.

• Slope Weighting: (Kruskall & Liberman 1983,Sakoe, & Chiba 1978) If equation is replaced with g(i,j) = d(i,j) + min{ g(i-1,j-1) , X g(i-1,j ) , X g(i,j-1) } where X is a positive real number, we can constrain the warping by changing the value of X. As X gets larger, the warping path is increasing biased toward the diagonal.

Page 9: Different  techniques for speech recognition

• Step Patterns (Slope constraints): (Itakura 1975, Myers et. al. 1980) We can visualize equation as a diagram of admissible step-patterns. The arrows illustrate the permissible steps the warping path may take at each stage. We could replace equation with g(i,j) = d(i,j) + min{ g(i-1,j-1) , g(i-1,j-2) , g(i- 2,j-1) }, which corresponds with the step-pattern show in Figure 4.B. Using this equation the warping path is forced to move one diagonal step for each step parallel to an axis.

Page 10: Different  techniques for speech recognition

STOCHASTIC DTWA lots of real signals are stochastic processes, such

as speech signal, video signal etc. therefore, in 1988 a new algorithm called stochastic DTW is proposed. In this method, conditional probability are used instead of local distance in standard DTW, and transition probabilities instead of path costs. We propose stochastic DTW method to cope with spectral variations caused by speaker to speaker.

Page 11: Different  techniques for speech recognition

Algorithm

1. Replace the deterministic cost with probabilities:-

Page 12: Different  techniques for speech recognition

• Then replace the right hand side with the maximum probability and taking logarithm for P:-

General equation of stochastic DTW is:-

Page 13: Different  techniques for speech recognition

HIDDEN MARKOV MODEL• An HMM is defined by a set of N states, K observation

symbols and three probabilistic matrices: M={∏, A, B}

Where

∏= ∏i initial state probabilitiesA= ai,j state transition probabilitiesB= bi,j,k symbol emission probabilities

Page 14: Different  techniques for speech recognition

The observation symbol generation procedure for topology is as follows:-

1. Start in state i with probability ∏i.2. t=13. Move from state i to j with probability ai,j and emit

observation symbol ot = k with probability bi,j,k.

4. t= t+15. Go to 3.

Page 15: Different  techniques for speech recognition
Page 16: Different  techniques for speech recognition

VITERBI ALGORITHM FOR HMM

The Viterbi algorithm was conceived by Andrew Viterbi in 1967 as an error-correction scheme for noisy digital communication links. The Viterbi algorithm is a dynamic programming algorithm for finding the most likely sequence of hidden states – called the Viterbi path – that results in a sequence of observed events, especially in the context of Markov information sources, and more generally, hidden Markov models.

Page 17: Different  techniques for speech recognition
Page 18: Different  techniques for speech recognition

Thank you