different techniques for speech recognition

Different techniques for speech recognition

From: Yashi Saxena

Index• Introduction• Standard DTW• Stochastic DTW•Hidden Markov model•Conclusion

INTRODUCTION Non-linear sequence alignment has a vast range

of application in DNA matching, string matching, speech recognition etc. it seeks an optimal mapping from the test signal to template signal, meanwhile allowing a non-linear, warping in the test signal. It show’s its power to cut the complexity to 0(nm). This algorithm is proposed in 1978 (DTW) and an update version came in 1988 which has improved the recognition rate from 89.3% to 92.9% in word recognition experiment.

STANDARD DYNAMIC TIME WARPING

Dynamic time warping (DTW) is an algorithm for measuring similarity between two sequences which may vary in time or speed. For instance, similarities in walking patterns would be detected, even if in one video the person was walking slowly and if in another he or she were walking more quickly, or even if there were accelerations and decelerations during the course of one observation. DTW has been applied to video, audio, and graphics — indeed, any data which can be turned into a linear representation can be analyzed with DTW. It was introduced in 1978.

Algorithm Two time series X and Y, of lengths |X| and |Y|, X= x1,x2........,xi,.........x|X|

Y= y1,y2........,yi,.........y|Y|

Wrap path W W= w1,w2........,wk max(|X|,|Y|)< K<|X|+|Y| K is the length of the warp path and the kth element of the warp

path is Wk= (i,j) The optimal warp path is the minimum distance warp path

Dist(W) is the distance of the warp path W, and dist(wki, wkj) is the distance between the two data point index in the kth element of warp path.

Problem • Windowing: (Berndt & Clifford 1994) Allowable

elements of the matrix can be restricted to those that fall into a warping window, |i-(n/(m/j))| < R, where R is a positive integer window width. This effectively means that the corners of the matrix are pruned from consideration.

• Slope Weighting: (Kruskall & Liberman 1983,Sakoe, & Chiba 1978) If equation is replaced with g(i,j) = d(i,j) + min{ g(i-1,j-1) , X g(i-1,j ) , X g(i,j-1) } where X is a positive real number, we can constrain the warping by changing the value of X. As X gets larger, the warping path is increasing biased toward the diagonal.

• Step Patterns (Slope constraints): (Itakura 1975, Myers et. al. 1980) We can visualize equation as a diagram of admissible step-patterns. The arrows illustrate the permissible steps the warping path may take at each stage. We could replace equation with g(i,j) = d(i,j) + min{ g(i-1,j-1) , g(i-1,j-2) , g(i- 2,j-1) }, which corresponds with the step-pattern show in Figure 4.B. Using this equation the warping path is forced to move one diagonal step for each step parallel to an axis.

STOCHASTIC DTWA lots of real signals are stochastic processes, such

as speech signal, video signal etc. therefore, in 1988 a new algorithm called stochastic DTW is proposed. In this method, conditional probability are used instead of local distance in standard DTW, and transition probabilities instead of path costs. We propose stochastic DTW method to cope with spectral variations caused by speaker to speaker.

Algorithm

1. Replace the deterministic cost with probabilities:-

• Then replace the right hand side with the maximum probability and taking logarithm for P:-

General equation of stochastic DTW is:-

HIDDEN MARKOV MODEL• An HMM is defined by a set of N states, K observation

symbols and three probabilistic matrices: M={∏, A, B}

Where

∏= ∏i initial state probabilitiesA= ai,j state transition probabilitiesB= bi,j,k symbol emission probabilities

The observation symbol generation procedure for topology is as follows:-

1. Start in state i with probability ∏i.2. t=13. Move from state i to j with probability ai,j and emit

observation symbol ot = k with probability bi,j,k.

4. t= t+15. Go to 3.

VITERBI ALGORITHM FOR HMM

The Viterbi algorithm was conceived by Andrew Viterbi in 1967 as an error-correction scheme for noisy digital communication links. The Viterbi algorithm is a dynamic programming algorithm for finding the most likely sequence of hidden states – called the Viterbi path – that results in a sequence of observed events, especially in the context of Markov information sources, and more generally, hidden Markov models.

Thank you

different techniques for speech recognition

Documents