use of improved feature vectors in spectral subtraction method emrah besci, semih ergin, m.bilginer...
TRANSCRIPT
USE OF IMPROVED FEATURE VECTORS IN
SPECTRAL SUBTRACTION METHOD
Emrah Besci, Semih Ergin, M.Bilginer Gülmezoğlu, Atalay Barkana
Osmangazi University, Electrical and Electronics Engineering Department
Batı Meşelik, Eskişehir, Turkey
PURPOSE To improve the recognition rates of the isolated digits in the noisy enviroment.
INTRODUCTION The database used in this study is TI-Digit Database which contains ‘one’, ’two’, ’three’, ’four’, ’five’, ’six’, ’seven’, ’eight’, ‘nine’, ’zero’, ’ow’ pronouncations.
In the study 426 feature vectors are used for the training set and 20 feature vectors are used for the test set.
INTRODUCTION Gaussian White Noise is added to the clean speech signals and noisy speech signals with 0, 5, 10 and 20 dB SNR are obtained.
The Root-Melcep parameters used in this study are obtained from the speech signals by applying variable frame length algorithm. Thus the feature vectors are constructed.
INTRODUCTION Spectral Subtraction Method is used in the cleaning phase and Common Vector Approach (CVA) is used in the recognition phase.
PROPOSED STUDY Variable Frame Length (VFL) Algorithm is the key concept of this study that means the words are divided into specific number of frames, which is 10 in this study.
Thus the parameters belonging to the speech signals are used at the end of the feature vectors instead of random values.
PROPOSED STUDY Another important point is, a filter is applied to the noisy speech signals.
PROPOSED STUDY Common Vector Approach The feature vectors for a word-class :
a1, a2, …, am feature vectors ai Є Rn (i = 1, 2, …, m) m : the number of speakers n : the dimension of each feature vector m > n in this study.
This n-dimensional feature space can be divided into
(k-1) dimensional orthogonal difference subspace B an (n-(k-1)) dimensional orthogonal indifference subspace B┴
PROPOSED STUDY Common Vector Approach (continued) B is spanned by
The orthonormal basis vectors uj Є Rn for j = 1, 2, …, k-1 (k-1<n)
B┴ is spanned by The orthonormal basis vectors uj Є Rn for j = k, k+1, …, n
PROPOSED STUDY Common Vector Approach (continued) The orthogonal projection matrix P onto the difference subspace B
The orthogonal projection matrix P┴ onto the indifference subspace B┴
From here, the common vector acom
1
1
( ) (4)k
Tj j
j
P u u
( ) (5)n
Tj j
j k
P u u
1
1( ) (6)
m
com ave ii
a P a P am
RESULTSRecognition Rates For Cleaned Speech Signals(Spectral Subtraction with 250 Parameters and VFL Algorithm)
Table 1.
Training Set Test Set
0 dB 54.8229 45.4545
5 dB 81.5194 71.3636
10 dB 96.4789 91.3636
20 dB 99.6586 97.7273
RESULTSRecognition Rates For Noisy Speech Signals(Spectral Subtraction with 250 Parameters and VFL Algorithm)
Table 2.
Training Set Test Set
0 dB 30.0469 29.5455
5 dB 49.1677 44.0909
10 dB 71.2761 60.4545
20 dB 92.1041 85.4545
Spectral Subtraction Recognition Rates For Cleaned Speech Signals
TrainingSet
TestSet
0 dB 54.8 45.5
5 dB 81.5 71.4
10 dB 96.5 91.4
20 dB 99.7 97.7
Table 1.
250 Parameters
with VFL Algorithm
TrainingSet
TestSet
0 dB 25 28
5 dB 50 43
10 dB 73 66
20 dB 94 80
Table 2.
407 Parameters
RESULTS
CONCLUSION Results obtained by using Spectral Subraction with Variable Frame Length (VFL) Algorithm and Common Vector Approach are higher than Spectral Subraction method with 407 parameters.
The results are better because all of the data used belongs to the speech sample and the bandpass filter decreases the effect of the noise.
THANK YOU
THANK YOU