by sarita jondhale1 signal processing and analysis methods for speech recognition
TRANSCRIPT
![Page 1: By Sarita Jondhale1 Signal Processing And Analysis Methods For Speech Recognition](https://reader035.vdocument.in/reader035/viewer/2022062308/56649ec65503460f94bd11b6/html5/thumbnails/1.jpg)
By Sarita Jondhale 1
Signal Processing And Analysis Methods For Speech
Recognition
![Page 2: By Sarita Jondhale1 Signal Processing And Analysis Methods For Speech Recognition](https://reader035.vdocument.in/reader035/viewer/2022062308/56649ec65503460f94bd11b6/html5/thumbnails/2.jpg)
By Sarita Jondhale 2
Introduction
• Spectral analysis is the process of defining the speech in different parameters for further processing
• Eg short term energy, zero crossing rates, level crossing rates and so on
• Methods for spectral analysis are therefore considered as core of the signal processing front end in a speech recognition system
![Page 3: By Sarita Jondhale1 Signal Processing And Analysis Methods For Speech Recognition](https://reader035.vdocument.in/reader035/viewer/2022062308/56649ec65503460f94bd11b6/html5/thumbnails/3.jpg)
By Sarita Jondhale 4
Spectral Analysis models
• Pattern recognition model• Acoustic phonetic model
![Page 4: By Sarita Jondhale1 Signal Processing And Analysis Methods For Speech Recognition](https://reader035.vdocument.in/reader035/viewer/2022062308/56649ec65503460f94bd11b6/html5/thumbnails/4.jpg)
By Sarita Jondhale 5
Spectral Analysis Model
Parameter measurement is common in both the systems
![Page 5: By Sarita Jondhale1 Signal Processing And Analysis Methods For Speech Recognition](https://reader035.vdocument.in/reader035/viewer/2022062308/56649ec65503460f94bd11b6/html5/thumbnails/5.jpg)
By Sarita Jondhale 6
Pattern recognition Model
• The three basic steps in pattern recognition model are – 1. parameter measurement– 2. pattern comparison– 3. decision making
![Page 6: By Sarita Jondhale1 Signal Processing And Analysis Methods For Speech Recognition](https://reader035.vdocument.in/reader035/viewer/2022062308/56649ec65503460f94bd11b6/html5/thumbnails/6.jpg)
By Sarita Jondhale 7
1. Parameter measurement
• To represent the relevant acoustic events in speech signal in terms of compact efficient set of speech parameters
• The choice of which parameters to use is dictated by other consideration
• eg – computational efficiency, – type of Implementation ,– available memory
• The way in which representation is computed is based on signal processing considerations
![Page 7: By Sarita Jondhale1 Signal Processing And Analysis Methods For Speech Recognition](https://reader035.vdocument.in/reader035/viewer/2022062308/56649ec65503460f94bd11b6/html5/thumbnails/7.jpg)
By Sarita Jondhale 8
Acoustic phonetic Model
![Page 8: By Sarita Jondhale1 Signal Processing And Analysis Methods For Speech Recognition](https://reader035.vdocument.in/reader035/viewer/2022062308/56649ec65503460f94bd11b6/html5/thumbnails/8.jpg)
By Sarita Jondhale 9
Spectral Analysis
• Two methods:
– The Filter Bank spectrum
– The Linear Predictive coding (LPC)
![Page 9: By Sarita Jondhale1 Signal Processing And Analysis Methods For Speech Recognition](https://reader035.vdocument.in/reader035/viewer/2022062308/56649ec65503460f94bd11b6/html5/thumbnails/9.jpg)
By Sarita Jondhale 10
The Filter Bank spectrum
Digital i/p
Spectral representation
The band pass filters coverage spans the frequency range of interest in the signal
![Page 10: By Sarita Jondhale1 Signal Processing And Analysis Methods For Speech Recognition](https://reader035.vdocument.in/reader035/viewer/2022062308/56649ec65503460f94bd11b6/html5/thumbnails/10.jpg)
By Sarita Jondhale 11
1.The Bank of Filters Front end Processor
• One of the most common approaches for processing the speech signal is the bank-of-filters model
• This method takes a speech signal as input and passes it through a set of filters in order to obtain the spectral representation of each frequency band of interest.
![Page 11: By Sarita Jondhale1 Signal Processing And Analysis Methods For Speech Recognition](https://reader035.vdocument.in/reader035/viewer/2022062308/56649ec65503460f94bd11b6/html5/thumbnails/11.jpg)
By Sarita Jondhale 12
• Eg• 100-3000 Hz for telephone quality
signal• 100-8000 Hz for broadband signal• The individual filters generally do
overlap in frequency• The output of the ith bandpass filter• where Wi is the normalized frequency
![Page 12: By Sarita Jondhale1 Signal Processing And Analysis Methods For Speech Recognition](https://reader035.vdocument.in/reader035/viewer/2022062308/56649ec65503460f94bd11b6/html5/thumbnails/12.jpg)
By Sarita Jondhale 13
• Each bandpass filter processes the speech signal independently to produce the spectral representation Xn
![Page 13: By Sarita Jondhale1 Signal Processing And Analysis Methods For Speech Recognition](https://reader035.vdocument.in/reader035/viewer/2022062308/56649ec65503460f94bd11b6/html5/thumbnails/13.jpg)
By Sarita Jondhale 14
The Bank of Filters Front end Processor
![Page 14: By Sarita Jondhale1 Signal Processing And Analysis Methods For Speech Recognition](https://reader035.vdocument.in/reader035/viewer/2022062308/56649ec65503460f94bd11b6/html5/thumbnails/14.jpg)
By Sarita Jondhale 15
The Bank of Filters Front end Processor
1
0
)()(
Qi1 ,)(*)()(iM
mi
ii
mnsmh
nhnsns
The sampled speech signal, s(n), is passed through a bank of Q Band pass filters, giving the signals
![Page 15: By Sarita Jondhale1 Signal Processing And Analysis Methods For Speech Recognition](https://reader035.vdocument.in/reader035/viewer/2022062308/56649ec65503460f94bd11b6/html5/thumbnails/15.jpg)
By Sarita Jondhale 16
The Bank of Filters Front end Processor
The bank-of-filters approach obtains the energy value of the speech signal considering the following steps:
• Signal enhancement and noise elimination.- To make the speech signal more evident to the bank of filters.
• Set of bandpass filters.- Separate the signal in frequency bands. (uniform/non uniform filters )
![Page 16: By Sarita Jondhale1 Signal Processing And Analysis Methods For Speech Recognition](https://reader035.vdocument.in/reader035/viewer/2022062308/56649ec65503460f94bd11b6/html5/thumbnails/16.jpg)
By Sarita Jondhale 17
• Nonlinearity.- The filtered signal at every band is passed through a non linear function (for example a wave rectifier full wave or half wave) for shifting the bandpass spectrum to the low-frequency band.
![Page 17: By Sarita Jondhale1 Signal Processing And Analysis Methods For Speech Recognition](https://reader035.vdocument.in/reader035/viewer/2022062308/56649ec65503460f94bd11b6/html5/thumbnails/17.jpg)
By Sarita Jondhale 18
The Bank of Filters Front end Processor
• Low pass filter.- This filter eliminates the high-frequency generated by the non linear function.
• Sampling rate reduction and amplitude compression.- The resulting signals are now represented in a more economic way by re-sampling with a reduced rate and compressing the signal dynamic range.
The role of the final lowpass filter is to eliminate the undesired spectral peaks
![Page 18: By Sarita Jondhale1 Signal Processing And Analysis Methods For Speech Recognition](https://reader035.vdocument.in/reader035/viewer/2022062308/56649ec65503460f94bd11b6/html5/thumbnails/18.jpg)
By Sarita Jondhale 19
The Bank of Filters Front end Processor
)sin()( nns iii
Assume that the output of the ith bandpass filter is a pure sinusoid at frequency I
If full wave rectifier is used as the nonlinearity
0(n)s if 1-
0(n)s if 1)(
where
)().())((
:outputty nonlineari The
0(n)sfor )(
0(n)sfor )())((s
i
i
i
ii
nw
nwnsnsfv
ns
nsnf
iii
i
i
![Page 19: By Sarita Jondhale1 Signal Processing And Analysis Methods For Speech Recognition](https://reader035.vdocument.in/reader035/viewer/2022062308/56649ec65503460f94bd11b6/html5/thumbnails/19.jpg)
By Sarita Jondhale 21
Types of Filter Bank Used For Speech Recognition
• uniform filter bank• Non uniform filter bank
![Page 20: By Sarita Jondhale1 Signal Processing And Analysis Methods For Speech Recognition](https://reader035.vdocument.in/reader035/viewer/2022062308/56649ec65503460f94bd11b6/html5/thumbnails/20.jpg)
By Sarita Jondhale 22
uniform filter bank
• The most common filter bank is the uniform filter bank
• The center frequency, fi, of the ith bandpass filter is defined as
• Q is number of filters used in bank of filters
speech. theof rangefrequency span the
torequired filters spaceduniformly ofnumber theis N
signalspeech theof rate sampling theis Fs where
Qi1 , iN
Fsfi
![Page 21: By Sarita Jondhale1 Signal Processing And Analysis Methods For Speech Recognition](https://reader035.vdocument.in/reader035/viewer/2022062308/56649ec65503460f94bd11b6/html5/thumbnails/21.jpg)
By Sarita Jondhale 23
uniform filter bank
• The actual number of filters used in the filter bank
• bi is the bandwidth of the ith filter
• There should not be any frequency overlap between adjacent filter channels
2/NQ
N
Fsbi
![Page 22: By Sarita Jondhale1 Signal Processing And Analysis Methods For Speech Recognition](https://reader035.vdocument.in/reader035/viewer/2022062308/56649ec65503460f94bd11b6/html5/thumbnails/22.jpg)
By Sarita Jondhale 24
uniform filter bank
If bi < Fs/N, then the certain portions of the speech spectrum would be missing from the analysis and the resulting speech spectrum would not be considered very meaningful
![Page 23: By Sarita Jondhale1 Signal Processing And Analysis Methods For Speech Recognition](https://reader035.vdocument.in/reader035/viewer/2022062308/56649ec65503460f94bd11b6/html5/thumbnails/23.jpg)
By Sarita Jondhale 25
nonuniform filter bank
• Alternative to uniform filter bank is nonuniform filter bank
• The criterion is to space the filters uniformly along a logarithmic frequency scale.
• For a set of Q bandpass filters with center frequncies fi and bandwidths bi, 1≤i≤Q, we set
![Page 24: By Sarita Jondhale1 Signal Processing And Analysis Methods For Speech Recognition](https://reader035.vdocument.in/reader035/viewer/2022062308/56649ec65503460f94bd11b6/html5/thumbnails/24.jpg)
By Sarita Jondhale 26
nonuniform filter bank
factorgrowth
clogarithmi theis andfilter first theoffrequency
center theandbandwidth arbitary are and C where
2
)(
2
1
11
1
1
,1
1
f
bbbff
Qibb
Cb
ii
j
ji
ii
![Page 25: By Sarita Jondhale1 Signal Processing And Analysis Methods For Speech Recognition](https://reader035.vdocument.in/reader035/viewer/2022062308/56649ec65503460f94bd11b6/html5/thumbnails/25.jpg)
By Sarita Jondhale 27
• The most commonly used values of α=2
• This gives an octave band spacing adjacent filters
• And α=4/3 gives 1/3 octave filter spacing
![Page 26: By Sarita Jondhale1 Signal Processing And Analysis Methods For Speech Recognition](https://reader035.vdocument.in/reader035/viewer/2022062308/56649ec65503460f94bd11b6/html5/thumbnails/26.jpg)
By Sarita Jondhale 28
Implementations of Filter Banks
• Depending on the method of designing the filter bank can be implemented in various ways.
• Design methods for digital filters fall into two classes:– Infinite impulse response (IIR)
(recursive filters)– Finite impulse response
![Page 27: By Sarita Jondhale1 Signal Processing And Analysis Methods For Speech Recognition](https://reader035.vdocument.in/reader035/viewer/2022062308/56649ec65503460f94bd11b6/html5/thumbnails/27.jpg)
By Sarita Jondhale 29
The FIR filter: (finite impulse response) or non recursive filter
• The present output is depend on the present input sample and previous input samples
• The impulse response is restricted to finite number of samples
![Page 28: By Sarita Jondhale1 Signal Processing And Analysis Methods For Speech Recognition](https://reader035.vdocument.in/reader035/viewer/2022062308/56649ec65503460f94bd11b6/html5/thumbnails/28.jpg)
By Sarita Jondhale 30
• Advantages: – Stable, noise less sever– Excellent design methods are available
for various kinds of FIR filters– Phase response is linear
• Disadvantage:– Costly to implement– Memory requirement and execution
time are high– Require powerful computational facilities
![Page 29: By Sarita Jondhale1 Signal Processing And Analysis Methods For Speech Recognition](https://reader035.vdocument.in/reader035/viewer/2022062308/56649ec65503460f94bd11b6/html5/thumbnails/29.jpg)
By Sarita Jondhale 31
The IIR filter: (Infinite impulse response) or recursive filter
• The present output sample is depends on the present input, past input samples and output samples
• The impulse response extends over an infinite duration
![Page 30: By Sarita Jondhale1 Signal Processing And Analysis Methods For Speech Recognition](https://reader035.vdocument.in/reader035/viewer/2022062308/56649ec65503460f94bd11b6/html5/thumbnails/30.jpg)
By Sarita Jondhale 32
• Advantage:– Simple to design– Efficient
• Disadvantage:– Phase response is non linear– Noise affects more– Not stable
![Page 31: By Sarita Jondhale1 Signal Processing And Analysis Methods For Speech Recognition](https://reader035.vdocument.in/reader035/viewer/2022062308/56649ec65503460f94bd11b6/html5/thumbnails/31.jpg)
By Sarita Jondhale 33
FIR Filters
signalinput )(
channel i theofoutput theis )(
channel i theof response impulse theis )(
1,2,...Qifor )()(
samples are L where1-Ln0 )()()(
th
th
1
0
ns
nx
nh
mnsmh
nhnsnx
i
i
L
m
i
ii
![Page 32: By Sarita Jondhale1 Signal Processing And Analysis Methods For Speech Recognition](https://reader035.vdocument.in/reader035/viewer/2022062308/56649ec65503460f94bd11b6/html5/thumbnails/32.jpg)
By Sarita Jondhale 34
FIR Filters• Less expensive implementation can be
derived by representing each bandpass filter by a fixed low pass window (n) modulated by the complex exponential
fiwnseS
eSne
emnmse
emnms
mnsemnx
ennh
ijw
n
jwnjw
mjw
m
njw
mnjw
m
m
njwi
njwi
i
ii
ii
i
i
i
2at )( of ansformFourier tr theis )( where
)(
)()(
)()(
)()()(
)()(
)(
![Page 33: By Sarita Jondhale1 Signal Processing And Analysis Methods For Speech Recognition](https://reader035.vdocument.in/reader035/viewer/2022062308/56649ec65503460f94bd11b6/html5/thumbnails/33.jpg)
By Sarita Jondhale 35
Frequency Domain Interpretation For Short Term
Fourier Transformmjw
m
jw ii emnmseSn )()( )(
At n=n0
ijw mnmsFTeSn i |)]()([ )( 00
Where FT[.] denotes Fourier TransformSn0(eji) is the conventional Fourier transform of the windowed signal, s(m)w(n0-m), evaluated at the frequency = i
A
![Page 34: By Sarita Jondhale1 Signal Processing And Analysis Methods For Speech Recognition](https://reader035.vdocument.in/reader035/viewer/2022062308/56649ec65503460f94bd11b6/html5/thumbnails/34.jpg)
By Sarita Jondhale 36
Frequency Domain Interpretation For Short Term
Fourier Transform
Shows which part of s(m) are used in the computation of the short time Fourier transform
![Page 35: By Sarita Jondhale1 Signal Processing And Analysis Methods For Speech Recognition](https://reader035.vdocument.in/reader035/viewer/2022062308/56649ec65503460f94bd11b6/html5/thumbnails/35.jpg)
By Sarita Jondhale 37
Frequency Domain Interpretation For Short Term
Fourier Transform• Since w(m) is an FIR filter with size L
then from the definition of Sn(eji) we can state that– If L is large, relative to the signal
periodicity then Sn(eji) gives good frequency resolution
– If L is small, relative to the signal periodicity then Sn(eji) gives poor frequency resolution
![Page 36: By Sarita Jondhale1 Signal Processing And Analysis Methods For Speech Recognition](https://reader035.vdocument.in/reader035/viewer/2022062308/56649ec65503460f94bd11b6/html5/thumbnails/36.jpg)
By Sarita Jondhale 38
Frequency Domain Interpretation For Short Term
Fourier TransformFor L=500 points Hamming window is applied to a section of voiced speech.
The periodicity of the signalis seen in the windowed timewaveform as well as in the short time spectrum in whichthe fundamental frequencyand its harmonics show up asnarrow peaks at equally spaced frequencies.
![Page 37: By Sarita Jondhale1 Signal Processing And Analysis Methods For Speech Recognition](https://reader035.vdocument.in/reader035/viewer/2022062308/56649ec65503460f94bd11b6/html5/thumbnails/37.jpg)
By Sarita Jondhale 39
Frequency Domain Interpretation For Short Term
Fourier TransformFor short windows, the time sequence s(m)w(n-m) doesn’t show the signal periodicity, nor does the signal spectrum.It shows the broad spectral envelop very well.
![Page 38: By Sarita Jondhale1 Signal Processing And Analysis Methods For Speech Recognition](https://reader035.vdocument.in/reader035/viewer/2022062308/56649ec65503460f94bd11b6/html5/thumbnails/38.jpg)
By Sarita Jondhale 40
Frequency Domain Interpretation For Short Term
Fourier Transform
Shows irregular series of local peaks and valleys due to the random nature of the unvoiced speech
![Page 39: By Sarita Jondhale1 Signal Processing And Analysis Methods For Speech Recognition](https://reader035.vdocument.in/reader035/viewer/2022062308/56649ec65503460f94bd11b6/html5/thumbnails/39.jpg)
By Sarita Jondhale 41
Frequency Domain Interpretation For Short Term
Fourier Transform
Using the shorter window smoothes out the random fluctuations in the short time spectral magnitude and shows the broad spectral envelope very well
![Page 40: By Sarita Jondhale1 Signal Processing And Analysis Methods For Speech Recognition](https://reader035.vdocument.in/reader035/viewer/2022062308/56649ec65503460f94bd11b6/html5/thumbnails/40.jpg)
By Sarita Jondhale 42
Linear Filtering Interpretation of the short-time Fourier
Transform• The linear filtering interpretation of
the short time Fourier Transform
• i.e Sn(ejwi) is a convolution of the low pass window, w(n), with the speech signal, s(n), modulated to the center frequency wi
)()( )( nenseSn njwjw ii * From A
![Page 41: By Sarita Jondhale1 Signal Processing And Analysis Methods For Speech Recognition](https://reader035.vdocument.in/reader035/viewer/2022062308/56649ec65503460f94bd11b6/html5/thumbnails/41.jpg)
By Sarita Jondhale 43
FFT Implementation of Uniform Filter Bank Based on the Short-
Time FT
m)-s(m)w(n(m)sLet
r- 1,-Nk0 k,Nrm assume Now
)()(
fi2 w where)()((n)x
thatknow we
1-N0,1,2....,i ),/(
n
)2
()2
(
ii
im
Nj
m
inN
j
mjw
m
njw
emnmse
emnmse
NFsifi
ii
![Page 42: By Sarita Jondhale1 Signal Processing And Analysis Methods For Speech Recognition](https://reader035.vdocument.in/reader035/viewer/2022062308/56649ec65503460f94bd11b6/html5/thumbnails/42.jpg)
By Sarita Jondhale 44
FFT Implementation of Uniform Filter Bank Based on the Short-
Time FT
result desired theis (k)u(n)x
1-Nk0 ,)((k)u
define we
)((n)x
r then i, allfor 1e since
)((n)x
1
0-k
)2
(n
)2
(i
n
)2
(1
0
)2
(i
irj2-
)()2
(1
0
)2
(i
N ikN
jinN
j
r
n
ikN
jN
k r
nin
Nj
r
kNriN
jN
k
nin
Nj
ee
kNrs
ekNrse
ekNrse
![Page 43: By Sarita Jondhale1 Signal Processing And Analysis Methods For Speech Recognition](https://reader035.vdocument.in/reader035/viewer/2022062308/56649ec65503460f94bd11b6/html5/thumbnails/43.jpg)
By Sarita Jondhale 45
FFT Implementation of Uniform Filter Bank Based on The Short Time FT
The FFT implementation is more efficient than the direct form structure
![Page 44: By Sarita Jondhale1 Signal Processing And Analysis Methods For Speech Recognition](https://reader035.vdocument.in/reader035/viewer/2022062308/56649ec65503460f94bd11b6/html5/thumbnails/44.jpg)
By Sarita Jondhale 46
Nonuniform FIR Filter Bank Implementations
The most general form of a nonuniform FIR filter bank
![Page 45: By Sarita Jondhale1 Signal Processing And Analysis Methods For Speech Recognition](https://reader035.vdocument.in/reader035/viewer/2022062308/56649ec65503460f94bd11b6/html5/thumbnails/45.jpg)
By Sarita Jondhale 47
Nonuniform FIR Filter Bank Implementations
• The kth bandpass filter impulse response, hk(n), represents a filter with a center frequency k, and bandwidth k.
• The set of Q bandpass filters covers the frequency range of interest for the intended speech recognition application
![Page 46: By Sarita Jondhale1 Signal Processing And Analysis Methods For Speech Recognition](https://reader035.vdocument.in/reader035/viewer/2022062308/56649ec65503460f94bd11b6/html5/thumbnails/46.jpg)
By Sarita Jondhale 48
Nonuniform FIR Filter Bank Implementations
• Each band pass filter is implemented via a direct convolution
• Each band pass filter is designed via the windowing design method
• The composite frequency response of the Q-channel filter bank is independent of the number and distribution of the individual filters
![Page 47: By Sarita Jondhale1 Signal Processing And Analysis Methods For Speech Recognition](https://reader035.vdocument.in/reader035/viewer/2022062308/56649ec65503460f94bd11b6/html5/thumbnails/47.jpg)
By Sarita Jondhale 49
Nonuniform FIR Filter Bank Implementations
A filter bank with the three filters has the exact same composite frequency responseas the filter bank with the seven filters shown in figure above
![Page 48: By Sarita Jondhale1 Signal Processing And Analysis Methods For Speech Recognition](https://reader035.vdocument.in/reader035/viewer/2022062308/56649ec65503460f94bd11b6/html5/thumbnails/48.jpg)
By Sarita Jondhale 50
Nonuniform FIR Filter Bank Implementations
• The impulse response of the kth bandpass filter
• The frequency response of the kth bandpass filter
)()()( nhnwnh kk
FIR windowImpulse response of idealband pass filer
)(~
)()( jwk
jwjwk eHeWeH *
![Page 49: By Sarita Jondhale1 Signal Processing And Analysis Methods For Speech Recognition](https://reader035.vdocument.in/reader035/viewer/2022062308/56649ec65503460f94bd11b6/html5/thumbnails/49.jpg)
By Sarita Jondhale 51
Nonuniform FIR Filter Bank Implementations
Thus the frequency response of the composite filter bank
Q
k
jwk
jw
Q
k
jwk
jwjw
Q
k
jwk
jwQ
k
jwjw
wwweHeH
eHeWeH
eHeWeHeH
1
maxmin
1
11
Otherwise ,0
,1{)(
~)(ˆ
responsesfrequency ideal ofsummation theissummation the
)(~
)()(
get, weconvoution theandsummation theinginterchangby
)(~
)()()( *
* 1
![Page 50: By Sarita Jondhale1 Signal Processing And Analysis Methods For Speech Recognition](https://reader035.vdocument.in/reader035/viewer/2022062308/56649ec65503460f94bd11b6/html5/thumbnails/50.jpg)
By Sarita Jondhale 52
Nonuniform FIR Filter Bank Implementations
• Where wmin is the lowest frequency in the filter bank and wmax is the highest frequency
• Equation 1 can be written as
• Which is independent of the number of ideal filters, Q, and their distribution in the frequency
)(ˆ)()( jwjwjw eHeWeH *
![Page 51: By Sarita Jondhale1 Signal Processing And Analysis Methods For Speech Recognition](https://reader035.vdocument.in/reader035/viewer/2022062308/56649ec65503460f94bd11b6/html5/thumbnails/51.jpg)
By Sarita Jondhale 53
FFT-Based Nonuniform Filter Banks
• By combining two or more uniform channels the nonuniformity can be created
• Consider taking an N-point DFT of the sequence x(n)
nkNjN
n
Nj
kkk
N
n
knNjnk
Nj
kk
N
n
nkNj
k
eN
nenxXXX
eenxXX
NkenxX
21
0
2
1
1
0
)1(22
1
1kk
1
0
2
)cos(2)('
)(
X and X outputs DFT Add
10 ,)(
![Page 52: By Sarita Jondhale1 Signal Processing And Analysis Methods For Speech Recognition](https://reader035.vdocument.in/reader035/viewer/2022062308/56649ec65503460f94bd11b6/html5/thumbnails/52.jpg)
By Sarita Jondhale 54
FFT-Based Nonuniform Filter Banks
• The equivalent kth channel value, Xk’ can be obtained by weighing the sequence, x(n) by the complex sequence 2 exp(-j (n/N))cos(n/N).
• If more than two channels are combined, then a different equivalent weighing sequence results
![Page 53: By Sarita Jondhale1 Signal Processing And Analysis Methods For Speech Recognition](https://reader035.vdocument.in/reader035/viewer/2022062308/56649ec65503460f94bd11b6/html5/thumbnails/53.jpg)
By Sarita Jondhale 55
Tree Structure Realizations of Nonuniform Filter Banks
In this method the speech signal is filtered in the stages, and the sampling rate is successively reduced at each stage
![Page 54: By Sarita Jondhale1 Signal Processing And Analysis Methods For Speech Recognition](https://reader035.vdocument.in/reader035/viewer/2022062308/56649ec65503460f94bd11b6/html5/thumbnails/54.jpg)
By Sarita Jondhale 56
Tree Structure Realizations of Nonuniform Filter Banks
![Page 55: By Sarita Jondhale1 Signal Processing And Analysis Methods For Speech Recognition](https://reader035.vdocument.in/reader035/viewer/2022062308/56649ec65503460f94bd11b6/html5/thumbnails/55.jpg)
By Sarita Jondhale 57
Tree Structure Realizations of Nonuniform Filter Banks
• The original speech signal, s(n), is filtered initially into two bands, a low band and a high band
• The high band is down sampled by 2 and represents the highest octave band (/2≤≤ ) of the filter bank.
• The low band is similarly down sampled by 2 and fed into second filtering stage in which the signal is again split into two equal bands.
• Again the high band of the stage 2 is down sampled by 2 and is used as a next highest filter bank output.
![Page 56: By Sarita Jondhale1 Signal Processing And Analysis Methods For Speech Recognition](https://reader035.vdocument.in/reader035/viewer/2022062308/56649ec65503460f94bd11b6/html5/thumbnails/56.jpg)
By Sarita Jondhale 58
Tree Structure Realizations of Nonuniform Filter Banks
• The low band is also down sampled by 2 and fed into a third stage of filters
• These third stage output after down sampling by factor 2, are used as the two lowest filter bands
![Page 57: By Sarita Jondhale1 Signal Processing And Analysis Methods For Speech Recognition](https://reader035.vdocument.in/reader035/viewer/2022062308/56649ec65503460f94bd11b6/html5/thumbnails/57.jpg)
By Sarita Jondhale 59
Summary of considerations for speech recognition filter banks 1st.Type of digital filter used (IIR
(recursive) or FIR (nonrecursive))• IIR: Advantage: simple to implement and
efficient. Disadvantage: phase response is
nonlinear• FIR: Advantage: phase response is linear
Disadvantage: expensive in implementation
![Page 58: By Sarita Jondhale1 Signal Processing And Analysis Methods For Speech Recognition](https://reader035.vdocument.in/reader035/viewer/2022062308/56649ec65503460f94bd11b6/html5/thumbnails/58.jpg)
By Sarita Jondhale 60
Summary of considerations for speech recognition filter banks2nd. The number of filters to be used in
the filter bank.1. For uniform filter banks the number of filters,
Q, can not be too small or else the ability of the filter bank to resolve the speech spectrum is greatly damaged. The value of Q less than 8 are generally avoided
2. The value of Q can not be too large, because the filter bandwidths would eventually be too narrow for some talker (eg. High-pitch females) i.e no prominent harmonics would fall within the band. (in practical systems the value of Q≤32).
![Page 59: By Sarita Jondhale1 Signal Processing And Analysis Methods For Speech Recognition](https://reader035.vdocument.in/reader035/viewer/2022062308/56649ec65503460f94bd11b6/html5/thumbnails/59.jpg)
By Sarita Jondhale 61
Summary of considerations for speech recognition filter banks
In order to reduce overall computation, many practical systems have used nonuniform spaced filter banks
![Page 60: By Sarita Jondhale1 Signal Processing And Analysis Methods For Speech Recognition](https://reader035.vdocument.in/reader035/viewer/2022062308/56649ec65503460f94bd11b6/html5/thumbnails/60.jpg)
By Sarita Jondhale 62
Summary of considerations for speech recognition filter banks3rd. The choice of nonlinearity and
LPF used at the output of each channel
• Nonlinearity: Full wave or Half wave rectifier
• LPF: varies from simple integrator to a good quality IIR lowpass filter.
![Page 61: By Sarita Jondhale1 Signal Processing And Analysis Methods For Speech Recognition](https://reader035.vdocument.in/reader035/viewer/2022062308/56649ec65503460f94bd11b6/html5/thumbnails/61.jpg)
By Sarita Jondhale 63