8 cepstral analysis
TRANSCRIPT
-
8/13/2019 8 Cepstral Analysis
1/24
8. Cepstral Analysis
(most slides taken from MIT course by Glass and Zue)
-
8/13/2019 8 Cepstral Analysis
2/24
Homomorphic filtering
Homomorphic filtering/transformation is a nonlineartransformation usually applied to image and speech
processing used to convert a signal obtained from a convolutionof two original signals into the sum of two signals.
In speech processing it can be applied to separate the filter fromthe excitation in the source-filter model
The cepstrum is one such homomorphic transformation that allowsus to perform such separation.
It is an alternativeoption to linear prediction analysis seen before
x[n]= e[n]!h[n]" x[n]= e[n]+ h[n]
x[n]=D(x[n])
s[n]= u[n]!h[n]
-
8/13/2019 8 Cepstral Analysis
3/24
TDP: Cepstral Analysis 3
Cepstral analysis is based on the observation that
by taking the log of X(z)
If the complex log is unique and the z transform is valid then, by applying Z-1
the two convolved signals are now additive.
Basics of cepstral analysis
-
8/13/2019 8 Cepstral Analysis
4/24
Basics of cepstral analysis (II)
Consider now that we restrict our signal x[n] to have poles and zerosonly in the unit circle, i.e.:
then if
This is the complex logarithm of X(w)
-
8/13/2019 8 Cepstral Analysis
5/24
Definition of Cepstrum
The real cepstrum is defined as:
Its magnitude is real and non-negative
And the complex cepstrum:
Where arg() represents the phase. We call it complex becauseit uses the complex logarithm, not due to the sequence, whichcan also ne real. In fact, the complex cepstrum of a realsequence is also real
TDP: Cepstral Analysis 5
[ ] ( ) !"!
"
"
!
#$
= log2
1
!"=
#
#
$$
$#
)](log[2
1
][
!"
+=
#
#
$$$
$
#
))](arg()(log[
2
1
-
8/13/2019 8 Cepstral Analysis
6/24
Definition of cesptrum(II)
It can be shown that the the real cepstrum is the even part of thecomplex cepstrum:
In speech processing we generally use the real cepstrum, which isobtained by applying an inverse Fourier Transform of the log-spectrum of the signal.
In fact, the name cepstrum comes from inverting the firstsyllable of the word spectrum. Similarly, the variable n incx[n] is called quefrency, which is the inversion offrequency
2
][][][
*
"+
=
-
8/13/2019 8 Cepstral Analysis
7/24
Properties of cepstrums
From this we can derive the following generalproperties:
1) the complex cepstrum decays at least as
fast as 1/|n|2) it has infinite duration, even if x[n] has
finite duration
3) it is real if x[n] is real (poles and zeros arein complex conjugate pairs)
NOTE: from 2) and 3) we see why usually a finite number of cepstrums is used in speech
processing (12-20 is sufficient), as very high order cepstrums have very small values.
-
8/13/2019 8 Cepstral Analysis
8/24
TDP: Cepstral Analysis 8
An example
(Using Taylor seriesexpansion and several tricks)
Given the r in the denominator,it is an infinite train of deltas thatconverges to 0
-
8/13/2019 8 Cepstral Analysis
9/24
TDP: Cepstral Analysis 9
(...)
-
8/13/2019 8 Cepstral Analysis
10/24
TDP: Cepstral Analysis 10
Computational considerations: using DFT
In digital signals we replace the Fourier Transform by the Discrete Fourier Transform
Aliasing by repetition of thecepstrums with period N
When N> the number of used cepstrums
we do not have a problem (which isusually the case)
-
8/13/2019 8 Cepstral Analysis
11/24
Cepstral analysis of speech
As pointed out at the beginning, we would like to separate theexcitation from the vocal tract filter h(n) by using a
homomorphic transformation.
We can do so easily as the filter parameters usually reside in thelower quefrencies, while the excitation parameters havehigher quefrencies
Consider the problem of recovering a filter's response from a periodic signal (such as a voiced excitation):
The filter response can be recovered if we can separate the output of the homomorphic transformation using a simple filter:
-
8/13/2019 8 Cepstral Analysis
12/24
TDP: Cepstral Analysis 12
Cepstral analysis of speech
-
8/13/2019 8 Cepstral Analysis
13/24
13
Cepstrum of a generic voiced signal
magnitude spectrum
cepstrum
Contributions to the cepstrum due to periodic excitation will occur at integermultiples of the fundamental period. NOTE that for children and high-pitchwomen we might have a problem
Contributions due to parameters usually modeled by the filter will concentrate inthe low quefrency region and will decay quickly with n
-
8/13/2019 8 Cepstral Analysis
14/24
Cepstral analysis of speech (voicedsignals)
-
8/13/2019 8 Cepstral Analysis
15/24
Cepstral analysis of speech (unvoicedsignals)
-
8/13/2019 8 Cepstral Analysis
16/24
TDP: Cepstral Analysis 16
Cepstral analysis of vowel (rectangularwindow)
-
8/13/2019 8 Cepstral Analysis
17/24
TDP: Cepstral Analysis 17
Cepstral analysis of vowel (taperingwindow)
-
8/13/2019 8 Cepstral Analysis
18/24
TDP: Cepstral Analysis 18
Cepstral analysis of fricative (rectangularwindow)
-
8/13/2019 8 Cepstral Analysis
19/24
TDP: Cepstral Analysis 19
Cepstral analysis of fricative (taperingwindow)
-
8/13/2019 8 Cepstral Analysis
20/24
TDP: Cepstral Analysis 20
Use in Speech Recognition
-
8/13/2019 8 Cepstral Analysis
21/24
TDP: Cepstral Analysis 21
Statistical properties of cepstral coefficients
-
8/13/2019 8 Cepstral Analysis
22/24
TDP: Cepstral Analysis 22
Mel-frequency cepstral representation
-
8/13/2019 8 Cepstral Analysis
23/24
TDP: Cepstral Analysis 23
MFCC computation diagram
-
8/13/2019 8 Cepstral Analysis
24/24
TDP: Cepstral Analysis 24
Mel-filter bank processing