demystifying the digital adaptive filters conducts in ... · demystifying the digital adaptive...
Post on 24-Jun-2018
223 Views
Preview:
TRANSCRIPT
Demystifying the Digital Adaptive Filters
Conducts in Acoustic Echo Cancellation
Md. Anamul Haque Department of Computer Science and Engineering, Jahangirnagar University
Savar, Dhaka, Bangladesh
Email: arifcse12@gmail.com
A.K.M Kamrul Islam Department of Computer Science and Engineering, Jahangirnagar University
Savar, Dhaka, Balgladesh
Email: kamrul@iubat.edu
Md.Imdadul Islam Department of Computer Science and Engineering, Jahangirnagar University
Savar, Dhaka, Balgladesh
Email: imdad@juniv.edu
Abstract— A digital sound system resembles the wireless
network link transmission system. A wireless network link is
affected by several disturbing factors: fading, attenuation,
non-linear distortion and noise. These factors are also
unavoidable in acoustic echo cancellation. A few number of
digital adaptive filter algorithms are approached to detect
and cancel the noise in a system. Among these algorithms
Least Mean Square, Block Frequency Domain Adaptive
Filter and Kalman Filter are widely used to predict and
remove the noise or unwanted signals. It is found that
performances of these filters varied and depends upon the
system of application. Therefore, a detailed performance
analysis is exigent. The goal of this paper is to use an
approach to analyze and figure out the best performed filter
by performance evaluation in the context of acoustic echo
cancellation.
Index Terms— Predictor-Corrector algorithm, near end
speech signal, Reverberation time, Step size.
I. INTRODUCTION
In wireless communication, the transmitted signal from
any source may traverse through multiple paths towards
the receiver. This happens by the obstruction or reflection
of natural barriers such as ground, buildings, vehicles,
hills at different atmospheric levels. As a result of such
propagation, the receiver receives multiple copies of
same signal each of different physical length. Each signal
experiences different noise, attenuation, phase shift and
delay [1]. Therefore original signal power is altered and
receiver assimilates either constructive or destructive
signal causes by amplifying or attenuating. However,
destructive interference occurs more frequently than
constructive one. The effect of destructive interference on
the key signal is called deep fading [2]-[3]. Overlapping
of different signals causes shadowing. Such complicated
phenomena led the researchers to develop statistical
model of fading channels [4] to handle these situations.
Relating with the coherence time (Tc) of the channel,
fading channel is further classified by slow and fast
fading [5]. However, Fading degraded the bit error rate
(BER), signal to noise ratio (SNR) resulting in lost data
and thus quality of the communication link.
Analogously, sound system in a conference room
reflects the consequences of wireless communication link
transmission. A typical sound system in a conference
room involves the speaker, microphone and transmission
path. A voice signal been sent by one participant come
out from the speaker propagates through multiple paths.
After echoing back from the conference room walls and
from multiple directions these signals yield the distortion
of root voice signal generated by the participant in front
of the microphone. Different feedback signal posses
different delay time and thus contaminate the key signal
indiscriminately. The only difference between the two
analogous systems is that the feedback signal in acoustic
echo cancellation retains less signal link length and also
low strength. Similarly, the noise and echo in the echo
cancellation system is not to that extent as in the case of
the wireless network link transmission.
To combat this scenario, an efficient digital filtering
algorithm has to be established before the sound signal is
sent to the speaker. In the arena of digital filter there are
some algorithms among which three algorithms are well
known in canceling noise form signal. They are: Least
Mean Square (LMS), Block Frequency Domain Adaptive
Filter (BFDAF) and Kalman filter. An LMS filter is a
simple adaptive algorithm that like other adaptive
algorithm adapts the parameters iteratively. A Block
Frequency Domain Adaptive Filter (BFDAF) uses an
adaptable feature of block implementation riding which
the coefficients of each block of the signal continually
adjusts. BFDAF is basically an enhancement of LMS
filter but calculates its parameters intelligently in
frequency domain using Fast Fourier Transform (FFT).
568 JOURNAL OF MULTIMEDIA, VOL. 5, NO. 6, DECEMBER 2010
© 2010 ACADEMY PUBLISHERdoi:10.4304/jmm.5.6.568-579
FFT facilitates the computation of the convolution. At
every time state the input signal is estimated and error
signal is calculated and refine the coefficients of the
estimated signal. Kalman filter is widely used due to its
great adaptive characteristic. Kalman filter contains a
recursive procedure to estimate the state of the process in
a way that minimizes the estimated mean square error. It
estimates the present state and error covariance from past
observations and corrects the error covariance from
measurement output and set estimation for future
calculation. Its operation technique is very effective even
when the system to model is unknown and depends on
the progressive results.
In this paper we will proceed in a way to compare
and analyze their performances on several aspects to have
a robust understanding to choose the best performed filter
in acoustic echo cancellation. The organization of this
paper is segmented to several sections: section II focuses
on the related work of digital adaptive filters. Section III
is composed of the basic operation and equations of the
filters we deal with. Basic experimental setup and terms
of Acoustic Echo Cancellation (AEC) are described in
section IV. Section V is the part where our core effort
lies. Here we figured out the performances of the filters
and variation of performance under different
circumstances. The final part (section VI) is an elaborate
discussion based upon our analysis and observation.
II. Related Work
This section briefly describes the related research and
study of digital adaptive filters especially in acoustic echo
cancellation. There are extensive researches on new
application development and algorithm enhancement of
LMS, FDAF and Kalman filter over past two decades.
Interest of the researchers transformed to the frequency
domain when FFT (Fast Fourier Transform) was
introduced. An approach of unconstrained frequency
domain adaptive filter was introduced by Mansour D. and
Gray A.H. in 1982. A research has been undertaken about
block implementation of BFDAF with varying delay
within the context of speech processing [J. S. Soo and K.
K. Pang, 1990]. Impulse response of the noisy speech
signal can have diverse delay time. In this case keeping
delay time static is not a good solution. Some studies
were based on specific application: such as in stereo echo
cancellation FDAF was enhanced. After commenced and
published by R.E.Kalman, Kalman filter has become a
topic of extensive research and application. In 1985 H.W.
Sorenson published a paper on theory and application in
details. As a part of varied application of kalman filter,
some researchers had a research on approximation of
non-linear transformations of probability distributions
under the context of robotics [Julier, 96]. If a statistical
probability of certain observation not changes linearly, it
is been found kalman filter can supply good estimation of
non-linear transformation and keeps the estimation error
minimized. There are publishing on the implementations
and testing of adaptive filters that are very helpful for
visualizing the algorithms output. A many of them is
related to specific algorithm performance and
enhancement [21]. Few of them work with multiple
filtering techniques. For an instance, the paper under the
title "Adaptive filtering algorithms for stereophonic
acoustic echo cancellation"[20] is concerned with two
approaches and categories of filters rather than specific
widely used filters. However, there is no exact paper that
can supply us sufficient performance evaluation of
adaptive filters in the context of AEC. This consequence
leads us to this experiment and demonstration of this
paper.
III. ADAPTIVE FILTER THEORY
A. Least Mean Square (LMS) Filter Concept
LMS is based on the steepest descent algorithm. It
predicts instantaneous estimation and updates the filter
coefficients sample by sample in a mode to minimize the
MSE [6]-[8]. The LMS algorithm’s significant feature is
its simplicity as neither has it required measurements
relevant to the auto correlation, cross correlation nor it
needs to compute matrix inversion. Hence it is faster than
basic Weiner filter algorithm [5]-[7]. Two basic processes
works behind the LMS filtering algorithm: filtering process- calculates the output response of the filter
relating to the input signal and generates an estimation
error of the output pertaining to the desired response and
adaptive process- adjusts the parameters automatically
regarding the estimation error.
As LMS is based on the steepest descent algorithm
weight update vector at time k+1 should be as follows:
kkk WW 1 (1)
Where Wk is the k-th weight vector, k
is the gradient
vector composed in equation1 and controls the rate of
convergence. Replacing the value of k
,
)(21 kkk RWPWW (2)
But as LMS uses the instantaneous estimates P and R of
(2) will be substituted by the values:
T
kkkk XXRXyP
kkk
k
T
kkkk
k
T
kkkkkk
XeW
XWyXW
WXXXyWW
2
)(2
)(21
(3)
Where k
T
kkk XWye .
LMS algorithm posses the weight update Wk+1 noted in
(3). Flowchart of the LMS algorithm depicted in Fig.1
will provide lucid understanding:
JOURNAL OF MULTIMEDIA, VOL. 5, NO. 6, DECEMBER 2010 569
© 2010 ACADEMY PUBLISHER
Figure.1: Flowchart of LMS adaptive filter
From entire theory we can cover that, LMS doesn’t
require any knowledge about correlation matrix instead it
uses instantaneous estimation [9]. At first stage weights
may be deviated from expectation but gradually it incline
towards good adjustment. In this way, it performs the
adaptation through learning the signal characteristics.
B. Block Frequency Domain Adaptive Filter (BFDAF)
Theory
LMS algorithm can be easily adapted with the time
domain of the signal. However some essential
development likely in ‘Acoustic Echo Cancellation in
Teleconferencing’ a long impulse response apparently
mixes with the echo duration. This claims to have a long
memory and increases the complexity of computation
[10]. Transforming the system of interest to the frequency
domain simply by Fourier transform mapping reduce the
computational complexity [12]. Secondly, the
orthogonality properties of DFT and other discrete
transforms provides higher convergence rate. Infinite
Impulse Response (IIR) can be a possible solution but
algorithm may encounter infinite filter instability [13]. To
enhance the performance, block implementation of FIR
filter allows parallel processing of input points per block.
In essence with FIR block implementation, fast Fourier
transform serves fast convolution [12].
Let L is the length of the block and M is the length of
tapped weight vector. Data matrix [10],
)(.........)3()2()1(
.....................
)2(.........)1()()1(
)1(.........)2()1()(
)(
MLkLuLkLuLkLuLkLu
MkLukLukLukLu
MkLukLukLukLu
kA
1(
...
)1(
)(
LkMG
kMG
kMG (4)
A(k) is a ML matrix [10] shown in (4) and length
of vector GT(kM) is M. Let the weight vector,
T
L kwkwkwkwkW )](.........)()()([)(ˆ1210
(5)
Output vector of the filter would be the multiplication
of A(k) and weight vector )( kW written in (4) and (5)
respectively,
)(ˆ).()]1(.........)2()1()([ kWkALkLykLykLykLy T (6)
For individual element,
)(ˆ).()( kWkMGkLy
)(ˆ).1()1( kWkMGkLy
…………………………….
)(ˆ).()( kWikMGikLy
)](.........)()()([ 1210 kwkwkwkw L
TMikLuikLuikLuikLu )]1(.........)2()1()([
1
0
)().(M
j
j jikLukw (7)
y( kL+ i) is the i-th output vector described by (7). Let
the desired response of (kL+i)th element is d(kL+i).Therefore the error signal, e(kL+i ) = d(kL+i ) - y(kL+i).In matrix form,
)1(
...
...
...
)2(
)1(
)(
)1(
...
...
...
)2(
)1(
)(
)1(
...
...
...
)2(
)1(
)(
)(
LkLe
kLe
kLe
kLe
LkLy
kLy
kLy
kLy
LkLd
kLd
kLd
kLd
ke
(8)
The cross correlation vector k (9) is the
multiplication of the error vector e(k) (8). With the
transpose of data matrix AT,
)()()( kekAk T (9)
The update equation (10) of weight vector can be
achieve by adding constant multiplication of correlation
vector,
)()()1( kkWkW (10)
Initialize wk (i) and xk-i
Read xk and yk from ADC
Filter xk nk = wk(i)xk-i
Update coefficient
wk+1 = wk + 2 ke xk-i
Compute error ek = yk - nk
Compute factor 2 ke
570 JOURNAL OF MULTIMEDIA, VOL. 5, NO. 6, DECEMBER 2010
© 2010 ACADEMY PUBLISHER
Now using )1(ˆ kW updated output vector,
)1(ˆ).1()]12(.........)1()([ kWkALkLyLkLyLkLy T (11)
Updated error matrix [12]-[13],
)12(
...
...
...
)2(
)1(
)(
)12(
...
...
...
)2(
)1(
)(
)12(
...
...
...
)2(
)1(
)(
)1(
LkLe
LkLe
LkLe
LkLe
LkLy
LkLy
LkLy
LkLy
LkLd
LkLd
LkLd
LkLd
ke
(12)
Using (12) the updated cross correlation vector is
simply as the previous operation [12]-[13],
)1()1()1( kekAk T (13)
Iteratively, The update equation of weight vector [12]-
[13],
)1()1()2( kkWkW (14)
Similarly continuing with the above procedure for
(k+i)-th term the output vector shown in (15) where ‘i’ is
the delay time to process the next input block [11],
)(ˆ).()]1)1((...)...2().....([ ikWikALikLyiLkLyiLkLy T (15)
))1)(1((
...
...
...
)2(
)1(
)(
))1)(1((
...
...
...
)2(
)1(
)(
))1)(1((
...
...
...
)2(
)1(
)(
)(
LikLe
iLkLe
iLkLe
iLkLe
LikLy
iLkLy
iLkLy
iLkLy
LikLd
iLkLd
iLkLd
iLkLd
ike
(16)
In generalized form cross correlation for (k+i)-th term,
)()()( ikeikAik T (17)
Updated weight vecotr (using eq.16 and 17),
)()()1( ikikWikW (18)
In overlap-save method N = 2M point FFT is used
where the size of the filter of M tap weights. Input signal
in ‘n’ domain is express like,
blockkthblockthk ,)1( where, (k-1)th block is,
)2(.........)3()2()1(
.....................
)22(.........)1()()1(
)12(.........)2()1()(
MkLukLukLukLu
MLkLuLkLuLkLuLkLu
MkLuLkLuLkLuLkLu
And k-th block is,
)(.........)3()2()1(
.....................
)22(.........)1()()1(
)12(.........)2()1()(
MkLuMkLuMkLuMkLu
MkLukLukLukLu
MkLukLukLukLu
Input signal in ‘k’ domain,
blockkthblockthkFFTdiagkU ,)1()( ; NN Matrix
The initial weight in ‘k’ domain is a 1N vector,
O
kwFFTkW
)(ˆ)(ˆ (19)
Here )(ˆ kw is tap-weight vector (in ‘n’ domain) of
length M and O is a null vector of length M. Output signal
of the filter in ‘k’ domain,
)(ˆ)()(ˆ kWkUIFFTofelementsMLastky
The desired response vector,
TMkMdkMdkMdkd 1(...,......),1(),()( (20)
Error signal vector in ‘n’ domain,
TMkMekMekMeke 1(...,......),1(),()(
)()( kykd (21)
Error signal in frequency domain,
)()(
ke
OFFTkE
(22)
The cross-correlation vector,
)()()( kEkUIFFTofelementsMFirstk H
Updated tap-weights,
O
kFFTkWkW
)()(ˆ)1(ˆ (23)
Continuing this technique, all the vector coefficients will be updated for each (k+i) th term.
C. Theory and Concept of Kalman Filter
The Kalman filter is an adaptive least square error filter which is distinct from other adaptive filters due to
its state-space concepts and recursive features [15]. It is
an efficient algorithm for estimating a signal in presence of Gaussian noise and to continuously update the best
estimate with the system current state. In particular, each
updated estimate is computed form the previous best estimate and current input data without altering stationary
or non-stationary environments. Kalman filter has a wide
range of practical application areas like Aerospace, Marine Navigation, Inertial navigation, Global
positioning, nuclear power plant implementation and
many others [17]-[19].
One of the great features to use Kalman filter is that it
is recursive. In order to predict system state from the
previous entire observed data, a large amount of storage
is required. However, Kalman stores only the best
estimate calculated from the previous estimate to update
the current estimate for future use [16]. This requires
small span of memory and thus upholding recursive
property. The whole process is divided into two portions:
JOURNAL OF MULTIMEDIA, VOL. 5, NO. 6, DECEMBER 2010 571
© 2010 ACADEMY PUBLISHER
at first state the state is predicted with the dynamic modeland this estimation in turn corrected by the second part;
the corrector. Hence, the error covariance is minimized
[15]-[16]. Let us assume our observed random variables
y(1), y(2), y(3)……….y(n-1) starting from 1 ending at n-1.
)|1( 1nynx 7is the mean-square estimate of a related
zero-mean random variable x(n-1). If new observation is
y(n) and we are to find out the compute update estimate
)|( nynx . This computation can be done from all the
past observations y(1), y(2), y(3)……….y(n-1) including
the new observation y(n). Kalman filter shows the
efficiency providing the recursive estimation procedurein that we only store the previous best estimation [15]-
[16]-[18] )|1( 1nynx and reuse for current estimation
)|( nynx regarding current observation y(n). Hence
Kalman filter supplies less space and more effectiveness.
The Kalman filter uses feedback control to estimate the process state i.e. at any time filter estimates the process
state and then obtains feedback from noisy
measurements. In this way, the basic Kalman filter operation is segmented with two portions: the time-
update and the measurement-update [18]. The time
update section deals with the prediction of the current state ahead in time and error covariance to have the next
time step’s priori estimate. This operation is defined by a
process equation:
)()(),1()1( 1 nvnxnnFnx (24)
)()()( 111 nQnvnvE H (25)
x(n+1)(24) is the M-dimensional estimation. F(n+1, n)
is a transitions matrix. v1(n) is the Mx1 vector process
noise. Q1 is its MxM covariance. The measurement
update equation describes the N-dimensional observation
vector y(n) introduced at the input. The equation is
writing as follows:
)()()()( 2 nvnxnCny (26)
)()]()([ 222 nQnvnvE H (27)
Here C(n) in (26) is the N-by-M and v2(n); the
measurement noise are the Nx1 vectors respectively.
Q2(n) is the N x N covariance of v2(n). Resulted
measurement update equations also called as the corrector
equations of the priori estimate to produce the posterioriestimate for next level [15].
Taking an account both the predictor and correctoroperators the Kalman filter algorithm (27) also denote as
predictor-corrector [15] algorithm shown in Fig.2. The
predictor corrector works recursively.
Figure.2: The Predictor-Corrector algorithm for
recursive minimum mean-square estimation.
The key target of the Kalman algorithm is to determine
the mean square estimate )|( nynx . Mathematically,
)|( nynx = minimum square estimate of x(n)
Given the observed data y(1),y(2),y(3)…….y(n)
Factually, Estimation of )|( nynx is a linear
combination (28) of the forward prediction errors known
as innovations (1), (2)….. (n-1):
n
k
kkn abynx1
)|( (28)
Where bk is to be chosen to minimize the mean square
estimation error )|()( nynxnx composed in (29):
)](*)([
)](*)([
kkE
knxEbk
(29)
At time k = n
)()()|(1
0nbkbynx
n
k nkn (30)
The summation on the right hand side of (30) is
functionally the previous estimate )|1( 1nynx and bn
(n) is the correction term. So the equation of the current
prediction can be expressed as:
)()|1()|( 1 nabynxynx nnn (31)
D. Mathematical Formulation of Kalman Filter
One step predictor is the key of mathematical
calculation of Kalman filter. It predicts state space after
introducing new observation y(n). Afterwards it
calculates the error covariance of prediction and updates
the mean square estimation for the next imminent
operation. Meanwhile Kalman gain G(n) will provide the
correction in the estimate along with the prediction error.
The essential Kalman variables related with the operation
along with their definitions and parameters have been
shown in the Table 1.
572 JOURNAL OF MULTIMEDIA, VOL. 5, NO. 6, DECEMBER 2010
© 2010 ACADEMY PUBLISHER
TABLE.1
DEFINITION OF KALMAN FILTER VARIABLES
Variable Definition
x(n) State at time n
y(n) Observation at time n
F(n+1, n) Transition matrix from time n to time n+1
C(n) Measurement matrix at time n
Q1(n) Correlation matrix of process noise v1(n)
Q2(n) Correlation matrix of measurement noise
v2(n)
)|( 1nynxPredicted estimate of the state at time n
given the observation y(1),y(2),y(3)…. y(n-
1)
)|( nynxFiltered estimate of the state at time n given
the observation y(1),y(2),y(3)…….y(n)
G(n) Kalman gain at time n
)(n Innovation vector at time n
R(n) Correlation matrix of the innovation vector
)(n
K(n, n-1) Correlation matrix of the error in
)|( 1nynx
K(n) Correlation matrix of the error in
)|( nynx
Using the system variables listed above the one step
prediction operation [18] can be performed
algorithmically as follows:
For any given time n+1Input vector process
Given observations = {y(1), y(2), y(3)…..y(n)}Known parameters Transition matrix = F(n+1, n)
Measurement matrix = C(n)Correlation matrix of process noise = Q1(n)Correlation matrix of measurement noise = Q2(n)
Initial Conditions: Set the initial state vector at time n =1 where only the
observation y0 is exist.
)1()|1( 0
^
xEyx
Initialize the Correlation matrix of the error at time n = 1
0])])1([)1()])(1([)1([()0,1( HxExxExEK
For n = 1, 2, 3,……. the following series of operations is
performed sequentially:
Kalman gain equation is:
1)()1,(),1()( RnCnnKnnFnG H (32)
Where error correlation matrix k(n, n-1) can be defined
by the expectation of the error correlation matrix of
previous time unit.
)]1,()1,([)1,( nnnnEnnK H (33)
(n, n-1) is the predicted state-error vector at time n
The inverse of the correlation matrix of innovations R-
1(n) is described by multiplication of the previous error
vector k(n, n-1) (33) with correlation of measurement
matrix. C(n) in presence of relevant Gaussian
measurement noise (A white noise combined by the
means of Gaussian distribution). Explicitly it is:
1
2
1 )]()()1,()([)( nQnCnnKnCnR H (34)
As a detection of the innovation i.e. the new
information of observation y(n) can be found by as
follows:
)|()()()( 1nynxnCnyn (35)
Time state update one unit for current use is the result
of multiplication of a transition matrix F(n+1, n) and
previous time state )|( 1nynx incorporating with gain
G(n) and innovation (n). This is the basic prediction
equation.
)()(|,1|1 1
^^
nnGynxnnFynx nn (36)
Correction of error correlation matrix of the current
time can be obtained by (37):
)1,()()()1,()1,()( nnKnCnGnnFnnKnK (37)
In the final part of the calculation, we need to update
the measurement of the future error correlation matrix
k(n+1, n). As an integral part of this, white Gaussian
process noise Q1(n) will be added to it. Mathematical
equation has shown (38).
)(),1()(),1(),1( 1 nQnnFnKnnFnnk H (38)
IV. Acoustic Echo cancellation and Experimental
setup
An acoustic echo cancellation is noise cancellation of a
recorded speech signal. A recorded speech signal from
the loud speaker returns to the microphone as an echo
reflecting from the room and mingled with original
speech signal. The speaker signal at the microphone input
thus is not uniform and distorted. Adaptive filter will
work with the distorted signal which is the signal of our
interest. During the processing of the distorted signal it
produces the best estimation of the noisy signal.
Subtraction of this noisy signal from original signal will
solve the problem. Concurrently an error signal will be
generated mirroring the difference between the actual
signal and our approximation. Filter coefficients will be
updated according to the change in the error signal for
next block input processing. The usage of AEC is
inescapable when the loud speaker is in closer to the
microphone. For an instance, devices like hands free
JOURNAL OF MULTIMEDIA, VOL. 5, NO. 6, DECEMBER 2010 573
© 2010 ACADEMY PUBLISHER
mobile; videoconferencing software like Skype,
Marratech, NFESIS, iVisit use their own AEC.
Experimentally, signals of our interest can be sub-
divided into three parts:
A. Near End Speech Signal (NES)
The signal originated from the user participating in the teleconference to the microphone is the near end speech
signal. Signal amplitude is higher at the starting but
degrades gradually.
B. Far End Speech Signal (FES)
The signal travels out from the loud-speaker and
bounce in the room and turns back to the microphone causing noise to the microphone input signal.
C. Microphone Signal (MS)
The microphone holds both the near end and far end speech signal. The target of our echo canceller is to
remove the far end signal from the microphone and
transmits the near end speech signal to the distant user participating in the teleconferencing.
D. Experimental Setup
Signal inserted in microphone of a hall room is the
direct speech signal )(nd of the speaker and echo signal
)(ˆ nd that arises from the reflection of walls shown in
Fig.3. Direct speech signal is called near-end signal and
the echo signal is called far-end signal. The combined
input signal of the microphone is )(ˆ)()( ndndn .
Objective of an echo canceller is to remove the far-end
signal so that only near-end signal is sent to the loud
speaker. The path or channel between loud speaker and
the microphone is represented by a long finite impulse
response.
Figure.3: Room environment of Acoustic Echo Canceller
V. RESULT AND SIMULATION
Our basic room setup incorporates a microphone system located at the middle of the room and the speakers
around the room each with 6.70 meters distance from the
microphone. After the basic room environment setup, our system is ready to simulate. Basic room impulse response
must be calculated as this response is the major noise to
our signal of experiment. For this purpose, we used
chebyshev filter which is presumed to be fourth in filter
order with pass-band frequency range 0.1 < Wn < 0.7.
Besides, as an inherent part of our designed filter,
sampling frequency was assumed to be fs = 8000
samples/s and number of time sequence M = 4001. Stop
band ripple of room impulse response was considered as
20db. The time domain and the frequency domain
illustrations (Fig.4 and Fig.5) of the room impulse
response are:
Figure. 4: Room impulse response (Frequency Domain)
Figure.5: Room impulse response (Time Domain)
AEC system is dependent upon some interdependent
factors. To analyze the performances and compare the
filters, it is obligatory to initialize the four factors of
AEC: Reverberation time, Filter length, step size (Mu)
and signal to noise ratio (SNR). These factors have
discussed below along with the parameters of our
designed system:
A. Reverberation Time
Suppose we have to cancel an amount of 20db from
the noisy signal. Reverberation time is the time it takes
the sound to decay an amount of 60db from its initial value to when the sound source stops. In our AEC
system, loud speaker is placed at 670 cm distance and the
sound takes 340 meter/s to travel in the air. So, it takes (670 / 340000 = 19.70 ms) about 20 ms to travel from the
speaker to microphone. Suppose that the reverberation
574 JOURNAL OF MULTIMEDIA, VOL. 5, NO. 6, DECEMBER 2010
© 2010 ACADEMY PUBLISHER
time is 6000 ms in our room. First we have to initialize
the filter length based upon the room setup and impulse
response.
B. Filter Length
The most elementary factor is the Filter Length. As
estimation of the echo is the most elementary job to cancel the echo from the microphone signal, choosing the
filter length accordingly is inevitable. Since reverberation
time is 6000 ms then to decay 20 db of signal, the room impulse response has to be at least 2000 ms long. Our
considered sampling frequency is 8 Khz. So, filter length
should be at least (2s * 8000 samples/s = 16000
coefficients). The speech signal that we used has filter
length of 25000.
C. Step Size (Mu)
Step size is a variable indicates the time interval that is
needed for the filter to read and process a block of input
samples. A reasonable Mu is considered according to the room impulse response. This is also called the
convergence rate. The less the step size assigns, the more
efficient output comes. It should be in the range 0<Mu<1.
D. Signal to Noise Ratio (SNR)
SNR indicates the rate of noise exists in the signal of
interest. The less the SNR, the more the difficulties arise
for the filters to converge.
E. Test Case
Here we will examine and observe the simulation
result of the three filters upon varying Mu and SNR. The
ultimate goal of the three filters is to keep the mean
square error (MSE) and misadjustment (Madj) rate
minimum. In each criterion, the time domain depiction
and spectrogram has shown. Spectrogram is an
illustration of signal regarding points on three axes. X-
axis shows the time, Y-axis draws the frequency and Z-
axis plots the amplitude. Hence, spectrogram provides
sound visualization.
For all the three filters several test cases was
considered for varying Mu and SNR. Here we portrayed
and highlighted two test cases that are precise and
appropriate to the analysis. First test case concerned with
Mu = 0.025 and SNR = 45 and second test was
undertaken with Mu = 0.25 and SNR = 35.
F. LMS Case
The resultant observation and illustration has been
adopted below (Fig.6 and Fig.7) in the case of LMS filter
with first test case. In the time domain illustration first
block shows the fresh speech signal, the second reflects
the same signal in presence of noise and last block
represents the filtered signal. Fig.7 signifies the same
phenomenon in spectrogram view. Here in third block
we observe that, the filtered block shows a good quality
of original wave.
Figure.6: Time domain illustration of original signal,
microphone signal and filtered signal
Figure.7: Spectogram illustration of original signal,
microphone signal and filtered signal
Increasing the Mu and decreasing the SNR causes the
increase of minimum mean square error (MMSE).
Though reducing the step size dramatically eliminates the
execution time (ET), misadjustment rate (Madj) moves
upward. ET has been listed in Table 2. By increasing the
Mu and decreasing the SNR we achieve the following
output (Fig.8 and Fig.9). Two figures manifests that the filtered signal is much noisy than the previous
observation.
JOURNAL OF MULTIMEDIA, VOL. 5, NO. 6, DECEMBER 2010 575
© 2010 ACADEMY PUBLISHER
Figure.8: Time domain illustration of original signal,
microphone signal and filtered signal.
Figure.9: Spectogram illustration of original signal,
microphone signal and filtered signal
G. BFDAF Case
As BFDAF processes the signal block by block; it waits for while to accumulate the signal of predefined
block to process. So, setting this amount of time is
sensitive for the filter. It takes 20 ms of the speech signal to traverse through the air and reach to the microphone.
Hence, BFDAF should skips (0.02 s * 8000 samples/s) =
160 samples. This means after the 160 coefficients BFDAF will take the next block of input. Noise cancelled
signal along with the noisy signal have been shown in
Fig.10 and Fig.11. Like LMS, the third block in the two diagrams depicts the filtered signal. Though execution
time is not quite smaller, the result is good. The ET and
Madj are mentioned in the Table 3.
Figure.10: Time domain illustration of original signal,
microphone signal and filtered signal.
Figure.11: Spectogram illustration of original signal,
microphone signal and filtered signal.
After increasing the noise (decreasing SNR by 10) and
increasing the step size (to 0.25) we obtained that
enhancement of the step causes less execution time but
lower convergence. Besides, noise increment accelerates
the divergence. Thus, filtered signal appears noisier.
Fig.12 and Fig.13 depict the effects of this change.
576 JOURNAL OF MULTIMEDIA, VOL. 5, NO. 6, DECEMBER 2010
© 2010 ACADEMY PUBLISHER
Figure.12: Time domain illustration of original signal,
microphone signal and filtered signal.
Figure.13: Spectogram illustration of original signal, microphone signal and filtered signal.
H. Kalman Case
Kalman filter conveys good convergence but it takes large execution time. Simulation result of Kalman filter at
Mu = 0.025 and SNR = 45 shown in Fig.14 and Fig.15
respectively. Fig. 14 and 15 emphasizes its very good
convergence albeit the execution time is high. Relevant
ET and Mad j recorded in Table 4.
Figure.14: Time domain illustration of original signal,
microphone signal and filtered signal.
Figure.15: Spectogram illustration of original signal,
microphone signal and filtered signal.
Performance of the Kalman filter is satisfactory
even when we decrease the SNR and increase the Mu. ETis still large enough in the comparison with that of LMS
and BFDAF.
JOURNAL OF MULTIMEDIA, VOL. 5, NO. 6, DECEMBER 2010 577
© 2010 ACADEMY PUBLISHER
Figure.16: Time domain illustration of original signal,
microphone signal and filtered signal.
Figure.17: Spectrogram illustration of original signal,
microphone signal and filtered signal.
I. Analysis
After pictorial representation we should look through
the tabular data where execution time (ET), minimum
mean square error (MMSE) and miss adjustment (Madj)relating to the change of SNR and Mu are recorded. The
tabular data has adopted below for each filter:
TABLE.2
ET, MMSE and Madj with varying Mu and SNR (LMS)
SNR Mu ET (s) MMSE Madj
45 0.025 1.228 -0.00566 -1
45 0.25 1.106 -0.00562 -1
35 0.025 1.012 -0.00258 -1
35 0.25 1.011 -0.00253 -1
The table 2 shows that both for SNR = 45 and SNR =
35, ET is smaller as the step size is increased. It is
noticeable that MMSE is increasing as with the increase of the noise (decrease of SNR) and the Mu. Despite all
these, there is no significant change of Madj for LMS.
This is because; LMS filter works with each input sample
in isolation and takes instantaneous estimates.
TABLE.3 ET, MMSE and Madj with varying Mu and SNR
(BFDAF)
SNR Mu ET (s) MMSE Madj
45 0.025 3.586 -0.000412 -1.0006
45 0.25 1.079 -0.0198 -1.0003
35 0.025 1.313 -0.0037 -0.9999
35 0.25 1.125 -0.8876 -0.9989
According to the table 3, when SNR was 45 and Mu
was 0.025, ET was 3.586. With the enhancement of thestep size, ET falls down and MMSE reached upward. Madj
is also increased. Similar incident found at the SNR = 35
but due to good adaptability of BFDAF it took less ET.
TABLE.4
ET, MMSE and Madj with varying Mu and SNR
(Kalman filter)
SNR Mu ET MMSE Madj
45 0.025 50.12 -0.000385 -1.2503
45 0.25 35 -0.000315 -1.012
35 0.025 32.38 -0.0049 -0.8897
35 0.25 30.46 -1.2701 -0.8101
Table 4 is the illustration of the characteristics of the
factors in the case of kalman filter. Like BFDAF, Kalman
filter also conveys the higher MMSE and Madj with
decreasing SNR and increasing Mu. Unlike BFDAF,
Kalman filter conducts a significant amount of execution
time. At SNR = 45 and Mu = 0.025, ET was 50.12
seconds, MMSE = -0.000385 and Madj = -1.2503. As step
size of the filter increased, literally it should take less
execution time. As an aftermath of this incident, ET went
down to 35 seconds but MMSE and Madj improved to -
0.000315 and -1.012 respectively.
From these three observations, we can precisely
append that, at SNR = 45, lowest MMSE belongs to
Kalman filter is -0.000385 although ET = 50.12s
diminishes its acceptability. At this stage, BAFADFperforms suitably than LMS except in the case of ET.
LMS takes execution time ET of 1.228 which is well over
3.586 (BAFDF). Madj is almost alike; -1 for LMS and -
1.006 for BAFDF. MMSE is better issued by BAFDF than
LMS. As we descend down the tables with decreasing
SNR and increasing Mu, Kalman Filter behaves well in
other factors except ET. Performance of BAFDF is
satisfactory and acceptable as it delivers lower MMSEand Madj until quality of signal is not much degraded and
Mu is not much upgraded. LMS also showed good
performance but there is no change in Madj. At SNR = 35
and Mu = 0.25, ET = 1.125s, MMSE = -0.8876 and Madj =
-0.9989 for BAFDF. For LMS, ET = 1.011, MMSE = -
0.00253 and Madj = -1. These consequences tell us that
LMS conducts less ET in all the cases though it does not
provide any change in Madj whereas BAFDF takes little
more ET during inception of the process but even with
the more noise and step size, it adapts itself intelligently.
VI. DISCUSSION
578 JOURNAL OF MULTIMEDIA, VOL. 5, NO. 6, DECEMBER 2010
© 2010 ACADEMY PUBLISHER
Finally, our experiment leads to a sound conclusion in
the choice of digital adaptive filters. From several aspects
we can conclude with the details discussion based on the
achievement of the result section. Though it has great
contribution to the many heavily noisy real life system
where prediction is inherent, Kalman filter conducts
much execution time which is not conducive to AEC
system. A recipient of the conference will not wait that
much time to hear the sender’s voice. To have a choice of
LMS filter or BFDAF filter, it will depend upon several
considerations. As LMS filter uses instantaneous
estimation, its miss-adjustment rate is almost unchanging.
BFDAF takes little more time at first to adapt but at the
end it shows good convergence. Upon the good choice of
step size, BFDAF demonstrates less miss-adjustment rate
and MMSE. If it is predetermined that the AEC will be
less noisy and little execution time in hand, it is desirable
to use LMS. But with good selection of step size and
considerable execution time it is recommendable to use
BFDAF. Besides, block processing requires less memory
storage for signal. Therefore, for very long impulse
response BFDAF is very efficient. At the end, we can
deduce that in AEC both BFDAF and LMS can be applied deeming the application area. In this paper, we have a
transparent understanding of choose adaptive filter in the
context of AEC. However, DTD (Double Talk Detection) is not considered in this paper which might be a
performance factor in AEC when two participants try to
speak at the same time. Future trend of this paper is to analyze the practical implementation of these algorithms
in the case of DTD and enhancement of algorithm if
desired results are failed to accomplish.
REFERENCES
[1] D. Molkdar, “Review on radio propagation into and within
buildings,” IEEE Proc. H, vol. 138, pp. 61–73, February
1991.
[2] Zhiwei Zeng, “Digital Communication via Multipath
Fading Channel”, Cpre537x Final Project, pp.17-22,
November 2000.
[3] J. K. Cavers and P. Ho, “Analysis of the error performance
of trellis coded modulations in Rayleigh fading channels,”IEEE Trans. Commun., vol. 40, pp. 74–80, January 1992
[4] P. Yegani and C. McGlilem, “A statistical model for the
factory radio channel,” IEEE Trans. Commun., vol. COM-
39, pp. 1445–1454, , October 1991.
[5] N. J. Bershad and P. L. Feintuch, “Non-Wiener solutions
for the LMS altorithm—A time domain approach,” IEEE Trans. Signal Processing, vol. 43, pp. 1273–1275, May-
1995.
[6] E. R. Ferrara, Jr., “Fast Implementation of LMS Adaptive
Filter,” IEEE Trans. Acoust., Speech, Signal Processing,
vol. ASSP-28, pp. 474-475, 1980.
[7] M. J. Shensa, “Non-Wiener solutions for the adaptive
canceller with a noisy reference,” IEEE Trans. Acoust,
Speech, Signal Processing, vol- ASSP-28, pp. 468–473.
August-2000.
[8] P. Clarkson and P. White, “Simplified analysis of the LMS
adaptive filter using a transfer function approximation,”
IEEE Trans. Acoust., Speech, Signal Processing, vol.
ASSP-35, pp. 987–993, July 1987.
[9] Emmanuel C. Ifeachor, Barrie W. Jervis, “Digital Signal
Processing – A practical approach”, Addison-Wesley,
ISBN 0 201 54413 X, pp.104-5, pp.541 – 552, September
2001.
[10]D. Mansour and A. H. Gray, “Unconstrained Frequency
Domain Adaptive Filter,” IEEE Trans. Acoust., Speech, Signal Processing, vol. ASSP-30, pp. 726-734. 1982.
[11]J. S. Soo and K. K. Pang, “Multidelay Block Frequency
Domain Adaptive Filter,” IEEE Trans. Acoust. Speech,
Signal Processing, vol. ASSP-38, pp. 373-376, 1990.
[12]Satoru Emura, Yoichi Haneda, and Shoji Makino,
“Enhanced Frequency-Domain Adaptive Algorithm for
Stereo Echo Cancellation”, IEEE Transc. Vol-II, pp.1901-
03, July 2001.
[13]N.K.Jablon, “On the complexity of Frequency Domain
Adaptive Filtering” IEEE Trans. On Signal Processing
Vol-39 pp. 2331-2334. October-1991.
[14]E. W. Harris and C.D.M.a.B.F.A., "A variable step (VS)
adaptive filter algorithm," IEEE Transactions on
Biomedical Engineering, Vol. 34, pp.309-316, 1986.
[15]Greg Welch and Gary Bishop, “An Introduction to the
Kalman Filter”, TR 95-041, Department of Computer
Science University of North Carolina at Chapel Hill, pp.3-
7, July 2006.
[16]H.W. Sorenson, Kalman Filtering: Theory and Application,
IEEE Press, pp.115-118, New York (1985).
[17]A. H. Mohamed, K. P. Schwarz, “Adaptive Kalman
Filtering for INS/GPS”.pp.7-11, December 1998.
[18]Simon Haykin, “Adaptive Filter Theory”, Fourth Edition,
Pearson Education, ISBN, 81-7808-565-8, pp.231-237, pp.
345-48, pp466-485, September 2001.
[19]Averil B. Chatfield, “Fundamentals of high accuracy
inertial navigation. Progress in astronautics and
aeronautics”, AIAA No. V-174: (800), pp.189-192.
September-1997.
[20]J. Benesty, F. Amand, A. Gilloire, Y. Grenier, "Adaptive
filtering algorithms for stereophonic acoustic echo
cancellation," icassp, vol-5, pp.3099-3102, Acoustics,
Speech, and Signal Processing, ICASSP-95, International
Conference on, 1995.
[21]Harsha I. K. Rao, Behrouz Farhang-Boroujeny, “Fast
LMS/Newton algorithms for stereophonic acoustic echo
cancellation”. Vol-8, pp.2919-2930, August 2009.
JOURNAL OF MULTIMEDIA, VOL. 5, NO. 6, DECEMBER 2010 579
© 2010 ACADEMY PUBLISHER
top related