demystifying the digital adaptive filters conducts in ... · demystifying the digital adaptive...

12
Demystifying the Digital Adaptive Filters Conducts in Acoustic Echo Cancellation Md. Anamul Haque Department of Computer Science and Engineering, Jahangirnagar University Savar, Dhaka, Bangladesh Email: [email protected] A.K.M Kamrul Islam Department of Computer Science and Engineering, Jahangirnagar University Savar, Dhaka, Balgladesh Email: [email protected] Md.Imdadul Islam Department of Computer Science and Engineering, Jahangirnagar University Savar, Dhaka, Balgladesh Email: [email protected] Abstract— A digital sound system resembles the wireless network link transmission system. A wireless network link is affected by several disturbing factors: fading, attenuation, non-linear distortion and noise. These factors are also unavoidable in acoustic echo cancellation. A few number of digital adaptive filter algorithms are approached to detect and cancel the noise in a system. Among these algorithms Least Mean Square, Block Frequency Domain Adaptive Filter and Kalman Filter are widely used to predict and remove the noise or unwanted signals. It is found that performances of these filters varied and depends upon the system of application. Therefore, a detailed performance analysis is exigent. The goal of this paper is to use an approach to analyze and figure out the best performed filter by performance evaluation in the context of acoustic echo cancellation. Index Terms— Predictor-Corrector algorithm, near end speech signal, Reverberation time, Step size. I. INTRODUCTION In wireless communication, the transmitted signal from any source may traverse through multiple paths towards the receiver. This happens by the obstruction or reflection of natural barriers such as ground, buildings, vehicles, hills at different atmospheric levels. As a result of such propagation, the receiver receives multiple copies of same signal each of different physical length. Each signal experiences different noise, attenuation, phase shift and delay [1]. Therefore original signal power is altered and receiver assimilates either constructive or destructive signal causes by amplifying or attenuating. However, destructive interference occurs more frequently than constructive one. The effect of destructive interference on the key signal is called deep fading [2]-[3]. Overlapping of different signals causes shadowing. Such complicated phenomena led the researchers to develop statistical model of fading channels [4] to handle these situations. Relating with the coherence time (Tc) of the channel, fading channel is further classified by slow and fast fading [5]. However, Fading degraded the bit error rate (BER), signal to noise ratio (SNR) resulting in lost data and thus quality of the communication link. Analogously, sound system in a conference room reflects the consequences of wireless communication link transmission. A typical sound system in a conference room involves the speaker, microphone and transmission path. A voice signal been sent by one participant come out from the speaker propagates through multiple paths. After echoing back from the conference room walls and from multiple directions these signals yield the distortion of root voice signal generated by the participant in front of the microphone. Different feedback signal posses different delay time and thus contaminate the key signal indiscriminately. The only difference between the two analogous systems is that the feedback signal in acoustic echo cancellation retains less signal link length and also low strength. Similarly, the noise and echo in the echo cancellation system is not to that extent as in the case of the wireless network link transmission. To combat this scenario, an efficient digital filtering algorithm has to be established before the sound signal is sent to the speaker. In the arena of digital filter there are some algorithms among which three algorithms are well known in canceling noise form signal. They are: Least Mean Square (LMS), Block Frequency Domain Adaptive Filter (BFDAF) and Kalman filter. An LMS filter is a simple adaptive algorithm that like other adaptive algorithm adapts the parameters iteratively. A Block Frequency Domain Adaptive Filter (BFDAF) uses an adaptable feature of block implementation riding which the coefficients of each block of the signal continually adjusts. BFDAF is basically an enhancement of LMS filter but calculates its parameters intelligently in frequency domain using Fast Fourier Transform (FFT). 568 JOURNAL OF MULTIMEDIA, VOL. 5, NO. 6, DECEMBER 2010 © 2010 ACADEMY PUBLISHER doi:10.4304/jmm.5.6.568-579

Upload: vominh

Post on 24-Jun-2018

223 views

Category:

Documents


0 download

TRANSCRIPT

Demystifying the Digital Adaptive Filters

Conducts in Acoustic Echo Cancellation

Md. Anamul Haque Department of Computer Science and Engineering, Jahangirnagar University

Savar, Dhaka, Bangladesh

Email: [email protected]

A.K.M Kamrul Islam Department of Computer Science and Engineering, Jahangirnagar University

Savar, Dhaka, Balgladesh

Email: [email protected]

Md.Imdadul Islam Department of Computer Science and Engineering, Jahangirnagar University

Savar, Dhaka, Balgladesh

Email: [email protected]

Abstract— A digital sound system resembles the wireless

network link transmission system. A wireless network link is

affected by several disturbing factors: fading, attenuation,

non-linear distortion and noise. These factors are also

unavoidable in acoustic echo cancellation. A few number of

digital adaptive filter algorithms are approached to detect

and cancel the noise in a system. Among these algorithms

Least Mean Square, Block Frequency Domain Adaptive

Filter and Kalman Filter are widely used to predict and

remove the noise or unwanted signals. It is found that

performances of these filters varied and depends upon the

system of application. Therefore, a detailed performance

analysis is exigent. The goal of this paper is to use an

approach to analyze and figure out the best performed filter

by performance evaluation in the context of acoustic echo

cancellation.

Index Terms— Predictor-Corrector algorithm, near end

speech signal, Reverberation time, Step size.

I. INTRODUCTION

In wireless communication, the transmitted signal from

any source may traverse through multiple paths towards

the receiver. This happens by the obstruction or reflection

of natural barriers such as ground, buildings, vehicles,

hills at different atmospheric levels. As a result of such

propagation, the receiver receives multiple copies of

same signal each of different physical length. Each signal

experiences different noise, attenuation, phase shift and

delay [1]. Therefore original signal power is altered and

receiver assimilates either constructive or destructive

signal causes by amplifying or attenuating. However,

destructive interference occurs more frequently than

constructive one. The effect of destructive interference on

the key signal is called deep fading [2]-[3]. Overlapping

of different signals causes shadowing. Such complicated

phenomena led the researchers to develop statistical

model of fading channels [4] to handle these situations.

Relating with the coherence time (Tc) of the channel,

fading channel is further classified by slow and fast

fading [5]. However, Fading degraded the bit error rate

(BER), signal to noise ratio (SNR) resulting in lost data

and thus quality of the communication link.

Analogously, sound system in a conference room

reflects the consequences of wireless communication link

transmission. A typical sound system in a conference

room involves the speaker, microphone and transmission

path. A voice signal been sent by one participant come

out from the speaker propagates through multiple paths.

After echoing back from the conference room walls and

from multiple directions these signals yield the distortion

of root voice signal generated by the participant in front

of the microphone. Different feedback signal posses

different delay time and thus contaminate the key signal

indiscriminately. The only difference between the two

analogous systems is that the feedback signal in acoustic

echo cancellation retains less signal link length and also

low strength. Similarly, the noise and echo in the echo

cancellation system is not to that extent as in the case of

the wireless network link transmission.

To combat this scenario, an efficient digital filtering

algorithm has to be established before the sound signal is

sent to the speaker. In the arena of digital filter there are

some algorithms among which three algorithms are well

known in canceling noise form signal. They are: Least

Mean Square (LMS), Block Frequency Domain Adaptive

Filter (BFDAF) and Kalman filter. An LMS filter is a

simple adaptive algorithm that like other adaptive

algorithm adapts the parameters iteratively. A Block

Frequency Domain Adaptive Filter (BFDAF) uses an

adaptable feature of block implementation riding which

the coefficients of each block of the signal continually

adjusts. BFDAF is basically an enhancement of LMS

filter but calculates its parameters intelligently in

frequency domain using Fast Fourier Transform (FFT).

568 JOURNAL OF MULTIMEDIA, VOL. 5, NO. 6, DECEMBER 2010

© 2010 ACADEMY PUBLISHERdoi:10.4304/jmm.5.6.568-579

FFT facilitates the computation of the convolution. At

every time state the input signal is estimated and error

signal is calculated and refine the coefficients of the

estimated signal. Kalman filter is widely used due to its

great adaptive characteristic. Kalman filter contains a

recursive procedure to estimate the state of the process in

a way that minimizes the estimated mean square error. It

estimates the present state and error covariance from past

observations and corrects the error covariance from

measurement output and set estimation for future

calculation. Its operation technique is very effective even

when the system to model is unknown and depends on

the progressive results.

In this paper we will proceed in a way to compare

and analyze their performances on several aspects to have

a robust understanding to choose the best performed filter

in acoustic echo cancellation. The organization of this

paper is segmented to several sections: section II focuses

on the related work of digital adaptive filters. Section III

is composed of the basic operation and equations of the

filters we deal with. Basic experimental setup and terms

of Acoustic Echo Cancellation (AEC) are described in

section IV. Section V is the part where our core effort

lies. Here we figured out the performances of the filters

and variation of performance under different

circumstances. The final part (section VI) is an elaborate

discussion based upon our analysis and observation.

II. Related Work

This section briefly describes the related research and

study of digital adaptive filters especially in acoustic echo

cancellation. There are extensive researches on new

application development and algorithm enhancement of

LMS, FDAF and Kalman filter over past two decades.

Interest of the researchers transformed to the frequency

domain when FFT (Fast Fourier Transform) was

introduced. An approach of unconstrained frequency

domain adaptive filter was introduced by Mansour D. and

Gray A.H. in 1982. A research has been undertaken about

block implementation of BFDAF with varying delay

within the context of speech processing [J. S. Soo and K.

K. Pang, 1990]. Impulse response of the noisy speech

signal can have diverse delay time. In this case keeping

delay time static is not a good solution. Some studies

were based on specific application: such as in stereo echo

cancellation FDAF was enhanced. After commenced and

published by R.E.Kalman, Kalman filter has become a

topic of extensive research and application. In 1985 H.W.

Sorenson published a paper on theory and application in

details. As a part of varied application of kalman filter,

some researchers had a research on approximation of

non-linear transformations of probability distributions

under the context of robotics [Julier, 96]. If a statistical

probability of certain observation not changes linearly, it

is been found kalman filter can supply good estimation of

non-linear transformation and keeps the estimation error

minimized. There are publishing on the implementations

and testing of adaptive filters that are very helpful for

visualizing the algorithms output. A many of them is

related to specific algorithm performance and

enhancement [21]. Few of them work with multiple

filtering techniques. For an instance, the paper under the

title "Adaptive filtering algorithms for stereophonic

acoustic echo cancellation"[20] is concerned with two

approaches and categories of filters rather than specific

widely used filters. However, there is no exact paper that

can supply us sufficient performance evaluation of

adaptive filters in the context of AEC. This consequence

leads us to this experiment and demonstration of this

paper.

III. ADAPTIVE FILTER THEORY

A. Least Mean Square (LMS) Filter Concept

LMS is based on the steepest descent algorithm. It

predicts instantaneous estimation and updates the filter

coefficients sample by sample in a mode to minimize the

MSE [6]-[8]. The LMS algorithm’s significant feature is

its simplicity as neither has it required measurements

relevant to the auto correlation, cross correlation nor it

needs to compute matrix inversion. Hence it is faster than

basic Weiner filter algorithm [5]-[7]. Two basic processes

works behind the LMS filtering algorithm: filtering process- calculates the output response of the filter

relating to the input signal and generates an estimation

error of the output pertaining to the desired response and

adaptive process- adjusts the parameters automatically

regarding the estimation error.

As LMS is based on the steepest descent algorithm

weight update vector at time k+1 should be as follows:

kkk WW 1 (1)

Where Wk is the k-th weight vector, k

is the gradient

vector composed in equation1 and controls the rate of

convergence. Replacing the value of k

,

)(21 kkk RWPWW (2)

But as LMS uses the instantaneous estimates P and R of

(2) will be substituted by the values:

T

kkkk XXRXyP

kkk

k

T

kkkk

k

T

kkkkkk

XeW

XWyXW

WXXXyWW

2

)(2

)(21

(3)

Where k

T

kkk XWye .

LMS algorithm posses the weight update Wk+1 noted in

(3). Flowchart of the LMS algorithm depicted in Fig.1

will provide lucid understanding:

JOURNAL OF MULTIMEDIA, VOL. 5, NO. 6, DECEMBER 2010 569

© 2010 ACADEMY PUBLISHER

Figure.1: Flowchart of LMS adaptive filter

From entire theory we can cover that, LMS doesn’t

require any knowledge about correlation matrix instead it

uses instantaneous estimation [9]. At first stage weights

may be deviated from expectation but gradually it incline

towards good adjustment. In this way, it performs the

adaptation through learning the signal characteristics.

B. Block Frequency Domain Adaptive Filter (BFDAF)

Theory

LMS algorithm can be easily adapted with the time

domain of the signal. However some essential

development likely in ‘Acoustic Echo Cancellation in

Teleconferencing’ a long impulse response apparently

mixes with the echo duration. This claims to have a long

memory and increases the complexity of computation

[10]. Transforming the system of interest to the frequency

domain simply by Fourier transform mapping reduce the

computational complexity [12]. Secondly, the

orthogonality properties of DFT and other discrete

transforms provides higher convergence rate. Infinite

Impulse Response (IIR) can be a possible solution but

algorithm may encounter infinite filter instability [13]. To

enhance the performance, block implementation of FIR

filter allows parallel processing of input points per block.

In essence with FIR block implementation, fast Fourier

transform serves fast convolution [12].

Let L is the length of the block and M is the length of

tapped weight vector. Data matrix [10],

)(.........)3()2()1(

.....................

)2(.........)1()()1(

)1(.........)2()1()(

)(

MLkLuLkLuLkLuLkLu

MkLukLukLukLu

MkLukLukLukLu

kA

1(

...

)1(

)(

LkMG

kMG

kMG (4)

A(k) is a ML matrix [10] shown in (4) and length

of vector GT(kM) is M. Let the weight vector,

T

L kwkwkwkwkW )](.........)()()([)(ˆ1210

(5)

Output vector of the filter would be the multiplication

of A(k) and weight vector )( kW written in (4) and (5)

respectively,

)(ˆ).()]1(.........)2()1()([ kWkALkLykLykLykLy T (6)

For individual element,

)(ˆ).()( kWkMGkLy

)(ˆ).1()1( kWkMGkLy

…………………………….

)(ˆ).()( kWikMGikLy

)](.........)()()([ 1210 kwkwkwkw L

TMikLuikLuikLuikLu )]1(.........)2()1()([

1

0

)().(M

j

j jikLukw (7)

y( kL+ i) is the i-th output vector described by (7). Let

the desired response of (kL+i)th element is d(kL+i).Therefore the error signal, e(kL+i ) = d(kL+i ) - y(kL+i).In matrix form,

)1(

...

...

...

)2(

)1(

)(

)1(

...

...

...

)2(

)1(

)(

)1(

...

...

...

)2(

)1(

)(

)(

LkLe

kLe

kLe

kLe

LkLy

kLy

kLy

kLy

LkLd

kLd

kLd

kLd

ke

(8)

The cross correlation vector k (9) is the

multiplication of the error vector e(k) (8). With the

transpose of data matrix AT,

)()()( kekAk T (9)

The update equation (10) of weight vector can be

achieve by adding constant multiplication of correlation

vector,

)()()1( kkWkW (10)

Initialize wk (i) and xk-i

Read xk and yk from ADC

Filter xk nk = wk(i)xk-i

Update coefficient

wk+1 = wk + 2 ke xk-i

Compute error ek = yk - nk

Compute factor 2 ke

570 JOURNAL OF MULTIMEDIA, VOL. 5, NO. 6, DECEMBER 2010

© 2010 ACADEMY PUBLISHER

Now using )1(ˆ kW updated output vector,

)1(ˆ).1()]12(.........)1()([ kWkALkLyLkLyLkLy T (11)

Updated error matrix [12]-[13],

)12(

...

...

...

)2(

)1(

)(

)12(

...

...

...

)2(

)1(

)(

)12(

...

...

...

)2(

)1(

)(

)1(

LkLe

LkLe

LkLe

LkLe

LkLy

LkLy

LkLy

LkLy

LkLd

LkLd

LkLd

LkLd

ke

(12)

Using (12) the updated cross correlation vector is

simply as the previous operation [12]-[13],

)1()1()1( kekAk T (13)

Iteratively, The update equation of weight vector [12]-

[13],

)1()1()2( kkWkW (14)

Similarly continuing with the above procedure for

(k+i)-th term the output vector shown in (15) where ‘i’ is

the delay time to process the next input block [11],

)(ˆ).()]1)1((...)...2().....([ ikWikALikLyiLkLyiLkLy T (15)

))1)(1((

...

...

...

)2(

)1(

)(

))1)(1((

...

...

...

)2(

)1(

)(

))1)(1((

...

...

...

)2(

)1(

)(

)(

LikLe

iLkLe

iLkLe

iLkLe

LikLy

iLkLy

iLkLy

iLkLy

LikLd

iLkLd

iLkLd

iLkLd

ike

(16)

In generalized form cross correlation for (k+i)-th term,

)()()( ikeikAik T (17)

Updated weight vecotr (using eq.16 and 17),

)()()1( ikikWikW (18)

In overlap-save method N = 2M point FFT is used

where the size of the filter of M tap weights. Input signal

in ‘n’ domain is express like,

blockkthblockthk ,)1( where, (k-1)th block is,

)2(.........)3()2()1(

.....................

)22(.........)1()()1(

)12(.........)2()1()(

MkLukLukLukLu

MLkLuLkLuLkLuLkLu

MkLuLkLuLkLuLkLu

And k-th block is,

)(.........)3()2()1(

.....................

)22(.........)1()()1(

)12(.........)2()1()(

MkLuMkLuMkLuMkLu

MkLukLukLukLu

MkLukLukLukLu

Input signal in ‘k’ domain,

blockkthblockthkFFTdiagkU ,)1()( ; NN Matrix

The initial weight in ‘k’ domain is a 1N vector,

O

kwFFTkW

)(ˆ)(ˆ (19)

Here )(ˆ kw is tap-weight vector (in ‘n’ domain) of

length M and O is a null vector of length M. Output signal

of the filter in ‘k’ domain,

)(ˆ)()(ˆ kWkUIFFTofelementsMLastky

The desired response vector,

TMkMdkMdkMdkd 1(...,......),1(),()( (20)

Error signal vector in ‘n’ domain,

TMkMekMekMeke 1(...,......),1(),()(

)()( kykd (21)

Error signal in frequency domain,

)()(

ke

OFFTkE

(22)

The cross-correlation vector,

)()()( kEkUIFFTofelementsMFirstk H

Updated tap-weights,

O

kFFTkWkW

)()(ˆ)1(ˆ (23)

Continuing this technique, all the vector coefficients will be updated for each (k+i) th term.

C. Theory and Concept of Kalman Filter

The Kalman filter is an adaptive least square error filter which is distinct from other adaptive filters due to

its state-space concepts and recursive features [15]. It is

an efficient algorithm for estimating a signal in presence of Gaussian noise and to continuously update the best

estimate with the system current state. In particular, each

updated estimate is computed form the previous best estimate and current input data without altering stationary

or non-stationary environments. Kalman filter has a wide

range of practical application areas like Aerospace, Marine Navigation, Inertial navigation, Global

positioning, nuclear power plant implementation and

many others [17]-[19].

One of the great features to use Kalman filter is that it

is recursive. In order to predict system state from the

previous entire observed data, a large amount of storage

is required. However, Kalman stores only the best

estimate calculated from the previous estimate to update

the current estimate for future use [16]. This requires

small span of memory and thus upholding recursive

property. The whole process is divided into two portions:

JOURNAL OF MULTIMEDIA, VOL. 5, NO. 6, DECEMBER 2010 571

© 2010 ACADEMY PUBLISHER

at first state the state is predicted with the dynamic modeland this estimation in turn corrected by the second part;

the corrector. Hence, the error covariance is minimized

[15]-[16]. Let us assume our observed random variables

y(1), y(2), y(3)……….y(n-1) starting from 1 ending at n-1.

)|1( 1nynx 7is the mean-square estimate of a related

zero-mean random variable x(n-1). If new observation is

y(n) and we are to find out the compute update estimate

)|( nynx . This computation can be done from all the

past observations y(1), y(2), y(3)……….y(n-1) including

the new observation y(n). Kalman filter shows the

efficiency providing the recursive estimation procedurein that we only store the previous best estimation [15]-

[16]-[18] )|1( 1nynx and reuse for current estimation

)|( nynx regarding current observation y(n). Hence

Kalman filter supplies less space and more effectiveness.

The Kalman filter uses feedback control to estimate the process state i.e. at any time filter estimates the process

state and then obtains feedback from noisy

measurements. In this way, the basic Kalman filter operation is segmented with two portions: the time-

update and the measurement-update [18]. The time

update section deals with the prediction of the current state ahead in time and error covariance to have the next

time step’s priori estimate. This operation is defined by a

process equation:

)()(),1()1( 1 nvnxnnFnx (24)

)()()( 111 nQnvnvE H (25)

x(n+1)(24) is the M-dimensional estimation. F(n+1, n)

is a transitions matrix. v1(n) is the Mx1 vector process

noise. Q1 is its MxM covariance. The measurement

update equation describes the N-dimensional observation

vector y(n) introduced at the input. The equation is

writing as follows:

)()()()( 2 nvnxnCny (26)

)()]()([ 222 nQnvnvE H (27)

Here C(n) in (26) is the N-by-M and v2(n); the

measurement noise are the Nx1 vectors respectively.

Q2(n) is the N x N covariance of v2(n). Resulted

measurement update equations also called as the corrector

equations of the priori estimate to produce the posterioriestimate for next level [15].

Taking an account both the predictor and correctoroperators the Kalman filter algorithm (27) also denote as

predictor-corrector [15] algorithm shown in Fig.2. The

predictor corrector works recursively.

Figure.2: The Predictor-Corrector algorithm for

recursive minimum mean-square estimation.

The key target of the Kalman algorithm is to determine

the mean square estimate )|( nynx . Mathematically,

)|( nynx = minimum square estimate of x(n)

Given the observed data y(1),y(2),y(3)…….y(n)

Factually, Estimation of )|( nynx is a linear

combination (28) of the forward prediction errors known

as innovations (1), (2)….. (n-1):

n

k

kkn abynx1

)|( (28)

Where bk is to be chosen to minimize the mean square

estimation error )|()( nynxnx composed in (29):

)](*)([

)](*)([

kkE

knxEbk

(29)

At time k = n

)()()|(1

0nbkbynx

n

k nkn (30)

The summation on the right hand side of (30) is

functionally the previous estimate )|1( 1nynx and bn

(n) is the correction term. So the equation of the current

prediction can be expressed as:

)()|1()|( 1 nabynxynx nnn (31)

D. Mathematical Formulation of Kalman Filter

One step predictor is the key of mathematical

calculation of Kalman filter. It predicts state space after

introducing new observation y(n). Afterwards it

calculates the error covariance of prediction and updates

the mean square estimation for the next imminent

operation. Meanwhile Kalman gain G(n) will provide the

correction in the estimate along with the prediction error.

The essential Kalman variables related with the operation

along with their definitions and parameters have been

shown in the Table 1.

572 JOURNAL OF MULTIMEDIA, VOL. 5, NO. 6, DECEMBER 2010

© 2010 ACADEMY PUBLISHER

TABLE.1

DEFINITION OF KALMAN FILTER VARIABLES

Variable Definition

x(n) State at time n

y(n) Observation at time n

F(n+1, n) Transition matrix from time n to time n+1

C(n) Measurement matrix at time n

Q1(n) Correlation matrix of process noise v1(n)

Q2(n) Correlation matrix of measurement noise

v2(n)

)|( 1nynxPredicted estimate of the state at time n

given the observation y(1),y(2),y(3)…. y(n-

1)

)|( nynxFiltered estimate of the state at time n given

the observation y(1),y(2),y(3)…….y(n)

G(n) Kalman gain at time n

)(n Innovation vector at time n

R(n) Correlation matrix of the innovation vector

)(n

K(n, n-1) Correlation matrix of the error in

)|( 1nynx

K(n) Correlation matrix of the error in

)|( nynx

Using the system variables listed above the one step

prediction operation [18] can be performed

algorithmically as follows:

For any given time n+1Input vector process

Given observations = {y(1), y(2), y(3)…..y(n)}Known parameters Transition matrix = F(n+1, n)

Measurement matrix = C(n)Correlation matrix of process noise = Q1(n)Correlation matrix of measurement noise = Q2(n)

Initial Conditions: Set the initial state vector at time n =1 where only the

observation y0 is exist.

)1()|1( 0

^

xEyx

Initialize the Correlation matrix of the error at time n = 1

0])])1([)1()])(1([)1([()0,1( HxExxExEK

For n = 1, 2, 3,……. the following series of operations is

performed sequentially:

Kalman gain equation is:

1)()1,(),1()( RnCnnKnnFnG H (32)

Where error correlation matrix k(n, n-1) can be defined

by the expectation of the error correlation matrix of

previous time unit.

)]1,()1,([)1,( nnnnEnnK H (33)

(n, n-1) is the predicted state-error vector at time n

The inverse of the correlation matrix of innovations R-

1(n) is described by multiplication of the previous error

vector k(n, n-1) (33) with correlation of measurement

matrix. C(n) in presence of relevant Gaussian

measurement noise (A white noise combined by the

means of Gaussian distribution). Explicitly it is:

1

2

1 )]()()1,()([)( nQnCnnKnCnR H (34)

As a detection of the innovation i.e. the new

information of observation y(n) can be found by as

follows:

)|()()()( 1nynxnCnyn (35)

Time state update one unit for current use is the result

of multiplication of a transition matrix F(n+1, n) and

previous time state )|( 1nynx incorporating with gain

G(n) and innovation (n). This is the basic prediction

equation.

)()(|,1|1 1

^^

nnGynxnnFynx nn (36)

Correction of error correlation matrix of the current

time can be obtained by (37):

)1,()()()1,()1,()( nnKnCnGnnFnnKnK (37)

In the final part of the calculation, we need to update

the measurement of the future error correlation matrix

k(n+1, n). As an integral part of this, white Gaussian

process noise Q1(n) will be added to it. Mathematical

equation has shown (38).

)(),1()(),1(),1( 1 nQnnFnKnnFnnk H (38)

IV. Acoustic Echo cancellation and Experimental

setup

An acoustic echo cancellation is noise cancellation of a

recorded speech signal. A recorded speech signal from

the loud speaker returns to the microphone as an echo

reflecting from the room and mingled with original

speech signal. The speaker signal at the microphone input

thus is not uniform and distorted. Adaptive filter will

work with the distorted signal which is the signal of our

interest. During the processing of the distorted signal it

produces the best estimation of the noisy signal.

Subtraction of this noisy signal from original signal will

solve the problem. Concurrently an error signal will be

generated mirroring the difference between the actual

signal and our approximation. Filter coefficients will be

updated according to the change in the error signal for

next block input processing. The usage of AEC is

inescapable when the loud speaker is in closer to the

microphone. For an instance, devices like hands free

JOURNAL OF MULTIMEDIA, VOL. 5, NO. 6, DECEMBER 2010 573

© 2010 ACADEMY PUBLISHER

mobile; videoconferencing software like Skype,

Marratech, NFESIS, iVisit use their own AEC.

Experimentally, signals of our interest can be sub-

divided into three parts:

A. Near End Speech Signal (NES)

The signal originated from the user participating in the teleconference to the microphone is the near end speech

signal. Signal amplitude is higher at the starting but

degrades gradually.

B. Far End Speech Signal (FES)

The signal travels out from the loud-speaker and

bounce in the room and turns back to the microphone causing noise to the microphone input signal.

C. Microphone Signal (MS)

The microphone holds both the near end and far end speech signal. The target of our echo canceller is to

remove the far end signal from the microphone and

transmits the near end speech signal to the distant user participating in the teleconferencing.

D. Experimental Setup

Signal inserted in microphone of a hall room is the

direct speech signal )(nd of the speaker and echo signal

)(ˆ nd that arises from the reflection of walls shown in

Fig.3. Direct speech signal is called near-end signal and

the echo signal is called far-end signal. The combined

input signal of the microphone is )(ˆ)()( ndndn .

Objective of an echo canceller is to remove the far-end

signal so that only near-end signal is sent to the loud

speaker. The path or channel between loud speaker and

the microphone is represented by a long finite impulse

response.

Figure.3: Room environment of Acoustic Echo Canceller

V. RESULT AND SIMULATION

Our basic room setup incorporates a microphone system located at the middle of the room and the speakers

around the room each with 6.70 meters distance from the

microphone. After the basic room environment setup, our system is ready to simulate. Basic room impulse response

must be calculated as this response is the major noise to

our signal of experiment. For this purpose, we used

chebyshev filter which is presumed to be fourth in filter

order with pass-band frequency range 0.1 < Wn < 0.7.

Besides, as an inherent part of our designed filter,

sampling frequency was assumed to be fs = 8000

samples/s and number of time sequence M = 4001. Stop

band ripple of room impulse response was considered as

20db. The time domain and the frequency domain

illustrations (Fig.4 and Fig.5) of the room impulse

response are:

Figure. 4: Room impulse response (Frequency Domain)

Figure.5: Room impulse response (Time Domain)

AEC system is dependent upon some interdependent

factors. To analyze the performances and compare the

filters, it is obligatory to initialize the four factors of

AEC: Reverberation time, Filter length, step size (Mu)

and signal to noise ratio (SNR). These factors have

discussed below along with the parameters of our

designed system:

A. Reverberation Time

Suppose we have to cancel an amount of 20db from

the noisy signal. Reverberation time is the time it takes

the sound to decay an amount of 60db from its initial value to when the sound source stops. In our AEC

system, loud speaker is placed at 670 cm distance and the

sound takes 340 meter/s to travel in the air. So, it takes (670 / 340000 = 19.70 ms) about 20 ms to travel from the

speaker to microphone. Suppose that the reverberation

574 JOURNAL OF MULTIMEDIA, VOL. 5, NO. 6, DECEMBER 2010

© 2010 ACADEMY PUBLISHER

time is 6000 ms in our room. First we have to initialize

the filter length based upon the room setup and impulse

response.

B. Filter Length

The most elementary factor is the Filter Length. As

estimation of the echo is the most elementary job to cancel the echo from the microphone signal, choosing the

filter length accordingly is inevitable. Since reverberation

time is 6000 ms then to decay 20 db of signal, the room impulse response has to be at least 2000 ms long. Our

considered sampling frequency is 8 Khz. So, filter length

should be at least (2s * 8000 samples/s = 16000

coefficients). The speech signal that we used has filter

length of 25000.

C. Step Size (Mu)

Step size is a variable indicates the time interval that is

needed for the filter to read and process a block of input

samples. A reasonable Mu is considered according to the room impulse response. This is also called the

convergence rate. The less the step size assigns, the more

efficient output comes. It should be in the range 0<Mu<1.

D. Signal to Noise Ratio (SNR)

SNR indicates the rate of noise exists in the signal of

interest. The less the SNR, the more the difficulties arise

for the filters to converge.

E. Test Case

Here we will examine and observe the simulation

result of the three filters upon varying Mu and SNR. The

ultimate goal of the three filters is to keep the mean

square error (MSE) and misadjustment (Madj) rate

minimum. In each criterion, the time domain depiction

and spectrogram has shown. Spectrogram is an

illustration of signal regarding points on three axes. X-

axis shows the time, Y-axis draws the frequency and Z-

axis plots the amplitude. Hence, spectrogram provides

sound visualization.

For all the three filters several test cases was

considered for varying Mu and SNR. Here we portrayed

and highlighted two test cases that are precise and

appropriate to the analysis. First test case concerned with

Mu = 0.025 and SNR = 45 and second test was

undertaken with Mu = 0.25 and SNR = 35.

F. LMS Case

The resultant observation and illustration has been

adopted below (Fig.6 and Fig.7) in the case of LMS filter

with first test case. In the time domain illustration first

block shows the fresh speech signal, the second reflects

the same signal in presence of noise and last block

represents the filtered signal. Fig.7 signifies the same

phenomenon in spectrogram view. Here in third block

we observe that, the filtered block shows a good quality

of original wave.

Figure.6: Time domain illustration of original signal,

microphone signal and filtered signal

Figure.7: Spectogram illustration of original signal,

microphone signal and filtered signal

Increasing the Mu and decreasing the SNR causes the

increase of minimum mean square error (MMSE).

Though reducing the step size dramatically eliminates the

execution time (ET), misadjustment rate (Madj) moves

upward. ET has been listed in Table 2. By increasing the

Mu and decreasing the SNR we achieve the following

output (Fig.8 and Fig.9). Two figures manifests that the filtered signal is much noisy than the previous

observation.

JOURNAL OF MULTIMEDIA, VOL. 5, NO. 6, DECEMBER 2010 575

© 2010 ACADEMY PUBLISHER

Figure.8: Time domain illustration of original signal,

microphone signal and filtered signal.

Figure.9: Spectogram illustration of original signal,

microphone signal and filtered signal

G. BFDAF Case

As BFDAF processes the signal block by block; it waits for while to accumulate the signal of predefined

block to process. So, setting this amount of time is

sensitive for the filter. It takes 20 ms of the speech signal to traverse through the air and reach to the microphone.

Hence, BFDAF should skips (0.02 s * 8000 samples/s) =

160 samples. This means after the 160 coefficients BFDAF will take the next block of input. Noise cancelled

signal along with the noisy signal have been shown in

Fig.10 and Fig.11. Like LMS, the third block in the two diagrams depicts the filtered signal. Though execution

time is not quite smaller, the result is good. The ET and

Madj are mentioned in the Table 3.

Figure.10: Time domain illustration of original signal,

microphone signal and filtered signal.

Figure.11: Spectogram illustration of original signal,

microphone signal and filtered signal.

After increasing the noise (decreasing SNR by 10) and

increasing the step size (to 0.25) we obtained that

enhancement of the step causes less execution time but

lower convergence. Besides, noise increment accelerates

the divergence. Thus, filtered signal appears noisier.

Fig.12 and Fig.13 depict the effects of this change.

576 JOURNAL OF MULTIMEDIA, VOL. 5, NO. 6, DECEMBER 2010

© 2010 ACADEMY PUBLISHER

Figure.12: Time domain illustration of original signal,

microphone signal and filtered signal.

Figure.13: Spectogram illustration of original signal, microphone signal and filtered signal.

H. Kalman Case

Kalman filter conveys good convergence but it takes large execution time. Simulation result of Kalman filter at

Mu = 0.025 and SNR = 45 shown in Fig.14 and Fig.15

respectively. Fig. 14 and 15 emphasizes its very good

convergence albeit the execution time is high. Relevant

ET and Mad j recorded in Table 4.

Figure.14: Time domain illustration of original signal,

microphone signal and filtered signal.

Figure.15: Spectogram illustration of original signal,

microphone signal and filtered signal.

Performance of the Kalman filter is satisfactory

even when we decrease the SNR and increase the Mu. ETis still large enough in the comparison with that of LMS

and BFDAF.

JOURNAL OF MULTIMEDIA, VOL. 5, NO. 6, DECEMBER 2010 577

© 2010 ACADEMY PUBLISHER

Figure.16: Time domain illustration of original signal,

microphone signal and filtered signal.

Figure.17: Spectrogram illustration of original signal,

microphone signal and filtered signal.

I. Analysis

After pictorial representation we should look through

the tabular data where execution time (ET), minimum

mean square error (MMSE) and miss adjustment (Madj)relating to the change of SNR and Mu are recorded. The

tabular data has adopted below for each filter:

TABLE.2

ET, MMSE and Madj with varying Mu and SNR (LMS)

SNR Mu ET (s) MMSE Madj

45 0.025 1.228 -0.00566 -1

45 0.25 1.106 -0.00562 -1

35 0.025 1.012 -0.00258 -1

35 0.25 1.011 -0.00253 -1

The table 2 shows that both for SNR = 45 and SNR =

35, ET is smaller as the step size is increased. It is

noticeable that MMSE is increasing as with the increase of the noise (decrease of SNR) and the Mu. Despite all

these, there is no significant change of Madj for LMS.

This is because; LMS filter works with each input sample

in isolation and takes instantaneous estimates.

TABLE.3 ET, MMSE and Madj with varying Mu and SNR

(BFDAF)

SNR Mu ET (s) MMSE Madj

45 0.025 3.586 -0.000412 -1.0006

45 0.25 1.079 -0.0198 -1.0003

35 0.025 1.313 -0.0037 -0.9999

35 0.25 1.125 -0.8876 -0.9989

According to the table 3, when SNR was 45 and Mu

was 0.025, ET was 3.586. With the enhancement of thestep size, ET falls down and MMSE reached upward. Madj

is also increased. Similar incident found at the SNR = 35

but due to good adaptability of BFDAF it took less ET.

TABLE.4

ET, MMSE and Madj with varying Mu and SNR

(Kalman filter)

SNR Mu ET MMSE Madj

45 0.025 50.12 -0.000385 -1.2503

45 0.25 35 -0.000315 -1.012

35 0.025 32.38 -0.0049 -0.8897

35 0.25 30.46 -1.2701 -0.8101

Table 4 is the illustration of the characteristics of the

factors in the case of kalman filter. Like BFDAF, Kalman

filter also conveys the higher MMSE and Madj with

decreasing SNR and increasing Mu. Unlike BFDAF,

Kalman filter conducts a significant amount of execution

time. At SNR = 45 and Mu = 0.025, ET was 50.12

seconds, MMSE = -0.000385 and Madj = -1.2503. As step

size of the filter increased, literally it should take less

execution time. As an aftermath of this incident, ET went

down to 35 seconds but MMSE and Madj improved to -

0.000315 and -1.012 respectively.

From these three observations, we can precisely

append that, at SNR = 45, lowest MMSE belongs to

Kalman filter is -0.000385 although ET = 50.12s

diminishes its acceptability. At this stage, BAFADFperforms suitably than LMS except in the case of ET.

LMS takes execution time ET of 1.228 which is well over

3.586 (BAFDF). Madj is almost alike; -1 for LMS and -

1.006 for BAFDF. MMSE is better issued by BAFDF than

LMS. As we descend down the tables with decreasing

SNR and increasing Mu, Kalman Filter behaves well in

other factors except ET. Performance of BAFDF is

satisfactory and acceptable as it delivers lower MMSEand Madj until quality of signal is not much degraded and

Mu is not much upgraded. LMS also showed good

performance but there is no change in Madj. At SNR = 35

and Mu = 0.25, ET = 1.125s, MMSE = -0.8876 and Madj =

-0.9989 for BAFDF. For LMS, ET = 1.011, MMSE = -

0.00253 and Madj = -1. These consequences tell us that

LMS conducts less ET in all the cases though it does not

provide any change in Madj whereas BAFDF takes little

more ET during inception of the process but even with

the more noise and step size, it adapts itself intelligently.

VI. DISCUSSION

578 JOURNAL OF MULTIMEDIA, VOL. 5, NO. 6, DECEMBER 2010

© 2010 ACADEMY PUBLISHER

Finally, our experiment leads to a sound conclusion in

the choice of digital adaptive filters. From several aspects

we can conclude with the details discussion based on the

achievement of the result section. Though it has great

contribution to the many heavily noisy real life system

where prediction is inherent, Kalman filter conducts

much execution time which is not conducive to AEC

system. A recipient of the conference will not wait that

much time to hear the sender’s voice. To have a choice of

LMS filter or BFDAF filter, it will depend upon several

considerations. As LMS filter uses instantaneous

estimation, its miss-adjustment rate is almost unchanging.

BFDAF takes little more time at first to adapt but at the

end it shows good convergence. Upon the good choice of

step size, BFDAF demonstrates less miss-adjustment rate

and MMSE. If it is predetermined that the AEC will be

less noisy and little execution time in hand, it is desirable

to use LMS. But with good selection of step size and

considerable execution time it is recommendable to use

BFDAF. Besides, block processing requires less memory

storage for signal. Therefore, for very long impulse

response BFDAF is very efficient. At the end, we can

deduce that in AEC both BFDAF and LMS can be applied deeming the application area. In this paper, we have a

transparent understanding of choose adaptive filter in the

context of AEC. However, DTD (Double Talk Detection) is not considered in this paper which might be a

performance factor in AEC when two participants try to

speak at the same time. Future trend of this paper is to analyze the practical implementation of these algorithms

in the case of DTD and enhancement of algorithm if

desired results are failed to accomplish.

REFERENCES

[1] D. Molkdar, “Review on radio propagation into and within

buildings,” IEEE Proc. H, vol. 138, pp. 61–73, February

1991.

[2] Zhiwei Zeng, “Digital Communication via Multipath

Fading Channel”, Cpre537x Final Project, pp.17-22,

November 2000.

[3] J. K. Cavers and P. Ho, “Analysis of the error performance

of trellis coded modulations in Rayleigh fading channels,”IEEE Trans. Commun., vol. 40, pp. 74–80, January 1992

[4] P. Yegani and C. McGlilem, “A statistical model for the

factory radio channel,” IEEE Trans. Commun., vol. COM-

39, pp. 1445–1454, , October 1991.

[5] N. J. Bershad and P. L. Feintuch, “Non-Wiener solutions

for the LMS altorithm—A time domain approach,” IEEE Trans. Signal Processing, vol. 43, pp. 1273–1275, May-

1995.

[6] E. R. Ferrara, Jr., “Fast Implementation of LMS Adaptive

Filter,” IEEE Trans. Acoust., Speech, Signal Processing,

vol. ASSP-28, pp. 474-475, 1980.

[7] M. J. Shensa, “Non-Wiener solutions for the adaptive

canceller with a noisy reference,” IEEE Trans. Acoust,

Speech, Signal Processing, vol- ASSP-28, pp. 468–473.

August-2000.

[8] P. Clarkson and P. White, “Simplified analysis of the LMS

adaptive filter using a transfer function approximation,”

IEEE Trans. Acoust., Speech, Signal Processing, vol.

ASSP-35, pp. 987–993, July 1987.

[9] Emmanuel C. Ifeachor, Barrie W. Jervis, “Digital Signal

Processing – A practical approach”, Addison-Wesley,

ISBN 0 201 54413 X, pp.104-5, pp.541 – 552, September

2001.

[10]D. Mansour and A. H. Gray, “Unconstrained Frequency

Domain Adaptive Filter,” IEEE Trans. Acoust., Speech, Signal Processing, vol. ASSP-30, pp. 726-734. 1982.

[11]J. S. Soo and K. K. Pang, “Multidelay Block Frequency

Domain Adaptive Filter,” IEEE Trans. Acoust. Speech,

Signal Processing, vol. ASSP-38, pp. 373-376, 1990.

[12]Satoru Emura, Yoichi Haneda, and Shoji Makino,

“Enhanced Frequency-Domain Adaptive Algorithm for

Stereo Echo Cancellation”, IEEE Transc. Vol-II, pp.1901-

03, July 2001.

[13]N.K.Jablon, “On the complexity of Frequency Domain

Adaptive Filtering” IEEE Trans. On Signal Processing

Vol-39 pp. 2331-2334. October-1991.

[14]E. W. Harris and C.D.M.a.B.F.A., "A variable step (VS)

adaptive filter algorithm," IEEE Transactions on

Biomedical Engineering, Vol. 34, pp.309-316, 1986.

[15]Greg Welch and Gary Bishop, “An Introduction to the

Kalman Filter”, TR 95-041, Department of Computer

Science University of North Carolina at Chapel Hill, pp.3-

7, July 2006.

[16]H.W. Sorenson, Kalman Filtering: Theory and Application,

IEEE Press, pp.115-118, New York (1985).

[17]A. H. Mohamed, K. P. Schwarz, “Adaptive Kalman

Filtering for INS/GPS”.pp.7-11, December 1998.

[18]Simon Haykin, “Adaptive Filter Theory”, Fourth Edition,

Pearson Education, ISBN, 81-7808-565-8, pp.231-237, pp.

345-48, pp466-485, September 2001.

[19]Averil B. Chatfield, “Fundamentals of high accuracy

inertial navigation. Progress in astronautics and

aeronautics”, AIAA No. V-174: (800), pp.189-192.

September-1997.

[20]J. Benesty, F. Amand, A. Gilloire, Y. Grenier, "Adaptive

filtering algorithms for stereophonic acoustic echo

cancellation," icassp, vol-5, pp.3099-3102, Acoustics,

Speech, and Signal Processing, ICASSP-95, International

Conference on, 1995.

[21]Harsha I. K. Rao, Behrouz Farhang-Boroujeny, “Fast

LMS/Newton algorithms for stereophonic acoustic echo

cancellation”. Vol-8, pp.2919-2930, August 2009.

JOURNAL OF MULTIMEDIA, VOL. 5, NO. 6, DECEMBER 2010 579

© 2010 ACADEMY PUBLISHER