Download - Adaptive Filter Theory - Egloospds9.egloos.com/pds/200806/18/35/s03_Adaptive_Filte… · · 2008-06-17DSP everywhere! 0 Adaptive Filter Theory Sung Ho Cho Hanyang University Seoul,

DSP everywhere!

0

Adaptive Filter Theory

Sung Ho Cho

Hanyang UniversitySeoul, Korea

(Office) +82-2-2220-0390(Mobile) +82-10-5412-5178

[email protected]@hanyang.ac.kr

DSP everywhere!

1

Table of Contentsf

Wiener Filters

Gradient Search by Steepest Descent Method

Stochastic Gradient Adaptive Algorithms

Recursive Least Square (RLS) Algorithm

DSP everywhere!

2

Wiener Filters

DSP everywhere!

3

Filter‐Optimization Problemp

Wiener FilteringA priori knowledge of the signal statistics or at least their estimates are requiredA priori knowledge of the signal statistics or at least their estimates are required.

Complex and expensive hardware systems are necessary (particularly, in nonstationary environments).

Adaptive FilteringComplete knowledge of the signal statistics is not required.

Filter weights eventually converge to the optimum Wiener solutions for stationary processes.

Filter weights show tracking capability in slowly time-varying nonstationary environments.

Complex and expensive hardware systems are not, in general, necessary.

DSP everywhere!

4

Wiener Filters (1/6)

Objectives:We want to design a filter that minimizes the mean squared estimation error so that theh { }2( )EWe want to design a filter that minimizes the mean-squared estimation error so that the estimated signal best approximates d(n).

ih { }2( )E e n)(ˆ nd

Desired

Estimation Error Signal

ˆ( ) ( ) ( )e n d n d n= −( )d nSignal

( ) ( ) ( )e n d n d n( )d n

ReferenceSignal

ih10 −≤≤ Ni

Estimated Signal

1

0

ˆ( ) ( )N

ii

d n h x n i−

== −∑( )x n

DSP everywhere!

5


Basic Structure:

1

0( ) ( ) ( )

( ) ( )

N

ii

T

e n d n h x n i

d n H X n

−

== − −∑

h

( )d n

z‐1

( )x n( ) ( )d n H X n= −

0h

h

z‐1( 1)x n −

ˆ( )d n

1h

h( 2)x n −

2hLinear combination of

the current and past input signals

z‐1

( 1)x n N− +1Nh −

( 1)x n N− +

DSP everywhere!

6


Basic Assumptions:d(n) and x(n) are zero-mean.d(n) and x(n) are zero mean.d(n) and x(n) are jointly wide-sense stationary.

Notations:Notations:Filter Coefficient Vector:

Reference Input Vector: [ ]( ) ( ) ( 1) ( 1) TX n x n x n x n N= − − +L

[ ]0 1 1, , , TNH h h h −= L

Reference Input Vector:

Estimation Error Signal:

[ ]( ) ( ), ( 1), , ( 1)X n x n x n x n N= +

1

0( ) ( ) ( )

N

ii

e n d n h x n i−

== − −∑

Autocorrelation Matrix:

( ) ( )Td n H X n= −

{ }( ) ( )TXXR E X n X n=

Cross-correlation Vector:

{ }

{ }( ) ( )dXR E d n X n=

TOptimum Filter Coefficient Vector: 0, 1, 1,, , ,T

opt opt opt N optH h h h −⎡ ⎤= ⎣ ⎦L

DSP everywhere!

7


Performance Measure (Cost Function):

{ }2( )Eξ { }( )

2

2

( )

( ) ( )T

E e n

E d n H X n

ξ =

⎧ ⎫= −⎨ ⎬⎩ ⎭

{ } { } { }{ }

2

2

( ) 2 ( ) ( ) ( ) ( )

( ) 2

T T T

T TdX XX

E d n H E d n X n H E X n X n H

E d n H R H R H

= − +

= − +

We now want to minimize with respect to H:

∂ξ

ξ

Wiener Hopf Solution (1931):

022 =+−=∂∂ HRRH XXdXξ

RHRWiener-Hopf Solution (1931):dXoptXX RHR =

dXopt RRHXX

1−=

DSP everywhere!

8


Autocorrelation Matrix RXX: { }( ) ( )TXXR E X n X n=

(0) (1) ( 1)(1) (0) ( 2)

xx xx xx

xx xx xx

r r r Nr r r N

−⎡ ⎤⎢ ⎥−⎢ ⎥=⎢ ⎥⎢ ⎥

L

L

M M O M

RXX is symmetric and Toeplitz.

( 1) ( 2) (0)xx xx xxr N r N r⎢ ⎥− −⎣ ⎦L

Is RXX invertible?Yes, almost always.

RXX is almost always a positive definite matrix.

A symmetric matrix A is called positive definite if xTAx > 0 for every nonzero x.All the eigenvalues of A is positive.

The determinant of every principal submatrix of A is positive.

Since the determinant of A is not zero, A is invertible.,

DSP everywhere!

9


Let XB(n) denote the vector obtained by rearranging the elements of X(n) backward, i.e.,

[ ]TThen

[ ]( ) ( 1), ( 2), , ( ) TBX n x n N x n N x n= − + − + L

{ }( ) ( )TB B XXE X n X n R=

Cross-correlation Vector RdX:

{ }

{ }

(0)(1)

( ) ( )

dx

dxdX

rr

R E d n X n

⎡ ⎤⎢ ⎥⎢ ⎥= =⎢ ⎥⎢ ⎥

M

Minimum Estimation Error:

( 1)dxr N⎢ ⎥−⎣ ⎦

Minimum Estimation Error:

min ( ) ( ) ( )

( ) ( )

Topt

T

e n d n H X n

d X H

= −

( ) ( )Toptd n X n H= −

DSP everywhere!

10


Minimum Mean-Squared Estimation Error: { }{ }

2min min ( )E e nξ =

( ){ }{ }

2

2

( ) ( )

( )

Topt

Topt dX

E d n H X n

E d n H R

= −

= −{ }{ }2( )

p

Topt XX optE d n H R H= −

Example:

ξ N = 1

ξN = 2

Error

ξ

minξError

Surface

Surfaceminξ

0h( )

1h

0,opth ( )0, 1,,opt opth h0h

DSP everywhere!

11

Orthogonality Principle:g y p

( )d nmin ( )e n

ˆ( )d n

min ( )

PlaneM

θ

The plane M is spanned by [ ]( ) ( ) ( 1) ( 1) TX n x n x n x n N +

( )d nPlane M

The plane M is spanned by .

1ˆ( ) ( )

N

id n h x n i−

= −∑

[ ]( ) ( ), ( 1), , ( 1)X n x n x n x n N= − − +L

The plane

0i=∑

)(min neM ⊥

{ }min ( ) ( ) 0NE e n X n =

The perfect estimation is possible if θ = 0, and the estimation fails if θ = π/2.

DSP everywhere!

12

Some Drawbacks of the Wiener Filter:f

Signal statistics must be known a priori. We must know R and R or at least their estimatesWe must know RXX and RdX or at least their estimates.

A matrix inversion operation is required.Heavy computational load.

Not proper for real-time applications.

Situations get worse in nonstationary environments.We have to compute RXX(n) and RdX(n) at every time n.

We must compute the matrix inversion operation at every time n.We must compute the matrix inversion operation at every time n.

DSP everywhere!

13

Gradient Search by Steepest Descent Method

DSP everywhere!

14

Steepest Descent Method (1/5)p

Objectives:We want to design a filter in a recursive form in order to avoid the matrix inversion operation( )hWe want to design a filter in a recursive form in order to avoid the matrix inversion operation required in Wiener solution.

( )ih n

ˆ( ) ( ) ( )e n d n d n= −( )d n

10 −≤≤ Ni( )x n hi(n)

1ˆ( ) ( ) ( )

Nd h i

−

∑10 ≤≤ Ni0

( ) ( ) ( )ii

d n h n x n i=

= −∑

DSP everywhere!

15


Basic Structure:

1

0( ) ( ) ( ) ( )

( ) ( ) ( )

N

ii

T

e n d n h n x n i

d n H n X n

−

== − −

= −

∑( )d n

z‐1

( )x n( ) ( ) ( )

0 ( )h n

z‐1( 1)x n −

( 2)

ˆ( )d n1( )h n

( 2)x n −2 ( )h n

z‐1

( 1)x n N− +( )h n1 ( )Nh n−

DSP everywhere!

16


Basic Assumptions:d(n) and x(n) are zero-mean.d(n) and x(n) are zero mean.d(n) and x(n) are jointly wide-sense stationary.

Notations:Notations:Filter Coefficient Vector:

Reference Input Vector: [ ]( ) ( ) ( 1) ( 1) TX n x n x n x n N= − − +L

[ ]0 1 1( ) ( ), ( ), , ( ) TNH n h n h n h n−= L

Reference Input Vector:

Estimation Error Signal:

[ ]( ) ( ), ( 1), , ( 1)X n x n x n x n N= +

1

0( ) ( ) ( ) ( )

N

ii

e n d n h n x n i−

== − −∑


( ) ( ) ( )Td n H n X n= −

{ }( ) ( )TXXR E X n X n=


{ }

{ }( ) ( )dXR E d n X n=

TOptimum Filter Coefficient Vector: 0, 1, 1,, , ,

Topt opt opt N optH h h h −⎡ ⎤= ⎣ ⎦L

DSP everywhere!

17


The filter coefficient vector at time n+1 is equal to the coefficient vector at time n plus a change proportional to the negative gradient of the mean-squared error i eproportional to the negative gradient of the mean-squared error, i.e.,

( )1( 1) ( ) ( )2 H nH n H n n+ = − μ∇

[ ]( ) ( ) ( ) ( ) TH n h n h n h n= L

μ = Adaptation Step-size

[ ]0 1 1( ) ( ), ( ), , ( )NH n h n h n h n−= L

Performance Measure (Cost Function):

{ }2( ) ( )n E e nξ =

{ }2( ) 2 ( ) ( ) ( )T TdX XXE d n H n R H n R H n= − +

DSP everywhere!

18


The Gradient of the Mean-Squared Error:

( )( )( )( )

2 2 ( )

H n

dX XX

nnH n

R R H n

∂ξ∇ =

∂= − +

Therefore, the recursive update equation for the coefficient vector becomes

( )dX XX

[ ]( 1) ( )N XX dXH n I R H n R+ = − μ + μ

Misalignment Vector:

( ) ( ) optV n H n H= −

[ ]( 1) ( )N XXV n I R V n+ = − μ

DSP everywhere!

19

Convergence of Steepest Descent Method (1/2)g f p

Convergence (or Stability) Condition:

1 1i− μλ <

20 , i< μ < ∀λ

(λi = the i-th eigenvalue of RXX)

iλ

max

20 , i⇒ < μ < ∀λ

Slow convergence if is large.maxλλ

max

minλ

DSP everywhere!

20

Convergence of Steepest Descent Method (2/2)g f p

Time Constant:The convergence behavior of the i-th element of the misalignment vector:The convergence behavior of the i th element of the misalignment vector:

( )( 1) 1 ( )i i iv n v n+ = − μλ

( )n

Time constant for the i-th element of the misalignment vector:

( )( ) 1 (0)ni i iv n v⇒ = − μλ

g

11 expii

⎛ ⎞− μλ = −⎜ ⎟τ⎝ ⎠

( )1 1 (samples) for 1

ln 1ii i

−⇒ τ = ≈ μ

− μλ μλ

Steady-State Value: ( ) or ( ) 0opt NH H V∞ = ∞ =

We still need a priori knowledge of signal statistics.

DSP everywhere!

21

Stochastic Gradient Adaptive Algorithms

DSP everywhere!

22

Stochastic Gradient Adaptive Filtersp

Motivations:No a priori information about signal statisticsNo a priori information about signal statistics

No matrix inversion

Tracking capability

S lf d i i (R i th d)Self-designing (Recursive method)

The filter gradually learns the required correlation of the input signals and adjusts its coefficient vector recursively according to some suitably chosen instantaneous error criterion.

Evaluation Criteria:Rate of convergence

Misadjustment (Deviation from the optimum solution)

Robustness for ill-conditioned dataRobustness for ill-conditioned data

Computational costs

Hardware implementation costs

Numerical problemsNumerical problems

DSP everywhere!

23

Applications of Stochastic Gradient Adaptive Filters (1/2)

System Identifications: )(nξ

ΣΣUnknownSystem

)(nx )(ne)(nd

Adaptive

Adaptive Prediction:

Filter

Σ )(ne)(nd

Δ−

AdaptiveFilter)()( Δ−= ndnx

Δz

)()( ndnx

DSP everywhere!

24

Applications of Stochastic Gradient Adaptive Filters (1/2)

Noise Cancellation:Σ )(ne)(ny Σ

)()()( nnynd ξ+=

?

)(nξ

AdaptiveFilter

)(nx)(ˆ nξ

Inverse Filtering:Σ )(neΔ− )(ndΣ )(neΔz

TrainingSignal(RX) Received

Si lAdaptive

Filter)(nx

ΣUnknownChannelTraining

Signal

Signal

)(nξ

Signal(TX)

DSP everywhere!

25

Classification of Adaptive Filtersf f p

System Identification:System Identificationy

Layered Earth Modeling

Adaptive Prediction:Linear Predictive Coding

Autoregressive Spectral Analysis

ADPCM

Noise Cancellation:Adaptive Noise Cancellation

Adaptive Echo CancellationAdaptive Echo Cancellation

Active Noise Control

Adaptive Beamforming

Inverse Filtering:Adaptive Equalization

Deconvolution

Blind Equalization

DSP everywhere!

26

Stochastic Gradient Adaptive Algorithms (1/6)p g

ˆ( ) ( ) ( )e n d n d n= −( )d n( )

h (n) 1ˆ

N −

∑

( ) ( ) ( ) ( )Td H X

10 −≤≤ Ni( )x n hi(n)

0( ) ( ) ( )i

id n h n x n i

== −∑

( )( 1) ( ) ( )H nH n H n nμ+ = − ∇

α

( ) ( ) ( ) ( )Te n d n H n X n= −AdaptiveAlgorithm

Various forms according to the choice of the “performance measure.”

α

∂( ) ( ) ( )

( )H n n e nH n

α∂∇ =

∂

If no correlation between d(n) and x(n), then no estimation can be made.

DSP everywhere!

27


Notations:Filter Coefficient Vector: [ ]( ) ( ) ( ) ( ) TH h h hFilter Coefficient Vector:

Reference Input Vector: [ ]( ) ( ), ( 1), , ( 1) TX n x n x n x n N= − − +L

[ ]0 1 1( ) ( ), ( ), , ( )NH n h n h n h n−= L

Estimation Error Signal:1

0( ) ( ) ( ) ( )

( ) ( ) ( )

N

ii

T

e n d n h n x n i

d n H n X n

−

== − −

=

∑


( ) ( ) ( )d n H n X n= −

{ }( ) ( )TXXR E X n X n=


Optimum Filter Coefficient Vector:

{ }( ) ( )dXR E d n X n=

0, 1, 1,, , ,T

opt opt opt N optH h h h −⎡ ⎤= ⎣ ⎦L

Misalignment Vector:

⎣ ⎦

( ) ( ) optV n H n H= −

Covariance Matrix of the Misalignment Vector: { }( ) ( ) ( )TK n E V n V n=

DSP everywhere!

28


Sign Algorithm: α = 1

The sign algorithm tries to minimize the instantaneous absolute error value at each iteration.

( ) ( ) ( ) ( )Td H X

( )( )

( )( )H n

e nn

H∂

∇ =∂

( ) ( ) ( ) ( )Te n d n H n X n= −

Filter Coefficient Updates:

( ) ( )n H n∂

{ }( 1) ( ) ( )sign ( )H n H n X n e n+ = − μ

1 ( ) 0⎧{ }1, ( ) 0

sign ( )1, ( ) 0

e ne n

e n≥⎧

= ⎨− <⎩

DSP everywhere!

29


Least Mean Square (LMS) Algorithm: α = 2

The LMS algorithm tries to minimize the instantaneous squared error value at each iteration.

( ) ( ) ( ) ( )Td H X

2

( )( )( )( )H n

e nnH

∂∇ =

∂

( ) ( ) ( ) ( )Te n d n H n X n= −


( ) ( )( )H n H n∂

( 1) ( ) ( ) ( )H n H n X n e n+ = − μ

DSP everywhere!

30


Least Mean Absolute Third (LMAT) Algorithm: α = 3

The LMAT algorithm tries to minimize the instantaneous absolute error value to the third power at each iteration.

3( )e n∂

( ) ( ) ( ) ( )Te n d n H n X n= −

( )( )

( )( )H n

e nn

H n∂

∇ =∂


{ }2( 1) ( ) ( ) ( )sign ( )H n H n X n e n e n+ = − μ

DSP everywhere!

31


Least Mean Fourth (LMF) Algorithm: α = 4

The LMF algorithm tries to minimize the instantaneous error value to the fourth power at each iteration.

T

4

( )( )( )H

e nn ∂∇ =

( ) ( ) ( ) ( )Te n d n H n X n= −


( ) ( )( )H n n

H n∇

∂

3( 1) ( ) ( ) ( )H n H n X n e n+ = − μ

DSP everywhere!

32

Convergence of the Adaptive Algorithms (1/2)g f p g

Basically, we need to know the mean and mean-squared behavior of the algorithms.

For the analysis of the statistical mean behavior:We want to know a set of statistical difference equations that characterizes E{H(n)} or E{V(n)}.

We also need to check

Stability conditions

Convergence speed

Unbiased estimation capability

For the analysis of the statistical mean-squared behavior:y qWe want to know a set of statistical difference equations that characterizes

and .


{ }2 2( ) ( )e n E e nσ =

{ }( ) ( ) ( )TK n E V n V n=


Stability conditions

Convergence speed

Estimation precisionEstimation precision

DSP everywhere!

33

Convergence of the Adaptive Algorithms (2/2)g f p g

Basic Assumptions for the Convergence Analysis:

The input signals d(n) and x(n) are zero-mean, jointly wide-sense stationary, and jointly Gaussian with finite variances.

A f thi ti i th t th ti ti ( ) d( ) HT( )X( ) i lA consequence of this assumption is that the estimation error e(n) = d(n) – HT(n)X(n) is also a zero-mean and Gaussian when conditioned on the coefficient vector H(n).

I d d A tiIndependence Assumption:

“The input pair {d(n), X(n)} at time n is independent of {d(k), X(k)} at time k, if n is not equal to k.”

This assumption is seldom true in practice, but is valid when the step-size μ is chosen to be ffi i tl llsufficiently small.

One direct consequence of the independence assumption is that the coefficient vector H(n) is uncorrelated with the input pair {d(n), X(n)}, since H(n) depends only on inputs at time n-1 and beforebefore..

DSP everywhere!

34

Sign Algorithm (1/2)g g

Mean Behavior:

⎡ ⎤{ } { }2 2( 1) ( )

( ) ( )N XX dXe e

E H n I R E H n Rn n

⎡ ⎤μ μ+ = − +⎢ ⎥π σ π σ⎣ ⎦

{ } { }2( 1) ( )( )N XX

eE V n I R E V n

n⎡ ⎤μ

+ = −⎢ ⎥π σ⎣ ⎦

Mean-Squared Behavior:

2 { }2min( ) ( )e XXn tr K n Rσ = ξ +

2 2 μ [ ]2

2

2( 1) ( ) ( ) ( )( )XX XX XXK n K n R K n R R K nn

μ+ = + μ − +

π σ

DSP everywhere!

35

Sign Algorithm (2/2)g g

Steady-State Mean-Squared Estimation Error:

{ }2min min( )

2 2e XXtr Rμ πσ ∞ ≈ ξ + ξ

Convergence Condition (Weak Convergence):

“The long term time average of the MAE is bounded for any positive value of μ ”The long-term time-average of the MAE is bounded for any positive value of μ.

Very robust, but slow.

DSP everywhere!

36

LMS Algorithm (1/2)g

Mean Behavior:

{ } [ ] { }( 1) ( )N XX dXE H n I R E H n R+ = − μ + μ

{ } [ ] { }( 1) ( )N XXE V n I R E V n+ = − μ


{ }2 ( ) ( )n tr K n Rσ = ξ + { }min( ) ( )e XXn tr K n Rσ = ξ +

[ ]( 1) ( ) ( ) ( )XX XXK n K n K n R R K n+ = − μ +[ ]2 2

( ) ( ) ( ) ( )

( ) 2 ( )

XX XX

e N XX XXn I R K n R

μ

⎡ ⎤+ μ σ +⎣ ⎦

DSP everywhere!

37

LMS Algorithm (2/2)g


{ }2min min( )

2e XXtr Rμσ ∞ ≈ ξ + ξ

2Mean Convergence:max

20 < μ <λ

Mean-Squared Convergence:{ }20

3 XXtr R< μ <

If , then .min

12LMS signπ

μ = μξ

2 2( ) ( )e LMS e signσ ∞ ≈ σ ∞

The convergence of the algorithm strongly depends on the input signal statistics.

DSP everywhere!

38

LMAT Algorithm (1/2)g

Mean Behavior:

{ } { }2 2( 1) 2 ( ) ( ) 2 ( )N e XX e dXE H n I n R E H n n R⎡ ⎤

+ = − μ σ + μ σ⎢ ⎥π π⎣ ⎦

{ } { }2( 1) 2 ( ) ( )N e XXE V n I n R E V n⎡ ⎤

+ = − μ σ⎢ ⎥π⎣ ⎦


2 { }2min( ) ( )e XXn tr K n Rσ = ξ +

[ ]2( 1) ( ) 2 ( ) ( ) ( )K K K R R K[ ]2 2 2

2( 1) ( ) 2 ( ) ( ) ( )

3 ( ) ( ) 3 ( )

e XX XX

e e XX XX XX

K n K n n K n R R K n

n n R R K n R

+ = − μ σ +π

⎡ ⎤+ μ σ σ +⎣ ⎦⎣ ⎦

DSP everywhere!

39

LMAT Algorithm (2/2)g


{ }2min min min

3( )4 2e XXtr Rμ π

σ ∞ ≈ ξ + ξ ξ

Mean Convergence:max

10 ,2 ( )e

nn

π< μ < ∀

λ σ

Very fast, but must be careful.

The convergence of the LMAT algorithm depends on the initial choice of the coefficient vector.

If then2 2 1μ μ 2 2( ) ( )∞ ∞If , then .

min3LMAT LMSμ = μπ ξ

2 2( ) ( )e LMAT e LMSσ ∞ ≈ σ ∞

DSP everywhere!

40

LMF Algorithm (1/2)g

Mean Behavior:

{ } { }2 2( 1) 3 ( ) ( ) 3 ( )N e XX e dXE H n I n R E H n n R⎡ ⎤+ = − μσ + μσ⎣ ⎦

{ } { }2( 1) 3 ( ) ( )N e XXE V n I n R E V n⎡ ⎤+ = − μσ⎣ ⎦


{ }2 ( ) ( )n tr K n Rσ = ξ + { }min( ) ( )e XXn tr K n Rσ = ξ +

[ ]2( 1) ( ) 3 ( ) ( ) ( )e XX XXK n K n n K n R R K n+ = − μσ +[ ]4 215 ( ) ( ) 6 ( )

e XX XX

e e N XX XXn n I R K n R⎡ ⎤+ μσ σ +⎣ ⎦

DSP everywhere!

41

LMF Algorithm (2/2)g


?

2Mean Convergence:2

max

20 ,3 ( )e

nn

< μ < ∀λ σ

Very fast, but must be careful also.

The convergence of the LMF algorithm also depends on the initial choice of the coefficient vector.

DSP everywhere!

42

Further Observations (1/2)

Misadjustment: ex

min

( )M ξ ∞ξ

Sign Algorithm:{ }

2 2XXtr R

M μ π≈

ξ

minξ

LMS Algorithm:

min2 2 ξ

{ }XXM tr Rμ≈

LMAT Algorithm:

{ }2 XXM tr R≈

{ }min3

XXM tr Rμ π≈ ξ

LMF Algorithm:

{ }min4 2 XXM tr Rξ

?go : ?

DSP everywhere!

43

Further Observations (2/2)

The misadjustment M increases with the filter order N.

The misadjustment M is directly proportional to μ.

The convergence speed is inversely proportional to μ.

Convergence Speed:Convergence Speed: (Fast) LMAT – LMF ≈ LMS – Sign (Slow)

Robustness (or Stability):(Good) Sign – LMS – LMAT – LMF (Bad)

DSP everywhere!

44

Example: System Identification Mode (1/6)p y f

)(nξ

Unknown

)(nξ

)(ndΣΣ

UnknownSystem

)(nx )(ne)(

AdaptiveFilterFilter

[ ]T[ ]0.1, 0.3, 0.5, 0.7, 0.5, 0.3, 0.1 ToptH =

DSP everywhere!

45


Two Sets of Reference Inputs:

CASE 1: Eigenvalue Spread Ratio = 25.3

1 1 1 1( ) ( ) 0.9 ( 1) 0.1 ( 2) 0.2 ( 3)x n n x n x n x n= ζ + − − − − −


1 1 1 1

2 2 2 2( ) ( ) 1.5 ( 1) ( 2) 0.25 ( 3)x n n x n x n x n= ζ + − − − − −

Measurement Noise ζ(n): White Gaussian Process

2 2 2 2( ) ( ) ( ) ( ) ( )ζ

Convergence Parameter μ:

Sign LMS LMAT LMF

0.00016 0.002 0.011 0.002

DSP everywhere!

46



1 1 1 1( ) ( ) 0.9 ( 1) 0.1 ( 2) 0.2 ( 3)x n n x n x n x n= ζ + − − − − −

1 0

0

B

1 : L M A T2 : L M S

-1 0

MSE

indB

4

3 : L M F4 : S IG N

0 4 0 0 0 8 0 0 0 1 2 0 0 0 1 6 0 0 0 2 0 0 0 0-2 0 1

2

3

Mean-Squared Behavior of the Coefficients

0 4 0 0 0 8 0 0 0 1 2 0 0 0 1 6 0 0 0 2 0 0 0 0# o f I te r a t io n

DSP everywhere!

47


0 .1 6

0 .1 22 4

1 : L M A T2 : L M S3 : L M F4 : S IG N

0 0 4

0 .0 8

E(h1

( n))

1 3

0 4 0 0 0 8 0 0 0 1 2 0 0 0 1 6 0 0 0 2 0 0 0 00 .0 0

0 .0 4

Mean Behavior of the Coefficients

0 4 0 0 0 8 0 0 0 1 2 0 0 0 1 6 0 0 0 2 0 0 0 0# o f I te r a t io n

DSP everywhere!

48



2 2 2 2( ) ( ) 1.5 ( 1) ( 2) 0.25 ( 3)x n n x n x n x n= ζ + − − − − −

1 0

0

dB

1 : L M A T2 : L M S

- 1 0

MSE

ind 2 : L M S

3 : L M F4 : S I G N

4

0 4 0 0 0 8 0 0 0 1 2 0 0 0 1 6 0 0 0 2 0 0 0 0- 2 0 1 2

3

0 4 0 0 0 8 0 0 0 1 2 0 0 0 1 6 0 0 0 2 0 0 0 0# o f Ite ra tio n

Mean-Squared Behavior of the Coefficients

DSP everywhere!

49


0 . 1 24

0 . 0 8)

1 2 3

4

0 . 0 4

E(h1

(n)

1 : L M A T2 : L M S3 : L M F4 : S I G N

0 4 0 0 0 8 0 0 0 1 2 0 0 0 1 6 0 0 0 2 0 0 0 00 . 0 0

Mean Behavior of the Coefficients

0 4 0 0 0 8 0 0 0 1 2 0 0 0 1 6 0 0 0 2 0 0 0 0# o f Ite ra tio n

DSP everywhere!

50

Other Algorithms (1/2)g

Signed Regressor Algorithm: { }( 1) ( ) sign ( ) ( )H n H n X n e n+ = + μ

Sign-Sign Algorithm: { } { }( 1) ( ) sign ( ) sign ( )H n H n X n e n+ = + μ

Normalized LMS Algorithm:

{ } { }μ

( 1) ( ) ( ) ( )H n H n X n e nμ+ +Normalized LMS Algorithm: ( 1) ( ) ( ) ( )

( ) ( )TH n H n X n e nX n X n

μ+ = +

Complex LMS Algorithm: *( 1) ( ) ( ) ( )H n H n X n e n+ = + μ

DSP everywhere!

51

Other Algorithms (2/2)g

Hybrid Algorithm #1: LMS + LMF

{ }{ }2 4

( )( ) (1 ) ( )

( ) , 0 1( )H n

e n e nn

H n

∂ φ + − φ∇ = ≤ φ ≤

∂

{ }3( 1) ( ) ( ) ( ) 2(1 ) ( ) ( )H n H n X n e n X n e n+ = + μ φ + − φ

Hybrid Algorithm #2: Sign + LMAT

{ }3( ) (1 ) ( )∂ φ φ{ }3

( )

( ) (1 ) ( )( ) , 0 1

( )H n

e n e nn

H n

∂ φ + −φ∇ = ≤ φ ≤

∂

{ } { }2( 1) ( ) ( ) 3(1 ) ( ) ( ) sign ( )H n H n X n X n e n e n+ = + μ φ + − φ

DSP everywhere!

52

Recursive Least Square (RLS) Algorithm

DSP everywhere!

53

RLS Algorithm (1/5)g

Cost Function: 2

1( ) ( , ) ( )

n

in n i e i

=ε = β∑

where n = Length of the observable data

1i

Error signal at time instance i:

The coefficient vector H(n) remains fixed during the observation interval .

( ) ( ) ( ) ( )Te i d i H n X i= −

ni ≤≤1( ) g

Weight Vector: (Normally, , λ = Forgetting Factor)

ni ≤≤1

0 ( , ) 1n i< β ≤ ( , ) n in i −β = λ

By the method of exponentially weighted least squares, we want to minimize

2( ) ( )n

n i i−λ∑Very fast, but computationally very complex.

h l i h i f l h h b f i d i ll

2

1( ) ( )n i

in e i

=ε = λ∑

The algorithm is useful when the number of taps required is small.

DSP everywhere!

54


Normal Equation: ( ) ( ) ( )n H n nΦ = Θ

1( ) ( ) ( )

nn i T

in X i X i−

=Φ = λ∑where

1( ) ( ) ( )

nn i

in d i X i−

=Θ = λ∑

We write1

1( ) ( ) ( ) ( ) ( )n

n i T Tn X i X i X n X n−

− −⎡ ⎤Φ = λ λ +⎢ ⎥∑

1( ) ( ) ( ) ( ) ( )

( 1) ( ) ( )i

Tn X n X n=

⎢ ⎥⎣ ⎦

= λΦ − +

∑

( ) ( 1) ( ) ( )n n d n X nΘ = λΘ − +

Do we need a matrix inversion? ⇒ No!

DSP everywhere!

55


Matrix Inversion Lemma:1

where A and B = N × N Positive Definite

( ) 11 1 1If , then .T T TA B CD C A B BC D C BC C B−− − −= + = − +

C = N × M

D = M × M Positive Definite

Letting we express in a recursive form:1( ), ( 1), ( ), 1,A n B n C X n D−= Φ = λΦ − = =

1 2 1 11

1 1( 1) ( 1) ( ) ( ) ( 1)( )

1 ( ) ( 1) ( )

T

Tn n X n X n nn

X n n X n

− − − −−

− −Φ − λ Φ − Φ −

Φ = −λ + λ Φ1 ( ) ( 1) ( )X n n X nλ + λ Φ −

K(n)K(n)

DSP everywhere!

56


Define 1( ) ( ) ( )n n N N−Ρ = Φ ×

1

1( 1) ( )( ) ( 1)

1 ( ) ( 1) ( )Tn X nn N

X n n X n

−

−λ Ρ −

Κ = ×+ λ Ρ −

1 1( ) ( ) ( 1) ( ) ( 1) ( )Tn X n n X n n X n− −⇒ Κ + λ Ρ − = λ Ρ −

{ }1 1{ }1 1( ) ( 1) ( ) ( 1) ( )Tn n X n n X n− −⇒ Κ = λ Ρ − − λ Ρ −

( ) ( ) ( )n n X n⇒ Κ = Ρ

1( ) ( ) ( )n n X n−⇒ Κ = Φ

Therefore, 1 1( ) ( 1) ( ) ( ) ( 1)Tn n n X n n− −Ρ = λ Ρ − − λ Κ Ρ −

DSP everywhere!

57


Time Update for H(n):11( ) ( ) ( )

( ) ( )( ) ( 1) ( ) ( ) ( )

H n n nn n

n n d n n X n

−= Φ Θ= Ρ Θ= λΡ Θ − + Ρ

1 1

( ) ( ) ( ) ( ) ( )

( 1) ( 1) ( ) ( ) ( 1) ( 1) ( ) ( )

( 1) ( 1) ( ) ( ) ( 1) ( 1) ( ) ( )

T

T

n n n X n n n d n n

n n n X n n n d n n− −

= Ρ − Θ − − Κ Ρ − Θ − + Κ

= Φ − Θ − − Κ Φ − Θ − + Κ

( ) ( 1) ( ) ( ) ( ) ( 1)TH n H n n d n X n H n⎡ ⎤⇒ = − + Κ − −⎣ ⎦

Innovation: ( ) ( ) ( ) ( 1)Tn d n X n H nα = − −“A priori estimation error”

)()()1()( nnnHnH αΚ+−=

A posteriori Estimation error e(n): ( ) ( ) ( ) ( )Te n d n X n H n= −

DSP everywhere!

58

Summary of the RLS Algorithmy f g

Initialization:

D t i th f tti f t λ (N ll 0 9≤λ<1)Determine the forgetting factor λ (Normally, 0.9≤λ<1)

1( ) : (0) , ( a small positive number)NN N I−× Ρ = δ δ =

Main Iteration:

( ) : (0) 0NN N H× =

1

1( 1) ( )( 1) : ( )

1 ( ) ( 1) ( )Tn X nN n

X n n X n

−

−λ Ρ −

× Κ =+ λ Ρ −

(1 1) : ( ) ( ) ( ) ( 1)Tn d n X n H n× α = − −

( 1) : ( ) ( 1) ( ) ( )N H n H n n n× = − + Κ α( 1) : ( ) ( 1) ( ) ( )N H n H n n n+ Κ α

1 1( 1) : ( ) ( 1) ( ) ( ) ( 1)TN n n n X n n− −× Ρ = λ Ρ − − λ Κ Ρ −

(1 1) : ( ) ( ) ( ) ( )Te n d n X n H n× = − (if necessary)

Download - Adaptive Filter Theory - Egloospds9.egloos.com/pds/200806/18/35/s03_Adaptive_Filte… · · 2008-06-17DSP everywhere! 0 Adaptive Filter Theory Sung Ho Cho Hanyang University Seoul,

Top Related