widrow’s least mean square (lms) algorithm
TRANSCRIPT
![Page 1: Widrow’s Least Mean Square (LMS) Algorithm](https://reader030.vdocument.in/reader030/viewer/2022020705/61fb8b8f2e268c58cd5f7151/html5/thumbnails/1.jpg)
Chapter 4
Adaptive Filter Theory and Applications
References:
B.Widrow and M.E.Hoff, “Adaptive switching circuits,” Proc. Of WESCON Conv. Rec., part 4, pp.96-140, 1960
B.Widrow and S.D.Stearns, Adaptive Signal Processing, Prentice-Hall, 1985
O.Macchi, Adaptive Processing: The Least Mean Squares Approach with Applications in Transmission, Wiley, 1995
P.M.Clarkson, Optimal and Adaptive Signal Processing, CRC Press, 1993
S.Haykin, Adaptive Filter Theory, Prentice-Hall, 2002
D.F.Marshall, W.K.Jenkins and J.J.Murphy, "The use of orthogonal transforms for improving performance of adaptive filters", IEEE Trans. Circuits & Systems, vol.36, April 1989, pp.474-483
1
![Page 2: Widrow’s Least Mean Square (LMS) Algorithm](https://reader030.vdocument.in/reader030/viewer/2022020705/61fb8b8f2e268c58cd5f7151/html5/thumbnails/2.jpg)
Adaptive Signal Processing is concerned with the design, analysis, and implementation of systems whose structure changes in response to the incoming data. Application areas are similar to those of optimal signal processing but now the environment is changing, the signals are nonstationary and/or the parameters to be estimated are time-varying. For example, Echo cancellation for Hand-Free Telephones (The speech echo is a nonstationary signal)
Equalization of Data Communication Channels (The channel impulse response is changing, particularly in mobile communications)
Time-Varying System Identification (the system transfer function to be estimated is non-stationary in some control applications)
2
![Page 3: Widrow’s Least Mean Square (LMS) Algorithm](https://reader030.vdocument.in/reader030/viewer/2022020705/61fb8b8f2e268c58cd5f7151/html5/thumbnails/3.jpg)
Adaptive Filter Development
Year Application Developer(s)
1959 Adaptive pattern recognitionsystem
Widrow et al
1960 Adaptive waveform recognition Jacowatz
1965 Adaptive equalizer for telephone channel
Lucky
1967 Adaptive antenna system Widrow et al
1970 Linear prediction for speech analysis
Atal
Present numerous applications, structures,algorithms
3
![Page 4: Widrow’s Least Mean Square (LMS) Algorithm](https://reader030.vdocument.in/reader030/viewer/2022020705/61fb8b8f2e268c58cd5f7151/html5/thumbnails/4.jpg)
Adaptive Filter Definition
An adaptive filter is a time-variant filter whose coefficients are adjusted in a way to optimize a cost function or to satisfy some predetermined optimization criterion.
Characteristics of adaptive filters:
They can automatically adapt (self-optimize) in the face of changing environments and changing system requirements They can be trained to perform specific filtering and decision-making tasks according to some updating equations (training rules)
Why adaptive?
It can automatically operate in
changing environments (e.g. signal detection in wireless channel)
nonstationary signal/noise conditions (e.g. LPC of a speech signal)
time-varying parameter estimation (e.g. position tracking of a moving source)
4
![Page 5: Widrow’s Least Mean Square (LMS) Algorithm](https://reader030.vdocument.in/reader030/viewer/2022020705/61fb8b8f2e268c58cd5f7151/html5/thumbnails/5.jpg)
Block diagram of a typical adaptive filter is shown below:
AdaptiveFilter
AdaptiveAlgorithm
Σx(k)+-
{h(k)}
d(k)
e(k)
x(k) : input signal y(k) : filtered outputd(k) : desired responseh(k) : impulse response of adaptive filterThe cost function may be E{e (k)} or Σ e (k)2 2
k=0
N-1
y(k)
FIR or IIR adaptive filter filter can be realized in various structures adaptive algorithm depends on the optimization criterion
5
![Page 6: Widrow’s Least Mean Square (LMS) Algorithm](https://reader030.vdocument.in/reader030/viewer/2022020705/61fb8b8f2e268c58cd5f7151/html5/thumbnails/6.jpg)
Basic Classes of Adaptive Filtering Applications 1.Prediction : signal encoding, linear prediction coding, spectral analysis
6
![Page 7: Widrow’s Least Mean Square (LMS) Algorithm](https://reader030.vdocument.in/reader030/viewer/2022020705/61fb8b8f2e268c58cd5f7151/html5/thumbnails/7.jpg)
2.Identification : adaptive control, layered earth modeling, vibration studies of mechanical system
7
![Page 8: Widrow’s Least Mean Square (LMS) Algorithm](https://reader030.vdocument.in/reader030/viewer/2022020705/61fb8b8f2e268c58cd5f7151/html5/thumbnails/8.jpg)
3.Inverse Filtering : adaptive equalization for communication channel, deconvolution
8
![Page 9: Widrow’s Least Mean Square (LMS) Algorithm](https://reader030.vdocument.in/reader030/viewer/2022020705/61fb8b8f2e268c58cd5f7151/html5/thumbnails/9.jpg)
4.Interference Canceling : adaptive noise canceling, echo cancellation
9
![Page 10: Widrow’s Least Mean Square (LMS) Algorithm](https://reader030.vdocument.in/reader030/viewer/2022020705/61fb8b8f2e268c58cd5f7151/html5/thumbnails/10.jpg)
Design Considerations 1. Cost Function
choice of cost functions depends on the approach used and the application of interest
some commonly used cost functions are mean square error (MSE) criterion : minimizes )}({ 2 keE
E (dwhere denotes expectation operation, )())( kykke −= is the estimation error, )(kd is the desired response and )(ky is the actual filter output
exponentially weighted least squares criterion : minimizes )(1
0
21∑ λ−
=
−−N
k
kN ke
where N is the total number of samples and λ denotes the exponentially weighting factor whose value is positive close to 1.
10
![Page 11: Widrow’s Least Mean Square (LMS) Algorithm](https://reader030.vdocument.in/reader030/viewer/2022020705/61fb8b8f2e268c58cd5f7151/html5/thumbnails/11.jpg)
2. Algorithm
depends on the cost function used convergence of the algorithm : Will the coefficients of the adaptive filter converge to the desired values? Is the algorithm stable? Global convergence or local convergence?
rate of convergence : This corresponds to the time required for the algorithm to converge to the optimum least squares/Wiener solution.
misadjustment : excess mean square error (MSE) over the minimum MSE produced by the Wiener filter, mathematically it is defined as
min
min2 )}({lim
ε
ε−= ∞→
keEM k (4.1)
(This is a performance measure for algorithms that use the minimum MSE criterion)
11
![Page 12: Widrow’s Least Mean Square (LMS) Algorithm](https://reader030.vdocument.in/reader030/viewer/2022020705/61fb8b8f2e268c58cd5f7151/html5/thumbnails/12.jpg)
tracking capability : This refers to the ability of the algorithm to track statistical variations in a nonstationary environment.
computational requirement : number of operations, memory size, investment required to program the algorithm on a computer.
robustness : This refers to the ability of the algorithm to operate satisfactorily with ill-conditioned data, e.g. very noisy environment, change in signal and/or noise models
3. Structure
structure and algorithm are inter-related, choice of structures is based on quantization errors, ease of implementation, computational complexity, etc.
four commonly used structures are direct form, cascade form, parallel form, and lattice structure. Advantages of lattice structures include simple test for filter stability, modular structure and low sensitivity to quantization effects.
12
![Page 13: Widrow’s Least Mean Square (LMS) Algorithm](https://reader030.vdocument.in/reader030/viewer/2022020705/61fb8b8f2e268c58cd5f7151/html5/thumbnails/13.jpg)
e.g.,
2
22
1
11
2
2
1
1
012
22
)()()(
pzEzD
pzEzD
pzzC
pzzC
AzAz
zBzH
−+
+−
+=
−⋅
−=
++=
Q. Can you see an advantage of using cascade or parallel form?
13
![Page 14: Widrow’s Least Mean Square (LMS) Algorithm](https://reader030.vdocument.in/reader030/viewer/2022020705/61fb8b8f2e268c58cd5f7151/html5/thumbnails/14.jpg)
Commonly Used Methods for Minimizing MSE
For simplicity, it is assumed that the adaptive filter is of causal FIR type and is implemented in direct form. Therefore, its system block diagram is
w (n)0
z-1
w (n)1
z-1
+
w (n)L-2
z-1
+
w (n)
+
L-1
...
...
x(n)
+d(n)
y(n)e(n) +
-
14
![Page 15: Widrow’s Least Mean Square (LMS) Algorithm](https://reader030.vdocument.in/reader030/viewer/2022020705/61fb8b8f2e268c58cd5f7151/html5/thumbnails/15.jpg)
The error signal at time is given by n
)()()( nyndne −= (4.2) where
)()()()()(1
0nXnWinxnwny TL
ii =∑ −=
−
=,
T
TLL
LnxLnxnxnxnX
nwnwnwnwnW
)]1()2()1()([)(
)]()()()([)( 1210
+−+−−=
= −−
L
L
Recall that minimizing the will give the Wiener solution in optimal filtering, it is desired that
)}({ 2 neE
( ) dxxxMMSEnRRWnW ⋅== −
∞→
1)(lim (4.3)
In adaptive filtering, the Wiener solution is found through an iterative procedure,
)()()1( nWnWnW ∆+=+ (4.4)
where )(nW∆ is an incrementing vector.
15
![Page 16: Widrow’s Least Mean Square (LMS) Algorithm](https://reader030.vdocument.in/reader030/viewer/2022020705/61fb8b8f2e268c58cd5f7151/html5/thumbnails/16.jpg)
Two common gradient searching approaches for obtaining the Wiener filter are
1. Newton Method
∂∂
−⋅µ=∆ −
)()}({
)(2
1
nWneE
RnW xx (4.5)
where µ is called the step size. It is a positive number that controls the convergence rate and stability of the algorithm. The adaptive algorithm becomes
( )
MMSE
dxxx
dxxxxx
xx
WnWRRnW
RnWRRnW
nWneE
RnWnW
µ+µ−=
µ+µ−=
−⋅µ−=
∂∂
⋅µ−=+
−
−
−
2)()21(2)()21(
)(2)(
)()}({
)()1(
1
1
21
(4.6)
16
![Page 17: Widrow’s Least Mean Square (LMS) Algorithm](https://reader030.vdocument.in/reader030/viewer/2022020705/61fb8b8f2e268c58cd5f7151/html5/thumbnails/17.jpg)
Solving the equation, we have ))0(()21()( MMSE
nMMSE WWWnW −µ−+= (4.7)
where )0(W is the initial value of )(nW . To ensure MMSEn
WnW =∞→
)(lim (4.8)
the choice of should be µ 101|21|1 <µ<⇒<µ−<− (4.9) In particular, when 5.0=µ , we have MMSEMMSEMMSE WWWWW =−⋅−+= ))0(()5.021()1( 1 (4.10) The weights jump form any initial )0(W to the optimum setting MMSEW in a single step.
17
![Page 18: Widrow’s Least Mean Square (LMS) Algorithm](https://reader030.vdocument.in/reader030/viewer/2022020705/61fb8b8f2e268c58cd5f7151/html5/thumbnails/18.jpg)
An example of the Newton method with 5.0=µ and 2 weights is illustrated below.
18
![Page 19: Widrow’s Least Mean Square (LMS) Algorithm](https://reader030.vdocument.in/reader030/viewer/2022020705/61fb8b8f2e268c58cd5f7151/html5/thumbnails/19.jpg)
2. Steepest Descent Method
∂µ−=∆
)}({)(
2 neEnW
∂ )(nW (4.11)
Thus
( )
MMSEMMSExx
MMSExxxx
dxxx
WWnWRIWRnWRI
RnWRnWnWneE
nWnW
+−µ−=
µ+µ−=
−⋅µ−=∂
∂µ−=+
))()(2(2)()2(
)(2)()()}({
)()1(2
(4.12)
where I is the x identity matrix. Denote L L
MMSEWnWnV −= )()( (4.13) We have
)()2()1( nVRInV xxµ−=+ (4.14)
19
![Page 20: Widrow’s Least Mean Square (LMS) Algorithm](https://reader030.vdocument.in/reader030/viewer/2022020705/61fb8b8f2e268c58cd5f7151/html5/thumbnails/20.jpg)
Using the fact that xxR is symmetric and real, it can be shown that T
xx QQQQR ⋅Λ⋅=⋅Λ⋅= −1 (4.15) where the modal matrix Q is orthonormal. The columns of Q , which are the eigenvectors of L xxR , are mutually orthogonal and normalized. Notice that T−1 . While Q=Q Λ is the so-called spectral matrix and all its elements are zero except for the main diagonal, whose elements are the set of eigenvalues of xxR , Lλλλ ,,, L . It has the form 21
λ
λλ
=Λ
L000
000
2
1
L
MOM
L
(4.16)
20
![Page 21: Widrow’s Least Mean Square (LMS) Algorithm](https://reader030.vdocument.in/reader030/viewer/2022020705/61fb8b8f2e268c58cd5f7151/html5/thumbnails/21.jpg)
It can be proved that the eigenvalues of xxR are all real and greater or equal to zero. Using these results and let
)()( nUQnV ⋅= (4.17) We have
( )( ) )(2
)(2
)()2()1(
)()2()1(
11
1
nUI
nUQRQQIQ
nUQRIQnU
nUQRInUQ
xx
xx
xx
Λµ−=
⋅⋅µ−⋅⋅=
⋅µ−=+
⇒⋅µ−=+⋅
−−
−
(4.18)
The solution is ( ) )0(2)( UInU nΛµ−= (4.19)
where )0(U is the initial value of )(nU . Thus the steepest descent algorithm is stable and convergent if
( ) 02lim =Λµ−∞→
n
nI
21
![Page 22: Widrow’s Least Mean Square (LMS) Algorithm](https://reader030.vdocument.in/reader030/viewer/2022020705/61fb8b8f2e268c58cd5f7151/html5/thumbnails/22.jpg)
or
( )
( )
( )
0
21lim000
21lim0
0021lim
2
1
=
µλ−
µλ−
µλ−
∞→
∞→
∞→
nL
n
n
n
n
n
L
OM
M
L
(4.20)
which implies
max
max1
01|21|λ
<µ<⇒<µλ− (4.21)
where maxλ is the largest eigenvalue of xxR .
If this condition is satisfied, it follows that
( )
MMSEn
MMSEnnn
WnW
WnWQnVQnU
=⇒
⇒−⋅=⋅⇒=
∞→
−
∞→
−
∞→∞→
)(lim
0)(lim)(lim0)(lim 11
(4.22)
22
![Page 23: Widrow’s Least Mean Square (LMS) Algorithm](https://reader030.vdocument.in/reader030/viewer/2022020705/61fb8b8f2e268c58cd5f7151/html5/thumbnails/23.jpg)
An illustration of the steepest descent method with two weights and 3.0=µ is given as below.
23
![Page 24: Widrow’s Least Mean Square (LMS) Algorithm](https://reader030.vdocument.in/reader030/viewer/2022020705/61fb8b8f2e268c58cd5f7151/html5/thumbnails/24.jpg)
Remarks: Steepest descent method is simpler than the Newton method since no matrix inversion is required.
The convergence rate of Newton method is much faster than that of the steepest descent method.
When the performance surface is unimodal, )0(W can be arbitrarily chosen. If it is multimodal, good initial values of )0 is necessary in order for global minimization.
(W
However, both methods require exact values of xxR and dxR which are not commonly available in practical applications.
24
![Page 25: Widrow’s Least Mean Square (LMS) Algorithm](https://reader030.vdocument.in/reader030/viewer/2022020705/61fb8b8f2e268c58cd5f7151/html5/thumbnails/25.jpg)
Widrow’s Least Mean Square (LMS) Algorithm A. Optimization Criterion To minimize the mean square error )}({ 2 neE B. Adaptation Procedure It is an approximation of the steepest descent method where the expectation operator is ignored, i.e.,
)()}({ 2
nWneE
∂∂ is replaced by
)()(2
nWne
∂∂
25
![Page 26: Widrow’s Least Mean Square (LMS) Algorithm](https://reader030.vdocument.in/reader030/viewer/2022020705/61fb8b8f2e268c58cd5f7151/html5/thumbnails/26.jpg)
The LMS algorithm is therefore:
[ ]
)()(2)(
,)(
)()()()(2)(
)()(
)()(
)(
)()(
)()1(
2
2
nXnenW
BA
BAnW
nXnWndnenW
nWne
nene
nW
nWne
nWnW
TT
µ+=
=∂
⋅∂∂
⋅−∂⋅µ−=
∂∂
⋅∂
∂µ−=
∂∂
µ−=+
or 1,,1,0),()(2)()1( −=−µ+=+ Liinxnenwnw ii L (4.23)
26
![Page 27: Widrow’s Least Mean Square (LMS) Algorithm](https://reader030.vdocument.in/reader030/viewer/2022020705/61fb8b8f2e268c58cd5f7151/html5/thumbnails/27.jpg)
C. Advantages low computational complexity simple to implement allow real-time operation does not need statistics of signals, i.e., xxR and dxR
D. Performance Surface The mean square error function or performance surface is identical to that in the Wiener filtering: ( ) ( )MMSExx
TMMSE WnWRWnWneE −−+ε= )()()}({ min
2 (4.24) where )(nW is the adaptive filter coefficient vector at time . n
27
![Page 28: Widrow’s Least Mean Square (LMS) Algorithm](https://reader030.vdocument.in/reader030/viewer/2022020705/61fb8b8f2e268c58cd5f7151/html5/thumbnails/28.jpg)
E. Performance Analysis
Two important performance measures in LMS algorithms are rate of convergence & misadjustment (relates to steady state filter weight variance).
1. Convergence Analysis
For ease of analysis, it is assumed that )(nW is independent of )(nX . Taking expectation on both sides of the LMS algorithm, we have
{ }
( ) MMSExxxx
xxdx
T
WRnWERInWERRnWE
nWnXnXnXndEnWE
nXneEnWEnWE
µ+µ−=
µ−µ+=⋅−µ+=
µ+=+
2)}({2)}({22)}({
))()(()()()(2)}({
)}()({2)}({)}1({
(4.25)
which is very similar to the adaptive equation (4.12) in the steepest descent method.
28
![Page 29: Widrow’s Least Mean Square (LMS) Algorithm](https://reader030.vdocument.in/reader030/viewer/2022020705/61fb8b8f2e268c58cd5f7151/html5/thumbnails/29.jpg)
Following the previous derivation, )(nW will converge to the Wiener filter weights in the mean sense if
( ) n
( )
( )
0
21lim000
21lim0
0021lim
2
1
=
µλ−
µλ−
µλ−
∞→
∞→
∞→
nL
n
n
n
n
L
OM
M
L
⇒ Lii ,,2,1,1|21| L=<µλ−
⇒max
1λ
<µ<0 (4.26)
Define geometric ratio of the th term as r ,2
pµλ− Lppp ,,2,11 L== (4.27)
It is observed that each term in the main diagonal forms a geometric series },,,,,,,1{ 1121 LL +− nnn rrrrr . ppppp
29
![Page 30: Widrow’s Least Mean Square (LMS) Algorithm](https://reader030.vdocument.in/reader030/viewer/2022020705/61fb8b8f2e268c58cd5f7151/html5/thumbnails/30.jpg)
Exponential function can be fitted to approximate each geometric series:
{ }
τ−≈⇒
τ−≈
p
np
pp
nrr exp1exp (4.28)
where pτ is called the th time constant . p
For slow adaptation, i.e., 12 <<µλ p , pτ is approximated as
( ) ( )
pp
ppp
ppp
µλ≈τ⇒
µλ−≈−µλ−
+µλ−
−µλ−=µλ−=τ
−
21
23
2
2
22)21ln(
132
L
(4.29)
Notice that the smaller the time constant the faster the convergence rate. Moreover, the overall convergence is limited by the slowest mode of convergence which in turns stems from the smallest eigenvalue of xxR , λ . min
30
![Page 31: Widrow’s Least Mean Square (LMS) Algorithm](https://reader030.vdocument.in/reader030/viewer/2022020705/61fb8b8f2e268c58cd5f7151/html5/thumbnails/31.jpg)
That is,
min
max 21
µλ≈τ (4.30)
In general, the rate of convergence depends on two factors: the step size : the larger the µ µ, the faster the convergence rate
the eigenvalue spread of xxR , )( xxRχ : the smaller )( xxRχ , the faster the convergence rate. )( xxRχ is defined as
λ
min
max)(λ
=χ xxR (4.31)
Notice that ∞<χ≤ )(1 xxR . It is worthy to note that although )( xxRχ cannot be changed, the rate of convergence will be increased if we transform )(nx to another sequence, say, )(ny , such that )( yyRχ is close to 1.
31
![Page 32: Widrow’s Least Mean Square (LMS) Algorithm](https://reader030.vdocument.in/reader030/viewer/2022020705/61fb8b8f2e268c58cd5f7151/html5/thumbnails/32.jpg)
Example 4.1 An Illustration of eigenvalue spread for LMS algorithm is shown as follows.
Σ h zi=0
1
i-i
Σ w zi=0
1
i-i
Σ
x(n)
d(n)
y(n) e(n)
+
-
Unknown System
Σq(n)+
+
d(n) = h0x(n) + h1x(n-1) + q(n) y(n) = w0(n)x(n) + w1(n)x(n-1) e(n) = d(n) – y(n) = d(n) – w0(n)x(n) – w1(n)x(n-1) w0(n+1) = w0(n) + 2ue(n)x(n) w1(n+1) = w1(n) + 2ue(n)x(n-1)
32
![Page 33: Widrow’s Least Mean Square (LMS) Algorithm](https://reader030.vdocument.in/reader030/viewer/2022020705/61fb8b8f2e268c58cd5f7151/html5/thumbnails/33.jpg)
; file name is es.m clear all N=1000; % number of sample is 1000 np = 0.01; % noise power is 0.01 sp = 1; % signal power is 1 which implies SNR = 20dB h=[1 2]; % unknown impulse response x = sqrt(sp).*randn(1,N); d = conv(x,h); d = d(1:N) + sqrt(np).*randn(1,N); w0(1) = 0; % initial filter weights are 0 w1(1) = 0; mu = 0.005; % step size is fixed at 0.005 y(1) = w0(1)*x(1); % iteration at “n=0” e(1) = d(1) - y(1); % separate because “x(0)” is not defined w0(2) = w0(1) + 2*mu*e(1)*x(1); w1(2) = w1(1);
33
![Page 34: Widrow’s Least Mean Square (LMS) Algorithm](https://reader030.vdocument.in/reader030/viewer/2022020705/61fb8b8f2e268c58cd5f7151/html5/thumbnails/34.jpg)
for n=2:N % the LMS algorithm y(n) = w0(n)*x(n) + w1(n)*x(n-1); e(n) = d(n) - y(n); w0(n+1) = w0(n) + 2*mu*e(n)*x(n); w1(n+1) = w1(n) + 2*mu*e(n)*x(n-1); end n = 1:N+1; subplot(2,1,1) plot(n,w0) % plot filter weight estimate versus time axis([1 1000 0 1.2]) subplot(2,1,2) plot(n,w1) axis([1 1000 0 2.2]) figure(2) subplot(1,1,1) n = 1:N; semilogy(n,e.*e); % plot square error versus time
34
![Page 35: Widrow’s Least Mean Square (LMS) Algorithm](https://reader030.vdocument.in/reader030/viewer/2022020705/61fb8b8f2e268c58cd5f7151/html5/thumbnails/35.jpg)
35
![Page 36: Widrow’s Least Mean Square (LMS) Algorithm](https://reader030.vdocument.in/reader030/viewer/2022020705/61fb8b8f2e268c58cd5f7151/html5/thumbnails/36.jpg)
36
![Page 37: Widrow’s Least Mean Square (LMS) Algorithm](https://reader030.vdocument.in/reader030/viewer/2022020705/61fb8b8f2e268c58cd5f7151/html5/thumbnails/37.jpg)
Note that both filter weights converge at similar speed because the eigenvalues of the xxR are identical: Recall
=
)0()1()1()0(
xxxx
xxxxxx RR
RRR
For white process with unity power, we have
1)}().({)0( == nxnxERxx 0)}1().({)1( =−= nxnxERxx
As a result,
=
=
1001
)0()1()1()0(
xxxx
xxxxxx RR
RRR
⇒ 1)( =χ xxR
37
![Page 38: Widrow’s Least Mean Square (LMS) Algorithm](https://reader030.vdocument.in/reader030/viewer/2022020705/61fb8b8f2e268c58cd5f7151/html5/thumbnails/38.jpg)
; file name is es1.m clear all N=1000; np = 0.01; sp = 1; h=[1 2]; u = sqrt(sp/2).*randn(1,N+1); x = u(1:N) + u(2:N+1); % x(n) is now a MA process with power 1 d = conv(x,h); d = d(1:N) + sqrt(np).*randn(1,N); w0(1) = 0; w1(1) = 0; mu = 0.005; y(1) = w0(1)*x(1); e(1) = d(1) - y(1); w0(2) = w0(1) + 2*mu*e(1)*x(1); w1(2) = w1(1);
38
![Page 39: Widrow’s Least Mean Square (LMS) Algorithm](https://reader030.vdocument.in/reader030/viewer/2022020705/61fb8b8f2e268c58cd5f7151/html5/thumbnails/39.jpg)
for n=2:N y(n) = w0(n)*x(n) + w1(n)*x(n-1); e(n) = d(n) - y(n); w0(n+1) = w0(n) + 2*mu*e(n)*x(n); w1(n+1) = w1(n) + 2*mu*e(n)*x(n-1); end n = 1:N+1; subplot(2,1,1) plot(n,w0) axis([1 1000 0 1.2]) subplot(2,1,2) plot(n,w1) axis([1 1000 0 2.2]) figure(2) subplot(1,1,1) n = 1:N; semilogy(n,e.*e);
39
![Page 40: Widrow’s Least Mean Square (LMS) Algorithm](https://reader030.vdocument.in/reader030/viewer/2022020705/61fb8b8f2e268c58cd5f7151/html5/thumbnails/40.jpg)
40
![Page 41: Widrow’s Least Mean Square (LMS) Algorithm](https://reader030.vdocument.in/reader030/viewer/2022020705/61fb8b8f2e268c58cd5f7151/html5/thumbnails/41.jpg)
41
![Page 42: Widrow’s Least Mean Square (LMS) Algorithm](https://reader030.vdocument.in/reader030/viewer/2022020705/61fb8b8f2e268c58cd5f7151/html5/thumbnails/42.jpg)
Note that the convergence speed of is slower than that of )(0 nw )(1 nw Investigating the xxR :
15.05.0)}1()({
))}1()(())1()({()}().({)0(
22
=+=−+=
−+⋅−+==
nunuE
nunununuEnxnxERxx
5.0)}1({
))}2()1(())1()({()}1().({)1(
2 =−=
−+−⋅−+=−=
nuE
nunununuEnxnxERxx
As a result,
=
=
15.05.01
)0()1()1()0(
xxxx
xxxxxx RR
RRR
⇒ 5.0min =λ and 5.1max =λ ⇒ 3)( =χ xxR (MATLAB command: eig)
42
![Page 43: Widrow’s Least Mean Square (LMS) Algorithm](https://reader030.vdocument.in/reader030/viewer/2022020705/61fb8b8f2e268c58cd5f7151/html5/thumbnails/43.jpg)
; file name is es2.m clear all N=1000; np = 0.01; sp = 1; h=[1 2]; u = sqrt(sp/5).*randn(1,N+4); x = u(1:N) + u(2:N+1) + u(3:N+2) + u(4:N+3) +u(5:N+4); % x(n) is 5th order MA process d = conv(x,h); d = d(1:N) + sqrt(np).*randn(1,N); w0(1) = 0; w1(1) = 0; mu = 0.005; y(1) = w0(1)*x(1); e(1) = d(1) - y(1); w0(2) = w0(1) + 2*mu*e(1)*x(1); w1(2) = w1(1);
43
![Page 44: Widrow’s Least Mean Square (LMS) Algorithm](https://reader030.vdocument.in/reader030/viewer/2022020705/61fb8b8f2e268c58cd5f7151/html5/thumbnails/44.jpg)
for n=2:N y(n) = w0(n)*x(n) + w1(n)*x(n-1); e(n) = d(n) - y(n); w0(n+1) = w0(n) + 2*mu*e(n)*x(n); w1(n+1) = w1(n) + 2*mu*e(n)*x(n-1); end n = 1:N+1; subplot(2,1,1) plot(n,w0) axis([1 1000 0 1.5]) subplot(2,1,2) plot(n,w1) axis([1 1000 0 2.5]) figure(2) subplot(1,1,1) n = 1:N; semilogy(n,e.*e);
44
![Page 45: Widrow’s Least Mean Square (LMS) Algorithm](https://reader030.vdocument.in/reader030/viewer/2022020705/61fb8b8f2e268c58cd5f7151/html5/thumbnails/45.jpg)
45
![Page 46: Widrow’s Least Mean Square (LMS) Algorithm](https://reader030.vdocument.in/reader030/viewer/2022020705/61fb8b8f2e268c58cd5f7151/html5/thumbnails/46.jpg)
46
![Page 47: Widrow’s Least Mean Square (LMS) Algorithm](https://reader030.vdocument.in/reader030/viewer/2022020705/61fb8b8f2e268c58cd5f7151/html5/thumbnails/47.jpg)
We see that the convergence speeds of both weights are very slow, although that of )(nw is faster. 1 Investigating the xxR :
12.02.02.02.02.0)}4()3()2()1()({
)}().({)0(22222
=++++=−+−+−+−+=
=
nununununuE
nxnxERxx
8.0)}4({)}3({)}2({)}1({
)}}1().({)1(2222
=−+−+−+−=
−=
nuEnuEnuEnuE
nxnxERxx
As a result,
=
=
18.08.01
)0()1()1()0(
xxxx
xxxxxx RR
RRR
⇒ 2.0min =λ and 8.1max =λ ⇒ 9)( =χ xxR
47
![Page 48: Widrow’s Least Mean Square (LMS) Algorithm](https://reader030.vdocument.in/reader030/viewer/2022020705/61fb8b8f2e268c58cd5f7151/html5/thumbnails/48.jpg)
2. Misadjustment Upon convergence, if MMSEn
WnW =∞→
)(lim , then the minimum MSE will be
equal to { } MMSE
TdxWRndE −=ε )(2
min (4.32) However, this will not occur in practice due to random noise in the weight vector )(nW . Notice that we have MMSEn
WnWE =∞→
)}({lim but not
MMSEnWnW =
∞→)(lim . The MSE of the LMS algorithm is computed as
( )
( ) ( ) ( ){ }( ) ( ){ }MMSExx
TMMSE
MMSETT
MMSE
T
WnWRWnWE
WnWnXnXWnWE
nXnWndEneE
−⋅⋅−+ε=
−⋅⋅⋅−+ε=
−=
)()(
)()()()(
)()()()}({
min
min
22
(4.33)
48
![Page 49: Widrow’s Least Mean Square (LMS) Algorithm](https://reader030.vdocument.in/reader030/viewer/2022020705/61fb8b8f2e268c58cd5f7151/html5/thumbnails/49.jpg)
The second term of the right hand side at ∞→n is known as the excess MSE and it is given by
( ) ( ){ }{ }{ }
[ ]xxL
ii
Tn
xxT
n
MMSExxT
MMSEn
Rtr
nUnUE
nVRnVE
WnWRWnWE
min1
0min
)()(lim
)()(lim
)()(limMSEexcess
µε=∑ λµε=
⋅Λ⋅=
⋅⋅=
−⋅⋅−=
−
=
∞→
∞→
∞→
(4.34)
where [ ]xxRtr is the trace of xxR which is equal to the sum of all elements of the principle diagonal: [ ] )}({)0( 2 nxELRLRtr xxxx ⋅=⋅= (4.35)
49
![Page 50: Widrow’s Least Mean Square (LMS) Algorithm](https://reader030.vdocument.in/reader030/viewer/2022020705/61fb8b8f2e268c58cd5f7151/html5/thumbnails/50.jpg)
As a result, the misadjustment M is given by
)}({
)}({
)}({lim
2min
2min
min
min2
nxEL
nxEL
keEM k
⋅⋅µ=
ε⋅µε
=
ε
ε−= ∞→
(4.36)
which is proportional to the step size, filter length and signal power.
Remarks:
1.There is a tradeoff between fast convergence rate and small mean square error or misadjustment. When µ increases, both the convergence rate and M increase; if µ decreases, both the convergence rate and M decrease.
50
![Page 51: Widrow’s Least Mean Square (LMS) Algorithm](https://reader030.vdocument.in/reader030/viewer/2022020705/61fb8b8f2e268c58cd5f7151/html5/thumbnails/51.jpg)
2.The bound for is µ
max
10
λ<µ< (4.37)
In practice, the signal power of can generally be estimated more easily than the eigenvalue of
)(nxxxR . We also note that
[ ] )}({ 2
1max nxELRtr xx
L
ii ⋅==∑ λ≤λ
= (4.38)
A more restrictive bound for µ which is much easier to apply thus is
1 )}({ 2 nxEL ⋅
(4.39) 0 <µ<
Moreover, instead of a fixed value of µ, we can make it time-varying as )(nµ . A design idea of a good )(nµ is
toinitially, valuelarge
=µ
econvergencuponentmisadjustmsmallensuretofinally, valuesmallrateeconvergencinitalfastensure
)(n
51
![Page 52: Widrow’s Least Mean Square (LMS) Algorithm](https://reader030.vdocument.in/reader030/viewer/2022020705/61fb8b8f2e268c58cd5f7151/html5/thumbnails/52.jpg)
LMS Variants
1. Normalized LMS (NLMS) algorithm
the product vector )()( nXne is modified with respect to the squared Euclidean norm of the tap-input vector )(nX :
)()()()(
2)()1( nXne
nXnXcnWnW
T ⋅+
µ+=+ (4.40)
where is a small positive constant to avoid division by zero. c
can also be considered as an LMS algorithm with a time-varying step size:
( ))()( nXnXc
nT ⋅+
µ=µ (4.41)
substituting , it can be shown that the NLMS algorithm converges if 5.00
0=c<µ< selection of step size in the NLMS is much easier than
that of LMS algorithm ⇒
52
![Page 53: Widrow’s Least Mean Square (LMS) Algorithm](https://reader030.vdocument.in/reader030/viewer/2022020705/61fb8b8f2e268c58cd5f7151/html5/thumbnails/53.jpg)
2. Sign algorithms pilot LMS or signed error or signed algorithm:
)()](sgn[2)()1( nXnenWnW µ+=+ (4.42) clipped LMS or signed regressor:
)](sgn[)(2)()1( nXnenWnW µ+=+ (4.43) zero-forcing LMS or sign-sign:
)](sgn[)](sgn[2)()1( nXnenWnW µ+=+ (4.44) their computational complexity is simpler than the LMS algorithm but they are relatively difficult to analyze
53
![Page 54: Widrow’s Least Mean Square (LMS) Algorithm](https://reader030.vdocument.in/reader030/viewer/2022020705/61fb8b8f2e268c58cd5f7151/html5/thumbnails/54.jpg)
3. Leaky LMS algorithm
the LMS update is modified by the presence of a constant leakage factor γ :
)()(2)()1( nXnenWnW µ+⋅γ=+ (4.45)
where 10 <γ< . operates when xxR has zero eigenvalues.
4. Least mean fourth algorithm
instead of minimizing , is minimized based on LMS approach:
)}({ 2 neE )}({ 4 neE
)()(4)()1( 3 nXnenWnW µ+=+ (4.46)
can outperform LMS algorithm in non-Gaussian signal and noise conditions
54
![Page 55: Widrow’s Least Mean Square (LMS) Algorithm](https://reader030.vdocument.in/reader030/viewer/2022020705/61fb8b8f2e268c58cd5f7151/html5/thumbnails/55.jpg)
Application Examples
Example 4.2
1. Linear Prediction
Suppose a signal is a second-order autoregressive (AR) process that satisfies the following difference equation:
)(nx
)()2(81.0)1(558.1)( nvnxnxnx +−−−=
where )(nv is a white noise process such that
=σ
=+=otherwise,0
0,)}()({)(2 mmnvnvEmR v
vv
We want to use a two-coefficient LMS filter to predict by )(nx
)2()()1()()()()(ˆ 212
1−+−=∑ −=
=nxnwnxnwinxnwnx
ii
55
![Page 56: Widrow’s Least Mean Square (LMS) Algorithm](https://reader030.vdocument.in/reader030/viewer/2022020705/61fb8b8f2e268c58cd5f7151/html5/thumbnails/56.jpg)
Upon convergence, we desire
558.1)}({ 1 →nwE and
81.0)}({ 2 −→nwE
z -1
w (n)1
z -1x(n)
+d(n)
x(n)
e(n)+
-
w (n)2
++
+
56
![Page 57: Widrow’s Least Mean Square (LMS) Algorithm](https://reader030.vdocument.in/reader030/viewer/2022020705/61fb8b8f2e268c58cd5f7151/html5/thumbnails/57.jpg)
The error function or prediction error is given by
2)(ne
)2()()1()()(
)()()()(
21
1
−−−−=
∑ −−==
nxnwnxnwnx
inxnwndnei
i
Thus the LMS algorithm for this problem is
)(2∂µ ne
)1()()()(
)()()(
2)(
)(2)()1(
1
1
2
11
11
−µ+=∂∂
⋅∂
∂⋅
µ−=
∂⋅−=+
nxnenwnw
nenene
nwnw
nwnw
and
)2()()()()(
2)()1(
2
2
2
22
−µ+=∂∂
⋅µ
−=+
nxnenwnwne
nwnw
The computational requirement for each sampling interval is
multiplications : 5 addition/subtraction : 4
57
![Page 58: Widrow’s Least Mean Square (LMS) Algorithm](https://reader030.vdocument.in/reader030/viewer/2022020705/61fb8b8f2e268c58cd5f7151/html5/thumbnails/58.jpg)
Two values of , 0.02 and 0.004, are investigated: µ
Convergence characteristics for the LMS predictor with 02.0=µ
58
![Page 59: Widrow’s Least Mean Square (LMS) Algorithm](https://reader030.vdocument.in/reader030/viewer/2022020705/61fb8b8f2e268c58cd5f7151/html5/thumbnails/59.jpg)
Convergence characteristics for the LMS predictor with 004.0=µ
59
![Page 60: Widrow’s Least Mean Square (LMS) Algorithm](https://reader030.vdocument.in/reader030/viewer/2022020705/61fb8b8f2e268c58cd5f7151/html5/thumbnails/60.jpg)
Observations: 1.When 02.0=µ , we had a fast convergence rate (the parameters
converged to the desired values in approximately 200 iterations) but large fluctuation existed in )(nw and )(nw . 1 2
2.When 004.0=µ , small fluctuation in and but the filter coefficients did not converge to the desired values of 1.558 and -0.81 respectively after the 300th iteration.
)(1 nw )(2 nw
3.The learning behaviours of and agreed with those of )
)(1 nw )(2 nw}({ 1 nwE and )}({ 2 nwE . Notice that )}({ 1 nwE and )}({ 2 nwE can be
derived by taking expectation on the LMS algorithm.
60
![Page 61: Widrow’s Least Mean Square (LMS) Algorithm](https://reader030.vdocument.in/reader030/viewer/2022020705/61fb8b8f2e268c58cd5f7151/html5/thumbnails/61.jpg)
Example 4.3 2. System Identification Given the input signal and output signal , we can estimate the impulse response of the system or plant using the LMS algorithm.
)(nx )(nd
Suppose the transfer function of the plant is ∑=
−2
0i
ii zh which is a causal FIR
unknown system, then can be represented as )(nd
∑ −==
2
0)()(
ii inxhnd
61
![Page 62: Widrow’s Least Mean Square (LMS) Algorithm](https://reader030.vdocument.in/reader030/viewer/2022020705/61fb8b8f2e268c58cd5f7151/html5/thumbnails/62.jpg)
Assuming that the order the transfer function is unknown and we use a 2- coefficient LMS filter to model the system function as follows,
Plant
Σ h zi=0
2
i-i
Σ w zi=0
1
i-i
Σ
x(n) d(n)
y(n)
e(n)+
-
The error function is computed as
)1()()()()(
)()()()()()(
10
1
0
−−−=
∑ −−=−==
nxnwnxnwnd
inxnwndnyndnei
i
62
![Page 63: Widrow’s Least Mean Square (LMS) Algorithm](https://reader030.vdocument.in/reader030/viewer/2022020705/61fb8b8f2e268c58cd5f7151/html5/thumbnails/63.jpg)
Thus the LMS algorithm for this problem is
)()()()(
)()()(
2)(
)()(
2)()1(
0
0
2
00
2
00
nxnenwnw
nenene
nwnwne
nwnw
µ+=∂∂
⋅∂
∂⋅
µ−=
∂∂
⋅µ
−=+
and
)1()()()()(
2)()1(
1
1
2
11
−µ+=∂∂
⋅µ
−=+
nxnenwnwne
nwnw
The learning behaviours of the filter weights and can be obtained by taking expectation on the LMS algorithm. To simplify the analysis, we assume that )(nx is a stationary white noise process such that
)(0 nw )(1 nw
=σ
=+=otherwise,0
0,)}()({)(2 mmnxnxEmR x
xx
63
![Page 64: Widrow’s Least Mean Square (LMS) Algorithm](https://reader030.vdocument.in/reader030/viewer/2022020705/61fb8b8f2e268c58cd5f7151/html5/thumbnails/64.jpg)
Assume the filter weights are independent of and apply expectation on the first updating rule gives
)(nx
( ){ }
( ){ }
20
20
12
0
212
0
10210
1
0
2
0
00
)}({
)}()1({)}({)}({)}({
)}()2({)}()1({)}({
)()1()()()()2()1()(
)()()()(
)()()()}()({
)}({)}1({
xx
ii
ii
nwEh
nxnxEnwEnxEnwE
nxnxEhnxnxEhnxEh
nxnxnwnxnwnxhnxhnxhE
nxinxnwinxhE
nxnyndEnxneE
nwEnwE
σµ−σµ=
−µ−µ
−−µ+−µ+µ=
−−−−+−+µ=
∑ −−∑ −µ=
−µ=µ=
−+
==
64
![Page 65: Widrow’s Least Mean Square (LMS) Algorithm](https://reader030.vdocument.in/reader030/viewer/2022020705/61fb8b8f2e268c58cd5f7151/html5/thumbnails/65.jpg)
20
200
20
200
20
200
20
200
)1)}(0({)}1({
)1)}(2({)}1({
)1)}(1({)}({
)1)}(({)}1({
xx
xx
xx
xx
hwEwE
hnwEnwE
hnwEnwE
hnwEnwE
σµ+µσ−=⇒
σµ+µσ−−=−⇒
σµ+µσ−−=⇒
σµ+µσ−=+⇒
LLLLLLLLLLLLLLLLLLLL
Multiplying the second equation by on both sides, the third equation by 22 , etc., and summing all the resultant equations, we have
)1( 2xµσ−
)1( xµσ−
{wE
012
000
120
1200
2
122
012
00
2220
1200
)1)()}0({()}1({
))1(1()1)}(0({)}1({
)1(1
)1(1)1)}(0({)}1({
))1()1(1()1)}(0({)}1(
hhwEnwE
hwEnwE
hwEnwE
hwEn
nx
nx
nx
x
nx
xn
x
nxxx
nx
+µσ−−=+⇒
µσ−−+µσ−=+⇒
µσ−−
µσ−−σµ+µσ−=+⇒
µσ−++µσ−+σµ+µσ−=+
+
++
++
+ L
65
![Page 66: Widrow’s Least Mean Square (LMS) Algorithm](https://reader030.vdocument.in/reader030/viewer/2022020705/61fb8b8f2e268c58cd5f7151/html5/thumbnails/66.jpg)
Hence 00 )}({lim hnwE
n=
∞→
provided that
222 2
01111|1|x
xxσ
<µ<⇒<µσ−<−⇒<µσ−
Similarly, we can show that the expected value of )}({ 1 nwE is
12
111 )1)()}0({()}({ hhwEnwE nx +µσ−−=
provided that
222 2
01111|1|x
xxσ
<µ<⇒<µσ−<−⇒<µσ−
It is worthy to note that the choice of the initial filter weights )}0({ 0wE and
)}0({ 1wE do not affect the convergence of the LMS algorithm because the performance surface is unimodal.
66
![Page 67: Widrow’s Least Mean Square (LMS) Algorithm](https://reader030.vdocument.in/reader030/viewer/2022020705/61fb8b8f2e268c58cd5f7151/html5/thumbnails/67.jpg)
Discussion: Since the LMS filter consists of two weights but the actual transfer function comprises three coefficients. The plant cannot be exactly modeled in this case. This refers to under-modeling. If we use a 3-weight LMS filter with
transfer function 2
, then the plant can be modeled exactly. If we use
more than 3 coefficients in the LMS filter, we still estimate the transfer function accurately. However, in this case, the misadjustment will be increased with the filter length used.
∑=
−
0i
ii zw
Notice that we can also use the Wiener filter to find the impulse response of the plant if the signal statistics, )0(xxR , )1(xxR , )0(dxR and )1(−dxR are available. However, we do not have )0(dxR and 1( )−dxR although
)0(xxR = 2 and xσ )1(xxR =0 are known. Therefore, the LMS adaptive filter can be considered as an adaptive realization of the Wiener filter and it is used when the signal statistics are not (completely) known.
67
![Page 68: Widrow’s Least Mean Square (LMS) Algorithm](https://reader030.vdocument.in/reader030/viewer/2022020705/61fb8b8f2e268c58cd5f7151/html5/thumbnails/68.jpg)
Example 4.4 3. Interference Cancellation
)(krGiven a received signal which consist of a source signal )(ks and a sinusoidal interference with known frequency. The task is to extract )(ks from )(kr . Notice that the amplitude and phase of the sinusoid is unknown. A well-known application is to remove 50/60 Hz power line interference in the recording of the electrocardiogram (ECG).
Source Signal + Sinusoidal Interference
r(k)=s(k) + Acos(ω k + φ)0
Reference Signal
sin(ω k)0
90 Phase-Shift0cos(ω k)0
b0
b1
Σe(k)
+
-
-
68
![Page 69: Widrow’s Least Mean Square (LMS) Algorithm](https://reader030.vdocument.in/reader030/viewer/2022020705/61fb8b8f2e268c58cd5f7151/html5/thumbnails/69.jpg)
The interference cancellation system consists of a phase-shifter and a two-weight adaptive filter. By properly adjusting the weights, the reference waveform can be changed in magnitude and phase in any way to model the interfering sinusoid. The filtered output is of the form
090
)cos()()sin()()()( 0100 kkbkkbkrke ω−ω−=
The LMS algorithm is
)sin()()()(
)()()(
2)(
)()(
2)()1(
00
0
2
00
2
00
kkekwkb
kekekekb
kbkekbkb
ωµ+=∂∂
⋅∂
∂⋅
µ−=
∂∂
⋅µ
−=+
and
)cos()()()()(
2)()1(
01
1
2
11
kkekbkbkekbkb
ωµ+=∂∂
⋅µ
−=+
69
![Page 70: Widrow’s Least Mean Square (LMS) Algorithm](https://reader030.vdocument.in/reader030/viewer/2022020705/61fb8b8f2e268c58cd5f7151/html5/thumbnails/70.jpg)
Taking the expected value of )(0 kb , we have
( ) ( ){ }
( )
{ }
)sin(2
)}({2
1
)(2
)sin(2
)}({
2)2cos(1
)()sin()}({
)(sin)()sin()2sin()()cos(21)}({
)}sin()]cos()()sin()()sin()sin()cos()cos({[)}({
)}sin()]cos()()sin()()cos()({[)}({)}sin()({)}({
)}1({
0
00
000
02
0010
001
00000
0010000
00
0
φµ
−
µ
−=
µ−φ
µ−=
ω−
⋅+φµ−=
ω+φµ−
ω−φµ+=
ω⋅ω−ω−φω−φωµ+=
ω⋅ω−ω−φ+ω+µ+=ωµ+=
+
AkbE
kbEAkbE
kkbAEkbE
kkbAEkkbAEkbE
kkkbkkbkAkAEkbE
kkkbkkbkAksEkbEkkeEkbE
kbE
70
![Page 71: Widrow’s Least Mean Square (LMS) Algorithm](https://reader030.vdocument.in/reader030/viewer/2022020705/61fb8b8f2e268c58cd5f7151/html5/thumbnails/71.jpg)
Following the derivation in Example 4.3, provided that 40 <µ< , the learning curve of )}({ 0 kbE can be obtained as
( )k
AbEAkbE
µ
−⋅φ++φ−=2
1)sin()}0({)sin()}({ 00
Similarly, )}({ 1 kbE is calculated as
( )k
AbEAkbE
µ
−⋅φ−+φ=2
1)cos()}0({)cos()}({ 11
When ∞→k , we have
)sin()}({lim 0 φ−=∞→
AkbEk
and
)cos()}({lim 1 φ=∞→
AkbEk
71
![Page 72: Widrow’s Least Mean Square (LMS) Algorithm](https://reader030.vdocument.in/reader030/viewer/2022020705/61fb8b8f2e268c58cd5f7151/html5/thumbnails/72.jpg)
The filtered output is then approximated as
)()cos()cos()sin()sin()()( 00
kskAkAkrke
=ωφ−ωφ+≈
which means that )(ks can be recovered accurately upon convergence.
Suppose 0)}0({)}0({ 10 == bEbE , 02.0=µ and we want to find the number of iterations required for )}({ 1 kbE to reach 90% of its steady state value. Let the required number of iterations be 0k and it can be calculated from
k
1.229)99.0log(
)1.0log(
1.0202.01
21)cos()cos()cos(9.0)}({
0
1
0
0
==⇒
=
−⇒
µ
−φ−φ=φ=
k
AAAkbE
k
Hence 300 iterations are required.
72
![Page 73: Widrow’s Least Mean Square (LMS) Algorithm](https://reader030.vdocument.in/reader030/viewer/2022020705/61fb8b8f2e268c58cd5f7151/html5/thumbnails/73.jpg)
If we use Wiener filter with filter weights and , the mean square error function can be computed as
0b 1b
( )2
)}({)cos()sin(21
)}({2
210
21
20
2 AksEbAbAbbkeE ++⋅φ−⋅φ++=
The Wiener coefficients are found by differentiating with respect to b and b and then set the resultant expression to zeros. We have
)}({ 2 keE0 1
)sin(~0)sin()}({
000
2φ−=⇒=φ+=
∂∂
AbAbb
keE
and
)cos(~0)cos()}({
111
2φ=⇒=φ−=
∂∂
AbAbb
keE
73
![Page 74: Widrow’s Least Mean Square (LMS) Algorithm](https://reader030.vdocument.in/reader030/viewer/2022020705/61fb8b8f2e268c58cd5f7151/html5/thumbnails/74.jpg)
Example 4.5 4. Time Delay Estimation Estimation of the time delay between two measured signals is a problem which occurs in a diverse range of applications including radar, sonar, geophysics and biomedical signal analysis. A simple model for the received signals is
)()()()()()(
12
11
knDkskrknkskr+−α=
+=
where )(ks is the signal of interest while )(1 kn and )(2 kn are additive noises. The α is the attenuation and D is the time delay to be determined. In general, D is not an integral multiple of the sampling period. Suppose the sampling period is 1 second and )(ks is bandlimited between -0.5 Hz and 0.5 Hz (- π rad/s and π rad/s). We can derive the system which can produce a delay of D as follows.
74
![Page 75: Widrow’s Least Mean Square (LMS) Algorithm](https://reader030.vdocument.in/reader030/viewer/2022020705/61fb8b8f2e268c58cd5f7151/html5/thumbnails/75.jpg)
Taking the Fourier transform of )()( DksksD −= yields
)()(ˆ ω⋅=ω ω− SeS Dj This means that a system of transfer function Dje ω− can generate a delay of D for )(ks . Using the inverse DTFT formula of (I.9), the impulse response of Djω− is calculated as e
)(sinc21
21)(
)(
Dn
de
deenh
Dnj
njnDj
−=
∫ ωπ
=
∫ ω⋅π
=
π
π−
−ω
π
π−
ωω−
where
vvv
ππ
=)sin()(sinc
75
![Page 76: Widrow’s Least Mean Square (LMS) Algorithm](https://reader030.vdocument.in/reader030/viewer/2022020705/61fb8b8f2e268c58cd5f7151/html5/thumbnails/76.jpg)
As a result, )( Dks − can be represented as
)(sinc)(
)(sinc)(
)()()(
Diiks
Diiks
khksDks
P
Pi
i
−−∑≈
−−∑=
⊗=−
−=
∞
−∞=
for sufficiently large P .
This means that we can use a non-casual FIR filter to model the time delay and it has the form:
ii
P
PizwzW −
−=∑=)(
It can be shown that )(sinc Diwi −⋅β→ for PPPi ,,1, L+−−= using the minimum mean square error approach. The time delay can be estimated from }{ iw using the following interpolation:
P
−∑=
−=)(sincmaxargˆ tiwD i
Pit
76
![Page 77: Widrow’s Least Mean Square (LMS) Algorithm](https://reader030.vdocument.in/reader030/viewer/2022020705/61fb8b8f2e268c58cd5f7151/html5/thumbnails/77.jpg)
r (k)
e(k)
-
W(z) = w zΣi=-P
P
Σ
-i
+
i i1
r (k)2
)()()()( 12 kwikrkrke i
P
Pi−∑−=
−=
77
![Page 78: Widrow’s Least Mean Square (LMS) Algorithm](https://reader030.vdocument.in/reader030/viewer/2022020705/61fb8b8f2e268c58cd5f7151/html5/thumbnails/78.jpg)
The LMS algorithm for the time delay estimation problem is thus
PPPjjkrkekw
kwke
kekekw
kwkekwkw
j
jj
jjj
,,1,),()(2)(
)()(
)()()(
)()()()1(
1
2
2
L+−−=−µ+=
∂∂
⋅∂
∂µ−=
∂∂
µ−=+
The time delay estimate at time k is:
−∑=
−=)(sincmaxarg)(ˆ ti(k)wkD i
P
Pit
78
![Page 79: Widrow’s Least Mean Square (LMS) Algorithm](https://reader030.vdocument.in/reader030/viewer/2022020705/61fb8b8f2e268c58cd5f7151/html5/thumbnails/79.jpg)
Exponentially Weighted Recursive Least-Squares A. Optimization Criterion
To minimize the weighted sum of squares ∑ λ==
−n
l
ln lenJ0
2 )()( for each time
where n λ is a weighting factor such that 10 ≤λ< .
When 1=λ , the optimization criterion is identical to that of least squaring filtering and this value of λ should not be used in a changing environment because all squared errors (current value and past values) have the same weighting factor of 1.
To smooth out the effect of the old samples, λ should be chosen less than 1 for operating in nonstationary conditions. B. Derivation
Assume FIR filter for simplicity. Following the derivation of the least squares filter, we differentiate )(nJ with respect to the filter weight vector at time n, i.e., )(nW , and then set the L resultant equations to zero.
79
![Page 80: Widrow’s Least Mean Square (LMS) Algorithm](https://reader030.vdocument.in/reader030/viewer/2022020705/61fb8b8f2e268c58cd5f7151/html5/thumbnails/80.jpg)
By so doing, we have
)()()( nGnWnR =⋅ (4.47) where
Tn
l
ln lXlXnR )()()(0
∑ λ==
−
∑ λ==
−n
l
ln lXldnG0
)()()(
TLlxLlxlxlxlX )]1()2()1()([)( +−+−−= L
Notice that )(nR and )(nG can be computed recursively from
TTTn
l
ln nXnXnRnXnXlXlXnR )()()1()()()()()(1
0+−λ=+∑ λ=
−
=
− (4.48)
)()()1()()()()()(1
0nXndnGnXndlXldnG
n
l
ln +−λ=+∑ λ=−
=
− (4.49)
80
![Page 81: Widrow’s Least Mean Square (LMS) Algorithm](https://reader030.vdocument.in/reader030/viewer/2022020705/61fb8b8f2e268c58cd5f7151/html5/thumbnails/81.jpg)
Using the well-known matrix inversion lemma: If TCCBA ⋅+= (4.50) where A and B are NN × matrix and C is a vector of length N , then 111111 )1( −−−−−− +−= BCCBCCBBA TT (4.51) Thus 1)( −nR can be written as
−+λ
−−−−
λ=
−
−−−−
)()1()()1()()()1(
)1(1
)(1
1111
nXnRnXnRnXnXnR
nRnRT
T (4.52)
81
![Page 82: Widrow’s Least Mean Square (LMS) Algorithm](https://reader030.vdocument.in/reader030/viewer/2022020705/61fb8b8f2e268c58cd5f7151/html5/thumbnails/82.jpg)
The filter weight )(nW is calculated as
[ ]
)()1()()()1())1()()(()1(
))()1()(()()1()()()1()()1()()()1(
))()1()(()()1()()()1()()()1()()1(
))()1()(()()1()()()1()()1()1()()()1(
)()1()(1)1()1(
)()()1()()1()()1()()()1()1(1
)()()(
1
1
1
111
1
111
1
1111
11
1
111
1
nXnRnXnXnRnWnXndnW
nXnRnXnXnRnXnXnRndnWnXnXnR
nXnRnXnXnRnXnXnRndnXnRndnW
nXnRnXnXnRnXnXnRndnGnRnXnXnR
nXnRndnGnR
nXndnGnXnRnX
nRnXnXnRnR
nGnRnW
T
T
T
TT
T
T
T
TT
T
T
−
−
−
−−−
−
−−−
−
−−−−
−−
−
−−−
−
−+λ
−−−+−=
−+λλ
−−−−−λ
−−+λλ
−−+−λ+−=
−+λλ
−−−−−−λ
−−λ
+−−=
+−λ
−+λ
−−−−
λ=
=
82
![Page 83: Widrow’s Least Mean Square (LMS) Algorithm](https://reader030.vdocument.in/reader030/viewer/2022020705/61fb8b8f2e268c58cd5f7151/html5/thumbnails/83.jpg)
As a result, the exponentially weighted recursive least squares (RLS) algorithm is summarized as follows,
1. Initialize )0(W and 1)0( −R
2. For L,2,1=n , compute )1()()()( −−= nWnXndne T (4.53)
)()1()(
1)(1 nXnRnX
nT −−+λ
=α (4.54)
)()1()()()1()( 1 nXnRnennWnW −−α+−= (4.55)
[ ]1111 )1()()()1()()1(1
)( −−−− −−α−−λ
= nRnXnXnRnnRnR T (4.56)
83
![Page 84: Widrow’s Least Mean Square (LMS) Algorithm](https://reader030.vdocument.in/reader030/viewer/2022020705/61fb8b8f2e268c58cd5f7151/html5/thumbnails/84.jpg)
Remarks:
1.When 1=λ , the algorithm reduces to the standard RLS algorithm that
minimizes n
. ∑=l
le0
2 )(
2.For nonstationary data, 9995.095.0 <λ< has been suggested.
3.Simple choices of )0(W and 1)0( −R are 0 and I2σ , respectively, where 2 is a small positive constant. σ
C. Comparison with the LMS algorithm 1. Computational Complexity RLS is more computationally expensive than the LMS. Assume there are L filter taps, LMS requires ( 14 +L ) additions and ( 34 +L ) multiplications per update while the exponentially weighted RLS needs a total of ( 13 2 −+ LL ) additions/subtractions and ( LL 44 2 + ) multiplications/divisions.
84
![Page 85: Widrow’s Least Mean Square (LMS) Algorithm](https://reader030.vdocument.in/reader030/viewer/2022020705/61fb8b8f2e268c58cd5f7151/html5/thumbnails/85.jpg)
2. Rate of Convergence
RLS provides a faster convergence speed than the LMS because
RLS is an approximation of the Newton method while LMS is an approximation of the steepest descent method.
1−
the pre-multiplication of )(nR in the RLS algorithm makes the resultant eigenvalue spread becomes unity.
Improvement of LMS algorithm with the use of Orthogonal Transform
A. Motivation
When the input signal is white, the eigenvalue spread has a minimum value of 1. In this case, the LMS algorithm can provide optimum rate of convergence. However, many practical signals are nonwhite, how can we improve the rate of convergence using the LMS algorithm?
85
![Page 86: Widrow’s Least Mean Square (LMS) Algorithm](https://reader030.vdocument.in/reader030/viewer/2022020705/61fb8b8f2e268c58cd5f7151/html5/thumbnails/86.jpg)
B. Idea
To transform the input to another signal )(nx )(nv so that the modified eigenvalue spread is 1. Two steps are involved:
1. Transform to )(nx )(nv using an NN × orthogonal transform T so that
σσ
σσ
=⋅⋅=
−2
21
22
21
000
00
00
N
N
Txxvv TRTR
L
OM
M
L
where )()( nXTnV ⋅=
T
NN nvnvnvnvnV )]()()()([)( 121 −= L
TNnxNnxnxnxnX )]1()2()1()([)( +−+−−= L
86
![Page 87: Widrow’s Least Mean Square (LMS) Algorithm](https://reader030.vdocument.in/reader030/viewer/2022020705/61fb8b8f2e268c58cd5f7151/html5/thumbnails/87.jpg)
{ }T
xx nXnXER )()(=
{ }Tvv nVnVER )()(=
2. Modify the eigenvalues of vvR so that the resultant matrix has identical
eigenvalues:
σσ
σσ
=→
2
2
2
2
oinnormalizatpower
000
000
'
L
OM
M
L
vvvv RR
87
![Page 88: Widrow’s Least Mean Square (LMS) Algorithm](https://reader030.vdocument.in/reader030/viewer/2022020705/61fb8b8f2e268c58cd5f7151/html5/thumbnails/88.jpg)
Block diagram of the transform domain adaptive filter
88
![Page 89: Widrow’s Least Mean Square (LMS) Algorithm](https://reader030.vdocument.in/reader030/viewer/2022020705/61fb8b8f2e268c58cd5f7151/html5/thumbnails/89.jpg)
C. Algorithm
The modified LMS algorithm is given by
)()(2)()1( 2 nVnenWnW −Λµ+=+ where
σσ
σσ
=Λ
−
−
2
21
22
21
2
/1000/1
/1000/1
N
N
L
OM
M
L
)()()( nyndne −=
)()()()()( nXTnWnVnWny TT ⋅⋅==
T
NN nwnwnwnwnW )]()()()([)( 121 −= L
89
![Page 90: Widrow’s Least Mean Square (LMS) Algorithm](https://reader030.vdocument.in/reader030/viewer/2022020705/61fb8b8f2e268c58cd5f7151/html5/thumbnails/90.jpg)
Writing in scalar form, we have
Ninvne
nwnwi
iii ,,2,1,
)()(2)()1(
2L=
σ
µ+=+
Since is the power of and it is not known a priori and should be estimated. A common estimation procedure for 2 is
2iσ )(nvi
)}({ nvE i
222 |)(|)1()( nvnn iii +−ασ=σ where
10 <α< In practice, α should be chosen close to 1, say, 9.0=α .
90
![Page 91: Widrow’s Least Mean Square (LMS) Algorithm](https://reader030.vdocument.in/reader030/viewer/2022020705/61fb8b8f2e268c58cd5f7151/html5/thumbnails/91.jpg)
Using a 2-coefficient adaptive filter as an example:
A 2-D error surface without transform
91
![Page 92: Widrow’s Least Mean Square (LMS) Algorithm](https://reader030.vdocument.in/reader030/viewer/2022020705/61fb8b8f2e268c58cd5f7151/html5/thumbnails/92.jpg)
Error surface with discrete cosine transform (DCT)
92
![Page 93: Widrow’s Least Mean Square (LMS) Algorithm](https://reader030.vdocument.in/reader030/viewer/2022020705/61fb8b8f2e268c58cd5f7151/html5/thumbnails/93.jpg)
Error surface with transform and power normalization
93
![Page 94: Widrow’s Least Mean Square (LMS) Algorithm](https://reader030.vdocument.in/reader030/viewer/2022020705/61fb8b8f2e268c58cd5f7151/html5/thumbnails/94.jpg)
Remarks:
1. The lengths of the principle axes of the hyperellipses are proportional to the eigenvalues of R .
2. Without power normalization, no convergence rate improvement of using transform can be achieved.
3. The best choice for T should be Karhunen-Loeve (KL) transform which is signal dependent. This transform can make vvR to a diagonal matrix but the signal statistics are required for its computation.
4. Considerations in choosing a transform: fast algorithm exists? complex or real transform? elements of the transform are all power of 2?
5. Examples of orthogonal transforms are discrete sine transform (DST), discrete Fourier transform (DFT), discrete cosine transform (DCT), Walsh-Hadamard transform (WHT), discrete Hartley transform (DHT) and power-of-2 (PO2) transform.
94
![Page 95: Widrow’s Least Mean Square (LMS) Algorithm](https://reader030.vdocument.in/reader030/viewer/2022020705/61fb8b8f2e268c58cd5f7151/html5/thumbnails/95.jpg)
Improvement of LMS algorithm using Newton's method
Since the eigenvalue spread of Newton based approach is 1, we can combine the LMS algorithm and Newton's method to form the "LMS/Newton" algorithm as follows,
)()()(
)()(
2)()1(
1
21
nXneRnW
nWneRnWnW
xx
xx
−
−
µ+=
∂∂µ
−=+
Remarks:
1. The computational complexity of the LMS/Newton algorithm is smaller than the RLS algorithm but greater than the LMS algorithm.
2. When xxR is not available, it can be estimated as follows,
1,,1,0),()()1,(ˆ),(ˆ −=⋅++−α= LlnxlnxnlRnlR xxxx L where represents the estimate of ),(ˆ nlRxx )(lRxx at time and n 10 <α<
95
![Page 96: Widrow’s Least Mean Square (LMS) Algorithm](https://reader030.vdocument.in/reader030/viewer/2022020705/61fb8b8f2e268c58cd5f7151/html5/thumbnails/96.jpg)
Possible Research Directions for Adaptive Signal Processing
1. Adaptive modeling of non-linear systems
For example, second-order Volterra system is a simple non-linear system. The output )(ny is related to the input )(nx by
)()(),()()()( 211
0
1
021
)2(1
0
)1(
1 2
jnxjnxjjwjnxjwnyL
j
L
j
L
j−∑ ∑ −∑ +−=
−
=
−
=
−
=
Another related research direction is to analyze non-linear adaptive filters, for example, neural networks, which are generally more difficult to analyze its performance.
2. New optimization criterion for Non-Gaussian signals/noises
For example, LMF algorithm minimizes . )}({ 4 neE
In fact, a class of steepest descend algorithms can be generalized by the least-mean-p (LMP) norm. The cost function to be minimized is given by
})({ pkeEJ =
96
![Page 97: Widrow’s Least Mean Square (LMS) Algorithm](https://reader030.vdocument.in/reader030/viewer/2022020705/61fb8b8f2e268c58cd5f7151/html5/thumbnails/97.jpg)
Some remarks:
When =1, it becomes least-mean-deviation (LMD), when =2, it is least-mean-square (LMS) and if p=4, it becomes the least-mean-fourth (LMF).
p p
The LMS is optimum for Gaussian noises and it may not be true for noises of other probability density functions (PDFs). For example, if the noise is impulsive such as a α-stable process with 21 <α≤ , LMD performs better than LMS; if the noise is of uniform distribution or if it is a sinusoidal signal, then LMF outperforms LMS. Therefore, the optimum p depends on the signal/noise models.
The parameter can be any real number but it will be difficult to analyze, particularly for non-integer p .
p
Combination of different norms can be used to achieve better performance.
Some suggests mixed norm criterion, e.g. )}({)}({ 42 neEbneEa ⋅+⋅
97
![Page 98: Widrow’s Least Mean Square (LMS) Algorithm](https://reader030.vdocument.in/reader030/viewer/2022020705/61fb8b8f2e268c58cd5f7151/html5/thumbnails/98.jpg)
Median operation can be employed in the LMP algorithm for operating in the presence of impulsive noise. For example, the median LMS belongs to the family of order-statistics-least-mean-square (OSLMS) adaptive filter algorithms.
3. Adaptive algorithms with fast convergence rate and small computation
For example, design of optimal step size in LMS algorithms
4. Adaptive IIR filters
Adaptive IIR filters have 2 advantages over adaptive FIR filters: It generalizes FIR filter and it can model IIR system more accurately Less filter coefficients are generally required
However, development of adaptive IIR filters are generally more difficult than the FIR filters because
The performance surface is multimodal the algorithm may lock at an undesired local minimum
⇒
It may lead to biased solution It can be unstable
98
![Page 99: Widrow’s Least Mean Square (LMS) Algorithm](https://reader030.vdocument.in/reader030/viewer/2022020705/61fb8b8f2e268c58cd5f7151/html5/thumbnails/99.jpg)
5. Unsupervised adaptive signal processing (blind signal processing)
What we have discussed previously refers to supervised adaptive signal processing where there is always a desired signal or reference signal or training signal.
In some applications, such signals are not available. Two important application areas of unsupervised adaptive signal processing are:
Blind source separation e.g. speaker identification in the noisy environment of a cocktail party e.g. separation of signals overlapped in time and frequency in wireless communications
Blind deconvolution (= inverse of convolution) e.g. restoration of a source signal after propagating through an unknown wireless channel
6. New applications
For example, echo cancellation for hand-free telephone systems and signal estimation in wireless channels using space-time processing.
99
![Page 100: Widrow’s Least Mean Square (LMS) Algorithm](https://reader030.vdocument.in/reader030/viewer/2022020705/61fb8b8f2e268c58cd5f7151/html5/thumbnails/100.jpg)
Questions for Discussion 1. The LMS algorithm is given by (4.23): 1,,1,0),()(2)()1( −=−µ+=+ Liinxnenwnw ii L where
)()()( nyndne −=
)()()()()(1
0nXnWinxnwny TL
ii =∑ −=
−
=,
Based on the idea of LMS algorithm, derive the adaptive algorithm that minimizes })({ neE .
(Hint: )sgn(||
vvv
=∂
∂ where 1)sgn( =v if and 1>v 1)sgn( −=v otherwise)
100
![Page 101: Widrow’s Least Mean Square (LMS) Algorithm](https://reader030.vdocument.in/reader030/viewer/2022020705/61fb8b8f2e268c58cd5f7151/html5/thumbnails/101.jpg)
2. For adaptive IIR filtering, there are basically two approaches, namely, output-error and equation-error. Let the unknown IIR system be
ii
M
i
jj
N
j
za
zb
zAzBzH
−−
=
−−
=
∑+
∑==
1
1
1
0
1)()()(
Using minimizing mean square error as the performance criterion, the output-error scheme is a direct approach which minimizes )}({ 2 neE where
)()()( nyndne −=
with
101
![Page 102: Widrow’s Least Mean Square (LMS) Algorithm](https://reader030.vdocument.in/reader030/viewer/2022020705/61fb8b8f2e268c58cd5f7151/html5/thumbnails/102.jpg)
)(ˆ)(ˆ)(
ˆ1
ˆ
)(ˆ)(ˆ
)()(
1
1
1
0
1
1
1
0
inyajnxbny
za
zb
zAzB
zXzY
iM
ij
N
j
ii
M
i
jj
N
j
−∑−−∑=⇒
∑+
∑==
−
=
−
=
−−
=
−−
=
However, as in Q.2 of Chapter 3, this approach has two problems, namely, stability and multimodal performance surface. On the other hand, the equation-error approach is always stable and has a unimodal surface. Its system block diagram is shown in the next page. Can you see the main problem of it and suggest a solution? (Hint: Assume )(kn is white and examine )}({ 2 keE )
102
![Page 103: Widrow’s Least Mean Square (LMS) Algorithm](https://reader030.vdocument.in/reader030/viewer/2022020705/61fb8b8f2e268c58cd5f7151/html5/thumbnails/103.jpg)
Σ
Σ
H(z)
Unknown system
+
++
−
s(k)
n(k)
d(k)
r(k)
e(k)
B(z) A(z)
103