eleg-636: statistical signal processingarce/files/courses/... · eleg-636: statistical signal...

Post on 10-Jul-2020

38 Views

Category:

Documents

2 Downloads

Preview:

Click to see full reader

TRANSCRIPT

ELEG-636: Statistical Signal Processing

Gonzalo R. Arce

Department of Electrical and Computer EngineeringUniversity of Delaware

Spring 2010

Gonzalo R. Arce (ECE, Univ. of Delaware) ELEG-636: Statistical Signal Processing Spring 2010 1 / 34

Course Objectives & Structure

Course Objectives & Structure

Objective: Given a discrete time sequence x(n), develop

Statistical and spectral signal representation

Filtering, prediction, and system identification algorithmsOptimization methods that are

StatisticalAdaptive

Course Structure:

Weekly lectures [notes: www.ece.udel.edu/∼arce]

Periodic homework (theory & Matlab implementations) [15%]

Midterm & Final examinations [85%]

Textbook:

Haykin, Adaptive Filter Theory.

Gonzalo R. Arce (ECE, Univ. of Delaware) ELEG-636: Statistical Signal Processing Spring 2010 2 / 34

Course Objectives & Structure

Course Objectives & Structure

Broad Applications in Communications, Imaging, Sensors.Emerging application in

Brain-imaging techniquesBrain-machine interfaces,Implantable devices.

Neurofeedback presents real-time physiological signals from MRIsin a visual or auditory form to provide information about brainactivity. These signals are used to train the patient to alter neuralactivity in a desired direction.

Traditionally, feedback using EEGs or other mechanisms has notfocused on the brain because the resolution is not good enough.

Gonzalo R. Arce (ECE, Univ. of Delaware) ELEG-636: Statistical Signal Processing Spring 2010 3 / 34

Wiener (MSE) Filtering Theory Wiener Filtering

Problem Statement

Produce an estimate of a desired process statistically related to a setof observations

Historical Notes: The linear filtering problem was solved byAndrey Kolmogorov for discrete time – his 1938 paper“established the basic theorems for smoothing and predictingstationary stochastic processes”Norbert Wiener in 1941 for continuous time – not published untilthe 1949 paper Extrapolation, Interpolation, and Smoothing ofStationary Time Series

Gonzalo R. Arce (ECE, Univ. of Delaware) ELEG-636: Statistical Signal Processing Spring 2010 5 / 34

Wiener (MSE) Filtering Theory Wiener Filtering

System restrictions and considerations:

Filter is linear

Filter is discrete time

Filter is finite impulse response (FIR)

The process is WSS

Statistical optimization is employed

Gonzalo R. Arce (ECE, Univ. of Delaware) ELEG-636: Statistical Signal Processing Spring 2010 6 / 34

Wiener (MSE) Filtering Theory Wiener Filtering

For the discrete time case

∗ ∗!" ∗

−!#" $%

The filter impulse response is finite and given by

hk =

w∗

k for k = 0, 1, · · · ,M − 10 otherwise

The output d(n) is an estimate of the desired signal d(n)x(n) and d(n) are statistically related ⇒ d(n) and d(n) arestatistically related

Gonzalo R. Arce (ECE, Univ. of Delaware) ELEG-636: Statistical Signal Processing Spring 2010 7 / 34

Wiener (MSE) Filtering Theory Wiener Filtering

In convolution and vector form

d(n) =M−1∑

k=0

w∗

k x(n − k) = wHx(n)

where

w = [w0,w1, · · · ,wM−1]T [filter coefficient vector]

x = [x(n), x(n − 1), · · · , x(n − M + 1)]T [observation vector]

The error can now be written as

e(n) = d(n)− d(n) = d(n)− wHx(n)

Question: Under what criteria should the error be minimized?

Selected Criteria: Mean squared-error (MSE)

J(w) = Ee(n)e∗(n) (∗)

Result: The w that minimizes J(w) is the optimal (Wiener) filterGonzalo R. Arce (ECE, Univ. of Delaware) ELEG-636: Statistical Signal Processing Spring 2010 8 / 34

Wiener (MSE) Filtering Theory Wiener Filtering

Utilizing e(n) = d(n)− wHx(n) in (∗) and expanding,

J(w) = Ee(n)e∗(n)

= E(d(n)− wHx(n))(d∗(n)− xH(n)w)

= E|d(n)|2 − d(n)xH(n)w − wHx(n)d∗(n)

+wHx(n)xH(n)w

= E|d(n)|2 − Ed(n)xH(n)w − wHEx(n)d∗(n)

+wHEx(n)xH(n)w (∗∗)

Let

R = Ex(n)xH(n) [autocorrelation of x(n)]

p = Ex(n)d∗(n) [cross correlation between x(n) and d(n)]

Then (∗∗) can be compactly expressed as

J(w) = σ2d − pHw − wHp + wHRw

where we have assumed x(n) & d(n) are zero mean, WSSGonzalo R. Arce (ECE, Univ. of Delaware) ELEG-636: Statistical Signal Processing Spring 2010 9 / 34

Wiener (MSE) Filtering Theory Wiener Filtering

The MSE criteria as a function of the filter weight vector w

J(w) = σ2d − pHw − wHp + wHRw

Observation: The error is a quadratic function of w

Consequences: The error is an M–dimensional bowl–shaped functionof w with a unique minimum

Result: The optimal weight vector, w0, is determined by differentiatingJ(w) and setting the result to zero

∇wJ(w)|w=w0 = 0

A closed form solution exists

Gonzalo R. Arce (ECE, Univ. of Delaware) ELEG-636: Statistical Signal Processing Spring 2010 10 / 34

Wiener (MSE) Filtering Theory Wiener Filtering

Example

Consider a two dimensional case, i.e., a M = 2 tap filter. Plot the errorsurface and error contours.

Error Surface Error Contours

Gonzalo R. Arce (ECE, Univ. of Delaware) ELEG-636: Statistical Signal Processing Spring 2010 11 / 34

Wiener (MSE) Filtering Theory Wiener Filtering

Aside (Matrix Differentiation): For complex data,

wk = ak + jbk , k = 0, 1, · · · ,M − 1

the gradient, with respect to wk , is

∇k (J) =∂J∂ak

+ j∂J∂bk

, k = 0, 1, · · · ,M − 1

The complete gradient is thus given by

∇w(J) =

∇0(J)∇1(J)

...∇M−1(J)

=

∂J∂a0

+ j ∂J∂b0

∂J∂a1

+ j ∂J∂b1

...∂J

∂aM−1+ j ∂J

∂bM−1

Gonzalo R. Arce (ECE, Univ. of Delaware) ELEG-636: Statistical Signal Processing Spring 2010 12 / 34

Wiener (MSE) Filtering Theory Wiener Filtering

Example

Let c and w be M × 1 complex vectors. For g = cHw, find ∇w(g)

Note

g = cHw =M−1∑

k=0

c∗

k wk =M−1∑

k=0

c∗

k (ak + jbk )

Thus

∇k (g) =∂g∂ak

+ j∂g∂bk

= c∗

k + j(jc∗

k ) = 0, k = 0, 1, · · · ,M − 1

Result: For g = cHw

∇w(g) =

∇0(g)∇1(g)

...∇M-1(g)

=

00...0

= 0

Gonzalo R. Arce (ECE, Univ. of Delaware) ELEG-636: Statistical Signal Processing Spring 2010 13 / 34

Wiener (MSE) Filtering Theory Wiener Filtering

Example

Now suppose g = wHc. Find ∇w(g)

In this case,

g = wHc =

M−1∑

k=0

w∗

k ck =

M−1∑

k=0

ck (ak − jbk )

and

∇k (g) =∂g∂ak

+ j∂g∂bk

= ck + j(−jck ) = 2ck , k = 0, 1, · · · ,M − 1

Result: For g = wHc

∇w(g) =

∇0(g)∇1(g)

...∇M-1(g)

=

2c0

2c1...

2cM−1

= 2c

Gonzalo R. Arce (ECE, Univ. of Delaware) ELEG-636: Statistical Signal Processing Spring 2010 14 / 34

Wiener (MSE) Filtering Theory Wiener Filtering

Example

Lastly, suppose g = wHQw. Find ∇w(g)

In this case,

g =

M−1∑

i=0

M−1∑

j=0

w∗

i wjqi,j

=M−1∑

i=0

M−1∑

j=0

(ai − jbi)(aj + jbj)qi,j

⇒ ∇k (g) =∂g∂ak

+ j∂g∂bk

= 2M−1∑

j=0

(aj + jbj)qk ,j + 0

= 2M−1∑

j=0

wjqk ,j

Gonzalo R. Arce (ECE, Univ. of Delaware) ELEG-636: Statistical Signal Processing Spring 2010 15 / 34

Wiener (MSE) Filtering Theory Wiener Filtering

Result: For g = wHQw

∇w(g) =

∇0(g)∇1(g)

...∇M-1(g)

= 2

M−1∑

i=0q0,iwi

M−1∑

i=0q1,iwi

...M-1∑

i=0qM−1,iwi

= 2Qw

Observation: Differentiation result depends on matrix ordering

Gonzalo R. Arce (ECE, Univ. of Delaware) ELEG-636: Statistical Signal Processing Spring 2010 16 / 34

Wiener (MSE) Filtering Theory Wiener Filtering

Returning to the MSE performance criteria

J(w) = σ2d − pHw − wHp + wHRw

Approach: Minimize error by differentiating with respect to w and setresult to 0

∇w(J) = 0 − 0 − 2p + 2Rw

= 0

⇒ Rw0 = p [normal equation]

Result: The Wiener filter coefficients are defined by

w0 = R−1p

Question: Does R−1 always exist? Recall R is positive semi-definite,and usually positive definite

Gonzalo R. Arce (ECE, Univ. of Delaware) ELEG-636: Statistical Signal Processing Spring 2010 17 / 34

Wiener (MSE) Filtering Theory Orthogonality Principle

Orthogonality Principle

Consider again the normal equation that defines the optimal solution

Rw0 = p

⇒ Ex(n)xH(n)w0 = Ex(n)d∗(n)

Rearranging

Ex(n)d∗(n) − Ex(n)xH(n)w0 = 0

Ex(n)[d∗(n)− xH(n)w0] = 0

Ex(n)e∗

0(n) = 0

Note: e∗

0(n) is the error when the optimal weights are used, i.e.,

e∗

0(n) = d∗(n)− xH(n)w0

Gonzalo R. Arce (ECE, Univ. of Delaware) ELEG-636: Statistical Signal Processing Spring 2010 18 / 34

Wiener (MSE) Filtering Theory Orthogonality Principle

Thus

Ex(n)e∗

0(n) = E

x(n)e∗

0(n)x(n − 1)e∗

0(n)...

x(n − M + 1)e∗

0(n)

=

00...0

Orthogonality Principle

A necessary and sufficient condition for a filter to be optimal is that theestimate error, e∗(n), be orthogonal to each input sample in x(n)

Interpretation: The observations samples and error are orthogonal andcontain no mutual “information”

Gonzalo R. Arce (ECE, Univ. of Delaware) ELEG-636: Statistical Signal Processing Spring 2010 19 / 34

Wiener (MSE) Filtering Theory Minimum MSE

Objective: Determine the minimum MSE

Approach: Use the optimal weights w0 = R−1p in the MSE expression

J(w) = σ2d − pHw − wHp + wHRw

⇒ Jmin = σ2d − pHw0 − wH

0 p + wH0 R(R−1p)

= σ2d − pHw0 − wH

0 p + wH0 p

= σ2d − pHw0

Result:Jmin = σ2

d − pHR−1p

where the substitution w0 = R−1p has been employed

Gonzalo R. Arce (ECE, Univ. of Delaware) ELEG-636: Statistical Signal Processing Spring 2010 20 / 34

Wiener (MSE) Filtering Theory Excess MSE

Objective: Consider the excess MSE introduced by using a weightedvector that is not optimal.

J(w)−Jmin = (σ2d −pHw−wHp+wHRw)−(σ2

d −pHw0−wH0 p+wH

0 Rw0)

Using the fact that

p = Rw0 and pH = wH0 R

yields

J(w)− Jmin = −pHw − wHp + wHRw + pHw0 + wH0 p − wH

0 Rw0

= −wH0 Rw − wHRw0 + wHRw + wH

0 Rw0

+ wH0 Rw0 − wH

0 Rw0

= −wH0 Rw − wHRw0 + wHRw + wH

0 Rw0

= (w − w0)HR(w − w0)

⇒ J(w) = Jmin + (w − w0)HR(w − w0)

Gonzalo R. Arce (ECE, Univ. of Delaware) ELEG-636: Statistical Signal Processing Spring 2010 21 / 34

Wiener (MSE) Filtering Theory Excess MSE

Finally, using the eigenvalue and vector representation R = QΩQH

J(w) = Jmin + (w − w0)HQΩQH(w − w0)

or defining the eigenvector transformed difference

v = QH(w − w0) (∗)

⇒ J(w) = Jmin + vHΩv

= Jmin +M∑

k=1

λkvkv∗

k

Result:

J(w) = Jmin +M∑

k=1

λk |vk |2

Note: (∗) shows that vk is the difference (w − w0) projected ontoeigenvector qk

Gonzalo R. Arce (ECE, Univ. of Delaware) ELEG-636: Statistical Signal Processing Spring 2010 22 / 34

Wiener (MSE) Filtering Theory Communications Example

Example

Consider the following system

& ' ( ) (* + , - . + , -/ 0 1 2 3 4 5 6 7 2 8 0 5 9 : ;0 < 2 : 5 = 6 4 1 0 7+ 7 0 1 2 : -> 0 9 9 ? 7 1 @ 6 4 1 0 7@ A 6 7 : B & C (

D : 2 1 5 : ;2 1 E 7 6 B) ( FGObjective: Determine the optimal filter for a given system and channel

Gonzalo R. Arce (ECE, Univ. of Delaware) ELEG-636: Statistical Signal Processing Spring 2010 23 / 34

Wiener (MSE) Filtering Theory Communications Example

Specific Objective

Determine the optimal order two filter weights, w0, for

H1(z) =1

1 + 0.8458z−1 [AR process]

H2(z) =1

1 − 0.9458z−1 [communication channel]

where v1(n) and v2(n) zero mean white noise processes withσ2

1 = 0.27 and σ22 = 0.1

Note: To determine w0, we need:

Ru — auto–correlation of the received signal

p — the cross correlation between received signal u(n) and thedesired signal d(n)

Gonzalo R. Arce (ECE, Univ. of Delaware) ELEG-636: Statistical Signal Processing Spring 2010 24 / 34

Wiener (MSE) Filtering Theory Communications Example

H I J K JL M N O P M N OQ R S T U V W X Y T Z R W [ \ ]R ^ T \ W _ X V S R YM Y R S T \ O` R [ [ a Y S b X V S R Yb c X Y \ d H e J

f \ T S W \ ]T S g Y X dK J hiProcedure: Consider Ru first

Since u(n) = x(n) + v2(n), where v2(n) is white with σ22 = 0.1 and is

uncorrelated with x(n)

Ru = Rx + Rv2 = Rx +

[

0.1 00 0.1

]

Gonzalo R. Arce (ECE, Univ. of Delaware) ELEG-636: Statistical Signal Processing Spring 2010 25 / 34

Wiener (MSE) Filtering Theory Communications Examplej kl m l n o pq r s t u v s t uw x y z x y z| ~ ~ x ~ z ~ ~ q s t u ~ ~ k n o v s t u

Next, consider the relation between x(n) and v1(n)

X (z) = H1(z)H2(z)V1(z)

where for the give systems

H1(z)H2(z) =1

(1 + 0.8458z−1)(1 − 0.9458z−1)

Converting to the time domain, we see x(n) is an order 2 AR process

x(n)− 0.1x(n − 1)− 0.8x(n − 2) = v1(n)

x(n) + a1x(n − 1) + a2x(n − 2) = v1(n)

Gonzalo R. Arce (ECE, Univ. of Delaware) ELEG-636: Statistical Signal Processing Spring 2010 26 / 34

Wiener (MSE) Filtering Theory Communications Example

Since x(n) is a real valued order two AR process, the Yule-Walkerequations are given by

[

r(0) r(1)r∗(1) r(0)

] [

−a1

−a2

]

=

[

r∗(1)r∗(2)

]

[

r(0) r(1)r(1) r(0)

] [

−a1

−a2

]

=

[

r(1)r(2)

]

Solving for the coefficients:

−a1 =r(1)[r(0)− r(2)]

r2(0)− r2(1)

− a2 =r(0)r(2)− r2(1)

r2(0)− r2(1)

Question: What are the known and unknown terms in this system?

Gonzalo R. Arce (ECE, Univ. of Delaware) ELEG-636: Statistical Signal Processing Spring 2010 27 / 34

Wiener (MSE) Filtering Theory Communications Example

Note: We must solve the system to obtain the unknown r(·) values

Noting r(0) = σ2x and rearranging to solve for r(1) and r(2)

r(1) =−a1

1 + a2σ2

x (∗)

r(2) =

(

−a2 +a2

1

1 + a2

)

σ2x (∗∗)

The Yule-Walker equations also stipulate

σ2v1

= r(0) + a1r(1) + a2r(2) (∗ ∗ ∗)

Next, utilize the determined r(1), r(2) and given a1, a2, σ2v1

values todetermine r(0) = σ2

x

Gonzalo R. Arce (ECE, Univ. of Delaware) ELEG-636: Statistical Signal Processing Spring 2010 28 / 34

Wiener (MSE) Filtering Theory Communications Example

Substituting (∗) and (∗∗) into (∗ ∗ ∗), utilize r(0) = σ2x , and rearranging

σ2v1

= r(0) + a1r(1) + a2r(2)

= σ2x + a1

(

−a1

1 + a2

)

σ2x + a2

(

−a2 +a2

1

1 + a2

)

σ2x

=

(

1 +−a2

1

1 + a2− a2

2 +a2

1a2

1 + a2

)

σ2x

⇒ σ2x =

(

1 + a2

1 − a2

)

σ2v1

(1 + a2)2 − a21

Note: All correlation terms have been determined since r(0) = σ2x

Gonzalo R. Arce (ECE, Univ. of Delaware) ELEG-636: Statistical Signal Processing Spring 2010 29 / 34

Wiener (MSE) Filtering Theory Communications Example

Using a1 = −0.1, a2 = −0.8, and σ2v1

= 0.27,

σ2x =

(

1 + a2

1 − a2

)

σ2v1

(1 + a2)2 − a21

= 1 =

Thus r(0) = 1. Similarly

r(1) =−a1

1 + a2σ2

x = 0.5

and finally, Rx is

Rx =

[

1 0.50.5 1

]

Gonzalo R. Arce (ECE, Univ. of Delaware) ELEG-636: Statistical Signal Processing Spring 2010 30 / 34

Wiener (MSE) Filtering Theory Communications Example

Recall the overall system ¡ ¢ £ ¤¥ ¦ § ¨ © ª § ¨ ©« ¬ ­ ® ¯ ¬ ­ ®° ± ² ³ ´ µ ¶ · ¸ ³ ¹ ± ¶ º » ¼± ½ ³ » ¶ ¾ · µ ² ± ¸¬ ¸ ± ² ³ » ®¿ ± º º À ¸ ² Á · µ ² ± ¸Á  · ¸ » à ¥ Ä § ¨ ©Å » ³ ² ¶ » ¼³ ² Æ ¸ · Ã Ç È É Ê Ë ¢ £ ̪ § ¨ © ÍÎÏ ÐÑ

Putting the pieces of Ru together

Ru = Rx + Rv2

=

[

1 0.50.5 1

]

+

[

0.1 00 0.1

]

=

[

1.1 0.50.5 1.1

]

Gonzalo R. Arce (ECE, Univ. of Delaware) ELEG-636: Statistical Signal Processing Spring 2010 31 / 34

Wiener (MSE) Filtering Theory Communications Example

Recall: The Wiener solution is given by w0 = R−1p

Note: R is known, but p is still to be determined

p = E[

d(n)x(n)d(n)u(n − 1)

]

Recall

X (z) = H2(z)D(z) =D(z)

1 − 0.9458z−1

or in the time domain

x(n)− 0.9458x(n − 1) = d(n)

Lastly, the observation is corrupted by additive noise

u(n) = x(n) + v2(n)

Gonzalo R. Arce (ECE, Univ. of Delaware) ELEG-636: Statistical Signal Processing Spring 2010 32 / 34

Wiener (MSE) Filtering Theory Communications Example

Thus

Eu(n)d(n) = E[x(n) + v2(n)][x(n)− 0.9458x(n − 1)]

= Ex2(n)+ Ex(n)v2(n) − 0.9458Ex(n)x(n − 1)

−0.9458Ev2(n)x(n − 1)

= σ2x + 0 − 0.9458r(1)− 0

= 1 − 0.9458(

12

)

= 0.5272

Similarly,

Eu(n − 1)d(n) = E[x(n − 1) + v2(n − 1)][x(n)− 0.9458x(n − 1)]

= r(1)− 0.9458r(0)

= −0.4458

Gonzalo R. Arce (ECE, Univ. of Delaware) ELEG-636: Statistical Signal Processing Spring 2010 33 / 34

Wiener (MSE) Filtering Theory Communications Example

Thus

p =

[

0.5272−0.4458

]

Final Solution:

w0 = R−1p

=

[

1.1 0.50.5 1.1

]

−1 [0.5272−0.4458

]

=

[

0.8360−0.7853

]

The optimal filter weights are a function of the source signalstatistics and the communications channel

Question: How do we optimized a filter of when the statistics arenot known in closed form or a priori?

Gonzalo R. Arce (ECE, Univ. of Delaware) ELEG-636: Statistical Signal Processing Spring 2010 34 / 34

top related