adaptive sp & machine intelligence lecture 4: modern...

45
© Danilo P Mandic Adaptive Signal Processing and Machine Intelligence Lecture 4: Modern Spectral Estimation 1 Danilo P. Mandic Room: 813 Ext: 46271 Department of Electrical and Electronic Engineering Imperial College London, UK [email protected] URL: www.commsp.ee.ic.ac.uk/mandic Adaptive SP & Machine Intelligence

Upload: lengoc

Post on 31-Aug-2018

220 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Adaptive SP & Machine Intelligence Lecture 4: Modern ...mandic/SE_ASP_LN/ASP_MI_Lecture_4_Su… · Lecture 4: Modern Spectral Estimation 1 DaniloP. Mandic ... Modified Periodogram

© Danilo P Mandic Adaptive Signal Processing and Machine Intelligence

Lecture 4: Modern Spectral Estimation

1

Danilo P. Mandic

Room: 813 Ext: 46271

Department of Electrical and Electronic EngineeringImperial College London, UK

[email protected] URL: www.commsp.ee.ic.ac.uk/∼mandic

Adaptive SP & Machine Intelligence

Page 2: Adaptive SP & Machine Intelligence Lecture 4: Modern ...mandic/SE_ASP_LN/ASP_MI_Lecture_4_Su… · Lecture 4: Modern Spectral Estimation 1 DaniloP. Mandic ... Modified Periodogram

© Danilo P Mandic Adaptive Signal Processing and Machine Intelligence

Overview of Spectral Estimation Methods

2

Introduction To

Spectrum Est.

Relationship Between Methods

Periodogram(Using Data)

Blackman-Tukey Method

(Using ACF)

Modifications1. Windowing Data2. Bartlett’s Method3. Welch’s Method

Maximum Entropy Method

ARMA Spectrum

Estimation

Subspace Methods

• Pisarenko• MUSIC• Eigenvector Method• Minimum NormNon-Parametric

Methods

Parametric Methods

Page 3: Adaptive SP & Machine Intelligence Lecture 4: Modern ...mandic/SE_ASP_LN/ASP_MI_Lecture_4_Su… · Lecture 4: Modern Spectral Estimation 1 DaniloP. Mandic ... Modified Periodogram

© Danilo P Mandic Adaptive Signal Processing and Machine Intelligence

Periodogram Based Methods

3

P

mod

(!) =1

NU

�����

N�1X

n=0

w(n)x(n)e�|n!

�����

2

PB(!) =1

N

K�1X

i=0

�����

L�1X

n=0

x(n+ iL)e�|n!

�����

2

PW (!) =1

KLU

K�1X

i=0

�����

L�1X

n=0

w(n)x(n+ iD)e�|n!

�����

2

WindowingModified Periodogram

AveragingBartlett’s Method

+ Overlapping windowsWelch’s Method

Pper(!) =1

N

�����

N�1X

n=0

x(n)e�|n!

�����

2

=N�1X

k=�N+1

r(k)e�|k!

Page 4: Adaptive SP & Machine Intelligence Lecture 4: Modern ...mandic/SE_ASP_LN/ASP_MI_Lecture_4_Su… · Lecture 4: Modern Spectral Estimation 1 DaniloP. Mandic ... Modified Periodogram

© Danilo P Mandic Adaptive Signal Processing and Machine Intelligence

Modified Periodogram

4

Pper(!) =1

N

�����

N�1X

n=0

x(n)e�|n!

�����

2

=N�1X

k=�N+1

r(k)e�|k!

⇥P

!

Windowing

Windowing mitigates the problem of spurious high frequency components in the spectrum.

Reduction the “Edge Effects”

Page 5: Adaptive SP & Machine Intelligence Lecture 4: Modern ...mandic/SE_ASP_LN/ASP_MI_Lecture_4_Su… · Lecture 4: Modern Spectral Estimation 1 DaniloP. Mandic ... Modified Periodogram

© Danilo P Mandic Adaptive Signal Processing and Machine Intelligence

Bartlett’s Method

5

Pper(!) =1

N

�����

N�1X

n=0

x(n)e�|n!

�����

2

=N�1X

k=�N+1

r(k)e�|k!

P

!

Pper(!) =1

N

�����

N�1X

n=0

x(n)e�|n!

�����

2

=N�1X

k=�N+1

r(k)e�|k!

Pper(!) =1

N

�����

N�1X

n=0

x(n)e�|n!

�����

2

=N�1X

k=�N+1

r(k)e�|k!

P

!

P

!

P

!

Averaging

Reduction in Variance

Tradeoff:Frequency Resolution & Variance Reduction

Page 6: Adaptive SP & Machine Intelligence Lecture 4: Modern ...mandic/SE_ASP_LN/ASP_MI_Lecture_4_Su… · Lecture 4: Modern Spectral Estimation 1 DaniloP. Mandic ... Modified Periodogram

© Danilo P Mandic Adaptive Signal Processing and Machine Intelligence

Welch’s Method

6

Pper(!) =1

N

�����

N�1X

n=0

x(n)e�|n!

�����

2

=N�1X

k=�N+1

r(k)e�|k!

P

!

Pper(!) =1

N

�����

N�1X

n=0

x(n)e�|n!

�����

2

=N�1X

k=�N+1

r(k)e�|k!P

!

P !

P

!

Overlapping Windows

Pper(!) =1

N

�����

N�1X

n=0

x(n)e�|n!

�����

2

=N�1X

k=�N+1

r(k)e�|k!

Averaging

Achieves a good balance between Resolution &

Variance

Page 7: Adaptive SP & Machine Intelligence Lecture 4: Modern ...mandic/SE_ASP_LN/ASP_MI_Lecture_4_Su… · Lecture 4: Modern Spectral Estimation 1 DaniloP. Mandic ... Modified Periodogram

© Danilo P Mandic Adaptive Signal Processing and Machine Intelligence

Blackman-Tukey Method

7

Pper(!) =N�1X

k=�N+1

r(k)e�|k!

The Periodogramcan also beexpressed as:

Autocorrelation Estimates at large lags are unreliable

PBT (!) =MX

k=�M

w(k)r(k)e�|k!

Lags: M < N � 1 Windowing

Next: Can we extrapolate the autocorrelation estimates for lags ?

Page 8: Adaptive SP & Machine Intelligence Lecture 4: Modern ...mandic/SE_ASP_LN/ASP_MI_Lecture_4_Su… · Lecture 4: Modern Spectral Estimation 1 DaniloP. Mandic ... Modified Periodogram

© Danilo P Mandic Adaptive Signal Processing and Machine Intelligence

Maximum Entropy Method

8

How can we extrapolate the autocorrelation estimates with imposing the least amount of structure on the data?

Maximize the randomness Maximize Entropy

Lag k

r x(k)

−pi piω

P x(ej ω

)

Autocorrelation Sequences

Power Spectral Density (PSD)

Lag k

r x(k)

−pi piω

P x(ej ω

)

Lag k

r x(k)

−pi piω

P x(ej ω

)

Which one has the “flattest” PSD?

Page 9: Adaptive SP & Machine Intelligence Lecture 4: Modern ...mandic/SE_ASP_LN/ASP_MI_Lecture_4_Su… · Lecture 4: Modern Spectral Estimation 1 DaniloP. Mandic ... Modified Periodogram

© Danilo P Mandic Adaptive Signal Processing and Machine Intelligence

Maximum Entropy Method (MEM)

9

Entropy of Gaussian random process with PSD :

Goal: Find extrapolated autocorrelation values to maximize the entropy:

*Refer to handout for the full derivation

Estimated using the Yule-Walker Method

The MEM method is identical to the all-pole AR(p) spectrum although no assumptions were made about the model of the data (except Gaussianity).

Page 10: Adaptive SP & Machine Intelligence Lecture 4: Modern ...mandic/SE_ASP_LN/ASP_MI_Lecture_4_Su… · Lecture 4: Modern Spectral Estimation 1 DaniloP. Mandic ... Modified Periodogram

© Danilo P Mandic Adaptive Signal Processing and Machine Intelligence

MEM, derivation

10

Page 11: Adaptive SP & Machine Intelligence Lecture 4: Modern ...mandic/SE_ASP_LN/ASP_MI_Lecture_4_Su… · Lecture 4: Modern Spectral Estimation 1 DaniloP. Mandic ... Modified Periodogram

© Danilo P Mandic Adaptive Signal Processing and Machine Intelligence

MEM, spectrum

11

Page 12: Adaptive SP & Machine Intelligence Lecture 4: Modern ...mandic/SE_ASP_LN/ASP_MI_Lecture_4_Su… · Lecture 4: Modern Spectral Estimation 1 DaniloP. Mandic ... Modified Periodogram

© Danilo P Mandic Adaptive Signal Processing and Machine Intelligence

Power spectra of ARMA processes

12

! !Flat (white)Spectrum

BandpassSpectrum

w(n) y(n)

Pw(!) Py(!)

�2w

Py(!) =

����Bq(!)

Ap(!)

����2

Pw(!)

Py(!) =

����Bq(!)

Ap(!)

����2

Pw(!)

y(n) = �pX

k=1

aky(n� k) +qX

k=0

bkw(n� k)

PARMA(!) =

���Pq

k=0 bke�|k!

���2

��1 +Pp

k=1 ake�|k!

��2

Autoregressive AR(p)

Moving Average MA(q)

Page 13: Adaptive SP & Machine Intelligence Lecture 4: Modern ...mandic/SE_ASP_LN/ASP_MI_Lecture_4_Su… · Lecture 4: Modern Spectral Estimation 1 DaniloP. Mandic ... Modified Periodogram

© Danilo P Mandic Adaptive Signal Processing and Machine Intelligence

Recap: ARMA processes

13

Page 14: Adaptive SP & Machine Intelligence Lecture 4: Modern ...mandic/SE_ASP_LN/ASP_MI_Lecture_4_Su… · Lecture 4: Modern Spectral Estimation 1 DaniloP. Mandic ... Modified Periodogram

© Danilo P Mandic Adaptive Signal Processing and Machine Intelligence

Recap: Yule-Walker equations

14

Page 15: Adaptive SP & Machine Intelligence Lecture 4: Modern ...mandic/SE_ASP_LN/ASP_MI_Lecture_4_Su… · Lecture 4: Modern Spectral Estimation 1 DaniloP. Mandic ... Modified Periodogram

© Danilo P Mandic Adaptive Signal Processing and Machine Intelligence

Recap: ARMA processes

15

Page 16: Adaptive SP & Machine Intelligence Lecture 4: Modern ...mandic/SE_ASP_LN/ASP_MI_Lecture_4_Su… · Lecture 4: Modern Spectral Estimation 1 DaniloP. Mandic ... Modified Periodogram

© Danilo P Mandic Adaptive Signal Processing and Machine Intelligence

ARMA spectrum estimation

16

Page 17: Adaptive SP & Machine Intelligence Lecture 4: Modern ...mandic/SE_ASP_LN/ASP_MI_Lecture_4_Su… · Lecture 4: Modern Spectral Estimation 1 DaniloP. Mandic ... Modified Periodogram

© Danilo P Mandic Adaptive Signal Processing and Machine Intelligence

Complex AR modelling (from Lecture 2)

17

Page 18: Adaptive SP & Machine Intelligence Lecture 4: Modern ...mandic/SE_ASP_LN/ASP_MI_Lecture_4_Su… · Lecture 4: Modern Spectral Estimation 1 DaniloP. Mandic ... Modified Periodogram

© Danilo P Mandic Adaptive Signal Processing and Machine Intelligence

Complex AR modelling, simulations

18

Page 19: Adaptive SP & Machine Intelligence Lecture 4: Modern ...mandic/SE_ASP_LN/ASP_MI_Lecture_4_Su… · Lecture 4: Modern Spectral Estimation 1 DaniloP. Mandic ... Modified Periodogram

© Danilo P Mandic Adaptive Signal Processing and Machine Intelligence

Subspace Methods: Introduction

19

x(n) = A1e|n!1 + w(n) A1 = |A1|e|�

w(n) ⇠ N (0,�2w)

e1 = [1, e|!1 , . . . , e|!1(M�1)]T

Rxx = |A1|2e1eH1 + �2wI

Rs Rn

x = A1e1 +w

x = [x(0), x(1), . . . , x(M � 1)]T

Autocorrelation:Matrix

E(xxH) =

Signal Autocorrelation Rank 1Single non-zero Eigenvalue =M |A1|2

Noise AutocorrelationRankAll Eigenvalues =

M�2w

Page 20: Adaptive SP & Machine Intelligence Lecture 4: Modern ...mandic/SE_ASP_LN/ASP_MI_Lecture_4_Su… · Lecture 4: Modern Spectral Estimation 1 DaniloP. Mandic ... Modified Periodogram

© Danilo P Mandic Adaptive Signal Processing and Machine Intelligence

Decomposing the Autocorrelation Matrix

20

is Hermitian. Remaining M-1 eigenvectors are orthogonal to

Eigenvectors of = Eigenvectors of

Can we use the idea that to somehow estimate the power spectrum?

Page 21: Adaptive SP & Machine Intelligence Lecture 4: Modern ...mandic/SE_ASP_LN/ASP_MI_Lecture_4_Su… · Lecture 4: Modern Spectral Estimation 1 DaniloP. Mandic ... Modified Periodogram

© Danilo P Mandic Adaptive Signal Processing and Machine Intelligence

Multiple Sinusoids

21

Consider:

The first 2 eigenvalues of are

The remaining are noise

2

1

e

e

3

2

1

v

v

v subspacesignal

eigenvector

Rank 2

The signal and noise subspaces are orthogonal

are not the signal eigenvectors in this case.

o Signal eigevectors span a 2D subspace

o Noise eigevectos span a (M-2) dim. subspace

Page 22: Adaptive SP & Machine Intelligence Lecture 4: Modern ...mandic/SE_ASP_LN/ASP_MI_Lecture_4_Su… · Lecture 4: Modern Spectral Estimation 1 DaniloP. Mandic ... Modified Periodogram

© Danilo P Mandic Adaptive Signal Processing and Machine Intelligence

Subspace Methods

22

Extending to p sinusoids.

Using

PSD estimation can be performed as:

Pisarenko Harmonic Decomposition

MUltiple SignalClassification (MUSIC)

EigenVector Method Minimum Norm Method

Noise Subspace & has min. norm

Page 23: Adaptive SP & Machine Intelligence Lecture 4: Modern ...mandic/SE_ASP_LN/ASP_MI_Lecture_4_Su… · Lecture 4: Modern Spectral Estimation 1 DaniloP. Mandic ... Modified Periodogram

© Danilo P Mandic Adaptive Signal Processing and Machine Intelligence

Comparison of the 4 Subspace Methods

23

0 0.5 1 1.5 2−60

−50

−40

−30

−20

−10

0

10

20

Frequency (units of pi)

Mag

nitu

de (d

B)

0 0.5 1 1.5 2−10

0

10

20

30

40

50

60

70

80

90

Frequency (units of pi)

Mag

nitu

de (d

B)

0 0.5 1 1.5 2−5

0

5

10

15

20

25

30

35

40

45

Frequency (units of pi)

Mag

nitu

de (d

B)

0 0.5 1 1.5 2−35

−30

−25

−20

−15

−10

−5

0

5

Frequency (units of pi)M

agni

tude

(dB)

Pisarenko MUSIC

EigenVector Minimum Norm

Pisarenko only needs a 5 x 5 correlation matrix

A 64 x 64 correlation matrix was used for other methods

Except for Pisarenko’smethod, all other estimates are correct!

Overlay of 10 different realizations of 4 complex sinusoids in white noise.

Page 24: Adaptive SP & Machine Intelligence Lecture 4: Modern ...mandic/SE_ASP_LN/ASP_MI_Lecture_4_Su… · Lecture 4: Modern Spectral Estimation 1 DaniloP. Mandic ... Modified Periodogram

© Danilo P Mandic Adaptive Signal Processing and Machine Intelligence

Principal Components Spectral Estimation

24

Signal Noise

Can we de-noise the signal by discarding the noise eigenvectors ? Linear Algebra terms: We impose a rank

p constraint on

Principal component analysis (PCA) can be used with Blackman–Tukey, maximum entropy method and AR spectrum estimation.

Page 25: Adaptive SP & Machine Intelligence Lecture 4: Modern ...mandic/SE_ASP_LN/ASP_MI_Lecture_4_Su… · Lecture 4: Modern Spectral Estimation 1 DaniloP. Mandic ... Modified Periodogram

© Danilo P Mandic Adaptive Signal Processing and Machine Intelligence

Summary of the Different Methods

25

PeriodogramBlackman-TukeyMoving Average (MA)

ARMA

“Numerator” Methods

Autoregressive (AR)

Max. Entropy

Pisarenko

MUSIC

Min. Norm Eigenvector

“Denominator” Methods

Signal Subspace

Noise Subspace

Page 26: Adaptive SP & Machine Intelligence Lecture 4: Modern ...mandic/SE_ASP_LN/ASP_MI_Lecture_4_Su… · Lecture 4: Modern Spectral Estimation 1 DaniloP. Mandic ... Modified Periodogram

Adaptive Signal Processing & Machine Intelligence

Principal Component Analysis

Danilo Mandicroom 813, ext: 46271

Department of Electrical and Electronic Engineering

Imperial College London, [email protected], URL: www.commsp.ee.ic.ac.uk/∼mandic

c© D. P. Mandic Adaptive Signal Processing & Machine Intelligence, Spring 2017 1

Page 27: Adaptive SP & Machine Intelligence Lecture 4: Modern ...mandic/SE_ASP_LN/ASP_MI_Lecture_4_Su… · Lecture 4: Modern Spectral Estimation 1 DaniloP. Mandic ... Modified Periodogram

What is that a matrix does to a vector?

Ampli-twistA matrix A which multiplies a vectorx(i) stretches or shortents the vector(ii) rotates the vector

A any general matrix

R a rotation matrix (RT = R−1

and detR = 1)

Ex = λx eigenanalysis

P projection matrix

An example of a rotation matrix

R =

[cos θ − sin θsin θ cos θ

]What can we say about theproperties of the matrix A, matrixE and the projection matrix P(rank, invertibility, ...)?

Is the projection matrix invertible?

c© D. P. Mandic Adaptive Signal Processing & Machine Intelligence, Spring 2017 2

Page 28: Adaptive SP & Machine Intelligence Lecture 4: Modern ...mandic/SE_ASP_LN/ASP_MI_Lecture_4_Su… · Lecture 4: Modern Spectral Estimation 1 DaniloP. Mandic ... Modified Periodogram

The meaning of eigenanalysis

Let A be an n× n matrix, where A is a linear operator on vectors in Rn,such that A x = b

An eigenvector of A is a vector v ∈ Rn such that A v = λv, where λ iscalled the corresponding eigenvalue.

Matrix A only changes the length of v, not its direction!

Equation A v = λv Equation A x = b.

c© D. P. Mandic Adaptive Signal Processing & Machine Intelligence, Spring 2017 3

Page 29: Adaptive SP & Machine Intelligence Lecture 4: Modern ...mandic/SE_ASP_LN/ASP_MI_Lecture_4_Su… · Lecture 4: Modern Spectral Estimation 1 DaniloP. Mandic ... Modified Periodogram

Motivation for Principal Component Analysis (PCA)from material by Rasmus Bro

c© D. P. Mandic Adaptive Signal Processing & Machine Intelligence, Spring 2017 4

Page 30: Adaptive SP & Machine Intelligence Lecture 4: Modern ...mandic/SE_ASP_LN/ASP_MI_Lecture_4_Su… · Lecture 4: Modern Spectral Estimation 1 DaniloP. Mandic ... Modified Periodogram

And now, principal directions in datafrom material by Rasmus Bro

Clearly, a 2D representation provides a perfect separability betweenhumans and monkays, together with principal directions in data.

c© D. P. Mandic Adaptive Signal Processing & Machine Intelligence, Spring 2017 5

Page 31: Adaptive SP & Machine Intelligence Lecture 4: Modern ...mandic/SE_ASP_LN/ASP_MI_Lecture_4_Su… · Lecture 4: Modern Spectral Estimation 1 DaniloP. Mandic ... Modified Periodogram

Principal Component Analysis (PCA)Also known as the Karhunen-Loeve transform

• Many signal processing, control and machine learning tasks employmultivariate data which often exhibit dependencies and redundancies.

• For example, it is often useful to reduce the dimensionality of a signalwhile maintaining the useful information.

• This reduces the computational complexity of any algorithm whilepreserving the physical meaning of the data.

• Besides dimensionality reduction, we often would like to transform themulti-channel data such each channel is orthogonal to each other (thedata covariance matrix is diagonal)

• We use the PCA to accomlish this goal → The PCA has been calledone of the most valuable results from applied linear algebra.

c© D. P. Mandic Adaptive Signal Processing & Machine Intelligence, Spring 2017 6

Page 32: Adaptive SP & Machine Intelligence Lecture 4: Modern ...mandic/SE_ASP_LN/ASP_MI_Lecture_4_Su… · Lecture 4: Modern Spectral Estimation 1 DaniloP. Mandic ... Modified Periodogram

Principal Component Analysis (PCA)Geometric View

Projected Data

c© D. P. Mandic Adaptive Signal Processing & Machine Intelligence, Spring 2017 7

Page 33: Adaptive SP & Machine Intelligence Lecture 4: Modern ...mandic/SE_ASP_LN/ASP_MI_Lecture_4_Su… · Lecture 4: Modern Spectral Estimation 1 DaniloP. Mandic ... Modified Periodogram

Principal Component Analysis (PCA)Derivation

• Consider a general data vector, xk ∈ CM×1, with the empirical(sample) covariance matrix defined as

cov(xk)def= Rx =

1

N

N−1∑i=0

xkxHk .

• Also, if we define a matrix X ∈ CN×M :XT =

[x1 x2 . . . xN

]=⇒ Rx = 1

NXHX

• The symmetric covariance matrix Rx admits the following eigenvaluedecomposition: QHRxQ = Λ

• The diagonal eigenvalue matrix, Λ = diag{λ1, λ2, . . . , λM}, indicatesthe power of each component of xk.

• The matrix of eigenvectors, Qr = [q1,q2, . . . ,qM ], designates theprincipal directions of the data.

c© D. P. Mandic Adaptive Signal Processing & Machine Intelligence, Spring 2017 8

Page 34: Adaptive SP & Machine Intelligence Lecture 4: Modern ...mandic/SE_ASP_LN/ASP_MI_Lecture_4_Su… · Lecture 4: Modern Spectral Estimation 1 DaniloP. Mandic ... Modified Periodogram

Principal Component Analysis (PCA)Derivation

• Suppose xk is to be transformed into a vector, Uk ∈ CM×1, using alinear transformation matrix W, so that

Uk = Wxk, where cov(Uk) = I.

• The PCA states that W = Λ−12QH can be obtained from the

eigenvector and eigenvalue matrices.

• Proof:

cov(Uk) =1

N

N−1∑i=0

UkUHk

= Λ−12QH

(1

N

N−1∑i=0

xkxHk

)QΛ−1

2

= Λ−12QHRxQΛ−1

2 = I.

c© D. P. Mandic Adaptive Signal Processing & Machine Intelligence, Spring 2017 9

Page 35: Adaptive SP & Machine Intelligence Lecture 4: Modern ...mandic/SE_ASP_LN/ASP_MI_Lecture_4_Su… · Lecture 4: Modern Spectral Estimation 1 DaniloP. Mandic ... Modified Periodogram

Principal Component Analysis (PCA)Dimensionality Reduction

• To perform dimensionality reduction, the PCA can be applied to obtaina transformed data vector Ur,k ∈ Cr×1 with the dimension r < M as

Ur,k = Wrxk = Λ−1

21:rQ

H1:rxk

• Λ1:r = diag{λ1, λ2, . . . , λr} and Q1:r = [q1,q2, . . . ,qr]

• Note: r corresponds to the r largest eigenvalues in Λ.

• The PCA selects the directions in which the data expressesmaximal variance, that is, the directions of the principal eigenvectorsof the data matrix.

• The PCA matrix Wr can be interpreted as a projection matrix as weare unable to recover xk from the “reduced” data vector Ur,k.

• The PCA projects the data onto the axes which exhibits the r-largestvariances.

c© D. P. Mandic Adaptive Signal Processing & Machine Intelligence, Spring 2017 10

Page 36: Adaptive SP & Machine Intelligence Lecture 4: Modern ...mandic/SE_ASP_LN/ASP_MI_Lecture_4_Su… · Lecture 4: Modern Spectral Estimation 1 DaniloP. Mandic ... Modified Periodogram

Principal Component Analysis (PCA)Connections with Singular Value Decomposition (SVD)

• Consider the SVD of the data matrix X = UΣVH.

• The matrices U and V are unitary, i.e. UHU = I and VHV = I.

• The covariance matrix Rx = 1NXXH = UΣ2UH.

• Related to the eigenvalue decomposition: QΛQH = UΣ2UH

• So, PCA matrix W = Λ−12QH can be obtained from the SVD of X as

W = Σ−1UH → No need to compute Rx.

c© D. P. Mandic Adaptive Signal Processing & Machine Intelligence, Spring 2017 11

Page 37: Adaptive SP & Machine Intelligence Lecture 4: Modern ...mandic/SE_ASP_LN/ASP_MI_Lecture_4_Su… · Lecture 4: Modern Spectral Estimation 1 DaniloP. Mandic ... Modified Periodogram

Principal components in data and noncircularity

c© D. P. Mandic Adaptive Signal Processing & Machine Intelligence, Spring 2017 12

Page 38: Adaptive SP & Machine Intelligence Lecture 4: Modern ...mandic/SE_ASP_LN/ASP_MI_Lecture_4_Su… · Lecture 4: Modern Spectral Estimation 1 DaniloP. Mandic ... Modified Periodogram

Noncircularity and PCA

c© D. P. Mandic Adaptive Signal Processing & Machine Intelligence, Spring 2017 13

Page 39: Adaptive SP & Machine Intelligence Lecture 4: Modern ...mandic/SE_ASP_LN/ASP_MI_Lecture_4_Su… · Lecture 4: Modern Spectral Estimation 1 DaniloP. Mandic ... Modified Periodogram

PCA examples, uncorrelated data

c© D. P. Mandic Adaptive Signal Processing & Machine Intelligence, Spring 2017 14

Page 40: Adaptive SP & Machine Intelligence Lecture 4: Modern ...mandic/SE_ASP_LN/ASP_MI_Lecture_4_Su… · Lecture 4: Modern Spectral Estimation 1 DaniloP. Mandic ... Modified Periodogram

PCA examples, correlated data

c© D. P. Mandic Adaptive Signal Processing & Machine Intelligence, Spring 2017 15

Page 41: Adaptive SP & Machine Intelligence Lecture 4: Modern ...mandic/SE_ASP_LN/ASP_MI_Lecture_4_Su… · Lecture 4: Modern Spectral Estimation 1 DaniloP. Mandic ... Modified Periodogram

Matrix Decomposition for RegressionPrincipal Component Regression (PCR)

Regression: Predicting dependent variables, Y ∈ CN×L, from a linearcombination independent variables, X ∈ CN×M , through a set ofcoefficients B ∈ CM×L.

Y = XB

The ordinary least squares (OLS) the solution is given by

BOLS = (XHX)−1XH︸ ︷︷ ︸Pseudo-inverse of X

Y

Problem: If X is sub-rank then the matrix XHX is singular and so thecalculation of (XHX)−1 becomes intractable.

This occurs in big data settings where number of variables being measuredis large and may contain redundant information.

c© D. P. Mandic Adaptive Signal Processing & Machine Intelligence, Spring 2017 16

Page 42: Adaptive SP & Machine Intelligence Lecture 4: Modern ...mandic/SE_ASP_LN/ASP_MI_Lecture_4_Su… · Lecture 4: Modern Spectral Estimation 1 DaniloP. Mandic ... Modified Periodogram

Matrix Decomposition for RegressionPrincipal Component Regression (PCR)

• The PCA representation allows the separation of data into independentcomponents.

• For sub-rank data, the number of important components will be lessthan the number of variables and this “subspace” can be identifiedusing PCA.

• Furthermore, the SVD representation allows a straightforwardcalculation of a generalised inverse as

B+PCR = VΣ−1UH︸ ︷︷ ︸

Generalised Inverse of X

Y

• Σ−1 is a diagonal matrix with the reciprocal of the singular values of X.Hint: Use the property UH = U−1 and VH = V−1 to derive B+

PCR.

c© D. P. Mandic Adaptive Signal Processing & Machine Intelligence, Spring 2017 17

Page 43: Adaptive SP & Machine Intelligence Lecture 4: Modern ...mandic/SE_ASP_LN/ASP_MI_Lecture_4_Su… · Lecture 4: Modern Spectral Estimation 1 DaniloP. Mandic ... Modified Periodogram

Notes:

c© D. P. Mandic Adaptive Signal Processing & Machine Intelligence, Spring 2017 18

Page 44: Adaptive SP & Machine Intelligence Lecture 4: Modern ...mandic/SE_ASP_LN/ASP_MI_Lecture_4_Su… · Lecture 4: Modern Spectral Estimation 1 DaniloP. Mandic ... Modified Periodogram

Notes:

c© D. P. Mandic Adaptive Signal Processing & Machine Intelligence, Spring 2017 19

Page 45: Adaptive SP & Machine Intelligence Lecture 4: Modern ...mandic/SE_ASP_LN/ASP_MI_Lecture_4_Su… · Lecture 4: Modern Spectral Estimation 1 DaniloP. Mandic ... Modified Periodogram

Notes:

c© D. P. Mandic Adaptive Signal Processing & Machine Intelligence, Spring 2017 20