t a q singular value decompositionsingular value ... · singular value decompositionsingular value...

7
Linear Algebra for Wireless Communications Wireless Communications Lect re 8 Lecture: 8 Singular Value Decomposition Singular Value Decomposition SVD Ove Edfors Department of Electrical and Information Technology L dU i it Lund University 2010-04-06 Ove Edfors 1 A bit of repetition A very useful matrix decomposition comes from the Spectral Theorem: A very useful matrix decomposition comes from the Spectral Theorem: A real symmetric matrix A can be factored into QΛQ T . The ortonormal REAL CASE eigenvectors of A are in the orthogonal matrix Q and the corresponding eigenvalues in the diagonal matrix Λ. A Hermitian matrix A can be factored into UΛU H . The ortonormal eigenvectors of A are in the unitary matrix U and the corresponding COMPLEX CASE eigenvalues in the diagonal matrix Λ. We have seen that we, e.g., can reduce the complexity of channel estimation in OFDM systems and decouple otherwise coupled systems linear systems. What a pity that it only works with symmetric or Hermitian matrices, i.e., very specific types of square matrices! 2010-04-06 Ove Edfors 2 types of square matrices! Is there a generalization? Would it be possible to replace (the restricted) T = A QΛQ A QΛQ where Q is orthogonal and Λ is diagonal, with something (more general) like T Σ A UV T = Σ A UV where U and V are still orthogonal (but not necessarily the same) and Σ is still diagonal? This would give us essentially the same ”orthogonal transform” property as we have used when we had symmetric or Hermitian matrices. ... so, does this work? 2010-04-06 Ove Edfors 3 Let’s try the factorization Let’s assume that a factorization T T = Σ A UV extists. What would that imply? ( ) T T T T T T = Σ Σ = AA UV V U UΣΣ U First look at AA T : ( ) Σ Σ AA UV V U UΣΣ U then at A T A: 1 1 1 T = Q Λ Q (spectral theorem!) ( ) T T T T T T = Σ Σ = AA V U UV VΣΣV ... then at A T A: This seem to work only if AA T and A T A have some very specific relations 2 2 2 T = Q Λ Q (spectral theorem!) 2010-04-06 Ove Edfors 4 This seem to work only if AA and A A have some very specific relations between their eigenvalues.

Upload: others

Post on 29-Oct-2019

20 views

Category:

Documents


0 download

TRANSCRIPT

Linear Algebra forWireless CommunicationsWireless Communications

Lect re 8Lecture: 8Singular Value DecompositionSingular Value Decomposition

SVD

Ove EdforsDepartment of Electrical and Information Technology

L d U i itLund University

2010-04-06 Ove Edfors 1

A bit of repetition

A very useful matrix decomposition comes from the Spectral Theorem:A very useful matrix decomposition comes from the Spectral Theorem:

A real symmetric matrix A can be factored into QΛQT. The ortonormalREAL CASE

y Q Qeigenvectors of A are in the orthogonal matrix Q and the correspondingeigenvalues in the diagonal matrix Λ.

A Hermitian matrix A can be factored into UΛUH. The ortonormaleigenvectors of A are in the unitary matrix U and the corresponding

COMPLEX CASE

g y p geigenvalues in the diagonal matrix Λ.

We have seen that we, e.g., can reduce the complexity of channel estimation in OFDMg p ysystems and decouple otherwise coupled systems linear systems.

What a pity that it only works with symmetric or Hermitian matrices, i.e., very specifictypes of square matrices!

2010-04-06 Ove Edfors 2

types of square matrices!

Is there a generalization?

Would it be possible to replace (the restricted)

T=A QΛQA QΛQwhere Q is orthogonal and Λ is diagonal, with something (more general) like

TΣA U VT= ΣA U Vwhere U and V are still orthogonal (but not necessarily the same) and Σ is still diagonal?

This would give us essentially the same ”orthogonal transform” property as we have usedwhen we had symmetric or Hermitian matrices.y

... so, does this work?

2010-04-06 Ove Edfors 3

Let’s try the factorization

Let’s assume that a factorization

TT= ΣA U Vextists. What would that imply?

( )T T T T T T= Σ Σ =AA U V V U UΣΣ U

First look at AAT:

( )Σ ΣAA U V V U UΣΣ U

then at ATA:1 1 1

T=Q Λ Q (spectral theorem!)

( )T T T T T T= Σ Σ =A A V U U V VΣ ΣV

... then at ATA:

This seem to work only if AAT and ATA have some very specific relations

2 2 2T=Q Λ Q (spectral theorem!)

2010-04-06 Ove Edfors 4

This seem to work only if AA and A A have some very specific relations between their eigenvalues.

Eigenvalues of AAT and ATA

Let’s find out the properties of the eigenvalues of AAT and ATALet s find out the properties of the eigenvalues of AA and A A.

If x is an eigenvector of AAT and λ the corresponding eigenvalue

Do they have the same eigenvalues?

T λ=AA x xIf x is an eigenvector of AA and λ the corresponding eigenvalue

then, multiplying by AT from the left,T T Tλ=A AA x A x

then, multiplying by A from the left,

shows us that ATx is and eigenvector of ATA and the same λ is its eigenvalue YES!

Any other useful properties?

shows us that ATx is and eigenvector of ATA and the same λ is its eigenvalue. YES!

Since AAT and ATA are symmetric the eigenvalues are real N t th t thSince AAT and ATA are symmetric, the eigenvalues are real.

Since the quadratic form xTATAx = ||Ax||2 ≥ 0, all eigenvaluesof ATA (and AAT) are non-negative.

Note that there are norestrictions on A. It doesn’t even have to be square.

2010-04-06 Ove Edfors 5

The singular value decomposition

The above leads us to the following theorem:

Singular Value Decomposition

g

Singular Value DecompositionAny M x N matrix A can be factored into

TA UΣVT=A UΣVThe columns of U (M x M) are eigenvectors of AAT, and the columns of V (N x N) areeigenvectors of ATA The r singular values on the diagonal of Σ (M x N) are the squareeigenvectors of ATA. The r singular values on the diagonal of Σ (M x N) are the squareroots of the nonzero eigenvalues of both AAT and ATA.

This is a very powerful decomposition ... it works on everything and gives us thepractical (orthogonal)(diagonal)(orthogonal) form that simplifies many problems!

2010-04-06 Ove Edfors 6

The singular value decomposition

About the structure: Are often sorted σ1 ≥ σ2 ≥ ... ≥ σr

1| | Tσ⎡ ⎤ ⎡ ⎤⎡ ⎤ ⎢ ⎥ v

M x M M x N N x NM x N

1

1

| |

| |

Tm

r Tσ

⎡ ⎤− −⎡ ⎤ ⎢ ⎥ ⎢ ⎥⎢ ⎥ ⎢ ⎥= = ⎢ ⎥⎢ ⎥ ⎢ ⎥ ⎢ ⎥⎢ ⎥ ⎢ ⎥⎣ ⎦

vA UΣV u u

| | r Tn

⎢ ⎥⎢ ⎥ − −⎢ ⎥⎣ ⎦ ⎣ ⎦⎣ ⎦v

Since the singular values are thesquare-roots of the non-zeroeigenvalues of both AAT and ATA,

u1 to ur spans C(A)ur+1 to uM spans N(AT)

v1 to vr spans C(AT)vr+1 to vN spans N(A)

there can be at most min(m,n) ofthem, i.e., r ≤ min(m,n).

In certain applications, e.g. linear estimation with strong correlation, we have the

Implies that r = rank(A)

2010-04-06 Ove Edfors 7

pp , g g ,property that r << min(m,n) and we can use this to simplify calculations.

SVD of symmetric matrices?

What aboutT=A UΣV

when A = AT?

We know that there is a spectral factorization A = QΛQT of such a symmetric matrix,and we have

Concl sion The2T T T T T= =AA QΛQ QΛ Q QΛ Q

Real eigenvalues Square root of these diagonal

Conclusion: Thesingular values ofa symmetric matrixare the absoluteReal eigenvalues Square root of these diagonal

elements are singular values values of its nonzeroeigenvalues.

Can you find a complete SVD from the spectral factorization?

2010-04-06 Ove Edfors 8

C it f MIMO tCapacity of MIMO-systems

2010-04-06 Ove Edfors 9

A MIMO system

ChannelTX

ChannelRX

MT transmitantennas

MR receiveantennas

2010-04-06 Ove Edfors 10

A general (narrow-band) model

The ”general” case with MT TX antennas andMR RX antennas:

1,1 1,2 1,1 1 1TMh h hy x n⎡ ⎤⎡ ⎤ ⎡ ⎤ ⎡ ⎤⎢ ⎥⎢ ⎥ ⎢ ⎥ ⎢ ⎥

2 2,1 2,2 2, 2 2TMy h h h x n⎢ ⎥⎢ ⎥ ⎢ ⎥ ⎢ ⎥⎢ ⎥⎢ ⎥ ⎢ ⎥ ⎢ ⎥= = + = +⎢ ⎥⎢ ⎥ ⎢ ⎥ ⎢ ⎥⎢ ⎥⎢ ⎥ ⎢ ⎥ ⎢ ⎥

y Hx n

,1 ,2 ,R T TR R R TM M MM M M M

y x nh h h⎢ ⎥⎢ ⎥ ⎢ ⎥ ⎢ ⎥⎢ ⎥⎢ ⎥ ⎢ ⎥ ⎢ ⎥⎣ ⎦ ⎣ ⎦ ⎣ ⎦⎣ ⎦

Received vector(MR x 1)

Transmittedvector

(MT x 1)

Receivernoisevector

Channel matrix(MR x MT)

(MT x 1) vector(MR x 1)

It is not trivial to figure out the capacity of this MIMO system, but the rectangularchannel matrix calls for trying a singular value decomposition to obtain a

2010-04-06 Ove Edfors 11

y g g psomewhat simpler system to analyze.

Capacity – No fading & AWGN [1]

Si l l d iti f th (fi d) h l HH= + = +y Hx n UΣV x n

Singular value decomposition of the (fixed) channel H:

where U (MR x MR) and V (MT x MT) are unitary matricesand Σ (MR x MT) is a matrix containing the singular values( R T) g gon its diagonal.

Multiply by UH from left:Only ”rotations”of y, x and n.Multiply by UH from left:

H H H= +U y ΣV x U n = +y Σx nnxy

y

All-zero, exeptdiagonal.

2010-04-06 Ove Edfors 12

g

Capacity – No fading & AWGN [2]

What have we obtained?What have we obtained?Parallel independentchannels: Shannon’s ”standard case”:

1σ⎡ ⎤⎢ ⎥⎢ ⎥

1σ 1n1y1x

0rσ

⎢ ⎥⎢ ⎥= +⎢ ⎥⎢ ⎥⎢ ⎥

y x n

rσ rn⎢ ⎥⎣ ⎦

Number of singular

ryrx

Number of singularvalues r = rank(H). (+ channels with σk = 0)

2010-04-06 Ove Edfors 13

Capacity – No fading & AWGN [3]

Shannon: The total capacity of parallel independentShannon: The total capacity of parallel independentchannels is the sum of their individual capacities.

( )log 1C SNR+( )2log 1k kC SNR= +

( )2log 1k kC C SNR= = +∑ ∑ ( )2log 1k kC C SNR+∑ ∑

Equal power distribution (channel not known at TX):Equal power distribution (channel not known at TX):

( ) ( )2 2l 1 l 1rr

C C∑ ∑ ∏Constant dep. on e.g. TX power and noise.

( ) ( )2 22 2

1 1

log 1 log 1k k kk k

C C ασ ασ= =

= = + = +∑ ∑ ∏This doesn’t look like what we

2010-04-06 Ove Edfors 14

This doesn t look like what we usually see about MIMO capacity.

Capacity – No fading & AWGN [4]

MIMO capacity is often seen on the form:

( )H( )2log detR

HMC α= +I HH

... so, let’s see if they are the same!21 ασ⎛ ⎞⎡ ⎤+

( )2log 1r

C ασ+∏

What we derived!1

2

1

log det 1

ασ⎛ ⎞⎡ ⎤+⎜ ⎟⎢ ⎥⎜ ⎟⎢ ⎥⎜ ⎟⎢ ⎥+( )2

1

log 1 kk

C ασ=

= +∏ 22log det 1

1rασ⎜ ⎟⎢ ⎥= =+

⎜ ⎟⎢ ⎥⎜ ⎟⎢ ⎥⎜ ⎟⎢ ⎥⎣ ⎦

( )2log det HM α= +I ΣΣ

⎜ ⎟⎢ ⎥⎣ ⎦⎝ ⎠

( )2log det H HM α= +U I ΣΣ U

2log det H H H Hα⎛ ⎞

= +⎜ ⎟⎜ ⎟UU UΣV VΣ U

( )2gRM ( )2g

RM

( )2log detR

HM α= +I HH

YES!

2010-04-06 Ove Edfors 15

2gHMR

⎜ ⎟⎝ ⎠I H H

( )2gRM

Capacity – No fading & AWGN [5]

Singular values of H alone determine the MIMO

CONCLUSION:r

determine the MIMO channel capacity (at a given

SNR)!

( ) ( )22 2

1

log 1 log detR

Hk M

kC ασ α

=

= + = +∏ I HH [bit/sec/Hz]

h i b h

Hρ⎛ ⎞

Normalization: - SNR at each receiver branchρ

2log detR

HM

T

CMρ⎛ ⎞

= +⎜ ⎟⎝ ⎠I HH

This relation is also derived (in a differnt way) in e.gG.J. Foschini and M.J. Gans. On Limits of Wireless Communications in a FadingEnvironment when Using Multiple Antennas. Wireless Personal Communications,no 6 pp 311 335 1998

2010-04-06 Ove Edfors 16

no 6, pp. 311-335, 1998.

Channel estimation in OFDM withChannel estimation in OFDM withpilot-symbol assisted modulation

2010-04-06 Ove Edfors 17

OFDM System modelSystem model

Simplified model under ideal conditions

kH ,0 kn ,0

p(slow enough fading and sufficient CP)

kx ,0 ky ,0

H n ( ) ( ) ( ) ( )thththth RXTXi l **=Total filter in the signal path:

kNx ,1− kNy ,1−

kNH ,1− kNn ,1−( ) ( ) ( ) ( )( ) ( ) ( ) ( )fHfHfHfH

thththth

RXTXsignal

RXTXsignal

**=

Given that subcarrier n istransmitted at frequency fnthe attenuations become:

( )nsignalkn fHH =,OFDM systems have N between 64 (WLAN) and

2010-04-06 Ove Edfors 18

6 ( ) a d8192 (DigTV).

OFDM System model [cont ]System model [cont.]

W h d d ith OFDM t i d lWe have ended up with an OFDM matrix model:

= +y Xh n

where y is the received vector, X a diagonal matrix with the transmitted constellationpoints on its diagonal, h a vector of channel attenuations, and a vector n of receiver noise.

For the purpose of channel estimation, assume that all ones are transmitted, i.e., thatX = I. We now have a simplified model: All t i i l l tip

= +y h nSame numer of

measurements as parameters to

All matrices in calculations are square, of equal size

(and symmetric).

Further assume that the channel is zero-mean and has autocorrelation Rhh whilethe noise is i.i.d zero mean complex Gaussian with autocorrelation Rnn = σ2I. We alsoassume that h and n are independent

pestimate!

2010-04-06 Ove Edfors 19

assume that h and n are independent.

OFDM Channel estimationChannel estimation

What have we obtained?

HU U( ) 12 −Λ Λ I( ) 12 −

R R I

Full NxNy h y N Diagonal NxN N h

HU U( )2σ+Λ Λ I( )2σ+hh hhR R I

Full NxNmatrix

multiplication.

y h y N-pointIFFT

Diagonal NxNmatrix

multiplication.

N-pointFFT

h

COMPLEXITY(operations)

22 2

2

log log2 log

N N N N NN N N+ +

= +

2N

2010-04-06 Ove Edfors 20

2g

OFDM Channel estimation (scattered pilots)Channel estimation (scattered pilots)

From earlier lecture Pilots scattered

ers

From earlier lecture

ta nts

in unknown data

OFDM OFDMcarri

ers

subc

arrie

know

n da

t

o” c

hann

elas

urem

en

Osystem

(TX,Ch l

Osystem

(TX,Ch l

all N

subc

s on

all

N Unk

”No

mea

Channel,RX)

Channel,RX)

Pilo

ts o

n

sure

men

tM

ea

2010-04-06 Ove Edfors 21

OFDM Channel estimation (scattered pilots)Channel estimation (scattered pilots)

During Lecture 7 we modelled the case with ”all pilots” as a completely known diagonaldata matrix X = Idata matrix X = I.

We now have 1x

⎡ ⎤⎢ ⎥⎢ ⎥2

3

xx

⎢ ⎥⎢ ⎥⎢ ⎥

4

1x

⎢ ⎥= + = +⎢ ⎥⎢ ⎥⎢ ⎥

y Xh n h n

4

1

⎢ ⎥⎢ ⎥⎢ ⎥⎣ ⎦1⎢ ⎥⎣ ⎦

If the unknown data (xks) are zero mean and independent, we cannot obtain anyinformation about the channel (h) from measurements (y) on those subcarriers.

2010-04-06 Ove Edfors 22

We have to rely on the subcarriers where we have known data (pilots).

OFDM Channel estimation (scattered pilots)Channel estimation (scattered pilots)To simplify the situation, assume a new (rediced size) model where we only collect

t f th P il t b imeasurements from the P pilot subcarriers.

1⎡ ⎤⎢ ⎥1⎢ ⎥⎢ ⎥= +⎢ ⎥⎢ ⎥

y h n

1⎢ ⎥⎣ ⎦

F ll h lPil t t i R i iM t Full channelvector(N x 1)

Pilot matrix[data rows removed]

(P x N)

Receiver noisevector(P x 1)

Measurementson the P pilotsubcarriers

(P x 1)

We have a rectangular matrix andthe nice calculations from Lecture 7h t b d i diff t !

Smells like an SVD!

2010-04-06 Ove Edfors 23

have to be done in a different way!

OFDMLMMSE Scattered Pilot EstimationLMMSE Scattered Pilot Estimation

The linear estimator we are looking for isThe linear estimator we are looking for is

LMMSE =h Ay

where A is an N x P matrix on the form

1hy yy

−=A R Rusing the cross correlation R between the N channel coefficients h and the P measurementsusing the cross correlation Rhy between the N channel coefficients h and the P measurementsy and the autororrelation Ryy of the P measurements.

Number of operations required to perform one estimation is: PN

If we want to reduce the complexity of this estimator, we can use the theory of low-rankestimators. [Details can be found in books about estimation theory.]

Number of operations required to perform one estimation is: PN

2010-04-06 Ove Edfors 24

[ y ]

OFDMChannel est low-rank approximationChannel est. low-rank approximation

By performing an SVD on the following matrix:

1/ 2 H− =R R UΣVhy yy =R R UΣV

... the ”best rank-q estimator” is given by a rank-q approximation of the A matrix:q g y q pp

1σ⎡ ⎤⎢ ⎥

1/ 2Hq yyσ

−⎢ ⎥⎢ ⎥≈ =⎢ ⎥

A A U V Rqσ⎢ ⎥

⎢ ⎥⎣ ⎦

2010-04-06 Ove Edfors 25

OFDMChannel est low-rank approximationChannel est. low-rank approximation

1σ⎡ ⎤⎢ ⎥

1

1/ 2HLMMSE q yyσ

−−

⎢ ⎥⎢ ⎥=⎢ ⎥

h U V R yyyqσ⎢ ⎥

⎢ ⎥⎣ ⎦

1σ⎡ ⎤⎢ ⎥

[... keep only the q used columns of U and V ...] If the channel has strong

correlation the k b1/ 2H

q q yy

−⎢ ⎥= ⎢ ⎥⎢ ⎥⎣ ⎦

U V R y rank q can be made small.

Compare to the NP requiredq⎣ ⎦

P x Pq x Pq x qN x q

Combine to one matrix, q x P

NP required without SVD.

2010-04-06 Ove Edfors 26

Number of operations required to perform one estimation: qP + qN = q(P+N)