t a q singular value decompositionsingular value ... · singular value decompositionsingular value...
TRANSCRIPT
Linear Algebra forWireless CommunicationsWireless Communications
Lect re 8Lecture: 8Singular Value DecompositionSingular Value Decomposition
SVD
Ove EdforsDepartment of Electrical and Information Technology
L d U i itLund University
2010-04-06 Ove Edfors 1
A bit of repetition
A very useful matrix decomposition comes from the Spectral Theorem:A very useful matrix decomposition comes from the Spectral Theorem:
A real symmetric matrix A can be factored into QΛQT. The ortonormalREAL CASE
y Q Qeigenvectors of A are in the orthogonal matrix Q and the correspondingeigenvalues in the diagonal matrix Λ.
A Hermitian matrix A can be factored into UΛUH. The ortonormaleigenvectors of A are in the unitary matrix U and the corresponding
COMPLEX CASE
g y p geigenvalues in the diagonal matrix Λ.
We have seen that we, e.g., can reduce the complexity of channel estimation in OFDMg p ysystems and decouple otherwise coupled systems linear systems.
What a pity that it only works with symmetric or Hermitian matrices, i.e., very specifictypes of square matrices!
2010-04-06 Ove Edfors 2
types of square matrices!
Is there a generalization?
Would it be possible to replace (the restricted)
T=A QΛQA QΛQwhere Q is orthogonal and Λ is diagonal, with something (more general) like
TΣA U VT= ΣA U Vwhere U and V are still orthogonal (but not necessarily the same) and Σ is still diagonal?
This would give us essentially the same ”orthogonal transform” property as we have usedwhen we had symmetric or Hermitian matrices.y
... so, does this work?
2010-04-06 Ove Edfors 3
Let’s try the factorization
Let’s assume that a factorization
TT= ΣA U Vextists. What would that imply?
( )T T T T T T= Σ Σ =AA U V V U UΣΣ U
First look at AAT:
( )Σ ΣAA U V V U UΣΣ U
then at ATA:1 1 1
T=Q Λ Q (spectral theorem!)
( )T T T T T T= Σ Σ =A A V U U V VΣ ΣV
... then at ATA:
This seem to work only if AAT and ATA have some very specific relations
2 2 2T=Q Λ Q (spectral theorem!)
2010-04-06 Ove Edfors 4
This seem to work only if AA and A A have some very specific relations between their eigenvalues.
Eigenvalues of AAT and ATA
Let’s find out the properties of the eigenvalues of AAT and ATALet s find out the properties of the eigenvalues of AA and A A.
If x is an eigenvector of AAT and λ the corresponding eigenvalue
Do they have the same eigenvalues?
T λ=AA x xIf x is an eigenvector of AA and λ the corresponding eigenvalue
then, multiplying by AT from the left,T T Tλ=A AA x A x
then, multiplying by A from the left,
shows us that ATx is and eigenvector of ATA and the same λ is its eigenvalue YES!
Any other useful properties?
shows us that ATx is and eigenvector of ATA and the same λ is its eigenvalue. YES!
Since AAT and ATA are symmetric the eigenvalues are real N t th t thSince AAT and ATA are symmetric, the eigenvalues are real.
Since the quadratic form xTATAx = ||Ax||2 ≥ 0, all eigenvaluesof ATA (and AAT) are non-negative.
Note that there are norestrictions on A. It doesn’t even have to be square.
2010-04-06 Ove Edfors 5
The singular value decomposition
The above leads us to the following theorem:
Singular Value Decomposition
g
Singular Value DecompositionAny M x N matrix A can be factored into
TA UΣVT=A UΣVThe columns of U (M x M) are eigenvectors of AAT, and the columns of V (N x N) areeigenvectors of ATA The r singular values on the diagonal of Σ (M x N) are the squareeigenvectors of ATA. The r singular values on the diagonal of Σ (M x N) are the squareroots of the nonzero eigenvalues of both AAT and ATA.
This is a very powerful decomposition ... it works on everything and gives us thepractical (orthogonal)(diagonal)(orthogonal) form that simplifies many problems!
2010-04-06 Ove Edfors 6
The singular value decomposition
About the structure: Are often sorted σ1 ≥ σ2 ≥ ... ≥ σr
1| | Tσ⎡ ⎤ ⎡ ⎤⎡ ⎤ ⎢ ⎥ v
M x M M x N N x NM x N
1
1
| |
| |
Tm
r Tσ
⎡ ⎤− −⎡ ⎤ ⎢ ⎥ ⎢ ⎥⎢ ⎥ ⎢ ⎥= = ⎢ ⎥⎢ ⎥ ⎢ ⎥ ⎢ ⎥⎢ ⎥ ⎢ ⎥⎣ ⎦
vA UΣV u u
| | r Tn
⎢ ⎥⎢ ⎥ − −⎢ ⎥⎣ ⎦ ⎣ ⎦⎣ ⎦v
Since the singular values are thesquare-roots of the non-zeroeigenvalues of both AAT and ATA,
u1 to ur spans C(A)ur+1 to uM spans N(AT)
v1 to vr spans C(AT)vr+1 to vN spans N(A)
there can be at most min(m,n) ofthem, i.e., r ≤ min(m,n).
In certain applications, e.g. linear estimation with strong correlation, we have the
Implies that r = rank(A)
2010-04-06 Ove Edfors 7
pp , g g ,property that r << min(m,n) and we can use this to simplify calculations.
SVD of symmetric matrices?
What aboutT=A UΣV
when A = AT?
We know that there is a spectral factorization A = QΛQT of such a symmetric matrix,and we have
Concl sion The2T T T T T= =AA QΛQ QΛ Q QΛ Q
Real eigenvalues Square root of these diagonal
Conclusion: Thesingular values ofa symmetric matrixare the absoluteReal eigenvalues Square root of these diagonal
elements are singular values values of its nonzeroeigenvalues.
Can you find a complete SVD from the spectral factorization?
2010-04-06 Ove Edfors 8
C it f MIMO tCapacity of MIMO-systems
2010-04-06 Ove Edfors 9
A MIMO system
ChannelTX
ChannelRX
MT transmitantennas
MR receiveantennas
2010-04-06 Ove Edfors 10
A general (narrow-band) model
The ”general” case with MT TX antennas andMR RX antennas:
1,1 1,2 1,1 1 1TMh h hy x n⎡ ⎤⎡ ⎤ ⎡ ⎤ ⎡ ⎤⎢ ⎥⎢ ⎥ ⎢ ⎥ ⎢ ⎥
2 2,1 2,2 2, 2 2TMy h h h x n⎢ ⎥⎢ ⎥ ⎢ ⎥ ⎢ ⎥⎢ ⎥⎢ ⎥ ⎢ ⎥ ⎢ ⎥= = + = +⎢ ⎥⎢ ⎥ ⎢ ⎥ ⎢ ⎥⎢ ⎥⎢ ⎥ ⎢ ⎥ ⎢ ⎥
y Hx n
,1 ,2 ,R T TR R R TM M MM M M M
y x nh h h⎢ ⎥⎢ ⎥ ⎢ ⎥ ⎢ ⎥⎢ ⎥⎢ ⎥ ⎢ ⎥ ⎢ ⎥⎣ ⎦ ⎣ ⎦ ⎣ ⎦⎣ ⎦
Received vector(MR x 1)
Transmittedvector
(MT x 1)
Receivernoisevector
Channel matrix(MR x MT)
(MT x 1) vector(MR x 1)
It is not trivial to figure out the capacity of this MIMO system, but the rectangularchannel matrix calls for trying a singular value decomposition to obtain a
2010-04-06 Ove Edfors 11
y g g psomewhat simpler system to analyze.
Capacity – No fading & AWGN [1]
Si l l d iti f th (fi d) h l HH= + = +y Hx n UΣV x n
Singular value decomposition of the (fixed) channel H:
where U (MR x MR) and V (MT x MT) are unitary matricesand Σ (MR x MT) is a matrix containing the singular values( R T) g gon its diagonal.
Multiply by UH from left:Only ”rotations”of y, x and n.Multiply by UH from left:
H H H= +U y ΣV x U n = +y Σx nnxy
y
All-zero, exeptdiagonal.
2010-04-06 Ove Edfors 12
g
Capacity – No fading & AWGN [2]
What have we obtained?What have we obtained?Parallel independentchannels: Shannon’s ”standard case”:
1σ⎡ ⎤⎢ ⎥⎢ ⎥
1σ 1n1y1x
0rσ
⎢ ⎥⎢ ⎥= +⎢ ⎥⎢ ⎥⎢ ⎥
y x n
rσ rn⎢ ⎥⎣ ⎦
Number of singular
ryrx
Number of singularvalues r = rank(H). (+ channels with σk = 0)
2010-04-06 Ove Edfors 13
Capacity – No fading & AWGN [3]
Shannon: The total capacity of parallel independentShannon: The total capacity of parallel independentchannels is the sum of their individual capacities.
( )log 1C SNR+( )2log 1k kC SNR= +
( )2log 1k kC C SNR= = +∑ ∑ ( )2log 1k kC C SNR+∑ ∑
Equal power distribution (channel not known at TX):Equal power distribution (channel not known at TX):
( ) ( )2 2l 1 l 1rr
C C∑ ∑ ∏Constant dep. on e.g. TX power and noise.
( ) ( )2 22 2
1 1
log 1 log 1k k kk k
C C ασ ασ= =
= = + = +∑ ∑ ∏This doesn’t look like what we
2010-04-06 Ove Edfors 14
This doesn t look like what we usually see about MIMO capacity.
Capacity – No fading & AWGN [4]
MIMO capacity is often seen on the form:
( )H( )2log detR
HMC α= +I HH
... so, let’s see if they are the same!21 ασ⎛ ⎞⎡ ⎤+
( )2log 1r
C ασ+∏
What we derived!1
2
1
log det 1
ασ⎛ ⎞⎡ ⎤+⎜ ⎟⎢ ⎥⎜ ⎟⎢ ⎥⎜ ⎟⎢ ⎥+( )2
1
log 1 kk
C ασ=
= +∏ 22log det 1
1rασ⎜ ⎟⎢ ⎥= =+
⎜ ⎟⎢ ⎥⎜ ⎟⎢ ⎥⎜ ⎟⎢ ⎥⎣ ⎦
( )2log det HM α= +I ΣΣ
⎜ ⎟⎢ ⎥⎣ ⎦⎝ ⎠
( )2log det H HM α= +U I ΣΣ U
2log det H H H Hα⎛ ⎞
= +⎜ ⎟⎜ ⎟UU UΣV VΣ U
( )2gRM ( )2g
RM
( )2log detR
HM α= +I HH
YES!
2010-04-06 Ove Edfors 15
2gHMR
⎜ ⎟⎝ ⎠I H H
( )2gRM
Capacity – No fading & AWGN [5]
Singular values of H alone determine the MIMO
CONCLUSION:r
determine the MIMO channel capacity (at a given
SNR)!
( ) ( )22 2
1
log 1 log detR
Hk M
kC ασ α
=
= + = +∏ I HH [bit/sec/Hz]
h i b h
Hρ⎛ ⎞
Normalization: - SNR at each receiver branchρ
2log detR
HM
T
CMρ⎛ ⎞
= +⎜ ⎟⎝ ⎠I HH
This relation is also derived (in a differnt way) in e.gG.J. Foschini and M.J. Gans. On Limits of Wireless Communications in a FadingEnvironment when Using Multiple Antennas. Wireless Personal Communications,no 6 pp 311 335 1998
2010-04-06 Ove Edfors 16
no 6, pp. 311-335, 1998.
Channel estimation in OFDM withChannel estimation in OFDM withpilot-symbol assisted modulation
2010-04-06 Ove Edfors 17
OFDM System modelSystem model
Simplified model under ideal conditions
kH ,0 kn ,0
p(slow enough fading and sufficient CP)
kx ,0 ky ,0
H n ( ) ( ) ( ) ( )thththth RXTXi l **=Total filter in the signal path:
kNx ,1− kNy ,1−
kNH ,1− kNn ,1−( ) ( ) ( ) ( )( ) ( ) ( ) ( )fHfHfHfH
thththth
RXTXsignal
RXTXsignal
**=
Given that subcarrier n istransmitted at frequency fnthe attenuations become:
( )nsignalkn fHH =,OFDM systems have N between 64 (WLAN) and
2010-04-06 Ove Edfors 18
6 ( ) a d8192 (DigTV).
OFDM System model [cont ]System model [cont.]
W h d d ith OFDM t i d lWe have ended up with an OFDM matrix model:
= +y Xh n
where y is the received vector, X a diagonal matrix with the transmitted constellationpoints on its diagonal, h a vector of channel attenuations, and a vector n of receiver noise.
For the purpose of channel estimation, assume that all ones are transmitted, i.e., thatX = I. We now have a simplified model: All t i i l l tip
= +y h nSame numer of
measurements as parameters to
All matrices in calculations are square, of equal size
(and symmetric).
Further assume that the channel is zero-mean and has autocorrelation Rhh whilethe noise is i.i.d zero mean complex Gaussian with autocorrelation Rnn = σ2I. We alsoassume that h and n are independent
pestimate!
2010-04-06 Ove Edfors 19
assume that h and n are independent.
OFDM Channel estimationChannel estimation
What have we obtained?
HU U( ) 12 −Λ Λ I( ) 12 −
R R I
Full NxNy h y N Diagonal NxN N h
HU U( )2σ+Λ Λ I( )2σ+hh hhR R I
Full NxNmatrix
multiplication.
y h y N-pointIFFT
Diagonal NxNmatrix
multiplication.
N-pointFFT
h
COMPLEXITY(operations)
22 2
2
log log2 log
N N N N NN N N+ +
= +
2N
2010-04-06 Ove Edfors 20
2g
OFDM Channel estimation (scattered pilots)Channel estimation (scattered pilots)
From earlier lecture Pilots scattered
ers
From earlier lecture
ta nts
in unknown data
OFDM OFDMcarri
ers
subc
arrie
know
n da
t
o” c
hann
elas
urem
en
Osystem
(TX,Ch l
Osystem
(TX,Ch l
all N
subc
s on
all
N Unk
”No
mea
Channel,RX)
Channel,RX)
Pilo
ts o
n
sure
men
tM
ea
2010-04-06 Ove Edfors 21
OFDM Channel estimation (scattered pilots)Channel estimation (scattered pilots)
During Lecture 7 we modelled the case with ”all pilots” as a completely known diagonaldata matrix X = Idata matrix X = I.
We now have 1x
⎡ ⎤⎢ ⎥⎢ ⎥2
3
xx
⎢ ⎥⎢ ⎥⎢ ⎥
4
1x
⎢ ⎥= + = +⎢ ⎥⎢ ⎥⎢ ⎥
y Xh n h n
4
1
⎢ ⎥⎢ ⎥⎢ ⎥⎣ ⎦1⎢ ⎥⎣ ⎦
If the unknown data (xks) are zero mean and independent, we cannot obtain anyinformation about the channel (h) from measurements (y) on those subcarriers.
2010-04-06 Ove Edfors 22
We have to rely on the subcarriers where we have known data (pilots).
OFDM Channel estimation (scattered pilots)Channel estimation (scattered pilots)To simplify the situation, assume a new (rediced size) model where we only collect
t f th P il t b imeasurements from the P pilot subcarriers.
1⎡ ⎤⎢ ⎥1⎢ ⎥⎢ ⎥= +⎢ ⎥⎢ ⎥
y h n
1⎢ ⎥⎣ ⎦
F ll h lPil t t i R i iM t Full channelvector(N x 1)
Pilot matrix[data rows removed]
(P x N)
Receiver noisevector(P x 1)
Measurementson the P pilotsubcarriers
(P x 1)
We have a rectangular matrix andthe nice calculations from Lecture 7h t b d i diff t !
Smells like an SVD!
2010-04-06 Ove Edfors 23
have to be done in a different way!
OFDMLMMSE Scattered Pilot EstimationLMMSE Scattered Pilot Estimation
The linear estimator we are looking for isThe linear estimator we are looking for is
LMMSE =h Ay
where A is an N x P matrix on the form
1hy yy
−=A R Rusing the cross correlation R between the N channel coefficients h and the P measurementsusing the cross correlation Rhy between the N channel coefficients h and the P measurementsy and the autororrelation Ryy of the P measurements.
Number of operations required to perform one estimation is: PN
If we want to reduce the complexity of this estimator, we can use the theory of low-rankestimators. [Details can be found in books about estimation theory.]
Number of operations required to perform one estimation is: PN
2010-04-06 Ove Edfors 24
[ y ]
OFDMChannel est low-rank approximationChannel est. low-rank approximation
By performing an SVD on the following matrix:
1/ 2 H− =R R UΣVhy yy =R R UΣV
... the ”best rank-q estimator” is given by a rank-q approximation of the A matrix:q g y q pp
1σ⎡ ⎤⎢ ⎥
1/ 2Hq yyσ
−⎢ ⎥⎢ ⎥≈ =⎢ ⎥
A A U V Rqσ⎢ ⎥
⎢ ⎥⎣ ⎦
2010-04-06 Ove Edfors 25
OFDMChannel est low-rank approximationChannel est. low-rank approximation
1σ⎡ ⎤⎢ ⎥
1
1/ 2HLMMSE q yyσ
−−
⎢ ⎥⎢ ⎥=⎢ ⎥
h U V R yyyqσ⎢ ⎥
⎢ ⎥⎣ ⎦
1σ⎡ ⎤⎢ ⎥
[... keep only the q used columns of U and V ...] If the channel has strong
correlation the k b1/ 2H
q q yy
qσ
−⎢ ⎥= ⎢ ⎥⎢ ⎥⎣ ⎦
U V R y rank q can be made small.
Compare to the NP requiredq⎣ ⎦
P x Pq x Pq x qN x q
Combine to one matrix, q x P
NP required without SVD.
2010-04-06 Ove Edfors 26
Number of operations required to perform one estimation: qP + qN = q(P+N)