coherent space time block codes from sets of subspaces · 2009. 7. 10. · multiple input multiple...
TRANSCRIPT
-
Christian Pietsch
Coherent Space Time Block Codes
from Sets of Subspaces
-
Multiple input multiple output (MIMO) systems are an attractive option forwireless communications due to their capability of providing extra capacityand/or diversity in comparison with single antenna schemes. This book con-tributes to the analysis and the construction of space time constellations thatallow for a reliable transmission in fading environments where the transmitterdoes not have any knowledge about the channel state information. Initiatedby new findings on the structure of orthogonal space time block codes (OST-BCs), we establish two mappings that link these constellations to unique setsof subspaces, which are termed Grassmannian packings in the mathematicalliterature. We derive the packing properties that result from OSTBCs. In thefirst place, this gives us new insight into the structure of OSTBCs. Moreover,since Grassmannian packings have been previously applied for the construc-tion of non-coherent constellations, we identify similarities and differencesbetween a variety of coherent and non-coherent constellations. The pack-ings that are related to OSTBCs are severely constrained. Allowing for moregeneral packings, this lets us construct space time constellations that supporthigher data rates. It provides a new powerful framework that links the de-sign of general coherent space time constellations with the search for goodGrassmannian packings. We derive packing properties that yield space timeconstellations with excellent performance in terms of mutual information anddiversity. We propose two methods that enable the design of these packings.Constructing space time constellations from the resulting packings, we obtainfull rate coherent space time block codes (STBCs) that turn out to be superiorto the best known coherent STBCs that we are aware of. While the emphasisof this work is on the analysis and the construction of space time constella-tions, a well defined transmission model eases many derivations. In particular,the relationship between models that include spreading matrices, dispersionmatrices, real-valued and complex-valued notation turns out to be important.
-
Coherent Space Time Block Codes
from Sets of Subspaces
-
Cover design
The figure on the front cover symbolizes the mapping of two points from the Grass-
mann manifold GR(2, 1) to the two dimensional Euclidean space R2. A detailed expla-
nation is given in Appendix 3.A1.3, see also Figure 3.7.
-
Acknowledgments
Having finalized my Ph. D. with this thesis, I may now look back at some wonderful
years that I was fortunate enough to spend at the Institute of Information Technology
at the University of Ulm. Surely, this time would not have been so rewarding if it had
not been for the great people I had the chance to work with. At this point, I would like
to express my gratitude to everybody who contributed to my work and thesis in one
way or another.
In particular, I owe many thanks to Prof. Jürgen Lindner for providing me with the
great opportunity to do challenging research at his institute and the trust he put into
my work by giving me a lot of freedom to choose an interesting topic and by letting
me present my results at various national and international conferences. Additionally,
I am very much indebted to Prof. Tobias Weber who kindly agreed on acting as a co-
supervisor.
I would like to say thank you to all my former colleagues, none of whom I would
want to have missed. It is always difficult to pick out individuals since others may
unintentionally appear less important. Nevertheless, I would like to mention a few
people explicitly. There is my long term friend Ivan Perǐsa who I have been lucky
to spend my years throughout high school and university with. There are my office
mates Siegfried Grob and Alexander Linduska who always contributed to an agreeable
working atmosphere. Surely, I will always remember the regular tea kitchen discus-
sions with Markus Dangl, Alexander Linduska, Christian Sgraja, Zoran Utkovski, and
Matthias Wetz on various scientific and non-scientific topics. Moreover, I really enjoyed
the trips to various COST meetings with Werner Teich who was also always patient to
listen to virtually anything that one wanted to discuss. Additionally, there are Werner
Birkle and Werner Hack who were invaluable for my work with the MIMO demonstra-
tor. Finally, many thanks to all those colleagues who were involved in proof-reading
my manuscript!
Last but not least, I am deeply grateful to my parents for their continuous support
throughout the past years.
Christian Pietsch
Neu-Ulm, October 2008
v
-
vi
-
Contents
Acknowledgments v
1 Introduction 1
2 Models and Design Criteria for Space Time Block Coding Techniques 5
2.1 The Multiple Input Multiple Output Channel . . . . . . . . . . . . . . . 5
2.2 Basic System Models and Their Relationship . . . . . . . . . . . . . . . 7
2.2.1 Linear Dispersion Model . . . . . . . . . . . . . . . . . . . . . . 9
2.2.2 Spreading Model . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.2.3 Matched Filtering and Transmission Matrices . . . . . . . . . . . 11
2.2.4 Relationship Between the Two Models . . . . . . . . . . . . . . 12
2.3 Real-Valued Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.3.1 Channel Matrices and Signal Vectors . . . . . . . . . . . . . . . 15
2.3.2 Real-Valued Spreading Matrices . . . . . . . . . . . . . . . . . . 16
2.3.3 Real-Valued Dispersion Matrices . . . . . . . . . . . . . . . . . . 17
2.4 Summary of General System Model Equations and Discussion . . . . . . 18
2.5 Maximum Likelihood Detection . . . . . . . . . . . . . . . . . . . . . . 20
2.5.1 ML Metric – Decision Criterion . . . . . . . . . . . . . . . . . . . 21
2.5.2 Structural Properties . . . . . . . . . . . . . . . . . . . . . . . . 22
2.6 Capacity and Rate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
2.6.1 Spreading Matrices and Capacity . . . . . . . . . . . . . . . . . 25
2.6.2 Dispersion Matrices and Capacity . . . . . . . . . . . . . . . . . 27
2.6.3 Outage Capacity Versus Ergodic Capacity . . . . . . . . . . . . . 28
2.7 Diversity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
2.7.1 Diversity Considerations for a Single Symbol Transmission . . . 30
2.7.2 Diversity Considerations for a Multiple Symbol Transmission . . 34
2.8 Joint Consideration of Capacity, Rate, and Diversity . . . . . . . . . . . 37
Appendix 39
2.A1 Another Example of Notation . . . . . . . . . . . . . . . . . . . . . . . 39
3 Orthogonal Space Time Block Codes – Two Subspace Representations 41
3.1 Dispersion Matrices and Basic Properties of OSTBCs . . . . . . . . . . . 42
3.1.1 Rate Related Issues . . . . . . . . . . . . . . . . . . . . . . . . . 43
3.1.2 Rate 1/2 OSTBCs with Minimal Delay . . . . . . . . . . . . . . . 443.1.3 Notational Aspects . . . . . . . . . . . . . . . . . . . . . . . . . 45
vii
-
Contents
3.2 Sets of Subspaces from Dispersion Matrices . . . . . . . . . . . . . . . . 46
3.2.1 Square Dispersion Matrices . . . . . . . . . . . . . . . . . . . . 47
3.2.2 Non-Square Dispersion Matrices . . . . . . . . . . . . . . . . . . 52
3.3 Subspaces from Square Transmit Matrices and Implications . . . . . . . 54
3.3.1 Identification of Transmit Matrices with Subspaces . . . . . . . . 56
3.3.2 Subspace Based ML Metric . . . . . . . . . . . . . . . . . . . . . 57
3.3.3 Constraints on Packings Related to OSTBCs . . . . . . . . . . . . 58
3.3.4 Geometrical Interpretation . . . . . . . . . . . . . . . . . . . . . 60
3.3.5 Link Between Coherent and Non-Coherent Schemes . . . . . . . 64
Appendix 71
3.A1 A Short Introduction to Grassmann Manifolds . . . . . . . . . . . . . . 71
3.A1.1 Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
3.A1.2 Distance Measures . . . . . . . . . . . . . . . . . . . . . . . . . 72
3.A1.3 Example: Points in GR(2, 1) and the Embedding of GR(2, 1) in R2 73
3.A2 Complex Orthogonal Designs from Real Orthogonal Designs . . . . . . 74
3.A3 STBCs from Packings with Subspace Dimensionality ≤ ℓ/2 . . . . . . . 753.A4 Derivation of (3.77) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
4 Design of High Rate Coherent STBCs from Grassmannian Packings 77
4.1 Mutual Information Preserving Sets of Subspaces . . . . . . . . . . . . 78
4.1.1 Packing Properties and Geometrical Interpretation . . . . . . . . 78
4.1.2 A Packing from an Orthogonal Spreading Matrix . . . . . . . . . 81
4.2 Diversity Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
4.2.1 Pairwise Diversity . . . . . . . . . . . . . . . . . . . . . . . . . . 90
4.2.2 Pairwise Diversity and Real-Valued Notation . . . . . . . . . . . 93
4.2.3 Enhanced Packings by Applying Rotations . . . . . . . . . . . . 99
4.2.4 Random Search for Good Packings . . . . . . . . . . . . . . . . 102
4.3 Comparison with Other STBCs . . . . . . . . . . . . . . . . . . . . . . . 110
4.3.1 An Upper Bound on the Performance – OSTBCs . . . . . . . . . 110
4.3.2 The Linear Dispersion Codes Proposed in [51] . . . . . . . . . . 111
4.3.3 Multi Layer Space Time Block Codes . . . . . . . . . . . . . . . 112
4.3.4 Golden Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
4.3.5 Comparison in Terms of BER Performance . . . . . . . . . . . . 114
Appendix 117
4.A1 Some Principal Angles Between the Subspaces in (4.26) . . . . . . . . . 117
4.A2 Diversity Considerations with Complex-Valued Data Symbols . . . . . . 118
4.A3 Dispersion Matrices from Random Grassmannian Packings . . . . . . . 120
4.A3.1 A Set of Real-Valued 4 × 4 Dispersion Matrices . . . . . . . . . . 1204.A3.2 A Set of Complex-Valued 4 × 4 Dispersion Matrices . . . . . . . 121
5 Summary and Future Work 123
viii
-
Contents
List of Frequently Used Accents, Acronyms, Operators, and Symbols 127
Bibliography 131
ix
-
x
-
1Introduction
IN the mid 1990s, Foschini [34, 35] and Telatar [106, 107] triggered a new field of
interest by pointing out that there exists an enormous capacity advantage in wire-
less communications when using multiple antennas, both at the transmitter and
at the receiver. Ever since, these systems, which have come to be known as multi-
ple input multiple output (MIMO) systems, have been treated extensively by many
research teams, world wide, resulting in an abundance of publications on virtually
any related issue. Most significantly, besides capacity, MIMO schemes are nowadays
also widely appreciated for their ability to enable reliable transmissions. Meanwhile,
several testbeds have successfully verified that MIMO transmissions are possible under
realistic conditions, see [8,10,53,87,92] for some examples. Furthermore, companies
are starting to implement first products utilizing some of the predicted gains. Despite
all these efforts, there is still a big gap between what theory predicts and reality. Hence,
more sophisticated techniques are yet to be developed to make use of the full potential
that MIMO systems bear. The following paragraphs briefly summarize the most impor-
tant existing results and outline the contributions of this thesis to the enhancement of
MIMO techniques.
Concerning capacity, MIMO systems were found to be attractive primarily because
many scenarios allow for a linear increase of mutual information with respect to the
number of transmit antennas or the number of receive antennas1 while maintaining
bandwidth and fixing transmit power [107]. Indeed, this is a significant capacity gain,
especially at high signal to noise ratios (SNRs), since additional bandwidth is usually
not available (or very expensive) and simply increasing transmit power only leads to a
logarithmic capacity gain [19]. The linear increase is due to the fact that the channel
impulse response usually spans a multi-dimensional space when using several anten-
nas at the transmitter and at the receiver [107]. Each of these dimensions may be used
for an independent data transmission, thus, providing a number of parallel transmis-
sion links. In practice, however, accessing these links individually would require per-
fect channel state information at the transmitter [81, 95, 110], which is unlikely to be
1depending on which number is smaller
1
-
1 Introduction
obtained. Outdated channel state information or simply estimation errors may cause
significant losses in data rate or performance [72,84]. In this context, a lot of theoret-
ical work has been carried out to compute the maximum achievable data rates under
certain constraints like, for example, the availability of perfect or imperfect channel
state information at the transmitter or the receiver, see for instance [45, 77] and the
references therein. Usually, the lack of channel knowledge – no matter whether this
is at the transmitter or at the receiver – leads to a small degradation in terms of mu-
tual information [45], but, most importantly, it does not affect the linear increase of
capacity. As a matter of fact, Foschini already proposed a scheme in his initial pub-
lication [34], which is nowadays commonly referred to as diagonal Bell Labs layered
space time architecture (D-BLAST), that makes use of the additional data rate while
only demanding channel knowledge at the receiver. Most existing schemes have in
common that they transmit data in parallel in order to exploit the additional data rate.
Due to this fact, the capacity gain of MIMO systems is often also termed multiplexing
gain in literature [65].
With respect to reliability, the gain results from the fact that the individual channel
impulse responses from each transmit antenna to each receive antenna often exhibit in-
dependent fading characteristics [18, Chapter 6]. Therefore, letting the same data ex-
perience multiple components of the MIMO channel enhances the likeliness of the data
being received correctly at the receiver. Such gains where replicas of the same data are
transmitted across several independent transmission links are commonly named diver-
sity gains [88]. In the case of MIMO systems, we usually speak of spatial diversity as
it primarily results from the spatial domain. Its magnitude is specified according to
the number of independent transmission links. For a time invariant flat fading MIMO
scheme, the maximum value is easily seen to be given by the product of the number
of transmit antennas and the number of receive antennas [103]. Space time codes
that achieve the maximum level of diversity are said to be fully diverse. Interestingly,
a moderate number of antennas at the transmitter and at the receiver is already suf-
ficient to provide a bit error probability that gives close to additive white Gaussian
noise (AWGN) performance in an independent identically distributed (i. i. d.) Rayleigh
fading environment [91]. Concerning MIMO transmissions, Alamouti [3] proposed
a simple encoding scheme that efficiently exploits the full potential of spatial diver-
sity without requiring channel state information at the transmitter. While Alamouti’s
scheme was designed for systems with two transmit antennas, only, Tarokh et al. [102]
came up with a generalization for an arbitrary number of transmit antennas, named
orthogonal space time block codes (OSTBCs). Besides diversity, these space time codes
are also very attractive because they reduce the detection complexity of the maximum
likelihood (ML) receiver significantly [3, 65], but their drawback is that they are not
capable of providing a multiplexing gain [66, 93, 102]. Even worse, OSTBCs actu-
ally have a rate loss compared with single antenna systems if more than two transmit
antennas are applied. This led to new designs like quasi OSTBC [61] or diagonal alge-
braic space time (DAST) block codes [20] in order to cope with this additional loss.
Initially, capacity and diversity related issues were mainly considered as two sep-
2
-
arate lines of research. Consequently, the majority of the early MIMO transmission
techniques put emphasis on maximizing either data rate or diversity – D-BLAST and
OSTBCs being just two examples. Nowadays, it has been widely accepted that one
needs to consider both aspects jointly for the construction of well performing space
time constellations. Bounds have been established on how much mutual information
and diversity can be achieved simultaneously [21, 120]. It becomes clear that large
data rates and a high level of diversity are not necessarily excluding properties asymp-
totically [21]. In literature, the term space time coding has been adopted, generally,
for the application of MIMO signaling techniques at the transmitter. It is also common
practice to distinguish between two branches, namely, the design of space time trellis
codes (STTC) and the design of space time block codes (STBCs) [40, 81, 110]. While
STTCs, see e. g. [103, 114] for some early work, often justify their name since they
allow for similar gains like the ones ordinary channel codes provide, it is important to
note that many (but not all) STBCs do not have this ability. Nevertheless, the design
of STBCs has proven to be a very powerful approach to make MIMO transmissions fea-
sible in today’s applications, partly, also due to the fact that they require less complex
receivers. This thesis contributes to the search for superior STBCs with new findings
on the structure of OSTBCs and new strategies for the construction of high rate space
time constellations that evolve thereof.
STBCs itself may be grouped into various categories. In the first place, they may be
distinguished by the amount of channel state information they need. Some schemes
include partial knowledge about the channel impulse response (like, e. g., its statistical
properties) for optimization at the transmitter, see e. g. [27,64,95,121], whereas oth-
ers do not even require channel state information at the receiver explicitly or implicitly.
The latter ones are usually referred to as differential, see e. g. [50,58,60,67], or non-
coherent space time codes, see e. g. [49,56,57,118,119], respectively. We will mostly
deal with coherent space time coding schemes that do require channel knowledge only
at the receiver, but an interesting link to non-coherent schemes will be discussed [85],
too. There exists an abundance of different proposals on how to design such STBCs.
Most strategies make use of very general concepts like, e. g., the linear dispersion codes
(LDCs) [51] in order to structure the design problem. Having a structured model, nu-
merical techniques are often applied to find good solutions, see e. g. [51, 52]. Unfor-
tunately, the corresponding functions are usually highly non-convex, which makes it
difficult to find an optimal solution. On the other hand, a purely analytical optimiza-
tion is also hard to achieve. Very powerful STBCs have been proposed by applying
division algebras and number theory, see [7,22,25,37,97]. However, these techniques
are difficult to handle, in particular, for higher dimensional STBCs where they may
also require some type of numerical optimization [80].
Now, in detail, the contributions of this work are the following. We show that there
exist two possibilities to identify OSTBCs with sets of subspaces which are named
Grassmannian packings in the mathematical literature [28]. The properties of these
packings give us new insight into the geometrical structure of OSTBCs. In particular,
we obtain a new link between coherent and non-coherent space time constellations
3
-
1 Introduction
which provides detailed information on their relationship. Motivated by these results,
we generalize this concept to construct high rate STBCs. This gives us a new power-
ful framework that links the design of general coherent space time constellations with
the search for good Grassmannian packings. We derive packing properties that yield
space time constellations with optimized performance in terms of mutual information
and diversity. We also suggest two methods that enable us to construct excellent pack-
ings. Thereof, we obtain full rate coherent STBCs that turn out to be superior to the
best known coherent STBCs [7, 24, 116, 117] that we are aware of. The emphasis of
this work is clearly on the relationship between coherent STBCs and Grassmannian
packings, but a well defined transmission model turns out to be helpful to identify
properties and design criteria. Therefore, we work out a unified model that includes
spreading matrices and dispersion matrices, both applying real-valued or complex-
valued notation. While most parts of the model have been described in literature
before, it is particularly their connection that gives us new insights. Besides, much of
the work relies on real-valued notation whose potential is often neglected. The main
results and some related work were also published in [82–86].
This thesis is structured into three major parts. In Chapter 2, we discuss various as-
pects of the transmission model. We also state some important design criteria that are
commonly used for the construction of space time constellations. Chapter 3 addresses
the connection between OSTBCs and Grassmannian packings. Further, we show what
the implications are on the relationship between coherent and non-coherent packing
based constellations. Finally, we generalize the concept for the design of high rate
coherent space time constellations in Chapter 4. Here, the emphasis is on how to con-
struct good sets of dispersion matrices from Grassmannian packings. A summary of
the main results is given in Chapter 5. Additionally, we point out how our findings
may affect future research in this field.
A list of symbols, operators, and accents is given at the end of this thesis. We
usually introduce all entities when they appear for the first time, but the reader is
recommended to consult this list if being in doubt about a particular definition. What
concerns general conventions, we denote vectors and matrices by bold face lower and
upper case letters, respectively. (¤)H is the hermitian, (¤)T is the transpose, and (¤)∗
the complex conjugate of a vector or a matrix. The trace of a matrix A is denoted by
tr(A). Its squared Frobenius norm is ‖A‖2. If not introduced otherwise, we refer toentries of A by [A]ij where i indicates the row and j the column. Similarly, [A]:j and[A]i: select entire columns and rows and [a]i addresses the ith element of a vector a.Real-valued notation is indicated by a long bar above the letter (not to be confused
with the short bar, which is the statistical average, e. g., the average energy per bit
Ēb). If the results apply for real-valued and complex-valued notation, we usually omitthe bar. Having a complex-valued symbol, ℜ(¤) and ℑ(¤) denote its real part andimaginary part. Finally, we frequently use diagonal matrices Φ with (principal) angles
on the diagonal. Then, cos(Φ) and sin(Φ) must be interpreted as the cosine and thesine of the diagonal elements, respectively, without affecting the off-diagonal elements.
4
-
2Models and Design Criteria for
Space Time Block Coding Techniques
THROUGHOUT literature, plenty of models have been deployed to describe space
time coded transmissions. Most models are related, but each bears its own ad-
vantages and disadvantages. Many space time codes are closely connected with
the model that was used to describe them. Some of these models are similar concern-
ing their concept and merely differ in notation. Some other models, however, apply dif-
ferent concepts and, therefore, provide easier access to other code properties. For this
reason, the description of an existing space time code using such models often reveals
properties that are hidden otherwise. Also, understanding the relationship between
the models is essential for the comparison of space time codes that were introduced
using different frameworks. In this chapter, we discuss two important concepts that
are used to describe space time transmissions, namely the dispersion model [51] and
the spreading model [52, 83]. As most models have in common that using complex-
valued data symbols imposes some restrictions on the code design, we explain how
real-valued data symbols and a purely real-valued notation [79] allow for a general-
ization which has slightly different implications for the two discussed models. We also
address basic aspects about ML detection and design criteria like transmission rate and
diversity. To begin with, we introduce the MIMO channel and its specific assumptions
that we apply in the course of this thesis.
2.1 The Multiple Input Multiple Output Channel
In a transmission system with multiple antennas at the transmitter and the receiver,
each transmit and receive antenna pair defines a wireless channel. The set of all these
wireless channels is commonly referred to as the MIMO channel, see Figure 2.1. The
number of transmit antennas, which we denote by nt, gives the number of channel
5
-
2 Models and Design Criteria for Space Time Block Coding Techniques
···
···
1
nt
1
nr···
···
···
···
hnrnt(τ, t)
h11(τ, t)
hnr1(τ, t)
h1nt(τ, t)
Figure 2.1: MIMO channel with nt transmit antennas and nr receive antennas.
inputs and the number of receive antennas, say nr, defines the number of channeloutputs. In total, we thus have ntnr different pairs of transmit and receive antennas.Each of the wireless channels is described by its impulse response hij(τ, t) where i andj identify the receive antenna and the transmit antenna, respectively. In general, theseimpulse responses may be a function of the absolute time t and disperse the transmitsignals in time, which is indicated by the delay time τ .
We restrict our analysis to slowly time variant frequency flat channels, only. This
eases the analysis because there exists a simple baseband channel model where each
channel impulse response may be described by a single complex-valued coefficient hij ,see [88]. Besides, the computation of the channel output then reduces to a multipli-
cation instead of a convolution, which would be much more difficult to handle in the
theoretical analysis. Moreover, we assume that the time variance of the channel is
sufficiently slow so that hij may be considered constant for the duration of at least onespace time codeword. It may change from codeword to codeword, though. All in all,
our assumptions comply with the well known quasi static [45] channel model where
the channel is assumed constant during one block of symbols, which we assume to
extend across ℓ time slots, and varies independently from block to block. Note that wedo not indicate the blockwise time dependence of hij because we only consider singleblocks in our analysis where hij is always fixed.
At first glance, these restrictions appear to be quite severe. However, frequency
selective channels may be easily split up into several parallel frequency flat channels
using orthogonal frequency division multiplexing (OFDM) [32]. Then, space time
coding is applied to each of the frequency slots individually. Doing so, one surely
looses frequency diversity, but spatial diversity often provides a level of diversity that
is sufficiently high already. Besides, additional spreading across the frequency domain
may be always applied, independently from the actual space time code. Also, the
assumption that the channel impulse response does not change for a small number of
time slots is quite moderate.
To allow for a discrete matrix transmission model, all individual channel impulse re-
sponses are gathered in the channel matrix (which we also refer to as (MIMO) channel
6
-
2.2 Basic System Models and Their Relationship
1 · · · ℓ = 1 · · · ℓ + 1 · · · ℓ
Y H X N
Figure 2.2: Elementary model for MIMO transmissions; blocks 1, . . . , ℓ mark time slots.
impulse response in the following)
H =
0
B
B
@
h11 · · · h1nt...
. . ....
hnr1 · · · hnrnt
1
C
C
A
(2.1)
where the individual channel impulse response hij is the entry of row i and column j.
2.2 Basic System Models and Their Relationship
An elementary model for data transmissions across a MIMO channel where the channel
matrix H stays constant for a duration of ℓ time slots is given by
Y = HX + N. (2.2)
Here, X is named transmit matrix. It is a matrix with nt rows and ℓ columns whoseentries are typically complex values, which we term transmit symbols. The transmit
symbols in row i are assigned to transmit antenna i and the ℓ consecutive time slots.Further, Y is the receive matrix whose entries are called receive symbols. Like for X,
each column is associated with a distinct time slot whereas the rows correspond to
different antennas (here, the receive antennas), see also Figure 2.2 for an illustration
of how the model incorporates time slots. Y results from a linear combination of the
desired signal components, namely HX, and the noise matrix N. We model the re-
ceiver noise components that N gathers by independent identically distributed (i. i. d.)
Gaussian random variables with variance σ2n per real dimension.A space time constellation or, equivalently, a STBC is well defined by a set of trans-
mit matrices X whose cardinality we denote by |X |. We convey data across the MIMOchannel by selecting and transmitting a matrix from X . Hereby, a single transmit ma-trix carries log2(|X |) bit of information. We often refer to each element of X as a pointof the constellation. This is motivated by the observation that each transmit matrix
denotes a point in the signal space of the transmission [70]. Clearly, the set of trans-
mit matrices should be designed such that the receiver has the chance to distinguish
between its elements. The constraints on the elements of X very much depend on theassumptions that we impose on the receiver and the characteristics of the MIMO chan-
nel. Details on this are not to be discussed in this section, but we should mention that
7
-
2 Models and Design Criteria for Space Time Block Coding Techniques
the subsequent considerations apply for constellations that are meant to be used for
coherent transmissions, i. e., transmissions where the receiver has knowledge about
H. We do refer to non-coherent space time constellations again towards the end of
Chapter 3, though, to point out some similarities between the coherent ones that we
discuss and general non-coherent constellations.
Now, an unstructured set X is impractical for various reasons. Most importantly,efficient equalization techniques are heavily based on how the information is embed-
ded in the transmit matrices, see e. g. [2, 29, 31, 44], but also the construction of well
performing sets of transmit matrices requires some sort of structure. Almost all coher-
ent STBCs1 that have been proposed in literature are constructed from multiple data
symbols xi which are linearly combined to form the entries of the transmit matrices.Much of our analysis is also based on such linear STBCs where we always assume that
n data symbols are transmitted jointly. The data symbols themselves are chosen in-dependently from discrete sets of scalars that are termed alphabets Ai, i. e. xi ∈ Ai.Typically, all data symbols are chosen from the same alphabet, i. e. A ≡ Ai. Theessence of the design of linear STBCs is to intelligently distribute the symbols xi acrossall transmit antennas within a block of ℓ consecutive time slots. We should alwayskeep in mind, however, that the linear structure imposes constraints that prohibit cer-
tain gains, which other schemes are capable of achieving. Some more details on this
will be discussed in Section 3.3 and in the first two paragraphs in Chapter 4. For the
time being, though, we restrict our considerations to these linear schemes.
The dispersion model [51] and the spreading model [52, 83] are two schemes that
provide a reasonable mathematical framework for the description of linear STBCs. The
dispersion model is closely linked with the matrix model that we introduced in (2.2)
while the spreading model deploys signal vectors. Nevertheless, having already agreed
on the specifications of the MIMO channel in Section 2.1, both models are equivalent
with respect to the type of STBCs which they describe. Each model has its own advan-
tages and disadvantages when it comes to the identification of certain properties of the
STBCs, though. In the remainder of this section, we will discuss the conceptual aspects
and the structural properties of both models and their relationship. To do so, we do
not specify the elements of A at the moment. Hence, the reader may assume complex-valued or real-valued elements in A. However, we soon discover that some existingSTBCs require real-valued alphabets in order to be covered properly. So, we stress that
we usually assume real-valued alphabets in all subsequent sections and chapters if not
mentioned otherwise. Note that this is also indicated by the bar above all data symbols
in upcoming sections and chapters (e. g., xi is a real-valued data symbol). Real-valuedalphabets also raise the desire for a general real-valued notation, but this discussion
is postponed to Section 2.3 because it is less common in literature and needs more
attention. Some additional remarks are finally given in Section 2.4, including a brief
summary of the actual model equations that we apply throughout this work.
1An example of an exception is the STBC described in [94].
8
-
2.2 Basic System Models and Their Relationship
2.2.1 Linear Dispersion Model
The linear dispersion model [51] defines the transmit matrices as a weighted sum of
dispersion matrices Ci where the weighting coefficients are the information carrying
data symbols xi. Mathematically, this may be expressed by
X =nX
i=1
xiCi. (2.3)
Hence, the STBC is fully defined by a set of dispersion matrices C with |C| = n and thesymbol alphabet A.
Example 2.1 shows how a well known STBC, namely Alamouti’s OSTBC [3], fits
into the linear dispersion model. We note that a formal introduction to OSTBCs is
not to come before Chapter 3. Nevertheless, Alamouti’s OSTBC [3] serves well as an
illustrative example because of its simple structure. We thus make use of it in several
examples to point out specific aspects about space time coding along the way.
Example 2.1 (Linear dispersion matrices). Most commonly Alamouti’s code is repre-
sented by the orthogonal code matrix [102]
X =
x1 −x∗2x2 x
∗1
!
(2.4)
which we simply assume given at the moment. For implementational purposes, (2.4)
actually gives the necessary detail in the sense that we know when and where to trans-
mit which symbol. Nevertheless, we require the dispersion matrices because they give
us more insight into the structure of the scheme later on. Obviously, two complex-
valued symbols are transmitted jointly. More specifically, at time slot 1, x1 and x2 aretransmitted from antenna 1 and 2, respectively. At time slot 2, −x∗2 and x∗1 are trans-mitted from antenna 1 and 2. This means, we would need two dispersion matrices,one for x1 and one for x2. However, it immediately becomes clear that a descriptionwith dispersion matrices according to (2.3) using the complex symbols x1 and x2 isnot possible because of the complex conjugate operation. Still, a description with dis-
persion matrices exists. By splitting up the symbols into their real and imaginary parts,
we come up with a slightly modified version of (2.4), namely
X =
ℜ(x1) + jℑ(x1) −ℜ(x2) + jℑ(x2)ℜ(x2) + jℑ(x2) ℜ(x1) − jℑ(x1)
!
. (2.5)
From this, we can easily derive a set of four dispersion matrices
C1 =
1 0
0 1
!
, C2 = j
1 0
0 −1
!
, (2.6a)
C3 =
0 −11 0
!
, and C4 = j
0 1
1 0
!
(2.6b)
9
-
2 Models and Design Criteria for Space Time Block Coding Techniques
that are weighted with the symbols ℜ(x1), ℑ(x1), ℜ(x2), and ℑ(x2), respectively. Lateron, we simply refer to these real-valued data symbols as x1, x2, x3, and x4.
Remark. In the previous example, it might have been more straightforward to use a
STBC that does not require the complex conjugate operation, but, by using Alamouti’s
OSTBC, we already make the reader aware of the difficulties that occur if the data
symbols are chosen from a complex-valued alphabet.
2.2.2 Spreading Model
Clearly, linear space time block coding may be interpreted as spreading data symbols
across the dimensions space and time. It may appear that this is merely a matter
of wording. However, spreading schemes are usually linked to a certain model de-
scription, which is rather uncommon for MIMO systems, but otherwise a well known
concept in communications [69]. Some authors actually do apply the model implic-
itly to simplify certain types of analysis, see e. g. [22, 52, 101], but we often find it
beneficial to define a spreading model explicitly as such.
The key idea of the linear MIMO spreading scheme is to define a spreading matrix
S which maps the data symbol vector x to its transmit symbol vector s2
s = Sx. (2.7)
Here, x is composed of the data symbols xi from before, i. e. [x]i = xi. Further, S isconstructed in such a way that s stacks the vectors that are transmitted consecutively.
In other words, the symbols [s]1, ..., [s]nt are transmitted from antenna 1 to nt duringthe first time slot, the symbols [s]1+nt , ..., [s]2nt are transmitted from antenna 1 to ntduring the second time slot, and so on. Using the effective channel impulse response
Hℓ that results from the concatenation of ℓ time slots, i. e.
Hℓ =
0
B
B
@
H . . . 0
.... . .
...
0 . . . H
1
C
C
A
, (2.8)
the overall transmission is then described by
y = HℓSx + n (2.9)
where n is now termed noise vector, see also Figure 2.3 for a visualization of the
structure of this mathematical model.
Example 2.2 (A spreading matrix). Let us again consider Alamouti’s scheme. Using
the details from Example 2.1, we may easily construct the transmit vector
s = ( x1 x2 −x∗2 x∗1 )T . (2.10)2Note that we may refer to x and s simply as the symbol vector whenever there is no confusion possible.
10
-
2.2 Basic System Models and Their Relationship
1
···
ℓ
=
H
· · ·
H
x1
·····xn
+
1
···
ℓ
y Hℓ S x n
Figure 2.3: Spreading model; blocks 1, . . . , ℓ mark time slots; white spaces denote zero elements.
A straightforward definition of the symbol vector would be x = ( x1 x2 )T . How-
ever, we have similar difficulty as in the previous example, because there does not exist
a linear mapping from, for example, x1 to x∗1. For this reason, a spreading matrix S
does not exist that maps x to s. If we define a modified transmit vector
x = ( ℜ(x1) ℑ(x1) ℜ(x2) ℑ(x2) )T , (2.11)though, a spreading matrix exists, namely
S =
0
B
B
B
@
1 j 0 0
0 0 1 j
0 0 −1 j1 −j 0 0
1
C
C
C
A
. (2.12)
That means, s = Sx.
2.2.3 Matched Filtering and Transmission Matrices
The definition of the matched filter output [70] is useful because many receive al-
gorithms rely on it. Additionally, matched filtering often gives more insight into the
structure of the transmission scheme, in particular, when despreading is included in
the matched filtering process. Most importantly, at the moment, we can use matched
filtering to motivate the definition of transmission matrices [69]. To do so, we first
consider the matched filter output for the dispersion model, which is
Ym = HHHX + HHN (2.13)
where HH is the impulse response of the matched filter. Note that the matched filter
covers the spatial dimensions such that it implicitly includes maximum ratio combining
(MRC) with respect to the signal contributions from different receive antennas. It is
self-suggestive to combine the impulse response of the channel and of the matched
filter to form a single matrix
R = HHH, (2.14)
11
-
2 Models and Design Criteria for Space Time Block Coding Techniques
which is the type of matrices that we refer to as transmission matrix from now on (not
to be confused with the transmit matrix X). It provides a direct link between the input
and output symbol matrices in (2.13)
Ym = RX + HHN. (2.15)
A similar description exists for the spreading model as well. Here, the matched filter
output is
ym = Rℓs + HHℓ n. (2.16)
This particular transmission matrix Rℓ evolves from R in the same manner as Hℓ does
from H. Moreover, additional despreading with SH motivates yet another definition
of a transmission matrix
x̃ = SHRℓSx + SHH
Hℓ n (2.17)
= Rsx + SHH
Hℓ n. (2.18)
Here, we have Rs = SHRℓS where the subscript s indicates that spreading is included.
The latter one of these three transmission matrices turns out to be the most useful one
for our considerations since it directly maps the data symbols.
All transmission matrices have in common that they are hermitian. The main di-
agonal values denote the effective transmission coefficients for each of the symbols
whereas the off-diagonal elements determine the amount of crosstalk between differ-
ent pairs of symbols. Therefore, an informal design goal is to make the off-diagonal
elements as small as possible to avoid crosstalk while the main diagonal elements
should be as large as possible. Clearly, this points out an advantage of Rs concerning
the design of transmission strategies, i. e. STBCs. Namely, R and Rℓ are fixed by the
channel impulse response and may not be altered by any type of signal processing at
the transmitter or the receiver, but the structure of Rs may be optimized by designing
the spreading matrix appropriately. Hence, Rs captures the properties of the chan-
nel as well as those of spatial and temporal spreading. It is also important to realize
that models which include matched filtering or matched filtering with despreading
still provide sets of sufficient statistic. This ensures their optimality concerning the
detection process. Another useful property of the transmission matrices is the fact that
they also denote the noise correlation matrices after matched filtering/despreading.
Note that the above considerations hold for uncorrelated noise at the receiver, but a
generalization for colored noise is straightforward, see [86].
2.2.4 Relationship Between the Two Models
To begin with, we again stress that the linear dispersion model and the spreading
model are mathematically equivalent. In other words, both models carry out the same
mapping between the data symbols and the receive symbols, at least, when consider-
ing the type of channel that we introduced in Section 2.1 (see the remark at the end of
12
-
2.2 Basic System Models and Their Relationship
S C4 C3 C2 C1
1
2
1
2
1
2
1
2 1 1 1 12 2 2 2
Figure 2.4: Mapping between spreading matrices and dispersion matrices.
this section for some more details). We have also mentioned before that, despite this
equivalence, each model provides its own advantages for the code design and analysis,
which we work out in the course of the thesis. Just to give a couple of examples, we
will observe that the spreading model proves useful for any type of capacity analysis
whereas the dispersion model is superior for certain types of code construction. Fur-
thermore, another advantage of the spreading model is that it provides a vector matrix
representation that many detection techniques require for straightforward implemen-
tation [29,31].
To make use of the properties of both models, it is important to understand their
relationship. The link between the two models is actually rather simple. Namely,
y evolves from Y by stacking its columns. The same applies for the relationship be-
tween s and X. In order to compare different space time constellations, it is also
inevitable to grasp the connection between the spreading matrix S and the dispersion
matrices Ci. We illustrate this mapping in Figure 2.4. In words, the dispersion matrix
Ci defines the ith column of S by stacking its columns. Further, there exists a simplelink between the entries of the transmission matrix Rs and the dispersion matrices.
Specifically,
[Rs]ij = [S]H:i Rℓ [S]:j (2.19)
= tr“
CHi RCj
”
. (2.20)
Applying some equivalence properties of traces, (2.20) becomes
[Rs]ij = tr“
CHi RCj
”
= tr“
CTj R
∗C
∗i
”
(2.21)
=1
2tr“
CHi RCj + C
Tj R
∗C
∗i
”
(2.22)
=1
2tr“
RCjCHi + R
∗C
∗i C
Tj
”
(2.23)
=1
2tr“
ℜ(R)“
CjCHi + C
∗i C
Tj
”
+ jℑ(R)“
CjCHi − C∗i CTj
””
. (2.24)
Later on, some special cases occur, namely, whenever the dispersion matrices are either
13
-
2 Models and Design Criteria for Space Time Block Coding Techniques
real-valued or purely imaginary. Then, we have
[Rs]ij =1
2tr“
ℜ(R)“
CjCHi + CiC
Hj
”
+ jℑ(R)“
CjCHi − CiCHj
””
(2.25)
if Ci and Cj are both purely real or both purely imaginary and
[Rs]ij =1
2tr“
jℑ(R)“
CjCHi + CiC
Hj
”
+ ℜ(R)“
CjCHi − CiCHj
””
(2.26)
if either one of them is real and the other one is imaginary. For a purely real-valued
transmission matrix R, the second addend vanishes in (2.25) whereas the first one
vanishes in (2.26). Thus, [Rs]ij is determined only by the hermitian or skew hermitiancomponents of the matrix product of the dispersion matrices Ci and Cj . Further, note
that the first addends are real-valued in, both, (2.25) and (2.26) whereas the last
ones are purely imaginary. Since the imaginary part does not affect the decision if
the dispersion matrices are weighted with real symbols, only the first addends are of
interest in those cases.
Remarks. a) Although we will not make use of this property throughout this work, we
point out that the spreading model is more general than the dispersion model in terms
of the type of the channel impulse responses it allows for. The dispersion model, by
virtue of its structure, inherently assumes a constant channel impulse response during
the transmission of one block. For the spreading model, on the other hand, a straight-
forward extension for more general channel conditions exists, simply by letting Hℓattain a structure that has different blocks on the diagonal or that is not block diagonal.
This way, the spreading model is easily adapted for time variant and/or frequency
selective channels. However, this structure should be taken into account if one wants
to design STBCs for such scenarios, which we do not. b) Sometimes when we switch
between the description of a constellation in terms of dispersion matrices and the same
constellation in terms of spreading matrices, we omit a scaling factor for convenience.
For example, a unitary dispersion matrix would cause a column with norm ℓ in thespreading matrix, which we usually set to 1 without mentioning this explicitely.
2.3 Real-Valued Notation
In communications, it is common practice to represent baseband signals with com-
plex numbers. Complex numbers, however, often prohibit a straightforward way that
certain aspects are dealt with, see e. g. [79, 82]. We already had to cope with some
difficulty when we represented Alamouti’s OSTBC with dispersion matrices in Exam-
ple 2.1 and with spreading matrices in Example 2.2. This was because the model does
not allow for the complex conjugate operation. Generalized, this means that the use of
complex-valued data symbols hides some degrees of freedom that are thus not acces-
sible for any type of optimization. To overcome this problem, we split up the symbols
into their real and imaginary parts and use each of the parts as separate real-valued
14
-
2.3 Real-Valued Notation
symbols. Still, the evolving transmit signals are complex-valued and the modeling of
the transmission itself is also complex-valued. Considering Example 2.2, in particular,
this solution is rather unsatisfactory because the resulting spreading matrix carries out
a mapping from real-valued entities to complex-valued entities. With such a mapping,
an analytical analysis takes more effort or even gets infeasible at times. Therefore, we
are interested in a model that maps real-valued data symbols to real-valued transmit
symbols. So, an entirely real-valued transmission model also requires a real-valued
representation of the spreading matrices, the dispersion matrices, and the channel
impulse responses.
Remark. For the sake of completeness, we mention another important advantage of
real-valued notation, which is less significant for our considerations in the remainder
of this thesis, though. That is, complex-valued notation lacks the ability to capture
certain types of correlation in a single correlation matrix correctly, which are, however,
easily incorporated with real-valued notation. As a matter of fact, this is the reason
why Neeser and Massey initially suggested the use of real-valued notation in [79].
2.3.1 Channel Matrices and Signal Vectors
The connection between complex-valued notation and real-valued notation is purely
mathematical. A complex number a may be interpreted as a two-dimensional entitywhose components are orthogonal. A 2×2 matrix that is constructed from the complexnumber in the following manner
A =
ℜ(a) −ℑ(a)ℑ(a) ℜ(a)
!
(2.27)
bears the same property. It is an equivalent description of the complex number in the
sense that it yields the same result with respect to addition and multiplication of two
arbitrary complex numbers [79]. The representation of the complex conjugate of a issimply given by the transpose of the matrix in (2.27).
An extension to matrices is straightforward. Each of the elements of the complex-
valued matrix A is transformed according to (2.27), i. e.,
A = ℜ (A) ⊗
1 0
0 1
!
+ ℑ (A) ⊗
0 −11 0
!
. (2.28)
For the impulse response of the MIMO channel, this is
H =
0
B
B
B
B
B
B
B
@
ℜ (h11) −ℑ (h11) · · · ℜ (h1nt) −ℑ (h1nt)ℑ (h11) ℜ (h11) · · · ℑ (h1nt) ℜ (h1nt)
......
. . ....
...
ℜ (hnr1) −ℑ (hnr1) · · · ℜ (hnrnt) −ℑ (hnrnt)ℑ (hnr1) ℜ (hnr1) · · · ℑ (hnrnt) ℜ (hnrnt)
1
C
C
C
C
C
C
C
A
. (2.29)
15
-
2 Models and Design Criteria for Space Time Block Coding Techniques
Again, the product of several matrices gives the same result using either one of the
notations – the complex-valued one or the real-valued one. That means, the con-
catenation of several system matrices (like channel matrices or (complex) spreading
matrices) is the same in both cases, which, of course, is an essential requirement.
For the signal vectors, preserving orthogonality between the real part and the imag-
inary part is not essential. As a matter of fact, the orthogonality between the real
and imaginary part of the signal components is one of the limitations of complex-
valued notation. That is why we simply stack the real and the imaginary part sym-
bolwise to obtain a real-valued description of the signal vector. Mathematically, we
may express the relationship between the complex-valued transmit vector s and its
real-valued counterpart s in the following way:
s = ℜ(s) ⊗
1
0
!
+ ℑ(s) ⊗
0
1
!
. (2.30)
At this point, we want to emphasize again that y = Hs + n and y = Hs + n aretwo equivalent equations if (2.30) and (2.28) apply for the relationship between all
corresponding vectors and matrices, respectively. So, both equations model the same
transmission over the same channel.
2.3.2 Real-Valued Spreading Matrices
The impulse responses of all physically existing MIMO channels are described properly
with complex-valued notation as well as with real-valued notation where the mapping
between the two is according to (2.28). Therefore, its real-valued representation al-
ways has a structure that allows for this transformation. The real-valued spreading
matrices that we introduce now do not require such a structure. They actually do have
this particular structure if and only if it is possible to express the corresponding STBC
with complex-valued data symbol vectors. From the examples before, we know that
this is not possible for all existing STBCs, including Alamouti’s OSTBC. Therefore, the
real-valued spreading matrices of these STBCs must have a more general structure to
allow for arbitrary linear mappings between real-valued data symbols and real-valued
transmit symbols. A simple example is given by the complex conjugate operation that
is detailed now.
Example 2.3 (Complex conjugate operation). Let x be an arbitrary complex-valueddata symbol. No linear transformation with only x as input is able to provide x∗ atits output. That means, x′ where x′ = ax will not be the complex conjugate of x foran arbitrary but fixed complex constant a and an arbitrary complex variable x. Now,real-valued notation easily incorporates the complex conjugate operation as a linear
transform like the following expression shows:
ℜ(x′)ℑ(x′)
!
=
ℜ(x)−ℑ(x)
!
=
1 0
0 −1
!
ℜ(x)ℑ(x)
!
= Sx. (2.31)
16
-
2.3 Real-Valued Notation
Again, we want to stress that this simple 2 × 2 spreading matrix does not have astructure according to (2.27), which is the reason why the complex-valued notation
does not exist in the first place.
The complex conjugate operation is common to various space time coding schemes
and Alamouti’s OSTBC is just one out of many STBCs that applies it. To demonstrate
how real-valued notation actually affects the spreading matrices, we give another ex-
ample.
Example 2.4 (Real-valued spreading matrix). We again consider Alamouti’s OSTBC.
The mapping between data vector and transmit vector is clearly defined by (2.4). From
this, the construction of the real-valued spreading matrix is straightforward, i. e.
S =
0
B
B
B
B
@
1 0 0 0 0 0 1 0
0 1 0 0 0 0 0 −10 0 1 0 −1 0 0 00 0 0 1 0 1 0 0
1
C
C
C
C
A
T
. (2.32)
Note that the relationship between S and the complex-valued spreading matrix from
Example 2.2 is
S = ℜ(S) ⊗
1
0
!
+ ℑ(S) ⊗
0
1
!
. (2.33)
This is the mapping that applies for the relationship between any real-valued spreading
matrices and their complex-valued counterparts, those which take real-valued symbols
at their input. A mapping according to (2.33) exists for any real-valued spreading ma-
trix, but, as expected, this spreading matrix does not have a complex-valued equiva-
lent according to (2.28). It is also important to note that S is non-square whereas the
spreading matrix in Example 2.2 is square. Some consequences will become clear in
Section 2.6 where we consider the capacity of a MIMO channel.
2.3.3 Real-Valued Dispersion Matrices
In Example 2.1, we applied real-valued data symbols to come up with the (complex-
valued) dispersion matrices that correctly describe Alamouti’s OSTBC. In terms of
spreading matrices, the counterpart of these particular dispersion matrices is a
complex-valued spreading matrix that takes real-valued data symbols and returns
complex-valued transmit symbols, which we described in Example 2.2. Naturally, real-
valued spreading matrices also have their corresponding dispersion matrices. Once
more, let us consider Alamouti’s scheme as an example.
17
-
2 Models and Design Criteria for Space Time Block Coding Techniques
Example 2.5 (Real-valued dispersion matrices). The real-valued dispersion matrices
are
C1 =
1 0 0 0
0 0 1 0
!T
, C2 =
0 1 0 0
0 0 0 −1
!T
, (2.34a)
C3 =
0 0 1 0
−1 0 0 0
!T
, and C4 =
0 0 0 1
0 1 0 0
!T
. (2.34b)
It is easily verified that the matrices Ci evolve from S in the same manner as Ci from
S, see also Figure 2.4. Further, note that Ci is constructed from the corresponding Ciin Example 2.1 in the same way as S is from S, i. e.,
Ci = ℜ(Ci) ⊗
1
0
!
+ ℑ(Ci) ⊗
0
1
!
. (2.35)
Just like the spreading matrices, the complex-valued dispersion matrices in Exam-
ple 2.1 differ from the real-valued dispersion matrices in (2.34) with respect to their
dimensionality. The interpretation is different, though. Contrary to the spreading ma-
trices, it is now the number of time slots that is linked to the number of columns. Since
we usually construct sets of square dispersion matrices, this means that the number
of time slots in the real-valued case is twice the number of time slots compared with
the complex-valued case. This means that the rate (in terms of number of symbols per
time slot) of the STBC constructed from the real-valued set is only half of the rate of
the complex-valued one if the cardinality of both sets is the same and the dispersion
matrices are weighted with real-valued data symbols chosen from the same alphabet
in both cases. In other words, having set these constraints concerning the dispersion
matrices and the channel, we need twice as many dispersion matrices if we apply real-
valued notation in order to obtain the same rate. Therefore, detection usually gets
more involved since double the number of symbols have to be decided on jointly, but,
on the positive side, we have more degrees of freedom for optimization due to a larger
signal space, see Chapter 4.
2.4 Summary of General System Model Equations and Discussion
From Examples 2.1 and 2.2 and the considerations in the previous section, we have
learned that the general case is covered only if the real part and the imaginary part
of complex-valued data symbols are dealt with independently. Eventually, this means
that all data symbols should be real-valued. Provided that this is the case, we have the
choice to use complex-valued or real-valued entities to model the remaining mappings
and transmission. From now on, we mostly3 restrict our considerations to one of these
3In Section 4.2.4, we propose a STBC that is optimal in terms of mutual information only if complex-valueddata symbols are applied. Further, in Section 4.3, some of the constellations from literature require complex-valued data symbols as well.
18
-
2.4 Summary of General System Model Equations and Discussion
general models where it should become clear from the context which one is actually
used. It is important to keep in mind that any upcoming transmission equation could
be rewritten by using any one of these general models. At times, we switch between
the models without explicitly mentioning, simply because one or the other model is
more suited for certain observations and derivations. In other words, we always use
the model that is best suited for the statements that we intend to make. In summary,
the four general model equations are
y = HSx + n and y = HSx + n (2.36)
and
Y = HnX
i=1
xiCi + N and Y = HnX
i=1
xiCi + N. (2.37)
As we usually apply real-valued data symbols there should not be any confusion con-
cerning the use of the matrices S and Ci from now on. If not stated otherwise, they
always map real-valued data symbols to complex-valued transmit signals and their
interrelationship with S and Ci is always according to (2.33) and (2.35), respectively.
Typically, the general case is covered by using complex-valued dispersion matrices
weighted with real-valued data symbols, i. e. it is modeled similarly according to the
equation on the left hand side in (2.37). As a matter of fact, this was already suggested
by Hassibi et al. in their initial publication on linear dispersion codes [51]. Contrary
to our use, it is, however, common practice to define two sets of dispersion matrices,4
one set for the real parts of the data symbols and another one for the imaginary parts
of the data symbols. Since the mapping between the real-valued data symbols we use
and the real and imaginary parts of the corresponding complex-valued data symbols
is arbitrary, we prefer the use of a single set of dispersion matrices, which is basically
the union of the two sets that are usually defined. We discuss this issue in a little more
detail when we introduce OSTBCs formally in Section 3.1. Certain alphabets like, e. g.,
8 PSK (phase shift keying) cause correlation among pairs of the real-valued symbols.Although it is not impossible to handle correlation among data symbols theoretically,
the assumption of uncorrelated data symbols eases the overall analysis. For this reason,
we continue to consider alphabets with uncorrelated real and imaginary part only. The
restriction is only minor and does not affect our overall conclusions. Hence, we suggest
to choose amplitude shift keying (ASK) alphabets for the real-valued data symbols.
Then, if we had the desire to establish a relationship to equivalent complex-valued
data symbols, we would get symbols from a regular quadrature amplitude modulation
(QAM) scheme, but this is not really necessary.
Despite the fact that the data symbols are usually real-valued, we refer to those
model equations that use complex-valued channel matrices, spreading matrices, or
dispersion matrices as complex-valued notation. Real-valued notation requires the
4The authors in [51] do briefly mention in a footnote that this is not necessary, though.
19
-
2 Models and Design Criteria for Space Time Block Coding Techniques
entire model to be real-valued. In Appendix 2.A1, we give another simple example to
point out some more differences between the two notations. It may help to understand
further why we favor real-valued notation at times.
For the sake of completeness, we want to mention that other concepts exist in liter-
ature to cover those mappings that require real-valued data symbols in our considera-
tions. For example, some authors define different effective channel impulse responses
for each time slot in order to cover the complex conjugate operation, see e. g. [81].
By doing so, a spreading model with independent complex-valued data symbols may
be applied for the representation of, e. g., Alamouti’s OSTBC. The resulting channel
matrix Hℓ is still block diagonal, but with different blocks on the diagonal. Unfortu-
nately, this representation lacks intuition because the channel seems to be time variant
although it is actually not. Also, certain theoretical analysis is not as straightforward
as it is with our model. Furthermore, yet another strategy that is often found in litera-
ture is to define two sets of dispersion matrices – one set for the complex-valued data
symbols and another one for the complex conjugate of the data symbols. In terms of
spreading matrices, it would mean that the data vector contains the symbols as well
as their complex conjugate counterparts, which causes correlation among the data
symbols – something we do not desire because it also complicates the analysis. This
concept is often referred to as being widely linear [39, 115] because it applies linear
filtering for the signal and its complex conjugate counterpart [12], but appears to be
non-linear with respect to the overall signal at first glance.
2.5 Maximum Likelihood Detection
At the receiver, a maximum likelihood (ML) detector returns the constellation point
which was most likely transmitted. It performs best or, at least, equally well in terms
of symbol error rate (SER) compared with all other potential receiving methods, as-
suming that all constellation points are equally likely to be transmitted. With the
right mapping between symbols and bits, the same statement applies for the bit error
rate (BER) as well. Unfortunately, its large number of operations often renders such
a detector infeasible for implementation in real applications for complexity reasons.
Nevertheless, progress in the development of the sphere detector [2, 33, 109] and
higher computational power of today’s personal computers enable us to achieve ML
simulation results for many STBCs with moderate constellation size. Moreover, there
exist space time codes that allow for ML detection with reduced complexity like, e. g.,
OSTBCs, see Chapter 3. Besides, other detection techniques exist that often closely
approach ML performance at much lower complexity. Therefore, ML BER results de-
note reasonable performance bounds. Furthermore, the ML decision metric gives a lot
of insight into the structural properties of STBCs, in general. It provides us with ideas
for the construction of new space time constellations and helps us to analyze existing
ones later on in this work.
20
-
2.5 Maximum Likelihood Detection
2.5.1 ML Metric – Decision Criterion
To identify the most likely space time codeword at the receiver, an obvious way – and
sometimes the only way – is to compare the received signal with all potential constella-
tion points. Among these, the constellation point that is most similar with the received
signal point gives the decision result. What we need is an appropriate measure for sim-
ilarity. As we assume independent Gaussian distributed noise components with equal
variance in all dimensions, this measure is the (squared) Euclidean distance between
the received signal and the potential noise free received signal candidates, which we
can easily compute because we assume the channel impulse response to be perfectly
known at the receiver. Mathematically, the decision rule may be expressed as
X̂ = arg minX̆∈X
w
w
wY − HX̆w
w
w
2
. (2.38)
Here, X̆ is the test candidate chosen from the space time constellation X . The samecriterion, simply in the context of the spreading model, is
ŝ = arg mins̆∈S
‖y − Hℓs̆‖2 (2.39)
where S and s̆ are defined according to X and X̆, respectively. Note that the hatalways indicates that the decision has been taken on the corresponding variable and
the breve identifies test candidates.
Rather than the transmit vector s and the transmit matrix X themselves, we are
usually interested in the decisions on the information carrying data symbols x1, . . . , xnthat s or X are constructed from. The definition of the corresponding decision rules
as a function of x1, . . . , xn is straightforward and becomes
x̂ = arg minx̆∈An
w
w
wy − HℓSx̆w
w
w
2
(2.40)
for the spreading model and
`
x̂1, . . . , x̂n´
= arg minx̆1,...,x̆n∈A
w
w
w
w
w
Y − HnX
i=1
Cix̆i
w
w
w
w
w
2
(2.41)
for the dispersion model.5
Altogether, we have |A|n different data symbol vectors where |A| denotes the cardi-nality of the alphabet A. This number increases exponentially with the total numberof data symbols. So, the number of tests that have to be carried out gets infeasible
for n larger than a certain limit, unless the structural properties of the space timeconstellations allow us to get around with less than |A|n comparisons.
5In (2.40), according to the definition of x, An is a set of all n-dimensional vectors with elements from A.
21
-
2 Models and Design Criteria for Space Time Block Coding Techniques
2.5.2 Structural Properties
The squared Euclidean norm in (2.39) may be expressed as a scalar product. Expand-
ing the resulting expression, we come up with
x̂ = arg minx̆∈An
„
“
y − HℓSx̆”H “
y − HℓSx̆”
«
(2.42)
= arg minx̆∈An
“
yHy −
“
x̃Hx̆ + x̆
Tx̃”
+ x̆TRsx̆
”
(2.43)
= arg minx̆∈An
yHy −
nX
i=1
ˆ
x̆˜
i
`
[x̃]i + [x̃]∗i
´
+nX
i=1
nX
j=1
[Rs]ijˆ
x̆˜
i
ˆ
x̆˜
j
!
. (2.44)
In (2.44), the first term is independent of x̆ and thus common to all distances. Hence,
it is irrelevant for the minimization. Also, the matched filter output x̃ does not depend
on the test candidate. That means, the second term is linear with respect to x̆1, . . . , x̆n.However, the third one causes addends that are quadratic with respect to x̆1, . . . , x̆n.Particularly, the addends with i 6= j introduce dependencies between different datasymbols, which is the reason why the decisions have to be taken jointly for all symbols
x1, . . . , xn. This is also why we usually have to carry out |A|n comparisons for thedecision on a single data vector, which increases exponentially with the number of
symbols that are transmitted jointly. In other words, the complexity is significantly
reduced to only n|A| comparisons if Rs has purely imaginary off-diagonal elementssince, in this case, we may minimize the metric individually for each x̆i. Note that thequadratic terms containing two different data symbols cancel if [Rs]ij = [Rs]
∗ji. With
real-valued notation, it is clear that Rs has to be diagonal to allow for symbolwise
detection.6
For academic reasons, especially for the design of new space time constellations, it
is of interest to split up y into its components and insert these into the metric
x̂ = arg minx̆∈An
w
w
wHℓS“
x − x̆”
+ nw
w
w
2
(2.45)
= arg minx̆∈An
“
∆xT Rs∆x + ∆xTS
HH
Hℓ n + n
HHℓS∆x + n
Hn”
(2.46)
where we substituted x − x̆ = ∆x to simplify our notation. Without noise at thereceiver, what remains in the parenthesis in (2.46) is only the first term. This term
denotes the squared Euclidian distance between data vector x, transmitted, and data
vector x̆ chosen at the receiver as a test candidate. It is obvious that we are interested
in making this distance term as large as possible whenever ∆x is not the all-zerovector, i. e. x and x̆ are not identical. Doing so, the decision becomes less vulnerable
with respect to the noise term. Therefore, the ultimate goal is to construct space
time constellations whose distance profile is optimized for the receiver. As we do not
6If complex symbols were allowed, Rs would have to be diagonal for decoupled ML decisions, too.
22
-
2.6 Capacity and Rate
assume knowledge about the actual channel impulse response at the transmitter, the
optimization has to be carried out for certain statistical assumptions or a deterministic
but unknown realization. Some hints on how the structure of the problem looks like
are already given by simply rewriting the distance measure as
∆xT Rs∆x =nX
i=1
[Rs]ii ∆x2i +
nX
i=1
nX
j=1,j 6=i[Rs]ij ∆xi∆xj . (2.47)
It is a strictly non-negative quadratic form as Rs is a hermitian positive semidefinite
matrix. Clearly, this is a necessity also because the quadratic form is a distance mea-
sure, which has to be larger than or equal to zero, by definition. In particular, the
left hand sum consists of non-negative elements whose individual contributions are
essential for reliability reasons. A formal analysis is to follow in Section 2.7.
2.6 Capacity and Rate
Since capacity considerations have been one of the major driving forces for the devel-
opment of MIMO systems, it is quite natural that plenty of related questions have been
posed, which again have attracted a lot of attention in literature ever since, see [45]
and the references therein. Depending on certain constraints like, e. g., channel knowl-
edge and time variance, different expressions have been derived for characterizing the
capacity of a MIMO channel. In many introductory papers on MIMO systems like,
e. g., [40], the use of MIMO techniques is motivated with the capacity expression for
scenarios with fixed flat fading channel impulse responses where the transmitter does
not have channel state information. This capacity,7 which applies for the type of sce-
narios that we consider, is known to be
C = log2
„
det
„
Inr +σ2sσ2n
HHH
««
. (2.48)
Here, σ2s is the average power of the transmit symbols per real dimension. Applyingsome well known properties of matrices and determinants, we may easily verify that
the following equalities hold:
det
„
Inr +σ2sσ2n
HHH
«
=
nλY
i=1
„
1 +σ2sσ2n
λi
«
= det
„
Int +σ2sσ2n
HHH
«
(2.49)
where λi with i ∈ {1, ..., nλ} are the non-zero eigenvalues of HHH or, equivalently,of HHH. It immediately follows that the capacity expression is directly linked with
7In the strict information theoretic sense [19], this is not a capacity because a maximization over the inputconstellations has not been carried out. However, unavailable knowledge about the channel impulse re-sponse at the transmitter prevents such a maximization. For this reason, it is common practice to refer tothese expressions as a capacity, since it is the best we can do considering the constraints of the model.
23
-
2 Models and Design Criteria for Space Time Block Coding Techniques
the transmission matrix R and its eigenvalues:
C = log2
„
det
„
Int +σ2sσ2n
R
««
(2.50)
=
nλX
i=1
log2
„
1 +σ2sσ2n
λi
«
. (2.51)
Equation (2.51) scales linearly with the number of non-zero eigenvalues. In other
words, each additional non-zero eigenvalue, which we refer to as an eigenmode of the
MIMO channel, provides a new degree of freedom that may be used for the transmis-
sion of an additional data stream, superimposed with the already existing ones. Totally,
the MIMO channel provides nλ degrees of freedom where, expressed in mathematicalterms, nλ is the rank of R. Therefore, the MIMO channel supports a maximum of nλparallel data streams that can be transmitted independently. A STBC that makes use
of the additional degrees of freedom by transmitting multiple data symbols per time
slot is said to exploit the multiplexing gain. To formalize the term multiplexing gain,
we define the transmission rate of a STBC as
r =n
2ℓ. (2.52)
This is the average number of complex data symbols transmitted per time slot. Com-
monly, the order of the multiplexing gain is defined by the rate of the scheme. In
the following, we refer to a STBC having full rate or maximum multiplexing gain if
r = nλ.8 In most cases, this means that we choose r = nt if we want to achieve full
rate. Note that the factor 1/2 in the rate definition is due to the fact that we refer to nas being the number of real-valued data symbols.
Remarks. a) As we do not have channel knowledge at the transmitter, it is not pos-
sible to address each of the eigenmodes separately. The transmission of the parallel
data streams is done jointly across all eigenmodes. That means, it is the task of the
receiver to isolate the data streams from each other. With perfect channel knowledge
at the transmitter, this separation could have been done at the transmitter already. Be-
sides, perfect channel knowledge would allow us to apply water filling techniques [19,
page 349] where the power of each data stream is adjusted according to the state of the
eigenmode. This results in a slightly higher capacity, but does not have any influence
on the actual achievable multiplexing gain. In our case, we transmit all data streams
with equal power – a strategy that is shown to be optimal if the transmitter does not
have channel knowledge in a Rayleigh fading environment [107]. b) Furthermore, we
want to point out that we find (2.50) to be more convenient compared to (2.48) be-
cause it contains R explicitly and allows us to draw some important conclusions on the
structure of the spreading matrices and dispersion matrices. It is also noteworthy that
the close tie between the transmission matrix R and the capacity expression is one of
the advantages of the spreading model that is to be discussed in the next subsection.
8Contrary to our definition, r = 1 is sometimes referred to as being full rate in literature. It matches with ourdefinition only if either the transmitter or the receiver just have a single antenna.
24
-
2.6 Capacity and Rate
2.6.1 Spreading Matrices and Capacity
Since we usually use real-valued data symbols in our considerations, it is more natural
to give the capacity as a function of real-valued matrices, i. e., we apply real-valued
notation:
C =1
2log2
„
det
„
I2nt +σ2sσ2n
R
««
. (2.53)
Both expressions, (2.50) and (2.53), return the same value. The factor 1/2 is dueto the fact that R simply has double dimensionality with each eigenvalue occurring
twice compared to R,9 i. e., det(A) = |det(A)|2. Using the same ideas, we can easilyincorporate several time slots:
C =1
2ℓlog2
„
det
„
I2ntℓ +σ2sσ2n
Rℓ
««
. (2.54)
Here, the factor 1/ℓ guarantees that the expression stays normalized with respect to asingle time slot.
A simple reasoning gives us some hints on how to choose spreading matrices that
are optimal in terms of capacity. To begin with, let us interpret HℓS as an effective
channel impulse response. It is straightforward to show that the mutual information
(per time slot) is then upper bounded by
C =1
2ℓlog2
„
det
„
I2ntℓ +σ2xσ2n
STRℓS
««
(2.55)
where σ2x is the average power of the real-valued data symbols. Optimum spreadingmatrices have to preserve the value of the original capacity expression in (2.54). In
other words, the mutual information between the transmit signal vector and the signal
vector at the receiver should not be degraded due to spreading. Comparing (2.55)
with (2.54), we have to consider two aspects, namely, the final value of C and thevariance of the signal that is actually transmitted from the antennas. Matrix theory
tells us [11] that orthonormal spreading matrices S do not affect the eigenvalues of
the original transmission matrix Rℓ. As a result, the value of C stays unchanged(assuming σ2s = σ
2x right now). This immediately becomes clear from
C =1
2ℓlog2
„
det“
ST”
det
„
I2ntℓ +σ2xσ2n
Rℓ
«
det“
S”
«
(2.56)
since the determinant of orthonormal matrices is one or minus one and (2.56) is a valid
representation of (2.55) if S is orthonormal. To come up with (2.56), we applied some
basic properties of determinants, including that det(AB) = det(A) det(B). Finally,we have to justify the assumption that σ2s = σ
2x. Having defined an orthonormal
9Information theoretic books often have the factor 1/2 because they consider real-valued quantities, e. g. [19].
25
-
2 Models and Design Criteria for Space Time Block Coding Techniques
spreading matrix S and fixed σ2x, it is easily verified that the variance of the transmitsymbol σ2s equals the variance of the data symbols σ
2x, namely
σ2s I = E“
s sT”
= E“
SxxTS
T”
= SE“
xxT”
ST
= σ2xI. (2.57)
Looking at the entire constellation, orthonormal spreading matrices preserve the norm
of the data vectors and their pairwise distances. Geometrically, the application of such
spreading matrices corresponds to a rotation of the data vector in the signal space. We
remark that similar considerations have also been carried out in [52].
Generalizing the results from above, it is easily shown that any real-valued spread-
ing matrix with orthonormal rows fulfills the necessary conditions required to pre-
serve mutual information. This means that spreading matrices may be rectangular,
in general, with more rows than columns. Despite the fact that more data symbols
are transmitted in parallel, the order of the multiplexing gain is not affected by any
spreading matrix that complies with these conditions. Contrary to other spreading
matrices, we want to stress explicitely that spreading matrices with orthonormal rows
do not affect the mutual information, no matter what R is. Hence, we refer to space
time constellations whose spreading matrices have orthonormal rows as mutual infor-
mation preserving constellations in the following. The construction of such spreading
matrices will be an important design criterion in Chapter 4.
Remarks. a) With complex-valued notation, mutual information preserving spreading
matrices require at least twice as many columns as rows. This is because these ma-
trices map real-valued data symbols to complex-valued transmit symbols. Only if we
allowed for complex-valued data symbols, the same reasoning from above would ap-
ply and unitary spreading matrices would be optimal. b) Reconsidering (2.55) with a
fixed realization of Rℓ, we can actually construct non-orthogonal spreading matrices
that do not affect the transmit power, but achieve higher mutual information. An im-
provement is not possible only if the eigenvalues of Rℓ are all the same, i. e., Rℓ is a
scaled identity matrix. The optimized spreading matrix that maximizes the expression
in (2.55) results in being the water filling solution, which is known to be the capacity
when the transmitter has knowledge about the channel impulse response [45,81,110].
Such spreading matrices, however, work well only for a particular channel realizations
and fail for many other ones. For this reason, channel knowledge at the transmitter
is essential for us being able to exploit these additional resources. Without channel
knowledge at the transmitter, we need a fixed spreading matrix that works well for
the majority of channel realizations, because it is impossible to adjust it according
to the channel conditions. The results above show us that spreading matrices with
orthonormal rows fulfill this condition.
Example 2.6 (Mutual information provided by Alamouti’s OSTBC). Alamouti’s OSTBC
is known to preserve mutual information only for certain channel conditions [93].
Based on the observations we made in this sections, we adopt a modified approach
to analyze Alamouti’s OSTBC with respect to mutual information. First of all, let us
consider the corresponding spreading matrix in (2.32). Its dimensionality already tells
26
-
2.6 Capacity and Rate
us that this spreading matrix cannot be optimal in terms of mutual information for
general MIMO channels because the matrix has less columns than rows. Alamouti’s
OSTBC requires two transmit antennas, which means that R2 is an 8 × 8 matrix. Tofully cover the potential of multiplexing, it is necessary to transmit at least eight real-
valued data symbols in parallel, but Alamouti’s scheme only transmits four jointly.
Nevertheless, Alamouti’s OSTBC may still be optimal as long as the number of eigen-
modes of the channel is limited