ieee transactions on wireless communications, vol. 3,...

13
IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, VOL. 3, NO. 6, NOVEMBER 2004 2299 Iterative Decoding for MIMO Channels Via Modified Sphere Decoding H. Vikalo, B. Hassibi, and T. Kailath Abstract—In recent years, soft iterative decoding techniques have been shown to greatly improve the bit error rate performance of various communication systems. For multiantenna systems em- ploying space–time codes, however, it is not clear what is the best way to obtain the soft information required of the iterative scheme with low complexity. In this paper, we propose a modification of the Fincke–Pohst (sphere decoding) algorithm to estimate the maximum a posteriori probability of the received symbol se- quence. The new algorithm solves a nonlinear integer least squares problem and, over a wide range of rates and signal-to-noise ratios, has polynomial-time complexity. Performance of the algorithm, combined with convolutional, turbo, and low-density parity check codes, is demonstrated on several multiantenna channels. The results for systems that employ space–time modulation schemes seem to indicate that the best performing schemes are those that support the highest mutual information between the transmitted and received signals, rather than the best diversity gain. Index Terms—Expected complexity, iterative decoding, low-den- sity parity check (LDPC) codes, multiantenna systems, polyno- mial-time complexity, space–time codes, sphere decoding, turbo codes, wireless communications. I. INTRODUCTION R ECENTLY, the pursuit of high-speed wireless data ser- vices has generated a significant amount of activity in the communications research community. The physical limitations of the wireless medium present many challenges to the design of reliable communication systems. As shown in [1], multi- antenna wireless communication systems are capable of pro- viding data transmission at potentially very high rates. In mul- tiantenna systems, space–time [2], [3] (along with traditional error-correcting) codes are often employed at the transmitter to induce diversity. Furthermore, to secure high reliability of the data transmission, special attention has to be payed to the re- ceiver design. However, good decoding schemes may result in high complexity of the receiver. A low-complexity detection scheme for multiantenna systems in a fading environment has been proposed in [4]. This detection scheme (so-called “nulling and cancelling”), depending on the adopted criterion, essentially performs zero-forcing or Manuscript received February 21, 2003; revised July 21, 2003; accepted Au- gust 26, 2003. The editor coordinating the review of this paper and approving it for publication is M. Shafi. This work was supported in part by the National Science Foundation under Grant CCR-0133818, by the Office of Naval Re- search under Grant N00014-02-1-0578, and by the Lee Center for Advanced Networking, California Institute of Technology. H. Vikalo and B. Hassibi are with the California Institute of Technology, Pasadena, CA 91125 USA (e-mail: [email protected]; hassibi@sys- tems.caltech.edu). T. Kailath is with the Information Systems Laboratory, Stanford University, Stanford, CA 94309 USA. Digital Object Identifier 10.1109/TWC.2004.837271 minimum-mean-square-error decision feedback equalization on block transmissions. In [5], a technique referred to as “sphere decoding” (based on the Fincke–Pohst algorithm [6]) was proposed for lattice code decoding and further adapted for space–time codes in [7]. The sphere decoder provides the maximum likelihood (ML) estimate of the transmitted signal sequence and so often significantly outperforms heuristic nulling and cancelling. Moreover, it was generally believed that sphere decoding requires much greater computational complexity than the cubic-time nulling and cancelling techniques. However, in [8], an analytic expression for the expected complexity of the sphere decoding has been obtained where it is shown that, over a wide range of rates and signal-to-noise ratios (SNRs), the expected complexity is polynomial-time (often subcubic). This implies that in many cases of interest ML performance can be obtained with complexity similar to nulling and cancelling. Another area of intense research activity is that of soft iter- ative decoding. Such techniques have been reported to achieve impressive results for codes with long codeword length. Fol- lowing the seminal paper by Berrou et al. [9], there have been many results on turbo decoding of concatenated codes, with performances approaching the Shannon limit on single-input single-output systems (see [10] and references therein). More recently, low-density parity check (LDPC) codes, long ne- glected since their introduction by Gallager [11], have also been resurrected (see, e.g., [12] and [13]). Crucial to both turbo and LDPC decoding techniques is the use of the probabilistic (“soft”) information about each bit in the transmitted sequence. For multiantenna systems employing space–time codes it is not clear what is the best way to obtain this soft -information with low complexity. As noted in [14], where turbo-coded modulation for multiantenna systems has been studied, if soft information is obtained by means of an exhaustive search, the computational complexity grows exponentially in the number of transmit antennas and in the size of the constellation. Hence, for high rate systems with a large number of antennas, the exhaustive search proves to be practically infeasible. Therefore, heuristics are often employed to obtain soft channel information [14]. [Also, see [15] and the references therein for a related work in the context of multiuser detection.] In [16] and [17], two variations of the sphere decoding algorithm have been proposed for estimating the soft information. In [17], sphere decoding has been employed to obtain a list of bit sequences that are “good” in a likelihood sense. This list is then used to generate soft information, which is subsequently updated by iterative channel decoder decisions. 1536-1276/04$20.00 © 2004 IEEE

Upload: others

Post on 21-Mar-2020

5 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, VOL. 3, …authors.library.caltech.edu/3279/1/VIKieeetwc04.pdf · channel model, where the channel is known to the receiver. This is

IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, VOL. 3, NO. 6, NOVEMBER 2004 2299

Iterative Decoding for MIMO ChannelsVia Modified Sphere Decoding

H. Vikalo, B. Hassibi, and T. Kailath

Abstract—In recent years, soft iterative decoding techniqueshave been shown to greatly improve the bit error rate performanceof various communication systems. For multiantenna systems em-ploying space–time codes, however, it is not clear what is the bestway to obtain the soft information required of the iterative schemewith low complexity. In this paper, we propose a modificationof the Fincke–Pohst (sphere decoding) algorithm to estimatethe maximum a posteriori probability of the received symbol se-quence. The new algorithm solves a nonlinear integer least squaresproblem and, over a wide range of rates and signal-to-noise ratios,has polynomial-time complexity. Performance of the algorithm,combined with convolutional, turbo, and low-density parity checkcodes, is demonstrated on several multiantenna channels. Theresults for systems that employ space–time modulation schemesseem to indicate that the best performing schemes are those thatsupport the highest mutual information between the transmittedand received signals, rather than the best diversity gain.

Index Terms—Expected complexity, iterative decoding, low-den-sity parity check (LDPC) codes, multiantenna systems, polyno-mial-time complexity, space–time codes, sphere decoding, turbocodes, wireless communications.

I. INTRODUCTION

RECENTLY, the pursuit of high-speed wireless data ser-vices has generated a significant amount of activity in the

communications research community. The physical limitationsof the wireless medium present many challenges to the designof reliable communication systems. As shown in [1], multi-antenna wireless communication systems are capable of pro-viding data transmission at potentially very high rates. In mul-tiantenna systems, space–time [2], [3] (along with traditionalerror-correcting) codes are often employed at the transmitter toinduce diversity. Furthermore, to secure high reliability of thedata transmission, special attention has to be payed to the re-ceiver design. However, good decoding schemes may result inhigh complexity of the receiver.

A low-complexity detection scheme for multiantenna systemsin a fading environment has been proposed in [4]. This detectionscheme (so-called “nulling and cancelling”), depending onthe adopted criterion, essentially performs zero-forcing or

Manuscript received February 21, 2003; revised July 21, 2003; accepted Au-gust 26, 2003. The editor coordinating the review of this paper and approving itfor publication is M. Shafi. This work was supported in part by the NationalScience Foundation under Grant CCR-0133818, by the Office of Naval Re-search under Grant N00014-02-1-0578, and by the Lee Center for AdvancedNetworking, California Institute of Technology.

H. Vikalo and B. Hassibi are with the California Institute of Technology,Pasadena, CA 91125 USA (e-mail: [email protected]; [email protected]).

T. Kailath is with the Information Systems Laboratory, Stanford University,Stanford, CA 94309 USA.

Digital Object Identifier 10.1109/TWC.2004.837271

minimum-mean-square-error decision feedback equalizationon block transmissions. In [5], a technique referred to as“sphere decoding” (based on the Fincke–Pohst algorithm [6])was proposed for lattice code decoding and further adaptedfor space–time codes in [7]. The sphere decoder provides themaximum likelihood (ML) estimate of the transmitted signalsequence and so often significantly outperforms heuristic nullingand cancelling. Moreover, it was generally believed that spheredecoding requires much greater computational complexity thanthe cubic-time nulling and cancelling techniques. However,in [8], an analytic expression for the expected complexityof the sphere decoding has been obtained where it is shownthat, over a wide range of rates and signal-to-noise ratios(SNRs), the expected complexity is polynomial-time (oftensubcubic). This implies that in many cases of interest MLperformance can be obtained with complexity similar to nullingand cancelling.

Another area of intense research activity is that of soft iter-ative decoding. Such techniques have been reported to achieveimpressive results for codes with long codeword length. Fol-lowing the seminal paper by Berrou et al. [9], there have beenmany results on turbo decoding of concatenated codes, withperformances approaching the Shannon limit on single-inputsingle-output systems (see [10] and references therein). Morerecently, low-density parity check (LDPC) codes, long ne-glected since their introduction by Gallager [11], have alsobeen resurrected (see, e.g., [12] and [13]).

Crucial to both turbo and LDPC decoding techniques isthe use of the probabilistic (“soft”) information about eachbit in the transmitted sequence. For multiantenna systemsemploying space–time codes it is not clear what is the bestway to obtain this soft -information with low complexity. Asnoted in [14], where turbo-coded modulation for multiantennasystems has been studied, if soft information is obtained bymeans of an exhaustive search, the computational complexitygrows exponentially in the number of transmit antennas andin the size of the constellation. Hence, for high rate systemswith a large number of antennas, the exhaustive search provesto be practically infeasible. Therefore, heuristics are oftenemployed to obtain soft channel information [14]. [Also, see[15] and the references therein for a related work in thecontext of multiuser detection.] In [16] and [17], two variationsof the sphere decoding algorithm have been proposed forestimating the soft information. In [17], sphere decoding hasbeen employed to obtain a list of bit sequences that are “good”in a likelihood sense. This list is then used to generate softinformation, which is subsequently updated by iterative channeldecoder decisions.

1536-1276/04$20.00 © 2004 IEEE

Page 2: IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, VOL. 3, …authors.library.caltech.edu/3279/1/VIKieeetwc04.pdf · channel model, where the channel is known to the receiver. This is

2300 IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, VOL. 3, NO. 6, NOVEMBER 2004

In this paper, we propose a multiple-input multiple-output(MIMO) detector, based on a modification of the originalFincke–Pohst algorithm, which efficiently obtains soft in-formation for the transmitted bit sequence. This modifiedFincke–Pohst algorithm essentially performs a maximum aposteriori (MAP) search and thus provides soft informationfor the channel decoder. The channel decoder’s output is thenfed back to the Fincke–Pohst MAP (FP-MAP) for the nextiteration. [Note that the channel decoder may be iterative aswell, as in the cases when the channel code is turbo or LDPC.]Our method differs from that of [17] in that the sphere decoderis modified (to allow for the introduction of soft informationfrom the iterative decoder), that the detector performs MAPsearch, and that FP-MAP is repeated for each iteration. Thefact that FP-MAP is repeated for each iteration does notnecessarily mean that the technique of [17] is computation-ally more efficient. In particular, due to the feedback fromthe channel decoder, each consecutive iteration of FP-MAPresults in a fewer number of points that are used to generatesoft information, whereas the size of the list in [17] is keptfixed. Our very preliminary simulation results show that listsphere decoding and FP-MAP have comparable performanceand complexity. Detailed quantitative analysis and compar-ison of the two techniques, and determining regimes whereone is superior to the other, is certainly worth examining.However, it goes beyond the scope of this paper and willnot be investigated here.

In addition to the schemes with traditional modulationtechniques, we study the multiantenna systems employingspace–time (S–T) codes. S–T coding is a modulation techniquethat imposes spatial and temporal correlation onto a transmittedsequence of modulated symbols. It has been developed tofully exploit the spatial diversity provided by a wireless link.There has been a tremendous amount of the research activitiesin the field (see, e.g., [2] and [3]). In this paper, we focuson linear-dispersive (LD) codes of [19] and employ them inmultiantenna systems where the transmitted data are encodedwith powerful channel codes. We show that the LD codes allowfor an efficient implementation of the FP-MAP algorithm, andillustrate the excellent performance of the proposed scheme viasimulations. The LD codes are designed to optimize the mutualinformation between the transmitted and received signals.Maximizing the mutual information is a necessary conditionto obtain the excellent performances promised by the powerfulchannel codes. We illustrate this by means of comparisonwith a space–time modulation scheme that does not optimizethe above-mentioned mutual information—in particular, anorthogonal design [3].

This paper is organized as follows. The channel model andproblem statement are in Section II. The Fincke–Pohst algo-rithm is described and the calculation of its expected complexityis outlined in Section III. In Section IV, we introduce the MAPmodification of the Fincke–Pohst algorithm and discuss its com-plexity. In Section V, we present performance simulations ofthe FP-MAP algorithm in multiantenna systems both with andwithout space–time (LD) coding. The channel codes that weconsider are convolutional, turbo, and LDPC codes. We con-clude this paper in Section VI.

II. SYSTEM MODEL

We assume a discrete-time block-fading multiantennachannel model, where the channel is known to the receiver.This is a reasonable assumption for communication systemswhere the signaling rate is much faster than the pace at whichthe propagation environment changes, so that the channel maybe learned, e.g., via transmitting known training sequences.

During any channel use the transmitted signaland received signal are related by

(1)

where is the known channel matrix andis the additive noise vector, both composed of indepen-

dent, identically distributed complex-Gaussian entries (0,1). If we assume that the entries of and have unit variance,then is the expected received SNR. The channel is used mul-tiple times to transmit a vector of data.

An iterative decoding scheme is shown in Fig. 1. The vectorof information bits is encoded with an error-correcting codeto obtain the vector of coded bits , which is then interleaved toresult in the vector . The vector is modulated onto a quadra-ture amplitude modulation (QAM)-constellation. Assume thateach constellation symbol represents modulated bits (e.g.,for a -QAM constellation, ). Then the modula-tion is performed by taking blocks of vector of lengthand mapping them (e.g., by means of a simple Gray mapping)into -dimensional symbol vectors. The resulting symbols aretransmitted across the channel as described by model (1). There-fore, a block of coded bits (corresponding to a singlesymbol vector) is transmitted per each channel use. Let us de-note these blocks of coded bits as . Assume,for simplicity, that the total length of the vector is .Then the entire vector can be blocked as

(2)

and transmitted in channel uses.Consider that the th channel use (i.e., the block has

been modulated onto symbol vector and transmitted acrossthe channel). On the receiver side, the received vector anda priori probabilities of the components of the symbol vector, , are processed by an MIMO de-

tector in order to obtain both the estimated bits in the currentblock and the reliability information about those decisions.Let us denote bits in the block by .The reliabilities of the decisions for the coded bits can be ex-pressed in the form of a log-likelihood ratio (LLR) as

(3)

[Note: we will represent logical 0 with amplitude level 1, andlogical 1 with amplitude level 1.] Let us denote the reliabilityinformation for the block by

Page 3: IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, VOL. 3, …authors.library.caltech.edu/3279/1/VIKieeetwc04.pdf · channel model, where the channel is known to the receiver. This is

VIKALO et al.: ITERATIVE DECODING FOR MIMO CHANNELS 2301

Fig. 1. System model.

and let denote a vector of concatenated blocks of reliabilities

collected over all uses of the channel. Then is a vector ofLLRs corresponding to all the bits in the vector .

The vector is deinterleaved to obtain vector , which isthen used by a channel decoder to form the estimate of the in-formation bit vector , as well as to provide , the a poste-riori reliability information for the coded bits vector . A pos-teriori reliability information for the vector is obtained bydeinterleaving into . Let us denote the a posteriori reli-ability information for the block by . Furthermore, as-sume that the bits , in the block areindependent (which, for a long vector and an efficient inter-leaver is a valid approximation). Then the a posteriori prob-abilities for the components of the symbol vector (symbolvector corresponding to the block ) can easily be found from

using the modulator mapping function. These probabilitiescan now be used to run the MIMO

detector algorithm [i.e., evaluate (3)] once again. Hence, theMIMO detector is an iterative one, and we use the describedscheme for iterative joint detection and decoding in an MIMOsystem. [Note that for the first iteration of the MIMO detector,we assume that all symbols are equally likely.]

The structure of the channel decoder depends upon the choiceof the error-correcting code. For a simple convolutional code,the channel decoder is a simple soft-in soft-out decoder, suchas the Bahl, Cocke, Jelinek, and Raviv algorithm of [18]. Whenthe channel code is a powerful turbo code, then the channel de-coder is iterative itself [10]. If the channel code is an LDPC, thechannel decoder employs message passing algorithms of [11].

The computational complexity of traditional algorithmsfor evaluating (3) can be prohibitive for applications in mul-tiantenna systems. Since the sphere decoding algorithm ofFincke and Pohst can supply us with the ML estimate ofwith reasonable complexity, one may speculate whether amodification can be devised to yield soft information with low

complexity. To show that this can be done, and to describe howto efficiently approximate the LLRs in (3), it is instructive toreview the original Fincke–Pohst algorithm [6].

III. FINCKE-POHST ALGORITHM

We assume that the components of the transmitted symbolvector in (1) are from a complex-valued QAM constellation.To state the ML detection as an integer least squares problem,we first find the real-valued equivalent of the (1). To this end,let , and let , and denote real vectorsobtained from , and , respectively, as

Furthermore, let be given by

Then the real-valued equivalent of (1) is given by

The ML detector maximizes the likelihood that was receivedgiven that was sent

(4)

The search space is a finite subset of the (shifted) -di-mensional integer lattice , which reflects the fact that theunknown (complex) symbols are from a QAM constellation.Therefore, is an -PAM constellation

where is usually a power of two.

Page 4: IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, VOL. 3, …authors.library.caltech.edu/3279/1/VIKieeetwc04.pdf · channel model, where the channel is known to the receiver. This is

2302 IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, VOL. 3, NO. 6, NOVEMBER 2004

Since is known and the noise is zero-mean unit-varianceGaussian, the conditional distribution of given is

Hence maximization (4) is equivalent to the optimizationproblem

(5)

Problem (5) is referred to as an integer least squares problemand is known to be NP-hard. Geometrically, it corresponds tothe search for the “closest” point in a skewed lattice to agiven -dimensional vector .

The basic idea of the FP algorithm is that rather than searchover the entire lattice, one should search only over lattice pointsin a hypersphere of radius around . Then the closest latticepoint inside the hypersphere is the solution to (5). To performthe search, however, one needs to 1) determine an appropriateradius and 2) find the lattice points inside the sphere.

The algorithm of Fincke and Pohst does not address thechoice of , but it does propose an efficient way of finding allthe points inside the hypersphere. In particular, the algorithmconstructs a tree, whose nodes at the th level correspondto the lattice points lying inside the sphere of radius anddimension . To find the lattice points inside a sphere ofradius and dimension , the algorithm performs a depth-firsttree search over all lattice points of radius and dimensions

. The nodes in the tree that correspond to thepoints outside the sphere are pruned. This is illustrated inFig. 2, for an dimensional lattice which haspoints in each dimension (i.e., ). [Remark: We need thetree-search interpretation for the discussion on the complexityof the algorithms; further details can be found in [8].]

The search radius can be chosen according to the statisticaldescription of the noise. Note that is a

random variable with degrees of freedom. We choose theradius to be a linear function of the variance of

where the coefficient is chosen in such a way that with a highprobability we find a lattice point inside a sphere

(6)

We find in (6) by a simple table lookup.

A. Computational Complexity of Fincke–Pohst Algorithm

As noted above, the FP algorithm performs a search overall lattice points within a sphere of radius and dimensions

. Hence the complexity of the algorithm isproportional to the number of lattice points visited. In general,the algorithm has worst case and average complexity that isexponential in the number of unknowns (see [8]). However,in communications applications, as is implied by (1), thevector is not arbitrary, but is a lattice point perturbed

Fig. 2. Tree-pruning interpretation of sphere decoding.

by additive noise with known statistical properties. Hence,in this case, the expected complexity is a relevant figureof merit. The expected complexity of the FP algorithm isproportional to the expected number of lattice points that thealgorithm visits. We need two key ingredients to calculatethis expected number of lattice points (and, consecutively,the expected complexity).

1) A probability that an arbitrary lattice point belongsto a -dimensional sphere of radius around the transmittedpoint ; it was shown in [8] that this probability is givenby the following incomplete gamma function

(7)

2) A technique for enumerating points in the latticewith respect to the transmitted point ; in [8], an efficientmethod for counting those lattice points that yield the sameargument of the gamma function in (7), based on certaingenerating functions, is developed.

Using 1) and 2) above, one can find the number of latticepoints visited by Fincke–Pohst algorithm and, therefore, theanalytic expression for its expected complexity. The detailscan be found in [8], e.g., the complexity of the FP algorithmfor a 2-pulse-amplitude-modulation (PAM) constellation is

(8)

For a 4-PAM constellation it is

(9)

where is the coefficient of in the polynomial

Similar expressions can be obtained for 8-PAM, 16-PAM,etc., constellations.

For a wide range of and , the sphere decoding algo-rithm has complexity comparable to cubic-time methods such

Page 5: IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, VOL. 3, …authors.library.caltech.edu/3279/1/VIKieeetwc04.pdf · channel model, where the channel is known to the receiver. This is

VIKALO et al.: ITERATIVE DECODING FOR MIMO CHANNELS 2303

as nulling and cancelling (cubic in ). As a general principle,for a fixed , the complexity decreases by increasing the SNR

or by decreasing .

IV. MODIFIED FP ALGORITHM FOR MAP DETECTION

The MAP detector maximizes the posterior probability

(10)

Using Bayes’ rule

Further, by assuming that the symbols are inde-pendent, we can write

Then, for a known channel in additive white Gaussian noise(AWGN), (10) is equivalent to the optimization problem

(11)

For an iterative decoding scheme, we also require soft infor-mation, i.e., the probability that each bit is decoded correctly.To this end, consider the LLR defined in (3) and, as in Sec-tion II, consider the th channel use (that is, the currentsymbol vector is obtained by modulating coded block

onto an -PAM constellation)

(12)

Assuming independent bits , (12) becomes

where and denote so-called a priori and ex-trinsic parts of the total soft information, respectively. [Note

that, when used in an iterative decoding scheme, it is onlythat is passed to the other decoding block(s) in the

scheme.] Since the block is uniquely mapped into thesymbol vector , it follows that for an AWGN channel

(13)

Computing (13) over the entire signal space is of prohibitivecomplexity. Instead, we constrain ourselves to thosefor which the argument in (11) is small. [Note that these are thesignal vectors whose contribution to the numerator and denom-inator in (13) is significant.]

Applying the idea of the Fincke–Pohst algorithm, we searchfor the points that belong to the geometric body described by

(14)

where is the lower triangular matrix obtained from the QRfactorization of . (Note that this is no longer a hypersphere.)The search radius in (14) can be chosen according to the sta-tistical properties of the noise and the a priori distribution of .

A necessary condition for to satisfy (14) readily follows:

(15)

Moreover, for every satisfying (15), we define

and obtain a stronger necessary condition for (14) to hold

The procedure continues until all the components of vectorare found. The FP-MAP algorithm can be summarized as fol-lows.

Input: .

1) Set .2) (Bounds for ) Set

.3) (Increase ) . If

and , go to 3), else proceed. Ifgo to 5), else to 4).

Page 6: IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, VOL. 3, …authors.library.caltech.edu/3279/1/VIKieeetwc04.pdf · channel model, where the channel is known to the receiver. This is

2304 IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, VOL. 3, NO. 6, NOVEMBER 2004

4) (Increase ) ; if , terminatealgorithm, else go to 3).

5) (Decrease ) If go to 6). Else

,and go to 2).

6) Solution found. Save and go to 3).Assume that the search yields the set of points

. The vector that minimizes (11)is the solution to the MAP detection problem (10). The softinformation for each bit can be estimated from (13), by onlysumming the terms in the numerator and denominator suchthat . Note that the logarithms in (13) can be efficientlycomputed using the standard log-MAP implementation [20].

A. Computational Complexity of FP-MAP Algorithm

The complexity of the FP-MAP algorithm can, in principle,be found following the outline of the calculation of complexityof the original Fincke–Pohst algorithm in Section III-A. How-ever, the probability that an arbitrary point belongs to a -di-mensional sphere of radius around the transmitted point(which we need to compute the expected number of points theFP-MAP algorithm visits) now becomes

(16)The argument of this probability function is not as simple asthe one in (7), and the computation of the expected number ofpoints is much more difficult. First and foremost, (16) is a func-tion of the a priori probabilities, which are generally not knownin advance to iterations. Second, since each point in a lat-tice has a distinct a priori probability affiliated with it, argu-ment of the probability function (16) will, in general, be dif-ferent for each pair of points . Thus an efficient enumer-ation that would help the complexity calculation cannot be done.Hence, to compute the expected number of points, one needs toconsider all the possible pairs of points and the corre-sponding probabilities (16) which, as the size of the problem in-creases, clearly becomes rather cumbersome. However, we notethat since , we have

Hence from (7) and (16) it follows that, for the same choice ofradius

and we conclude that, for the same choice of , the expectednumber of points that the FP-MAP algorithm visits is upperbounded by the expected number of points visited by the orig-inal sphere decoding algorithm. Thus, the expected complexityof the FP-MAP is roughly upper bounded by the expectedcomplexity of the sphere decoding, for the same choice of .[“Roughly” upper bounded because since the a priori proba-bilities enter the algorithm, there are a few (two, to be exact)

additional operations per each visited point; this is accountedfor by changing (2 17) to (2 19) in (8) and (9).]

Therefore, the results of [8] suggest that the expected com-plexity of the FP-MAP algorithm is polynomial in over a widerange of rates and SNRs. Generally, we chose the search param-eter so that there are sufficiently many points to make a goodapproximation of (13).

V. SIMULATION RESULTS

In this section, we study performances of several multi-antenna systems employing the FP-MAP algorithm for iterativedecoding.

Example 1 [Convolutional Code]: We consider the multi-antenna system with transmit and receive an-tennas. An information bit sequence with 9216 information bitsis encoded by a rate convolutional code with memorylength 2 and generating polynomials (feed-forward) and (feedback). The coded se-quence is modulated by means of simple Gray mapping onto a16-QAM modulation scheme. On the receiver side, the FP-MAPalgorithm is used for soft detection.

Fig. 3 shows the bit error rate (BER) performance of thesystem compared with the performance of the system em-ploying maximum likelihood (hard) detection. The dashed linein Fig. 3 denotes the SNR corresponding to the ergodic capacityof the channel.

Fig. 4 compares the performance of the FP-MAP with that ofthe soft nulling and cancelling (N/C) algorithm employed in thesame system. For each entry in a transmitted symbol vector, thesoft N/C algorithm cancels the previously decoded symbols andobtains the soft information using the distribution of the noise.[The soft N/C algorithm is similar to the soft minimum meansquared error equalizer of [21]. Also, see [15] for an applica-tion in multiuser context.] Prior to the soft N/C, symbols areoptimally ordered. The complexity of the soft N/C algorithmis roughly cubic (due to the required QR-factorization of thechannel matrix). The FP-MAP even with a single iteration out-performs four iterations of the soft N/C by 2 dB at BER of 10 .

Example 2 [Turbo and LDPC Codes]: We study the multi-antenna system of Example 1 , where the trans-mitted data are encoded with powerful turbo and LDPC codes.

First consider the case where the 9216-bits-long informationsequence is encoded with a rate parallel concatenatedturbo code. The constituent convolutional codes of the turbocode are the same as the convolutional code described in Ex-ample 1. The modulation scheme is 16-QAM. The FP-MAP al-gorithm is used to obtain soft information, which is passed ontothe turbo decoder. For each iteration of the FP-MAP, the turbodecoder performs eight iterations of its own. Fig. 5 shows theBER performance of the system.

Fig. 6 shows the BER performance of the same system, but now the data are encoded with a rate

LDPC code of length 1088 and column weight 4. Furthermore,4-QAM modulation scheme is employed. When the LDPC de-coder receives soft information from FP-MAP, it performs eightiterations before passing what it infers about the coded bits backto the FP-MAP detector.

Page 7: IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, VOL. 3, …authors.library.caltech.edu/3279/1/VIKieeetwc04.pdf · channel model, where the channel is known to the receiver. This is

VIKALO et al.: ITERATIVE DECODING FOR MIMO CHANNELS 2305

Fig. 3. BER performance ofM = N = 4 system employing rate 1/2, 9216-bits-long convolutional code, 16-QAM, FP-MAP.

Fig. 4. Comparison of BER performances for FP-MAP and soft N/C employed on M = N = 4 system with rate 1/2, 1000-bits-long convolutional code,16-QAM.

Following the convention, the dashed vertical lines in Figs. 5and 6 denote the capacity limits, i.e., they denote the smallestSNR required for reliable transmission at the given data rate of

the system. This means that the error probability of the best codetransmitting at smaller SNR than the one denoted by the dashedlines approaches one as the code length approaches infinity. The

Page 8: IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, VOL. 3, …authors.library.caltech.edu/3279/1/VIKieeetwc04.pdf · channel model, where the channel is known to the receiver. This is

2306 IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, VOL. 3, NO. 6, NOVEMBER 2004

Fig. 5. BER performance of M = N = 4 system employing rate 1/2 turbo code, 16-QAM, FP-MAP.

Fig. 6. BER performance of M = N = 4 system employing rate 8/9 LDPC code, 4-QAM, FP-MAP.

turbo coded scheme in Fig. 5 gets 3.3 dB close to capacity. AtBER of 10 , it outperforms the convolutional code with perfor-mance in Fig. 3 by approximately 3 dB. The rate of the systemis 8 bits per channel use. The LDPC code, on the other hand, isabout 4.5 dB away from capacity of the system in which it is em-ployed; the data rate in this system is 7.1 bits per channel use.

Although the LDPC code is outperformed by the turbo code,it proves to be an interesting alternative, especially in light ofthe complexity exponents shown in Fig. 7. AtSNR 10 dB, both schemes have BER 10 . As indicatedin Fig. 7, for such SNR, the complexity of the detection in thesystem employing the (high rate) LDPC code and 4-QAM mod-

Page 9: IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, VOL. 3, …authors.library.caltech.edu/3279/1/VIKieeetwc04.pdf · channel model, where the channel is known to the receiver. This is

VIKALO et al.: ITERATIVE DECODING FOR MIMO CHANNELS 2307

Fig. 7. Maximum mutual information and complexity exponents for 4-QAM and 16-QAM constellations,M = N = 4 system.

ulation is approximately cubic in , while the complexity inthe system with the (1/2 rate) turbo code and 16-QAM modula-tion is significantly higher. So, although the mutual informationplots in Fig. 7 imply that the system with 16-QAM modulationscheme is more efficient, the system that employs 4-QAM mod-ulation is much more favorable from the complexity point ofview.

Example 3 [Systems Employing Both Channel andSpace–Time Codes]: Here we consider performance of theLDPC codes in a multiantenna system with linear dispersive(LD) space–time modulation scheme of [19]. A linear-dis-persion code imposes correlation onto a symbol sequence bytransmitting (over channel uses) columns of the matrixdefined as

where and are fixed matrices, andare complex scalars. The particular choice of

scalars , typically coming from a QAM constellation,determines a specific codeword from the code that is determinedby the set of matrices . The rate of the LD code is

.Assume that . By collecting vectors

received over uses of channel into matrix , and col-lecting corresponding noise vectors into matrix ,the input/output relation of the multiantenna system can be ex-pressed as

(17)

In [19], it was shown that (17) can be written as

(18)

where, and

. Furthermore,the 2 2 real valued equivalent channel matrix isgiven by

......

. . ....

...

where

where , and where the vector, denotes the th column of the MIMO

channel matrix .Note that entries of the vector in (18) come from an

-PAM constellation. Therefore, one can pose the MAP detec-tion problem as

(19)

where is the a priori probability of the th component of thevector . Equation (19) allows for an efficient implementationof the FP-MAP algorithm.

We consider a multiantenna system where the data are en-coded by the -rate LDPC code of length 1088 (the sameLDPC code is used in Example 2). Fig. 8 shows the performanceof the 2 2 system employing the 4-QAM modulation scheme

Page 10: IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, VOL. 3, …authors.library.caltech.edu/3279/1/VIKieeetwc04.pdf · channel model, where the channel is known to the receiver. This is

2308 IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, VOL. 3, NO. 6, NOVEMBER 2004

Fig. 8. BER performance of M = N = 2 system employing R = 4 LD code and rate 8/9 LDPC code, 4-QAM modulation, FP-MAP.

Fig. 9. BER performance of M = 3; N = 1 system employing R = 2 LD code and rate 8/9 LDPC code, FP-MAP.

and a rate LD code (see [19, Section 4] for the con-struction of this LD code). The significant coding gain due to useof the LDPC code is evident.Even without S–T coding, LDPCcode would yield good performance, as illustrated in Example2. However, S–T coding is necessary in systems with fewer re-

ceive than transmit antennas, to provide as many equations asthere are unknowns and ensure feasibility of decoding (when

, the Fincke–Pohst algorithm is exponential in ).Performance of such a system, with and , em-ploying an LD code is shown in Fig. 9.

Page 11: IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, VOL. 3, …authors.library.caltech.edu/3279/1/VIKieeetwc04.pdf · channel model, where the channel is known to the receiver. This is

VIKALO et al.: ITERATIVE DECODING FOR MIMO CHANNELS 2309

Fig. 10. Comparison of BER performances of LD code vs. V-BLAST (rate R = 4) employed on M = N = 2 system, using rate 8/9 LDPC code, FP-MAP.

Fig. 11. Comparison of BER performances of LD code versus orthogonal design on M = 3; N = 1 system. Both schemes employ rate 8/9 LDPC code. Thesystem using the LD code uses 64-QAM; the system using orthogonal design uses 256-QAM so that both have rate R = 6. The FP-MAP is used for detection.

We have focused on LD codes. One can ask whether any S–Tcode would suffice, i.e., would any modulation scheme supportthe excellent performance of the powerful channel codes? Based

on extensive simulations, we believe that the answer is negative:S–T modulation schemes need to maximize the mutual informa-tion between the input and output signals. In fact, the dispersion

Page 12: IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, VOL. 3, …authors.library.caltech.edu/3279/1/VIKieeetwc04.pdf · channel model, where the channel is known to the receiver. This is

2310 IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, VOL. 3, NO. 6, NOVEMBER 2004

matrices of the LD code are actually chosen to maximize themutual information between and

(20)

for a particular of interest and an appropriate power constraint.So, the LD codes satisfy the required criterion. Interestingly,V-BLAST also satisfies this condition. Indeed, as shown inFig. 10, the multiantenna system employing an LDPC channelcode and V-BLAST performs almost as well as the schemewith an LD code ([19, (36)]). Of course, there is an additionalgain that the LD code obtains by spreading the signals acrossspace and time more efficiently than V-BLAST. On the otherhand, if an S–T code throws away some information [i.e., ifit violates (20)], then the system performance is inferior incomparison to the S–T code designed with (20) in mind. Thisis illustrated in Fig. 11, where we compare performance ofthe LD code of Fig. 9 with the orthogonal S–T code ([3]; also[19, (35)]). The orthogonal design does not satisfy (20) and theperformance of the system is clearly much worse than of thatemploying the LD code.

VI. CONCLUSION

In this paper, we developed a modification of the sphere de-coding algorithm to perform the MAP detection and efficientlyestimate soft information. We considered the expected com-plexity of the algorithm and found it is closely related to the ex-pected complexity of the original sphere decoding. In fact, overa wide range of rates and SNR, the FP-MAP algorithm has ex-pected complexity that is polynomial in the number of transmitantennas.

When combined with soft iterative decoding schemes,the proposed detection algorithm FP-MAP provides close tocapacity performances of multiantenna systems. This wasdemonstrated on systems employing both turbo and LDPCcodes. Furthermore, we illustrated performance of the FP-MAPalgorithm in multiantenna systems that employ linear disper-sive space–time modulation scheme as well as powerful LDPCchannel codes. Our results imply that in order to obtain the re-markable performance of the iterative decoding, the space–timetechniques need to optimize for the mutual information be-tween the transmitted and received symbols. Thus, the designparadigms that constructs codes based on mutual informationappear to be very reasonable.

Detailed quantitative analysis of the FP-MAP algorithm, itscomparison with other decoding techniques, and determiningregimes where a particular technique is superior to the othersis certainly worth examining and may be investigated in futurework.

REFERENCES

[1] I. E. Telatar, “Capacity of multi-antenna Gaussian channels,” Eur. Trans.Telecom., vol. 10, pp. 585–595, Nov. 1999.

[2] V. Tarokh V, N. Seshadri, and A. R. Calderbank, “Space-time codes forhigh data rate wireless communication: Performance criterion and codeconstruction,” IEEE Trans. Inform. Theory, vol. 44, pp. 744–765, Mar.1998.

[3] B. M. Hochwald and T. L. Marzetta, “Unitary space-time modulation formultiple-antenna communication in Rayleigh flat-fading,” IEEE Trans.Inform. Theory, vol. 46, pp. 543–564, Mar. 2000.

[4] G. J. Foschini, “Layered space-time architecture for wireless commu-nication in a fading environment when using multi-element antennas,”Bell Labs. Tech. J., vol. 1, no. 2, pp. 41–59, 1996.

[5] E. Viterbo and J. Boutros, “A universal lattice code decoder for fadingchannels,” IEEE Trans. Inform. Theory, vol. 45, pp. 1639–1642, July1997.

[6] U. Fincke and M. Pohst, “Improved methods for calculating vectorsof short length in a lattice, including a complexity analysis,” Math.Comput., vol. 44, no. 4, pp. 463–471, Apr. 1985.

[7] M. O. Damen, A. Chkeif, and J. C. Belfiore, “Lattice code decoder forspace-time codes,” IEEE Commun. Lett., vol. 4, pp. 161–163, May 2000.

[8] B. Hassibi and H. Vikalo, “Expected complexity of the sphere decoderalgorithm,” , submitted for publication.

[9] C. Berrou, A. Glavieux, and P. Thitimajshima, “Near Shannon limiterror-correcting coding and decoding. Turbo codes,” in Proc. Int. Conf.Communications, 1993, pp. 1064–1070.

[10] J. P. Woodard and L. Hanzo, “Comparative study of turbo decodingtechniques: An overview,” IEEE Trans. Veh. Technol., vol. 49, pp.2208–2233, Nov. 2000.

[11] R. G. Gallager, Low-Density Parity-Check Codes. Cambridge, MA:MIT Press, 1963.

[12] D. J. C. MacKay and R. M. Neal, “Near Shannon limit performance oflow-density parity-check codes,” Electron. Lett., vol. 32, pp. 1645–1646,1996.

[13] D. J. C. MacKay, “Good error correcting codes based on very sparsematrices,” IEEE Trans. Inform. Theory, vol. 45, pp. 399–431, Mar. 1999.

[14] A. Stefanov and T. M. Duman, “Turbo-coded modulation for systemswith transmit and receive antenna diversity over block fading channels:System model, decoding approaches, and practical considerations,”IEEE J. Select. Areas Commun., vol. 19, pp. 958–968, May 2001.

[15] P. D. Alexander, A. J. Grant, and M. C. Reed, “Iterative detection incode-division multiple-access with error control coding,” Eur. Trans.Tel., vol. 9, pp. 419–425, 1998.

[16] H. Vikalo and B. Hassibi, “Low-complexity iterative decoding over mul-tiple antenna channels via a modified sphere decoder,” in Proc. AllertonConf. Commun., Control, and Computing, Monticello, IL, Oct. 2001.

[17] B. M. Hochwald and S. ten Brink, “Achieving near-capacity on a mul-tiple-antenna channel,” in Proc. Allerton Conf. Commun., Control, andComputing, Monticello, IL, Oct. 2001.

[18] L. Bahl, J. Cocke, F. Jelinek, and J. Raviv, “Optimal decoding of linearcodes for minimizing symbol error rate,” IEEE Trans. Inform. Theory,pp. 284–287, Mar. 1974.

[19] B. Hassibi and B. Hochwald, “High-rate codes that are linear in spaceand time,” IEEE Trans. Inform. Theory, vol. 48, no. 7, pp. 1804–1824,July 2002.

[20] P. Robertson, P. Hoeher, and E. Villebrun, “Optimal and suboptimalmaximum a posteriori algorithms suitable for turbo decoding,” Euro.Trans. Telecommun., vol. 8, pp. 119–125, Mar.–Apr. 1997.

[21] M. Tuechler, A. Singer, and R. Koetter, “Minimum mean squared errorequalization using a-priori information,” IEEE Trans. Signal Processing,vol. 50, pp. 673–683, 2002.

H. Vikalo was born in Tuzla, Bosnia and Herze-govina. He received the B.S. degree from theUniversity of Zagreb, Croatia, in 1995, the M.S.degree from Lehigh University, Bethlehem, PA, in1997, and the Ph.D. degree from Stanford University,Stanford, CA, in 2003, all in electrical engineering.

He held a short-term appointment at Bell Labora-tories, Murray Hill, NJ, in the summer of 1999. FromNovember 2002 to July 2003, he was a PostdoctoralResearcher, and since July 2003, he has been anAssociate Scientist at the California Institute of

Technology, Pasadena. His research interests include wireless communications,signal processing, estimation, and genomic information processing.

Page 13: IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, VOL. 3, …authors.library.caltech.edu/3279/1/VIKieeetwc04.pdf · channel model, where the channel is known to the receiver. This is

VIKALO et al.: ITERATIVE DECODING FOR MIMO CHANNELS 2311

B. Hassibi was born in Tehran, Iran, in 1967. He re-ceived the B.S. degree from the University of Tehranin 1989 and the M.S. and Ph.D. degrees from Stan-ford University, Stanford, CA, in 1993 and 1996, re-spectively, all in electrical engineering.

From 1996 to 1998 he was a Research Associatewith the Information Systems Laboratory, StanfordUniversity. From 1998 to 2000 he was a Memberof Technical Staff in the Mathematical SciencesResearch Center, Bell Laboratories, Murray Hill,NJ. Since January 2001 he has been an Assistant

Professor of electrical engineering at the California Institute of Technology,Pasadena. He has also held short-term appointments at Ricoh CaliforniaResearch Center, the Indian Institute of Science, and Linkoping University,Sweden. His research interests include wireless communications, robustestimation and control, adaptive signal processing, and linear algebra. He is thecoauthor of Indefinite Quadratic Estimation and Control: A Unified Approachto H and H Theories (New York: SIAM, 1999) and Linear Estimation(Englewood Cliffs, NJ: Prentice-Hall, 2000).

Prof. Hassibi received the 1999 O. Hugo Schuck Best Paper Award of theAmerican Automatic Control Council, a 2002 National Science Foundation Ca-reer Award, and a 2003 Okawa Foundation Research Grant for Information andTelecommunications.

T. Kailath received the B.E. (Telecom) degree fromPune University, India, and the S.M. and Sc.D. de-grees in electrical engineering from the Massachu-setts Institute of Technology, Cambridge.

He has also been awarded honorary degrees fromLinkoping University, Sweden, Strathclyde Univer-sity, Scotland, the University of Carlos III, Madrid,Spain, and the University of Bordeaux, France. From1961 to 1962, he was with the Jet Propulsion Labora-tories, Pasadena, CA, and also taught part-time at theCalifornia Institute of Technology. Since then he has

been with Stanford University, where he is currently Hitachi America Professorof Engineering, Emeritus. He has also held shorter term appointments at sev-eral institutions around the world. His research has spanned a large number ofdisciplines, emphasizing information theory and communications in the 1960s,linear systems, estimation and control in the 1970s, VLSI design and sensorarray signal processing in the 1980s, and applications to semiconductor manu-facturing and digital communications in the 1990s. Concurrently, he contributedto several fields of mathematics, especially stochastic processes, operator theory,and linear algebra. In the course of his research and teaching, he has mentoredmore than a hundred doctoral and postdoctoral students and authored or coau-thored more than 300 journal papers. His books include Linear Systems (En-glewood Cliffs, NJ: Prentice-Hall, 1980), Indefinite Quadratic Estimation andControl (with B. Hassibi and A. H. Sayed) (SIAM, 1999), and Linear Estima-tion (with A. H. Sayed and B.Hassibi) (Englewood Cliffs, NJ: Prentice-Hall,2000). He has held Guggenheim, Churchill, and Humboldt fellowships, amongothers.

Prof. Kailath is a member of the National Academy of Engineering and theNational Academy of Sciences,the American Academy of Arts and Sciences,the Third World Academy of Sciences, the Indian National Academy ofEngineering, and the Royal Spanish Academy of Engineering. He has receivedoutstanding paper prizes from the IEEE Information Theory Society, the IEEESignal Processing Society, the European Signal Processing Society, and theIEEE TRANSACTIONS ON SEMICONDUCTOR MANUFACTURING. He served asPresident of the IEEE Information Theory Society in 1975, and received itsShannon Award in 2000. Among other awards are the Technical Achievement(1989) and Society (1991) Awards of the IEEE Signal Processing Society, the1995 IEEE Education Medal, and (with A. Sayed) the 1996 IEEE Donald G.Fink Prize Award.