asynchronous convolutional-coded physical-layer network coding

15
1 Asynchronous Convolutional-Coded Physical-Layer Network Coding Qing Yang, Student Member, IEEE, and Soung Chang Liew, Fellow, IEEE Abstract—This paper investigates the decoding process of asynchronous convolutional-coded physical-layer network coding (PNC) systems. Specifically, we put forth a layered decoding framework for convolutional-coded PNC consisting of three layers: symbol realignment layer, codeword realignment layer, and joint channel-decoding network coding (Jt-CNC) decoding layer. Our framework can deal with phase asynchrony (phase offset) and symbol arrival-time asynchrony (symbol misalignment) between the signals simultaneously transmitted by multiple sources. A salient feature of this framework is that it can handle both fractional and integral symbol misalignments. For the decoding layer, instead of Jt-CNC, previously proposed PNC decoding algorithms (e.g., XOR-CD and reduced-state Viterbi algorithms) can also be used with our framework to deal with general symbol misalignments. Our Jt-CNC algorithm, based on belief propagation (BP), is BER-optimal for synchronous PNC and near optimal for asynchronous PNC. Extending beyond convolutional codes, we further generalize the Jt-CNC decoding algorithm for all cyclic codes. Our simulation shows that Jt-CNC outperforms the previously proposed XOR-CD algorithm and reduced-state Viterbi algorithm by 2dB for synchronous PNC. For both phase- asynchronous and symbol-asynchronous PNC, Jt-CNC performs better than the other two algorithms. Importantly, for real wireless network experimentation, we implemented our decoding algorithm in a PNC prototype built on the USRP software radio platform. Our experiment shows that the proposed Jt-CNC decoder works well in practice. KeywordsPhysical-layer network coding; convolutional codes; symbol misalignment; phase offset; joint channel-decoding and network coding; cyclic codes. I. I NTRODUCTION T HIS paper investigates the use of convolutional codes in asynchronous physical-layer network coding (PNC) systems to ensure reliable communication. In particular, we focus on the decoding problem when simultaneous signals from multiple transmitters arrive at a PNC receiver with asynchronies between them. PNC was first proposed in [1] as a way to exploit network coding [2], [3] at the physical layer. In the simplest PNC setup, two users exchange information via a relay in a two-way relay network (TWRN). The two users transmit their messages simultaneously to the relay; the relay then maps the overlapped signals to a network-coded message and broadcasts it to the two users; and each of the two users recovers the message from the other user based on the network-coded message and the knowledge of its own message. PNC can potentially boost the throughput of TWRN by 100% compared with a traditional relay system [1]. The authors are with the Department of Information Engineering, The Chinese University of Hong Kong, Shatin, New Territories, Hong Kong (email: {yq010, soung}@ie.cuhk.edu.hk). This work is partially supported by AoE grant E-02/08 and the General Re- search Funds Project No. 414911, established under the University Grant Com- mittee of the Hong Kong Special Administrative Region, China. This work is also partially supported by the China 973 Program, Project No. 2012CB315904 and the China NSFC grant, Project No. 61271277. Our paper focuses on PNC decoding as applied to TWRN. To ensure reliable transmission, communication systems make use of channel coding to protect the information from noise and fading. In channel-coded PNC, the goal of the relay is to decode the simultaneously received signals not into the individual messages of the two users, but into a network-coded message. This process is referred to as the channel-decoding network coding (CNC) process in [4]. In addition to the issue of channel coding, in practice, the signals from the two users may be asynchronous in that there may be relative symbol arrival-time asynchrony (symbol misalignment), phase asynchrony (phase offset), and other asyn- chronies between the two signals received at the relay. These PNC systems are referred to as asynchronous PNC (APNC) systems [5]. Both [4] and [5] assume the use of repeat accumulate (RA) codes. Our current paper, on the other hand, focuses on the use of convolutional codes. A main motivation is that convolutional codes are commonly adopted in many communications systems (e.g., the channel code in IEEE 802.11 is a convolutional code [6]). Convolutional codes have been well studied and there are many good designs for the encoding/decoding of convolutional codes in the conventional communication setting. Given this backdrop, whether these designs are still applicable to PNC, and what additional considerations and modifications are needed for PNC, are issues of utmost interest. This paper is an attempt to address these issues. Our main contributions are as follows: We put forth a layered decoding framework for asyn- chronous PNC system. The proposed decoding frame- work can deal with synchronous PNC as well as asyn- chronous PNC with relative phase offset and general symbol misalignment—by general symbol misalignment, we mean that the arrival times of the two users’ signals at the relay are offset by (τ I + τ F ) symbol durations, where τ I is an integral offset and τ F is a fractional offset smaller than one. With our framework, the previous decoding algorithms can also be used to deal with asynchronous PNC. We design a joint channel-decoding network coding (Jt- CNC) decoder for convolutional-coded PNC. The Jt-CNC decoder, based on belief propagation (BP), is optimal in terms of bit error rate (BER) performance. We analyze the BER of our Jt-CNC decoder mathematically and derive an approximate expression for the BER. We implement the Jt-CNC decoder in a real PNC system built on USRP software radio platform. Our experiment shows that the Jt-CNC decoder works well under real wireless channel. We propose an algorithm that can handle general sym- bol misalignment in cyclic-coded PNC, building on the insight obtained from our study of convolutional-coded PNC; that is, the algorithm is applicable to all cyclic codes, not just convolutional codes.

Upload: vuongkhuong

Post on 13-Feb-2017

231 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Asynchronous Convolutional-Coded Physical-Layer Network Coding

1

Asynchronous Convolutional-CodedPhysical-Layer Network Coding

Qing Yang, Student Member, IEEE, and Soung Chang Liew, Fellow, IEEE

Abstract—This paper investigates the decoding process ofasynchronous convolutional-coded physical-layer network coding(PNC) systems. Specifically, we put forth a layered decodingframework for convolutional-coded PNC consisting of three layers:symbol realignment layer, codeword realignment layer, and jointchannel-decoding network coding (Jt-CNC) decoding layer. Ourframework can deal with phase asynchrony (phase offset) andsymbol arrival-time asynchrony (symbol misalignment) betweenthe signals simultaneously transmitted by multiple sources. Asalient feature of this framework is that it can handle bothfractional and integral symbol misalignments. For the decodinglayer, instead of Jt-CNC, previously proposed PNC decodingalgorithms (e.g., XOR-CD and reduced-state Viterbi algorithms)can also be used with our framework to deal with generalsymbol misalignments. Our Jt-CNC algorithm, based on beliefpropagation (BP), is BER-optimal for synchronous PNC and nearoptimal for asynchronous PNC. Extending beyond convolutionalcodes, we further generalize the Jt-CNC decoding algorithm forall cyclic codes. Our simulation shows that Jt-CNC outperformsthe previously proposed XOR-CD algorithm and reduced-stateViterbi algorithm by 2 dB for synchronous PNC. For both phase-asynchronous and symbol-asynchronous PNC, Jt-CNC performsbetter than the other two algorithms. Importantly, for real wirelessnetwork experimentation, we implemented our decoding algorithmin a PNC prototype built on the USRP software radio platform.Our experiment shows that the proposed Jt-CNC decoder workswell in practice.

Keywords—Physical-layer network coding; convolutional codes;symbol misalignment; phase offset; joint channel-decoding andnetwork coding; cyclic codes.

I. INTRODUCTION

THIS paper investigates the use of convolutional codesin asynchronous physical-layer network coding (PNC)

systems to ensure reliable communication. In particular, wefocus on the decoding problem when simultaneous signals frommultiple transmitters arrive at a PNC receiver with asynchroniesbetween them.

PNC was first proposed in [1] as a way to exploit networkcoding [2], [3] at the physical layer. In the simplest PNCsetup, two users exchange information via a relay in a two-wayrelay network (TWRN). The two users transmit their messagessimultaneously to the relay; the relay then maps the overlappedsignals to a network-coded message and broadcasts it to thetwo users; and each of the two users recovers the messagefrom the other user based on the network-coded message andthe knowledge of its own message. PNC can potentially boostthe throughput of TWRN by 100% compared with a traditionalrelay system [1].

The authors are with the Department of Information Engineering, TheChinese University of Hong Kong, Shatin, New Territories, Hong Kong (email:{yq010, soung}@ie.cuhk.edu.hk).

This work is partially supported by AoE grant E-02/08 and the General Re-search Funds Project No. 414911, established under the University Grant Com-mittee of the Hong Kong Special Administrative Region, China. This work isalso partially supported by the China 973 Program, Project No. 2012CB315904and the China NSFC grant, Project No. 61271277.

Our paper focuses on PNC decoding as applied to TWRN.To ensure reliable transmission, communication systems makeuse of channel coding to protect the information from noiseand fading. In channel-coded PNC, the goal of the relay isto decode the simultaneously received signals not into theindividual messages of the two users, but into a network-codedmessage. This process is referred to as the channel-decodingnetwork coding (CNC) process in [4].

In addition to the issue of channel coding, in practice,the signals from the two users may be asynchronous in thatthere may be relative symbol arrival-time asynchrony (symbolmisalignment), phase asynchrony (phase offset), and other asyn-chronies between the two signals received at the relay. ThesePNC systems are referred to as asynchronous PNC (APNC)systems [5].

Both [4] and [5] assume the use of repeat accumulate (RA)codes. Our current paper, on the other hand, focuses on the useof convolutional codes. A main motivation is that convolutionalcodes are commonly adopted in many communications systems(e.g., the channel code in IEEE 802.11 is a convolutional code[6]). Convolutional codes have been well studied and there aremany good designs for the encoding/decoding of convolutionalcodes in the conventional communication setting. Given thisbackdrop, whether these designs are still applicable to PNC, andwhat additional considerations and modifications are needed forPNC, are issues of utmost interest. This paper is an attempt toaddress these issues.

Our main contributions are as follows:

• We put forth a layered decoding framework for asyn-chronous PNC system. The proposed decoding frame-work can deal with synchronous PNC as well as asyn-chronous PNC with relative phase offset and generalsymbol misalignment—by general symbol misalignment,we mean that the arrival times of the two users’ signals atthe relay are offset by (τI+ τF) symbol durations, whereτI is an integral offset and τF is a fractional offset smallerthan one. With our framework, the previous decodingalgorithms can also be used to deal with asynchronousPNC.

• We design a joint channel-decoding network coding (Jt-CNC) decoder for convolutional-coded PNC. The Jt-CNCdecoder, based on belief propagation (BP), is optimal interms of bit error rate (BER) performance. We analyze theBER of our Jt-CNC decoder mathematically and derivean approximate expression for the BER.

• We implement the Jt-CNC decoder in a real PNC systembuilt on USRP software radio platform. Our experimentshows that the Jt-CNC decoder works well under realwireless channel.

• We propose an algorithm that can handle general sym-bol misalignment in cyclic-coded PNC, building on theinsight obtained from our study of convolutional-codedPNC; that is, the algorithm is applicable to all cycliccodes, not just convolutional codes.

Page 2: Asynchronous Convolutional-Coded Physical-Layer Network Coding

2

The remainder of this paper is organized as follows. SectionII overviews related work. Section III describes the PNC systemmodel. Section IV puts forth our Jt-CNC framework, focusingon synchronous PNC. Section V extends the Jt-CNC frameworkto asynchronous PNC. We further show how the algorithmicframework is applicable to the general cyclic-coded PNC inthe Appendix. Section VI presents simulations and experimentalresults together with the BER analysis for the Jt-CNC decoder.Section VII concludes this work.

II. RELATED WORK

A. Synchronous PNC with Convolutional CodesThe first implementation of TWRN based on the principle of

PNC was recently reported in [7], [8]. This system employs theconvolutional code defined in the 802.11 standard and adoptsthe OFDM modulation to eliminate symbol misalignment [9].In [7], [8], first the log-likelihood ratio (LLR) of the XORedchannel-coded bits is computed; then this soft information is fedto a conventional Viterbi decoder. We refer to this decodingstrategy as the soft XOR and channel decoding (XOR-CD)scheme [10]. Detailed explanation and interpretation of theXOR-CD algorithm are given in Appendix III-A. The experi-ment shows that the use of XOR-CD on the convolutional-codedPNC system, thanks to its simplicity, is feasible and practical.

The acronym XOR-CD refers to a two-step process: first,prior to channel decoding and without considering the correla-tions among the received symbols due to the channel code,we apply symbol-by-symbol PNC mapping on the receivedsymbols to obtain estimates on the successive XORed bits;after that, we perform channel decoding on the XORed bitsto obtain the XORed source bits. The performance of XOR-CD is suboptimal because the PNC mapping in the first steploses information [4]. Furthermore, only linear channel codescan be correctly decoded in the second step. Jt-CNC, on theother hand, performs channel decoding and network coding asan integrated process rather than two disjoint steps. Jt-CNCcan be ML (maximum likelihood) optimal, depending on whichvariations of Jt-CNC we use and whether the underlying PNCsystem is synchronous or asynchronous.

Within the class of Jt-CNC algorithms, for optimality, thereare two possible decoding targets: (i) ML XORed packet;(ii) ML XORed bits. To draw an analogy, for the conven-tional single-user point-to-point communication, if convolu-tional codes are used, then the Viterbi algorithm [11] aims toobtain the ML packet, while the BCJR [12] aims to obtain MLbits. For PNC systems, the aim is to obtain the network-codedpacket or the network-coded bits instead.

A Jt-CNC algorithm for finding the ML XORed packetwas proposed in [13]. However, as will be discussed later,finding the ML XORed packet requires exhaustive search thatcould have prohibitively high complexity. Therefore, the log-max approximation is adopted in [13] and the ML algorithmis simplified to (approximated with) a full-state Viterbi (FSV)algorithm. The detailed explanation and interpretation of thefull-state Viterbi algorithm are given in Appendix III-B. Theterm “full-state” comes from the fact that this algorithm com-bines the trellises of both end nodes to make a virtual decoder.By searching the best path on the combined trellis with theViterbi algorithm, [13] tries to decode the ML pair of packetsof the two end nodes. To further reduce the complexity, [13]simplifies the full-state Viterbi algorithm to a “reduced-state”Viterbi algorithm. Reference [13], however, did not benchmarktheir approximate algorithm with the optimal one. As we will

show later, the algorithm proposed by us in this paper can yieldbetter performance than that in [13].

In this paper, we aim to find the ML XORed bits withinthe source packet rather than the overall ML XORed packet.In Section IV we show that our algorithm is XOR bit-optimalfor synchronous PNC. Finding ML XORed bits turns out tohave much lower complexity than finding the ML XORedpacket. This is quite different from the conventional point-to-point communication system, in which the simple Viterbialgorithm can be used to decode the ML packet, and in whichBCJR (slightly more complex than the Viterbi algorithm) can beused to decode the ML bits. The XOR bit-optimal decoding forsynchronous PNC is investigated in our prior published work[14].

B. Asynchronous PNC with Convolutional CodesIn asynchronous PNC systems, the signals from the two

end nodes may arrive at the relay with symbol misalignmentand relative phase offset [5]. To our best knowledge, therewas no Jt-CNC decoder for convolutional codes that can dealwith integral-plus-fractional symbol misalignment. In [15], aconvolutional decoding scheme with an XOR-CD algorithmwas proposed to deal with integral symbol misalignment. Aspointed out in [15], symbol misalignment entangles the channel-coded bits of the trellises of the two encoders in a way thatordinary Viterbi decoding, based on just one of the trellises,is not applicable. Therefore the XOR-CD algorithm for syn-chronous PNC cannot be applied anymore in the presence ofintegral symbol misalignment. Their solution is to rearrange thetransmit order of the channel-coded bits into blocks, and padDmax zeros between adjacent blocks. The zero padding acts asa guard interval between blocks that avoids the entanglementof channel-coded bits and facilitates Viterbi decoding. However,this scheme can only deal with integral symbol misalignmentof at most Dmax symbols. In addition, it incurs a code-rate lossdue to the zero padding between blocks.

C. Asynchronous PNC with Other Channel CodesThe use of LDPC codes in asynchronous PNC systems has

previously been considered. In [5], the authors designed a Jt-CNC decoder for the RA code that can deal with fractionalsymbol misalignment (i.e., symbol misalignment that is lessthan one symbol duration) and phase offset. Our decodingframework adopts the over-sampling technique proposed in [5]to address fractional symbol misalignment.

To deal with asynchrony in PNC, our decoding frameworkconsists of three layers: symbol-realignment layer, codeword-realignment layer, and joint channel-decoding network coding(Jt-CNC) layer. The first two layers, symbol realignment andcodeword realignment, counter fractional and integral symbolmisalignments, respectively; the third layer, Jt-CNC, decodesthe ML XORed bits. Other decoding schemes (e.g., XOR-CD,reduced-state Viterbi) can also be used in the third layer of theframework. We further show that our decoding framework isnot only applicable when convolutional codes are adopted, it isalso applicable when general cyclic codes are used.

Besides convolutional codes, an important class of cycliccodes is the cyclic LDPC. The Jt-CNC LDPC decoder proposedin [5] was extended by [16] to deal with general asynchronyusing cyclic LDPC. However, the proposed decoder in [16]discards the non-overlapped part of the received signal, losinguseful information that can potentially enhance performance.

Page 3: Asynchronous Convolutional-Coded Physical-Layer Network Coding

3

Therefore, for the decoder in [16], the larger the symbolmisalignment, the worse the performance. By contrast, ourframework makes full use of the non-overlapped portion ofthe signal so that the larger symbol misalignment can enhanceperformance.

D. Two-Way Relay Network with Other TechniquesThe goal of our Jt-CNC decoder proposed in this paper is

to decode the XOR of the end nodes’ messages in the uplinkphase. The relay then broadcasts the XORed message to bothend nodes in the downlink phase. Each end node XORs theXORed message with its own message to obtain the otherend node’s message. For two-way relay network system, wehave other rate-improving techniques other than PNC such assuperposition coding and hierarchical modulation [17], [18].Both superposition coding and hierarchical modulation are fordownlink. If they are to be used, the relay has to decode theindividual messages UA and UB instead of their XOR in theuplink. For FSV and Jt-CNC, we can also get the individualmessages, and then use superposition coding or hierarchicalmodulation rather than network coding (NC) for the downlink.For both superposition coding and hierarchical modulation,the self information at the two end nodes is not used to dothe decoding in the downlink. In other words, some availableinformation is not exploited. As a result, the overall achievablerates (from an information-theoretic viewpoint) will not be asgood as those achieved by the NC schemes.

III. SYSTEM MODEL

We consider the application of PNC in a two-way relaynetwork (TWRN) as shown in Fig. 1. In this model, nodesA and B exchange information with the help of relay node R.We assume that all nodes are half-duplex and there is no directlink between A and B.

With PNC, nodes A and B exchange one packet with eachother in two time slots. The first time slot corresponds to an up-link phase, in which node A and node B transmit their channel-coded packets simultaneously to relay R. The relay R thenconstructs a network-coded packet based on the simultaneouslyreceived signals from A and B. This operation is referred toas the channel decoding network coding (CNC) process [10],because the received signals are decoded into a network-codedmessage rather than the individual messages from A and B.The second time slot corresponds to a downlink phase, inwhich relay R channel-encodes the network-coded message andbroadcasts it to both A and B. Upon receiving the network-coded packet, A (B) then attempts to recover the original packettransmitted by B (A) in the uplink phase using self-information[1]. This paper focuses on the design of the CNC algorithm inthe uplink phase; the issue in the downlink phase is similar tothat in conventional point-to-point transmission and does notrequire special treatment [10].

As shown in Fig. 1, in the uplink phase, the source packets ofnodes A and B each goes through a convolutional encoder, aninterleaver, and a modulator. Our framework can accommodateall different types of convolutional codes. We adopt zero-tailconvolutional codes1 in the main text of this paper. We denotethe source packets of node A and node B by two K-bit binarysequences:

U i =(ui1, u

i2, · · · , uiK

), i ∈ {A,B} , (1)

1The use of other convolutional codes (e.g., tail-biting [19] and recursive) isdiscussed in the Appendix II.

A BRUplink

Downlink

Encoder(•)

InterleaverΠ (•) + Interleaver

Π (•)

Encoder(•)

DeinterleaverΠ-1 (•)

AUAC

AX BX

RW

RYBC

BU

RU

DecoderC-1(•)

ModulatorM(•)

ModulatorM(•)

DemodulatorM-1(•)

AC BC

Fig. 1. System model of two-way relay network operated with physical-layernetwork coding.

where uik is the input bit of end nodes i’s source packet at timek. The source packets are encoded into two M -bit channel-coded binary sequences. We assume nodes A and B use thesame convolutional code with code rate r. In the followingpresentation we choose r = 1/R where R is an integer asan example. For concreteness, let us consider R = 3. Thus,M = 3K. The two channel-coded packets are

Ci =(ci1, c

i2, · · · , ciM

)=

(ci1, c

i2, · · · , ciK

)=

(ci1,1, c

i1,2, c

i1,3, c

i2,1, c

i2,2, · · · , ciK,1, c

iK,2, c

iK,3

),

i ∈ {A,B} , (2)

where cik,j is the jth channel-coded bit of end nodes i’s channel-

coded packet at time k; the 3-bit tuple cik = (cik,1, cik,2, c

ik,3)

is the output of the convolutional encoder of node i at time k.Then, CA and CB are fed into their respective block interleaversthat realize the same permutation to produce

Ci =(ci1,1, c

i2,1, · · · , ciK,1; c

i1,2, · · · , ciK,2; c

i1,3, · · · , ciK,3

),

i ∈ {A,B} . (3)

Note that after the permutation, the jth coded bits of all thesource bits are grouped into a block. There are altogether threeblocks. Finally, Ci are modulated to produce the two sequencesof N complex symbols:

Xi =(xi1, x

i2, · · · , xiN

), i ∈ {A,B} . (4)

Throughout this paper, we focus on BPSK and QPSK modu-lations; our framework can be easily extended to higher orderconstellations [20]–[22]. For BPSK N=3K and xin∈{1,−1}.For QPSK N=3K/2 and xin∈1/

√2 {1+j,−1+j, 1−j,−1−j}.

The complex symbol sequences XA and XB are shaped usinga pulse shaping function p(t) with symbol duration T andtransmitted. Without loss of generality, we assume p(t) is therectangular pulse throughout this paper.

Let us denote the channel coefficients of the channels fromnode A and node B to relay R by hA and hB, respectively.Both hA and hB are complex numbers, whose phase differenceφ = � (hB/hA) is the relative phase offset between node A andnode B. We assume that the channel state information (CSI) hA

and hB can be estimated at the relay R using preambles. Node Aand node B use different pseudo-noise (PN) sequences that have

Page 4: Asynchronous Convolutional-Coded Physical-Layer Network Coding

4

good cross-correlation property (e.g., Gold sequence) as theirpreambles. Upon receiving the superposed packet, the relaycross-correlates the received signal with node A’s preamble toestimate the channel coefficient hA. Since node A and node Buse different PN sequences as preambles, the influence of nodeB’s signal is removed by the cross-correlation. The relay alsoestimates hB using the same method.

The received complex baseband signal at the relay is

yR(t)=

N∑n=1

{hAxAnp (t−nT ) + hBxBnp (t−nT−τT )

}+wR(t),

(5)where τT is the symbol misalignment (i.e., the arrival timeof the signal of B lags the arrival time of the signal of A byτT ). The relay can estimate the symbol misalignment using thetwo PN preambles. First, the relay cross-correlates the receivedsignal with node A’s preamble to locate the first sample of A’spacket; it then cross-correlates the received signal with nodeB’s preamble to locate the first sample of B’s packet. Finally,the relay calculates their difference to estimate τ . This methodworks even if the end nodes’ preambles are partially overlappeddue to the good cross-correlation property of the PN preambles.The noise term wR(t) is assumed to be circularly complex withvariance σ2. We assume the symbol misalignment to consistof two parts: an integral part τI ∈ N, and a fractional partτF ∈ [0, 1) so that τ = τI + τF.

IV. SYNCHRONOUS CONVOLUTIONAL-CODED PNC

Let us first focus on synchronous convolutional-coded PNC,where the signals of node A and node B are symbol-aligned(τ = 0). Section V will discuss the asynchronous case. Wefirst derive the XOR packet-optimal Jt-CNC algorithm thataims at finding the ML XORed source packet. We show thatsuch an XOR packet-optimal algorithm has prohibitively highcomplexity. Then we introduce our XOR bit-optimal Jt-CNCalgorithm that finds the ML XORed bits, which has much lowercomplexity.

A. XOR Packet-Optimal Decoding of SynchronousConvolutional-Coded PNC

In the case of synchronous convolutional-coded PNC, thereceived baseband signal at relay R is obtained by settingsymbol misalignment τ to zero in (5):

yR(t) =

N∑n=1

{hAxAnp(t− nT ) + hBxBnp(t− nT )

}+ wR(t).

(6)After matched filtering [5], the received baseband samples at

relay R areY R =

(yR1 , y

R2 , · · · , yRN

), (7)

whereyRn = hAxAn + hBxBn + wR

n . (8)

The ML XORed source packet UR = (uR1 , uR2 , · · · , uRK) (i.e.,

ML XOR of the source packets of node A and node B) is givenby

UR = argmaxUR

∑UA,UB:UA⊕UB=UR

exp(−M (

XA, XB)), (9)

where ⊕ denotes the binary bit-wise XOR operator; XA andXB are the convolutional-encoded and modulated baseband

signal of UA and UB, respectively; and M(XA, XB) is thedistance metric defined as follows:

M (XA, XB

)=

N∑n=1

∣∣yRn − hAxAn − hBxBn∣∣2

2σ2

=

∥∥Y R − hAXA − hBXB∥∥22

2σ2. (10)

For source packets UA and UB of length K, the functionalmapping from UA and UB to the XORed source packet UR

can be expressed as

fpacket : {0, 1}K × {0, 1}K → {0, 1}K . (11)

The mapping in (11) is a 2K-to-1 mapping; that is, there are2K possible (UA, UB) that can produce a particular UR. Thisis where the complexity lies in (9). For each possible sourcepacket UR, we need to examine 2K possible combinations of(UA, UB), with each (UA, UB) associated with one pair ofchannel-coded signal (XA, XB). The Viterbi algorithm is ashortest-path algorithm that computes a path in the trellis of(UA, UB). Meanwhile, each UR is associated with 2K pathsin the trellis. There is no known exact computation method for(9) except to exhaustively sum over the possible combinationsof (UA, UB) for each UR.

We now consider the computing complexity of theXOR packet-optimal decoding algorithm. For each possible(UA, UB), we need to sum over N terms in (10) to computeM(XA, XB). For a code-rate r code and M-QAM modulation,N = K/[rlog2(M)]. Computing each term in (10) takes sixcomplex operations, and the summation takes (N − 1) opera-tions. Hence the complexity of one combination of (UA, UB)is (7K/[rlog2(M)]−1). Moreover, to find the maximum in (9),(2K−1) comparisons are needed. Given that there are 2K possi-

ble UR, from which we want to find the optimal UR, the overallcomplexity is therefore 22K(7K/[rlog2(M)]− 1)+ 2K − 1. InBig-O notation, the complexity is O(K22K).

This is a big contrast with the situation in the regular point-to-point communication system, in which the Viterbi algorithm forfinding the ML codeword has polynomial complexity only. ForPNC systems, the complexity of XOR packet-optimal decodingalgorithm is exponential with packet length K.

B. XOR Bit-Optimal Decoding of Synchronous Convolutional-Coded PNC

To reduce complexity, we consider an XOR bit-optimal Jt-CNC decoder based on the framework of Belief Propagation(BP) algorithms. The proposed decoder aims to find the MLXORed source bit rather than the ML XORed source packet.We give two important results: (i) the proposed Jt-CNC decoderis optimal in terms of BER performance; and (ii) the complexityis linear in packet length K.

Unlike finding ML XORed packets, for which the Viterbialgorithm is of little use, the BP (similar to BCJR) algorithmcan find the ML XORed source bit without incurring exponen-tial growth in complexity. We first explain the reason beforedescribing the BP algorithm in detail.

The kth ML XORed source bits uRk , k = 1, 2, . . .K is givenby

uRk = argmaxuRk

∑uAk,uB

k:uA

k⊕uB

k=uR

k

Pr(uAk , u

Bk |Y R, C) , (12)

Page 5: Asynchronous Convolutional-Coded Physical-Layer Network Coding

5

0s 1s Ks1u 2u Ku

Kc1c 2c

1f 2f Kf

1c Kc2c

Fig. 2. Tanner graph of the Jt-CNC decoder on which the BP algorithm oper-ates: sk is the joint state variable that captures the states of the convolutionalcodes at the two users at time k; uk is the pair of source bit of the two usersat time k; ck is the group of channel-coded bits of the two users at time k;fi is the factor node that represents the state transition function of the Jt-CNCdecoder; γ(ck) is the likelihood function of ck .

where Pr(uAk , uBk |Y R, C) denotes the a posteriori probability

of (uAk , uBk ) given the received signal Y R and the codebook

C. We can use the BP algorithm to calculate this probability.Fortunately, finding the ML XORed bits in PNC systems hasmuch lower complexity, because the functional mapping from(uAk , u

Bk ) to uRk can be expressed as

fbit : {0, 1} × {0, 1} → {0, 1}. (13)

The mapping in (13) is a 2-to-1 mapping; hence for eachpossible realization of the XOR bit uRk = uAk ⊕uBk , we need toexamine only two possible realizations of the pair of sourcebits (uAk , u

Bk ). Importantly, the BP algorithm can compute

Pr(uAk , uBk |Y R, C) easily, from which Pr(uAk ⊕ uBk |Y R, C) can

readily be obtained through the 2-to-1 mapping in (13).We now explain the details of our BP algorithm that

implements the XOR bit-optimal Jt-CNC decoder. BP is ageneral framework for generating inference-making algorithmsfor graphical models, in which there are two kinds of nodes:variable nodes and factor nodes. Each variable node represents avariable, such as the state variable of the convolutional encoder;each factor node indicates the relationship among all variablenodes connected to it. For example the state transition functionof a convolutional encoder is represented by a factor node. Thegoal of BP is to compute the marginal probability distributionsPr(uAk , u

Bk |Y R, C) for all k. This goal is achieved by means of

a sum-product message-passing algorithm [23].Fig. 2 shows the Tanner graph of our bit-optimal Jt-CNC

decoder. Unlike the conventional point-to-point convolutionaldecoder for single-user systems with only one transmitter, theJt-CNC decoder combines the states and the trellis of bothtransmitters A and B. Note here that node A and node Bcan use different convolutional codes with the same code rate.In Fig. 2, vectors S = (s0, s1, · · · , sK) represents the statevariables, where state sk combines the state of both end nodes’states; vector U = (u1, u2, · · · , uK), where uk = (uAk , u

Bk ),

represents the “virtual” source packet consisting of the dupleof the two source packets from nodes A and B; similarly,vector C = (c1, c2, · · · , cK), where ck = (cAk , c

Bk ) (as defined

in (2) cik denotes the group of channel-coded bits of node iat time k), represents the “virtual” channel-coded packet. Thebehavior of the decoder is defined by the functions of the factorsnode fk(sk−1, uk, ck, sk) that represents the state transition ruleof the trellis. For a trellis transition e = (sk−1, uk, ck, sk),fk(e) = 1 if e is a valid transition, and fk(e) = 0 otherwise.For example, if input uk causes a state transition from sk−1 tosk and the output is ck, then fk(e) = 1; on the other hand, if

1ks ks

ku

kc

kf

1ks ks

1ks ks

kc

ku

Fig. 3. The messages being passed around a factor node during the operationof the sum-product algorithm.

input uk causes a state transition from sk−1 to a state not equalto sk or the output is not ck, then fk(e) = 0.

The goal of the Jt-CNC decoder is to find the maximumlikelihood XOR bit uRk through the a posteriori probability(APP) Pr(uk|Y R, C) by

Pr(uRk∣∣Y R, C ) = ∑

uk:uAk⊕uB

k=uR

k

Pr(uk∣∣Y R, C ), (14)

where Pr(uk|Y R, C) can be computed exactly by the sum-product message-passing algorithm thanks to the tree struc-ture of the Tanner graph associated with convolutional nodes[24]. The sum-product algorithm, when applied to decodeconvolutional codes, is the well-known BCJR algorithm [12].The difference in our situation here is that instead of thesource bit from one source, we are decoding for the bit dupleuk = (uAk , u

Bk ) from the two sources.

We now explain our sum-product algorithm in detail. Fig. 3depicts the messages being passed around a factor node withinthe overall Tanner graph of Fig. 2. We follow the notation ofthe original paper on the BCJR algorithm [12]. In the forwarddirection, the message from sk−1 to fk is denoted by α(sk−1),and the message from fk to sk is denoted by α(sk). In thebackward direction, the message from sk to fk is denoted byβ(sk), and the message from fk to sk−1 is denoted by β(sk−1).Additionally, γ(ck) denotes the message from ck to fk, andδ(uk) denotes the message from fk to uk. Note that δ(uk) isthe APP Pr(uk|Y R, C) and the goal here is to compute it.

Since the Tanner graph of the Jt-CNC decoder is cycle-free, the operation of the sum-product algorithm consists oftwo natural recursions according to the direction of messageflow in the graph: a forward recursion to compute α(sk)as a function of α(sk−1) and γ(ck); a backward recursionto compute β(sk−1) as a function of β(sk) and γ(ck). Thecalculation of Pr(uk|Y R, C) can be divided into three steps:initialization, forward/backward recursion, and termination. Wepresent these three steps in detail below.

Initialization As usual in a cycle-free Tanner graph, the sum-product algorithm begins at the leaf nodes. Since zero-tailconvolutional code is used, the initial and terminal states of endnode’s convolutional encoders are both zero state. Therefore the

Page 6: Asynchronous Convolutional-Coded Physical-Layer Network Coding

6

A1x

I F

A2x

A3x

A4x

A5x

B1x

B2x

B3x

Fig. 4. Symbol misalignment in PNC: a general symbol misalignment consistsof an integral part τI = 2 and a fractional part τF = 0.7.

message α(s0) and β(sK) are initialized as

α(s0) =

{1 if s0 equals to 0

0 otherwise,(15a)

and

β(sK) =

{1 if sK equals to 0

0 otherwise.(15b)

The message γ(ck) is the likelihood function of ckbased on the evidence Y R. For example, if the code rateis 1/3 and the BPSK modulation is used, then ck =(cAk,1, c

Ak,2, c

Ak,3, c

Bk,1, c

Bk,2, c

Bk,3). These channel-coded bits ck are

mapped to BPSK modulated symbols (xA3k−2, xA3k−1, x

A3k) and

(xB3k−2, xB3k−1, x

B3k) at node A and node B, respectively. Given

the overlapped signal Y R at the relay node, the likelihood ofck is calculated by

γ(ck) ∝3∏

j=1

1√2πσ2

exp

⎧⎪⎨⎪⎩−

∣∣∣yR3(k−1)+j − hAxA3(k−1)+j − hBxB3(k−1)+j

∣∣∣22σ2

⎫⎪⎬⎪⎭ .

(16)

Forward/backward recursion After initializing the messagesfrom leaf nodes, we can compute the message α(sk) and β(sk)recursively by following the message update rule below [24]:

α (sk) =∑

sk−1,uk,ck

fk (sk−1, uk, ck, sk)α (sk−1) γ (ck) ,

(17a)

β (sk−1) =∑

sk,uk,ck

fk (sk−1, uk, ck, sk)β (sk) γ (ck) . (17b)

Termination In the final step, the algorithm terminates withthe computation of δ(uk), which gives the APP of the sourcebit uk

δ (uk) =∑

sk−1,sk,ck

fk (sk−1, uk, ck, sk)α (sk−1) γ (ck)β (sk) .

(18)The summation in (18) is over different trellis transitions

e = (sk−1, uk, ck, sk) with fixed uk, such that fk(e) = 1 if eis a valid transition, and fk(e) = 0 otherwise.

Let us consider the computing complexity of the XORbit-optimal decoding algorithm. The initialization takes9K/[rlog2(M)] complex operations in (16). The for-ward/backword recursions step take 6·22/rKS2 operations,where S is the number of decoder’s states. The terminationstep for computing (18) takes 4·22/rKS2 operations. Because

1,0x 1,1x 2,1x ,N Nx 1,N Nx

R 1,12Pr y xR 1,0

1Pr y x R 2,13Pr y x R ,

2Pr N NNy x R 1,

2 1Pr N NNy x

o e e

A B R1 1Pr ,x x Y A B RPr ,N Nx x Y

Deinterleaver

0s 1s Ks1u 2u Ku

Kc1c 2c

1f

1c

2f Kf

2c Kc

Jt-CNC decoder

Codeword-realignemntlayer

Symbol-realignmentlayer

Demodulated samples

Fig. 5. Decoding framework for asynchronous convolutional-coded PNCsystems. This framework can deal with an integral-plus-fractional symbolmisalignment.

finding a single ML XORed source bit in (12) takes twosummations and one comparison, so finding all the ML XORedsource bits takes 3K operations beyond the operations by theBP algorithm that computes Pr(uAk , u

Bk |Y R, C), k = 1, 2, . . .K.

Therefore finding the ML XORed source bits of length-K pack-ets has an overall complexity of 9K/[rlog2(M)]+6·22/rKS2+4·22/rKS2 +3K. In Big-O notation, the complexity of sourcebit-optimal decoding algorithm is O(K). Compared with theXOR packet-optimal decoding algorithm, the XOR bit-optimalalgorithm has a much lower complexity and is therefore morefeasible in practice.

V. ASYNCHRONOUS CONVOLUTIONAL-CODED PNC

In this section, we present our three-layer decoding frame-work for asynchronous convolutional-coded PNC. The asyn-chrony causes unique challenges that the synchronous decoderin Section IV cannot handle. As shown in Fig. 4, when thesignals of nodes A and B arrive at the relay at different times,their symbols can be misaligned. The symbol misalignmentconsists of two parts: an integral part τI and a fractional partτF. These two components impose different challenges: thefractional symbol misalignment causes overlaps of adjacentsymbols, and as a result the symbol-boundary preserving sam-pling as expressed in (8) is no longer valid; the integral symbolmisalignment entangles the channel-coded bits of nodes A andB in such a way that the decoding scheme as proposed inSection IV cannot be applied anymore.

To address these challenges, we add two layers to the Jt-CNCdecoder to construct an integrated framework, as illustrated inFig. 5. First, to address the fractional symbol misalignment,the symbol-realignment layer uses a BP algorithm at the relayto “realign” the soft information of the symbols. Second, thecodeword-realignment layer uses an interleaver/deinterleaverset-up to accommodate the integral symbol misalignment. Asa result, the three-layer decoding framework can deal with theintegral-plus-fractional symbol misalignment.

A. Symbol-Realignment Layer: Addressing Fractional SymbolMisalignment

For simplicity, as in [5], we assume the use of rectangularpulse to carry the modulated signal in the analog domain. Asillustrated in Fig. 4, the fractional symbol misalignment τFcauses an inter-symbol interference between A’s and B’s signals(e.g., xB1 overlaps with part of xA3 and part of xA4 ). In [25],

Page 7: Asynchronous Convolutional-Coded Physical-Layer Network Coding

7

1,0x 1,1x 2,1x ,N Nx 1,N Nx

R 1,12Pr y xR 1,0

1Pr y x R 2,13Pr y x R ,

2Pr N NNy x R 1,

2 1Pr N NNy x

o e e

A B R1 1Pr ,x x Y A B RPr ,N Nx x Y

Fig. 6. Tanner graph of the symbol-realignment layer. The variable node xi,j

corresponds to the joint symbol (xAi , xB

j ); nodes ψo and ψe are the factornodes that constrain the relationships among the variable nodes. The likelihoodprobabilities Pr(yR2n−1|xn,n−1) and Pr(yR2n|xn,n) are the evidences from

observation Y R.

suboptimal sampling was assumed: with respect to Fig. 4, onlythe overlapped part of xB1 and xA3 is sampled, and the usefulsignal in the overlapped part of xB1 and xA4 is discarded. Bycontrast, our method here is an optimal maximum-likelihood(ML) oversampling method based on the BP algorithm. Specif-ically, the relay R performs integration (matched filtering) onthe overlapped symbols for a duration τF and a duration of(1 − τF) alternately to generate (2N + 1) samples. Our over-sampling method makes full use of the overlapped signal andis ML optimal.

Let us first ignore the integral part of symbol misalignmentand only consider the fractional part (i.e., τ < 1) in thissubsection. When the integral part of symbol misalignmentis not zero, the factor graphs follow different equations fornon-overlapping parts of the received signal. Because the non-overlapping parts are “clear” signal, we use the traditionalsampling technique for them. The soft-information of the non-overlapping parts of signal can be directly computed, so we onlyfocus the Tanner graph of the overlapped signal. Furthermore,let us assume |hA|=|hB|=

√P where P is the transmission

power of end nodes. The design of this layer is similar tothe decoding method of asynchronous non-channel-coded PNCexpounded in [5], [26]. It is provided here for completenessand for illustration on how this layer is tied to the upper layer.

With over-sampling on the received signal yR(t), the totalnumber of samples obtained per packet is (2N + 1), whereN is the number of symbols per packet (for both users Aand B). The relay uses the (2N + 1) samples to computethe soft information Pr(xAn , x

Bn |Y R), where instead of the

expression in (7), Y R = (yR1 , yR2 , · · · , yR2N+1) consists of the

(2N+1) samples. Thus, as far as the soft information fed to theupper layer is concerned, the fractional symbol misalignment isremoved and the symbols are realigned. We emphasize thatthis realignment of soft information is a key step. Once that isdone, the channel decoding algorithm for synchronous PNC asproposed in Section IV can be applied.

We can write the samples obtained at the relay R as follows(after normalization):

yR2n−1 = xAn + xBn−1ejφ + wR

2n−1 (19a)

yR2n = xAn + xBnejφ + wR

2n (19b)

where n=1, 2, . . . , N , xB0=0 and yR2N+1=xBNe

jφ+wR2N+1. The

terms wR2n−1 and wR

2n are zero-mean complex Gaussian noisewith variances σ2/τFP and σ2/(1− τF)P per dimension,respectively [5].

We use a BP algorithm to compute soft information ofPr(xAn , x

Bn |Y R) from the (2N + 1) samples. The associated

Tanner graph is shown in Fig. 6. In the Tanner graph xi,jΔ=

(xAi , xBj ) are the variable node; ψo and ψe are the compatibility

functions associated with the factor nodes. The compatibilityfunctions model the correlation between two adjacent symbolsand are defined as

ψo

(xn,n−1, xn,n

)=

⎧⎨⎩1 if the values of xAn in xn,n−1

and xn,n are equal

0 otherwise,(20a)

ψe

(xn,n, xn+1,n

)=

⎧⎨⎩1 if the values of xBn in xn,n

and xn+1,n are equal

0 otherwise.

(20b)

Note that the Tanner graph in Fig. 6 has a tree structure,hence the BP algorithm can find the “exact” a posterioriprobability Pr(xn,n|Y R) for n = 1, . . . , N . Furthermore, thesolution can be found after only one iteration of the message-passing algorithm [24]. We now describe the message-passingalgorithm in detail: αL(x

i,j) denotes the forward message fromthe factor node (ψo or ψe) to variable node xi,j , αR(x

i,j)denotes the forward message from variable node xi,j to thefactor node; βR(x

i,j) denotes the backward message from thefactor node to variable node xi,j , βL(x

i,j) denotes the backwardmessage from variable node xi,j to the factor node; γ(xi,j)denotes the evidence of variable node xi,j from the observation.Let us first consider variable node xn,n. The forward messagesfrom left to right can be computed as

αL(xn,n) =

∑xn,n−1

αR(xn,n−1)ψo

(xn,n−1, xn,n

), (21a)

αR(xn,n) = αL(x

n,n)γ(xn,n). (21b)

The backward messages from right to left can be computed as

βR(xn,n) =

∑xn+1,n

βL(xn+1,n)ψe

(xn,n, xn+1,n

), (22a)

βL(xn,n) = βR(x

n,n)γ(xn,n). (22b)

We compute the forward/backward messages of variablenode xn+1,n in the same way, except that the factor nodesneed to be modified accordingly. The likelihood probabilitiesPr(yR2n+1|xn+1,n) and Pr(yR2n|xn,n) are the evidences γ(xn,n)from observation Y R. The computation of these evidences isgiven by

γ(xn+1,n) = Pr(yR2n+1

∣∣xn+1,n)= Pr

(yR2n+1

∣∣xAn+1, xBn

)=

1

2πσ2/τFexp

{−∣∣yR2n+1 − xAn+1 − xBn

∣∣22σ2/τF

}(23a)

and

γ(xn,n) = Pr(yR2n∣∣xn,n) = Pr

(yR2n∣∣xAn , xBn)

=1

2πσ2/(1− τF)exp

{−∣∣yR2n − xAn − xBn

∣∣22σ2/(1− τF)

}.

(23b)

Finally, the soft information of Pr(xAn , xBn |Y R) is computed by

Pr(xAn , xBn |Y R) = αL(x

n,n)βR(xn,n)γ(xn,n). (24)

Page 8: Asynchronous Convolutional-Coded Physical-Layer Network Coding

8

The computation of forward messages in (21a) and (21b) takes2NM2(2M2 − 1) complex operations and 2NM2 complexoperations, where N is the number of symbols per packetand M is the modulation order, respectively. The same com-putation is needed for the backward messages in (22a) and(22b) as well. Computing the evidences in (23a) and (23b)takes (2N + 1)7M2 complex operations. Equation (24) takesanother 2NM2 complex operations. Therefore, the overallcomputational complexity of the symbol-realignment layer is4NM2(2M2−1)+4NM2+(2N+1)7M2+2NM2 complexoperations. In Big-O notation, the complexity is O(N). Sincethe complexity is linear with the packet size N , the symbol-realignment layer does not incur heavy overhead.

B. Codeword-Realignment Layer: Countering Integral SymbolMisalignment

Since the fractional part of symbol misalignment has beenremoved in the symbol-realignment layer, here we only considerthe integral part of symbol misalignment in this subsection.Recall that in Section IV we used (16) to compute the messageγ(ck). Equation (16) requires that the modulated symbols of endnodes A and B to be symbol-by-symbol aligned (i.e., xAn mustalign with xBn ). However, with integral symbol misalignmentτI, x

An will be aligned with xBn−τI ; consequently, the algorithm

proposed in Section IV becomes invalid.The codeword-realignment layer solves this problem using

a specially designed interleaver/deinterleaver at the end/relaynodes. At the end nodes, we use the same block interleaver withR rows and M/R columns, where r = 1/R is the code rate andM is the number of bits in the codeword. For interleaving, thechannel-coded bits are filled into the interleaver column-wise ,and read out row-wise. Let Π denotes the interleave operation.The interleaving process is

Π(CA)= CA, Π

(CB)= CB. (25)

The interleaved packets CA and CB are modulated andtransmitted simultaneously to the relay. Upon receiving theoverlapped signal (with symbol misalignment), the relay firstdeals with the fractional symbol misalignment with the al-gorithm proposed in Section V-A, leaving only the integralsymbol misalignment τI after that. Then the relay wraps backthe nonoverlapped signal at the tail to the head. The overlappedsignal becomes CA + CB

(τI), where CB

(τI)is the τI bit circular-

shifted version of CB. Finally, the relay uses the same blockdeinterleaver to deinterleave the overlapped signal as

Π−1(CA+CB

(τI)

)=Π−1

(CA)+Π−1

(CB

(τI)

)=CA+CB

(τIR).

(26)Let us consider an example with a convolutional code of coderate 1/3, BPSK modulation, and integral symbol misalignmentof τI = 2 (in this subsection, we only consider the integral partof τ ). As specified in Section III, the channel-coded packetsof node A and node B are CA and CB, respectively. Then thechannel-coded packet is bit-interleaved with a block interleaverwith 3 rows and M/3 columns. The interleaved packets Ci, i ∈{A,B} in (3) are BPSK modulated to produce the transmittedsignal

Xi =(xi1,1, x

i2,1, · · · , xiK,1, x

i1,2, · · · , xiK,2, x

i1,3, · · · , xiK,3

),

i ∈ {A,B} , (27)

where xik,j = 1−2cik,j , and cik,j is the jth channel-coded bit ofnode i’s channel-coded packet at time k as defined in (3). The

TABLE I. DIFFERENT TECHNIQUES TO MAKE THE CONVOLUTIONAL

CODE’S INITIAL STATE AND TERMINAL STATE THE SAME.

Code Technique

Non-recursive zero tailing or tail biting

Recursive zero tailing

received signal samples are the superposition of the followingtwo sequences

(xA1,1) (xA2,1) xA3,1 xA4,1 · · · xAK,3

xB1,1 xB2,1 · · · xBK−2,3 (xBK−1,3) (xBK,3).(28)

The relay first wraps back the nonoverlapped signals:xA1,1x

A2,1 at the head and xBK−1,3x

BK,3 at the tail (enclosed in

brackets in (28) above). After the wrap-back, the realignedsequences look like

(xA1,1) (xA2,1) xA3,1 xA4,1 xA5,1 · · · xAK,3

(xBK−1,3) (xBK,3) xB1,1 xB2,1 xB3,1 · · · xBK−2,3.(29)

Then the relay deinterleaves the signal in (29) to restore nodeA’s transmission order. After deinterleaving, the received packetbecomes the superposition of the following sequences:

(xA1,1) xA1,2 xA1,3(xBK−1,3) x

BK−1,1 x

BK−1,2︸ ︷︷ ︸

block−1

(xA2,1) xA2,2 xA2,3(xBK,3) x

BK,1 x

BK,2︸ ︷︷ ︸

block−2

xA3,1 xA3,2 x

A3,3

xB1,1 xB1,2 x

B1,3︸ ︷︷ ︸

block−3

· · ·· · ·

(30)We group the received packet into K blocks, with each block-

k containing the R = 3 coded symbols of the kth input.The signal in (30) is equivalent to the superposition of themodulated signals of CA and CB

(6), where, except for the first

two blocks, CB(6) is the 6-bit right circular-shifted version of CB.

The symbols of the first two blocks are out of order due to thelarger-than-one symbol misalignment. However, the disorderdoes not hinder our decoding because we can still computethe likelihoods of the first and second blocks. For example, thelikelihood of the first block can be computed as follows:

γ(cA1 c

BK−1

)= γ

(xA1,1x

A1,2x

A1,3x

BK−1,1x

BK−1,2x

BK−1,3

)∝ γ

(xA1,1

)γ(xBK−1,3

)γ(xA1,2x

BK−1,1

)γ(xA1,3x

BK−1,2

).

(31)

To compute the likelihood in (31), we first computethe likelihoods in the second line. We can obtain γ(xA1,1),γ(xBK−1,3), γ(xA1,2x

BK−1,1), and γ(xA1,3x

BK−1,2) from the

symbol-realignment layer as shown in Fig. 6. After normal-ization, we get the likelihood of block-1 γ(cA1 c

BK−1). We can

compute the likelihood of block-2 in the same manner. Thelikelihood of block-k when k > τI can be computed using(16).

We then pass the likelihoods of the K blocks to the upperlayer, Jt-CNC decoder, as the message γ(ck). In Appendix Iwe prove that if a convolutional code C has the same initialstate and terminal state, then C(τIR), the τIR bit circular-shiftedversion of C, can be decoded to U(τI). To ensure that the initialstate equals to the terminal state, we use different techniquesfor different kinds of convolutional codes as shown in Table I.

As a result, in the presence of integral symbol misalign-ment, our XOR bit-optimal decoding algorithm will outputUR = UA ⊕ UB

(τI). However, node A (B) can still decode the

information of node B (A). In the downlink phase, the relaybroadcasts the XOR message UR together with the value of τIto both the end nodes. Node A can first XOR UR with its own

Page 9: Asynchronous Convolutional-Coded Physical-Layer Network Coding

9

−1 0 1 2 3 4 5 6 7 8 910−6

10−5

10−4

10−3

10−2

10−1

100

SNR (dB) Eb/N0

BE

R

(5,7) Jt−CNC(5,7) FSV(5,7) XOR−CDV(13,15,17) Jt−CNC(13,15,17) FSV(13,15,17) XOR−CDV

Fig. 7. BER performances of Jt-CNC, XOR-CD Viterbi (XOR-CDV), andfull-state Viterbi (FSV) algorithms for PNC. The channel codes are (5, 7)and (13, 15, 17) convolutional codes. We use BPSK modulation and assumeAWGN channel.

packet to obtain UB(τI)

, then left shift it τI bits to restore UB.

Node B can first right shift its own packet to obtain UB(τI)

and

then XOR it with UR to obtain UA.

VI. NUMERICAL RESULTS

We evaluate the performance of the proposed PNC decodingframework under the AWGN channel by extensive simulationsand analysis. First, we compare the BER performances of Jt-CNC, XOR-CD Viterbi, and full-state Viterbi algorithms insynchronous PNC under both AWGN channel and Rayleighfading channel. Second, we investigate the effect of phase offseton our Jt-CNC decoder. Third, we present the performance ofthe three algorithms in the presence of symbol asynchrony.Furthermore, we implement the three algorithms in a real PNCprototype built on the USRP software radio platform, and testthem in an indoor environment.

A. BER Performance Comparison and Analysis

1) AWGN Channel: We compare the BER performances ofJt-CNC, XOR-CD Viterbi (XOR-CDV), and full-state Viterbi(FSV) in synchronous PNC under AWGN channel. The XOR-CD Viterbi algorithm and full-state Viterbi algorithm wereintroduced in Section II and elaborated in Appendix III. Inthe simulations, we adopt convolutional codes of two differentcode rates: code rate 1/2 (5, 7) code and code rate 1/3(13, 15, 17) code. We consider BPSK modulation, assumingAWGN channel.

We plot the BER curve of the full-state Viterbi algorithm asa benchmark for the reduced-state Viterbi algorithm in [13]. Inour attempt to replicate the reduced-state Viterbi algorithm, wecannot get the same simulation results as in [13] even thoughwe follow the exact specification as described in the paper2.Our simulation results are somewhat better than those presented

2We believe that there are errors in equation (12) and Fig. 3 in [13]. Wesuspect that in [13], the SNR was not normalized correctly. Our attempt tocontact the authors of [13] by email received no reply.

−1 0 1 2 3 4 5 6 7 8 910−4

10−3

10−2

10−1

100

SNR (dB) Eb/N0

FER

(5,7) FSV(5,7) Jt−CNC(5,7) XOR−CDV

Fig. 8. Frame error rate (FER) performances of Jt-CNC, XOR-CD Viterbi(XOR-CDV), and full-state Viterbi (FSV) algorithms for synchronous PNC.The channel code used is (5, 7) convolutional code. We use BPSK modulationand assume AWGN channel.

in [13]. To avoid misrepresenting their results, here we justcompare the results of full-state Viterbi with Jt-CNC.

As shown in Fig. 7, Jt-CNC has slightly better BER per-formance than FSV. In Section II-A and also in AppendixIII-B, we explained that FSV is an approximation to theXOR packet-optimal decoding algorithm based on the log-maxapproximation. As such, FSV is not exactly XOR-packet opti-mal. The approximation is shown in equation (III.51). Duringthe simplification in (III.51), some possible combinations of{UA, UB} yielding the same XOR packet UR are omitted, soFSV is not strictly an XOR packet-optimal decoding scheme,but an approximation to it. As such, there is no guarantee thatit will outperform Jt-CNC even if XOR-packet error rate is theperformance metric. Meanwhile, when XOR-bit error rate isthe performance metric (as shown in Fig. 7), Jt-CNC will bebetter than FSV, since Jt-CNC targets for bit optimality and isan exact bit-optimal algorithm for synchronous PNC. In [13],a performance gap of 2 dB was observed between the reduced-state Viterbi and the full-state Viterbi. If the gap between full-state Viterbi and reduced-state Viterbi is 2 dB, then the gapbetween Jt-CNC and reduced-state Viterbi is at least 2 dB.

Fig. 7 also shows that Jt-CNC outperforms XOR-CD Viterbiby 2 dB for both rate 1/2 and 1/3 convolutional codes. Asdescribed previously, XOR-CD loses information in the XOR-mapping, hence this 2 dB gap is as expected.

2) Frame Error Rate: We compare the frame error rate (FER)performances of Jt-CNC, XOR-CD Viterbi and full-state Viterbiin synchronous PNC. As discussed in Section IV, Jt-CNC isoptimal in terms of BER, but may not be optimal in terms ofFER. As shown in Fig. 8, the FER performances of Jt-CNCand FSV are quite close, and are 1 dB better than XOR-CDV.We have also investigated the FER performances of the threedecoders in asynchronous PNC. The relative performance gapsamong the decoders as in synchronous PNC are also observed.Furthermore, in terms of the dB gaps between the decoders,there is no substantial difference between BER and FER results.Henceforth, we will only present the BER results.

3) Fading Channel: We compare the BER performances ofJt-CNC, XOR-CDV, and FSV in synchronous PNC under fadingchannel. In the simulation, we assume block Rayleigh fading

Page 10: Asynchronous Convolutional-Coded Physical-Layer Network Coding

10

0 2 4 6 8 10 1210−6

10−5

10−4

10−3

10−2

10−1

100

SNR (dB) Eb/N0

BE

R

(5,7) FSV(5,7) Jt−CNC(5,7) XOR−CDV

Fig. 9. BER performances of Jt-CNC, XOR-CD Viterbi (XOR-CDV), andfull-state Viterbi (FSV) algorithms for PNC under fading channel. The channelcode used is (5, 7) convolutional code. We use BPSK modulation and assumeblock Rayleigh fading channel.

and white noise. As shown in Fig. 9, compared with the BERperformances under AWGN channel, all the three decodingalgorithms experience 3 dB degradation under fading channel.The degradation is caused by the phase offset and unbalancedchannel coefficients of fading channel.

4) BER Analysis: We now analyze the bit error rate (BER)of our Jt-CNC decoder under AWGN channel. It is difficult toobtain the closed-form expression of BER for Jt-CNC decoder,which uses the BCJR decoding algorithm. However, the BER ofFSV (full-state Viterbi) algorithm can be a good approximationfor Jt-CNC algorithm for the following two reasons: first, thesimulation in Section VI-A1 shows that BER performances ofJt-CNC and FSV are quite close; second, Jt-CNC and FSVshould perform nearly the same in the high SNR regime. Nextwe derive the approximative BER expression for Jt-CNC andFSV with BPSK modulation.

Let BERXOR denote the BER of the XORed source packetUR, and BERAB denote the BER of the joint source packet{UA, UB} of the two end nodes. It can be easily proved thatBERXOR < BERAB, since an error of UR implies an errorof {UA, UB} but not vice versa (e.g., if UA and UB haveone bit error in the same position, their XOR is still correct).There is no simple way to derive the closed-form expression ofBERAB, but we may approximate (upper bound) it using theunion bound [27] as

BERAB ≈∞∑k=d

ckPk. (32)

In (32), d is the free distance of the FSV decoder. ckis the sum of the numbers of bit errors over all paths ofdistance k from the correct path. We compute ck by taking thederivative of the FSV decoder’s transfer function [27]. Pk is thepairwise error probability that an incorrect path with distancek from the correct path is decoded. To compute Pk, let usconsider an incorrect path {XA, XB} merging with the correctpath {XA, XB} at a particular step, which has k incorrectsymbols and the remaining symbols correct. Such a path maybe incorrectly chosen only if it has a smaller distance metric

(as defined in (10)) than the correct path, i.e.,

M(XA, XB

)<M (

XA, XB). (33)

Since the path {XA, XB} and path {XA, XB} differ inexactly k symbols, the pairwise error probability is

Pk = Pr{M (

XA, XB)−M

(XA, XB

)> 0

}= Pr

{k∑

i=1

(yRi − xAi − xBi

)22σ2

−k∑

i=1

(yRi − xAi − xBi

)22σ2

> 0

}

= Pr

{k∑

i=1

[2yRi (x

Ai + xBi − xAi − xBi ) + (xAi + xBi )

2−

(xAi + xBi )2]> 0

}. (34)

Without loss of generality, we assume that the correct path{XA, XB} corresponds to the all-zero source packet [28](hence xAi =x

Bi = + 1, ∀i), and only one symbol error happens

for each overlapped symbol yRi (hence xAi = +1, xBi = −1 orxAi = −1, xBi = +1). Therefore

Pk = Pr

{k∑

i=1

[2yRi (0− 2) + 4

]> 0

}

= Pr

{k∑

i=1

yRi < k

}. (35)

Since yRi are independent Gaussian random variables ofvariance σ2 and mean (xAi + xBi ), where xAi and xBi are theactually transmitted symbols by the end nodes. Therefore the

sum Z =∑k

i=1 yRi is also Gaussian with mean 2k and variance

kσ2 hence

Pk = Pr {Z < k} = 1−Q

(k − 2k√kσ2

)= Q

(√k

σ

). (36)

Consequently,

BERAB ≈∞∑k=d

ckQ

(√k

σ

). (37)

To give a concrete example, let us assume both end nodes use(5, 7) convolutional codes as in Section VI-A1. As illustrated inSection III-B, the trellis of the FSV decoder is the combinationof node A’s and node B’s trellises. Therefore, the FSV decoder

has a generator vector of

(5, 7, 0, 00, 0, 5, 7

). We compute the free

distance d and the coefficients ck of this FSV decoder usingthe method proposed in [29]. We get that d = 5 and substitutethe values of ck into (37) to obtain

BERAB ≈ 2P5 + 8P6 + 24P7 + 64P8 + 160P9 + . . .

= 2Q

(√5

σ

)+ 8Q

(√6

σ

)+ 24Q

(√7

σ

)+

64Q

(√8

σ

)+ 160Q

(√9

σ

)+ . . . . (38)

We omit the higher order terms (when k > 12 the value ofPk is negligible) in (38) and plot the approximative BER curvetogether with the simulation results from Section VI-A1. As

Page 11: Asynchronous Convolutional-Coded Physical-Layer Network Coding

11

0 2 4 6 8 10 1210−6

10−5

10−4

10−3

10−2

10−1

100

SNR (dB) Eb/N0

BE

R

Jt−CNC φ=0Jt−CNC φ=π/4XOR−CDV φ=0XOR−CDV φ=π/4FSV φ=0FSV φ=π/4

(a) without random-phase precoding

0 2 4 6 8 10 1210−6

10−5

10−4

10−3

10−2

10−1

100

SNR (dB) Eb/N0

BE

R

Jt−CNC φ=0Jt−CNC φ=π/4XOR−CDV φ=0XOR−CDV φ=π/4FSV φ=0FSV φ=π/4

(b) with random-phase precoding

Fig. 11. Effects of phase offset on Jt-CNC, XOR-CD Viterbi (XOR-CDV), and full-state Viterbi (FSV). QPSK modulation and the (13, 15, 17) convolutionalcode are used in the simulation. We assume the symbols are aligned and the relative phase offset is π/4. In (a), both nodes transmit their signals directly; in (b),node B precodes its transmit signal with a pseudo-random phase sequence.

−1 0 1 2 3 4 5 6 7 810−7

10−6

10−5

10−4

10−3

10−2

10−1

100

101

102

SNR (dB) Eb/N0

BE

R

(5,7) Approximative BER(5,7) Jt−CNC(5,7) FSV

Fig. 10. The approximative BER curve derived in (38) and the simulated BERcurves of Jt-CNC, full-state Viterbi (FSV) algorithms. The channel code usedis (5, 7) convolutional code. We use BPSK modulation and assume AWGNchannel.

shown in Fig. 10, the BER curves of Jt-CNC and FSV are closeto the derived approximative BER curve in (38), and approachit in the high-SNR regime.

B. Effects of Phase Offset

We next evaluate the effect of phase offset on Jt-CNCassuming QPSK modulation (higher order QAM can also beused)—note that phase offset does not present a challengeto BPSK systems (see [4], [5]). First, we compare the BER

performances of the aforementioned three decoding algorithmswith phase offset φ=0 (phase synchronous) and φ=π/4 (worstcase for QPSK [5]).

As shown in Fig. 11a, when the phase offset is π/4, theBER performances of Jt-CNC, FSV, and XOR-CDV degradeby 3 dB, 3 dB, and 5 dB, respectively. The severe phase penaltyis due to the poor confidence of the messages as calculated in(16) when the phase offset is π/4.

One method to improve the confidence is to make thephase offset random so that the symbols with small phaseoffset can help the symbols with large phase offset during theBP process. To improve our system’s resilience against phaseoffset, we adopt the random-phase precoding at the transmitterof one end node. Specifically, node B rotates the phase ofits transmitted signal with a pseudo-random phase sequenceΦB=(φB1 , · · · , φBN ) where φBn is randomly chosen from zeroto π/4. We assume that this pseudo-random phase sequence isknown at the relay so that it can incorporate this knowledge intothe decoding process. As shown in Fig. 11b, with the random-phase precoding algorithm, the phase penalty is reduced to 1 dB,1 dB, and 3 dB compared with the synchronous case for Jt-CNC,FSV, and XOR-CDV, respectively.

C. Effects of Symbol Misalignment

A major advantage of the proposed decoding framework isthat it can handle general symbol misalignment with differentdecoding algorithms. We evaluate the performance of Jt-CNCunder varying degrees of symbol misalignment and phase offset(without random-phase precoding). In the simulation, both endnodes transmit 1000-bit source packets (corresponding to 1500QPSK symbols for channel code rate of 1/3).

From Fig. 12 we see that although the fractional symbolmisalignment (the curve with τ = 0.5, φ = 0) degrades theBER performance by 0.5dB, the integral symbol misalignment(the curve with τ=100.5, φ=0) improves the BER performance

Page 12: Asynchronous Convolutional-Coded Physical-Layer Network Coding

12

−1 0 1 2 3 4 510−5

10−4

10−3

10−2

10−1

100

SNR (dB) Eb/N0

BE

R

τ=0, φ=0τ=0.5, φ=0τ=100.5, φ=0τ=0, φ=π/8τ=0.5, φ=π/8τ=100.5, φ=π/8

Fig. 12. BER performance of Jt-CNC decoder under general symbolmisalignment, with (13, 15, 17) convolutional code and QPSK modulation.

0 1 2 3 4 5 6 7 8 9 1010−6

10−5

10−4

10−3

10−2

10−1

100

SNR (dB) Eb/N0

BE

R

FSV δ=0Jt−CNC δ=0XOR−CDV δ=0XOR−CDV δ=9.5Jt−CNC δ=9.5FSV δ=9.5

Fig. 13. BER performances of Jt-CNC, XOR-CD Viterbi (XOR-CDV), andfull-state Viterbi (FSV) decoding algorithms under general symbol misalign-ment (δ = 9.5, φ = 0), with (13, 15, 17) convolutional code and QPSKmodulation.

slightly. That is because when there are integral symbol mis-alignments, the head and tail of the signals are non-overlappingand thus yield cleaner information without mutual interference.

We next compare the BER performances of the Jt-CNC,XOR-CD Viterbi, and full-state Viterbi algorithms under larger-than-one symbol misalignment. As shown in Fig. 13, whensymbol misalignment δ = 9.5 and phase offset φ = 0, theBER performances degrades 0.5 dB, 0.5 dB, and 3 dB for Jt-CNC, FSV, and XOR-CDV, respectively. Within the framework,both Jt-CNC and full-state Viterbi are robust to symbol mis-alignment; while XOR-CD Viterbi is quite sensitive to symbolmisalignment.

D. Software Radio ExperimentTo evaluate the proposed algorithm in a real communication

system, we implemented an OFDM PNC prototype built onUSRP N210. The three decoding algorithms are implemented

4 5 6 7 8 9 10 11 12 13 1410−6

10−5

10−4

10−3

10−2

10−1

100

SNR (dB) Eb/N0

BE

R

XOR−CDVFSVJt−CNC

Fig. 14. BER performances of Jt-CNC, FSV, and XOR-CDV in an indoorenvironment. We tested the three algorithms on a practical OFDM PNCprototype implemented on USRP N210. The PNC prototype adopts BPSKmodulation and (5, 7) convolutional code.

in the prototype. The PNC prototype adopts BPSK modulationwith 2 MHz bandwidth and 2.58 GHz carrier frequency. Weused the (5, 7) convolutional code and followed the frameformat design in [8]. We conducted our experiments in anindoor office environment and evaluated the BER performancesof Jt-CNC, XOR-CDV, and full-state Viterbi algorithms underdifferent SNRs. In the experiment, we balanced the powers ofthe end nodes and let both nodes transmit 100 packets to therelay. Each packet consisted of 204 OFDM symbols (4 symbolsof preambles and 200 symbols of data). The PC used for thisexperiment has 32 GB RAM and an Intel Core i7 processor.The typical processing times to decode one packet for the threedecoding algorithms are: 0.5255 s for XOR-CDV; 2.0596 s forFSV; 1.4946 s for Jt-CNC.

As shown in Fig. 14, the BER performances of Jt-CNC andfull-state Viterbi are nearly the same in real indoor environment.Compared with the simulation results in Fig. 7, the BERperformance of all the three algorithms in the real system aredegraded by 5 dB due to imperfections of the real system, suchas imperfect channel estimation, carrier-frequency offsets, andfrequency-selective channels.

VII. CONCLUSION

We have proposed a three-layer decoding framework forasynchronous convolutional-coded PNC systems. This frame-work can deal with general (integral plus fractional) symbolmisalignment in convolutional-coded PNC systems. Further-more, we design a Jt-CNC algorithm to achieve the BER-optimal decoding of convolutional code in synchronous PNC.Building on the study of convolutional codes, we furthergeneralize the Jt-CNC decoding algorithm to all cyclic codes(in Appendix II), providing a new angle to counter sym-bol asynchrony. Simulation shows that our Jt-CNC algorith-m outperforms the previous decoding algorithms (XOR-CD,reduced-state Viterbi) by 2 dB. For both phase-asynchronousand symbol-asynchronous PNC, our Jt-CNC algorithm outper-forms the two previously proposed algorithms. Importantly, wehave implemented the proposed Jt-CNC decoder in a real PNC

Page 13: Asynchronous Convolutional-Coded Physical-Layer Network Coding

13

Ks

1s

1K ks

1u

2u

1K ku1K kc

1c2c

1f

2f

K kf

K ks

K ku

K kc

1K kf

1Ks

Ku

KcKf

0( )s

Fig. 15. Tanner graph of a convolutional code that has the same initial stateS0 and terminal state SK . We merge the initial state and terminal state, hencethe Tanner graph in Fig. 2 becomes a ring.

prototype built on software radio platform. Our experimentshows that the Jt-CNC decoder works well in practice.

APPENDIX ICORRECT DECODING OF CIRCULAR-SHIFTED SOURCE BITS

Theorem 1: Let C denote the codeword of a convolutionalcode whose code rate is r = L/R, where L and R arepositive integers. If the initial state and terminal state of thisconvolutional code encoder are the same, then the decodingbased on C(kR), the kR-bit right circular-shifted version of C,yield the kL-bit right circular-shifted version of U .

Proof: The encoding and decoding process of the convolu-tional code can be represented by the Tanner graph in Fig. 2.Since the code has the same initial and terminal state, we canmerge the initial state and terminal state of the Tanner graphas shown in Fig. 15. For a general convolutional code withcode rate L/R, the source message uk is an L-bit tuple andthe coded message ck is an R-bit tuple.

Let C(kR) = (cK−k+1, cK−k+2, · · · , cK , c1, · · · , cK−k) bethe kR-bit right circular-shifted version of codeword C. Todecode C(kR), the decoding algorithm starts with the firsttuple cK−k+1 and ends with the last tuple cK−k. Becausethe Tanner graph has a ring structure, the decode output isU(kL) = (uK−k+1, uK−k+2, · · · , uK , u1, · · · , uK−k), which isthe kL-bit right circular-shifted version of U .

Remark 1: Both zero-tail convolutional codes and tail-bitingconvolutional codes have the property in Theorem 1, becausetheir initial state and terminal state are the same. For a recur-sive convolutional code, we can append tail bits to the inputpacket to force the terminal state of the encoder to zero state.Then recursive convolutional codes can also be used with theproposed Jt-CNC decoder.

APPENDIX IIHANDLE SYMBOL MISALIGNMENT IN TAIL-BITING

CONVOLUTIONAL CODES AND CYCLIC CODES

Theorem 2: For a tail-biting convolutional code with coderate 1/R, R ∈ N+, let U denote the source packet of thechannel-coded packet C, and C(kR) denote the kR-bit rightcircular-shifted version of C. The source packet correspondingto C(kR) is U(k), the k-bit right circular-shifted version of U .

Proof: Let m denote the memory length of this convolu-tional encoder. The generator matrix of the convolutional codeis

G =

⎡⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎣

g0 g1 g2 · · · gm

g0 g1 · · · gm−1 gm

. . .. . .

g0 g1 g2 · · · gm

gm g0 g1 · · · gm−1

gm−1 gm g0 · · · gm−2

.... . .

. . .

g1 g2 · · · gm g0

⎤⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎦,

(II.39)where gb = [g0 g1 · · · gm] is the basis generator matrix ofthe convolutional code; each entry gi is an R-bit vector

gi =[g(1)i g

(2)i · · · g

(R)i

], (II.40)

where g(r)i is equal to 1 or 0, corresponding to whether the

ith stage of the shift register contributes (connects) to the rthoutput. Therefore, the basis generator matrix gb can be regardedas the “impulse response” of the convolutional encoder. Forexample, the basis generator matrix of (5, 7) convolutional codeshown in Fig. 16 is [11 01 11]. The encoding process is simply

C = UG. (II.41)

The right circular-shifted codeword can be represented by

C(kR) = UG(k) (II.42)

where G(k) is obtained by right circular-shift matrix G by k×Rcolumns. Since G is a circulant matrix, we have

C(kR) = UG(k) = U(k)G. (II.43)

Therefore the source packet of C(kR) is U(k), the k-bit circular-shifted version of U .

Remark 2: Theorem 2 is also valid for the tail-biting con-volutional code with a general code rate L/R, but the resultingsource packet will be U(kL), the kL-bit right circular-shiftedversion of U . The proof is the same except that the entry ofthe basis generator matrix gb is an L×R matrix:

gi =

⎡⎢⎢⎢⎢⎣g(1)1,i g

(2)1,i · · · g

(R)1,i

g(1)2,i g

(2)2,i · · · g

(R)2,i

......

...

g(1)L,i g

(2)L,i · · · g

(R)L,i

⎤⎥⎥⎥⎥⎦ (II.44)

where g(r)l,i is equal to 1 or 0, depending on whether the ith stage

of the shift register for the lth input contributes (connects) tothe rth output.

Theorem 2 also indicates that tail-biting convolutional codesare quasi-cyclic with period R. Inspired by the quasi-cyclicproperty of tail-biting convolutional codes [19], [30], [31],we attempted to generalize the results to general quasi-cyclic

Page 14: Asynchronous Convolutional-Coded Physical-Layer Network Coding

14

DD Dku

,1kc

,2kcFig. 16. Convolutional encoder of (5, 7) convolutional code. uk is the inputsource bit at time k; ck,1 and ck,2 are the first and second output bit of theencoder at time k, respectively.

codes (as opposed to just convolutional codes). Unfortunately,a general quasi-cyclic code3 may not have the property inTheorem 2. However, building on the insight obtained from ourstudy of convolutional-coded PNC, we propose an algorithmthat can deal with general symbol misalignment when cycliccodes are used (as opposed to quasi-cyclic codes). That is,our asynchronous PNC decoding framework can incorporatenot just convolutional codes, but all cyclic codes. Since cycliccodes have a period of one, we do not need the inter-leaver/deinterleaver here.

Let C(·) and C−1(·) denote the encoding function and decod-ing function of a particular linear cyclic code (e.g., BCH code),respectively. Then the encoding process in the end nodes isCi = C(U i), i ∈ A,B. To ease presentation, we assume BPSKmodulation and a symbol misalignment of τI. The receivedsignal at the relay is the overlap of the following two signals:

xA1 · · · xAτI xAτI+1 · · · · · · xANxB1 xB2 · · · xBN−τI

xBN−τI+1 · · · xBN .(II.45)

Upon receiving the overlapped signal, the relay first alignsthe last τI symbols with the first τI symbols to obtain a newoverlapped signal

xA1 xA2 · · · xAτI xAτI+1 · · · · · · xANxBN−τI+1 · · · · · · xBN xB1 xB2 · · · xBN−τI

.(II.46)

The result in (II.46) is actually the signal XA + XB(τI)

,

where XB(τI)

is the τI-symbol right circular-shifted version of

node B’s signal. Then the relay can map the signal of (II.46)to CA⊕CB

(τI), where CB

(τI)is the τI-bit right circular-shifted

version of CB. Note that CB(τI)

is also a valid codeword due

to the property of cyclic code. We assume the source packetcorresponding to CB

(τI)is UB such that UB = C−1(CB

(τI)).

Because the XOR operator preserves the linearity of codes, therelay first decodes the XORed packet by

UR=C−1(CA⊕CB

(τI)

)=C−1

(CA)⊕C−1

(CB

(τI)

)=UA⊕UB,

(II.47)and then broadcasts this packet to both the end nodes. Afterdecoding UR, node A first XORs UR with its own informationUA to obtain UB; then node A re-encodes UB to obtain CB

(τI)=

C(UB), and left circular-shifts CB(τI)

to obtain CB; finally from

CB node A can decode UB. For node B, it first right circular-

3A recent paper [32] investigates the use of quasi-cyclic LDPC codes todeal with the symbol misalignment in PNC without requiring the validity ofTheorem 2.

shifts its codeword CB to produce CB(τI)

and decodes CB(τI)

to

obtain UB; then node B XORs UR with UB to obtain UA.

APPENDIX IIIEXPLANATIONS OF XOR-CD AND FULL-STATE VITERBI

ALGORITHMS

In this appendix, we explain and provide interpretations forXOR-CD algorithm and full-state Viterbi (FSV) algorithm.To ease the presentation, we consider BPSK modulation andsynchronous PNC.

A. XOR-CDAs pointed out in Section II, XOR-CD refers to a two-

step process: (i) symbol-by-symbol PNC mapping; (ii) channeldecoding. In the first step, the received symbol yRn is mappedto XORed coded bit cRn = cAn ⊕ cBn for n = 1, . . . , N . Uponreceiving the overlapped symbols, the demodulator at the relaycomputes the likelihood

Pr(yRn∣∣xAn , xBn ) = 1√

2πσ2exp

{−∣∣yRn − hAxAn − hBxBn

∣∣22σ2

}.

(III.48)Then the likelihood in (III.48) is mapped to XORed coded

bits by

Pr(yRn∣∣cRn ) = ∑

xAn ,xB

n :cAn⊕cBn=cRn

Pr(yRn∣∣xAn , xBn ) . (III.49)

After the mapping, we obtain CR = CA ⊕ CB, which isfed into the channel decoder in the second step. Based on thelikelihood in (III.49), we can make hard decision on cRn , whichis called “hard” XOR-CD; or we directly pass the probabilityto the decoder, which is called “soft” XOR-CD (which is usedin our main text).

In the second step, an ordinary point-to-point channel de-coder can be used to decode the XORed source bits. Sinceconvolutional code is a linear code and XOR is a linearoperator, the decoding process of CR is

C−1(CR) = C−1(CA⊕CB) = C−1(CA)⊕C−1(CB) = UA⊕UB.(III.50)

Two points are noteworthy: (i) not only convolutional codes,any linear code can be used with XOR-CD; (ii) the symbolsof XA and XB must be symbol-by-symbol aligned, otherwise(III.50) is invalid.

B. Full-State ViterbiIn Section IV-A, we show that the complexity of finding

the ML XORed source packet is prohibitively high. Full-stateViterbi algorithm is proposed to reduce the complexity in [13].Equation (9) is simplified using log-max approximation to

UR = argmaxUR

log∑

UA,UB:UA⊕UB=UR

exp(−M (

XA, XB))

≈ argminUR

minUA,UB:UA⊕UB=UR

(M (XA, XB

)). (III.51)

The computing of (III.51) consists of two steps. First, wefind the best pair of codewords UA and UB such that{

UA, UB}= arg min

UA,UB

(M (XA, XB

)). (III.52)

Computing (III.52) is equivalent to finding the shortest path onthe joint trellis of node A’s and node B’s encoders. Viterbi

Page 15: Asynchronous Convolutional-Coded Physical-Layer Network Coding

15

algorithm is a well-known algorithm to solve this problem.Since the state space of the joint trellis is the combinationof node A’s and node B’s state space, we call this decodingalgorithm the “full-state” Viterbi algorithm. Second, we obtainUR by XOR UA with UB (i.e., UR = UA ⊕ UB).

REFERENCES

[1] S. Zhang, S. C. Liew, and P. P. Lam, “Hot topic: physical-layer networkcoding,” in Proceedings of Mobicom 2006. ACM, 2006, pp. 358–365.

[2] R. Ahlswede, N. Cai, S.-Y. Li, and R. W. Yeung, “Network informationflow,” IEEE Trans. Inf. Theory, vol. 46, no. 4, pp. 1204–1216, 2000.

[3] S.-Y. Li, R. W. Yeung, and N. Cai, “Linear network coding,” IEEE Trans.Inf. Theory, vol. 49, no. 2, pp. 371–381, 2003.

[4] S. Zhang and S. C. Liew, “Channel coding and decoding in a relaysystem operated with physical-layer network coding,” IEEE J. Sel. AreasCommun., vol. 27, no. 5, pp. 788–796, 2009.

[5] L. Lu and S. C. Liew, “Asynchronous physical-layer network coding,”IEEE Trans. Wireless Commun., vol. 11, no. 2, pp. 819–831, 2012.

[6] IEEE-SA Standards Board, “Wireless LAN medium access control(MAC) and physical layer (PHY) specifications,” IEEE Std 802.11 part11, 2003.

[7] L. Lu, T. Wang, S. C. Liew, and S. Zhang, “Implementation of physical-layer network coding,” Physical Communication, 2012.

[8] L. Lu, L. You, Q. Yang, T. Wang, M. Zhang, S. Zhang, and S. C.Liew, “Real-time implementation of physical-layer network coding,” inProceedings of the 2nd Workshop on Software Radio ImplementationForum. ACM, 2013, pp. 71–76.

[9] F. Rossetto and M. Zorzi, “On the design of practical asynchronousphysical layer network coding,” in IEEE 10th Workshop on SPAWC ’09.IEEE, 2009, pp. 469–473.

[10] S. C. Liew, S. Zhang, and L. Lu, “Physical-layer network coding:Tutorial, survey, and beyond,” Physical Communication, vol. 6, pp. 4–42,2013.

[11] A. Viterbi, “Error bounds for convolutional codes and an asymptoticallyoptimum decoding algorithm,” IEEE Trans. Inf. Theory, vol. 13, no. 2,pp. 260–269, 1967.

[12] L. Bahl, J. Cocke, F. Jelinek, and J. Raviv, “Optimal decoding oflinear codes for minimizing symbol error rate,” IEEE Trans. Inf. Theory,vol. 20, no. 2, pp. 284–287, 1974.

[13] D. To and J. Choi, “Convolutional codes in two-way relay networks withphysical-layer network coding,” IEEE Trans. Wireless Commun., vol. 9,no. 9, pp. 2724–2729, 2010.

[14] Q. Yang and S. C. Liew, “Optimal decoding of convolutional-codedphysical-layer network coding,” accepted by WCNC 2014, 2014.

[15] D. Wang, S. Fu, and K. Lu, “Channel coding design to support asyn-chronous physical layer network coding,” in Proceedings of Globecom2009. IEEE, 2009.

[16] X. Wu, C. Zhao, and X. You, “Joint ldpc and physical-layer networkcoding for asynchronous bi-directional relaying,” IEEE J. Sel. AreasCommun., vol. 31, no. 8, pp. 1446–1454, 2013.

[17] S. Vanka, S. Srinivasa, Z. Gong, P. Vizi, K. Stamatiou, and M. Haenggi,“Superposition coding strategies: Design and experimental evaluation,”IEEE Trans. Wireless Commun., vol. 11, no. 7, pp. 2628–2639, 2012.

[18] H. Jiang and P. A. Wilford, “A hierarchical modulation for upgradingdigital broadcast systems,” IEEE Trans. Broadcast, vol. 51, no. 2, pp.223–229, 2005.

[19] H. Ma and J. Wolf, “On tail biting convolutional codes,” IEEE Trans.Commun., vol. 34, no. 2, pp. 104–111, 1986.

[20] V. Namboodiri, K. Venugopal, and B. Rajan, “Physical layer networkcoding for two-way relaying with qam,” IEEE Trans. Wireless Commun.,vol. 12, no. 10, pp. 5074–5086, October 2013.

[21] H. J. Yang, Y. Choi, and J. Chun, “Modified high-order PAMs for binarycoded physical-layer network coding,” IEEE Commun. Lett., vol. 14,no. 8, pp. 689–691, 2010.

[22] T. Koike-Akino, P. Popovski, and V. Tarokh, “Optimized constellationsfor two-way wireless relaying with physical network coding,” IEEE J.Sel. Areas Commun., vol. 27, no. 5, pp. 773–787, 2009.

[23] J. Pearl, Probabilistic reasoning in intelligent systems: networks ofplausible inference. Morgan Kaufmann, 1988.

[24] F. R. Kschischang, B. J. Frey, and H.-A. Loeliger, “Factor graphs andthe sum-product algorithm,” IEEE Trans. Inf. Theory, vol. 47, no. 2, pp.498–519, 2001.

[25] S. Zhang, S.-C. Liew, and P. P. Lam, “On the synchronization of physical-layer network coding,” in Information Theory Workshop 2006. IEEE,2006, pp. 404–408.

[26] L. Lu, S. C. Liew, and S. Zhang, “Optimal decoding algorithm forasynchronous physical-layer network coding,” in Proceedings of ICC2011. IEEE, 2011.

[27] A. J. Viterbi, “Convolutional codes and their performance in communi-cation systems,” IEEE Trans. Commun. Tech., vol. 19, no. 5, pp. 751–772,1971.

[28] B. Sklar, Digital communications. Prentice Hall NJ, 2001, vol. 2.

[29] M. L. Cedervall and R. Johannesson, “A fast algorithm for computingdistance spectrum of convolutional codes,” IEEE Trans. Inf. Theory,vol. 35, no. 6, pp. 1146–1159, 1989.

[30] M. Esmaeili, T. A. Gulliver, N. P. Secord, and S. A. Mahmoud, “A linkbetween quasi-cyclic codes and convolutional codes,” IEEE Trans. Inf.Theory, vol. 44, no. 1, pp. 431–435, 1998.

[31] G. Solomon and H. Tilborg, “A connection between block and con-volutional codes,” Journal on Applied Mathematics, vol. 37, no. 2, pp.358–369, 1979.

[32] P.-C. Wang, Y.-C. Huang, and K. R. Narayanan, “Asynchronous compute-and-forward/integer-forcing with qusai-cyclic codes,” arXiv preprint arX-iv:1312.4003, 2013.

Qing Yang received his B.Eng degree in electronicsand information engineering from the HuazhongUniversity of Science and Technology, Wuhan, China,in 2010. Since then he has been Ph.D. studentat the Department of Information Engineering, TheChinese University of Hong Kong. His researchinterests include physical-layer network coding, multi-user MIMO and software-defined radio.

Soung Chang Liew received his S.B., S.M., E.E.,and Ph.D. degrees from the Massachusetts Institute ofTechnology. From 1984 to 1988, he was at the MITLaboratory for Information and Decision Systems,where he investigated Fiber-Optic CommunicationsNetworks. From March 1988 to July 1993, he wasat Bellcore (now Telcordia), New Jersey, where heengaged in Broadband Network Research. He hasbeen a Professor at the Department of InformationEngineering, The Chinese University of Hong Kong(CUHK), since 1993. Prof. Liew is currently the

Division Head of the Department of Information Engineering and a Co-Directorof the Institute of Network Coding at CUHK. He is also an Adjunct Professorof Peking University and Southeast University, China.

Prof. Liews research interests include wireless networks, Internet protocols,multimedia communications, and packet switch design. Prof. Liews researchgroup won the best paper awards in IEEE MASS 2004 and IEEE WLN2004. Separately, TCP Veno, a version of TCP to improve its performanceover wireless networks proposed by Prof. Liews research group, has beenincorporated into a recent release of Linux OS. In addition, Prof. Liew initiatedand built the first inter-university ATM network testbed in Hong Kong in 1993.More recently, Prof. Liews research group pioneers the concept of Physical-layer Network Coding (PNC). Publications of Prof. Liew can be found inwww.ie.cuhk.edu.hk/soung.