[ieee 2008 ieee international symposium on signal processing and information technology (isspit) -...

FPGA Implementation of a Multi-Rate Punctured Viterbi Decoder Compatible withthe DVB-T Standard

Ahmad Anwar Abdellatifl, Samar Moustafa Ismail2, Darek Korzec3Department of Electronics and Electrical Engineering. Faculty of Information Engineering and Technology,

German University in Cairo.1 ahmad.abde latif t g 2 samarJsmai (jgucedu.eg,d3 darek.korzec k

Abstract - In this paper, a Field Programmable Gate Array (FPGA) II. CONVOLUTIONAL CODESimplementation of a Viterbi decoder using Register ExchangeAlgorithm (REA) is presented. The REA offers a higher operating Convolutional Codes are a type of error correcting codesfrequency than its competitor the Trace Back Algorithm (TBA). The that operates on series of bits, and it has memory, as eachproposed design is compatible with the Digital Video Broadcasting encoded bit is dependent on the message bit and some of thefor Terrestrial networks (DVB-T). The design uses less area and

p. .

operates at a higher frequency than the existing designs. A multi- previous message bits.rate de-puncturing unit is integrated in the design to support fivecode rates of 112, 213, 314, 516, 718, which are adopted by the DVB-T A. Some characteristics of convolutional codesstandard.

A convolutional code has some main characteristics:Keywords - DVB-T, Viterbi Decoder, REA, FPGA, Puncturing. The code rate (k/n): it is number of inputs bits / number of

output bits.I. INTRODUCTION The encoder memory (m): it is the number of memory

elements in a convolutional encoder.The European standard for Digital Video Broadcasting for The constraint length (K): it is the number of inputs

terrestrial networks (DVB-T) has been adopted as a standard required so that the output is a combination of the input bitfor terrestrial broadcasting of High Definition Television and the previous m message bits, and is equal to m+1.(HDTV) in many countries all around the world [1]. The The connection vectors (gi, g2): they are vectors describingchannel coding of the DVB-T standard uses two concatenated the connections from the memory elements to the adders oferror correcting codes, the outer coding is a block code (Reed the encoder.Solomon Block Codes) and the inner coding is aConvolutional Code. ModuIo~2 Addition

Convolutional Coding is a Forward Error Correction X upu (-11 LiTechnique that is particularly suited for channels corruptedmainly with Additive White Gaussian Noise (AWGN) [2].Convolutional Coding has found numerous applications inDigital Video Broadcasting (DVB) and Digital Audio Da'a

Ipt -k Ilii1-Blt R 1h IBitBroadcasting (DAB) due to its high error correcting la I V lelacapability. The Viterbi algorithm [3,4], is the mostly usedalgorithm for decoding convolutional codes with smallconstraint length. The code rate of a convolutional encodercan be increased by introducing puncturing [4], which isremoving some of the redundancy bits to increase the channel Y Output (G 133 WtAI)bit-rate. The DVB-T standard supports different punctured Mi(}2ithmrates of 1/2, 2/3, 3/4, 5/6, 7/8 [2].rates pe is organized as fllw. S Fig. 1. The convolutional encoder used in the DVB-T standard [2].The paper iS organized as follows. Section II introducesconvolutional codes and their parameters. Section IIIhighlights puncturing to achieve higher code rates. Section IV 1 shows he con olo encdr ofthe DVB-Texplains the Viterbi algorithm and gives a brief overview on standard which has 6 memory elements and 2 modulo-2the existing designs. In section V, the proposed Hardware adders. The code rate is 1/2; the constrait length 1S 7 with 64design of the Viterbi decoder is presented in details. In possible encoder states with g and g2 equal to (171) OCT andsection VI, the results are elaborated showing the hardware (133)OCT respectively.results as well as the performance analysis of the softwaremodel. Section VII summarizes the outcome ofthis paper. B. The trellis diagram

978-1 -4244-3555-51081$25.00 ©2008 IEEE 100

The trellis diagram is a representation of the state diagram It was invented by Andrew Viterbi in the year 1967. oneof the encoder with a time axis [3]; it is useful in tracing the of the most commonly used algorithms to decodeoutput bits for a series of input bits for the encoder so as to convolutional codes [3]. It is a Maximum Likelihoodfacilitate the decoding process by checking the path in the Decoding Algorithm used for decoding of convolutionaltrellis closest to the received message. codes [3]; it follows the trellis so as to deduce the most likely

path which is the valid path closest to the received codeword.It discards less likely paths. When 2 paths intersect at a state,

/ #4};vEe wthe less likely path is discarded (known that this won't be the}\ #/ \ rJ*\ SJ \ /St ~~survivor).

V StateV2 We will not immediately consider complete codeE1/ / / : SO 3 ; sequences. Instead, we pass the trellis from the initial node to

00 ~ ~ ~ ~ fthe terminating node and thereby calculate distances forpartial code sequences [4]. A Viterbi decoder consists of 3

I10'' i'. .K\' 'A-- 3 f- main units: the Branch Metric Unit, Add Compare Select2w%( >>(; N l_ SUnit and the Survivor Memory Unit.

___\_Ml__!________i____ B. Existing architectures

Three of the main existing Architectures wereinvestigated:

Fully Parallel Approach: it is the fastest method whichFig. 2. The Trellis Diagram of a 4-statecode takes the whole message in parallel and dedicates an ACS

unit to each n - bits where k/n is the code rate [5].Fig.2 shows a trellis diagram for a 4 state code which The draw back is the huge area and power consumption

shows possible transitions between states with the required for parallel operation.corresponding inputs and outputs. Trace Back Algorithm (TBA): it stores the survivor path

as a sequence of states, and then at the end of the trellis, it isIII. PUNCTURING traced back to get the decoded message using a simple

combinational circuit [5]. The drawback here is theHigher rate codes can be derived from rate 1/n codes by combinational circuit which reduces the maximum clock

introducing puncturing [3], which is the removal of some of frequency.the bits of a message according to a predefined puncturing Register Exchange Algorithm (REA): it stores thepattern. This reduces the error protection but increases the bit possible decoded messages instead of storing sequences ofrate of the channel as less data is sent. This is treated in the states. Register exchange is applied between possible pathsdecoder by inserting bit erasures in the places where the bits according to the ACS decisions.were removed [3]. Register exchange was applied in the proposed design so

as to increase the operating frequency, as the combinationalTable 1l.Puncturing Rates Supported by the DVB-T Standard [2]. loi reuie in trc bac liistemxiu'prtnlogCc requured nn trace back limTts the maxamum operatngCode Rates r Punkturino pO1;tern Traiisinitted sequeone frequency. The survivor memory for REA is a regularI Aler Oatallekd4erheFaf tonWMsoh)

1,12 K 1 x .1 structure that can easily be modeled in VHDL [8].YX1

2/3o X. 10 i YY. I 1 C. Previous work reviewed

WJt fK I 1;1 1 AX Y{ Y2 X3Y: 1 1 0

16 X: 1 0 1 0 1 Xi Y1 Y2 X3 Y X6 Some of the existing design were reviewed, in [6] anY: 1 1 0 I 0

718 K 1 0 0 0X 01 Xi Y1i);YO X6Y6Xi ASIC implementation of a trace back solution with aY X1 X 0 X 0 pipelined ACS is presented achieving a frequency up to 101

MHz, with a configurable constraint length from 7 to 10. InTable 1 shows the puncturing patterns associated with [7], the implementation a low power architecture called

different puncturing rates according to the DVB-T standard, Memory-less Viterbi Decoder (MLVD) is presented. Thefive different rates are adopted. The mother code rate is 12, X author, using a bit serial architecture, achieved reducedrepresents the first output, Y represents the second output. power consumption and worked up to a frequency of 100

MHz implementing the design on an IC. The author alsoIV. THE VITERBI ALGORITHM implemented the design on an FPGA achieving a frequency

of 25 MHz and a decoding speed of more than 3Mb/sec. InA. Background [8], a Viterbi decoder on an IC compatible with DVB

standards was presented. It used REA and achieved a

978-1 -4244-3555-5/08/$25.00 ©2008 IEEE 101

maximum frequency of 70 MHz on IC, the chip size was path metric. Then, the path with minimum metric is selected10.6mm2. In [9], a Soft Output Viterbi Algorithm (SOVA) to be the survivor path for a certain state. The ACS unitdecoder was presented. It achieved a throughput of 500 performs the Add Compare Select recursion every clockMb/sec, however, it only supports an 8 state code (K=4 cycle to give the decisions indicating the survivor paths as anwhich is impractical). In [10], a Viterbi decoder compatible output, as shown in Fig.3. Each decision is a bit whichwith DVB standards was presented. It supports multi-rates indicates whether the survivor trellis path to each state is anand a BER performance graph was given, and used a 5 bit upper or a lower state in the trellis. The ACS unit takes thesoft decision metric and could reduce the number of adders in possible branch metrics as inputs from the BMU and itthe ACS to 192. However, no measurement of area or speed accumulates them on the stored path metrics in the pathor power efficiency or puncturing performance was given. metric storage, and outputs the new path metrics (64 path

metrics) to the path metric storage and the survivor pathV. THE PROPOSED DESIGN decisions (64 bits) to the SMU using the ACS butterfly.

The decoder core comprises 3 main blocks: Euclidean distance. Shortest distance on a straight liie betweenThe Branch Metric Unit (BMU) is used to calculate the ytvo pont.

four possible branch metrics. The Add Compare Select Unit tx(ACS) is used to calculate the survivor path to each state. The w =Survivor Memory Unit (SMU) is used to record the most ED' j(X-X) +(Y-Y),likely decoded bits according to the ACS decisions.

v

A block diagram of the proposed design for the decodercore is given in Fig. 3.

MIanhattain dilsta.u Distance betweeri two paoits measuredE¢l outp41t ,F along at piith of right angtits.Seriatutput pAItruht u

illputl:nput2 ACS,unit r |utp | . x l<IZY7filiptil Iinpupt 2 1 D t

s~~~~~~~~~~~~ ~ ~~~~~ ~~~~~~~~~~Iwat Lx I -f IV itA2|BMUt= | 8Butterfly | rirrF

M t 5 er3eM iemloryllUnitFig. 4.Definitions of the distances

32g1flagZrEE C. The ACS butterflyPathxiII

M1etriicslll

PM j Win |The ACS butterfly is the core of the ACS unit responsibleStorage> I for calculation of the metrics and comparison to select the

Index of thL Stag survivor path.co)st tkly4 path

IcunterBM Staite; --A-i------------ St-ab& i2Fig.3. The block diagram of the proposed Viterbi decoder core 4

A. The Branch Metric Unit

A two bit soft decision has been implemented using theManhattan distance, which is different from the commonly aS i+s Itate + 2used Euclidean Distance in the sense that the use of a dM 2multiplier is avoided.

The Manhattan Distance in Fig. 4 doesn't need a ...r iitmultiplier or a square root circuit, which reduces area as well - =` Jas combinational delay upon hardware implementation.

Fig. 5. A Trellis ButterflyB. The ACS unit

Fig.5 shows the relation between the Trellis. ThisFor a bit serial operation, an add compare select recursion butterfly-shaped relation was used for the construction of the

is carried out for each 2 message bits. The branch metric for ACS unit as well as the SMU.those two bits is added on the path metric to form the new

978-1 -4244-3555-51081$25.00 ©2008 IEEE 102

And as shown in Fig. 6, the add compare select recursion Fig. 8 shows only three levels of comparators which areis done for each state in a "wing" of the butterfly. So, each used for an 8 state code, for our design, six levels werebutterfly consists of 2 wings, and the ACS unit consists of 32 implemented for a 64-state code.ACS butterflies.

F. The de-puncturing unit

The de-puncturing unit reverses the action of puncturing.It inserts don't cares in the bit-stream instead of the bitsremoved by the action of puncturing. The puncturing pattern

Bm1 __ ^ ,-.3f-. ,5 ^ is stored in a circular shift register which rotates each clockCrniipiratlr New Path MetriZ cycle, according to this puncturing pattern, zeros are inserted

I Z , # in place of the deleted bits. An output is taken from thisPM3 1 3 33gE3/3puncturing pattern to the BMU. This output acts as a flag

indicating whether the current bit is inserted (treated asBIl i04 t00xl 1 [ erasure) or it is originally a channel output (treated normally).

JfI 0 33 3>k0Wx3Wr*g0rvlgtA clock generation block is used to adjust the clockfrequency of the eco er in accommo ation with thepuncturing rate, as the decoder should work faster withhigher puncturing rates.

Fig.6. Butterfly wing

D. The Survivor Memory Unit

The SMU was implemented using the REA concept which m 223gives an L-bit register to each state where L is the Truncation M ' 'Length. It stores the decoded bits in those register where each , 'l I M cosstate represents a path. When the ACS calculations determine -M -j j j Indexthe survivor to each state, the register of the survivor path is , M 5 X ;copied into the register of the new state. The ACS decisions , M6act as selection lines to the multiplexers. P A

Fig. 7 shows the REA architecture for a 4-state code with M7a truncation length of 3 bits, the implemented design was | Ms+ ' ' 'done for a 64-state code with a truncation length of31 bits.

;.oy=t , ==== _ _ __=_=Metric Level I Level 2 Level 3Storage

I bit Registruto ,,{ K { 0*Fig.8. Finding the minimum index for an 8 state code.

B3\ i;xJ8E-v D 1 bit Mutitplexer 2 way Multiplexer

input bitsC A ACS Decisior O l 2-bit Reg.

FtW iei*t\mtt+XXx|B ACSDecisionI1 1 1outputI5_|tI|w tl lt > t C ACS DOecision messageD ACS Decision 3

1C C OC,i..A A outputflago

Fig.7. SMU REA architecture for a 4-state codePuncturing Pattern|

E. Minimum Path Metric unit

The Minimum Path Metric unit (Min PM) is used to S RegisteI Sh;ifRelMgister11determine the path with minimum metric after the trellis end.It is realized as arrays of comparators that work in paralleland gives the index of the survivor path in the Path MetriclIStorage as an output, as well as the minimum cost.

' ~~~~~~~~~~~~~~~~Fig.9.Simplified block diagram for the de-puncturing unit

978-1 -4244-3555-5/08/$25.00 ©C2008 IEEE 103

Fig. 9 is a diagram showing a simple construction for the shown, our design exhibits almost the same performance ofde-puncturing unit for only one puncturing pattern. However, the MATLAB, but with no need for multiplication asin our design five different rates are supported with the explained in section lIIIA.ability to choose between them online during runtime. Fig. 12 shows the performance of different puncturing

Fig. 10 shows the top level diagram for the whole Viterbi rates adopted by the DVB-T standard in the proposed design.Decoder with the de-puncturing unit. Note that the serial toparallel modules provide inputs at rate 2 bits/ clock cycle Test for puncturinig rates using the MATLAB AWGN channel model.

and not forparalleloperation. 100 ~~~~~~~~~~~~L=50900, Tr'Unc=50 bits.

Depunctur'lng unit

10

Inpok.ms9 2W .......~~~~~~~~~~~~~~~~

lrqm iowpuLmsg Sedal tDParalel ................ .....10ptirodudn punc ate ..............1~ ~~~7 ..........module~~~~~~~~~~~~~~~~~~~~~

k/n.. 2. ...

k/21

Clock Generation k/n 7~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~71-..........ecoder ore

Fig. 12Pefrac fo different. puntuin rtesofth.popoeddeigFastCI~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~nui. Tog.......gle...F PpFfo A erfrmncelos.isnotce as.hepuctrig.at

inrae whih s nrml.de.o.rmoal.f.orFig.10.Top level diagram~~~~~~~~~~~~ for....... the... proose deigre un an y it.a te.un tuin rt inrae.VI. RESULTS TheDesign was implemented usingVHDL [12] and~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~siuae on Moelim I.ws ucesfulysythsiedan

insoftwarefor performance implemented on Spartan3s500E FPGA [13], which offers...

The design was modeled easy reconfigurable platform for testing the design.The~~~~~~.............................evaluation~~~~~~~~~~~~~~~~~~~~~~~~~~........ using MATLAB. software used was Xilinx ISE9.1i-----------------------The--- design----- area....usage-----and-

Compaling the proposed design (Manhattan Distance) th the ILAB decoder (Eudidean Distance) operating frequency aregivenin Table 2.~~~~~~~~~~~~~~~~.......M~~~age=6O 009 bits, Trunc = 50 bits, TLAB AWGN channel model Table 2. Area usage and frequency.~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~/n2J3.

.~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~-Slc Flp-los 652OperatingFrqeny274 MHz

Aig Viterbirmadefodr comerntpantiberingrth sotheDV-prpstadard,sias0 well~~~~~~~~~~~as,farmut-atcelsisntcde- tepuncturinguntwhcrande tie

punctringe rates isupported byute DVB-Tmstandard aoreFig. O. Tpleeldagra for he poposimplemned ntdc ont FPGA.e Thctusen ofathe REAroffers. ancesC2 3~~~~~VI 4R7SULin The Designopeatingpfequencydcompare toDthe TBA. The

SNR(dB) max~~~imumae operateing reqencwasuchevedforlsythescored isn2

Fig.aious omaingMTLBwihteaALBseoerMy UsoningurtheManattanmdistanestinstadothedeuclidTeadsotwance maeteurpseddesilnIE9ignofe lessg area thane the

Cmaigteproposed design usn teManhatnDsac)wththMAnLA distanerEudaDsandcte) i capeableg ofrgivingy 1rbitvcoc cycTale asanotptMALA Mesiagn00 whit,Tuch-usebtheEuLBWNclidanne mdisane.A

978-1.......i .............125.0 ©2.. IEEE2 104uag ndfrqeny

REFERENCES

[1] O.Leary, Seamus,"Understanding digital terrestrial broadcasting",Artech House, 2000.

[2] ETSI EN 300 744 V1.5.1 (2004-1 1),"Digital Video Broadcasting(DVB); Framing structure, channel coding and modulation for digitalterrestrial television", European Standard.

[3] Robert H. Morelos-Zaragoza, "The Art of Error Correcting Coding,"Second Edition, John Wiley & Sons, 2006.

[4] Andre Neubauer, Jurgen Freudenberger, Volker Ktuhn, "Coding TheoryAlgorithms, Architectures and Applications", John Wiley and Sons,2007.

[5] Herbert Dawid, Olaf J. Joeressen and Heinrich Meyr, "ViterbiDecoders: High Performance Algorithms and Architectures", 1995.

[6] Mohammed Benaissa, Yiqun Zhu, "A Novel High-Speed ConfigurableViterbi Decoder for Broadband Access", EURASIP Journal on AppliedSignal Processing 2003:13, 1317-1327.

[7] Dalia Abdel-Wahed Fouad El-Dib, "Low Power Register ExchangeViterbi Decoder for Wireless Applications", University of Waterloo,Ontario, Canada, 2004.

[8] C. Deltoso, M. Cand, L. Sponga, "A Punctured Viterbi DecoderCompatible with DVB Standards", France Telecom - CNET - 28.

[9] Engling Ye, Stephanie A. Augsburger ,W. Rhett Davis, BorivojeNikolic, "A 500-Mb/s Soft-Output Viterbi Decoder", IEEE Journal ofSolid-State Ciruits", Vol. 38, NO. 7, July 2003.

[10] Yih-Min Chen, Wei-Cheng Lin, "Design and Implementation of aViterbi Decoder for DVB-T System", National Central University,Taiwan.

[11] Paul E. Black, "The Manhattan Distance", Dictionary of Algorithmsand Data Structures, May 2006.

[12] Douglas L.Perry, VHDL Programming by example, 4th edition,McGraw-Hill, 2002

[13] Spartan-3E Starter Kit Board User Guide, Xilinx UG230 (v1.0), March9, 2006.

978-1 -4244-3555-51081$25.00 ©2008 IEEE 105

[ieee 2008 ieee international symposium on signal processing and information technology (isspit) -...

Documents