ahmed salim - diva portalliu.diva-portal.org/smash/get/diva2:469272/fulltext01.pdf · wcdma...

Institutionen för systemteknikDepartment of Electrical Engineering

Examensarbete

Evaluation of Word Length Effects onMultistandard Soft Decision Viterbi Decoding

Examensarbete utfört i Elektroniksystemvid Tekniska högskolan vid Linköpings universitet

av

Ahmed Salim

LiTH-ISY-EX--11/4416--SE

Linköping 2011

Department of Electrical Engineering Linköpings tekniska högskolaLinköpings universitet Linköpings universitetSE-581 83 Linköping, Sweden 581 83 Linköping

Evaluation of Word Length Effects onMultistandard Soft Decision Viterbi Decoding

Examensarbete utfört i Elektroniksystemvid Tekniska högskolan i Linköping

av

Ahmed Salim

LiTH-ISY-EX--11/4416--SE

Handledare: Kent Palmkvistisy, Linköpings universitet

Examinator: Kent Palmkvistisy, Linköpings universitet

Linköping, 06 December, 2011

Avdelning, InstitutionDivision, Department

Division of Automatic ControlDepartment of Electrical EngineeringLinköpings universitetSE-581 83 Linköping, Sweden

DatumDate

2011-12-006

SpråkLanguage

� Svenska/Swedish� Engelska/English

�

�

RapporttypReport category

� Licentiatavhandling� Examensarbete� C-uppsats� D-uppsats� Övrig rapport�

�

URL för elektronisk versionhttp://www.control.isy.liu.se

http://www.ep.liu.se

ISBN—

ISRNLiTH-ISY-EX--11/4416--SE

Serietitel och serienummerTitle of series, numbering

ISSN—

TitelTitle

Svensk titelEvaluation of Word Length Effects on Multistandard Soft Decision Viterbi Decod-ing

FörfattareAuthor

Ahmed Salim

SammanfattningAbstract

There have been proposals of many parity inducing techniques like Forward ErrorCorrection (FEC) which try to cope the problem of channel induced errors to alarge extent if not completely eradicate. The convolutional codes have been widelyidentified to be very efficient among the known channel coding techniques. Theprocess of decoding the convolutionally encoded data stream at the receiving nodecan be quite complex, time consuming and memory inefficient.

This thesis outlines the implementation of multistandard soft decision viterbidecoder and word length effects on it. Classic Viterbi algorithm and its variantsoft decision viterbi algorithm, Zero-tail termination and Tail-Biting terminationfor the trellis are discussed. For the final implementation in C language, the "Zero-Tail Termination" approach with soft decision Viterbi decoding is adopted. Thismemory efficient implementation approach is flexible for any code rate and anyconstraint length.

The results obtained are compared with MATLAB reference decoder. Sim-ulation results have been provided which show the performance of the decoderand reveal the interesting trade-off of finite word length with system performance.Such investigation can be very beneficial for the hardware design of communicationsystems. This is of high interest for Viterbi algorithm as convolutional codes havebeen selected in several famous standards like WiMAX, EDGE, IEEE 802.11a,GPRS, WCDMA, GSM, CDMA 2000 and 3GPP-LTE.

NyckelordKeywords Soft-Decision Viterbi, Tail-biting, Zero-tail, Direct Trace-back, Word Length ef-

fects, floating point quantization, BER analysis

Acknowledgments

First of all I am thankful to Linkoping University for giving me admissionin master program, ”Communication Electronics” and get a chance to knowmore.

I am really thankful to Kent Palmkvist (Associate Prof. in ElectronicsSystem division), ISY Department, Linkoping University for helping meunconditionally. He supervised this thesis as well as has been examiner ofthis thesis. He helped me a lot during design phase, MATLAB, C, test-benchfor verification of results and finally while writing thesis report. I believed Ilearned a lot from him which is a beautiful part of my memory.

I am especially thankful to Mikael Oloffsson (Associate Prof.), RezaMossavi (Phd Student) from Communication System Department for hav-ing long and conclusive discussions. They both contributed significantlyin developing my in-depth understandings regarding ”Viterbi Algorithm”.Occasionally, I got inspirations from Jacob Wikner (Associate Prof.), KentPalmkvist, and Mikael Olofsson at various stages during my MS.

I am thankful to some of master session fellows: Azam Zia, Aqeel Afzal,Vishnu Unnikrishnan.

I am thankful to Tomas Otby (study coordinator), and Kerstin Hawkins(study coordinator). I found them really helpful and encouraging in allmatters.

Though I took help from many people at various stages ranging fromresearch paper selection, understanding, design, implementation, analysis,results tabulation, drawing graphs, and writing thesis using Latex; howeverit is my examiner and supervisor Kent Palmkvist only, who gave directionto my thesis work in a reasonable way, I dedicate this thesis to him.

My parents and family always took care of me and encouraged me evenwhen I was in a state of being ”floating”, I dedicate this thesis to them also.

i

ii Acknowledgments

Contents

Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . i

Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii

List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v

List of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii

Acronyms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix

1 Channel coding in communication systems 1

1.1 Communication Systems and their Evolution . . . . . . . . . 1

1.2 A Basic Communication System . . . . . . . . . . . . . . . . 2

1.3 Fundamental Limits of Communication . . . . . . . . . . . . 2

1.4 Channel Coding and its importance . . . . . . . . . . . . . . 3

1.5 Error Detection and Correction . . . . . . . . . . . . . . . . . 4

1.6 Forward Error Correction (FEC) . . . . . . . . . . . . . . . . 5

1.7 Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

2 Introduction to Convolutional Codes 9

2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

2.2 Convolutional Codes and Basic Terminology . . . . . . . . . . 9

2.2.1 Code Rate . . . . . . . . . . . . . . . . . . . . . . . . . 10

2.2.2 Constraint Length . . . . . . . . . . . . . . . . . . . . 10

2.2.3 Hamming Distance . . . . . . . . . . . . . . . . . . . . 10

2.2.4 Free Distance and Error Correction Capability . . . . 11

2.3 Convolutional Encoders . . . . . . . . . . . . . . . . . . . . . 12

2.4 State Diagram and Trellis Structure . . . . . . . . . . . . . . 13

3 Channel Models 17

3.1 Introduction to Discrete Channel . . . . . . . . . . . . . . . . 17

3.1.1 Discrete Memoryless Channel . . . . . . . . . . . . . . 17

3.1.2 Discrete Channel with Memory . . . . . . . . . . . . . 18

3.1.3 Binary Symmetric Channel . . . . . . . . . . . . . . . 18

3.2 Continuous Channel . . . . . . . . . . . . . . . . . . . . . . . 19

3.2.1 Additive White Gaussian Noise Channel . . . . . . . . 19

iii

iv Contents

4 Convolutional codes : Decoding 214.1 Convolutional Codes : Decoding . . . . . . . . . . . . . . . . 214.2 Background of Viterbi Algorithm . . . . . . . . . . . . . . . . 214.3 Maximum Likelihood (ML) . . . . . . . . . . . . . . . . . . . 224.4 Viterbi Algorithm . . . . . . . . . . . . . . . . . . . . . . . . 224.5 Hard Decision Decoding . . . . . . . . . . . . . . . . . . . . . 234.6 Soft Decision Decoding . . . . . . . . . . . . . . . . . . . . . . 234.7 Methods of Trellis Termination . . . . . . . . . . . . . . . . . 24

4.7.1 Trellis Truncation . . . . . . . . . . . . . . . . . . . . 254.7.2 Trellis Termination . . . . . . . . . . . . . . . . . . . . 254.7.3 Tail Biting . . . . . . . . . . . . . . . . . . . . . . . . 27

5 Word Length Effects 295.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 295.2 Finite Length for Storage and its Effects . . . . . . . . . . . . 305.3 Floating Point Numbers and Quantization . . . . . . . . . . . 315.4 Quantization for Fixed Point Programming . . . . . . . . . . 325.5 Saturation Effects . . . . . . . . . . . . . . . . . . . . . . . . 335.6 Word Length Estimation . . . . . . . . . . . . . . . . . . . . . 35

6 Implementation Details 396.1 Implementation Details . . . . . . . . . . . . . . . . . . . . . 396.2 Table: Convolutional encoding used in different standards . . 396.3 Pseudocode: The Viterbi Algorithm . . . . . . . . . . . . . . 406.4 Steps of Viterbi Algorithm . . . . . . . . . . . . . . . . . . . . 42

7 Simulation Results: Bit Error Rate Analysis 497.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 497.2 Simulation System and Basic Definitions . . . . . . . . . . . . 497.3 Simulation Results and Comparisons . . . . . . . . . . . . . . 527.4 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56

Bibliography 57

List of Figures

1.1 Block Diagram of a Simple Communication System. . . . . . 21.2 Block diagram of an FEC System. . . . . . . . . . . . . . . . 6

2.1 A 12Rate (5,7) Convolutional Encoder [1]. . . . . . . . . . . . 13

2.2 State Diagram for 12Rate (5, 7) Convolutional Encoder [1]. . 14

2.3 Trellis Diagram for 12Rate (5, 7) Convolutional Encoder [1]. . 15

3.1 Description of a Binary Symmetric Channel [1]. . . . . . . . . 19

4.1 Block Diagram of Zero-Tail Encoder. [2] . . . . . . . . . . . . 264.2 Block Diagram of Tail-Biting Encoder. [2] . . . . . . . . . . . 28

5.1 3 bit Saturation. . . . . . . . . . . . . . . . . . . . . . . . . . 345.2 4 bit Saturation. . . . . . . . . . . . . . . . . . . . . . . . . . 345.3 Quantization and Saturation. . . . . . . . . . . . . . . . . . . 355.4 Word length computation for Viterbi decoding algorithm. . . 37

6.1 Pseudocode of Viterbi Algorithm. . . . . . . . . . . . . . . . . 416.2 Soft decision decoding to determine the survivor at a partic-

ular time instant [3]. . . . . . . . . . . . . . . . . . . . . . . . 446.3 Operation Level Flow Chart. . . . . . . . . . . . . . . . . . . 456.4 Zero-Tail Traceback [2]. . . . . . . . . . . . . . . . . . . . . . 466.5 Tail-Biting Traceback [2]. . . . . . . . . . . . . . . . . . . . . 48

7.1 Bit error rate calculation. . . . . . . . . . . . . . . . . . . . . 507.2 Comparison with MATLAB. . . . . . . . . . . . . . . . . . . . 517.3 SNR versus BER at different word lengths. . . . . . . . . . . 54

v

vi List of Figures

List of Tables

7.1 Polynomials: [171 133], Rate = 1/2, Constraint Length = 7,Packet size = 70 bits including last six zeros. . . . . . . . . . 53






vii

viii List of Tables

Acronyms

Here are the main acronyms used in this document. The meaning of anacronym is usually indicated once, when it first appears in the text.

2G 2nd Generations3G 3rd Generations3GPP-LTE 3rd Generation Partnership Project- Long Term Evolution4G 4th GenerationsACS Add, Compare and Select UnitAWGN Additive White Gaussian NoiseBMU Branch Metric UnitBPSK Binary Phase Shift KeyingBSC Binary Symmetric ChannelECC Error Correcting CodeEDGE Enhanced Data Rates for GSM EvolutionFEC Forward Error CorrectionGSM Global System for Mobile CommunicationGPRS General Packet radio ServiceLLR Log-Likelihood RatiosML Maximum LikelihoodTBU Trace Back UnitUMTS Universal Mobile for Telecommunication SystemVA Viterbi Algorithm (classic)WCDMA Wideband Code Division Multiple AccessWiMAX Worldwide Interoperability for Microwave AccessWLAN Wireless Local Area Network

ix

x Acronyms

Chapter 1

Channel coding incommunication systems

1.1 Communication Systems and their Evolution

The past few decades saw the rapidest growth of the technology and themodernization of infrastructure in the realm of wireless communications.Wireless communication has been one of the most vibrant areas in the com-munication field [4]. Many wireless systems were standardized, put intoeffect, later replaced by new systems and newer generations are in the pro-cess of standardization. This flurry of research in modern communicationsystems is the consequence of the quest for being able to transmit largeamounts of data reliably. This progress owes itself to mainly three fac-tors. First is the human demand for connectivity, second is the advancesin implementation of signal-processing algorithms and third is the successof previous wireless standards which showed the real life demonstration ofcommunication principles and algorithms [4].

Although the research in the area of communications systems has beenenormous, yet the quest to be able to transmit large amounts of data reliablyis still unquenched. There are two fundamental aspects of wireless commu-nication that make the problem challenging and interesting. First is thephenomenon of channel fading. Secondly, the wireless medium is an openbroadcast medium where each receiver will listen to unintended transmit-ters. This extra interference power in some sense increases the noise level inthe communication and reduces the communication reliability unless someproper measures are taken in the form of transmit-receive processing.

1

2 Chapter 1 Channel coding in communication systems

1.2 A Basic Communication System

A basic point-to-point communication system consists of a transmitter whois trying to send data to a receiver through a channel. Transmitter gets thedata to be transmitted from some source, one example of which is the humanspeech for an audio call over a cellular system where the hand-held mobileset serves as the transmitter. Generally a simple transmitter is the im-plementation of the following functions: source encoding, channel encodingand modulation. Receiver receives the transmitted data which has under-gone distortion and modifications due to various channel effects and it has toperform the transmit processing operations in the reverse order to retrievethe desired data. Thus a simple receiver will be an implementation of de-modulation, channel decoding and source decoding. In a lot of applications,transmit and receive functionality is implemented in one unit, the primeexample of which is the modern mobile phone. To send human speech, itserves as transmitter and during the reception, it receives the data throughchannel (air) and after necessary processing outputs the data (voice). Thebasic building blocks of a communication system are shown in Fig 1.1.

Figure 1.1: Block Diagram of a Simple Communication System.

1.3 Fundamental Limits of Communication

There is a branch of communication, named as Information Theory, whichstudies the fundamental limits of communication. Claude E. Shannon is,with no doubt, considered to be the founder of Information theory whichtook its birth with his landmark paper [5]. Information theory tries to

1.4 Channel Coding and its importance 3

answer fundamental questions in the area [5], [6], some of which are givenbelow:

• The minimum rate to which an information source can be compressedwithout any loss. This measure is termed as the information rate of asource.

• The relation of distortion (with lossy compression) and source infor-mation rate.

• The maximum rates that can be sent through specific channels withan acceptable (small) error probability. This maximum data rate thatbe sent through a channel with a small probability of error is termedas channel capacity.

There is a very nice and intricate interplay of source information rateand the channel capacity which was related by one of the Shannon’s theo-rems, the channel coding theorem [7], which is stated as follows:

”If the information rate of a given source does not exceed

the capacity of a given channel,then there exists a coding

technique that makes possible transmission through this

unreliable channel with an arbitrarily low error rate.”

This important theorem predicts the possibility of error-free (a very smallprobability of error, to be precise) transmission through a noisy or unreliablechannel. Shannon’s theorem does not indicate what coding technique to usebut it indicates the existence of at least one coding theorem if the sourceinformation rate is less than the channel capacity. This is how channelcoding comes into the picture of communication systems and plays a primerole in data transmission.

As in this thesis work, main focus is channel coding and a decodingalgorithm, details about source encoding and decoding are omitted, ratherit is assumed that these blocks are there in the system of concern. Onwarddiscussion concern will be channel coding at the transmitter and channeldecoding at the receiver. The next section describes why channel coding isessential for the successful data transmission in a communication system.

1.4 Channel Coding and its importance

When information bits pass through a channel, they are corrupted due tochannel impairments [4]. Channels are characterized by statistical and phys-ical properties as a result some probability always exists that channel mayflip a bit to another value. Different kinds of channel models already exist


and they are used to model real world communication channels. The mostcommon channel model which is used to study various systems is additivewhite Gaussian noise (AWGN) channel which only adds some noise to thetransmitted symbols. A lot of fading channels have also been standardized,the most common of which are Rayleigh fading and Rician fading channels.Contrary to these continuous channels, a lot of discrete channels have alsobeen studied, mostly in Information Theory, like binary symmetric channel(BSC) and binary erasure channel (BEC) [6].

Channel coding in general is some added redundancy which may indicateif the information received at the receiver is erroneous or not. Similarly thereare channel coding algorithms which even correct the error. The simplestexample of a channel code design is the addition of a parity bit in a block ofinformation bits to make the number of 1’s in this block either even or odd.When receiver receives this block, it may check the parity. If it is not sameas intended, it may get to know that there have been some bit-flips in thechannel. But this simple code is helpless if two bits flip in one block. Thusthis block at best can indicate the error of one bit in one block. Robusterror indication will require the addition of more parity bits. Sophisticatedchannel coding algorithms provide a lot of flexibility. They not only provideerror indication but automatic error correction as well within certain limits.

Channel coding has been one of the favorite areas for communicationresearchers and as a result various classes of codes have been developed. Thetwo main categories of channel codes are block codes and convolutional codes[8], [9]. Both of these have their advantages and disadvantages dependingupon application. In the following a brief overview of both these categoriesis given to differentiate between them.

• Block codes work on fixed-size blocks of bits or symbols of predeter-mined size. Practical block codes can generally be decoded in polyno-mial time to their block length. The classical examples of block codesare Reed-Solomon codes, Golay, BCH, Multidimensional parity, andHamming codes.

• Convolutional codes work on bit or symbol streams of arbitrarylength. They are most often decoded with the Viterbi algorithm,though other algorithms are also used sometimes. Viterbi algorithmprovides optimal results efficiently however at the cost of exponentiallyincreasing complexity.

In this thesis, main emphasis is given on convolutional codes.

1.5 Error Detection and Correction

As it is described earlier, channel codes of varying flexibility and functional-ity are available ranging from error detectors to those which (try to) correct

1.6 Forward Error Correction (FEC) 5

the error as well. For a given set of working parameters of a system anda particular channel, the detection of errors is simpler than the correctionof errors. The decision for applying detection or correction in a given codedesign depends upon the application requirement more than anything else.

When the communication system is able to provide a full-duplex trans-mission (that is, a transmission for which the source and the destination cancommunicate at the same time in a two way mode), codes can be designedfor detecting errors, because the correction maybe performed by requiring arepetition of the transmission. These schemes are known as automatic re-peat reQuest (ARQ) schemes [3] where if code (and associated parity bits)indicates to the receiver about a possible error, the receiver will ask for theretransmission of this data block. This may continue until the correct re-ception of data. This does not mean that error correction codes are notemployed for full-duplex systems. Of course, an error correcting code usedin such a system will avoid a lot of retransmissions and will increase thesystem overall throughput.

On the other hand, there are communication systems for which the full-duplex mode is not allowed. An example of one such system is ”‘paging” sentin cellular systems, sending of alphanumerical characters as text messagesfor a mobile user. In this type of transmission, there is no possibility ofdemanding a retransmission in the case of a detected error, and so the re-ceiver has to implement some error-correction algorithms to properly decodethe message. This transmission mode is known as forward error correction(FEC) [3].

1.6 Forward Error Correction (FEC)

In an FEC system where presumably there is no ARQ kind of feedbackmechanism and hence if receiver gets some erroneous data, it has no wayto inform the transmitter that the data was incorrectly received. Thus thisabsence of ARQ mechanism in such systems makes it desirable to have somegood error correcting capabilities.

The source generates some data which could be human speech for exam-ple. Then after some processing (probably quantization and source encod-ing) binary values are obtained which represent the source data. Supposethat the rate of this source data is rb bits per second. The encoder willtake a group of k bits and encode them to a group of n bits where normallyn will be larger than k and extra encoded bits represent the added redun-dancy. This redundancy may, if intelligently added, try to correct the errorsincurred during the communication process. Figure 1.2 shows a block dia-gram of an FEC communication system. At this stage it might be useful tomention that the channel should be rich enough so that it has informationcapacity, denoted by r larger than the source information rate rb. If this is


not the case, the laws of information theory [6], [5] tell that there is no hopefor such a communication to get through successfully.

Figure 1.2: Block diagram of an FEC System.

The code used in the FEC system is characterized by various parame-ters. In the next chapter, various parameters like minimum distance, con-straint length, code rate etc are defined which govern the error detectionand correction capabilities of a code. In most of the theoretical analysis,normally additive white Gaussian noise (AWGN) channels are assumed andthe performance of codes is studied through probability of error analysis.This probability of error is studied at bit-level, symbol level and frame-levelalthough for practical systems, this is normally frame error rate which isspecified and studied.

1.7 Objectives

Channel coding techniques lie at the heart of a communication system. Dueto the ubiquity of the channel impairments, channel coding techniques areequally ubiquitous and every major communication standard, for example2G (GSM), 3G UMTS (WCDMA), LTE, LTE-Advanced and WiMax toname a few, has standardized some suitable channel coding techniques. Thisstandardization means that every standard has defined the type of codes andtheir associated parameters to be used in that standard so that base stationsand mobile devices developed by different manufacturers can communicatewithout any problem. Most of the standards have standardized convolu-tional codes and for these codes decoders based upon the Viterbi decodingalgorithm are the de-facto standard.

The objective of this thesis is the implementation of soft decision Viterbidecoder which has enough flexibility to decode the codes following variousstandards. To achieve this objective, as a first step, convolutional cod-ing/decoding algorithms need to be studied in detail and they shall bedescribed in the next chapter alongwith their decoding algorithms. Thendata shall be gathered about what codes (and associated parameters) arepopular among the standards. Basic channel models are also going to bedescribed which are normally used to analyze the performance of the com-munication systems and various blocks therein, channel decoding in case

1.7 Objectives 7

of this thesis work. Classic Viterbi decoding algorithm and its variant softdecision viterbi algorithm will be discussed in detail. Two widely used ap-proaches for termination of trellis ”Tail-biting” and ”Zero-Tail Termination”shall be theoretically discussed while for the implementation in C language,the ”Zero-Tail Termination” approach with soft decision Viterbi decodingwill be adopted.

For performance benchmarking, a communication chain has been im-plemented in MATLAB using mostly the built-in functions. In the end,the performance comparison of C-implementation of Viterbi decoder withMATLAB decoder (used as reference) is provided. The implementation isbasically done using floating point C variables. As hardware implemen-tation of any algorithms brings about the questions of finite word lengthsand quantization, very basic quantizers (truncation blocks) are introducedin proposed decoder implementation. These blocks allow to investigate theimpacts of the fixed point programming, overflow detection and handling,and finite word length memory storages by adopting different levels of trun-cation. Simulation results provided in the end shed light on the trade-off ofthe system performance and the available hardware word length.

Chapter 2

Introduction toConvolutional Codes

2.1 Introduction

This chapter is dedicated to the description of the convolutional codes. Mostpart of the information contained in this chapter is taken from [3], and [1]which discusses these codes in great detail. Binary convolutional codes werefirst introduced by Elias in 1955. They are one of the most popular forms ofbinary error correcting codes and they have been extensively used in variousapplications, for example different wireless communication standards (IMT-2000, GSM, IS-95), in digital terrestrial and satellite communications andin broadcasting systems.

2.2 Convolutional Codes and Basic Terminology

Convolutional codes have been in use since the early days of the 1950’s.While the implementation of the encoder of a convolutional code is quitesimple, the process of decoding the resultant data stream at the receiv-ing node can be highly complicated. Dr. Andrew J. Viterbi was the firstone who put effort on Viterbi algorithm in the late 1960’s. This algorithmrevolutionized the convolutional decoding and it has been used as major de-coding algorithm in decoders for convolutional codes. Viterbi decoder doesthe ”Maximum Likelihood” decoding of the received coded sequence and itcan be said a breakthrough over highly complex implementations used inpast. Convolutional coding on transmission side and viterbi decoding onreceiver side has become a de-facto industry standard for various modern

9

10 Chapter 2 Introduction to Convolutional Codes

communication systems. Viterbi Algorithm is used in almost every mobilehandset because wireless standards for the past many years have been usingConvolutional codes. There are normally slight differences in the associ-ated code parameters which are explained later in this chapter. The Viterbialgorithm simplifies the process of reconstructing the original data streamunlike many other decoding algorithms, its operation is still very complexin terms of implementation. Convolutional codes are a class of linear codeswhose encoding operation can be viewed as a filtering process [8]. The name”convolutional codes” comes from the mathematical idea of any digital fil-ter: linear convolution. In the following, some basic parameters are definedwhich are associated to convolutional codes.

2.2.1 Code Rate

The coding rate R for any coded system is defined as R =k

n, where k

is the number of input symbols and n is the number of output symbols.Convolutional codes operate on streams of input symbols, producing streamsof output symbols. Normally k ≤ n and hence this formation of larger bitsserves to correct the errors introduced by channel.

2.2.2 Constraint Length

The convolutional encoder consists of simple structure, memory elementsand adders. An important parameter of convolutional codes is their con-straint length which is governed by the number of memory elements in theconvolutional encoder. If the number of memory elements in the encoder ism, the constraint length, denoted by K, would be K = m+ 1.

2.2.3 Hamming Distance

The Hamming distance between two bit sequences of equal lengths is thenumber of positions at which the corresponding symbols are different.

To understand free distance dfree of a convolutional code it is necessaryto understand the minimum hamming distance. The minimal Hammingdistance between different encoded sequences is measured with the help ofstate diagram of that encoder. The Hamming distance between two stringsof equal length is the number of positions at which the corresponding sym-bols are different. Suppose there are two code vectors c1 and c2 such that c1,c2 belong to C and d(c1, c2) is the number of positions where two code vectorsdiffer. For instance, if c1 = (101010) and c2 = (101100), then d(c1, c2) = 2which is the Hamming distance [1]. According to the definition, it can beverified that

d(ci, cj) =∑

(ci ⊕ cj) (2.1)

2.2 Convolutional Codes and Basic Terminology 11

The minimum value of the distance between all possible pairs of code-words can be calculated and is called the minimum distance of the code,dfree [3].

2.2.4 Free Distance and Error Correction Capability

The free distance of a code, denoted by dfree, is the minimal Hammingdistance between all possible encoded sequences of that particular code. Thisparameter is very important and governs the error correction capabilities ofthe convolutional code.

The bit stored in the right most register is discarded when a bit is enteredinto the encoder and the current state becomes the next state [1]. In Figure2.1, there are two memory elements m = 2 so the constraint length is K = 3.The total numbers of states of any convolutional encoder are 2K−1 and 4possible states exist for this encoder.

Free distance, is an important parameter of convolutional codes and isthe minimum hamming distance of any two distinct code sequences. Itis a measure of the error-correcting capability of trellis based codes. Toachieve high performance through coding and/or decoding (less errors atthe decoder side), convolutional codes with higher free distance are used.More free distance means paths at a particular time instant are more faraway so less chances of wrong selection and it can be understand easilywhile having understanding of decoding through trellis.

The error correction capability t of a convolutional code is de-fined as the number of errors that can be corrected by that convolutionalcode. It can be calculated as

t =

⌊

dfree − 1

2

⌋

(2.2)

To achieve reasonable error correction capability error events should beseparated by at least the constraint length of the code, measured in bits [3],[9]. In past it has been calculated through simulations that some specificpositions of XOR gates make best combination for maximum possible freedistance for any particular K. Larger free distance of a code means moreerror correction capability exists in that code as less chance of wrong pathselection. So to keep the free distance higher is really important and oneobvious solution is to increase the number of memory elements within certainlimits.

One seemingly drawback of increasing K (more number of states, moresearch space) is computationally complex (algorithmically) decoder requiredapart from more hardware [3]. Thus convolutional codes involve this trade-off between error correction capabilities (which increase with number ofmemory elements) and the decoding complexity (which also increases enor-mously with the number of memory elements).


2.3 Convolutional Encoders

A convolutional encoder is a discrete and linearly time-invariant systemin mathematical terms. Convolutional encoders can be described as digitalfilters which gives encoded output in terms of a single stream. The structureof convolutional encoders are quite simple [1].

A convolutional encoder has memory, in the sense that the output sym-bols depend not only on the input symbols but also on previous inputssymbols. So it can be called as a convolutional encoder, is a sequential cir-cuit or a finite-state machine whose states are based on contents of memoryelements. More on this discussion is revealed with an example later on.

At a particular time instant the convolutional encoder is defined to bein a specific state which is based on its memory contents. In later section,it is shown how state diagram of a convolutional encoder helps in buildingcomplex but systematic structure known as trellis [3].

A convolutional code consists of the set of all binary sequences producedby a convolutional encoder. In practice, the state of the convolutional en-coder is periodically forced to a known state. Its main advantage becomeapparent while decoding the data at the receiver side because received datais decoded as packet/ frame based or in block wise manner.

Convolutional encoders are kind of digital filters. Their structure canbe defined in terms of storage elements, and adders. As all practical imple-mentations are based on binary so are encoders. Memory elements of theencoder are simply 1 bit registers. Adders involved in the encoder take cur-rent input as well as delayed versions of input (stored in memory elements)and perform modulo 2 addition. If an encoder is constructed without anyfeedback loop it is said to be a feedforward encoder. Mathematically theseencoders are defined as a combination of vectors connecting different mem-ory elements and performing modulo 2 addition. Any of these connectingvectors is taken as one if it exists and it is taken as zero if it is absent [1].

To give an example to readers an example is given. It is a rate R =1

2encoder and it is shown in Figure 2.1. Its connecting vectors are [101] and[111] which correspond to decimal equivalent 5 and 7, therefore it can becalled a rate (5, 7) convolutional encoder. The binary stream is fed as inputto this encoder and it produces two output streams as its design suggests.The c[1] and c[0] are its two output values which comes out when each bit isentered into the encoder and processed. The output of the encoder is takenin an interleaved manner, which means taking one after the other, each timeto produce a complete codeword c for that particular time instant. As aresult it can be said that every codeword is a specific combination of binaryinput value at that time and values stored in memory elements. This specificcombination consists of position of adders and their relation with memoryelements and these combinations were investigated by some researchers in

2.4 State Diagram and Trellis Structure 13

Figure 2.1: A 12Rate (5,7) Convolutional Encoder [1].

past. In simple words, each input bit produces two output bits so this is

called a rate R =1

2convolutional encoder [1].

It is important to understand that cK [i] represents bit i in the vectorcK , while least resent entered bit is far right and most resent entered bit isfar left in memory elements. For example, if cK = 10, then cK [1] = 1 andcK [0] = 0. The same notation is going to use in the rest of the thesis.

2.4 State Diagram and Trellis Structure

The state of a convolutional encoder is described by bits stored in its memoryelements corresponding to that time instant. As can be explained that aconvolutional encoder is a state machine so it can be best described bya state diagram. In state diagram it can easily be seen how all possibleinputs affect the output and next state of the encoder. The (5, 7) encodershown in Figure 2.1 contains two 1-bit registers to store one bit each; thatis the obvious reason why state of convolutional encoder is represented by2-bits. How state changes from one to another is represented by arrows andcorresponding values of input and output are also labeled. Figure 2.2 showsthe state diagram of the (5, 7) encoder described in previous example [1].

A trellis diagram shows all the possible states and changes of states withinformation about corresponding input and output of the encoder with time.


Figure 2.2: State Diagram for 12Rate (5, 7) Convolutional Encoder [1].

2.4 State Diagram and Trellis Structure 15

Any encoded sequence (say codeword) can be represented as a path on thistrellis diagram (kind of butterfly extension) and this is the key point touse trellis for processing of encoding paths to perform decoding. Figure 2.3shows one stage of a trellis corresponding to the 1

2rate (5,7) encoder [1].

Figure 2.3: Trellis Diagram for 12Rate (5, 7) Convolutional Encoder [1].

There are 2m states at each stage of the trellis, where m is the numberof memory elements in the encoder. Each state gives to output edges corre-sponding to two possible inputs; ”0” and ”1”. Thus each stage of a trellis willhave 2m+1 edges. To proceed further with trellis structure, it is importantto have understanding of few parameters:

The state index: q. States of any trellis are labeled from top to bottomin each stage, and can have the values q ∈ (0, 1, 2.....2m − 1). These valuesare taken from contents of memory elements. Each stage of trellis can bedefined as two columns of states with arrows from starting to leaving stateslabeled as qS and qE, respectively [1].

The edge index: e. The edge index is labeled from top to bottomaccording to its leaving state. It can have values e ∈ (0, 1, 2, .....2m+1 − 1)[1].


Input and output: m(e)/c(e). Each input into an encoder gives anoutput vector. This output comes according to previous output and currentcontents of memory elements. Every state has two edges, each labeled withits own input and output information, defined by the encoder’s structure[1].

Chapter 3

Channel Models

In any communications system, it is important to understand how a signalis affected by the transmission channel it encounters. A channel is always amedium through which the information being transmitted is effected due tonoise. There are also other channel properties that may affect the receiveddata rendering it difficult to decode. There exists a probability that a giventransmitted symbol is converted into another symbol. From this view point,a channel is considered as unreliable. Few key parameters related to channelmodeling are described. This chapter focuses on two common models forcommunication channels that are used when evaluating the performance ofconvolutional codes. These are the BSC and the AWGN channel.

3.1 Introduction to Discrete Channel

A discrete channel consists of a set of input symbols, output symbols anda relation between them characterized by conditional probability. Supposeinput symbols are represented by (x1, x2...xU ) and output symbols are rep-resented by (y1, y2, ..., yV ) so conditional probability is given as P (yj/xi).The conditional probability makes sure that received symbol is yj whiletransmitted was xi [3].

3.1.1 Discrete Memoryless Channel

A discrete memoryless channel treats every input symbol independent ofother input symbols at earlier times.

17

18 Chapter 3 Channel Models

3.1.2 Discrete Channel with Memory

A discrete channel with memory treats every input symbol with some rela-tion to earlier input symbols.

3.1.3 Binary Symmetric Channel

The binary symmetric channel (BSC) is most simple model used for channelmodeling while establishing basic communication system during simulations.It is a channel model that does not add any noise during characterizationof a channel. At each time instant, a bit is transmitted and the probabilitythat this bit is received wrongly is p and 1-p is the probability of success.The bit error probability (mathematically) while transmitting through thebinary symmetric channel is given by

Pe∼= Q

(

√

2

(

Eb

No

))

(3.1)

where (Eb

No

) is the energy-per-bit-to-noise ratio [9] - also referred to as

the bit signal-to-noise ratio (SNR) or SNR per bit and Gaussian Q functionis given by

Q =1

√2π

∫

∞

x

(e−z2

2) dz (3.2)

The value p is known as the probability of error, because it representsthe probability that a bit changes from 0 to 1 or 1 to 0, which can be seenin Figure 3.1. A BSC transmission can be modeled as

r = s⊕ n, (3.3)

where r represents received bits, s represents transmitted bits, and n isused to model possible bit errors. If there is a bit error at time i, ni will be1, otherwise it is 0. The ⊕ operator, also known as XOR, applies any biterror at ni by changing the value of si from 0 to 1 or 1 to 0 [1].

To check the performance of a channel used for a particular communica-tion system, typically transmitted bits are compared with received bits andfor this most basic approach is ”Hamming distance”. The Hamming distancecompares every bit of one sequence (transmitted one) to corresponding bitof the other sequence (received one) and counts the number of bits whichdiffer. When a sequence is received over a BSC, it is compared with a fewpossible transmitted sequences to determine which one it most closely resem-bles. The received sequence which gives least hamming distance is declared

3.2 Continuous Channel 19

Figure 3.1: Description of a Binary Symmetric Channel [1].

as the best possible sequence or in other words best possible path. Howeverdifferent scenarios can be possible.

3.2 Continuous Channel

Unlike a discrete channel where input and output may take only values froma finite set of alphabets, the outputs of the continuous valued channel maytake any value (and are not limited). The most well-known example of acontinuous valued channel is the additive white Gaussian noise (AWGN)channel.

3.2.1 Additive White Gaussian Noise Channel

The additive white Gaussian noise (AWGN) channel is most used modelto characterize a communication channel. In an AWGN channel, noise isintroduced (contrary to the BSC channel) and power distribution functionof noise is based on Gaussian distribution so is the name given to it. TheAWGN model can be used to take effects of several kind of channels, so it isreputed as a standard channel model with different parameters according toenvironment. It is used for benchmarking of communication system and ithelps in evaluating system’s performance in case of varying noise parameter.

Prior to passing through the channel, modulation is done on transmis-sion side. A simple modulation scheme is binary phase-shift keying (BPSK),where 0 is represented with some positive amplitude and 1 with the negativeamplitude. These are kept different so that on receiver side it is possible todistinguish them. Higher order modulation schemes (like 8PSK, 16QAM,and 64QAM) provide higher data rates as one modulated symbol (of theseschemes) takes more than one bit of information (as happens in BPSK). The

20 Chapter 3 Channel Models

two major drawback of these modulation schemes are; poor performance in

case of a very noisy channel and higher requirement ofEb

No

[10]. After modu-

lation bits are sent over the channel, where they encounter additive Gaussiannoise. A mathematical representation of this is:

r = s⊕ n,

In this equation r represent received values (of course noisy version), srepresent transmitted bit values (modulated bits), and n represent noisy val-ues. This is similar to the BSC model in Equation 3.3, however each sequenceis real-valued and the ⊕ operator performs addition of reals. According tostochastic theory, the values of n are independent Gaussian random vari-

ables with mean µn = 0 and variance σ2 =No

2, whereµn is the mean of

noise (expected value) and σ2 is the variance of noise [1].The performance of a digital communication system is often evaluated

with the bit error rate (BER) versus the signal-to-noise ratio (SNR). The

SNR is defined asEb

No

, where Eb is the energy per information bit. In an

uncoded BPSK system, each transmitted symbol corresponds to one infor-mation bit which means Es = Eb [1].

d2E(s, r) =

n−1∑

i=0

(

s[i]− r[i]

)2

(3.4)

When the channel used is AWGN, most often Euclidean distance is usedas key parameter to select which might be possible transmitted sequence.The Euclidean distance formula gives much better results as compared toHamming distance formula and it is explained in the following chapter [1].

Chapter 4

Convolutional codes :Decoding

4.1 Convolutional Codes : Decoding

This chapter describes in great detail two important algorithms for decod-ing convolutional codes: the viterbi algorithm (classic) and its variant softdecision viterbi algorithm. Later on different methods of trellis terminationare discussed and analyzed with performance point of view.

4.2 Background of Viterbi Algorithm

The most common method for decoding a convolutionally-encoded sequenceis the Viterbi algorithm. It which was first introduced in [11] and later onanalyzed in [12]. As mentioned in Chapter 2, the viterbi algorithm use aspecial structure called as trellis; consisting of states and their correspond-ing connection according to a specific relation which is governed by theencoder used on transmission side. Viterbi algorithm does some processingon trellis structure and finds out best possible path among many, followedthrough trellis. This possibility arises due to the fact that while encodingeach sequence finds a unique path through the trellis. All the convolutionalencoders essentially try to find out that best possible path while using dif-ferent criteria.

It is claimed in [12] the viterbi algorithm is an optimum decoding al-gorithm used to decode convolutional codes. Classic Viterbi Algorithm wasinvented by Andrew J. Viterbi in 1967. Its optimality was invented by DavidForney who is credited the first idea of trellis structure [13]. The computa-

21

22 Chapter 4 Convolutional codes : Decoding

tional complexity of Viterbi algorithm is exploited in many new algorithmswhich are basically different non-optimum variants of classic Viterbi Algo-rithm. However their performance degradation is still acceptable withincertain bounds.

4.3 Maximum Likelihood (ML)

Any decoding algorithm can be proposed that could compare the receiveddata stream with every possible path along the trellis and give the most likepath. The viterbi algorithm uses an approach to find best possible path,maximum likelihood (ML). The details of maximum likelihood detectioncan be explained in the following way:

Suppose a given code sequence c is generated by the encoder of a con-volutional code, the channel noise converts this sequence into the receivedsequence rk, which is in fact a noisy version of the code sequence c. Anoptimal decoder compares the conditional probabilities P (rk/c) that thereceived sequence rk corresponds to a possible code sequence c, and thendecide which code sequence might have higher conditional probability:

P (rk/c) = maxallcP (rk/c) (4.1)

This criterion is called as maximum likelihood. The decoding proceduretries to make probability function maximum during the selection of a se-quence. It gives the most likely received sequence by selecting among allcode sequences. Interestingly it does not provide an efficient approach ifapply in viterbi algorithm as it is. For a code sequence of L bits, there are2RL possible sequences, where R is the rate of the code so it still has toprocess so many possibilities which make it worse in case of huge sequenceof input data stream. In simple words, the maximum likelihood decoderselects a sequence c′, from the set of all these possible sequences, which ismost resembling to the received one. Also it is considered that channel ismemory-less and noise is AWGN (Additive White Gaussian Noise) whichcollectively means that each symbol is independently affected by this noise[3].

4.4 Viterbi Algorithm

The Viterbi Algorithm (VA) performs maximum likelihood decoding on re-ceived sequence. It is applied to the trellis of a convolutional code while itis clear that trellis itself is constructed from convolutional code. The MLcriterion helps in viterbi decoding to find out most likely sequence howeverits main drawback is that number of possible code sequences are too many

4.5 Hard Decision Decoding 23

which causes a huge complexity for calculation. Here the VA plays its impor-tant role and reduces this computational complexity by avoiding calculationfor all possible code sequences.

The Viterbi Decoding procedure consists of calculating the cumulativedistance between the received sequence at a time instant ti at a given stateof the trellis, and each of all the code sequences that arrive at that instant ti.This calculation is done for all states of the trellis, and for successive timeinstants, in order to look for the sequence with the minimum cumulativedistance. The sequence with minimum cumulative distance is the same asthe sequence with the highest probability of being identical to the receivedsequence provided that transmission is done over the AWGN channel [11].

4.5 Hard Decision Decoding

In hard decision decoding, the metric used to measure cumulative distancebetween received sequence and output of each transition of each trellis stageis Hamming Distance. The Hamming distance has already been discussedwith example in chapter 2. To calculate Hamming distance, received signalvalues have first to be converted into binary values as per requirement ofHamming distance. It can be said this metric can cause of introducingsome errors and the metric used is not much sophisticated to be used forbranch metric calculation. The branch metric is a parameter related toviterbi decoding and its details will come in the next chapter. The Hammingdistance is used to find the maximum likelihood path.

4.6 Soft Decision Decoding

In soft decision decoding metric used for calculation of distance is EuclideanDistance. The Euclidean distance is calculated (one-by-one) between realvalues of received sequence and binary-mapped output of each transitionstate. It is assumed that soft values are provided to this type of decoder,soft values mean here some reliability information associated with channeloutput values. While Euclidean distance is used to find the maximum likeli-hood path, the channel assumed is Additive White Gaussian Noise Channel(AWGN). Almost all practical implementations of viterbi use soft decisioncriterion.

It is said that maximum likelihood decoding can be considered as a spe-cial case of soft decision decoding. In ML decoding the decoder gives themost likelihood sequence. It has been said that soft decision decoding gives 2to 3 dB more coding gain as compared to hard decision decoding while main-taining same bit error probability [9]. The Euclidean distance gives betterperformance and helps in finding the most probabilistic ML path. Howeverimproved results are achieved at the cost of higher computational complex-


ity. An important thing to notice is Euclidean distance metric performscomputation over real values unlike binary in Hamming distance formula.In literature is is said that eight quantized levels are reasonable to achievethe same performance as with un-quantized values of the channel output[14]. While associated reliability information with channel output gives fineresults and less bit error rate, cost is paid in terms of complexity. In otherwords 50% to 63% less transmitted power is required as compared to harddecision decoding [9].

The maximum likelihood criterion is applied by considering which codevector or sequence is closest to the received vector or sequence, that is, whichcode vector or sequence minimizes the distance with respect to the receivedvector or sequence. In this case decision criterion is not on a single bitalone, rather probability of each bit to be zero or one relative to others isalso added. So probability theory helps in finding correct received sequencewhile adding complexity to computational details and receiver architecture[15], [3].

4.7 Methods of Trellis Termination

Viterbi Algorithm operates over a special structure called as trellis, and it isformed by state probabilities and branch metrics. Due to trellis structure itis easy to understand the processing steps of Viterbi Algorithm. The trellistermination is related to starting and ending states of the convolutionalencoder and it directly helps in decoding procedure. In simple words trellistermination is the way in which trellis is ended into some specific statesaccording to some rule or criterion. Trellis termination is important to payattention from traceback view point. Here it is reminded that convolutionalcodes are not really block codes and viterbi decoder operates on a continuoussequence of encoded data. However this continuous data is split into trace-back lengths for processing so data is considered in the form of packets(or possibly blocks) and each packet has one or few independent blocks.The number of blocks in a packet, and trace-back lengths vary accordingto different implementation standards. Though encoding and decoding canbe start from any state, there is no hindrance or constraint by algorithm ortrellis structure however it is not a usual practice. While encoding is done,an assumption is used: To start encoding with a known state and similarlyend encoding at a known state after all data bits have been encoded. Innext section it is revealed why this assumption has become an essential partof encoding/ decoding jargon [2], [16].

4.7 Methods of Trellis Termination 25

Types of Termination:

There are three most popular methods of terminating a trellis, given as fol-lows:

• Trellis truncation

• Trellis termination

• Tail biting

Different standards use different types of trellis termination, depending uponapplication. For example, if length of packet is small then ”Tail Biting” ismore efficient as compared to other two techniques. In case of reasonablepacket length ”Zero Tail” is considered as promising technique. The detailsof these techniques is given in the following sub-sections.

4.7.1 Trellis Truncation

Trellis truncation is the most simple method. In it, the encoder is havingzero filled in all memory elements each time it starts encoding new packetor block. As there is no phenomenon to bring this type of encoder into aspecial state so it is a crude form from traceback view point. The reason isthe corresponding decoder of this type of encoder is not aware of from where(which state) to start decoding while last traceback and in case of randomselection or any proposed criterion does not give better results of course [2].This method gives degraded performance in terms of bit error rate (BER)the reason of which is already explained in above lines.

4.7.2 Trellis Termination

Trellis termination is used in almost all viterbi decoders except rare cases.Like ”Trellis Truncation” this type of encoder is also reset to zero beforestarting encoding for each packet or block. The difference with the previoustechnique comes when a fixed number of zero bits are shifted into the encoderat the end of each packet. These fixed number of zero bits are equal tonumber of memory elements of the encoder used. This is obvious that thisis done to bring the encoder back to zero state after each block is encoded.Important thing to remember is output bits corresponding to those zero bitsis not part of information encoding or decoding and they are just discardedor taken care while decoding on receiver side [2].

This method is also given the name of ”Zero-Tail” termination due toinvolvement of last zero bits at the end of each packet. In this method theencoder’s starting and ending states are essentially zero and known. Thedecoder uses this information of known state for selection of right traceback


path while starting it and ending it. As obvious its BER performance ismuch better than trellis truncation method however its cost is paid in termsof extra transmitted bits (to bring starting and ending states as known ones).

Figure 4.1: Block Diagram of Zero-Tail Encoder. [2]

Another point to notice is code rate is reduced due to using this typeof encoder and/or decoder. Suppose there were N information bits, whichbecome (N + no. of zero bits)/R are transmitted. These extra bits (no.of zero bits) consume useful time of transmission as well as energy sending

4.7 Methods of Trellis Termination 27

for these bits and the benefit is achieved in terms of reduceEb

No

for a given

probability of error. This effect is only reasonable when N is very small [2].

4.7.3 Tail Biting

This technique was introduced to cope the extra cost of transmitting zerobits in case of zero-tail termination. In this method zero bits are not trans-mitted, rather last few bits of each packet are transmitted prior to startencoding each packet. Those few number of bits are equal to the numberof memory elements. As a result starting and ending states of the encoderare known once again without transmitting extra zeros. As obvious thisis essentially useful on decoder side. Important thing to notice is duringthis initialization of the encoder with ”last bits of packet”; output bits ofthe encoder are discarded as it is not as such information bits. One majordrawback of this approach is that before starting encoding process throughthis scheme complete packet should be available at the encoder input so thisscheme renders some delay.

The encoder in figure 2, shows tail biting for constraint length 7 example.In this figure the encoder is initialized to the last six bits of the packet, andthe first symbol at output of the encoder is produced when ”inputdata” = d0which is the very first bit of packet. The last symbol at the output of theencoder is formed when last bit of that packet is on input and as soon as itenters.

A variant of this same technique is to initialize the encoder with thefirst few data bits of the packet (equal to number of memory elements) andignore output at the encoder corresponding to these number of data bits.Here again no extra symbol is being transmitted for bringing the encoderinto same starting and ending state. After first Z data bits, the remaining(N-Z) data bits are encoded and transmitted. This has the same effect ofhaving the starting and ending states to be identical. Its advantage is thatit does not require the entire packet before encoding starts; however, thebits are out of sequence at the receiver [2].

This was an overview of three schemes of terminating a trellis. All ofthree have their own advantages and disadvantages as discussed briefly andno one can be declared as universally acceptable. Depending on packetlength, delay tolerance and performance different standards use differentapproaches and sometimes combination of them also.


Figure 4.2: Block Diagram of Tail-Biting Encoder. [2]

Chapter 5

Word Length Effects

5.1 Introduction

The main task of writing this chapter is to give an idea about the fixedpoint behavior of floating point numbers and relevant issues. In practicalimplementation, although many floating point signals (variables that areinteracting) are encountered, the storage spaces (no matter what kind ofmemory one takes) are always given as some fixed number of bits. Hencethe good knowledge of the behavior of floating point signals (variable) whenconfined to fixed storage spaces is very important for system design.

The terminology of the ”word length” is used to denote this fixed storagespace (mostly in number of bits) where a particular signal can be stored. Itis stated that the fully general detailed investigation of the ”floating pointquantization issues” is out of the scope of this thesis. Hence in the imple-mentation of the Viterbi algorithm which is the main objective of this thesis,the effects of quantization by ”truncation”are simulated, i.e., by limiting thesignal in some specific range. For specific scenario of this thesis implemen-tation, while introducing truncation, few basic issues arise such that:

• The relation between data rates, hardware, and fixed point numbers(storage spaces).

• The reasons of introducing quantization/truncation blocks in the pro-posed system and the benefits achieved from them.

• While estimating the word length numerical errors are introduced andhandled efficiently.

29

30 Chapter 5 Word Length Effects

• Selection of a reasonable word length so as to achieve an acceptableperformance.

• The trade-off of quantization/truncation and system performance.

5.2 Finite Length for Storage and its Effects

In modern day world, there are micro-controllers and digital computers inuse everywhere. Whenever computers (or digital devices in general) have tostore the floating point numbers, they have to manage them in such as wayto be able to store them in fixed memory spaces. There will be numberswhich, computer will not be able to store perfectly in this memory spaceand this would require some kind of truncation or rounding-off operations.Whenever the numbers are going through truncation or rounding-off beforetheir storage, they may become different from their original values the ex-tent of which will depend upon the level of this truncation or rounding-offoperation. Sometimes this error can be ignored depending upon the appli-cation or the criticality of the situation however it is not possible to ignorethis completely in all cases.

These round off problems arise because every numerical value is storedat a specific location in hardware memory which is finite and is governedby fixed number of bits. These fixed bits are usually power of 2 and mostoften 8, 16, 32 or 64 bits depending upon the system of exploitation. Thisis a fundamental limitation while working with hardware and can not bebypassed. A trivial solution of this problem would be simply to increase thestorage space used to hold the value of this variable. While it will definitelyincrease the storage capacity to hold 2b bit patterns for b-bit storage, one cannever have infinite precision as it will require infinite memory which simplytranslate into infinite hardware. On the other hand, it is very important todesign cost-effective hardware which would require the use of memory spacewhich is required and excessive memory utilization is not considered a gooddesign.

The choice of the size of memory space for variable, which are termed asthe word length, should be selected as the minimum which does not harmthe performance of the system. This would require the system designer tohave a deep knowledge of the signals inside, the values these signals takeand the impact of their truncation or rounding-off.

In computing, the numbers are represented either as floating point (alsocalled real numbers) or fixed point. Evaluation of fixed point numbers andfloating point numbers can be understood only after having a good concept ofrange and precision. In simple words, range can be defined as the smallestand the largest number that can be represented while the precision can bedefined as the size of the gap between two consecutive values (numbers). In acomputer hardware possible number widths define CPU, ALU architecture,

5.3 Floating Point Numbers and Quantization 31

memory buses, data buses and registers etc and these definitions become aprimordial part of the architecture design. The details of fixed point andfloating point number representation are discussed later but for the timebeing, it is pointed out that floating point numbers have larger dynamicrange compared to fixed point numbers when equal storage space is available.

5.3 Floating Point Numbers and Quantization

Floating point number and fixed point number representations both canstore positive as well as negative numbers but with different precision. Fixedpoint numbers will have a fixed number of digits before and after the binarypoint (decimal point in everyday life), depending upon the particular appli-cation or system. High level computer programming languages, such as C,MATLAB usually allocate 16 bits or 32 bits to store each integer dependingon the architecture used [17].

A floating point number is represented by a mantissa and an exponentwhere this mantissa is multiplied by ten raised to the exponent. For examplein the number 6.3971 × 108, 6.3971 is the mantissa and 8 is the exponent.Usually the floating point numbers are normalized so that there is only asingle non-zero digit to the left of the decimal point. This is achieved byadjusting the exponent as needed.

The most common format for floating point system is ANSI/IEEE Std.754-1985. 32 bit numbers are called as single precision, and ”float” in Clanguage. The standard also defines 64 bit numbers called double precisionand ”double” as data type in C language. For example, the decimal fraction3.496 is interpreted as: 3 + 4/10 + 9/100 + 6/1000. The binary fraction1.1010 would mean: 1 + 1/2 + 0/4 + 1/8 + 0/16 [18].

Floating point variables need not special care during arithmetic oper-ations, rather they do not even require overflow or saturation detectionexcept in rare cases. It is evident that floating point numbers have largedynamic range. Apart from large dynamic range, a 32 bit floating pointnumber is superior to a 16 bit floating point number. The reason lies inwhat happens when the number is stored. Whenever floating point numbersare stored in fixed bits, they may need rounding operation. The differenceof the true value from its representation in fixed number of bits is termed asnoise. When more bits are available for representation, the adjacent valuesare quite close and the noise due to this fixed length representation wouldbecome very small [17]. This indicates that when the system of concernis such that very high precision is required, one needs to choose the wordlength carefully so as not to destroy the system performance.


5.4 Quantization for Fixed Point Programming

The signal quantization occurs not only when analog quantities are convertedto digital quantities, but also occurs sometimes when a calculated quantityis stored into the memory of a computer (or digital system). This is calledarithmetic rounding in the jargon of computing.

Arithmetic rounding is different from the conversion of analog quantitiesto digital quantities in that the quantizer input is not an analog value butitself is a quantized data. For example, multiplication operation of two 16 bitnumbers approximately doubles the number of bits (or that of the mantissa)compared to the numbers multiplied. So if the result of the multiplicationhas to be stored in a location which is again 16 bit, arithmetic roundingis required. Similar thing may happen while adding two numbers. Eachaddition operation may result in a number which has one bit larger than thenumbers added. To store the result in the same number of bits and avoidoverflow, there is need to either scale the values being added or quantizethe resulting value to the number of bits available for storage. This willcorrespondingly add quantization noise on each such step. In the worstcase, this quantization noise may simply add up through various processingstages and as a result this large accumulated noise may degrade the systemperformance [18].

During the whole flow of the Fixed point programming, there is needto take care of the result of the arithmetic operations (say multiplicationand the addition). Most of the computer based simulations use floatingpoint variables. When one switches to hardware programming or even thecomputer simulations where the objective is to investigate the hardwareperformance, one opts fixed point programming. For a real world digitalsystem design, the definitions of the word length, how the quantization willbe performed etc are important decision parameters. Extra cost of hardwareis never welcomed by hardware engineers and hence fixed point hardwareparameters need careful selection.

If it would be possible to propose a reasonable word length of floatingpoint numbers used in a system, a fixed point hardware can be used forfloating point implementation. First of all, the word lengths of variablesat different points (considered important) in the program flow should bemeasured. To compute an optimized word length, quantizer at differentnodes in the system can be very different from each other.

The word length can be further partitioned into two parameters, namelythe range and the resolution. The range of numbers defines how small andhow large numbers of a particular system allow to pass without truncation.The selection of possible range is very important to get efficient results andit is application dependent. In loosely speaking range corresponds to integerpart of values used. The resolution is interchangeable with precision and itssignificance also relates with nature of floating point numbers used in par-

5.5 Saturation Effects 33

ticular application. The precision corresponds to fractional part of floatingpoint numbers used. It is obvious that truncation of integer part affects thequantized value significantly as compared to truncation of fractional partdue to less weight-age of fractional part. To quantize floating point numbersmeans; to start varying the resolution and observe its effect on the perfor-mance. This process of truncation will continue toward integer side, unlessperformance degradation starts to observe. So in this way possible integerpart as well as fractional part can be inquired and a reasonable fixed pointnumber can be proposed.

5.5 Saturation Effects

It is good to have an idea of saturation phenomenon and the situation inwhich saturation plays its role and avoids from wrong result. The maximumnumber and the minimum number for a particular system governs to set”limits” of saturation. It means that when values are reached to a certainmaximum or minimum and further increase or decrease is not possible dueto hardware limitations; it is the idea of saturation which avoids from wrongresult. After saturation result is possibly close enough to true (expected)value and most often, minor difference of saturated value and true (expected)value is tolerable. The absence of saturation phenomenon might leads tovery wrong results during arithmetic, so it is an essential part of digitalarithmetic and system design hardware. The extreme cases (also called ascorner cases) cause of overflow, if saturation is not done. Saturation not onlydetects overflow but also gives possibly close results. It is clearly explainedby figure 5.1, which shows 3 bit saturation.

The behavior of numbers in case saturation occurs, is also accounted inthe quantization proposed for this thesis. 8 bit saturation means, if at sometime instant magnitude of number was more than 255, the saturated valuewould be still 255. Similarly if 7 bits are used to store the intermediateresults and value generated at a particular time instant is more than 128,the saturation operation will set the value as 128. In other words, valuesexceeding from certain limits (minimum number, and maximum number) areclipped and values after clipping are called as ”saturated values”. In figure 5.2curve shows 4 bit saturation; input values have direct relation with outputuntil input value reaches to 15, after that all higher input values correspondto the same output value/ level and all these are called as saturated value.

For comparison between different level of saturation; curves of 2 bits, 3bits, and 4 bit fixed point are shown in figure 5.3. The relative differencebetween them translates level of approximation which in turn means value ofnoise added. These curves are very interesting as they clearly show how theoutput resolution (which dictates the quantization noise) changes with thetruncation levels of the integer. The wider the (horizontal) gap between two


0 2 4 6 8 10 120

1

2

3

4

5

6

7

Input (x)

3 bi

t Sat

urat

ion

Saturation Effects: Input and Saturated Outputs

3 bit Integer

Figure 5.1: 3 bit Saturation.

0 5 10 15 200

5

10

15

Input (x)

4 bi

t Sat

urat

ion

Saturation Effects: Input and Saturated Outputs

4 bit Integer

Figure 5.2: 4 bit Saturation.

5.6 Word Length Estimation 35

truncation levels, lesser the resolution and more would be the quantizationnoise. The detailed system performance results are presented in the lastchapter. It is obvious from the curves that the output follows the inputsignal behavior within its truncation level limits dictated by the allowednumber of bits, the integer part.

0 5 10 15 200

2

4

6

8

10

12

14

16

Input (x)

Qua

ntiz

ed O

utpu

tInput and Quantized Outputs

4 bit Fixed Point3 bit Fixed Point2 bit Fixed Point

Figure 5.3: Quantization and Saturation.

5.6 Word Length Estimation

To compute the word length of a system, viterbi decoding system whoseefficient implementation has been the main objective of this thesis, is takenas an example. To get an idea of a reasonable word length and investigatethe issues related to range and resolution, three ”artificial” blocks are addedin proposed decoding program flow. These three blocks were added at threedifferent stages of the trellis used in Viterbi decoding algorithm.

The idea of using three ”quantizers” was based on the trellis structureimplementation in three sub phases. Each of these blocks is capable ofperforming two things:

• ”The First task” is to calculate the range of the values (of variables)at those points where it is inserted and where later it is intended touse a quantizer.

• ”The Second task” is to vary the resolution and observe the effectson system wide performance which is bit-error rate for the proposedViterbi decoding algorithm.


For simple implementation, these artificial blocks at three different lo-cations use the same truncation/quantization parameters (number of bitsbefore and after binary point).

In figure 5.4, it is shown where truncation blocks are inserted relative toother blocks of viterbi algorithm. Q is used as a universal notation for quan-tization (truncation in this case for simplicity). This figure is only showingone quantization/truncation block but in proposed decoding algorithm pro-gram flow, three such blocks are put in the trellis. Showing all of themwill require very detailed figure with the detailed mathematical operationscarried out in the decoder, it is chosen to show only one for better under-standing and to avoid unnecessary complexity. This figure also shows that inthe end, the results are always compared with the MATLAB floating pointimplementation. Most of the block appearing in this figure have alreadybeen discussed and explained in the previous chapter.

The computation of the range of values the variables take, and the res-olution required becomes easy with the help of these blocks. The values atintended nodes or points in the program flow were observed to evaluate therange. Later these blocks were set to quantize the variables at those pointsand the performances were evaluated for different levels of quantization.

It was observed that negative numbers were not being generated in thealgorithm so only the positive range was selected from 0 higher positive value(depending upon the maximum value generated). After careful observationnumber of bits used were proposed to be 8 bits. It was observed that in thisdecoding program flow, the fractional values of variables like path metricsetc do not have a significant effect over the performance. Survivor paths anddiscarded values in the trellis still remain the same survivor and discardedvalues respectively even if most of fractional part is ignored. Only 1 bit offractional part is kept to refine the results.

Leaving most of the fractional values surely gives the benefit in termsof the storage space. After that the question is how many bits should beused for integral part of these variables. This quantization is important as itmay strongly impact the system performance. The artificial blocks are set tovarious levels of quantization and evaluated the corresponding impact on theperformance. It will be observed later that even one or two bit truncation ofthe integer part may strongly degrade the system performance which revealsitself in poor BER for the proposed system. This study may indicate thequantization parameters suitable for particular situations of this decodingalgorithm implementation.

After computing this word length, it can be possible that this system(Viterbi decoder algorithm) can be run on fixed point hardware of intendedword length (say 8 bits) which means an obvious saving of hardware cost.It helps in understanding how many bits should be kept to achieve the sameperformance as of without truncation. Similarly if the objective is to designa specific system where word lengths are very stringent, this study might

5.6 Word Length Estimation 37

Figure 5.4: Word length computation for Viterbi decoding algorithm.


show what kind of performances one can expect with those (stringent) wordlengths.

Chapter 6

Implementation Details

6.1 Implementation Details

The aim of writing this chapter is to give an insight of implementation phase.It is good to have an idea of widespread use of ”Viterbi Algorithm”, in variousstandards and protocols. First of all a table is shown which shows the details(constraint length, rate, size of block and trellis termination if available) usedin different standards. Later on pseudo-code of Viterbi Algorithm, steps ofViterbi, and traceback techniques with figures for both (Zero tail and Tail-biting) from implementation view point are discussed.

6.2 Table: Convolutional encoding used in differ-

ent standards

The table below shows the Convolutional encoding with corresponding pa-rameters, taken from the following references: [19], [20], [21], [22], [23], [24],[25], [26].

This table reveals that the convolutional codes with rate 1/2 and con-straint length K = 7, with polynomial [171 133]; are the specifications usedin many standards so it is chosen as main target of the implementation forthis thesis. However constraint length K = 5, and K = 9 are also targetedwith rate 1/2, 1/3and 1/4 each. In the last column of table, ZT and TBtwo abbreviations are used which represent ”Zero Tail” and ”Tail Biting”termination respectively.

39

40 Chapter 6 Implementation Details

Standard R K Polynomial Block Size Termination

DAB 1/4 7,

DRM 1/4 7,

UMTS 1/3 9, 557,663,771 504+8 null bits

UMTS 1/2 9, 561, 753, 504+8 null bits

UMTS 1/2 7, 171, 133

HiperLAN/2 1/2 7, 171, 133

LTE 1/2,1/3 5,6 561, 753 Both ZT1/4 7,9 and TB

GPRS 1/2, 5, 23, 33, 504+ Both ZT8 null bits and TB

EDGE 1/2,1/3 7 561, 753 190-606, Both ZTMax.size:876 and TB

WiMax 1/2 7 171, 133 ZT option,TB must

GSM 1/2 5 Both ZTand TB

WLAN 1/2 7 171, 133

WCDMA 1/2,1/3 9 1/2:753,561.UMTS 1/2,1/3 9 1/3:577,663,711

1/4:765,671,513,473

6.3 Pseudocode: The Viterbi Algorithm

The input of this algorithm is LLR values which means Log-likelihood ratio,these are channel outputs (real values). These LLR values take into effectof correlation of conditional probability of being received value y = 0, andy = 1. The use of these values in decoding helps in making decision morecorrect. The output of this algorithm is a sequence of binary values whichare obtained after maximizing the likelihood probability. The output of softdecision viterbi is compared with MATLAB (used as reference) as well asbinary input of the encoder.

By principle the survivor path has to be computed through processingof trellis at each stage, while maximizing the likelihood probability. Twoarrows are leaving each state corresponding to two possible inputs of eachstate: 0 and 1. Each arrow is labeled with input, and output values whileleaving a state and coming to another state which is its ”next state”. Trellisstructure is formed with the help of state diagram of encoder. In the figure6.1, pseudo code of Viterbi algorithm is shown as per implementation of thisthesis.

6.3 Pseudocode: The Viterbi Algorithm 41

Figure 6.1: Pseudocode of Viterbi Algorithm.

The main for loop will run over half of the entire length of the receiveddata stream, and it will cover all sub-operations. The trellis structure isdivided into three sub-parts (so three For loops) for ease of calculation andmakes computation efficient. Another thing to remind is all above proce-dure is done over a fixed number of bits (packet length), which are 70 bitsincluding last six zeros due to zero tail termination method. When the threeessential steps of forward direction (BM, PM, Survivor selection) are per-formed over all trellis stages ( at least five times the constraint length ofthe encoder, called as the decoding depth); traceback is need to start in thereverse direction so that decoded bits may achieve. Select the NPM metricof zero state in case of Zero-Tail termination.

In case of Tail-Biting termination, best metric would be the choice tostart traceback. To select the smallest metric (NPM) means to minimizethe error and maximize the conditional probability criterion. The Euclideandistance helps in reaching closest to transmitted sequence. Termination in aknown state is done(For example, 0 or tail biting(starting and ending statesin the encoder are also same)) [27].

Assumption: While doing implementation a reasonable assumption ismade that length of received data stream and length of each packet are fullydivisible and remainder is zero. This was done to avoid complexity in casehow to treat those left bits if suppose 5, 10, or 30 bits are left.

Note: This arithmetic division between length of received stream andlength of each packet is done to calculate number of iterations of trellis as a”Whole number” for which BM, PM, and survivor selection is done.


6.4 Steps of Viterbi Algorithm

Trellis Construction

For zero tail decoding trellis construction is start from zero state at timeinstant ti = 0. So in first stage only two branches are calculated which leavefrom state zero. In second and third stage four and eights branches are cal-culated respectively and so on. In this thesis implementationK = 7 and rateR = 1/2; full trellis was constructed after six stages (time instants). Thiswas about zero tail trellis construction. In case of tail biting constructioneach branch of each stage was active even at time instant ti = 0.

Branch Metric Computation

At each branch (two from each state as k = 1, two possible values of inputonly either 0, or 1 in the trellis), the receiver compares the received signal(LLR) values which are real values to each signal allowed for that branch.Each branch is labeled with a metric proportional to the distance betweenthe two signals (allowed signal level and the received one) and this labeledvalue is stored in memory for every branch. The Euclidean distance formulais given below:

d2E(r, c) =

n−1∑

j=1

(

rji−1− cjk

)2

(6.1)

where ri−1 and ck represent received LLR values and codewords (outputof the encoder corresponding to different transitions, labeled as output ontrellis edges), and j varies from time n= 0 to n= length of received sequence.The squared of difference between them is calculated and used as branchmetric value.

Path Metric (New Path Metric) Computation

Each branch metric (current value) is added to path metric and this metricis named as ”new path metric” for current state. It is computed for everyedge of every state at each time instant. For first stage this parameter isequal to zero and it is computed by adding previous path metric (which isalso equal to zero at zero state) and current branch metric. Each branchof each state has now a value associated with it called as path metric. Ateach time instant ti, a path metric is stored for each state q in the trellis.This path metric helps in finding the state which corresponds to current MLpath through the trellis. After few initial stages (equal to memory elements

6.4 Steps of Viterbi Algorithm 43

of the encoder or one less than the K) each state has two path metric valuesand to become survivor value for each state one of them have to knock downthe other one.

Survivor Selection

Decision is done based on ML criterion so minimum of these path metric isselected as the survivor for that state [3]. Suppose a message (informationbits) is encoded and transmitted through an AWGN channel. For exampleif this received sequence is:

sr = (+1.35−0.15−1.25+1.40−0.85−0.10−0.95−1.75+0.5+1.30) (6.2)

The soft-decision viterbi algorithm will be applied on this sequence toshow survivor selection. The first step is to calculate the Euclidean distancebetween the received LLR values and the corresponding outputs for eachtransition of each state. As an example of the squared distance calculation,at time instant t2, how the Euclidean distance is calculated against thetransition from state 00 to the same state 00;

d21[(+1.35,−0.15), (−1,−1)] = (1.35 + 1)2 + (−0.15 + 1)2 = 6.245 (6.3)

In the figure below bold lines at time instant t4 need special attentionto notice as they are survivor at that time instant for corresponding trellis.It is obvious from figure that minimum cumulative distance is selected assurvivor as it minimizes the error corresponding to ML path as have alreadybeen discussed in chapter 4.

Once again it is stated that soft decision gives better(fine) results as com-pared to its counterpart hard decision. In addition, the use of the squaredEuclidean distance as a measure makes it quite unlikely that two paths arriv-ing at the same node of the trellis would have the same value of cumulativesquared distance.

Often a block or operation named Add-Compare-Select (ACS) is usedin literature to describe the combined operation of branch metrics, pathmetric and selection of survivor. It receives the possible branch metrics andthe state metrics storage’s value corresponding to each edge at each instantof time. As its name suggests this block adds branch metric and path metricto update new path metric and then it selects to keep survivor value amongpair of two for each state.

The algorithm (implementation) loops over the length of the received se-quence twice; once in the forward direction (while calculating branch metric,path metric and selection of survivor), and once in the backward direction.


Figure 6.2: Soft decision decoding to determine the survivor at a particulartime instant [3].

The backward loop (also known as the traceback loop) processes all of theinformation obtained in the forward loop and uses it to output the decodeddata. As its name implies it starts from ending stage and runs back to firstfirst stage of trellis. From which state it starts, there are different criteriaalready proposed.

Traceback

After done with whole forward path survivor paths are ready against eachstate and they are saved in memory. To start decoding process first ofall traceback is done. Traceback is necessarily consists of finding the mostlikelihood path from the last state to the first state in the survivor pathmetric. Then it generates the decoded output sequence.

It has been said that Traceback windows of five or more memory con-straint lengths, or T > 5(K − 1), have been found to lose very little per-formance when compared to tracing back over the entire received sequence[3].

Traceback for Zero-Tail Decoder

Traceback for Zero tail was started also from zero state whatever its metricwas. As it is obvious each packet had Z bits (zero bits equal to memoryelements) at the end of each packet so decoded values corresponding tothese Z bits were discarded for every packet. This was the cost of bringingthe encoder’s starting and last stages same, in zero tail decoder. In case of


Figure 6.3: Operation Level Flow Chart.


zerotail decoding no need was felt to take care of buffer(intermediate resultsstorage) overflow as path metric was reset for each packet.

Figure 6.4: Zero-Tail Traceback [2].

Traceback for Tail-biting Decoder

In case of tailbiting decoding first time three blocks were processed, con-structed trellis, and traceback of three blocks also. However decoded valuesof only middle block is kept as it is more reliable. The decoded values of firstblock are ignored this time as decoder is not aware of any known state. Af-ter middle block, another block was reinserted to help as training sequencefor middle block. Here advantage of the phenomena that all paths mergeat a common point after five to eight times of constraint length while doingtraceback, are taken.

After first three blocks, two blocks were processed in each step and trace-back of two blocks also simultaneously. First block is reinserted after last


block of packet as it is going to help as training sequence of last block ofpacket. However one extra block is also reinserted after this reinserted firstblock (as training sequence of first block) to decode it. Second block reinser-tion after this first one can be avoided by saving first six decoded bits of firstblock however to make it flexible if constraint length changes whole blockwas reinserted rather than six bits (forK = 7). It is shown graphically intail biting traceback figure.

An important thing to take care of in tail biting decoding is ”overflowhandling”. After each block path metric values are added for next processingof new blocks as each time two blocks are processed. At each stage minimumvalue of path metric was minimized from all other values of path metric, toavoid overflow.

In tail biting decoding, to make sure that traceback would not startfrom wrong state, more than one blocks were processed as well as tracebackhowever decoded values of only one block was kept and others were ignored.A major difference between decoding of zero tail and tail biting is that firstblock in tail biting was decoded at the end of packet so delay was significantas compared to zero tail decoding. In case of zero tail decoding each blockis processed and decoded as soon as it is received however extra zeros were akind of wastage of bandwidth or in other words reduced rate of transmissionas it is already described in the tail biting encoder. On the other hand tailbiting decoding is little bit more complex as compared to zero tail decodingfrom processing view point however bandwidth efficient somehow.


Figure 6.5: Tail-Biting Traceback [2].

Chapter 7

Simulation Results: BitError Rate Analysis

7.1 Introduction

This chapter is aimed at demonstrating the system wide simulation resultsfor the system of concern. Hence a communication system is used wherethe receiver is employing the Viterbi decoding algorithm implementationas detailed in previous chapters. The bit error rate (BER) is selected asthe system performance metric as is usual for communication systems. Theimpacts of quantization and the word length over system performance arealso investigated.

7.2 Simulation System and Basic Definitions

In figure 7.1, a communication system is shown while emphasizing on codingand decoding blocks which are the major concern of this thesis work. Aszero trellis termination method is used while construction of trellis, six zeros(equal to memory elements) forK = 7 are appended at the end of each block.It is made sure that those six zeros are excluded correctly while computingbit error rate (BER). Zero separation is done in the corresponding blockon the receiver side, after decoding as these zero bits were not ”informationbits”, though they were useful for decoding process. All this is shown onblock level in figure 7.1. This figure also shows a block for computing BERwhich is, of course, not part of a real communication system.

For the validity and comparison of results, the MATLAB floating pointimplementation of the Viterbi decoding algorithm is also used. This has

49

50 Chapter 7 Simulation Results: Bit Error Rate Analysis

Figure 7.1: Bit error rate calculation.

been shown in Fig. 7.2. The left part is the MATLAB implementations andthe right part is C implementation done for this thesis. In the end, resultsare compared in ”Comparison” block.

The results of BER of the system versus the SNR in this chapter, willbe frequently analyzing. This will be done by providing the tables for BERand SNR and sometimes with the help of plots. To make the things clear,these key terms are defined.

Signal-to-noise ratio (SNR) It is the most important metric whichexpresses the quality of a channel, means whether channel is good or bad interms of adding noise. It can be defined as the ratio of the received signalstrength to the received noise power level. It is a unitless parameter, beingthe ratio of two power values, however mostly it is described in dB scale.

Bit Error rate (BER) Bit error rate is one of the most importantsystem performance metric. It is defined as the ratio of the number of er-roneous bits received to the total number of bits transmitted. It is a directmeasure of channel quality estimation, modulation schemes performanceas well as measuring algorithm efficiency. In implementation it is used tobenchmark system’s performance under some specific constraints. It is aunitless quantity and may be expressed as percentage. The value of BER

7.2 Simulation System and Basic Definitions 51

Figure 7.2: Comparison with MATLAB.


is an indication of channel quality and hence directly depends upon SNR.It should be noted that mostly AWGN channel is used to investigate theimpact of coding/decoding due to its simplicity.

Packet Error Rate For real world communication systems, packet er-ror rate (sometimes called block error rate) is very important performancemetric. All modern communication systems are packet based, furthermorecoding/decoding operations are packet based so packet error rates becomevery suitable system wide performance metric. It is defined as the ratio ofthe number of erroneous received packets to the total number of packetsreceived. Packet error rate and bit error rate are strongly related but theyare not directly equivalent. Even if a single bit is decoded incorrectly in apacket containing thousands of bits, the packet may become useless. Thisalso makes packet error rate much larger in terms of value as compared toBER.

In this thesis packet error rate is not used to evaluate the system perfor-mance.

7.3 Simulation Results and Comparisons

In this section, the performance results of the proposed implementation ofthe Viterbi decoding algorithm are given. The comparisons with the MAT-LAB implementation are provided. The bit metric used for performancecomparison is bit error rate. This figure of merit is frequently used in realcommunication systems and is considered satisfactory in the realm of a com-munication system if it is close to 10−4 or even lower. To compute the BERvalues, Monte Carlo simulations were run in a reasonable number so as toaccumulate sufficient number of errors. It is made sure that even for verysmall BER values, approaching 10−4 or lower, BER values are consideredonly after at least 100 errors have occurred during simulation.

The coding parameters for all simulations have been selected which arein widespread use in multiple standards. A half rate convolutional codewith polynomial [171 133]. The constraint length has been selected to be 7(K = 7). These specifications are used in various well known communicationstandards. Table 7.1 shows the performance of C implementation (Zero TailDecoder) in terms of bit error rate (BER). This table shows full floating pointimplementation with ”double”values of C programming language hence thereis neither quantization/truncation nor the values are limited at any stage.

The values in the tables 7.1, 7.2, 7.3, 7.4, 7.5, and 7.6 are those which areobtained after truncation and/or saturation. For the results in these tables,either the fractional parts for all values have been discarded or only 1 bitof fractional part is kept. Although the detailed BER versus SNR results intabular format for various levels of truncation are provided, it could be in-

7.3 Simulation Results and Comparisons 53

teresting to see the joint effect of SNR and truncation level on the resultingsystem performance which are captured through BER.

Table 7.1: Polynomials: [171 133], Rate = 1/2, Constraint Length = 7,Packet size = 70 bits including last six zeros.

SNR BER of C-Implementation BER of MATLABwith ”double” width

-3 3.56 × 10−1 3.56 × 10−1

-2 2.78 × 10−1 2.87 × 10−1

-1 2.12 × 10−1 2.03 × 10−1

0 7.25 × 10−2 9.88 × 10−2

1 2.35 × 10−2 3.77 × 10−2

2 3.14 × 10−3 2.58 × 10−3

3 3.14 × 10−4 3.86 × 10−4

4 1.03 × 10−5 1.08 × 10−5


SNR BER of C-Implementation BER of MATLABwith 8.1 Fixed Point

-3 3.42 × 10−1 3.56 × 10−1

-2 2.84 × 10−1 2.87 × 10−1

-1 2.17 × 10−1 2.03 × 10−1

0 8.64 × 10−2 9.88 × 10−2

1 2.55 × 10−2 3.77 × 10−2

2 3.33 × 10−3 2.58 × 10−3

3 3.43 × 10−4 3.86 × 10−4

4 1.14 × 10−5 1.08 × 10−5

In Figure 7.3, BER values have been plotted against different SNR values.SNR values appear along horizontal axis while vertical axis shows the valuesof BER on a logarithmic scale. Each single curve in this plot corresponds toa particular truncation level. For the fixed word length analysis, the curveshave been plotted for the fixed point 8.1, 7.1 and 6.1. It shows that onlyone bit of fractional part is kept in these three curves. A curve has alsobeen plotted for no truncation at all which serves to benchmark these fixedword length curves. The curves show that for higher length fixed point (morenumber of bits to save), BER decreases with the increase in SNR. If attention


−3 −2 −1 0 1 2 3 410

−5

10−4

10−3

10−2

10−1

100

SNR vs BER at different Word length

SNR per bit (db)

Ave

rage

BE

R

Floating Point C ImplementationFixed Point = 8.1Fixed Point = 7.1Fixed Point = 6.1

Figure 7.3: SNR versus BER at different word lengths.

7.3 Simulation Results and Comparisons 55



-3 3.59 × 10−1 3.56 × 10−1

-2 2.82 × 10−1 2.87 × 10−1

-1 2.10 × 10−1 2.03 × 10−1

0 1.01 × 10−2 9.88 × 10−2

1 3.19 × 10−2 3.77 × 10−2

2 3.04 × 10−3 2.58 × 10−3

3 4.40 × 10−4 3.86 × 10−4

4 1.30 × 10−5 1.08 × 10−5



-3 3.88 × 10−1 3.56 × 10−1

-2 3.35 × 10−1 2.87 × 10−1

-1 2.27 × 10−1 2.03 × 10−1

0 1.10 × 10−1 9.88 × 10−2

1 2.60 × 10−2 3.77 × 10−2

2 3.84 × 10−3 2.58 × 10−3

3 4.42 × 10−4 3.89 × 10−4

4 1.38 × 10−5 1.08 × 10−5

is given on any fixed SNR point along the horizontal axis, e.g. 3 dB of SNR,it can be seen that the curve with floating point implementation withoutquantization performs the best. Furthermore it is clear that with the increasein truncation level (decrease in the word length), performance is degrading.Performance very close to the double implementation is achieved for theword length 8. Fixed point 7 shows very little performance degradation,and that is negligible. When attention is paid to the curve of fixed pointlength = 6, the performance gap becomes very big. From these simulationresults it is observed that the word length 8.1 shows performance very closeto the floating point implementation performance. This observation leadsto the conclusion that for this specific system the word length of 8.1 shouldbe used.




-3 3.78× 10−1 3.56× 10−1

-2 2.89× 10−1 2.87× 10−1

-1 2.12× 10−1 2.03× 10−1

0 1.15× 10−2 9.88× 10−2

1 4.37× 10−2 3.77× 10−2

2 4.07× 10−3 2.58× 10−3

3 8.02× 10−4 3.89× 10−4

4 3.78× 10−5 1.08× 10−5


SNR BER of C-Implementation BER of MATLABwith 6 Fixed Point

-3 3.75× 10−1 3.56× 10−1

-2 3.34× 10−1 2.87× 10−1

-1 2.67× 10−1 2.03× 10−1

0 1.72× 10−1 9.88× 10−2

1 1.06× 10−1 3.77× 10−2

2 3.62× 10−2 2.58× 10−3

3 1.13× 10−2 3.86× 10−4

4 2.93× 10−3 1.08× 10−5

7.4 Conclusions

Viterbi decoding algorithm has become an essential part of communicationreceivers due to widespread embrace of convolutional codes in almost allcommunication standards. This fact makes the investigation of fixed pointimplementation of this algorithm very important. In this thesis, the analy-sis of the trade-off of the word length used to store trellis values per nodeagainst the decoding performance, is done. This investigation indicates thatwith 8-bit word length used for storage of trellis values leads to decoding per-formance comparable to full floating point implementation. This may leadto a pragmatic cost-effective hardware decoding algorithm implementation.

Bibliography

[1] Brett W. Werling, A Hardware Implementation of the Soft OutputViterbi Algorithm for Serially Concatenated Convolutional Codes, Elec-trical Engineering and Computer Science Department of the Universityof Kansas, 2010.

[2] Bill Wilkie and Beth Cowie, Xilinx: Viterbi Decoder Block Decoding-Trellis Termination and Tail Biting, www.Xilinx.com, 2005.

[3] Jorge Castieira Moreira and Patrick Guy Farrell, ESSENTIALS OFERROR-CONTROL CODING, John Wiley and Sons, Ltd, 2006.

[4] D. Tse and P. Viswanath, Fundamentals of Wireless Communications,Cambridge, U.K. Cambridge Univ. Press, 2005.

[5] C.E. Shannon, “A mathematical theory of communications,” Bell Sys-tems Technical Journal, vol. 27, pp. 379–423, 1948.

[6] T.M. Cover and J.A. Thomas, Elements of Information Theory, NewYork: John Wiley and Sons, 1991.

[7] C. E. Shannon, “Communication in the presence of noise,” Proceedingsof the IRE, vol. 37, pp. 10–21, January 1949.

[8] Shu Lin and Daniel J. Costello, Error Control Coding, Prentice Hall,2003.

[9] Robert H. Morelos-Zaragoza, The Art of Error Correcting Coding, JohnWiley and Sons, Ltd, 2006.

[10] Erik Dahlman, Stefan Parkvall, Johan Skold, and Per Beming, 3GEvolution, HSPA and LTE for Mobile Broadband, 2e, Academic Press,2008.

[11] A. J. Viterbi, “Error bounds for convolutional codes and an asymp-totically optimum decoding algorithm,” IEEE Trans. on InformationTheory, vol. 28, pp. 260–269, April 1967.

57

58 Bibliography

[12] Jr. G. D. Forney and G. Ungerbock, “Modulation and coding for lineargaussian channels,” IEEE Trans. on Information Theory, vol. 28, pp.2384–2415, October 1998.

[13] G. D. Forney, Concatenated Codes, Cambridge: M.I.T. Press, 1966.

[14] J. Hagenauer and P. Hoher, “A viterbi algorithm with soft-decisionoutputs and its applications,” IEEE Global Telecommunications Con-ference, Dallas, TX, vol. 28, November 1989.

[15] Chip Fleming, A Tutorial on Convolutional Coding with Viterbi Decod-ing, Spectrum Application, 2003.

[16] Timo Vogt, Norbert Wehn, and Philippe Alves, A Multi-StandardChannel- Decoder for Base Station Applications, Integrated Circuitsand Systems Design, 2004. SBCCI 2004. 17th Symposium on, 2004.

[17] Steven W. Smith, The Scientist and Engineer’s Guide to Digital SignalProcessing, California Technical Pub, 1997.

[18] Bernard Widrow and Istvan Kollar, Quanitization Noise, CambridgeUniversity Press, UK, 2008.

[19] J. R. Cavallaro and M. Vaya, “Viturbo: A Reconfigurable Architecturefor Viterbi and Turbo Decoding,” In Proceedings of IEEE InternationalConference on Acoustics, Speech, and Signal Processing, pp. 497–500,April 2008.

[20] C. Berrou and A. Glavieux, “Near Optimum Error Correcting Codingand Decoding: Turbo Codes,” IEEE Transactions on Communications,vol. 44(10), pp. 1261–1271, Oct 1996.

[21] Afshin Niktash, Hooman Parizi, and Nader Bagherzadeh, “A MultiStandard Viterbi Decoder for Mobile Applications Using a Reconfig-urable Architecture,” In Proceedings of VTC, pp. 1–5, Fall 2006.

[22] Afshin Niktash, Hooman Parizi, A.H. Kamalizad, and NaderBagherzadeh, “Recfec: A Reconfigurable Fec Processor for Viterbi,Turbo, Reed-Solomon and Ldpc Coding,” Wireless Communicationsand Networking Conference, 2008. WCNC 2008. IEEE, April 2008.

[23] Lucia Bissi, Pisana Placidi, Giuseppe Baruffa, and Andrea Scorzon, AMulti-Standard Reconfigurable Viterbi Decoder using Embeddded FPGAblocks, DSD ’06 Proceedings of the 9th EUROMICRO Conference onDigital System Design, 2006.

[24] Jelena Nikolic Popovic, Using TMS320C6416 Coprocessors: ViterbiCoprocessor (VCP), Texas Instruments, September 2003.

Bibliography 59

[25] Gerard K. Rauwerda, Gerard J.M. Smit, and Paul M. Heysters, “Im-plementation of Multi-standard Wireless Communication Receivers ina Hterogeneous Reconfigurable System-on-Chip,” ProRISC 2005, 16thWorkshop on Circuits, Systems and Signal Processing, pp. 146–154,November 2005.

[26] M.F.N. Batcha, A.Z. Shaameri, and Mimos Berhad, “ConfigurableAdaptive Viterbi Decoder for Gprs, Edge and Wimax,” Telecommu-nications and Malaysia International Conference on Communications,pp. 237–241, April 2007.

[27] Todd K. Moon, Error Correction Coding: Mathematical Methods andAlgorithms, John Wiley and Sons, INC., 2005.

ahmed salim - diva portalliu.diva-portal.org/smash/get/diva2:469272/fulltext01.pdf · wcdma...

Documents