ieee coding theory

18
Proceedings of the 42nd IEEE Symposium on Foundations of Computer Science (FOCS01) 0-7695-1390-5/02 $17.00 ' 2002 1IEEE

Upload: justdream

Post on 12-Dec-2015

59 views

Category:

Documents


3 download

DESCRIPTION

IEEE Coding Theory

TRANSCRIPT

Coding Theory: Tutorial & Survey

Madhu Sudan�

Abstract

Co dingtheory has playe da central role in the the-oretical computer science. Computer scientists havelong exploite d notions, constructions, theorems andtechniques of coding theory. More recently, theoreticalcomputer science has also been c ontributing to the the-ory of error-correcting c odes- in particular in makingprogress on some fundamental algorithmic connections.Here we survey some of the centr al go als of c oding the-ory and the progress made via algebraic methods. Westress that this is a very partial view of coding theoryand a lot of promising combinatorial and probabilisticapproaches are not cover ed by this survey.

1 Introduction, Disclaimers

This is a hastily written article intended to go alongwith a 2-hour tutorial whose mandate w as to intro-duce coding theory to theoretical computer scientists.Limited by my (lack of) knowledge, time, and energy, Iha ve chosen to focus the survey on some of the classicalalgebraic results, and some recent algorithmic progressin this direction. This article is just a transcription ofmy notes. They lack polish, and more relevantly, rigorin many (most?) places. The hope is that they willstill pro vide suÆciently many pointers to help the in-terested reader gain more information about the area.

Coding theory is the umbrella term used to coverthe study of tw orelated, but distinguishable, topics:The theory of \error correcting codes" and the mathe-matics behind \reliable communication in the presenceof noise". The latter term is self-explanatory while theformer could use some elaboration. Error-correctingcodes are collections of sequences of elements from a�nite set (or words o ver a�nite alphabet) with somenice extremal properties | namely, any two words in

�MIT Laboratory for Computer Science, 200

Technology Square, Cambridge, MA 02143, USA.

http://theory.lcs.mit.edu/~madhu. This work w as sup-

ported in part by NSF Career Award CCR 9875511, NSF

Award 9912342, and MIT-NTT Award MIT 2001-04.

the collection disagree on many coordinates. Thus thetheory of error-correcting codes could be considered anarea of combinatorial mathematics (with some naturalalgorithmic tasks that arise naturally). The theory ofcommunication in the presence of noise, on the otherhand often leads to information theory and/or statis-tics. The tw o aspects of coding theory enjoy a symbi-otic relationship from the days of their origin.

Coding theory, and the dichotomy within, owes itsorigins to tw o roughly concurrent seminal works byHamming [45] and Shannon [80], in the late 1940s.

Hamming was studying devices to store informationand wanted to design simple schemes toprotect fromthe corruption of a small number of bits. Hammingrealized the explicit need to consider sets of words or\codewords" where every pair di�ers in a large num-ber of coordinates. Hamming extracted the combina-torial underpinnings of the theory of error-correctingcodes. He de�ned the notion of distance betw een t w osequences over �nite alphabets - now called the Ham-ming distance, observed this was a metric and thusled to interesting properties. This leads to the notionof distance we work with in coding theory, named theHamming distance. Hamming also constructed an ex-plicit family of codes that acheiv ed non-trivial distance.We will see these codes shortly. Hamming's paper wasev en tually published in 1950.

Slightly prior to Hamming's publication, Shannon(1948) wrote an treatise of monumental impact formal-izing the mathematics behind the theory of communi-cation. In this treatise he developed probability andstatistics to formalize a notion of information. He thenapplied this notion to study how a sender can commu-nicate \eÆciently" o ver di�erent media, or more gen-erally , channels of communication to a receiv er. Thechannel under consideration were of t w o distinct types:Noiseless or Noisy. In the former case, the goal is tocompress information at the senders end so as to, say,minimize the total number of \symbols" communicatedwhile allowing the receiver to reco ver the transmittedinformation correctly . The noisy case, which is moreimportant to the topic of this article, considers a chan-nel that alters the signal being sent by adding to it a

1

Proceedings of the 42nd IEEE Symposium on Foundations of Computer Science (FOCS�01) 0-7695-1390-5/02 $17.00 © 2002 � IEEE

\noise". The goal in this case is to add some redun-dancy to the message being sent so that a few erroneous\symbols" at the receiver's end still allow the receiverto recover the sender's intended message. Shannon'sw ork sho w ed,somewhat surprisingly, that the sameunderlying notions captured the \rate" at which onecould communicate over either class of channels. Evenmore surprisingly, Shannon's w orkshow edthat, con-trary to popular belief then, that when transmittinginformation at a �xed \feasible" rate, longer messageswere more likely to be recovered (completely correctly)than short ones - so asymptotics actually help! Shan-non's methods roughly involved encoding messages us-ing long random strings, and the theory relied on thefact that long messages chosen at random (over a �xedalphabet) tend to be far aw ayfrom each other. Thedistance measure under which these strings tend to befar aw ay were not focussed on explicitly in Shannon'sw ork.

Shannon's and Hamming's w orksw erechronologi-cally and technically deeply intertwined. T echnicallythese w orkscomplement eac h other perfectly . Ham-ming focusses on the mathematical properties behingthese combinatorial objects, while Shannon creates theperfect setting for their application. Hamming fo-cusses on combinatorial aspects, more based on an"adversarial" perspective, while Shannon bases his the-ory on \probabilistic" models (of error, message etc.).Hamming's results w ere highly constructive and hepa ys special attention to the small �nite cases. Shan-non's results were highly non-constructive (and a stel-lar example of the probabilistic method in action), andhis theory pays speecial atten tionto the asymptotic,rather than �nite behavior. These complementary in- uences leads Berlekamp [11] to characterize Shannon'sw orkas \statistical and existen tial"and Hamming'swork as \combinatorial and constructive". (A brief di-gression: The edited volume by Berlekamp [11] is apersonal favorite of mine, as a source book of ideas,their history ,the mood surrounding the research, asw ell as for the refreshingly candid opinions ofthe ed-itor on a broad variet y of topics. Reading, at leastthe editorials, is highly recommended.) It may bepointed out that the statistical/combinatorial distinc-tion seems to point to di�erences in their goals, whilethe existential/combinatorial distinction seems to be aconsequence of the state of their knowledge/capability.Shannon would have probably loved to see his resultsbeing more constructive, but the �eld was not yet de-veloped enough to pro vesuc h results. (He de�nitelylived to see his results being transformed into more con-structiv e ones.)Hamming, on the other hand, startedwith some constructive results and build the theory

around these - and thus came this distinction. Inthe sequel we will refer to the Hamming/Shannon di-chotomy, but we will use this to refer to the combina-torial/statistical distinction in their goals.

Moving on to the chronology: While Shannon's pa-per appeared �rst, he borrows one code from Ham-ming's construction and cites Hamming for this code.So he was certainly aw are of Hamming's work. Ham-ming presumably was also aw are b ythe time of pub-lication of his work, but does not cite Shannon's workexplicitly. How ev er he cites a short (2/3rd page) articleby Gola y [34], who in turn cites Shannon's article (soHamming was lik ely aware of Shannon's work). Bothpapers seem to regard the other as far aw ay from theirown work. Shannon's paper nev er explicitlyrefers toerror-correcting codes or even the distance property inhis main technical results | they only occur betw eenthe lines of the proofs of his main results! (He only usesHamming's codes as an example.) Hamming, in turn,does not mention the applicability of his notion to reli-able communication. He goes to the extent of enumer-ating all prior related works - and mentions Golay'sw ork [34] as the only prior w orkon error-correctingcodes.

Both w orkshowever, were immediately seen to beof monumental impact. Initially both works were citedalmost equally often (with a fair number of works thatw ould cite Hamming's paper but not Shannon's).Overtime the distinctions betw een the tw o perspectives havebeen overwhelmed by the common themes. F urtherthe asymptotic theory of Shannon started driving thepractice of communication and storage of information.This, in turn, became the primary motivator for muc hresearc h in the theory of error-correcting codes. Asa result, today it is very common to see articles thatascribe origins of the entire theory to Shannon's work,cf. [49, Chapter 1], [96 ]. (The Handbook on CodingTheory [49], for instance, in troduces Shannon's w orkon the �rst page and waits for about ten pages beforementioning Hamming's work.)

1.1 Basic definitions

We now move on to some of the basic notions ofcoding theory. T ode�ne codes, messages, and othersuch notions, let us follow a message on its path fromthe source to the receiver as it goes over a noisy chan-nel and let us see what are the relevan t features thatemerge. (In what follows, emphasization is used to in-dicate that a phrase is a reserved one, at least in thisarticle if not all of coding theory . P arenthesized por-tions of emphasized texts indicate portions that aretypically omitted in usage.)

2

Proceedings of the 42nd IEEE Symposium on Foundations of Computer Science (FOCS�01) 0-7695-1390-5/02 $17.00 © 2002 � IEEE

As already mentioned there are three entities in-volved in this process - the sender, the receiver, andthe noisy channel seperating the tw o.The goal of thesender is to communicate one element, or message m,of a �nite set M, the message space, to the receiv er.Both sender and receiver know this space of possiblemessages M. This space of messages is \large" - ex-ponentially larger than the time w e hopeto spend incomputing or communication at either the sender's orreceiv er's end.

The noisy channel is capable of communicating arbi-trarily long sequences of symbols from a �nite alphabet�. While both M and � are \�nite", � should bethought of as small (the reader may even think of thisas f0; 1g for the initial part of this article), while M isthought of as being signi�cantly large. So, e.g., runningtimes that are polynomial or even exponential in � arereasonable, while running times that are polynomial inM are typically unreasonable. T o send messages overthis channel, the sender and receiv eragree upon thelength of the sequences to be transmitted - termed blocklength, usually denoted n. Thus the space of words be-ing transmitted over the c hannel is �n, and is referredto as the ambient space. The sender and receiver agreealso upon the encoding E, an injectiv emap from themessage space to the ambien tspace (E : M ! �n),to be used to encode the messages before transmitting.Thus the sender, to communicate the message m tothe receiver, instead transmits E(m) over the c hannel.The image of the encoding function fE(m)jm 2 Mgis the error-correcting c ode, or simply code. T o enableeasy quantitativ e comparisons on the performance orredundancy of the code it is useful to identify the mes-sage space with a space of sequences from � of lengthk. This parameter k does not seem to ha vea well-de�ned name in the coding theoryliterature, so we'llstart calling it the message length. The ratio k=n thenmeasures the (message) rate of the code - a fundamen-tal parameter in coding theory.

Now moving on to the channel, the channel insertssome noise into the transmitted symbol. V ery broadly,the noise of the channel is just a map from the am-bien tspace to itself. T omake this de�nition slightlymore interesting, w eascribe a group structure to thealphabet � with operator +. This endows the am-bien tspace �n also with a group structure naturallywith ha1; : : : ; ani+ hb1; : : : ; bni = ha1+b1; : : : ; an+bni.In these terms, w e can say that the channel pro-duces a noise vector, commonly termed error pattern,� 2 �n and that the receiv erreceiv esthe corruptiony = E(m) + � as the received word. The receiver nowemploys some decoding function D : �n ! M. Hope-fully ,D runs eÆciently and produces D(y) =m.

The information-theoretic focus of coding theory re-volv es around questions of the form:

Given a distribution P on the error insertedbe the channel (i.e, P : �n ! [0; 1]), whatare the best encoding and decoding functions,or more formally what is the smallest errorprobabability

maxE;D

fEm2M [Pr�2P [D(E(m) + �) =m]]g?

If the error induced by the channel can be describeduniformly as a function of n, then this enables asymp-totic studies of the channel. E.g., Shannon studies thecase where the channel distribution was a just an n-wise Cartesian product of the distribution over indi-vidual coordinates. In this case he sho wed that thechannel has a capacity C0 2 [0; 1] suc h that if onepicks a C < C0 and � > 0, then for every large enoughn there exists an encoding/decoding pair with rate atleast C which acheives an error probability of at most�. (Thus, asymptotics help.) He also show edthat itis not feasible to communicate at rates larger than C0.It is known that the error-probability dropso� expo-nentially with n for every C < C0, but the exact expo-nent with which this error drops of, is still not knownand is one of the fundamental questions of the exis-ten tial aspects of Shannon's theory . Of course, onceone requires the code and the associated functions tobe eÆciently computable, a vast host of open ques-tions emerge. We will not dwell on these. The textsof Blahut [14] or Co ver and Thomas [18] are possiblygood starting points to read more about this directionof work. Let us move on to the Hamming notions.

Recall � is a group - let zero be the identity elementof this group.: The (Hamming) weight of an sequence� 2 �n, denoted wt(�), is the n umber of non-zero sym-bols in the sequence. The usage of the letter � is sugges-tive in that this notion is supposed to be applied to theerror pattern. It is reasonable to expect (from the noisychannel) that the noise should a�ect relativ elysmallnumber of symbols of the transmitted w ord andthuserror patterns are restricted in their Hamming weight.The (Hamming) distance between two sequences x andy, denoted �(x;y), is simply the Hamming weight ofx � y. Hamming de�ned this notion (explicitly) andnoticed that it is a metric and thus other metric con-cepts could be brought into play here. F or example,the notion of a Hamming ball of radius r around a se-quence y is the set fx 2 �nj�(y;x) � rg will be ofin terest to us.

Hamming introduced the notion of error-correctingand error-dete ctingcodes and used these to motivatethe concept of the (minimum) distance of a code. A

3

Proceedings of the 42nd IEEE Symposium on Foundations of Computer Science (FOCS�01) 0-7695-1390-5/02 $17.00 © 2002 � IEEE

code C � �n corrects e errors if for every sequence yin the ambien t space it is the case that there is at mostone codeword in C in the Hamming ball of radius earound y. (Thus if a transmission is corrupted by anerror pattern of weight at most e then error-recovery isfeasible, at least, if we ignore computational eÆciencyissues.) Using the metric properties this is equivalentto saying that the Hamming balls of radius e aroundthe codewords are non-overlapping. Similarly, a codeC detects e errors if no error pattern of weight at moste maps a codeword into another one. Thus the test,\Is the received v ector a codeword?" is guaranteed todetect any occurence of up to e errors. Finally, theminimum distance of a code C, denoted �(C), is theminimum Hamming distance between any pair of dis-tinct codewords in it, i.e.,

�(C) = minx;y2C;x6=y

f�(x;y)g :

The metric properties lead to the immediate observa-tions: For ev ery integer e, C is an e-error-correctingcode if and only if it is a (2e)-error-detecting code if andonly if it has a minimum distance of at least 2e+1. Theminimum distance of a code is another fundamental pa-rameter associated with a code, and it makes sense tostudy this as a fraction of the length of the code. Thisquantity d=n is usually referred to as a distance rateof a code. Since rate is already a reserv ed word, thisphrasing can lead to awkwardness | and we avoid itby referring to this quantity as the relative distance.

The fundamental questions of coding theory as aris-ing from the Hamming theory is based on the construc-tion of codes with good rate and good relative distance.E.g. the most elementary question is:

Given an alphabet � of size q, and integers nand k, what is the largest minimum distanced that a code C � �n of message length k canha ve?

(Note that the actual alphabet � itself does not reallypla ya role in this question, once its size is �xed. Soq will be an important parameter in the questions be-low, but not �.) The question raised above remainswide open for small q. When q � n, this questionhas satisfactory answers. Even getting with a constant(absolute) distance, while maintaining a rate close to1, is non-trivial. Getting codes of distance 1 is triv-ial (the identity function does this)! Hamming pointsout that appending the parity of all bits gives a codeof minimum distance 2 over the binary alphabet. Hismain result is a family of codes of minimum distance3 that are now named the Hamming codes. Hammingalso gives codes of distance 4 as part of a general re-sult showing how to go from a code of odd minimum

distance to a code of ev en minimum distance (by in-creasing the block length by 1).

The question raised above remains open even in anasymptotic sense. In this setting we look at families ofcodes C = fCiji 2 Z+g, with Ci having block length ni,where the ni's form an increasing sequence. The rateof the family is the in�mum of the rates of the codesin the family. The relativ e distancethe family is thein�mum of the relativ edistances of the codes in thefamily. The asymptotic version of the question asks:

Given an alphabet � of size q and rate R,what is the largest relative distance Æ forwhich there exists an in�nite family of codesof rate � R and minimum distance � Æ?

A family of codes is said to be asymptotically good ifboth its rate and minimum distance are greater thanzero. While early results show ed asymptotically goodcodes do exist (see below), explicit constructions ofsuch families of codes took signi�cantly longer. Studyhas since turned to the explicit relation between R andÆ above and is the core question driving muc h of theHamming-based theory.

Before going on to describing some of the early re-sults of coding theory, we mention one broad subclassof codes that are essen tiallymost of what is investi-gated. These are the class of linear codes. In thesecodes,one associates the alphabet � withsome �nite�eld of size q. Sequences of length n over � can now bethought of as n-dimensional vectors. If a code forms alinear subspace of the ambient space, then the code issaid to be linear. While there are no theorems indicat-ing linear codes should be as good (asymptotically) asgeneral codes - this has been the empirical observationso far. Since these codes bring together a host of niceproperties with them, study of these codes has beensigni�cantly more popular than the general case. F orexample, Hamming suggested the notion of a system-atic code - codes whose coordinates can be split intok message symbols and n � k chec ksymbols. Ham-ming noted that every linear code can be made sys-tematic without loss of performance. Linear codes arespecial in that such a code C can be described suc-cintly by giving a generator matrix G 2 �k�n suc hthat C = fxGjx 2 �kg.

1.2 Rest of this article

The results of coding theory can be classi�ed intoone the following three broad categories (ob viouslywith plenty of scope for further subdivisions): (1) Con-structions/Existence results for codes (2) Limitationsof codes (non-existence results). (3) EÆcient Algo-rithms (for encoding and decoding). A fourth category

4

Proceedings of the 42nd IEEE Symposium on Foundations of Computer Science (FOCS�01) 0-7695-1390-5/02 $17.00 © 2002 � IEEE

which can be added, but whose con tents w ouldvarywidely depending on the audience is (4) Applicationsof codes to other areas. In this article w e start bydescribing the limits to the performance �rst (a non-standard choice), and using this to motivate some ofthe algebraic constructions. We then move on to thetask of decoding. Notable omissions from this articleare the non-algebraic constructions, and the w orkonthe side of highly eÆcient encoding. Applications ofcodes to theoretical computer science abound (cf. [90,Lecture 9]) and new ones continue to emerge at a reg-ular rate (see the thesis of Guruswami [40 ] for manyrecen t connections).How ever we will not describe anyhere.

2 Bounds and constructions for codes

2.1 The random linear code

P erhaps the most classical \code" is the randomcode and the random linear code. Hints that the ran-dom code are likely to be quite good in performancewere implicit in the w ork of Shannon, but he doesnot make this part explicit. Gilbert[33 ] was the �rstto make this explicit. Later V arshamov sho wed thatthe bounds actually hold even for random linear codesmaking the searc h space signi�cantly smaller - singlyexponential instead of being doubly exponential in thecode parameters. Finally, Wozencraft reduced the com-plexit yev en further for codes of rate 1=2 (extensionsof the technique works for codes of rate 1=l for any in-teger l), making the complexity of searching for such a

code simply exponential (2O(n) instead of 2nO(1)

).T o describe the construction and performance of

these codes, let us introduce some notation. LetVq(n; r) denote the volume of the Hamming ball ofradius r in �n, where j�j = q. Gilbert's bound isbased on the follo wing greedy process. Pic kK = qk

random sequences greedily as follo ws: Having pic kedv1; : : : ; vi, pic k vi+1 so that it does not lie within adistance of d from any of v1; : : : ; vi. When does sucha vector vi+1 exist? Note that each of the vectorsvj rules out Vq(n; d) vectors from �n. But this onlyrules out i � Vq(n; d) vectors in all from �n. Thus if(K � 1) � Vq(n; d) < qn then the greedy process canalways make progress. V arshamov picks the generatormatrix of the linear code in a similar greedy process. Itis a simple exercise show that this process also acheiv esthe same bound. Wozencraft's construction (personalcommunication in [65, Section 2.5]) is quite di�erent,but acheiv es the same bounds for rate 1=2 codes is asfollows. Let n = 2k and associate �k with a �nite�eld Fqk of size qk. F or ev ery member � 2 Fqk , let

the code C� be given b yf(x; �x)jx 2 Fqk g. (Note thatthe messages are now being viewed as elements of Fqkwhile the encodings are tw o elements of the same �eld.)Wozencraft shows that one of the codes C� also ac hiev esthe same performance as guaranteed by the results ofGilbert and Varshamov. (Wozencraft's ensemble is ac-tually a wider class of constructions, of which the aboveis one speci�c example. This is the example given ex-plicitly in Justesen [51].)

T o understand the bound (qk � Vq(n; d) � qn) inslightly clearer ligh t, let us simplify it sligh tly. F or0 < Æ < 1, let

Hq(Æ) = Æ logq

�q � 1

Æ

�+ (1� Æ) logq

�1

1� Æ

�:

Then Vq(n; Æn) is well approximated by qHq(Æ)n. Theresults of Gilbert, V arshamov (and Wozencraft) showthat for ev ery choice of n and Æ, their sample spacecontains a code of length n, minimum distance Æn, andmessage length n � logq(Vq(n; Æn)) � n(1 � Hq(Æ)).Thus asymptotically one can obtain codes of relativedistance Æ of rate 1 � Hq(Æ). This result bound isusually termed the Gilbert-Varshamov bound, whichstates that there exist codes of rate R and relative dis-tance Æ satisfying

R � 1�Hq(Æ) (Gilbert-Varshamov Bound)

(This bound is syn tactically similar to Shannon'sbound for the capacity of the \q-ary symmetric chan-nel" of error rate Æ. Suc ha channel acts on the co-ordinates of the transmitted word independently, andpreserv es ev erycoordinate with probaliltiy 1 � Æ and ips it to a random di�erent symbol with probabilityÆ=(q � 1). The Shannon capacity of suc h a channel is1�Hq(Æ). This is not a coincidence!)

2.2 Non-existence results for codes

Given a choice of n; k; d and q, does a q-ary codeof length n, message length k and minimum distanced exist? We already know that if n > k+?d, thensuch a code does exist as proven b ythe probabilisticmethod. The main question driving much of this sec-tion is whether the performance of the random code canbe beaten in an asymptotic sense, and if not | ho wcan we prove this? In other words, what techniques canwe use to prove that certain choices of code parametersare impossible to acheive. (These are the sort of resultsthat us computer scien tists areused to calling \low erbounds", except in the coding context these are reallyupper bounds on the rate (as a function of the relativedistance).)

5

Proceedings of the 42nd IEEE Symposium on Foundations of Computer Science (FOCS�01) 0-7695-1390-5/02 $17.00 © 2002 � IEEE

We will start by describing some simple ideas usedto pro ve these upper bounds and then give pointers tostate of the art results.

We will start with a simple bound due to Singleton(cf. [63 ]) SupposeC is a code of length n and messagelength k. Project C do wn to any k� 1 coordinates. Bythe pigeonhole principle tw ocodewords must projectto the same string. Thus these codewords di�ered inat most n � k + 1 coordinates. This pro ves thattheminimum distance of the code d satis�es:

d � n� k + 1 (Singleton bound)

The bound seems naiv e- yet it is tigh t in that somecodes achiev e it! Codes that match the Singletonbound are called Maximum Distance Separable (MDS)codes and we will see an example in the next section.

The Singleton bound is independent of the alpha-bet size q. Indeed the codes that acheiv e this use largevalues of q. In the follo wing we will get bounds thatimprove if q is small. We start with a bound from Ham-ming's original paper. (Note that in the very �rst pa-per, Hamming had formalized codes to the extent thathe could prove negative results!) This bound is basedon the follo wing simple observation. If w econsider acode of minimum distance d, then Hamming balls of ra-dius d=2 around the codewords are disjoint. Since thereare at least qk such balls, w eget qkVq(n; d=2) � qn.Written asymptotically, we see that if a code of rate Rhas relative distance Æ, then:

R � 1�Hq(Æ=2) (Hamming Bound)

Once again, there do exist codes that meet the Ham-ming bound (the exact version, not the asymptotic ver-sion). suc h codes are called perfect. How ev ertheydon't exist for all choices of parameters.

In fact, while the Hamming bound is quite good asÆ ! 0, it deteriorates considerably for larger Æ and endsup proving obviously weak bounds for large Æ. In par-ticular, the random codes only get a relative distance of1� 1

q in the best of cases, while the Hamming bound al-lows a distance of twice this quantity, which is clearlyimpossible (since the relativ edistance should not begreater than one!). This bound was improved upon byElias (unpublished, but reported later in [81]) and in-dependently b y Bassalygo [7], building upon proofs ofPlotkin [74] and Johnson [50] respectively. The princi-pal idea behind these bounds is the following. We knowthat no Hamming ball in �n of radius d=2 includes tw ocodewords from a code of distance d. We can do slightlybetter: It is possible to show that any Hamming ball of

radius rdef= q�1

q

�n�

qn(n� qd

q�1 )�has more than qn

codewords. This quantity r is bet w eend=2 and d, and

tends to d as d=n! 1� 1q . Thus if we take the q

k ballsof radius r around the codewords, w eare overcount-ing the points of the ambient space only by a factor ofat most qn. Thus we get qkVq(n; r) � qnqn. Asymp-totically this shows that if a q-ary code of rate R andrelativ e distanceÆ exists then:

R � 1�Hq

��1� 1

q

���1�

r1� q

q � 1Æ

��

(Elias-Bassalygo Bound)

The Elias-Bassalygo bounds already giv e decentasymptotic bounds on the rate to relativ e distanceperformance of codes. (In particular, they meet theGilbert-Varshamov bound at the two ends, i.e., R = 0and R = 1.) In the interim region they are boundedaw ayfrom each other, but it is hard to get a qual-itativ e sense of the di�erence in this regime. How-ev er a closer look at the tw oendpoints does rev eal aqualitative di�erence. F or instance, if we �x an abso-lute distance d and ask what is the limit of n � k asn; k !1, then the Hamming/Elias-Bassalygo boundsrequire n�k � d=2 logq n while the Gilbert-Varshamovbound acheives n� k � d logq n - so these bounds areonly o�b y a factor of 2 (in some sense), but it turnsout the non-existence result bound is the right one (andthe Gilbert-Varshamov bound is not!).

On the other hand, at the other end, if w e con-sider codes of relativ e minimum distance 1 � 1

q � �and ask what is the largest rate they can acheive: TheElias-Bassalygo bound shows that R � O(�), while theGilbert-Varshamov bound only acheives R � (�2).So here again the tw o bounds are qualitatively o�from each other. In this case, it turns out the Elias-Bassalygo bound is the one that is too weak. A betterbound does exist in the literature | termed the lin-ear programming bound. This bound was introducedby Delsarte [19] who show ed that a certain linear pro-gram on n variables with O(n) inequalities could beused to give an upper bound on the message length ofa code code of minimum distance d. (This linear pro-gram was in turn inspired by a series of identities due toMacWilliams [62] which establish some quantitative re-lationships betw een a linear code and its \dual".)Go-ing from the linear program to an asymptotic bound onthe performance of a code turns out to be highly non-trivial. Such an analysis (not known to be tight) wasgiven b y McEliece, Rudemich, Rumsey and Welch [68]which shows, in our example, that a code of minimumdistance 1� 1

q �� has rate at most O(�2 log 1� ) and thus

the Gilbert-Varshamov bound is almost tigh t at thisextreme. A complete description of the analysis lead-ing to the [68] bound can be found in the text of vanLint [59]. Levenshtein [57]'s article on this subject is a

6

Proceedings of the 42nd IEEE Symposium on Foundations of Computer Science (FOCS�01) 0-7695-1390-5/02 $17.00 © 2002 � IEEE

source of more information.

2.3 Codes

We will introduce various codes to the reader in ap-proximately the historical order, also constrasting themwith various bounds acheiv ed so far.All codes of thissection are linear unless otherwise noted. The asso-ciated alphabet will be Fq , the �eld of cardinality q.All vectors will be row vectors (standard convention ofcoding theory.) Almost all beat the Gilbert-Varshamovbound at some point or the other.

Hamming codes. These are linear codes obtained asfollows. F or any integer l, consider the ql� l matrix H 0

whose rows include all l-dimensional vectors overFq .Delete the row corresponding to the all zeroes vectorfrom H 0. Next delete scalar multiples from H 0: I.e., ifvi = �vj for some � 2 Fq , then delete one of the rowsvi or vj arbitrarily. Continue this process till H

0 has noscalar multiples left. This leaves us with an� l matrix,let us call it H , with n = ql�1

q�1 . The Hamming code oflength n is the collection of vectors fx 2 F

nq jxH = 0g.

Note that this forms a linear subspace of Fnq . The mes-sage length equals n � l � n � logq n. It can be es-tablished fairly easily that the minimum distance ofthis code is at least 3 (after some reasoning this cor-responds to establishing that no tw o rows of H arelinearly dependent). The Hamming codes are perfectin that they meet the Hamming bound exactly. Theearlier mentioned paper of Golay mentions tw oothercodes (one binary and one ternary) which meets theHamming bound. After a signi�cant amount of work,van Lin t[58] and Tietavainen [94 ], eastablished thatthese were the only perfect codes!

(Generalized) Reed Solomon codes. Thesecodes, due to Reed and Solomon [76 ], are de�nedfor q � n. Let �1; : : : ; �n be n distinct elements ofFq . The Reed Solomon code of message length k,with parameters �1; : : : ; �n is de�ned as follo ws: As-sociate with a message m = hm0; : : : ;mk�1i a poly-

nomial M(x) =Pk�1

j=0 mjxj . The encoding of m

is the evaluation of M at the n given points i.e.,E(m) = hM(�1); : : : ;M(�n)i. By construction, theReed Solomon codes have message length k and blocklength n. Their minimum distance is n � k + 1 whichis an immediate consequence of the fact that two de-gree k�1 polynomials can agree at no more than k�1points. Thus the Reed Solomon codes give a simpleclass of codes that meet the Singleton bound. Suc hcodes are called Maximum Distance Seperable (MDS)codes. These are essentially the only such codes thatare known.

BCH and alternant codes. These codes are namedafter their co-discoverers. Hocquenghem [47] and Boseand Chaudhari [17 ] proposed these binary codes es-sentially around the same time as the Reed-Solomoncodes were proposed. We won't describe the codes ex-plicitly here, but instead describe a slightly more gen-eral class of codes called alternant codes. Essentiallythese codes can be obtained from Reed Solomon codesas we describe next. For n-dimensional vectors x andy (over some �eldF), let x � y be the coordinate wiseproduct of x and y. Pic ka �nite �eld F2l suc h that2l � n and �1; : : : ; �n distinct from F2l and let C1be the Reed Solomon code of dimension k over this�eld. Now, pic k a vector v = hv1; : : : ; vni with thevi 2 F2l n f0g. Consider the code C2 = fv � cjc 2 C1g.Note that C2 has the same message length and distanceC1. Now take the intersection of C2 with Fn2 (i.e., tak eonly the binary vectors from this code). Often it ispossible to make non-trivial choices of vectors � and vso that the resulting code has large dimension. BCHcodes are one suc h family of codes. The BCH codesyield codes of block length n, minimum distance d anddimension k = n � ((d � 1)=2) log2 n and thus asymp-totically match the Hamming/Elias-Bassalygo bounds.described above. (In particular, these codes generalizethe Hamming codes.)

Reed-Muller codes. These codes are named afterMuller [69], its disco verer, and Reed [75] who ga veadecoding algorithm for these codes. Appropriately gen-eralized, these codes are based on evaluations of mul-tivariate polynomials. Consider a l-variate domaingover a q-ary �eld. Consider the space of all polyno-mials of degree at most m over this �eld, subject tothe condition that no variable takes on a degree of qor more. If m < q, then this is a vector space of di-mension

�m+lm

�. F orm � q, the dimension of this space

does not have a closed form, but can be easily low erbounded by at least

�lm

�. We identify this space with

the vector space of messages and let their encodingsbe their evaluations at all points in the space. Thusw eget n = ql. The relativ edistance of these codesis at least 1 � m

q if m < q, and at least q�m0

where

m0 = dm=qe for m � q. (The �rst part follo ws fromthe familiar Schw artz-Zippel lemma in theoretical com-puter science. Of course, the Reed-Muller codes pre-date this lemma by 25 years.) The parameter m above(degree) is referred to as the order of the Reed-Mullercode.

Hadamard codes. Reed-Muller codes of order oneare special in coding theory since they form a sub-class of the \Hadamard" codes or simplex codes. Bi-nary Hadamard codes are obtained from integer n� n

7

Proceedings of the 42nd IEEE Symposium on Foundations of Computer Science (FOCS�01) 0-7695-1390-5/02 $17.00 © 2002 � IEEE

matrices H with en tries from f+1;�1g that are self-orthogonal (i.e., HTH = nI). The rows of this matrix(interpreted as binary vectors) make a binary code (notnecessarily linear) in which every pair of codewords dif-fer from each other in half the places. Codes over aq-ary alphabet with a relative distance of 1 � 1=q arewhat we term as Hadamard codes.

Algebraic-geometry codes. No description of alge-braic coding theory is complete without mention of thefamed \algebraic-geometry codes" or the \geometric-goppa codes". These codes may be thought of as codesobtained by starting with Reed-Muller codes of someorder and then removing a large number of coordi-nates (or projecting onto a small number of coordi-nates). This process may identify many codewords. Avery careful choice of the coordinates onto which oneprojects, based on deep insights o�ered by algebraicgeometry, lead to codes over q-ary alphabets of dimen-sion k and block length n, with minimum distance be-ing at least n � k � np

q�1 , when q is a square. The

most interesting aspect of these codes is that over con-stan t sized alphabets, these codes can beat the Gilbert-V arshamov bound with a rate and minimum distancebounded aw ayfrom 1. (Note that all other codes ofthis section also beat the random codes - how ev er theyeither require a large alphabet (Reed Solomon codes)or they do better only for rate being 1 � o(1) (Ham-ming codes, BCH codes), or they do better only forrate being o(1) (Hadamard codes).

Concatenated codes. Concatenated codes don't re-fer to any �xed family of codes, but rather to codesobtained by a certain mechanism for combining tw ofamilies of codes. This construction and some immedi-ate applications are extremely useful and so we describethese in this section. Suppose we have tw o codes: C1of length n1, dimension k1 and distance d1, and C2of length n2, dimension k2 and distance d2. Supposefurther that the alphabet of the second code is q andthe alphabet of the �rst code is carefully chosen to beqk2 . We call the �rst code the outer code and the sec-ond one the inner code. Thus the tw ocodes are notin a symmetric relationship. In particular the numberof codewords of the inner code correspond exactly tothe n umber of letters in the outer alphabet. Then thisgiv es a new way to encode messages of the outer code.First encode it using the outer code. Then encode eachletter of the ensuing codeword in the outer alphabet us-ing the encoding provided by the inner code. It is easyto prove that this leads to a q-ary code of length n1n2with message length k1k2 of minimum distance at leastd1d2.

This process w as proposed by F orney [27 ], who

noted that if one used Reed Solomon codes as the outercode and searched for the best possible inner code (inpolynomial time), then one gets a code with very gooderror-correcting properties. In particular for appropri-ate c hoice of parameters one get asymptotically goodcodes that are constructible in polynomial time | the�rst such construction! F orneydid not mention suc ha result (of the Hamming type) explicitly. Instead hecast his results in the Shannon model (correcting ran-dom errors) and in this setting F orney's results w ereev en more impressive. He gave a decoding algorithmfor concatenated codes by a natural combination of theinner and outer decoding - and using this got an algo-rithmic version of Shannon's theorem - where both theencoder and decoder work ed in polynomial time and of-fered exponentially small probability of error! In somesenses Forney closed tw o of the mostsigni�cant openquestions in coding theory - one in the Shannon styleand one in the Hamming style, using a very elementarybut deeply insightful idea.

These ideas were built upon further by Justesen whosuggested the use of n1 di�erent inner codes, one foreach coordinate of the outer code. He noticed that itsuÆced if most of the inner codes exhibited good dis-tance. He then applied this idea with the Wozencraftensemble as the n1 inner codes to get a fully explicit(logspace constructible?) family of codes that w ereasymptotically good. More recently Katsman, Tsfas-man and Vladuts [53] concatenated algebraic geometrycodes with Hadamard codes to get binary codes of rateO(�3) of relative distance 1

2 � � giving the best codesat suc h high relative distance.

Brief history of developments. The abo ve codesw ere mentioned in roughly chronological order, butwith a few inversions. Here we clarify the exact orderand some interim history. Hamming had already intro-duced linear-algebra (at least overF2 ) in the contextof coding theory. Slepian [85] was the �rst to system-atically explore the role of linear algebra in coding andthus introduce linear error-correcting codes as a �eldof their own.

The �rst truly \algebraic" codes were the Reed-Muller codes - and this is where �eld multiplicationseems to have been used in a strong form. The Reed-Solomon and BCH codes were all developed around thesame time. The description of BCH codes was not likethe description here and it was not kno wn/feltto berelated to RS codes in any w ay.BCH codes were animmediate hit, since they work ed o ver the binary alpha-bet and generalized Hamming codes in the right way.P eterson analyzed these codes and gave a polynomialdecoding algorithm - a highly non-trivial achievement.Later Gorenstein and Zierler [38] extended the BCH

8

Proceedings of the 42nd IEEE Symposium on Foundations of Computer Science (FOCS�01) 0-7695-1390-5/02 $17.00 © 2002 � IEEE

codes to non-binary alphabets and also extended thedecoding algorithm to these codes. They also noticedthat Reed-Solomon codes exhibited an interesting dual-ity property (under appropriate choice and ordering ofthe �i's) and this allow ed them to be treated as a sub-family of BCH codes. Since then the study of RS codesand description has been merged into that of the BCHcodes in the common texts (a pity, in my opinion). RScodes do not seem to have enjoyed the same attentionas combinatorial objects because of their large alpha-bet. F orney's work changed some of this by �nding usefor codes over large alphabets. Also for some storagemedia it made sense to view the channel as naturallyo�ering a large alphabet. P eople also noticed that largealphabets provide a natural w ayto cope with burst yerror patterns. So Reed-Solomon codes even tually be-came quite popular and got widely deployed - more sothan any other algebraic code family (perhaps exclud-ing the parity code!).

We move on to the algebraic geometry codes. Thesecodes were an \inven tion" of V.D. Goppa [37] | whosuggested this route way before it was clear that thesecodes could o�er any interesting possibilities. It tooka signi�cant amount of time and e�ort before the fullpotential of this idea was realized. The full potentialwas realized in the w orkof Tsfasman, Vladuts, andZink [95 ] which builds upon some deep work in alge-braic number theory. Proofs of the properties of thesecodes is being simpli�ed signi�cantly no wadays, withthe works of Garcia and Stich tenoth [30] being majorsteps in this direction. F or the time being, the bestreferences are [87, 48].

Non algebraic codes.As already mentioned w ew on't be covering such

codes. Y etw e should mention that there is an ex-tremely elegant body of w orkbased on binary codeswhose parity check matrices are of \low-density" (i.e.,ha ve a small number of non-zero entries). These codes,called Low Density Parity Check codes (LDPC codes),w ere innovated by Gallager [28] who show ed that a ran-dom matrix with this property produced codes thatw ere asymptotically good, and allow ed for eÆcient (lin-ear time) decoding for a constant fraction of errors.This result completely belies the intuition that sug-gests that for a code to be decodable eÆciently it mustalso be constructible eÆciently. Subsequent w orkbyT anner [92 ] started to make a lot of these construc-tions explicit. Finally the works of Sipser and Spielman[84] and Spielman [86 ] culminated in explicitly con-structible linear time encodable and decodable codesin this family whose constructions were fully explicit.LDPC codes remain an active area of study today withthe works of Luby et al. [61, 60] and Richardson et al.

[77, 78] showing the potential that these codes have foracheiving Shannon capacity of various channels even atsmall values of the block length. The work of Barg andZemor [6] shows ho w these codes lead to linear timedecodable codes that achieve Shannon capacity!

3 Decoding algorithms

3.1 Algorithmic Tasks

There are three essential algorithmic tasks associ-ated with error-correcting codes: (1) Construction ofthe code; (2) Encoding and (3) Decoding. The �rst ofthese tasks w asimplicitly dealt with in the previoussection. F or linear codes, the task of encoding tak esonly O(n2) time, once the code is speci�ed by its gen-erator matrix. So this part is not too hard. F asteralgorithms can, should, and have been sought - but wewill not dwell on this part here. A somewhat hardertask - with naiv ealgorithms running in exponentialtime - is the task of correcting errors, or decoding.

The decoding problem is not entirely trivial to for-mulate. Numerous variants of this problem exist inthe literature. The �rst question raised by these prob-lems is: How should the code be speci�ed? If it ispart of the input, almost immediately hardness resultscrop up. The �rst such results were due to Berlekamp,McEliece and van Tilborg [13 ]. Subsequently manyvarian ts have been shown to remain hard | e.g., ap-proximation [2, 20 ], to within error bounded by dis-tance [23], �xed number of errors [21]. (Seealso thesurvey by Barg [5].) In this section w ewill not dealwith such problems, but focus on this problem for �xedclasses of (algebraic) codes.

Even after we �x the code (family) to be decoded,the decoding problem is not completely �xed. Theliterature includes a multitude of decoding problems:Maximum likelihood decoding (MLD), Nearest code-w ord problem (NCP), Bounded Distance Decoding(BDD), Unambiguous decoding, List decoding etc. Ev-ery pair of these problems di�er from each other (some-times very slightly). Here w e attempt to �x theseterms. In all problems the input is a vector r from theambient space �n. In all cases the output is a (possiblyempty) set of messagesm (or equivalen tly, a codewordc). The main question iswhat is expected of c. Thefollo wing criteria distinguish the above problems:

MLD Here we are given a distribution D on the errorpatterns and we wish to output a single codewordc that maximizes the probability of r� c.

NCP Here we wish to output a single codeword c thatis closest to r in Hamming distance. (Note that

9

Proceedings of the 42nd IEEE Symposium on Foundations of Computer Science (FOCS�01) 0-7695-1390-5/02 $17.00 © 2002 � IEEE

this is essentially the MLD problem when the dis-tribution on error is that of the q-ary symmetricchannel - i.e., error is i.i.d. on coordinates, withthe error on any coordinate being equally likely tobe any non-zero symbol).

List decoding Here again an error bound e is �xedand the goal is to output the set of all codewordswithin Hamming distance e from the received vec-tor.

Bounded Distance Decoding Here along with thecode an error bound e is �xed and one is expectedto output a single vector c within Hamming dis-tance e from r if one exists or an empty set other-wise.

Unambiguous decoding This is the Bounded Dis-tance Decoding problem with e set to (d � 1)=2where d is the minimum distance of the code C.

Informally, the problems roughly decrease in complex-ity as one goes down the list (though of course thisdepends on the choice of the error parameter e in the3rd and4th problems abo ve). The classically studiedproblem is that of unambiguous decoding. This is theproblem that P eterson [73] solv ed in polynomial timefor BCH codes, and the problem that led to the famedBerlekamp-Massey algorithm. We will discuss such analgorithm in the following section. A ttempting to solveeither the bounded distance decoding problem, or thepotentially harder list decoding problem for larger er-ror v alues was not really pursued actively till recently.Recently the author, and several others have found listdecoding algorithms for a varied family of codes, thatsolve the problem for e larger than d=2. We will dis-cuss later. The MLD or NCP problem have not re-ally been solved exactly in any non-trivial cases. (Twotrivial cases are the Hamming code, where it is possi-ble to enumerate all error-patterns in polynomial time,and the Hadamard code where it is possible to enumer-ate all codewords in polynomial time.) But reasonablygood approximations to this problem can be solved ef-�ciently, as was sho wn by Forney [27]. We will discusssome of this later.

3.2 Unambiguous decoding for algebraic codes

Unambiguous decoding algorithms for almost all al-gebraically (or number-theoretically) de�ned codes in-cluding BCH, Reed-Solomon, Reed-Muller, Algebraic-Geometry codes are now kno wn. F urthermore thesealgorithms have been uni�ed to the extent that a sin-gle decoding algorithm, abstracted appropriately, ac-tually w orks for all these cases. Suc h analgorithm is

described in [48, Section 6]. Here we will be slightly lessambitious and describe an algorithm that works for allcode families above except the Reed-Muller case. Wewill describe the algorithm in general and sho who wit applies to the Reed Solomon case. This algorithmis essen tially from the thesis of Duursma [24 ] and isattributed to Duursma, K�otter [54] and Pellikaan [72 ].

Recall that x�y denotes the coordinatewise productof x and y. F or sets of vectors A and B, let A � B =fx � yjx 2 A;y 2 Bg.

De�nition 1 A pair of linear codes A;B are said tobe a t-error-c orrecting pair for a linear c odeC of blocklength n over � if (1) A � C � B, (2) dimension of Ais at least t + 1, (3) the minimum distance of A is atleast n ��(C) + 1, and (4) the minimum distance ofB is at least t+ 1.

The above de�nition is almost entirely consisting ofbasic notions from linear codes, except for the �rst con-dition which involves \higher algebra" (and sets it upso that quadratic constraints may be made to look lin-ear).

Theorem 2 F ort � n, let A;B;C be line ar c odes overF of block length n, such that (A;B) form a t-errorcorrecting pair for C. Then, given gener ator matricesfor A;B;C and a \r eceived wor d" r 2 F

n , a codewordc 2 C that di�ers from r in at most t locations can b efound in O(n3) time.

Proof: The algorithm for �nding the codeword c w orksin tw o steps:(Step 1) Find a;b satisfying:

a 2 A� f0g;b 2 B and a � r = b (1)

(Step 2) Let Idef= fi : ai 6= 0g. Find c 2 C s.t. ci = ri

for ev eryi 2 I , if such a c exists, and output it.Let GA, GB and GC be the generating matrices of

A, B and C respectively. Let kA; kB ; kC denote theirdimensions. Then Step 1 above amounts to �nding vec-tors xA 2 F

kA and xB 2 FkB such that (GA � xA)iri =

(GB � xB)i. This amounts to solving a linear systemwith n equations and kA+kB � 2n unknowns and cancertainly be solved using O(n3) �eld operations. Sim-ilarly , Step 2 amounts to �nding xC 2 F

kC such that(GC � xC)i = ri for i 2 I . Again these form at most nlinear equations in rC � n unknowns and can be solvedin O(n3) time. This yields the claim about the runningtime.

The correctness is proved b y three claims.(We enu-merate the claims belo wbut omit the details of theproof. The reader is referred to [36 ] for full details.)

10

Proceedings of the 42nd IEEE Symposium on Foundations of Computer Science (FOCS�01) 0-7695-1390-5/02 $17.00 © 2002 � IEEE

We �rst claim that if r is in fact close to some code-w ordc, then a pair satisfying Equation (1) must exist| a claim that follows from a rank argument (the linearsystem is homogenous and has greater rank than num-ber of constraints). Next, w eclaim that for any paira;b satisfying Equation (1), every codeword c that isclose to r will satisfy the conditions required in Step2. This claim follo ws from the fact that A � C � Band B has large minimum distance. Finally, w e claimthat for any pair a;b, there is at most one solution toStep 2. This step follows from the fact that C has largeminimum distance. This completes the proof that c isthe unique answer output by our algorithm.

The above paradigm for decoding uni�es most of thekno wn (unique) decoding algorithms.F or example, theWelch-Berlekamp algorithm [98 , 12 ] (as described in[32]) can be obtained as a special case of the above. T odecode a Reed-Solomon code C of dimension k (withthe message space being all polynomials of degree lessthan k), w e use as the codeA the Reed Solomon codeconsisting of all polynomials of degree at most bn�k2 c.We note that A�C is con tained in the spaceB of poly-nomials of degree at most bn+k2 c. Note that A;B nowsatisfy all conditions required of a t-error correctingpair for t = bn�k2 c and thus give a t-error correctingalgorithm.

3.2.1 Decoding concatenated codes

In this section we describe an extremely elegant poly-nomial time algorithm for decoding concatenated codesunder fairly minimal assumptions about the codes in-volved in concatenation. Speci�cally, the assumptionsare that the alphabet size of the outer code is polyno-mially bounded in the block length of the concatenatedcode, and that the outer code has a polynomial time\errors and erasures decoding" algorithm. The latterassumption needs some clari�cation and we will get tothis shortly.

First, let us recall the notion of concatenation. IfC1 is a code of block length n1, message length k1 andminimum distance d1 over an alphabet of sizeqk2 ; andC2 is a code of block length n2, message length k2 andminimum distance d2 over an alphabet of sizeq; thenC = C1 Æ C2, the concatenation of C1 and C2 is a code ofbloc k lengthn = n1n2, message length k = k1k2 andminimum distance d = d1d2 over an alphabet of sizeq.This code is obtained by encoding a message (comingfrom a space of size (qk2)k1) �rst using the encodingfunction associated with C1, and then encoding eachcoordinate of the resulting w ordusing the encodingfunction associated with C2.

Going on to the decoding problem: Let us use � to

denote the q-ary alphabet of C2 and C, and � to denotethe qk2 -ary alphabet of C1 (as also the message space ofC2). Let E1 and E2 denote the encoding functions usedby C1 and C2. We receive a vector r = hr1; : : : ; rn1i,where ri are elements of �

n2 . A natural algorithm todecode r is the following tw o-stage process. (1) First,for eac hi 2 f1; : : : ; n1g, compute yi 2 � that minimizesthe Hamming distance betw eenri and yi. (2) Let y =hy1; : : : ; yni 2 �n1 , Compute the nearest codeword c 2C1 to y and output c.

Let us �rst examine the computational complexityof the above algorithm: (1) The �rst step runs in timepolynomial in � by a brute force implementation - andthis is not an unreasonable running time. (2) The sec-ond step is a decoding step of the code C1. So if w eassume that the outer code has a small alphabet anda fast, say, unambiguous decoding algorithm, then theabo ve algorithm runs eÆciently. Now let use examinethe question of how many errors does the above algo-rithm correct? Say the transmitted word is m and letE1(m) = hz1; : : : ; zn1i. F or the second stage to outputE1(m), w eneed to ensure that yi 6= zi for less thand1=2 values of i. In turn this forces us to make surethat on at most d1=2 of the blocks we have a Hammingdistance of more than d2=2 betw eenE2(zi) and ri. Allin all, this quick calculation reveals that the algorithmcan correct essen tially d1d2=4 = d=4 errors. This isa reasonable achievement under minimal assumptions,but does not yet achieve the goal of unambiguous de-coding (which would like to correct d=2 errors).

We now describe the elegant Generalized Mini-mum Distance decoding algorithm of Forney [27] whichaccomplishes unambiguous decoding under slightlystronger assumptions. Forney considers the followingissue. When decoding the inner codes on the variouscoordinates, one gets more information than just thefact that the outer symbol could ha vebeen yi in theith coordinate. We actually get some reliability infor-mation telling us that if yi is the symbol, then a certainnumber of errors must ha veoccured, and if it is not,then a (larger) number of errors must have occured.What if w e simply throw out the coordinates wherein either case a large number of errors have occured?This should help the outer decoding process in princi-ple. F orney makes this assumption precise by using thenotion of an error and erasure decoding algorithm - analgorithm for which symbols are allow ed to be erasedand such an erasure only accounts for half an error.This notion is formalized below.

De�nition 3 For a code C1 � �n1 , of minimum dis-tanc ed1, we say that an algorithm A is an unambigu-ous error and erasure decoding algorithm if it takes asinput y 2 (� [ f?g)n1 , (where ? is a special symbol

11

Proceedings of the 42nd IEEE Symposium on Foundations of Computer Science (FOCS�01) 0-7695-1390-5/02 $17.00 © 2002 � IEEE

not in �) and outputs a codeword c 2 C1 such that

2t+ s < d1, where sdef= jfijyi =?gj and t

def= jfijci 6= yigj,

if such a codeword exists.

Forney's algorithm using such an unambiguous de-coder is now the following (on input r 2 �n1n2).

Stage 1 Fori 2 f1; : : : ; n1g do: Let yi be the messageof � minimizing ei

def=�(E2(yi); ri).

Stage 2 For � 2 f0; : : : ; n2g do: Let y0i = yi if ei �� and let y0i =? otherwise. Use an unambiguouserror and erasure decoder to decode y0. If thereceiv ed codeword is close enough to r in Hammingdistance, then output c, else continue.

F orney analyzes the algorithm and shows, somewhatsurprisingly that it decodes upto half the minimum dis-tance of the concatenated code. We conclude by ob-serving that the algebraic decoding algorithm of theprevious section naturally provides an unambiguous er-rors and erasures decoding algorithm, and so this as-sumption is a very reasonable one.

Before moving on, w e make a small aside. F or-ney [27] actually used the algorithms above (the naiveone, or the more sophisticated GMD) to actuallyachiev eShannon capacity of say the binary symmet-ric channel with polynomial time algorithms. This isa signi�cantly stronger result than the unambiguousdecoder given above since it needs to correct abouttwice as many errors as the number guaranteed above.How ev er in this case the errors are random.How doeshe tak e advantage of this fact? F orney notices thatwhen errors occur at random wiTh say probability p,then most of the inner blocks have at most the (p+ �)-fraction of errors. So the inner decoding actually getsrid of most errors. Hence the outer code, and decoderonly need to correct a small number of errors | notproportional to p, but rather a function of �. So onedoes not lose much in the outer decoding, even thoughthe outer decoding algorithm (and the loss in perfor-mance due to the fact that it corrects only half as manyerrors as the outer distance) do not adversely a�ect theoverall performance.Careful formalization of this intu-ition and analysis lead him to the algorithmic versionof Shannon's coding theorem. (As should be evident,F orney's thesis [27 ] is another personal favorite - againreading this monumental work is highly recommended.Some of the sketchy arguments above are also describedin more details in my notes on my webpage [90 ].)

3.3 List decoding

We now move beyond the scope of unambiguous de-coding. The main motivation of this part is to just solve

the bounded distance decoding problem for e > d=2,where d is the minimum distance of the code w earegiv en. However, thus far, w ehave only found algo-rithms that solve this problem by enumerating all code-w ords lying within a Hamming ball of radiuse aroundthe receiv ed vector, and thus solving the seeminglyharder list-de coding problem.

The notion of list-decoding was in troduced byElias [25] and Wozencraft [99] who considered this asa weak form of reco very from errors, in the Shannonmodel. In this w eak form,error reco very was consid-ered successful if the decoding algorithm output a smallset of codewords that included the transmitted mes-sage. This notion led to tighter analysis of notions suchas error-probability and the error-exponent of a chan-nel etc. It may also be noted that the Elias-Bassalygoupper bound (on the rate of a code given its minimumdistance) rev olv es around the combinatorial notion oflist-decoding. Thus list-decoding was evidently a usefulnotion in both the Shannon-based and the Hamming-based theories of error-correcting codes.

How ev er, till recen tly no non-trivial algorithmic re-sults w ereachiev ed inthis direction. A workof Gol-dreic h and Levin [35] re-exposed the importance ofthis concept to theoretical computer science (they ex-ploited this notion to get hardcore predicates for one-w ay functions).They also construct a very non-triviallist-decoder for a Hadamard code - unfortunately thisresult is non-trivial only in a non-standard model ofcomputation. We won't get further into a descriptionof their result. The �rst result for list-decoding in thestandard models is due to this author [88]. This workbuilt upon previous w orkof w orkof Ar, Lipton, Ru-binfeld and Sudan [1] to get a list-decoding algorithmfor Reed-Solomon codes. Since this algorithm was dis-covered, it has been extended to many classes of codes,improved signi�cantly in terms of eÆciency and thenumber of errors it corrects, and used to design list-decoding algorithms for other families of codes. Beloww e describe this algorithm and some of the subsequentw ork.

3.3.1 List-decoding of Reed Solomon codes

The problem w e are interested in is the following:Given vectors �;y 2 F

nq �nd all polynomials of de-

gree less than k that agree with the point pairs (�i; yi)for many values of i. The unambiguous decoding al-gorithm for this problem could be explained in termsof the following idea: Suppose the polynomial beingencoded is M . First construct an algebraic explana-tion for the error points, by constructing a polyno-mial E(x) suc h that E(�i) = 0 if there is an error

12

Proceedings of the 42nd IEEE Symposium on Foundations of Computer Science (FOCS�01) 0-7695-1390-5/02 $17.00 © 2002 � IEEE

at location i. Next notice that for ev ery coordinatei, the follo wing condition holds: yi = M(�i) \or"E(�i) = 0. The logical \or" can be arithmetized usingsimple multiplication, and we get for every coordinatei, E(�i) � (yi �M(�i)) = 0. Thus the unambiguousdecoding algorithm actually �nds an algebraic relationin the (x; y)-plane which is satis�ed by all the points(�i; yi), and then factors the algebraic relation to ex-tract the polynomial M .

The algorithm of [1, 88] implements this idea andw orks in tw o phases: (1) Find a non-zero biv ariatepolynomial Q(x; y) of small total degree suc h thatQ(�i; yi) = 0 for all i 2 f1; : : : ; ng. (2) F actor the poly-nomial Q(x; y) into irreducible polynomials and reportevery polynomial M of degree less than k suc h thaty �M(x) is a factor of Q(x; y) and yi = M(�i) forsuÆciently many coordinates i 2 f1; : : : ; ng.

The �rst thing to notice about the above algorithmis that once the parameters are �xed, it can be imple-mented in polynomial time. The �rst step just amountsto �nding a non-trivial solution to a system of ho-mogenous linear equations. The second step is non-trivial, but fortunately solved due to the (independent)w orks of Grigoriev [39], Kaltofen [52], and Lenstra [56]who gave polynomial time algorithms to factor multi-variate polynomials. (Of course, the seminal w ork ofBerlekamp [10] lies at the core of all these results.)

Now we think about correctness. Ar et al. [1] use astandard result from algebra (Bezout's theorem in theplane) to note that that if Q is a polynomial of total de-greeD andM has degree k and Q(a; b) = a�M(b) = 0for more than Dk pairs (a; b), then y �M(x) must di-vide (and hence be an irreducible factor of) Q(x; y).So the question reduces to the follo wing: How smallcan w emake the degree of Q? T urnsout w ecan as-sume D is at most 2

pn, as pointed out in [88]. This

is seen easily via a counting argument: A biv ariatepolynomial of total degree 2

pn has more than n coef-

�cients, while the conditions Q(�i; yi) = 0 only give nhomogenous linear constraints. Thus a non-trivial solu-tion does exist to this linear system. Putting the abovetogether, we �ndthe algorithm described abo ve givesa Reed-Solomon decoding algorithm that can reco verthe polynomial M pro vided the number of agreements(non-error locations) is at least 2k

pn!

Is this an improvement? Not necessarily! In fact forsome choice of code parameters the abo vealgorithm,as analyzed above, is even worse than the unambiguousdecoding algorithm. But that is just because we havebeen sloppy in our analysis. If we are careful and pickthe right set of coeÆcients of Q to be non-zero, then atthe v ery least we don't do worse that the unambiguousdecoding algorithm (in particular if in terpreted prop-

erly this algorithm is a strict generalization of to theWelch-Berlekamp algorithm). But for other choicesof code parameters this algorithm does signi�cantlybetter. T o see such a choice consider a case wherek = n1=3. In such a case, the number of errors the algo-rithm can decode from is n�O(n5=6) = n�o(n), whilethe unambiguous decoder could not allo wmore thann=2 error. So in this kind of a scenario, this algorithmcan recover from errors, even when the amount of erroris far larger than the true information. A careful anal-ysis in [89] shows that this list-decoding algorithm cando better than the unambiguous decoding algorithmsprovided k < n=3. Even better results are known fromsubsequent work and we describe some of these in thesequel.

3.3.2 F urther results in list-decoding

List-decoding in algebraic settings. Since thediscovery of the list-decoding algorithm for the ReedSolomon codes, list-decoding algorithms for a num-ber of other families of codes have been disco v-ered. Many of these w ork in the spirit of the al-gorithm described abo ve. The �rst such algorithmwas for algebraic-geometry codes due to Shokrollahiand Wasserman [82]. Later Goldreich, Ron, and Su-dan [36] gave an algorithm decoding a number theo-retic analog of the Reed Solomon code that they callthe Chinese Remainder Theorem (CRT) code, (Thesecodes are studied in the coding theory literature un-der the name Redundant Residue Number System(RRNS) codes [64 ].) Finally, Guruswami, Sahai, andSudan [44] showed how to unify the above codes and al-gorithms using an \ideal-theoretic framework" and howthe above-mentioned algorithms could be expressed asspecial cases of this ideal-theoretic algorithm. (Un-fortunately, w e are not yet aw areof an y interestingideal-theoretic codes other than the ones already men-tioned!)

New paradigms for list-decoding. The above re-sults sho w ho w to adapt a paradigm for a decodingalgorithm to w ork for di�erent classes of codes. Asecond strain of results show how to use the Reed-Solomon decoding algorithm can be used as a sub-routine to construct list-decoding algorithms for otherfamilies of codes. These results are highly non-trivialin that the resulting algorithms (and sometimes alsothe codes) are very distinct from the algorithm de-scribed above and potentially generate new paradigmsin decoding algorithms. One such result, due to Aroraand Sudan [3] with improvements by Sudan, Trevisanand V adhan[91], produces a list-decoding algorithmfor Reed-Muller codes. This result is a culmination of a

13

Proceedings of the 42nd IEEE Symposium on Foundations of Computer Science (FOCS�01) 0-7695-1390-5/02 $17.00 © 2002 � IEEE

long series of works on the \random self-reducibility" ofmultiv ariate polynomials [8, 31, 32, 26]. These resultsare typically stated and studied in very di�erent con-texts | how ev er, they are essentially just randomizeddecoding algorithms for Reed-Muller codes, whose eÆ-ciency can be highlighted ev en further if one looks intothe non-standard models of computing used in thesepapers. A second suc h result is a new and excitingconstruction of codes based on \extractors", due to Ta-Shma and Zuckerman [93], whose constructions lead tonew codes with interesting \soft-decision decoding" ca-pabilities. Unfortunately, we won't be able to discussthe terms \extractors", or \soft-decision decoding" inthis article, so we won't dwell on this result. Neverthe-less this result gives one of only four distinctive avorsof list-decoding algorithms, and thus is important forfuture study. Finally, Guruswami and Indyk [41] con-struct sev eral families codes with eÆcient list-decodingalgorithms. An example of the many results given thereis a binary code of positive rate which is list-decodableupto nearly 1=2 fraction of errors in nearly linear time.F ormally, for every �; � > 0, they sho w there existsan R > 0 and an in�nite family of codes of rate Rwhich is list-decodable to within 1=2� � fraction of er-rors in time O(n1+�). (Previously it was known how toachieve such error-correction capability inpolynomialtime [91, 43].)

Improv ements to error-correction capabil-ity. As noted above, the decoding algorithm for theReed Solomon codes only improved upon the error-correction capability for codes of rate less than 1=3.(The case of other codes gets even worse!) F ocussingon this aspect, Guruswami and Sudan [42], proposedan extension to the Reed-Solomon decoding algorithmthat o vercame this limitation.This algorithm has thefeature that it can decode Reed Solomon codes forthe largest error bound e for which it is known thata ball of radius e has only polynomially many radius(see the discussion leading to the Elias-Bassalygobound). Thus any improvements to this algorithmw ould requirenew combinatorial insights about ReedSolomon codes. [42 ] giv e similar results also foralgebraic geometry codes where the best results aredue to K�otter and V ardy[55]. Similar improvementswere also obtained for the case of CRT codes byBoneh [16] and Guruswami, Sahai, and Sudan [44].

Improving the running time. Most of the decod-ing algorithms described above (with the exception ofthe result by Guruswami and Indyk [41]) run in poly-nomial time but not a particularly small polynomial!F ortunately, a fair amount of e�ort has been expendedin the coding theory literature on making these running

times more respectable. For instance, [70, 79] show howto solve the interpolation problem de�ned in the �rststage of the list-decoding algorithm for Reed Solomoncodes (as well as the variant that comes up in the caseof algebraic geometry codes) in O(n2) time, when theoutput list size is a constant. Similar running timesare also known for the root �nding problem (whichsuÆces for the second step in the algorithms above)[4, 29, 67, 70, 79 , 100]. T ogether these algorithms fur-ther enhance the practical potential for list-decoding(at least they turn the focus aw ayfrom algorithmicissues and move them tow ards combinatorial ones).

The wealth of results revolving around list-decodingis gro wing even as we write this article, shedding fur-ther light on its combinatorial signi�cance, algorithmiccapability, and use in theoretical computer science. Agood starting point for further reading on this topicmay be the thesis of Guruswami [40].

4 Conclusions

In summary we have hope to have introduced codingtheory to a wide audience and convinced them that this�eld contains a wealth of results, while also o�ering thepotential for new research problems. In particular somecen tral algorithmic questions of coding theory, both inthe Shannon sense and in the Hamming sense are opentoday, and theoretical computer scien tists can (andare) contributing. Readers seeking further material areencouraged to check out the website of the author [90].More stable sources of information include the classicaltext of MacWilliams and Sloane [63], the concise textof v an Lint on algebraic coding theory [59], the out-of-print, but highly recommended, book by Blahut [15]which is an excellent source for some of the algorith-mic works, and the highly detailed (and not-so-handy)handbook of coding theory [49].

Acknowledgments

Many thanks to Venkatesan Guruswami for manyconversations, and pointers to the early history.

References

[1] Sigal Ar, Ric hardLipton, Ronitt Rubinfeld, andMadhu Sudan. Reconstructing algebraic functionsfrom mixed data. SIAM Journal on Computing,28(2):488{511, 1999.

[2] Sanjeev Arora, Laszlo Babai, Jacques Stern andElizabeth Z. Sweedyk. The hardness of ap-

14

Proceedings of the 42nd IEEE Symposium on Foundations of Computer Science (FOCS�01) 0-7695-1390-5/02 $17.00 © 2002 � IEEE

pro ximating problems de�ned by linear con-strain ts. Journal of Computer and System Sci-enc es, 54(2):317{331, April 1997.

[3] Sanjeev Arora and Madhu Sudan. Improved low-degree testing and its applications. Pr oceedingsof the Twenty-Ninth Annual ACM Symposium onTheory of Computing, pages 485{495, 1997.

[4] Daniel Augot and Lancelot P ecquet. A Hensellifting to replace factorization in list decod-ing of algebraic-geometric and Reed-Solomoncodes. IEEE T ransactionson Information The-ory, 46:2605{2613, November 2000.

[5] Alexander Barg. Complexity issues in coding the-ory. In [49, Chapter 7].

[6] Alexander Barg and Gill�es Z �emor. Linear-time de-codable, capacity achieving binary codes with ex-ponentially falling error probability. IEEE Trans-actions on Information Theory, 2001.

[7] L. A. Bassalygo. New upper bounds for error cor-recting codes. Pr oblems of Information Transmis-sion, 1(1):32{35, 1965.

[8] D. Beaver and J. Feigen baum. Hiding instances inmultioracle queries. Pr oceedings of the 7th Sympo-sium on Theoretical Aspects of Computer Science,Lectures Notes in Computer Science v. 415, pp.37{48, February 1990.

[9] Elwyn R. Berlekamp. A lgebraic Coding Theory.McGraw Hill, New York, 1968.

[10] Elwyn R. Berlekamp. F actoring polynomials overlarge �nite �elds. Mathematics of Computations,24:713{735, 1970.

[11] Elwyn R. Berlekamp. Key pap ersin the develop-ment of coding theory. IEEE Press, New Y ork,1974.

[12] Elwyn R. Berlekamp. Bounded distance +1 soft-decision Reed-Solomon decoding. IEEE T ransac-tions on Information Theory, 42(3):704{720, 1996.

[13] Elwyn R. Berlekamp, Robert J. McEliece, andHenk C. A. van Tilborg. On the inherent in-tractability of certain coding problems (Cor-resp.). IEEE Transactions on Information Theory,24(3):384{386, May 1978.

[14] Ric hard E. Blahut. Principles and Practice ofInformation Theory. Addison-Wesley , Reading,Massachusetts, 1987.

[15] Richard E. Blahut. Theory and Practic e of ErrorControl Codes. Addison-Wesley, Reading, Mas-sachusetts, 1983.

[16] Dan Boneh. Finding smooth integers in short in-tervals using CRT decoding. Pr oceedings of the32nd Annual ACM Symposium on Theory of Com-puting, pages 265{272, 2000.

[17] R. C. Bose and D. K. Ray-Chaudhuri. On a classof error correcting binary group codes. Informa-tion and Control, 3:68{79, 1960.

[18] T.M. Cover and Joy A. Thomas. Elements of In-formation Theory. Wiley Publishing, New Y ork,1991.

[19] Ph. Delsarte. An algebraic approach to the asso-ciation schemes of coding theory. Philips ResearchR eports, Suppl. 10, 1973.

[20] Irit Dinur, Guy Kindler, and Shmuel Safra.Approximating-CVP to within almost-polynomialfactors is NP-hard. Pr oceedingsof the 39th A n-nual IEEE Symposium on Foundations of Com-puter Science, 1998.

[21] Rod G. Downey, Michael R. Fellows, AlexanderVardy, and Geo� Whittle. The parametrized com-plexit yof some fundamental problems in codingtheory. SIAM Journal on Computing 29(2):545{570, 1999.

[22] Ilya I. Dumer. Two algorithms for the decoding oflinear codes. Pr oblemsof Information T ransmis-sion, 25(1):24{32, 1989.

[23] Ilya Dumer, Daniele Micciancio, and Madhu Su-dan. Hardness of approximating the minimum dis-tance of a linear code. Pr oceedings of the 40th An-nual Symposium on Foundations of Computer Sci-ence, pages 475-484, New Y orkCity, New Y ork,17-19 October, 1999.

[24] Iw anM. Duursma. De coding codes from curvesand cyclic codes. Ph.D. Thesis, Eindhoven, 1993.

[25] P eter Elias. List decoding for noisy channels.T echnic alR eport 335, Research L aboratory ofElectronics, MIT, 1957.

[26] Uriel Feige and Carsten Lund. On the hardnessof computing the permanent of random matrices.Computational Complexity, 6(2):101{132, 1997.

[27] G. David Forney .Concatenated Codes. MIT Press,Cambridge, MA, 1966.

15

Proceedings of the 42nd IEEE Symposium on Foundations of Computer Science (FOCS�01) 0-7695-1390-5/02 $17.00 © 2002 � IEEE

[28] Robert G. Gallager. Low Density Parity CheckCo des. MIT Press, Cambridge, Massachusetts,1963.

[29] Shuhong Gao and M. Amin Shokrollahi. Com-puting roots of polynomials over function �elds ofcurves. Co dingTheory and Cryptography: F romEnigma and Geheimschreiber to Quantum Theory(D. Joyner, Ed.), Springer, pages 214{228, 2000.

[30] Arnaldo Garcia and Henning Stich tenoth. A tow erof Artin-Schreier extensions of function �elds at-taining the Drinfeld-Vl�adut bound. InventionesMathematicae, 121:211{222, 1995.

[31] P eter Gemmell, Ric hardLipton, Ronitt Rubin-feld, Madhu Sudan, and Avi Wigderson. Self-testing/correcting for polynomials and for approx-imate functions. Proceedings ofthe Twenty ThirdA nnualA CMSymposium on Theory of Comput-ing, pages 32-42, New Orleans, Louisiana, 6-8 May1991.

[32] P eter Gemmell and Madhu Sudan. Highly resilientcorrectors for multiv ariate polynomials. Informa-tion Processing Letters, 43(4):169{174, 1992.

[33] E. N. Gilbert. A comparison of signalling alpha-bets. Bell System T echnic alJournal, 31:504{522,May 1952.

[34] Marcel J. E. Golay. Notes on digital coding. Pro-ceedings of the IRE, v. 37, page 657, June 1949.

[35] Oded Goldreich and Leonid Levin. A hard-corepredicate for all one-way functions. Pr oceedings ofthe 21st A nnualA CMSymposium on Theory ofComputing, pages 25{32, May 1989.

[36] Oded Goldreich, Dana Ron, and Madhu Su-dan. Chinese remaindering with errors.IEEE T ransactions on Information Theory,46(5):1330{1338, July 2000. (F uller versionavailable as TR98-062 (Revision 4), ElectronicColloquium on Computational Complexityhttp://www.eccc.uni-trier. de/ec cc.)

[37] V. D. Goppa. Codes on algebraic curves. SovietMath. Doklady, 24:170{172, 1981.

[38] Daniel Gorenstein and Neal Zierler. A class oferror-correcting codes in pm symbols. Journal ofSo ciety of Industrial and Applied Mathematics,9:207{214, June 1961.

[39] Dima Grigoriev. F actorization of polynomialsover a �nite �eld and the solution of systems

of algebraic equations. T ranslated from Za-piski Nauchnykh Seminarov L enningradskogo Ot-deleniya Matematicheskogo Instituta im. V. A.Steklova AN SSSR, 137:20{79, 1984.

[40] Venk atesanGuruswami. List decodingof Error-correcting c odes. Ph.D. Thesis. Massachusetts In-stitute of Technology, 2001.

[41] Venk atesan Guruswami and Piotr Indyk.Expander-based constructions of eÆcientlydecodable codes. Pr oceedings of the 42nd AnnualSymposium on Foundations of Computer Science,to appear, October 2001.

[42] Venk atesanGuruswami and Madhu Sudan. Im-proved decoding of Reed-Solomon and algebraic-geometric codes. IEEE Transactions on Informa-tion Theory, 45:1757{1767, 1999.

[43] Venk atesanGuruswami and Madhu Sudan. Listdecoding algorithms for certain concatenatedcodes. Pr oceedings of the 32nd Annual ACM Sym-posium on Theory of Computing, pages 181{190,2000.

[44] Venk atesan Guruswami, Amit Sahai, and MadhuSudan. Soft-decision decoding of Chinese Remain-der codes. Pr oceedingsof the 41st IEEE Sympo-sium on Foundations of Computer Science, pages159{168, 2000.

[45] Richard W. Hamming. Error Detecting and ErrorCorrecting Codes. Bell System Technic al Journal,29:147{160, April 1950.

[46] H. J. Helgert. Alternant codes. Information andControl, 26:369{380, 1974.

[47] A. Hocquenghem. Codes correcteurs d'erreurs.Chi�res (Paris), 2:147{156, 1959.

[48] Tom H�holdt, Jacobus H. van Lint, Ruud P el-likaan. Algebraic geometry codes. In [49].

[49] C. Hu�man and V. Pless. Handbook of CodingTheory, volumes 1 & 2. Elsevier Sciences, 1998.

[50] S. M. Johnson. A new upper bound for error-correcting codes. IEEE Transactions on Informa-tion Theory, 8:203{207, 1962.

[51] J�rn Justesen. A class of constructive asymptoti-cally good algebraic codes. IEEE Transactions onInformation Theory, 18:652{656, 1972.

16

Proceedings of the 42nd IEEE Symposium on Foundations of Computer Science (FOCS�01) 0-7695-1390-5/02 $17.00 © 2002 � IEEE

[52] Eric h Kaltofen. P olynomial-time reductions frommultiv ariate to bi- and univariate integral polyno-mial factorization. SIAM Journal on Computing,14(2):469{489, 1985.

[53] G. L. Katsman, Michael Tsfasman, and SergeVl�adut. Modular curv esand codes with a poly-nomial construction. IEEE T ransactionson In-formation Theory, 30:353{355, 1984.

[54] Ralf K�otter.A uni�ed description of an errorlocating procedure for linear codes. Pr oceedingsof A lgebraic and Combinatorial Coding Theory,V oneshta Voda, Bulgaria, 1992.

[55] Ralf K�otter and Alexander Vardy .Algebraic soft-decision decoding of Reed-Solomon codes. Pro-ceedings of the 38th A nnual A llerton Confer-enc e on Communication, Control and Computing,pages 625{635, October 2000.

[56] Arjen K. Lenstra. Factoring multiv ariate polyno-mials over �nite �elds. Journal of Computer andSystem Sciences, 30(2):235{248, April 1985.

[57] V. I. Levensh tein. Universal bounds for codes anddesigns. [49 , Chapter 6], 1998.

[58] Jacobus H. van Lin t. Nonexistence theorems forperfect error-correcting codes. Pr oceedingsof theSymposium on Computers in Algebr a and NumberTheory, New York, 1970, pages 89{95, G. Birkho�and M. Hall Jr. (Eds), SIAM-AMS Proceedingsvol IV, American Mathematical Societ y,Provi-dence, RI, 1971.

[59] Jacobus H. van Lint. Intr oduction to Coding The-ory. Graduate Texts in Mathematics 86, (ThirdEdition) Springer-Verlag, Berlin, 1999.

[60] M. Luby, M. Mitzenmacher, M. A. Shokrollahi andD. Spielman. Analysis of low density codes and im-pro ved designs using irregular graphs. STOC 1998.

[61] M. Luby, M. Mitzenmacher, M. A. Shokrollahi, D.Spielman, and V. Stemann. Practical loss-resilientcodes. Proceedings of the Twenty-Ninth AnnualA CM Symposium on Theory of Computing, pages150{159, 1997.

[62] Jessie MacWilliams. A theorem on the distribu-tion of weights in a systematic code. Bell SystemsT echnic al Journal, 42:79{94, January 1963.

[63] F. J. MacWilliams and Neil J. A. Sloane. The The-ory of Err or-Correcting Codes. Elsevier/North-Holland, Amsterdam, 1981.

[64] D. M. Mandelbaum. On a class of arithmetic codesand a decoding algorithm. IEEE Transactions onInformation Theory, 21:85{88, 1976.

[65] J. L. Massey. Threshold decoding. MIT Press,Cambridge, 1963.

[66] J. L. Massey. Shift-register synthesis and BCH de-coding. IEEE T ransactionson Information The-ory, 15:122{127, 1969.

[67] R. Matsumoto. On the second step in theGuruswami-Sudan list decoding algorithm for ag-codes. T echnic alR eportof IEICE, pages 65{70,1999.

[68] Robert J. McEliece, E. R. Rodemich, H. C. Rum-sey, and L. R. Welch. New upper bounds onthe rate of a code via the Delsarte-Macwilliamsinequalities. IEEE T ransactionson InformationTheory, 23:157{166, 1977.

[69] D. E. Muller. Application of Boolean algebra toswitching circuit design and to error detection.IRE Transactions on Electronic Computation, vol.EC-3, pp 6{12, Sept. 1954.

[70] Rasmus R. Nielsen and Tom H�holdt. DecodingHermitian codes with Sudan's algorithm. Pr oceed-ings of AAECC-13, LNCS 1719, pages 260{270,1999.

[71] Vadim Olshevsky and M. Amin Shokrollahi. Adisplacement structure approach to eÆcient listdecoding of algebraic geometric codes. Pr occedingsof the 31st Annual ACM Symposium on Theory ofComputing, 1999.

[72] Ruud Pellikaan. On decoding linear codes by errorcorrecting pairs. Eindhoven University of Technol-ogy, preprint, 1988.

[73] W. W. Peterson. Encoding and error-correctionprocedures for Bose-Chaudhuri codes. IRE Trans-actions on Information Theory, 6:459{470, 1960.

[74] M. Plotkin. Binary codes with speci�ed minimumdistance. IRE T ransactionson Information The-ory, IT-6:445{450, 1960.

[75] Irving S. Reed. A class of multiple-error-correctingcodes and the decoding scheme. IRE Transactionson Information Theory, vol. PGIT-4, pp. 38{49,September 1954.

[76] I. S. Reed and G. Solomon. Polynomial codes overcertain �nite �elds. J. SIAM, 8:300{304, 1960.

17

Proceedings of the 42nd IEEE Symposium on Foundations of Computer Science (FOCS�01) 0-7695-1390-5/02 $17.00 © 2002 � IEEE

[77] Tom Ric hardsonand Rudiger Urbanke. The ca-pacit y of low-density parit y-check codes undermessage-passing decoding. IEEE Transactions onInformation Theory, March 2001.

[78] Tom Ric hardson,Amin Shokrollahi and RudigerUrbanke. Design of capacity-approaching irregu-lar low-density parit y-c hec kcodes. IEEE T rans-actions on Information Theory, March 2001.

[79] Ronny Roth and Gitit Ruckenstein. EÆcient de-coding of Reed-Solomon codes beyond half theminimum distance. IEEE T ransactionson Infor-mation Theory, 46(1):246{257, January 2000.

[80] Claude E. Shannon. A mathematical theory ofcommunication. Bell System T echnic alJournal,27:379{423, 623{656, 1948.

[81] Claude E. Shannon, Robert G. Gallager, and El-wyn R. Berlekamp. Lo w er bounds to error proba-bilit y for coding on discrete memoryless channels.Information and Control, 10:65{103 (Part I), 522{552 (Part II), 1967.

[82] M. Amin Shokrollahi and Hal Wasserman. List de-coding of algebraic-geometric codes. IEEE Trans-actions on Information Theory, 45(2):432{437,1999.

[83] V. M. Sidelnikov. Decoding Reed-Solomon codesbeyond (d � 1)=2 and zeros of multivariate poly-nomials. Pr oblemsof Information T ransmission,30(1):44{59, 1994.

[84] Michael Sipser and Daniel Spielman. Expandercodes. IEEE Transactions on Information Theory,42(6):1710{1722, 1996.

[85] David Slepian. A class of binary signalling alpha-bets. Bell System T echnic alJournal, 35:203{234,Jan uary 1956.

[86] Daniel Spielman. Linear-time encodable and de-codable error-correcting codes. IEEE T ransac-tions on Information Theory, 42(6):1723{1732,1996.

[87] Henning Stic h tenoth. A lgebr aicF unction Fieldsand Codes. Springer-Verlag, Berlin, 1993.

[88] Madhu Sudan. Decoding of Reed-Solomon codesbeyond the error-correction bound. Journal ofComplexity, 13(1):180{193, 1997.

[89] Madhu Sudan. Decoding of Reed-Solomon codesbeyond the error-correction diameter. Pr oceedingsof the 35thA nnual Allerton Conference on Com-munication, Control and Computing, 1997.

[90] Madhu Sudan. A Crash Course inCoding Theory. Slides available fromhttp://theory.lcs .mit.edu/~madhu, Novem-ber 2000.

[91] Madhu Sudan, Luca Trevisan, and Salil V ad-han. Pseudorandom generators without the XORlemma. Journal of Computer and System Sci-ences, 62(2): 236{266, March 2001.

[92] R. Michael T anner.A recursiv eapproach to lowcomplexity codes. IEEE Transactions on Informa-tion Theory, 27:533{547, September 1981.

[93] Amnon Ta-Shma and David Zuckerman. Extrac-tor Codes. Pr oceedings of the 33rd Annual ACMSymposium on Theory of Computing, pages 193{199, July 2001.

[94] Aimo Tietavainen. On the nonexistence of perfectcodes over �nite �elds. SIAM Journal of AppliedMathematics, 24(1):88-96, January 1973.

[95] Michael A. Tsfasman, Serge G. Vl�adut, andThomas Zink. Modular curves, Shimura curv es,and codes better than the Varshamov-Gilbertbound. Math. Nachrichten, 109:21{28, 1982.

[96] Alexander Vardy . Algorithmic complexity in cod-ing theory and the minimum distance problem.STOC, pages 92-109, 1997.

[97] R. R. Varshamov. Estimate of the number of sig-nals in error correcting codes. Dokl. A kad.NaukSSSR, 117:739{741, 1957.

[98] Llo yd Welch and Elwyn R. Berlekamp. Error cor-rection of algebraic block codes. US Patent Num-ber 4,633,470, December 1986.

[99] J. M. Wozencraft. List Decoding. QuarterlyPr ogress Report, Research L aboratory of Electron-ics, MIT, 48:90{95, 1958.

[100] Xin-Wen Wu and P. H. Siegel. EÆcient list de-coding of algebraic geometric codes beyond theerror correction bound. Pr oceedingsof the Inter-national Symposium on Information Theory, June2000.

18

Proceedings of the 42nd IEEE Symposium on Foundations of Computer Science (FOCS�01) 0-7695-1390-5/02 $17.00 © 2002 � IEEE