list decoding product and interleaved codes prasad raghavendra joint work with parikshit gopalan...

List Decoding Product and Interleaved Codes

Prasad Raghavendra

Joint work withParikshit Gopalan & Venkatesan Guruswami

Error-correcting codes

• Encoding E : k n

• Code C = { E(m) : m k } • Rate R of the code = k/n• (Relative) distance of the code = if every two

codewords differ in n positionsNoise model:

– Any subset of up to fraction p of symbols can be arbitrarily corrupted by the channel

message m codeword E(m)

What fraction of errors can various codes correct?

Decoding radii

n

If error fraction p < /2, correct codewordis unique closest codeword.

When p > /2, unambiguous recoveryis not always possible.

Pessimistic limit: even for p >> /2, many error patterns have a unique close-by codeword.

Relax decoding model: List decoding“Allow a small list of possible answers”

List Decoding[Elias’57, Wozencraft’58, Goldriech-

Levin’89]List decoding C n up to error fraction pGiven: A received word r n

Output: A list of all codewords c C s.t. HammingDist(r,c) e =pn

e r

1. Combinatorics:• Code must guarantee that list is small

(say, bounded by constant independent of n) for every r

2. Algorithmics:• There are exp(n) many codewords; want to find close-by

codewords in poly(n) time

List Decoding Radius LDR(C): largest error rate forwhich the code can be (efficiently) list decoded

List decoding radius• Natural question: Given a code, what is its list decoding

radius?– combinatorial, algorithmic

• Johnson radius: LDR(C) J() = 1 - (1-)1/2

– (1 - (1-2)1/2)/2 for binary codes– Johnson radius always /2– generic combinatorial bound as function of alone

• Johnson bound tight in some cases, but specific codesmay have higher LDR

– Random codes in fact have LDR w.h.p.– Determining LDR of even well-understood codes often very difficult

List decoding radii: previous salient results

• Hadamard (first order Reed-Muller) codes: LDR = 1 - 1/q [Goldreich-Levin; Goldreich-Rubinfeld-Sudan]

• Reed-Solomon (RS) codes: Efficient algorithm to list decode up to Johnson radius [Sudan; G.-Sudan] (Exact LDR open!)

• Binary Reed-Muller codes (of any order r): LDR = 2-r [Gopalan-Klivans-Zuckerman]

• “Correlated” RS codes: Efficient algorithm to list decode beyond J() (for close to 1) [Parvaresh-Vardy]

• Folded RS codes: LDR = 1 - R (optimal) [G.-Rudra]– Some algebraic-geometric extensions [G.-Patthak’06] [G.-09]

• Group homomorphisms [Dinur-Grigorescu-Kopparty-Sudan] LDR All results give algorithm to decode up to

stated bound on LDR

Our work• Efficient list decoding only known for specific well-

structured codes– Mostly, algebraic codes (exceptions [G.-Indyk’03], [Trevisan’03])

• Our work: General list decoding algorithms for a broad class of codes– Abstract look at two well-known “product” operations to

combine codes1. Tensoring 2. Interleaving

– Algorithmic solutions (and combinatorial bounds) for associated list decoding problems

Tensor ProductsGiven a linear code C Fq

n , the (tensor) product code C x C Fq

n x n is:• C x C = {n x n matrices such that each row and each

column is a codeword of C }

Parameters:– Blocklength(C x C) = Blocklength(C)2 – Rate(C x C) = Rate(C)2

(C x C) = (C)2

– Alphabet size q remains same– Parameters in general worse, but can get longer codes

Tensoring of Reed Solomon Codes

Basis for Reed Solomon Code C:the monomials = { 1,x,x2, x3, x3 ,… xk-1 }

Basis for tensor product C X C :the monomials = { 1,x,y,xy,x2y, xiyj,.. ,xk-1yk-1}

C X C = { Evaluations of bivariate polynomials P(x,y) with degree of each variable bounded by k-1}

"Applications"

(Tensor) product codes arise in many contexts:• Explicit codes for BSC: Product of Hamming codes [Elias]

• Hardness of Approximation [Dumer-Micciancio-Sudan]

• Construction of Locally Testable Codes [BenSasson-Sudan], [Meir]

• Local Testability of tensor products has received lot of attention: [BenSasson-Sudan], [G.-Rudra], [Dinur-Sudan-Wigderson], [Valiant], [BenSasson-Viderman]

Our Result: Tensor Products

For every code C,LDR( C X C ) = LDR(C) (C)

• Using list decoding algorithm for C as a blackbox, we get list decoding algorithm for C X C

• list size = 2O((log L)^2) if list size for C is L.

Corollary: For repeated tensoring LDR( C x m ) = LDR(C) (C)m-1

Tensor products always decodable beyond Johnson radius!- m-1 J() >> J(m)

Corollary: New list decoding algorithms for multivariate polynomial codes when degree of each variable is bounded.

• even decoding these up to Johnson radius was open

Ratio LDR/Distancestays same under tensoring.

Next:

Interleaved Product

So far : List Decoding Tensor Products:

Definition and Results

Interleaving

Bursty ErrorsA few

codewords incur large errors.

Solution : Interleave the symbols from different codewords while transmitting. Bursty Errors get

uniformly distributed across different

codewords

Interleaved Product: DefinitionFor a code C,

Codeword of C•m : m-wise interleaving of C is given by

c1

c2

c3

.

.cm

Parameters:• Rate, distance, block length of C•m same as those of C • Alphabet of C•m = [q]m where Alphabet of C = [q]

Reed Solomon Codeword

Evaluation of a degree k polynomial P at n points.

Degree k curve P(t) in the one-dimensional space [q].

Interleaving in Practice3-interleaved Reed Solomon Codeword

Evaluation of a degree k polynomials (P1 ,P2 ,P3) at n points.

Degree k curve (P1 (t), P2 (t), P3 (t)) in the three dimensional space [q]3.

P(1) P(2) P(n)

P1(1) P1(2) P1(n)

P2(1) P2(2) P2(n)

P3(1) P3(2) P3(n)

Interleaved codes list decoding

Question:LDR(C ) = p → LDR(C•m ) = ?

Naive Approach:

R1(1) R1(2) R1(n)

R2(1) R2(2) R2(n)

R3(1) R3(2) R3(n)

List decode

List decode

List decode

L1 = {A1,A2 , .. Am}

L2 = {B1,B2 , .. Bm }

L3 = {C1,C2 , .. Cm } p fraction of columns erroneous

X

X

Drawback: List size and running time grows exponentially in number m of interleavings.

Try all possibilities in L1 X L2 X L3

Clearly at most p

Our Result: Interleaved codes

For every code C, LDR(C•m ) = LDR(C )

Specifically, we design list decoding algorithm, and bound the list size of C•m

List size is independent of # interleavings m !

Related work• Interleaved Hadamard codes

– Decoding linear transformations

[Dinur-Grigorescu-Kopparty-Sudan], [Dwork-Shaltiel-Smith-Trevisan]

• Interleaved Reed-Solomon codes– Decoding under random noise (for cryptanalysis) [Coppersmith-

Sudan], [Bleichenbacher-Kiayias-Yung]

• Parvaresh-Vardy codes & folded Reed-Solomon

codes [G.-Rudra] are carefully chosen subcodes of interleaved RS codes

Other results: Better list size bounds

• Codewords of interleaved code C•m and tensor product C x C are naturally viewed as matrices

• If rank of a codeword of C•m is r, its Hamming weight equals r’th Generalized Hamming weight r(C)– For binary codes large r, r(C) 2 (C)

• “Deletion” argument: List size bound dominated by number of close-by low rank codewords– Bound low-rank list size by combinatorial techniques – Leads to better list size bounds for binary linear codes

• Eg., for decoding linear transformations over Fq, get tight list size bound for fixed q– list size Oq(1/2) for error fraction (1-1/q- )

Rest of talk

• List decoding interleaved codes

• Sketch of (tensor) product list decoding

• GHW and decoding linear transformations

• Summary

Interleaved codesFor a code C,

Codeword of C•m : m-wise interleaving of C is given by

c1

c2

c3

.

.cm

Naive Approach RevisitedR1[1] R1[2] R1[n]

R2[1] R2[2] R2[n]

R3[1] R3[2] R3[n]

Root

A1 A2At

B1Bt

…

…

C1Ct…

Each leaf is a candidate decoding.

Running time and list size depends on the number of leaves.

Two simple ideasR1[1] R1[2] R1[n]

R2[1] R2[2] R2[n]

R3[1] R3[2] R3[n]

Root

A1 A2At

B1Bt

…

…

C1Ct…

Idea 1:Erase columns containing an erroneous symbol.

Idea 2:Prune branches with more than p fraction of errors.

Branching → Erasures

v

A1 A2At…

Observation 1At most one of the codewords A1, A2 , .. At is at a distance < δ(v)/2. For all but one Ai

e(Ai) ≥ e(v) + δ(v)/2= (δ + e(v))/2

Observation 2

Even the nearest codeword has to be δ-p away (if t > 1)For the nearest codeword Ai

e(Ai) ≥ e(v) + δ - p

e(v) = frac. of columnserased up to v

We decode at v up toradius p - e(v)

δ(v) = δ - e(v) = distanceof punctured code with erased positions removed

Given a branching node v,along all children, the number of erasures increases by one of the following two rules :

e(wi) ≥ (δ + e(v))/2OR

e(wi) ≥ e(v) + (δ – p)

Erasures → Low depthv

w1 w2wt…

p δ0 e e+ δ – p

(e+ δ)/2

In about δ/ (δ – p) steps, fraction of erasures reaches p

• Unique decode from p < δ erasures from then on

List size bound• On every root to leaf path there are

at most δ/(δ – p) branching nodes.

• Every branching node has at most L children, where L = list size for C.

• Number of leaves ≤ L(δ/ (δ – p))

– Careful bound: 2(δ/(δ – p)) Llog(δ/ (δ – p))

• Proof is algorithmic

Root

A1 A2At

B1

Bt

…

…

C1

Ct

…

Rest of talk




• Summary

= (C)

(C x C) = 2

Wrong Correct

Unique decoding product codesDecode along rows, then along columns

Decoding up to radius 2/4

Decode rows up to radius /2 (at most /2 rows wrong)

Decode columns to radius /2

CorrectAdvice

Phase 1:

Given correct advice (codeword on random S x T),

List decode to radius p along rows in S, and use advice to disambiguate.

B : value on rows that didn’t fail; advice for next phase.

Wrong |S|

Fail

Correct (1- +)|S|

B

List decoding product codesList decode along rows, then columns, then do it again!

Goal: Decode up to radius p- (p = LDR(C))

S

T

Phase 2

D – advice for next phase

Fail

BD

List decode received word to radius p along all the columns,

Given the noisy advice B,

Use advice B to disambiguate (pick codeword that differs from B in < fraction rows in B)

Correct (1- + )n

Wrong n

Averaging + code distance + sampling Correct decoding on

(1- + )n columns, n mistakes

Phase 3 D is “noisy but sound”

Erasures Correct

DE

Row-wise: Unique codeword (correct one) differs from D in < n positions

List decode received word to radius p along the rows, and use advice D to disambiguate.

E : advice with only erasures.

(1- + )n rowscorrectly decoded! - rest fail

(1- + )n

Phase 4Given the codeword with erasures E,

EUnique Decode each column of E from (< n) erasures - easy for every linear code

Rest of talk




• Summary

(Binary) Linear transformation code

a1a2

am

L {0,1}m x k

Had(am) Columnsx {0,1}k

a1˙ xa2˙ xm

rows

Given recd. matrix R, # codewords that differ in (1/2-) columns?

- For Hadamard codes (each row), list size is (1/2)- Our general algorithm gives upper bound (1/)O(log (1/))

• Consider rank of the codeword• If rank > 1, Hamming weight is 3/4 (GHW connection)

• Johnson radius J(3/4) = 1/2.• By “GKZ deletion argument,” list size O(1/) L(1)(1/2-)

- [Dinur-Grigorescu-Kopparty-Sudan] 1/C for absolute constant C- We get tight O(1/2) bound. (Also, min { cq/2, C/4 } over Fq )

# close-by rank 1 codewords

Rank 1 list size• Rank 1 codeword has simple form

– 1/2 columns are v, 1/2 are 0

• Each candidate v must occur with frequency in R

– t 1/ candidates for v

• Reduces to t Hadamard decodings to radius (1/2- )– Naively, list size t/ 2 = 1/ 3

– Use erasures in decoding, list size O(1/2)

• More complex argument: Rank 2 list size L(2)(1/2-) O(1/2)

• Deletion argument: overall list size C0· L(2)(1/2-) for absolute constant C0

0 0 0v v v

Summary & Open questions• Studied effect of two product operations on list decoding

– Tensoring: LDR( C x m ) = LDR(C) (C)m-1

– Interleaving: LDR(C•m) = LDR(C), irrespective of m

• Both decoding radii are tight, but our list size bounds could be improved– Lower bounds on list size?– List size C/2 for linear transformations over Fq ?– Construction of list-decodable codes via repeated tensoring?

• Combinatorial folded codes?

• LDR of Reed-Solomon codes? Non-binary (degree 3) Reed-Muller codes?– LDR =1-2/q for quadratic case [Gopalan’09]

list decoding product and interleaved codes prasad raghavendra joint work with parikshit gopalan...

Documents