codes and pseudorandomness : a survey

Codes and Pseudorandomness:A Survey

David Zuckerman

University of Texas at Austin

Randomness and Computing

• Randomness extremely useful in computing.– Randomized algorithms– Monte Carlo simulations– Cryptography– Distributed computing

• Problem: high-quality randomness expensive.

What is minimal randomness requirement?

• Can we eliminate randomness completely?• If not:–Can we minimize quantity of randomness?–Can we minimize quality of randomness?• What does this mean?

What is minimal randomness requirement?

• Can we eliminate randomness completely?• If not:–Can we minimize quantity of randomness?• Pseudorandom generator

–Can we minimize quality of randomness?• Randomness extractor

Outline

• PRGs and Codes– Intro to PRGs.– Various connections.

• Extractors and Codes– Intro to Extractors.– Connections with list decoding.– Non-Malleable Codes and Extractors.

• Conclusions

Pseudorandom Numbers

• Computers rely on pseudorandom generators:

PRG71294 141592653589793238

short random string

long “random-enough”string

What does “random enough” mean?

Modern Approach to PRGs[Blum-Micali 1982, Yao 1982]

Alg

Alg

random

pseudorandom

≈ samebehavior

Require PRG to “fool” all efficient algorithms.

Which efficient algorithms?

• Poly-time PRG fooling all polynomial-time circuits implies NP≠P.

• So either:– Make unproven assumption.– Try to fool interesting subclasses of algorithms.

Existence vs. Explicit Construction

• Most functions are excellent PRGs.–Challenge: find explicit one.

• Most codes have excellent properties.–Known: good explicit codes.

• Can codes give good PRGs?

Idea 1: PRG = Random Codeword

• Choose a random codeword in [n,k] code.– n random variables, |sample space| = 2k.

• But: linear code can’t fool all linear tests.• t-wise independence:– Dual distance > t any t coordinates independ as

rv’s, since any t columns of G lin independ.– t=2: Hadamard code, |Ω|=n+1 [Lancaster 1965]– t odd. Dual BCH code, |Ω| = 2(n+1)(t-1)/2 [ABI ‘86]

Idea 2: PRG=Random Column of G

• k random variables, |sample space| = n.• If dual distance > 4, then Ω is a Sidon set:– All pairwise sums distinct.

• Dual BCH code: |Ω|= 2(k-1)/2.– Don’t need row of 1’s; Ω={(x,x3)|x in F2k/2}.

Generator matrix G

Sample space Ω

k

PRG=Random Column of G

• Say all codewords≠0 have relative wt ½±ε.• corresponds to a codeword, S≠Φ.• ½-ε ≤ Pr[ =1] ≤ ½+ε: ε-biased space [NN ‘90]• RS concat Hadamard: |Ω| = O(k2/ε2) [AGHP ‘90]• AG concat Hadamard: |Ω| = O(k/ε3)– Degree < genus: |Ω| = O(k/ε2)5/4 [BT 2009]

• Optimal, non-explicit: O(k/ε2). Ignored logs.

Generator matrix GX1

Sample space Ω

…Xk

PRGs from Hard Functions [NW ‘88]

• PRGs lower bounds.• Nisan-Wigderson: lower bounds PRGs.• Suppose f is hard on average.

f f

Si Sj

Design (wt w code): |Si|=w

seed

Worst Average Case Hardness [L,BF]

• Given: worst-case hard f:{0,1}w{0,1}.• Encode f using RM code as g:Fq

wFq.– g=unique multilinear function s.t. g=f on {0,1}w.

• g is avg-case hard.– Efficiently compute g on 1-1/(4m) fraction Efficiently compute g everywhere whp.

– Pick random line with L(0)=x.– Degree(g(L(.))) ≤ w.– Interpolate g(L(0)) from g(L(1)),…,g(L(w+1)).

Local Decodability

• Compute any bit of message whp by querying at most r bits of encoding.– RM codes.– New family: Matching Vector Codes [Yekhanin,

Efremenko,…]. Beats RM codes.• Stronger notion: Local correctability.– Can compute any bit of encoding.– RM codes.

List Decoding [Elias 1957]

• Output list of all close codewords.• Can sometimes decode beyond distance/2.• Efficient algorithms for:

– Hadamard [Goldreich-Levin]– Reed-Solomon [Sudan, Guruswami-Sudan]– AG codes [Shokrollahi-Wasserman, GS]– Reed-Muller [large q: STV, q=2: GKZ]– Certain concatenated codes [GS, STV, GI, BM]– PV codes [Parvaresh-Vardy]– Folded RS [Guruswami-Rudra]– Multiplicity codes [Kopparty, Guruswami-Wang]

PRGs and Hardcore Bits[Goldreich-Levin 1989, Impagliazzo 1997]

• Given one-way function f:– Easy to compute, hard to invert.– E.g., f(x)=gx mod p (g generator, p prime).

• Goal: computing bit b(x) hard given f(x).• Thm: Suppose C is (locally) list decodable.

Then b(x,i) = C(x)i hard given f(x), i.• Pf idea: Suppose easy. List decode few

candidates for x. Check if f(candidate)=f(x).

PRG from Hardcore Bits

• Given one-way permutation f, hardcore b.• PRG(x,i)=b(x,i),b(f(x),i),b(f2(x),i),…

List Decoding Related to Randomness Extractors

General Weak Random Source [Z ‘90]

• Random variable X on {0,1}n.• General model: min-entropy

• Flat source:– Uniform on A,

|A| ≥ 2k.|A| ³ 2k

{0,1}n

General Weak Random Source [Z ‘90]

• Can arise in different ways:– Physical source of randomness.– Cryptography: condition on adversary’s

information, e.g. bounded storage model.– Pseudorandom generators (for space s

machines): condition on TM configuration.

Goal: Extract Randomness

Ext n bits m bits

statistical error

Problem: Impossible, even for k=n-1, m=1, ε<1/2.

Impossibility Proof

• Suppose f:{0,1}n {0,1} satisfies sources X ∀with H∞(X) ≥ n-1, f(X) ≈ U.

f-1(0)f-1(1)

Take X=f-1(0)

Randomness Extractor: short seed[Nisan-Z ‘93,…, Guruswami-Umans-Vadhan ‘07]

Ext n bits m =.99k bits

statistical error

d=O(log (n/ε)) random bit seed Y

Strong extractor: (Ext(X,Y),Y) ≈ Uniform

Graph-Theoretic View: “Expansion”

(1-)M K=2k

D=2d

N=2n

M=2m

x y Ext(x,y)

output uniform

Alternate View

S

BADS

D=2d

N=2n M=2m

x

Extractor Codes via Alt-View[Ta-Shma-Z 2001]

• • List recovery – generalizes list decoding.

S=(S1,…,SD), agreement = |{i|xi in Si}|

|{Codewords with agreement ≥(μ(S) + ε)D}|

≤ |BADS|.• Can construct extractor codes with efficient

decoding.• Give hardcore bits Ext(x,y) wrt 1-way (f(x),y).

Leftover Hash Lemma Johnson Bound

• Johnson bound: An [n,k,(½−ε2)n]-code has <L=1/ε2 codewords within distance ½-ε of received word r. Alt pf [TZ ‘01]:

• Let V=close codewords, D=distribution (i,vi), i in [n], v in V. |D-U|≥ε: Pr[(i,vi) = (i,ri)].

• If |V|≥L:– collision-pr(D)<(1/n)(1/|V| + 1-d/n)=(1+4ε2)/(2n)

• Implies |D-U|< ε. Contradiction.

Codes Extractors

• PRGs + Codes Extractor [Trevisan 1999]• RM Codes Extractor[Ta-Shma, Z, Safra 2001; Shaltiel, Umans 2001]• Parvaresh-Vardy Codes Extractor[Guruswami, Umans, Vadhan 2007]

2-Stage Extractor

Condense:

Extract:

.9

uniform

+ O(log n) random bits

+ O(log n) random bits

Parvaresh-Vardy codes Condenser[Guruswami-Umans-Vadhan 2007]

• Fq finite field• parameter h ≤ q• deg. n polynomial E(Y) irreducible over Fq

– source: degree n-1 univariate polynomial f– define fi(Y) = fhi(Y) mod E(Y)

C(f, y 2 Fq) = (y, f0(y), f1(y), f2(y), , fm-1(y))

Independent Sources

n/2 bits n/2 bits

Ext

m =Ω(k) bits statistical error

Bounds for 2 Independent Sources

• Classical: H∞ (X) > n/2.– Lindsey Lemma: inner product.

• Bourgain: H∞ (X) > .4999n.

• Existence: H∞ (X) > 2 log n.

Privacy Amplification With Active Adversary

• Problem: Active adversary could change Y to Y’.

public

Pick Y

Shared secret = Ext(X,Y).

Active Adversary

• Can arbitrarily insert, delete, modify, and reorder messages.

• E.g., can run several rounds with one party before resuming execution with other party.

Non-Malleable Extractor[Dodis-Wichs 2009]

• Strong extractor: (Ext(X,Y),Y) ≈ (U,Y).• nmExt is a non-malleable extractor if for arbitrary

A:{0,1}d {0,1}d with y’ = A(y) ≠ y.(nmExt(X,Y),nmExt(X,Y’),Y) ≈ (U,nmExt(X,Y’),Y)

• nmExt can’t ignore a bit of the seed.• Existence: k > log log n + c, d = log n + O(1),

m = (k-log d)/2.01.• Gives privacy amplification with active adversary in

2 rounds with optimal entropy loss.

Explicit Non-Malleable Extractor

• Even k=n-1, m=1 nontrivial.– E.g., Ext(x,y) = x.y. X=0??...?, y’=A(y) flips first bit,

x.y’= x.y.

• Dodis-Li-Wooley-Z 2011: H∞ (X) > n/2.• Cohen-Raz-Segev 2012: Seed length O(log n).• Li 2012: H∞ (X) > .499n.– Connection with 2-source extractors.

A Simple 1-Bit Construction [Li]

• Sidon set: set S with all s+t, s,t in S, distinct.• Thm [Li]: f(x,y) = x.y, y uniform from S,

nonmalleable extractor for H∞ (X) > n/2.

• Proof: H∞ (Y) = n/2, so X.Y ≈ U (Lindsey’s lemma).

• Suffices to show X.Y+X.A(Y) ≈ U (XOR lemma).• X.Y+X.A(Y) = X.(Y+A(Y)). • H∞ (Y+A(Y)) ≥ H∞ (Y)-1 = n/2 - 1.

Non-Malleable Codes[Dziembowski, Pietrzak, Wichs 2010]

• Adversary tampers with Enc(m) via f in F.– Ideally Dec(f(Enc(m)) = m or “error”– Impossible if f(x) = Enc(m’) allowed.

• Dec(f(Enc(m)) = m or is independent of m.– Randomized encoding allowed.

• Prob method: exist if |F| < 22αn, α<1.• Explicit?• Codes for f(x1,…,xn)=f1(x1),…,fn(xn).

Split-State Tampering

• f(x,y)=g(x),h(y) |x|=|y|=n/2.• 2-source ext for H(X)+H(Y)>2n/3 codes for 1-

bit messages [Dz, Kazana, Obremski 2013]• Poly rate: n=k7+o(1) via additive combinatorics

[Aggarwal, Dodis, Lovett 2013].• Constant rate if can construct nonmalleable 2-

source extractors for entropy rate .99. [Cheraghchi, Guruswami 2013].

Non-Malleable 2-Source Extractor[Cheraghchi, Guruswami 2013]

• X and Y independent weak sources.• Think of H∞(X)=H∞(Y)=.99(n/2).

• For all A1, A2, x’=A1(x)≠x, y’=A2(y)≠y:

• (nmExt(X,Y),nmExt(X,Y’)) ≈ (U,nmExt(X,Y’))• (nmExt(X,Y),nmExt(X’,Y)) ≈ (U,nmExt(X’,Y))• (nmExt(X,Y),nmExt(X’,Y’)) ≈ (U,nmExt(X’,Y’))• Open question: explicit construction.

Key Properties of Codes• Dual distance k-wise independence, Sidon

sets.• Relative distance ≈ ½ small-bias spaces.• Local decodability Amplifying hardness of

functions for PRGs, extractors.• List decodability Cryptographic PRGs,

extractors.• Non-malleability Non-malleable 2-source

extractors.

Open Questions

• Construct ε–biased spaces of size n=O(k/ε2).– [n=O(k/ε2),k,(½-ε)n] codes.

• 2-source extractors for entropy rate α, any α>0.• Non-malleable extractors for H∞(X)=αn.• Non-malleable codes of constant rate.– Non-malleable 2-source extractors.

• Other Applications & Connections.

Thank you!

codes and pseudorandomness : a survey

Documents