approximate list- decoding and hardness amplification valentine kabanets (sfu) joint work with...
TRANSCRIPT
Approximate List-Decoding and
Hardness Amplification
Valentine Kabanets (SFU)joint work with
Russell Impagliazzo and Ragesh Jaiswal (UCSD)
Error-Correcting Codes, Randomness and
Complexity
“Classical” complexity (pre-Randomness): P vs NP, P vs NL, …
“Modern” complexity (Randomness):cryptography, NP = PCP(log n, 1), BPP vs P, expanders/extractors/…
Use of classical error-correcting codes (Hadamard, Reed-Solomon, …)
Invention of new kinds of codes (locally testable, locally decodable, locally list-decodable, …)
Example: Derandomization
Idea: Replace truly random string with
computationally random (pseudorandom) string
Goal: Save on the number of random bits used
Example: Derandomization
Computational Randomness =
Computational Hardness
Hard-to-compute functions ) Pseudorandom Generators (PRG) ) derandomization (BPP = P)
PRG ) Hard-to-compute functions
Hardness of Boolean functions
worst-case hardness of f : every (resource-bounded) algorithm A computes f(x) incorrectly for at least one input x
average-case hardness of f : every (resource-bounded) algorithm A computes f(x) incorrectly for at least fraction of inputs x
PRG requires average-case hard functions
Worst-Case to Average-Case
f(x1) f(x2) f(x3) … f(xN)
Error-Correcting Encoding
g(x1) g(x2) g(x3) … g(xM)
N = 2n
M = 2O(n)
Correcting Errors
f(x1) f(x2) f(x3) … f(xN)
Error-Correcting Decoding
g(x1) g(x2) g(x3) … g(xM)
If can compute g on “many” inputs, then can compute f on all inputs.
A Closer Look
ImplicitError-Correcting
Decoding
If h(x) = g(x) for “many” x , and h is computable by a “small” circuit,then f is computable by a “small” circuit.
h
f
h ¼ g
Use Locally Decodableerror-correcting codes !
List-Decodable Codes
ImplicitError-Correcting
List-Decoding
If h(x) = g(x) for ½ + of inputs, and h is computable by a “small” circuit,then f is computable by a “small” circuit.
hh ¼ g
Use Locally List-Decodableerror-correcting codes !
[Sudan, Trevisan, Vadhan ’01](algebraic polynomial-based codes)
f
Hardness Amplification
Yao’s XOR Lemma: If f:{0,1}n ! {0,1} is -hard for size s (i.e.,
any size s circuit errs on ¸ fraction of inputs), then
f©k(x1,…,xk) = f(x1) © … © f(xk) is (1/2-) –hard for size s’=s* poly(,), for ¼ 2-( k)
Proof: By contradiction. Suppose have a smallcircuit computing f©k on more than ½+ fraction, show how to build a new circuit computing f on > 1- fraction.
XOR-based Code
Think of a binary message msg on N=2n bits as a truth-table of a Boolean function f.
The code of msg is of length Nk where code(x1,…,xk) = f(x1) © … © f(xk).
This is very similar to a version of Hadamard code …
Hadamard CodeGiven a binary msg on N bits, the
Hadamard code of msg is a string of 2N bits, where for an N-bit string r, the code at r is
Had(msg)r = < msg, r > mod 2
(the inner product of msg and r)Our XOR-code is essentially truncated Hadamard code where we only consider N-bit strings r of Hamming weight k :f(x1) © … © f(xk) = < msg, r >where ri=1 for i=x1, …, xk and ri=0 elsewhere
List-Decoding Hadamard Code
Given a 2N-bit string w, how many N-bit strings m1, …, mt are there such that Had(mi) agrees with w in ¸ ½ + fraction of positions ?
Answer: O(1/2) (easy to show using discrete Fourier analysis, or
elementary probability theory)
The famous Goldreich-Levin algorithm provides an efficient way of list-decoding Hadamard code with optimal list size O(1/2)
List-Decoding k-XOR-Code
Given a string w, how many strings m1, …, mt are there such that each k-XOR codeword code(mi) agrees with w in ¸ ½ + fraction of positions ?
Answer: Too many ! (any two messages that differ in < 1/k fraction of bits have almost identical codewords)
List-Decoding k-XOR-Code
Correct question:Given a string m, how many k-XOR codewords
code(msg1), …, code(msgt) are there such that
(1) each code(msgi) agrees with m in ¸ ½ + fraction of positions, and
(2) every pair msgi and msgj differ in at least fraction of positions ?
Answer: 1/(42 – e-2 k), which is O(1/2) for > log (1/)/k (as is the case for Yao’s XOR Lemma ! )
The List Size
The proof of Yao’s XOR Lemma yields an approximate list-decoding algorithm for the XOR-code defined above.
But the list size is 2poly(1/) rather than the optimal poly(1/)
Our Result for k-XOR Code
There is a randomized algorithm such that, for ¸ poly(1/k):
Given a circuit C that computes code(msg) in ½+ fraction of positions, the algorithm outputs with high probability a list of poly(1/) circuits that contains a circuit agreeing with msg in ¸ 1- k-0.1 fraction positions. The running time is poly(|C|,1/).
Direct Product Lemma
If f:{0,1}n ! {0,1} is -hard for size s (i.e., any size s circuit errs on ¸ fraction of inputs), then
fk(x1,…,xk) = f(x1)…f(xk) is ¼ 2-( k) -hard for size s’=s* poly(,).
XOR Lemma and Direct Product Lemma are essentially equivalent, thanks to the Goldreich-Levin list-decoding algorithm for Hadamard codes. Hence, enough to list-decode the Direct Product Lemma.
The proof of the DP Lemma
[Impagliazzo & Wigderson]: Give efficient randomized algorithm LEARN that, given as input a circuit C -computing fk (where f:{0,1}n ! {0,1}) and poly(n,1/) pairs (x,f(x)) for independent uniform x’s, with high probability outputs a circuit C’ (1-)-computing f.
Need to know f(x) for poly(n,1/) random x’s. Let’s choose x’s at random, and then try all possibilities for the values of f on these x’s. This gives a list of 2poly(n,1/) circuits.
Reducing the List Size
Magic: We will use the circuit C -computing fk
to generate poly(n,1/) pairs (x,f(x)) for independent uniform x’s, and then run LEARN on C and the generated pairs (x,f(x)).
Well… Cannot do exactly that, but …
Imperfect samples
We will use the circuit C -computing fk to generate poly(n,1/) pairs (x,bx) for a distribution on x’s that is statistically close to uniform and such that for most x’s we have bx= f(x).
Then run a generalization of LEARN on C and the generated pairs (x,bx), where the generalized LEARN is tolerant of imperfect samples (x,bx).
How to generate imperfect samples
Warm-up
Given a circuit C -computing fk, want to generate (x,f(x)) where x is almost uniformly distributed.
First attempt: Pick a k-tuple (x1,…, xk) uniformly at random from the -fraction of k-tuples where C is correct. Evaluate C(x1,…, xk) = b1… bk. Pick a random i, 1· i· k, and output (xi,bi).
A Sampling Lemma
Let Sµ {0,1}nk be any set of density . Define a distribution D as follows: Pick an k-tuple of n-bit strings (x1,…,xk)
uniformly at random from S, pick uniformly an index 1· i· k, and output xi.
Then the statistical distance between D and the uniform distribution is at most (log(k/)/k)1/2 ¼ 1/k
Using the Sampling Lemma
If we could sample k-tuples on which C is correct, then we would have a pair (x,f(x)) for x ¼1/k- close to uniform.
But we can’t ! Instead, run the previous sampling procedure with a random k-tuple (x1,…, xk) some poly(1/) number of times.
With high probability, at least one pair will
be of the form (x,f(x)) for x close to uniform.
Getting more pairs
Given a circuit C -computing fk, we can get k1/2 pairs (x,f(x)),
for x’s statistically close to uniform, by viewing the input k-tuple as a k1/2-
tuple of k1/2-tuples, and applying the Sampling Lemma to
that “meta-tuple”.
What does it give us ?
Given a circuit C -computing fk, we can generate about k1/2 samples (x,f(x)). (Roughly speaking.)
Need about n/2 samples (to run LEARN).
If n/2 < k1/2, then done. What if n/2 > k1/2 ???
Direct Product Amplification
Idea:Given a circuit C -computing fk,
construct a new circuit C’ that ’-computes fk’ for k’ = k3/2, and ’ > 2.
Iterate a constant number of times, and get a circuit poly()-computingfpoly(k) for any poly(k). If = poly(1/k), we are done. [ since n/2 · poly(k) ]
Direct Product Amplification
Cannot achieve perfect DP amplification !Instead, can create a circuit C’ such that,
for at least ’ fraction of tuples (x1,…, xk’), C’(x1,…, xk’) agrees with f(x1),…, f(xk’) in “most” positions.
Because of this imperfection, we can onlyget pairs of the form (x,bx) where x’s are almost uniform and “most” bx=f(x).
Putting Everything Together
C for fk C’ for fkc
DP amplification
Sampling
LEARN
pairs (x,bx)
circuit (1-)-computing f
with probability > poly()
Repeat poly(1/) times to get a list containing a good circuit for f, w.h.p.
An application to uniform hardness
amplification
Hardness amplification in PH
Theorem: Suppose there is a language L in PNPk that is 1/nc-hard for BPP. Then there is a language L’ in PNPk that is (1/2-n-d)-hard for BPP, for any constant d.
Trevisan gives a weaker reduction (from 1/nc to (1/2 – log- n) hardness) but within NP. Since we use the nonmonotone function XOR as an amplifier, we get outside NP.
Open Questions
Achieving optimal list-size decoding for arbitrary .
What monotone functions f yield efficiently list-decodable f-based error-correcting codes ? Getting an analogue of the Goldreich-Levin algorithm for monotone f-based codes would yield better uniform hardness amplification in NP.