1 distributed (local) monotonicity reconstruction michael saks rutgers university c. seshadhri...
Post on 19-Dec-2015
212 views
TRANSCRIPT
![Page 1: 1 Distributed (Local) Monotonicity Reconstruction Michael Saks Rutgers University C. Seshadhri Princeton University (Now IBM Almaden)](https://reader037.vdocument.in/reader037/viewer/2022110322/56649d2e5503460f94a05ffc/html5/thumbnails/1.jpg)
1
Distributed (Local) Monotonicity
ReconstructionMichael Saks
Rutgers University
C. SeshadhriPrinceton University(Now IBM Almaden)
![Page 2: 1 Distributed (Local) Monotonicity Reconstruction Michael Saks Rutgers University C. Seshadhri Princeton University (Now IBM Almaden)](https://reader037.vdocument.in/reader037/viewer/2022110322/56649d2e5503460f94a05ffc/html5/thumbnails/2.jpg)
2
Overview Introduce a new class of algorithmic
problems:Distributed Property Reconstruction
(extending framework of program self-correction, robust property testing locally decodable codes)
A solution for the property Monotonicity
![Page 3: 1 Distributed (Local) Monotonicity Reconstruction Michael Saks Rutgers University C. Seshadhri Princeton University (Now IBM Almaden)](https://reader037.vdocument.in/reader037/viewer/2022110322/56649d2e5503460f94a05ffc/html5/thumbnails/3.jpg)
3
Data Sets
Data set = function f : Γ V
Γ = finite index setV = value set
In this talk,
Γ = [n]d = {1,…,n}d
V = nonnegative integersf = d-dimensional array of nonnegative
integers
![Page 4: 1 Distributed (Local) Monotonicity Reconstruction Michael Saks Rutgers University C. Seshadhri Princeton University (Now IBM Almaden)](https://reader037.vdocument.in/reader037/viewer/2022110322/56649d2e5503460f94a05ffc/html5/thumbnails/4.jpg)
4
For f,g with common domain Γ:
dist(f,g) = fraction of domain where f(x) ≠ g(x)
Distance between two data sets
![Page 5: 1 Distributed (Local) Monotonicity Reconstruction Michael Saks Rutgers University C. Seshadhri Princeton University (Now IBM Almaden)](https://reader037.vdocument.in/reader037/viewer/2022110322/56649d2e5503460f94a05ffc/html5/thumbnails/5.jpg)
5
Properties of data sets
Focus of this talk:
Monotone: nondecreasing along every line
(Order preserving)
When d=1, monotone = sorted
![Page 6: 1 Distributed (Local) Monotonicity Reconstruction Michael Saks Rutgers University C. Seshadhri Princeton University (Now IBM Almaden)](https://reader037.vdocument.in/reader037/viewer/2022110322/56649d2e5503460f94a05ffc/html5/thumbnails/6.jpg)
6
Some Algorithmic problems for PGiven data set f (as input): Recognition: Does f satisfy P? RobustTesting:
(Define ε(f) = min{ dist(f,g) : g satisfies P})
For some 0 ≤ ε1 < ε2 < 1, output either ε(f) > ε1 :f is far from P ε(f) < ε2: f is close to P
(If ε1 < ε(f) ≤ ε2 then can decide either)
![Page 7: 1 Distributed (Local) Monotonicity Reconstruction Michael Saks Rutgers University C. Seshadhri Princeton University (Now IBM Almaden)](https://reader037.vdocument.in/reader037/viewer/2022110322/56649d2e5503460f94a05ffc/html5/thumbnails/7.jpg)
7
Property ReconstructionSetting:
Given f We expect f to satisfy P
(e.g. we run algorithms on f that assume P) but f may not satisfy P
Reconstruction problem for P:
Given data set f, produce data set g that satisfies P is close to f:
d(f,g) is not much bigger than ε(f)
![Page 8: 1 Distributed (Local) Monotonicity Reconstruction Michael Saks Rutgers University C. Seshadhri Princeton University (Now IBM Almaden)](https://reader037.vdocument.in/reader037/viewer/2022110322/56649d2e5503460f94a05ffc/html5/thumbnails/8.jpg)
8
What does it mean to produce g? Offline computation
Input: function table for f
Output: function table for g
![Page 9: 1 Distributed (Local) Monotonicity Reconstruction Michael Saks Rutgers University C. Seshadhri Princeton University (Now IBM Almaden)](https://reader037.vdocument.in/reader037/viewer/2022110322/56649d2e5503460f94a05ffc/html5/thumbnails/9.jpg)
9
Distributed monotonicity reconstructionWant algorithm A that on input x, computes
g(x)
may query f(y) for any y has access to a short random string s
and is otherwise deterministic.
![Page 10: 1 Distributed (Local) Monotonicity Reconstruction Michael Saks Rutgers University C. Seshadhri Princeton University (Now IBM Almaden)](https://reader037.vdocument.in/reader037/viewer/2022110322/56649d2e5503460f94a05ffc/html5/thumbnails/10.jpg)
10
Distributed Property ReconstructionGoal:
WHP (with probability close to 1)
(over choices of random string s):
g has property P d(g,f) = O( ε(f) ) Each A(x) runs quickly
in particular only reads f(y) for a small number of y.
![Page 11: 1 Distributed (Local) Monotonicity Reconstruction Michael Saks Rutgers University C. Seshadhri Princeton University (Now IBM Almaden)](https://reader037.vdocument.in/reader037/viewer/2022110322/56649d2e5503460f94a05ffc/html5/thumbnails/11.jpg)
11
Distributed Property ReconstructionPrecursors: Online Data Reconstruction Model
(Ailon-Chazelle-Liu-Seshadhri)[ACCL]
Locally Decodable Codes and Program self-correction (Blum-Luby-Rubinfeld; Rubinfeld-Sudan; etc )
Graph Coloring (Goldreich-Goldwasser-Ron)
Monotonicity Testing (Dodis-Goldreich- Lehman-Raskhodnikova-Ron-Samorodnitsky; Goldreich-Goldwasser- Lehman-Ron-Samorodnitsky;Fischer;Fischer-Lehman-Newman-Raskhodnikova-Rubinfeld-Samorodnitsky;Ergun-Kannan-Kumar-Rubinfeld-Vishwanathan; etc)
Tolerant Property Testing (Parnas, Ron, Rubinfeld)
![Page 12: 1 Distributed (Local) Monotonicity Reconstruction Michael Saks Rutgers University C. Seshadhri Princeton University (Now IBM Almaden)](https://reader037.vdocument.in/reader037/viewer/2022110322/56649d2e5503460f94a05ffc/html5/thumbnails/12.jpg)
12
Example: Local Decoding of Codes Data set f = boolean string of length n
Property = is a Code word of a
given error correcting code C
Reconstruction = Decoding to a close code word
Distributed reconstruction = Local decoding
![Page 13: 1 Distributed (Local) Monotonicity Reconstruction Michael Saks Rutgers University C. Seshadhri Princeton University (Now IBM Almaden)](https://reader037.vdocument.in/reader037/viewer/2022110322/56649d2e5503460f94a05ffc/html5/thumbnails/13.jpg)
13
Key issue: making answers consistent For error correcting code, can assume
input f decodes to a unique g. The set of positions that need to be corrected is
determined by f. For general property,
many different g (even exponentially many) that are close to f may have the property
We want to ensure that A produces one of them.
![Page 14: 1 Distributed (Local) Monotonicity Reconstruction Michael Saks Rutgers University C. Seshadhri Princeton University (Now IBM Almaden)](https://reader037.vdocument.in/reader037/viewer/2022110322/56649d2e5503460f94a05ffc/html5/thumbnails/14.jpg)
14
An example
Monotonicity with input array:
1,….,100, 111,…,120,101,…,110,121,…,200.
![Page 15: 1 Distributed (Local) Monotonicity Reconstruction Michael Saks Rutgers University C. Seshadhri Princeton University (Now IBM Almaden)](https://reader037.vdocument.in/reader037/viewer/2022110322/56649d2e5503460f94a05ffc/html5/thumbnails/15.jpg)
15
Monotonicity Reconstruction: d=1 f is a linear array of length nFirst attempt at distributed reconstruction:
A(x) looks at f(x) and f(x-1)
If f(x) ≥ f(x-1),then g(x) = f(x)
Otherwise, we have a non-monotonicity
g(x) = max { f(x) , f(x-1) }
![Page 16: 1 Distributed (Local) Monotonicity Reconstruction Michael Saks Rutgers University C. Seshadhri Princeton University (Now IBM Almaden)](https://reader037.vdocument.in/reader037/viewer/2022110322/56649d2e5503460f94a05ffc/html5/thumbnails/16.jpg)
16
Monotonicity Reconstruction: d=1 Second attempt
Set g(x) = max{ f(1), f(2),…, f(x) }
g is monotone but
A(x) requires time Ω(x) dist(g,f) may be much larger than ε(f)
![Page 17: 1 Distributed (Local) Monotonicity Reconstruction Michael Saks Rutgers University C. Seshadhri Princeton University (Now IBM Almaden)](https://reader037.vdocument.in/reader037/viewer/2022110322/56649d2e5503460f94a05ffc/html5/thumbnails/17.jpg)
17
Our results (for general d )
A distributed monotonicity reconstruction algorithm for general dimension d such that:
Time to compute g(x) is (log n)O(d)
dist(f,g) = C1(D) (f) Shared random string s has size (d log n)O(1)
(Builds on prior results on monotonicity testing and online monotonicity reconstruction.)
![Page 18: 1 Distributed (Local) Monotonicity Reconstruction Michael Saks Rutgers University C. Seshadhri Princeton University (Now IBM Almaden)](https://reader037.vdocument.in/reader037/viewer/2022110322/56649d2e5503460f94a05ffc/html5/thumbnails/18.jpg)
18
Which array values should be changed?A subset S of Γ is f-monotone
if f restricted to S is monotone.
For each x in Γ, A(x) must: Decide whether g(x) = f(x) If not , then determine g(x)
Preserved = { x : g(x) = f(x) }
Corrected = { x : g(x) ≠ f(x) }
In particular, Preserved must be f-monotone
![Page 19: 1 Distributed (Local) Monotonicity Reconstruction Michael Saks Rutgers University C. Seshadhri Princeton University (Now IBM Almaden)](https://reader037.vdocument.in/reader037/viewer/2022110322/56649d2e5503460f94a05ffc/html5/thumbnails/19.jpg)
19
Identifying Preserved
The partition (Preserved, Corrected)
must satisfy:
Preserved is f-monotone |Corrected|/|Γ| = O(ε(f))
Preliminary algorithmic problem:
![Page 20: 1 Distributed (Local) Monotonicity Reconstruction Michael Saks Rutgers University C. Seshadhri Princeton University (Now IBM Almaden)](https://reader037.vdocument.in/reader037/viewer/2022110322/56649d2e5503460f94a05ffc/html5/thumbnails/20.jpg)
20
Classification problem
Classify each y in Γ as Green or Red Green is f - monotone Red has size O(ε(f)|Γ|)
Need subroutine Classify(y).
![Page 21: 1 Distributed (Local) Monotonicity Reconstruction Michael Saks Rutgers University C. Seshadhri Princeton University (Now IBM Almaden)](https://reader037.vdocument.in/reader037/viewer/2022110322/56649d2e5503460f94a05ffc/html5/thumbnails/21.jpg)
21
A sufficient condition for f-monotonicityA pair (x,y) in Γ × Γ is a violation if
x < y and f(x) > f(y)
To guarantee that Green is f - monotone:
Red should hit all violations:
For every violation (x,y) at least one of x,y is Red
![Page 22: 1 Distributed (Local) Monotonicity Reconstruction Michael Saks Rutgers University C. Seshadhri Princeton University (Now IBM Almaden)](https://reader037.vdocument.in/reader037/viewer/2022110322/56649d2e5503460f94a05ffc/html5/thumbnails/22.jpg)
22
Classify: 1-dimensional case
d=1: Γ={1,…,n}
f is a linear array.
For x in Γ, and subinterval J of Γ:
violations(x,J)=|{y in J : (x,y) is a violation}|
![Page 23: 1 Distributed (Local) Monotonicity Reconstruction Michael Saks Rutgers University C. Seshadhri Princeton University (Now IBM Almaden)](https://reader037.vdocument.in/reader037/viewer/2022110322/56649d2e5503460f94a05ffc/html5/thumbnails/23.jpg)
23
Constructing a large f-monotone setThe set Bad:
x in Bad if for some interval J containing x|violations(x,J)|≥|J|/2
Lemma.1)Good=Γ - Bad is f-monotone2)|Bad| ≤ 4 ε(f)|Γ| . Proof: 1) If x,y are a violation then one of them is Bad for the
interval [x,y].
![Page 24: 1 Distributed (Local) Monotonicity Reconstruction Michael Saks Rutgers University C. Seshadhri Princeton University (Now IBM Almaden)](https://reader037.vdocument.in/reader037/viewer/2022110322/56649d2e5503460f94a05ffc/html5/thumbnails/24.jpg)
24
Lemma. Good=Γ \ Bad is f-monotone |Bad| ≤ 4 ε(f)|Γ| .
So we’d like to take:
Green=Good Red = Bad
![Page 25: 1 Distributed (Local) Monotonicity Reconstruction Michael Saks Rutgers University C. Seshadhri Princeton University (Now IBM Almaden)](https://reader037.vdocument.in/reader037/viewer/2022110322/56649d2e5503460f94a05ffc/html5/thumbnails/25.jpg)
25
How do we compute Good?
To test whether y in Good:For each interval J containing y,
check violations(y,J)< |J|/2
Difficulties There are (n) intervals J containing y For each J, computing violations(y,J)
takes time (|J|) .
![Page 26: 1 Distributed (Local) Monotonicity Reconstruction Michael Saks Rutgers University C. Seshadhri Princeton University (Now IBM Almaden)](https://reader037.vdocument.in/reader037/viewer/2022110322/56649d2e5503460f94a05ffc/html5/thumbnails/26.jpg)
26
Speeding up the computation
Estimate violations(y,J) by random sampling
sample size polylog(n) is sufficient
violations* (y,J) denotes the estimate
Compute violations* (y,J) only for a
carefully chosen set of test intervals
![Page 27: 1 Distributed (Local) Monotonicity Reconstruction Michael Saks Rutgers University C. Seshadhri Princeton University (Now IBM Almaden)](https://reader037.vdocument.in/reader037/viewer/2022110322/56649d2e5503460f94a05ffc/html5/thumbnails/27.jpg)
28
The Test Set T
Assume n=|Γ|=2k
k layers of intervals
Layer j consists of 2k-j+1-1 intervals of size 2j
![Page 28: 1 Distributed (Local) Monotonicity Reconstruction Michael Saks Rutgers University C. Seshadhri Princeton University (Now IBM Almaden)](https://reader037.vdocument.in/reader037/viewer/2022110322/56649d2e5503460f94a05ffc/html5/thumbnails/28.jpg)
30
Subroutine classify
To classify y
If for each J in T containing y
violations*(y,J) < .1 |J|
then y is Green
else y is Red
![Page 29: 1 Distributed (Local) Monotonicity Reconstruction Michael Saks Rutgers University C. Seshadhri Princeton University (Now IBM Almaden)](https://reader037.vdocument.in/reader037/viewer/2022110322/56649d2e5503460f94a05ffc/html5/thumbnails/29.jpg)
31
Where are we?
We have a subroutine Classify On input x,
Classify outputs Green or Red Runs in time polylog(n)
WHP Green is f-monotone |Red| ≤ 20ε(f)|Γ|
![Page 30: 1 Distributed (Local) Monotonicity Reconstruction Michael Saks Rutgers University C. Seshadhri Princeton University (Now IBM Almaden)](https://reader037.vdocument.in/reader037/viewer/2022110322/56649d2e5503460f94a05ffc/html5/thumbnails/30.jpg)
32
Defining g(x) for Red x
The natural way to define g(x) is: Green(x) = { y : y ≤x and y Green}
g(x) = max{f(y) : y in Green(x))}
= f(max{Green(x)}) In particular, this gives
g(x) = f(x) for Green x
![Page 31: 1 Distributed (Local) Monotonicity Reconstruction Michael Saks Rutgers University C. Seshadhri Princeton University (Now IBM Almaden)](https://reader037.vdocument.in/reader037/viewer/2022110322/56649d2e5503460f94a05ffc/html5/thumbnails/31.jpg)
34
Computing m(x)
Can search back from x to find first Green
Inefficient if x is preceded by a long Red stretch
xm(x)
![Page 32: 1 Distributed (Local) Monotonicity Reconstruction Michael Saks Rutgers University C. Seshadhri Princeton University (Now IBM Almaden)](https://reader037.vdocument.in/reader037/viewer/2022110322/56649d2e5503460f94a05ffc/html5/thumbnails/32.jpg)
35
Approximating m(x)?
x
Pick random Sample(x) of points less than x Density inversely proportional to distance from x Size is polylog(n)
Green* (x) = { y: y in Sample(x) , y Green}
m*(x) = max {y in Green* (x)}
m*(x)
![Page 33: 1 Distributed (Local) Monotonicity Reconstruction Michael Saks Rutgers University C. Seshadhri Princeton University (Now IBM Almaden)](https://reader037.vdocument.in/reader037/viewer/2022110322/56649d2e5503460f94a05ffc/html5/thumbnails/33.jpg)
36
Is m*(x) good enough?
xm*(x) y
Suppose y is Green and m*(x) ≤ y ≤ x Since y is Green:
g(y) = f(y)
and
g(x) = f(m*(x)) < f(y) = g(y)
g is not monotone
![Page 34: 1 Distributed (Local) Monotonicity Reconstruction Michael Saks Rutgers University C. Seshadhri Princeton University (Now IBM Almaden)](https://reader037.vdocument.in/reader037/viewer/2022110322/56649d2e5503460f94a05ffc/html5/thumbnails/34.jpg)
37
Is m*(x) good enough?
To ensure monotonicity we need:
x < y implies m*(x) < m*(y) Requires relaxing the requirement:
for all Green z, m*(z) = z
xm*(x) y
![Page 35: 1 Distributed (Local) Monotonicity Reconstruction Michael Saks Rutgers University C. Seshadhri Princeton University (Now IBM Almaden)](https://reader037.vdocument.in/reader037/viewer/2022110322/56649d2e5503460f94a05ffc/html5/thumbnails/35.jpg)
38
Thinning out Green* (x)
Plan: Eliminate certain unsafe points from Green*(x)
Roughly, y is unsafe for x if for some z > x
Some interval beginning with y and containing x has a high density of Reds.
(There is a non-trivial chance that Sample(z)has no Green points ≥ y.)
xy z
![Page 36: 1 Distributed (Local) Monotonicity Reconstruction Michael Saks Rutgers University C. Seshadhri Princeton University (Now IBM Almaden)](https://reader037.vdocument.in/reader037/viewer/2022110322/56649d2e5503460f94a05ffc/html5/thumbnails/36.jpg)
39
Thinning out Green* (x)
Green* (x) = { y: y in Sample(x) , y Green}
m*(x) = max {y in Green* (x)}
Green^(x) = { y: y in Green* (x) , y safe for x}m^(x) = max {y in Green^ (x)}
(Hiding: Efficient implementation of Green^(x))
![Page 37: 1 Distributed (Local) Monotonicity Reconstruction Michael Saks Rutgers University C. Seshadhri Princeton University (Now IBM Almaden)](https://reader037.vdocument.in/reader037/viewer/2022110322/56649d2e5503460f94a05ffc/html5/thumbnails/37.jpg)
40
Redefining Green^(x)
WHP
if x ≤ y, then m^(x) ≤ m^(y)
{x: m^(x) ≠ x} is O(ε(f) |Γ|).
![Page 38: 1 Distributed (Local) Monotonicity Reconstruction Michael Saks Rutgers University C. Seshadhri Princeton University (Now IBM Almaden)](https://reader037.vdocument.in/reader037/viewer/2022110322/56649d2e5503460f94a05ffc/html5/thumbnails/38.jpg)
41
Summary of 1-dimensional case Classify points as Green and Red
Few Red points f restricted to Green is f-monotone
For each x, choose Sample(x) size polylog(n) All points less than x Density inversely proportional to distance from x
Green^ (x) from Sample(x) that are safe for x m^(x) is the maximum of Green^(x)
Output g(x)=f(m^(x))
![Page 39: 1 Distributed (Local) Monotonicity Reconstruction Michael Saks Rutgers University C. Seshadhri Princeton University (Now IBM Almaden)](https://reader037.vdocument.in/reader037/viewer/2022110322/56649d2e5503460f94a05ffc/html5/thumbnails/39.jpg)
42
Dimension greater than 1
For x < y, want g(x) < g(y)
x
y
![Page 40: 1 Distributed (Local) Monotonicity Reconstruction Michael Saks Rutgers University C. Seshadhri Princeton University (Now IBM Almaden)](https://reader037.vdocument.in/reader037/viewer/2022110322/56649d2e5503460f94a05ffc/html5/thumbnails/40.jpg)
43
Red/Green Classification
Extend the Red/Green classification to higher dimensions: f restricted to Green is Monotone Red is small
Straightforward (mostly) extension of
1-dimensional case
![Page 41: 1 Distributed (Local) Monotonicity Reconstruction Michael Saks Rutgers University C. Seshadhri Princeton University (Now IBM Almaden)](https://reader037.vdocument.in/reader037/viewer/2022110322/56649d2e5503460f94a05ffc/html5/thumbnails/41.jpg)
44
Given Red/Green classificationIn the one-dimensional case,
Green^ (x) = sampled Green points safe for x
g(x) = f(max {y : y in Green^ (x) }
.
![Page 42: 1 Distributed (Local) Monotonicity Reconstruction Michael Saks Rutgers University C. Seshadhri Princeton University (Now IBM Almaden)](https://reader037.vdocument.in/reader037/viewer/2022110322/56649d2e5503460f94a05ffc/html5/thumbnails/42.jpg)
45
The Green points below x
Set of Green maxima could be very large Sparse Random Sampling will only roughly capture the frontier Finding an appropriate definition of unsafe points is much harder
than in the one dimensional case
01
x
![Page 43: 1 Distributed (Local) Monotonicity Reconstruction Michael Saks Rutgers University C. Seshadhri Princeton University (Now IBM Almaden)](https://reader037.vdocument.in/reader037/viewer/2022110322/56649d2e5503460f94a05ffc/html5/thumbnails/43.jpg)
46
Further work The g produced by our algorithm has
d(g,f) ≤ C(d)ε(f)|Γ| Our C(d) is exp(d2) . What should C(d) be? (Guess: C(d) = exp(d) )
Distributed reconstruction for other interesting properties? (Reconstructing expanders, Kale,Peres,
Seshadhri, FOCS 08)