adaptive annealing: a near-optimal connection between sampling and counting
DESCRIPTION
Adaptive annealing: a near-optimal connection between sampling and counting. Daniel Štefankovič (University of Rochester) Santosh Vempala Eric Vigoda (Georgia Tech). Adaptive annealing: a near-optimal connection between sampling and counting. If you want to count - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Adaptive annealing: a near-optimal connection between sampling and counting](https://reader036.vdocument.in/reader036/viewer/2022062521/56816868550346895dded0cb/html5/thumbnails/1.jpg)
Adaptive annealing: a near-optimal connection between
sampling and counting
Daniel Štefankovič(University of Rochester)
Santosh VempalaEric Vigoda(Georgia Tech)
![Page 2: Adaptive annealing: a near-optimal connection between sampling and counting](https://reader036.vdocument.in/reader036/viewer/2022062521/56816868550346895dded0cb/html5/thumbnails/2.jpg)
Adaptive annealing: a near-optimal connection between
sampling and counting
Daniel Štefankovič(University of Rochester)
Santosh VempalaEric Vigoda(Georgia Tech)
If you want to count using MCMC then statistical physicsis useful.
![Page 3: Adaptive annealing: a near-optimal connection between sampling and counting](https://reader036.vdocument.in/reader036/viewer/2022062521/56816868550346895dded0cb/html5/thumbnails/3.jpg)
1. Counting problems
2. Basic tools: Chernoff, Chebyshev
3. Dealing with large quantities (the product method)
4. Statistical physics
5. Cooling schedules (our work)
6. More…
Outline
![Page 4: Adaptive annealing: a near-optimal connection between sampling and counting](https://reader036.vdocument.in/reader036/viewer/2022062521/56816868550346895dded0cb/html5/thumbnails/4.jpg)
independent sets
spanning trees
matchings
perfect matchings
k-colorings
Counting
![Page 5: Adaptive annealing: a near-optimal connection between sampling and counting](https://reader036.vdocument.in/reader036/viewer/2022062521/56816868550346895dded0cb/html5/thumbnails/5.jpg)
independent sets
spanning trees
matchings
perfect matchings
k-colorings
Counting
![Page 6: Adaptive annealing: a near-optimal connection between sampling and counting](https://reader036.vdocument.in/reader036/viewer/2022062521/56816868550346895dded0cb/html5/thumbnails/6.jpg)
spanning trees
Compute the number of
![Page 7: Adaptive annealing: a near-optimal connection between sampling and counting](https://reader036.vdocument.in/reader036/viewer/2022062521/56816868550346895dded0cb/html5/thumbnails/7.jpg)
spanning trees
Compute the number of
det(D – A)vv0 1 0 11 0 1 00 1 0 11 0 1 0
2 0 0 00 2 0 00 0 2 00 0 0 2
2 -1 0-1 2 -1 0 -1 2
Kirchhoff’s Matrix Tree Theorem:
-D A
det
![Page 8: Adaptive annealing: a near-optimal connection between sampling and counting](https://reader036.vdocument.in/reader036/viewer/2022062521/56816868550346895dded0cb/html5/thumbnails/8.jpg)
spanning trees
Compute the number of
polynomial-timealgorithmG
number of spanning trees of G
![Page 9: Adaptive annealing: a near-optimal connection between sampling and counting](https://reader036.vdocument.in/reader036/viewer/2022062521/56816868550346895dded0cb/html5/thumbnails/9.jpg)
independent sets
spanning trees
matchings
perfect matchings
k-colorings
Counting ?
![Page 10: Adaptive annealing: a near-optimal connection between sampling and counting](https://reader036.vdocument.in/reader036/viewer/2022062521/56816868550346895dded0cb/html5/thumbnails/10.jpg)
independent sets
Compute the number of
(hard-core gas model)
independent set subset S of vertices, of a graph no two in S are neighbors=
![Page 11: Adaptive annealing: a near-optimal connection between sampling and counting](https://reader036.vdocument.in/reader036/viewer/2022062521/56816868550346895dded0cb/html5/thumbnails/11.jpg)
# independent sets = 7
independent set = subset S of vertices no two in S are neighbors
![Page 12: Adaptive annealing: a near-optimal connection between sampling and counting](https://reader036.vdocument.in/reader036/viewer/2022062521/56816868550346895dded0cb/html5/thumbnails/12.jpg)
...
# independent sets = G1
G2
G3
Gn
...Gn-2
...Gn-1
![Page 13: Adaptive annealing: a near-optimal connection between sampling and counting](https://reader036.vdocument.in/reader036/viewer/2022062521/56816868550346895dded0cb/html5/thumbnails/13.jpg)
...
# independent sets = G1
G2
G3
Gn
...Gn-2
...Gn-1
2
3
5
Fn-1
Fn
Fn+1
![Page 14: Adaptive annealing: a near-optimal connection between sampling and counting](https://reader036.vdocument.in/reader036/viewer/2022062521/56816868550346895dded0cb/html5/thumbnails/14.jpg)
# independent sets = 5598861
independent set = subset S of vertices no two in S are neighbors
![Page 15: Adaptive annealing: a near-optimal connection between sampling and counting](https://reader036.vdocument.in/reader036/viewer/2022062521/56816868550346895dded0cb/html5/thumbnails/15.jpg)
independent sets
Compute the number of
polynomial-timealgorithmG
number of independent sets of G
?
![Page 16: Adaptive annealing: a near-optimal connection between sampling and counting](https://reader036.vdocument.in/reader036/viewer/2022062521/56816868550346895dded0cb/html5/thumbnails/16.jpg)
independent sets
Compute the number of
polynomial-timealgorithmG
number of independent sets of G
!(unlikely)
![Page 17: Adaptive annealing: a near-optimal connection between sampling and counting](https://reader036.vdocument.in/reader036/viewer/2022062521/56816868550346895dded0cb/html5/thumbnails/17.jpg)
#P-complete
#P-complete even for 3-regular graphs
graph G # independent sets in G
(Dyer, Greenhill, 1997)
FP
#P
P
NP
![Page 18: Adaptive annealing: a near-optimal connection between sampling and counting](https://reader036.vdocument.in/reader036/viewer/2022062521/56816868550346895dded0cb/html5/thumbnails/18.jpg)
graph G # independent sets in G
approximation
randomization
?
![Page 19: Adaptive annealing: a near-optimal connection between sampling and counting](https://reader036.vdocument.in/reader036/viewer/2022062521/56816868550346895dded0cb/html5/thumbnails/19.jpg)
graph G # independent sets in G
approximation
randomization
?which is more important?
![Page 20: Adaptive annealing: a near-optimal connection between sampling and counting](https://reader036.vdocument.in/reader036/viewer/2022062521/56816868550346895dded0cb/html5/thumbnails/20.jpg)
graph G # independent sets in G
approximation
randomization
?which is more important?
My world-view: (true) randomness is important conceptually but NOT computationally(i.e., I believe P=BPP).
approximation makes problems easier (i.e., I believe #P=BPP)
![Page 21: Adaptive annealing: a near-optimal connection between sampling and counting](https://reader036.vdocument.in/reader036/viewer/2022062521/56816868550346895dded0cb/html5/thumbnails/21.jpg)
We would like to know Q
Goal: random variable Y such that
P( (1-)Q Y (1+)Q ) 1-
“Y gives (1-estimate”
![Page 22: Adaptive annealing: a near-optimal connection between sampling and counting](https://reader036.vdocument.in/reader036/viewer/2022062521/56816868550346895dded0cb/html5/thumbnails/22.jpg)
We would like to know Q
Goal: random variable Y such that
P( (1-)Q Y (1+)Q ) 1-
polynomial-timealgorithmG,,
FPRAS: Y
(fully polynomial randomized approximation scheme):
![Page 23: Adaptive annealing: a near-optimal connection between sampling and counting](https://reader036.vdocument.in/reader036/viewer/2022062521/56816868550346895dded0cb/html5/thumbnails/23.jpg)
1. Counting problems
2. Basic tools: Chernoff, Chebyshev
3. Dealing with large quantities (the product method)
4. Statistical physics
5. Cooling schedules (our work)
6. More...
Outline
![Page 24: Adaptive annealing: a near-optimal connection between sampling and counting](https://reader036.vdocument.in/reader036/viewer/2022062521/56816868550346895dded0cb/html5/thumbnails/24.jpg)
We would like to know Q 1. Get an unbiased estimator X, i. e.,
E[X] = Q
Y=X1 + X2 + ... + Xnn
2. “Boost the quality” of X:
![Page 25: Adaptive annealing: a near-optimal connection between sampling and counting](https://reader036.vdocument.in/reader036/viewer/2022062521/56816868550346895dded0cb/html5/thumbnails/25.jpg)
P( Y gives (1)-estimate )
1 -
The Bienaymé-Chebyshev inequality
V[Y]E[Y]2
1
![Page 26: Adaptive annealing: a near-optimal connection between sampling and counting](https://reader036.vdocument.in/reader036/viewer/2022062521/56816868550346895dded0cb/html5/thumbnails/26.jpg)
P( Y gives (1)-estimate )
1 -
Y=X1 + X2 + ... + Xn
n
The Bienaymé-Chebyshev inequality
V[Y]E[Y]2 = 1 V[X]
E[X]2n
squared coefficient of variation SCV
V[Y]E[Y]2
1
![Page 27: Adaptive annealing: a near-optimal connection between sampling and counting](https://reader036.vdocument.in/reader036/viewer/2022062521/56816868550346895dded0cb/html5/thumbnails/27.jpg)
P( Y gives (1)-estimate of Q )
Let X1,...,Xn,X be independent, identically distributed random variables, Q=E[X]. Let
The Bienaymé-Chebyshev inequality
1 - V[X]n E[X]2
1
ThenY=
X1 + X2 + ... + Xn
n
![Page 28: Adaptive annealing: a near-optimal connection between sampling and counting](https://reader036.vdocument.in/reader036/viewer/2022062521/56816868550346895dded0cb/html5/thumbnails/28.jpg)
P( Y gives (1)-estimate of Q ) - 2 . n . E[X] / 3 1 –
Let X1,...,Xn,X be independent, identically distributed random variables, 0 X 1, Q=E[X]. Let
Chernoff’s bound
Y=X1 + X2 + ... + Xn
nThen
e
![Page 29: Adaptive annealing: a near-optimal connection between sampling and counting](https://reader036.vdocument.in/reader036/viewer/2022062521/56816868550346895dded0cb/html5/thumbnails/29.jpg)
![Page 30: Adaptive annealing: a near-optimal connection between sampling and counting](https://reader036.vdocument.in/reader036/viewer/2022062521/56816868550346895dded0cb/html5/thumbnails/30.jpg)
n V[X]E[X]2
1
1
n 1E[X]
3
ln (1/)
0X1
Number of samples to achieve precision with confidence
![Page 31: Adaptive annealing: a near-optimal connection between sampling and counting](https://reader036.vdocument.in/reader036/viewer/2022062521/56816868550346895dded0cb/html5/thumbnails/31.jpg)
n V[X]E[X]2
1
1
n 1E[X]
3
ln (1/)
0X1
Number of samples to achieve precision with confidence
BADGOOD
BAD
![Page 32: Adaptive annealing: a near-optimal connection between sampling and counting](https://reader036.vdocument.in/reader036/viewer/2022062521/56816868550346895dded0cb/html5/thumbnails/32.jpg)
Median “boosting trick”
P( ) 3/4
n 1E[X]
4
(1-)Q (1+)Q
Y=X1 + X2 + ... + Xn
n
Y=
BY BIENAYME-CHEBYSHEV:
![Page 33: Adaptive annealing: a near-optimal connection between sampling and counting](https://reader036.vdocument.in/reader036/viewer/2022062521/56816868550346895dded0cb/html5/thumbnails/33.jpg)
Median trick – repeat 2T times(1-)Q (1+)Q
P( ) 3/4
P( ) 1 - e-T/4> T out of 2T
median is in
P( ) 1 - e-T/4
BY BIENAYME-CHEBYSHEV:
BY CHERNOFF:
![Page 34: Adaptive annealing: a near-optimal connection between sampling and counting](https://reader036.vdocument.in/reader036/viewer/2022062521/56816868550346895dded0cb/html5/thumbnails/34.jpg)
n V[X]E[X]2
32
n 1E[X]
3
ln (1/)
0X1
+ median trick
ln (1/)
BAD
![Page 35: Adaptive annealing: a near-optimal connection between sampling and counting](https://reader036.vdocument.in/reader036/viewer/2022062521/56816868550346895dded0cb/html5/thumbnails/35.jpg)
n V[X]E[X]2
1
ln (1/)
Creating “approximator” from X = precision = confidence
![Page 36: Adaptive annealing: a near-optimal connection between sampling and counting](https://reader036.vdocument.in/reader036/viewer/2022062521/56816868550346895dded0cb/html5/thumbnails/36.jpg)
1. Counting problems
2. Basic tools: Chernoff, Chebyshev
3. Dealing with large quantities (the product method)
4. Statistical physics
5. Cooling schedules (our work)
6. More...
Outline
![Page 37: Adaptive annealing: a near-optimal connection between sampling and counting](https://reader036.vdocument.in/reader036/viewer/2022062521/56816868550346895dded0cb/html5/thumbnails/37.jpg)
(approx) counting sampling Valleau,Card’72 (physical chemistry), Babai’79 (for matchings and colorings), Jerrum,Valiant,V.Vazirani’86
random variables: X1 X2 ... Xt
E[X1 X2 ... Xt]
= O(1)V[Xi]E[Xi]2
the Xi are easy to estimate = “WANTED”
the outcome of the JVV reduction:
such that1)2)
squared coefficient of variation (SCV)
![Page 38: Adaptive annealing: a near-optimal connection between sampling and counting](https://reader036.vdocument.in/reader036/viewer/2022062521/56816868550346895dded0cb/html5/thumbnails/38.jpg)
E[X1 X2 ... Xt]
= O(1)V[Xi]E[Xi]2
the Xi are easy to estimate = “WANTED” 1)
2)
O(t2/2) samples (O(t/2) from each Xi) give 1 estimator of “WANTED” with prob3/4
Theorem (Dyer-Frieze’91)
(approx) counting sampling
![Page 39: Adaptive annealing: a near-optimal connection between sampling and counting](https://reader036.vdocument.in/reader036/viewer/2022062521/56816868550346895dded0cb/html5/thumbnails/39.jpg)
JVV for independent sets
P( ) 1# independent sets =
GOAL: given a graph G, estimate the number of independent sets of G
![Page 40: Adaptive annealing: a near-optimal connection between sampling and counting](https://reader036.vdocument.in/reader036/viewer/2022062521/56816868550346895dded0cb/html5/thumbnails/40.jpg)
JVV for independent sets
P( )P( ) =
?
??
??P( ) ?P( )P( )
X1 X2 X3 X4
Xi [0,1] and E[Xi] ½ = O(1)V[Xi]E[Xi]2
P(AB)=P(A)P(B|A)
![Page 41: Adaptive annealing: a near-optimal connection between sampling and counting](https://reader036.vdocument.in/reader036/viewer/2022062521/56816868550346895dded0cb/html5/thumbnails/41.jpg)
JVV for independent sets
P( )P( ) =
?
??
??P( ) ?P( )P( )
X1 X2 X3 X4
Xi [0,1] and E[Xi] ½ = O(1)V[Xi]E[Xi]2
P(AB)=P(A)P(B|A)
![Page 42: Adaptive annealing: a near-optimal connection between sampling and counting](https://reader036.vdocument.in/reader036/viewer/2022062521/56816868550346895dded0cb/html5/thumbnails/42.jpg)
Self-reducibility for independent sets
?
??P( ) 5
7=
![Page 43: Adaptive annealing: a near-optimal connection between sampling and counting](https://reader036.vdocument.in/reader036/viewer/2022062521/56816868550346895dded0cb/html5/thumbnails/43.jpg)
?
??P( ) 5
7=
57=
Self-reducibility for independent sets
![Page 44: Adaptive annealing: a near-optimal connection between sampling and counting](https://reader036.vdocument.in/reader036/viewer/2022062521/56816868550346895dded0cb/html5/thumbnails/44.jpg)
?
??P( ) 5
7=
57= 5
7=
Self-reducibility for independent sets
![Page 45: Adaptive annealing: a near-optimal connection between sampling and counting](https://reader036.vdocument.in/reader036/viewer/2022062521/56816868550346895dded0cb/html5/thumbnails/45.jpg)
??P( ) 3
5=
35=
Self-reducibility for independent sets
![Page 46: Adaptive annealing: a near-optimal connection between sampling and counting](https://reader036.vdocument.in/reader036/viewer/2022062521/56816868550346895dded0cb/html5/thumbnails/46.jpg)
??P( ) 3
5=
35= 3
5=
Self-reducibility for independent sets
![Page 47: Adaptive annealing: a near-optimal connection between sampling and counting](https://reader036.vdocument.in/reader036/viewer/2022062521/56816868550346895dded0cb/html5/thumbnails/47.jpg)
35
57= 5
7=
35
57= 2
3 = 7
Self-reducibility for independent sets
![Page 48: Adaptive annealing: a near-optimal connection between sampling and counting](https://reader036.vdocument.in/reader036/viewer/2022062521/56816868550346895dded0cb/html5/thumbnails/48.jpg)
SAMPLERORACLEgraph G
random independentset of G
JVV: If we have a sampler oracle:
then FPRAS using O(n2) samples.
![Page 49: Adaptive annealing: a near-optimal connection between sampling and counting](https://reader036.vdocument.in/reader036/viewer/2022062521/56816868550346895dded0cb/html5/thumbnails/49.jpg)
SAMPLERORACLEgraph G
random independentset of G
JVV: If we have a sampler oracle:
then FPRAS using O(n2) samples.
SAMPLERORACLE
graph G set fromgas-model Gibbs at
ŠVV: If we have a sampler oracle:
then FPRAS using O*(n) samples.
![Page 50: Adaptive annealing: a near-optimal connection between sampling and counting](https://reader036.vdocument.in/reader036/viewer/2022062521/56816868550346895dded0cb/html5/thumbnails/50.jpg)
O*( |V| ) samples suffice for counting
Application – independent sets
Cost per sample (Vigoda’01,Dyer-Greenhill’01) time = O*( |V| ) for graphs of degree 4.
Total running time: O* ( |V|2 ).
![Page 51: Adaptive annealing: a near-optimal connection between sampling and counting](https://reader036.vdocument.in/reader036/viewer/2022062521/56816868550346895dded0cb/html5/thumbnails/51.jpg)
Other applicationsmatchings O*(n2m) (using Jerrum, Sinclair’89)
spin systems: Ising model O*(n2) for <C
(using Marinelli, Olivieri’95) k-colorings O*(n2) for k>2 (using Jerrum’95)
total running time
![Page 52: Adaptive annealing: a near-optimal connection between sampling and counting](https://reader036.vdocument.in/reader036/viewer/2022062521/56816868550346895dded0cb/html5/thumbnails/52.jpg)
1. Counting problems
2. Basic tools: Chernoff, Chebyshev
3. Dealing with large quantities (the product method)
4. Statistical physics
5. Cooling schedules (our work)
6. More…
Outline
![Page 53: Adaptive annealing: a near-optimal connection between sampling and counting](https://reader036.vdocument.in/reader036/viewer/2022062521/56816868550346895dded0cb/html5/thumbnails/53.jpg)
easy = hot
hard = cold
![Page 54: Adaptive annealing: a near-optimal connection between sampling and counting](https://reader036.vdocument.in/reader036/viewer/2022062521/56816868550346895dded0cb/html5/thumbnails/54.jpg)
Hamiltonian
12
4
0
![Page 55: Adaptive annealing: a near-optimal connection between sampling and counting](https://reader036.vdocument.in/reader036/viewer/2022062521/56816868550346895dded0cb/html5/thumbnails/55.jpg)
Hamiltonian H : {0,...,n}
Big set =
Goal: estimate |H-1(0)|
|H-1(0)| = E[X1] ... E[Xt ]
![Page 56: Adaptive annealing: a near-optimal connection between sampling and counting](https://reader036.vdocument.in/reader036/viewer/2022062521/56816868550346895dded0cb/html5/thumbnails/56.jpg)
Distributions between hot and cold
(x) exp(-H(x))
= inverse temperature
= 0 hot uniform on = cold uniform on H-1(0)
(Gibbs distributions)
![Page 57: Adaptive annealing: a near-optimal connection between sampling and counting](https://reader036.vdocument.in/reader036/viewer/2022062521/56816868550346895dded0cb/html5/thumbnails/57.jpg)
(x) Normalizing factor = partition function
exp(-H(x))
Z()= exp(-H(x)) x
Z()
Distributions between hot and cold
(x) exp(-H(x))
![Page 58: Adaptive annealing: a near-optimal connection between sampling and counting](https://reader036.vdocument.in/reader036/viewer/2022062521/56816868550346895dded0cb/html5/thumbnails/58.jpg)
Partition function
Z()= exp(-H(x)) x
have: Z(0) = ||want: Z() = |H-1(0)|
![Page 59: Adaptive annealing: a near-optimal connection between sampling and counting](https://reader036.vdocument.in/reader036/viewer/2022062521/56816868550346895dded0cb/html5/thumbnails/59.jpg)
Partition function - example
Z()= exp(-H(x)) x have: Z(0) = ||
want: Z() = |H-1(0)|
12
4
0
Z() = 1 e-4.
+ 4 e-2.
+ 4 e-1.
+ 7 e-0.
Z(0) = 16Z()=7
![Page 60: Adaptive annealing: a near-optimal connection between sampling and counting](https://reader036.vdocument.in/reader036/viewer/2022062521/56816868550346895dded0cb/html5/thumbnails/60.jpg)
(x) exp(-H(x))Z()
Assumption: we have a sampler oracle for
SAMPLERORACLE
graph G
subset of Vfrom
![Page 61: Adaptive annealing: a near-optimal connection between sampling and counting](https://reader036.vdocument.in/reader036/viewer/2022062521/56816868550346895dded0cb/html5/thumbnails/61.jpg)
(x) exp(-H(x))Z()
Assumption: we have a sampler oracle for
W
![Page 62: Adaptive annealing: a near-optimal connection between sampling and counting](https://reader036.vdocument.in/reader036/viewer/2022062521/56816868550346895dded0cb/html5/thumbnails/62.jpg)
(x) exp(-H(x))Z()
Assumption: we have a sampler oracle for
W X = exp(H(W)( - ))
![Page 63: Adaptive annealing: a near-optimal connection between sampling and counting](https://reader036.vdocument.in/reader036/viewer/2022062521/56816868550346895dded0cb/html5/thumbnails/63.jpg)
(x) exp(-H(x))Z()
Assumption: we have a sampler oracle for
W X = exp(H(W)( - ))
E[X] = (s) X(s) s
= Z()Z()
can obtain the following ratio:
![Page 64: Adaptive annealing: a near-optimal connection between sampling and counting](https://reader036.vdocument.in/reader036/viewer/2022062521/56816868550346895dded0cb/html5/thumbnails/64.jpg)
Partition function
Z() = exp(-H(x)) x
Our goal restated
Goal: estimate Z()=|H-1(0)| Z() =Z(1) Z(2) Z(t)
Z(0) Z(1) Z(t-1)Z(0)
0 = 0 < 1 < 2 < ... < t =
...
![Page 65: Adaptive annealing: a near-optimal connection between sampling and counting](https://reader036.vdocument.in/reader036/viewer/2022062521/56816868550346895dded0cb/html5/thumbnails/65.jpg)
Our goal restated
Z() =Z(1) Z(2) Z(t)Z(0) Z(1) Z(t-1)
Z(0)...
How to choose the cooling schedule?
Cooling schedule:
E[Xi] =Z(i)Z(i-1)
V[Xi]E[Xi]2
O(1)
minimize length, while satisfying
0 = 0 < 1 < 2 < ... < t =
![Page 66: Adaptive annealing: a near-optimal connection between sampling and counting](https://reader036.vdocument.in/reader036/viewer/2022062521/56816868550346895dded0cb/html5/thumbnails/66.jpg)
Our goal restated
Z() =Z(1) Z(2) Z(t)Z(0) Z(1) Z(t-1)
Z(0)...
How to choose the cooling schedule?
Cooling schedule:
E[Xi] =Z(i)Z(i-1)
V[Xi]E[Xi]2
O(1)
minimize length, while satisfying
0 = 0 < 1 < 2 < ... < t =
![Page 67: Adaptive annealing: a near-optimal connection between sampling and counting](https://reader036.vdocument.in/reader036/viewer/2022062521/56816868550346895dded0cb/html5/thumbnails/67.jpg)
1. Counting problems
2. Basic tools: Chernoff, Chebyshev
3. Dealing with large quantities (the product method)
4. Statistical physics
5. Cooling schedules (our work)
6. More...
Outline
![Page 68: Adaptive annealing: a near-optimal connection between sampling and counting](https://reader036.vdocument.in/reader036/viewer/2022062521/56816868550346895dded0cb/html5/thumbnails/68.jpg)
Parameters: A and n
Z() =AH: {0,...,n}
Z() = exp(-H(x)) x
Z() = ak e- kk=0
n
ak = |H-1(k)|
![Page 69: Adaptive annealing: a near-optimal connection between sampling and counting](https://reader036.vdocument.in/reader036/viewer/2022062521/56816868550346895dded0cb/html5/thumbnails/69.jpg)
Parameters
Z() =A H: {0,...,n}
independent sets
matchings
perfect matchings
k-colorings
2V
V!kV
A
EVVEn
V!
![Page 70: Adaptive annealing: a near-optimal connection between sampling and counting](https://reader036.vdocument.in/reader036/viewer/2022062521/56816868550346895dded0cb/html5/thumbnails/70.jpg)
Parameters
Z() =A H: {0,...,n}
independent sets
matchings
perfect matchings
k-colorings
2V
V!kV
A
EVVEn
V!matchings = # ways of marrying them so that no unhappy couple
![Page 71: Adaptive annealing: a near-optimal connection between sampling and counting](https://reader036.vdocument.in/reader036/viewer/2022062521/56816868550346895dded0cb/html5/thumbnails/71.jpg)
Parameters
Z() =A H: {0,...,n}
independent sets
matchings
perfect matchings
k-colorings
2V
V!kV
A
EVVEn
V!matchings = # ways of marrying them so that no unhappy couple
![Page 72: Adaptive annealing: a near-optimal connection between sampling and counting](https://reader036.vdocument.in/reader036/viewer/2022062521/56816868550346895dded0cb/html5/thumbnails/72.jpg)
Parameters
Z() =A H: {0,...,n}
independent sets
matchings
perfect matchings
k-colorings
2V
V!kV
A
EVVEn
V!matchings = # ways of marrying them so that no unhappy couple
![Page 73: Adaptive annealing: a near-optimal connection between sampling and counting](https://reader036.vdocument.in/reader036/viewer/2022062521/56816868550346895dded0cb/html5/thumbnails/73.jpg)
Parameters
Z() =A H: {0,...,n}
independent sets
matchings
perfect matchings
k-colorings
2V
V!kV
A
EVVEn
V!marry ignoring “compatibility” hamiltonian = number of unhappy couples
![Page 74: Adaptive annealing: a near-optimal connection between sampling and counting](https://reader036.vdocument.in/reader036/viewer/2022062521/56816868550346895dded0cb/html5/thumbnails/74.jpg)
Parameters
Z() =A H: {0,...,n}
independent sets
matchings
perfect matchings
k-colorings
2V
V!kV
A
EVVEn
V!
![Page 75: Adaptive annealing: a near-optimal connection between sampling and counting](https://reader036.vdocument.in/reader036/viewer/2022062521/56816868550346895dded0cb/html5/thumbnails/75.jpg)
Previous cooling schedules
Z() =A H: {0,...,n}
+ 1/n (1 + 1/ln A) ln A
“Safe steps”
O( n ln A)Cooling schedules of length
O( (ln n) (ln A) )
0 = 0 < 1 < 2 < ... < t =
(Bezáková,Štefankovič, Vigoda,V.Vazirani’06)
(Bezáková,Štefankovič, Vigoda,V.Vazirani’06)
![Page 76: Adaptive annealing: a near-optimal connection between sampling and counting](https://reader036.vdocument.in/reader036/viewer/2022062521/56816868550346895dded0cb/html5/thumbnails/76.jpg)
Previous cooling schedules
Z() =A H: {0,...,n}
+ 1/n (1 + 1/ln A) ln A
“Safe steps”
O( n ln A)Cooling schedules of length
O( (ln n) (ln A) )
0 = 0 < 1 < 2 < ... < t =
(Bezáková,Štefankovič, Vigoda,V.Vazirani’06)
(Bezáková,Štefankovič, Vigoda,V.Vazirani’06)
![Page 77: Adaptive annealing: a near-optimal connection between sampling and counting](https://reader036.vdocument.in/reader036/viewer/2022062521/56816868550346895dded0cb/html5/thumbnails/77.jpg)
+ 1/n (1 + 1/ln A) ln A
“Safe steps”
(Bezáková,Štefankovič, Vigoda,V.Vazirani’06)
Z() = ak e- kk=0
nW X = exp(H(W)( - ))
1/e X 1V[X]E[X]2 e1
E[X]
![Page 78: Adaptive annealing: a near-optimal connection between sampling and counting](https://reader036.vdocument.in/reader036/viewer/2022062521/56816868550346895dded0cb/html5/thumbnails/78.jpg)
+ 1/n (1 + 1/ln A) ln A
“Safe steps”
(Bezáková,Štefankovič, Vigoda,V.Vazirani’06)
Z() = ak e- kk=0
nW X = exp(H(W)( - ))
Z() = a0 1 Z(ln A) a0 + 1 E[X] 1/2
![Page 79: Adaptive annealing: a near-optimal connection between sampling and counting](https://reader036.vdocument.in/reader036/viewer/2022062521/56816868550346895dded0cb/html5/thumbnails/79.jpg)
+ 1/n (1 + 1/ln A) ln A
“Safe steps”
(Bezáková,Štefankovič, Vigoda,V.Vazirani’06)
Z() = ak e- kk=0
nW X = exp(H(W)( - ))
E[X] 1/2e
![Page 80: Adaptive annealing: a near-optimal connection between sampling and counting](https://reader036.vdocument.in/reader036/viewer/2022062521/56816868550346895dded0cb/html5/thumbnails/80.jpg)
Previous cooling schedules
+ 1/n (1 + 1/ln A) ln A
“Safe steps”
O( n ln A)Cooling schedules of length
O( (ln n) (ln A) )
(Bezáková,Štefankovič, Vigoda,V.Vazirani’06)
(Bezáková,Štefankovič, Vigoda,V.Vazirani’06)
1/n, 2/n, 3/n, .... , (ln A)/n, .... , ln A
![Page 81: Adaptive annealing: a near-optimal connection between sampling and counting](https://reader036.vdocument.in/reader036/viewer/2022062521/56816868550346895dded0cb/html5/thumbnails/81.jpg)
No better fixed schedule possible
Z() =A H: {0,...,n}
Za() = (1 + a e ) A1+a
- n A schedule that works for all
(with a[0,A-1])
has LENGTH ( (ln n)(ln A) )
THEOREM:
![Page 82: Adaptive annealing: a near-optimal connection between sampling and counting](https://reader036.vdocument.in/reader036/viewer/2022062521/56816868550346895dded0cb/html5/thumbnails/82.jpg)
Parameters
Z() =A H: {0,...,n}Our main result:
non-adaptive schedules of length *( ln A )
Previously:
can get adaptive scheduleof length O* ( (ln A)1/2 )
![Page 83: Adaptive annealing: a near-optimal connection between sampling and counting](https://reader036.vdocument.in/reader036/viewer/2022062521/56816868550346895dded0cb/html5/thumbnails/83.jpg)
Related work
can get adaptive scheduleof length O* ( (ln A)1/2 )
Lovász-Vempala Volume of convex bodies in O*(n4) schedule of length O(n1/2)(non-adaptive cooling schedule, using specific properties of the “volume” partition functions)
![Page 84: Adaptive annealing: a near-optimal connection between sampling and counting](https://reader036.vdocument.in/reader036/viewer/2022062521/56816868550346895dded0cb/html5/thumbnails/84.jpg)
Existential part
for every partition function there exists a cooling schedule of length O*((ln A)1/2)
Lemma:
can get adaptive scheduleof length O* ( (ln A)1/2 )
there exists
![Page 85: Adaptive annealing: a near-optimal connection between sampling and counting](https://reader036.vdocument.in/reader036/viewer/2022062521/56816868550346895dded0cb/html5/thumbnails/85.jpg)
Cooling schedule (definition refresh)
Z() =Z(1) Z(2) Z(t)Z(0) Z(1) Z(t-1)
Z(0)...
How to choose the cooling schedule?
Cooling schedule:
E[Xi] =Z(i)Z(i-1)
V[Xi]E[Xi]2
O(1)
minimize length, while satisfying
0 = 0 < 1 < 2 < ... < t =
![Page 86: Adaptive annealing: a near-optimal connection between sampling and counting](https://reader036.vdocument.in/reader036/viewer/2022062521/56816868550346895dded0cb/html5/thumbnails/86.jpg)
W X = exp(H(W)( - ))E[X2]E[X]2
Z(2-) Z()Z()2= C
E[X] Z() Z()=
Express SCV using partition function(going from to )
V[X]E[X]2 +1
=
![Page 87: Adaptive annealing: a near-optimal connection between sampling and counting](https://reader036.vdocument.in/reader036/viewer/2022062521/56816868550346895dded0cb/html5/thumbnails/87.jpg)
f()=ln Z()
Proof:
E[X2]E[X]2
Z(2-) Z()Z()2= C
C’=(ln C)/2
2-
(f(2-) + f())/2 (ln C)/2 + f()
graph of f
![Page 88: Adaptive annealing: a near-optimal connection between sampling and counting](https://reader036.vdocument.in/reader036/viewer/2022062521/56816868550346895dded0cb/html5/thumbnails/88.jpg)
f()=ln Z() f is decreasing f is convex f’(0) –n f(0) ln A
Properties of partition functions
![Page 89: Adaptive annealing: a near-optimal connection between sampling and counting](https://reader036.vdocument.in/reader036/viewer/2022062521/56816868550346895dded0cb/html5/thumbnails/89.jpg)
f()=ln Z() f is decreasing f is convex f’(0) –n f(0) ln A
f() = ln ak e- kk=0
n
f’() =ak k e- k
k=0-
n
ak e- kk=0
n
Properties of partition functions
(ln f)’ = f’f
![Page 90: Adaptive annealing: a near-optimal connection between sampling and counting](https://reader036.vdocument.in/reader036/viewer/2022062521/56816868550346895dded0cb/html5/thumbnails/90.jpg)
f()=ln Z()
f is decreasing f is convex f’(0) –n f(0) ln A
Proof:either f or f’changes a lot
Let K:=f
(ln |f’|) 1K
1Then
for every partition function there exists a cooling schedule of length O*((ln A)1/2)
GOAL: proving Lemma:
![Page 91: Adaptive annealing: a near-optimal connection between sampling and counting](https://reader036.vdocument.in/reader036/viewer/2022062521/56816868550346895dded0cb/html5/thumbnails/91.jpg)
Proof:Let K:=f
(ln |f’|) 1K1
Then
c := (a+b)/2, := b-ahave f(c) = (f(a)+f(b))/2 – 1
(f(a) – f(c)) / f’(a) (f(c) – f(b)) / f’(b)
a bc
f is convex
![Page 92: Adaptive annealing: a near-optimal connection between sampling and counting](https://reader036.vdocument.in/reader036/viewer/2022062521/56816868550346895dded0cb/html5/thumbnails/92.jpg)
Let K:=f
(ln |f’|) 1K
Then
c := (a+b)/2, := b-ahave f(c) = (f(a)+f(b))/2 – 1
(f(a) – f(c)) / f’(a) (f(c) – f(b)) / f’(b)f is convex
f’(b) f’(a)1-1/f e-f
![Page 93: Adaptive annealing: a near-optimal connection between sampling and counting](https://reader036.vdocument.in/reader036/viewer/2022062521/56816868550346895dded0cb/html5/thumbnails/93.jpg)
f:[a,b] R, convex, decreasingcan be “approximated” using
f’(a)f’(b)(f(a)-f(b))
segments
![Page 94: Adaptive annealing: a near-optimal connection between sampling and counting](https://reader036.vdocument.in/reader036/viewer/2022062521/56816868550346895dded0cb/html5/thumbnails/94.jpg)
Proof:
2-
Technicality: getting to 2-
![Page 95: Adaptive annealing: a near-optimal connection between sampling and counting](https://reader036.vdocument.in/reader036/viewer/2022062521/56816868550346895dded0cb/html5/thumbnails/95.jpg)
Proof:
2-
i
i+1
Technicality: getting to 2-
![Page 96: Adaptive annealing: a near-optimal connection between sampling and counting](https://reader036.vdocument.in/reader036/viewer/2022062521/56816868550346895dded0cb/html5/thumbnails/96.jpg)
Proof:
2-
i
i+1 i+2
Technicality: getting to 2-
![Page 97: Adaptive annealing: a near-optimal connection between sampling and counting](https://reader036.vdocument.in/reader036/viewer/2022062521/56816868550346895dded0cb/html5/thumbnails/97.jpg)
Proof:
2-
i
i+1 i+2
Technicality: getting to 2-
i+3
ln ln A
extrasteps
![Page 98: Adaptive annealing: a near-optimal connection between sampling and counting](https://reader036.vdocument.in/reader036/viewer/2022062521/56816868550346895dded0cb/html5/thumbnails/98.jpg)
Existential Algorithmic
can get adaptive scheduleof length O* ( (ln A)1/2 )
there exists
can get adaptive scheduleof length O* ( (ln A)1/2 )
![Page 99: Adaptive annealing: a near-optimal connection between sampling and counting](https://reader036.vdocument.in/reader036/viewer/2022062521/56816868550346895dded0cb/html5/thumbnails/99.jpg)
Algorithmic construction
(x) exp(-H(x))Z()
using a sampler oracle for
we can construct a cooling schedule of length 38 (ln A)1/2(ln ln A)(ln n)
Our main result:
Total number of oracle calls 107 (ln A) (ln ln A+ln n)7 ln (1/)
![Page 100: Adaptive annealing: a near-optimal connection between sampling and counting](https://reader036.vdocument.in/reader036/viewer/2022062521/56816868550346895dded0cb/html5/thumbnails/100.jpg)
current inverse temperature
ideally move to such that
E[X] =Z()Z()
E[X2]E[X]2
B2B1
Algorithmic construction
![Page 101: Adaptive annealing: a near-optimal connection between sampling and counting](https://reader036.vdocument.in/reader036/viewer/2022062521/56816868550346895dded0cb/html5/thumbnails/101.jpg)
current inverse temperature
ideally move to such that
E[X] =Z()Z()
E[X2]E[X]2
B2B1
Algorithmic construction
X is “easy to estimate”
![Page 102: Adaptive annealing: a near-optimal connection between sampling and counting](https://reader036.vdocument.in/reader036/viewer/2022062521/56816868550346895dded0cb/html5/thumbnails/102.jpg)
current inverse temperature
ideally move to such that
E[X] =Z()Z()
E[X2]E[X]2
B2B1
Algorithmic construction
we make progress (where B11)
![Page 103: Adaptive annealing: a near-optimal connection between sampling and counting](https://reader036.vdocument.in/reader036/viewer/2022062521/56816868550346895dded0cb/html5/thumbnails/103.jpg)
current inverse temperature
ideally move to such that
E[X] =Z()Z()
E[X2]E[X]2
B2B1
Algorithmic construction
need to construct a “feeler” for this
![Page 104: Adaptive annealing: a near-optimal connection between sampling and counting](https://reader036.vdocument.in/reader036/viewer/2022062521/56816868550346895dded0cb/html5/thumbnails/104.jpg)
Algorithmic constructioncurrent inverse temperature
ideally move to such that
E[X] =Z()Z()
E[X2]E[X]2
B2B1
need to construct a “feeler” for this
= Z()Z()
Z(2)Z()
![Page 105: Adaptive annealing: a near-optimal connection between sampling and counting](https://reader036.vdocument.in/reader036/viewer/2022062521/56816868550346895dded0cb/html5/thumbnails/105.jpg)
Algorithmic constructioncurrent inverse temperature
ideally move to such that
E[X] =Z()Z()
E[X2]E[X]2
B2B1
need to construct a “feeler” for this
= Z()Z()
Z(2)Z()
bad “feeler”
![Page 106: Adaptive annealing: a near-optimal connection between sampling and counting](https://reader036.vdocument.in/reader036/viewer/2022062521/56816868550346895dded0cb/html5/thumbnails/106.jpg)
estimator for Z()Z()
Z() = ak e- kk=0
n
For W we have P(H(W)=k) = ak e- k
Z()
![Page 107: Adaptive annealing: a near-optimal connection between sampling and counting](https://reader036.vdocument.in/reader036/viewer/2022062521/56816868550346895dded0cb/html5/thumbnails/107.jpg)
Z() = ak e- kk=0
n
For W we have P(H(W)=k) = ak e- k
Z()
For U we have P(H(U)=k) = ak e- k
Z()
If H(X)=k likely at both , estimator
Z()Z()
estimator for
![Page 108: Adaptive annealing: a near-optimal connection between sampling and counting](https://reader036.vdocument.in/reader036/viewer/2022062521/56816868550346895dded0cb/html5/thumbnails/108.jpg)
Z() = ak e- kk=0
n
For W we have P(H(W)=k) = ak e- k
Z()
For U we have P(H(U)=k) = ak e- k
Z()
If H(X)=k likely at both , estimator
Z()Z()
estimator for
![Page 109: Adaptive annealing: a near-optimal connection between sampling and counting](https://reader036.vdocument.in/reader036/viewer/2022062521/56816868550346895dded0cb/html5/thumbnails/109.jpg)
For W we have P(H(W)=k) = ak e- k
Z()
For U we have P(H(U)=k) = ak e- k
Z()
P(H(U)=k)P(H(W)=k)ek(-)=
Z()Z()
Z()Z()
estimator for
![Page 110: Adaptive annealing: a near-optimal connection between sampling and counting](https://reader036.vdocument.in/reader036/viewer/2022062521/56816868550346895dded0cb/html5/thumbnails/110.jpg)
For W we have P(H(W)=k) = ak e- k
Z()
For U we have P(H(U)=k) = ak e- k
Z()
P(H(U)=k)P(H(W)=k)ek(-)=
Z()Z()
Z()Z()
PROBLEM: P(H(W)=k) can be too small
estimator for
![Page 111: Adaptive annealing: a near-optimal connection between sampling and counting](https://reader036.vdocument.in/reader036/viewer/2022062521/56816868550346895dded0cb/html5/thumbnails/111.jpg)
Rough estimator for
Z() = ak e- kk=0
n
For W we have P(H(W)[c,d]) =
ak e- k
Z()
k=c
d
Z()Z()
For U we have P(H(W)[c,d]) =
ak e- k
Z()
k=c
d
interval insteadof single value
![Page 112: Adaptive annealing: a near-optimal connection between sampling and counting](https://reader036.vdocument.in/reader036/viewer/2022062521/56816868550346895dded0cb/html5/thumbnails/112.jpg)
P(H(U)[c,d])P(H(W)[c,d]) e ec(-)
e 1
If |-| |d-c| 1 thenRough estimator for
We also need P(H(U) [c,d]) P(H(W) [c,d]) to be large.
Z()Z()
Z()Z()
Z()Z()
ak e- kk=c
d
ak e- kk=c
d ec(-) =ak e- (k-c)
k=c
d
ak e- (k-c)d
k=c
![Page 113: Adaptive annealing: a near-optimal connection between sampling and counting](https://reader036.vdocument.in/reader036/viewer/2022062521/56816868550346895dded0cb/html5/thumbnails/113.jpg)
Split {0,1,...,n} into h 4(ln n) ln Aintervals [0],[1],[2],...,[c,c(1+1/ ln A)],...
for any inverse temperature there exists a interval with P(H(W) I) 1/8h
We say that I is HEAVY for
We will:
![Page 114: Adaptive annealing: a near-optimal connection between sampling and counting](https://reader036.vdocument.in/reader036/viewer/2022062521/56816868550346895dded0cb/html5/thumbnails/114.jpg)
Split {0,1,...,n} into h 4(ln n) ln Aintervals [0],[1],[2],...,[c,c(1+1/ ln A)],...
for any inverse temperature there exists a interval with P(H(W) I) 1/8h
We say that I is HEAVY for
We will:
![Page 115: Adaptive annealing: a near-optimal connection between sampling and counting](https://reader036.vdocument.in/reader036/viewer/2022062521/56816868550346895dded0cb/html5/thumbnails/115.jpg)
Algorithm find an interval I which is heavy for the current inverse temperature
see how far I is heavy (until some *)
use the interval I for the feeler
repeat
Z()Z()
Z(2)Z()
either * make progress, or * eliminate the interval I * or make a “long move”
ANALYSIS:
![Page 116: Adaptive annealing: a near-optimal connection between sampling and counting](https://reader036.vdocument.in/reader036/viewer/2022062521/56816868550346895dded0cb/html5/thumbnails/116.jpg)
distribution of h(X) where X
...
I = a heavy interval at
I isheavy
![Page 117: Adaptive annealing: a near-optimal connection between sampling and counting](https://reader036.vdocument.in/reader036/viewer/2022062521/56816868550346895dded0cb/html5/thumbnails/117.jpg)
distribution of h(X) where X
...
I = a heavy interval at
no longer heavy at !
I is NOTheavy
I isheavy
![Page 118: Adaptive annealing: a near-optimal connection between sampling and counting](https://reader036.vdocument.in/reader036/viewer/2022062521/56816868550346895dded0cb/html5/thumbnails/118.jpg)
distribution of h(X) where X’
...
I = a heavy interval at
’
heavy at ’
I isheavy
I isheavy
I is NOTheavy
![Page 119: Adaptive annealing: a near-optimal connection between sampling and counting](https://reader036.vdocument.in/reader036/viewer/2022062521/56816868550346895dded0cb/html5/thumbnails/119.jpg)
I isheavy
I isheavy
I is NOTheavy
I isheavy I is NOT
heavy
use binary search to find *
**+1/(2n)
= min{1/(b-a), ln A}
I=[a,b]
’
![Page 120: Adaptive annealing: a near-optimal connection between sampling and counting](https://reader036.vdocument.in/reader036/viewer/2022062521/56816868550346895dded0cb/html5/thumbnails/120.jpg)
I isheavy
I isheavy
I is NOTheavy
I isheavy I is NOT
heavy
use binary search to find *
**+1/(2n)
= min{1/(b-a), ln A}
I=[a,b]
How do you know that you can use binary search?
’
![Page 121: Adaptive annealing: a near-optimal connection between sampling and counting](https://reader036.vdocument.in/reader036/viewer/2022062521/56816868550346895dded0cb/html5/thumbnails/121.jpg)
I isheavy
I isheavy
How do you know that you can use binary search?
I is NOTheavy
I is NOTheavy
Lemma: the set of temperatures for which I is h-heavy is an interval.
ak e- kk=0
nak e- k
kI 1
8h
P(h(X) I) 1/8h for XI is h-heavy at
![Page 122: Adaptive annealing: a near-optimal connection between sampling and counting](https://reader036.vdocument.in/reader036/viewer/2022062521/56816868550346895dded0cb/html5/thumbnails/122.jpg)
How do you know that you can use binary search?
ak e- kk=0
nak e- k
kI 1
8h
c0 x0 + c1 x1 + c2 x2 + .... + cn xn
Descarte’s rule of signs: x=e-
+ - + +
sign change number of positive roots
number of sign changes
![Page 123: Adaptive annealing: a near-optimal connection between sampling and counting](https://reader036.vdocument.in/reader036/viewer/2022062521/56816868550346895dded0cb/html5/thumbnails/123.jpg)
How do you know that you can use binary search?
ak e- kk=0
nak e- k
kI 1
h
c0 x0 + c1 x1 + c2 x2 + .... + cn xn
Descarte’s rule of signs: x=e-
+ + +
sign change number of positive roots
number of sign changes
-1+x+x2+x3+...+xn
1+x+x20
-
![Page 124: Adaptive annealing: a near-optimal connection between sampling and counting](https://reader036.vdocument.in/reader036/viewer/2022062521/56816868550346895dded0cb/html5/thumbnails/124.jpg)
How do you know that you can use binary search?
ak e- kk=0
nak e- k
kI 1
8h
c0 x0 + c1 x1 + c2 x2 + .... + cn xn
Descarte’s rule of signs: x=e-
+ + +
sign change number of positive roots
number of sign changes
-
![Page 125: Adaptive annealing: a near-optimal connection between sampling and counting](https://reader036.vdocument.in/reader036/viewer/2022062521/56816868550346895dded0cb/html5/thumbnails/125.jpg)
I isheavy
I isheavy
I is NOTheavy
* *+1/(2n)
can roughly compute ratio of Z()/Z(’) for ’ [,*] if |-|.|b-a| 1
I=[a,b]
![Page 126: Adaptive annealing: a near-optimal connection between sampling and counting](https://reader036.vdocument.in/reader036/viewer/2022062521/56816868550346895dded0cb/html5/thumbnails/126.jpg)
I isheavy
I isheavy
I is NOTheavy
* *+1/(2n)
can roughly compute ratio of Z()/Z(’) for ’ [,*] if |-|.|b-a| 1
I=[a,b]
find largest such thatZ()Z()
Z(2)Z()
C
1. success
2. eliminate interval
3. long move
![Page 127: Adaptive annealing: a near-optimal connection between sampling and counting](https://reader036.vdocument.in/reader036/viewer/2022062521/56816868550346895dded0cb/html5/thumbnails/127.jpg)
![Page 128: Adaptive annealing: a near-optimal connection between sampling and counting](https://reader036.vdocument.in/reader036/viewer/2022062521/56816868550346895dded0cb/html5/thumbnails/128.jpg)
if we have sampler oracles for
then we can get adaptive scheduleof length t=O* ( (ln A)1/2 )
independent sets O*(n2) (using Vigoda’01, Dyer-Greenhill’01)
matchings O*(n2m) (using Jerrum, Sinclair’89)
spin systems: Ising model O*(n2) for <C
(using Marinelli, Olivieri’95) k-colorings O*(n2) for k>2 (using Jerrum’95)
![Page 129: Adaptive annealing: a near-optimal connection between sampling and counting](https://reader036.vdocument.in/reader036/viewer/2022062521/56816868550346895dded0cb/html5/thumbnails/129.jpg)
1. Counting problems
2. Basic tools: Chernoff, Chebyshev
3. Dealing with large quantities (the product method)
4. Statistical physics
5. Cooling schedules (our work)
6. More...
Outline
![Page 130: Adaptive annealing: a near-optimal connection between sampling and counting](https://reader036.vdocument.in/reader036/viewer/2022062521/56816868550346895dded0cb/html5/thumbnails/130.jpg)
6. More… a) proof of Dyer-Frieze
b) independent sets revisited
c) warm starts
Outline
![Page 131: Adaptive annealing: a near-optimal connection between sampling and counting](https://reader036.vdocument.in/reader036/viewer/2022062521/56816868550346895dded0cb/html5/thumbnails/131.jpg)
O(t2/2) samples (O(t/2) from each Xi) give 1 estimator of “WANTED” with prob3/4
Theorem (Dyer-Frieze’91)
Appendix – proof of:E[X1 X2 ... Xt]
= O(1)V[Xi]E[Xi]2
the Xi are easy to estimate = “WANTED” 1)
2)
![Page 132: Adaptive annealing: a near-optimal connection between sampling and counting](https://reader036.vdocument.in/reader036/viewer/2022062521/56816868550346895dded0cb/html5/thumbnails/132.jpg)
How precise do the Xi have to be?First attempt – term by term
(1 )(1 )(1 )... (1 ) 1t
t
t
t
Main idea:
each term (t2) samples (t3) total
n V[X]E[X]2
1
ln (1/)
![Page 133: Adaptive annealing: a near-optimal connection between sampling and counting](https://reader036.vdocument.in/reader036/viewer/2022062521/56816868550346895dded0cb/html5/thumbnails/133.jpg)
How precise do the Xi have to be?Analyzing SCV is better
(Dyer-Frieze’1991)
P( X gives (1)-estimate )
1 - V[X]E[X]2
1
squared coefficient of variation (SCV)GOAL: SCV(X) 2/4
X=X1 X2 ... Xt
![Page 134: Adaptive annealing: a near-optimal connection between sampling and counting](https://reader036.vdocument.in/reader036/viewer/2022062521/56816868550346895dded0cb/html5/thumbnails/134.jpg)
How precise do the Xi have to be?
(Dyer-Frieze’1991)
SCV(X) = (1+SCV(X1)) ... (1+SCV(Xt)) - 1
Main idea:
SCV(Xi) t SCV(X) <
SCV(X)= V[X]E[X]2
E[X2]E[X]2= -1
Analyzing SCV is better
proof:
![Page 135: Adaptive annealing: a near-optimal connection between sampling and counting](https://reader036.vdocument.in/reader036/viewer/2022062521/56816868550346895dded0cb/html5/thumbnails/135.jpg)
How precise do the Xi have to be?
(Dyer-Frieze’1991)
SCV(X) = (1+SCV(X1)) ... (1+SCV(Xt)) - 1
Main idea:
SCV(Xi) t SCV(X) <
SCV(X)= V[X]E[X]2
E[X2]E[X]2= -1
Analyzing SCV is better
proof:
X1, X2 independent E[X1 X2] = E[X1]E[X2]
X1, X2 independent X12,X2
2 independent
X1,X2 independent SCV(X1X2)=(1+SCV(X1))(1+SCV(X2))-1
![Page 136: Adaptive annealing: a near-optimal connection between sampling and counting](https://reader036.vdocument.in/reader036/viewer/2022062521/56816868550346895dded0cb/html5/thumbnails/136.jpg)
How precise do the Xi have to be?
(Dyer-Frieze’1991)X1 X2 ... Xt
X =Main idea:
SCV(Xi) t SCV(X) <
each term (t /2) samples (t2/2) total
Analyzing SCV is better
![Page 137: Adaptive annealing: a near-optimal connection between sampling and counting](https://reader036.vdocument.in/reader036/viewer/2022062521/56816868550346895dded0cb/html5/thumbnails/137.jpg)
6. More… a) proof of Dyer-Frieze
b) independent sets revisited
c) warm starts
Outline
![Page 138: Adaptive annealing: a near-optimal connection between sampling and counting](https://reader036.vdocument.in/reader036/viewer/2022062521/56816868550346895dded0cb/html5/thumbnails/138.jpg)
12
4Hamiltonian
0
![Page 139: Adaptive annealing: a near-optimal connection between sampling and counting](https://reader036.vdocument.in/reader036/viewer/2022062521/56816868550346895dded0cb/html5/thumbnails/139.jpg)
Hamiltonian – many possibilities
0
1
2
(hardcore lattice gas model)
![Page 140: Adaptive annealing: a near-optimal connection between sampling and counting](https://reader036.vdocument.in/reader036/viewer/2022062521/56816868550346895dded0cb/html5/thumbnails/140.jpg)
What would be a natural hamiltonian for planar graphs?
![Page 141: Adaptive annealing: a near-optimal connection between sampling and counting](https://reader036.vdocument.in/reader036/viewer/2022062521/56816868550346895dded0cb/html5/thumbnails/141.jpg)
What would be a natural hamiltonian for planar graphs?
H(G) = number of edges
natural MC
(1+)
1(1+) try G - {u,v}
try G + {u,v}
pick u,v uniformly at random
![Page 142: Adaptive annealing: a near-optimal connection between sampling and counting](https://reader036.vdocument.in/reader036/viewer/2022062521/56816868550346895dded0cb/html5/thumbnails/142.jpg)
natural MC
(1+)
1(1+) try G - {u,v}
try G + {u,v}
pick u,v uniformly at random
uv
uv
(1+)n(n-1)/2
1(1+)n(n-1)/2G G’
![Page 143: Adaptive annealing: a near-optimal connection between sampling and counting](https://reader036.vdocument.in/reader036/viewer/2022062521/56816868550346895dded0cb/html5/thumbnails/143.jpg)
uv
uv
(1+)n(n-1)/2
1(1+)n(n-1)/2
G) number of edges
satisfies the detailed balance condition
(G) P(G,G’) = (G’) P(G’,G)
G G’
( = exp(-))
![Page 144: Adaptive annealing: a near-optimal connection between sampling and counting](https://reader036.vdocument.in/reader036/viewer/2022062521/56816868550346895dded0cb/html5/thumbnails/144.jpg)
6. More… a) proof of Dyer-Frieze
b) independent sets revisited
c) warm starts
Outline
![Page 145: Adaptive annealing: a near-optimal connection between sampling and counting](https://reader036.vdocument.in/reader036/viewer/2022062521/56816868550346895dded0cb/html5/thumbnails/145.jpg)
Mixing time: mix = smallest t such that | t - |TV 1/e
Relaxation time: rel = 1/(1-2)
rel mix rel ln (1/min)
n ln n)
n)
(n=3)
(discrepancy may be substantially bigger for, e.g., matchings)
![Page 146: Adaptive annealing: a near-optimal connection between sampling and counting](https://reader036.vdocument.in/reader036/viewer/2022062521/56816868550346895dded0cb/html5/thumbnails/146.jpg)
Mixing time: mix = smallest t such that | t - |TV 1/e
Relaxation time: rel = 1/(1-2)
Estimating (S)
1 if X S0 otherwiseY={
X
E[Y]=(S) ...
X1X2X3
Xs
METHOD 1
![Page 147: Adaptive annealing: a near-optimal connection between sampling and counting](https://reader036.vdocument.in/reader036/viewer/2022062521/56816868550346895dded0cb/html5/thumbnails/147.jpg)
Mixing time: mix = smallest t such that | t - |TV 1/e
Relaxation time: rel = 1/(1-2)
Estimating (S)
1 if X S0 otherwiseY={
X
E[Y]=(S) ...
X1X2X3
Xs
METHOD 1
X1 X2 X3... Xs
METHOD 2 (Gillman’98, Kahale’96, ...)
![Page 148: Adaptive annealing: a near-optimal connection between sampling and counting](https://reader036.vdocument.in/reader036/viewer/2022062521/56816868550346895dded0cb/html5/thumbnails/148.jpg)
Mixing time: mix = smallest t such that | t - |TV 1/e
Relaxation time: rel = 1/(1-2)
Further speed-up
X1 X2 X3... Xs
|t - |TV exp(-t/rel) Var(0/)
( (x)(0(x)/(x)-1)2)1/2 small called warm start
METHOD 2 (Gillman’98, Kahale’96, ...)
![Page 149: Adaptive annealing: a near-optimal connection between sampling and counting](https://reader036.vdocument.in/reader036/viewer/2022062521/56816868550346895dded0cb/html5/thumbnails/149.jpg)
Mixing time: mix = smallest t such that | t - |TV 1/e
Relaxation time: rel = 1/(1-2)
Further speed-up
X1 X2 X3... Xs
METHOD 2 (Gillman’98, Kahale’96, ...)
|t - |TV exp(-t/rel) Var(0/)
( (x)(0(x)/(x)-1)2)1/2 small called warm start
sample at can be used as a warm start for ’
cooling schedule can step from ’ to
![Page 150: Adaptive annealing: a near-optimal connection between sampling and counting](https://reader036.vdocument.in/reader036/viewer/2022062521/56816868550346895dded0cb/html5/thumbnails/150.jpg)
sample at can be used as a warm start for ’
cooling schedule can step from ’ to
0 1 2 3 m
....
= “well mixed” states
m=O( (ln n)(ln A) )
![Page 151: Adaptive annealing: a near-optimal connection between sampling and counting](https://reader036.vdocument.in/reader036/viewer/2022062521/56816868550346895dded0cb/html5/thumbnails/151.jpg)
0 1 2 3 m
....
= “well mixed” states
Xs
X1 X2 X3... Xs
METHOD 2
run the our cooling-schedulealgorithm with METHOD 2using “well mixed” statesas starting points
![Page 152: Adaptive annealing: a near-optimal connection between sampling and counting](https://reader036.vdocument.in/reader036/viewer/2022062521/56816868550346895dded0cb/html5/thumbnails/152.jpg)
0 1 k
Output of our algorithm: k=O*( (ln A)1/2 )
small augmentation (so that we can use sample from current as a warm start at next)
still O*( (ln A)1/2 )0 1 2 3 m
....
Use analogue of Frieze-Dyer for independent samplesfrom vector variables with slightly dependent coordinates.
![Page 153: Adaptive annealing: a near-optimal connection between sampling and counting](https://reader036.vdocument.in/reader036/viewer/2022062521/56816868550346895dded0cb/html5/thumbnails/153.jpg)
if we have sampler oracles for
then we can get adaptive scheduleof length t=O* ( (ln A)1/2 )
independent sets O*(n2) (using Vigoda’01, Dyer-Greenhill’01)
matchings O*(n2m) (using Jerrum, Sinclair’89)
spin systems: Ising model O*(n2) for <C
(using Marinelli, Olivieri’95) k-colorings O*(n2) for k>2 (using Jerrum’95)