slow and fast mixing of tempering and swapping for the potts model nayantara bhatnagar, uc berkeley...
TRANSCRIPT
Slow and Fast Mixing of Tempering and Swapping for the
Potts Model
Nayantara Bhatnagar, UC BerkeleyDana Randall, Georgia Tech
lim Pr[Xt = Y | X0] = π(Y)
t → ∞
Markov Chains
K = (Ω, P)
Theorem: If K is connected and “aperiodic”, the Markov chain X0,X1,... converges in the limit to a unique stationary distribution π over Ω.
P(X,Y)
P(Y,X)
If P(X,Y) = P(Y,X), π is uniform over Ω.
Matchings Independent Sets
Partition functions of Ising, Potts models
Volume of a convex body
Broder’s Markov chain Glauber dynamics
Glauber dynamicsBall walk, Lattice walk
δ
Markov Chains
Introduction:Markov Chain Monte Carlo
Markov Chains:
• Matchings – Broder’s Markov chain
• Colorings – Glauber dynamics
• Independent Sets – Glauber dynamics
• Ising, Potts model – Glauber dynamics
• Volume – Ball walk, Lattice walk
Mixing Time, T: time to get within 1/4 in variation distance to π.
Rapid mixing (polynomial), slowly mixing (exponential).
Techniques for proving rapid mixing:
Coupling, Spectral Gap, Conductance and isoperimetry, Multicommodity flows, Decomposition, Comparison ...
What if natural Markov chain is slowly mixing?
The q-state Potts Model
q-state Ferromagnetic Potts Model: Underlying graph: G(V,E)
Configurations Ω = { x : x [q]n}Inverse temperature β > 0,
πβ(x) e β( H(x)) H(x) = Σ δxi = x
j
Glauber dynamics Markov Chain• Choose (v, ct+1(v)) R V x [q].
• Update ct(v) to ct+1(v) with Metropolis probabilities.
(i,j)
Why Simulated Tempering
πβ(x)
H(x)
Glauber dynamics mixes slowly for the q-state
Potts for Kn for q ≥ 2, at large enough β.
ΦS = P[ Xt+1 S | Xt ~ π(S)]
SSc
Theorem : T c1
Φ
c2
Φ2
Φ = min ΦS
S: π(S)
½
Conductance: [Jerrum-Sinclair ’89, Lawler-Sokal ’88]
Simulated Tempering[Marinari-Parisi ’92]
Define inverse temperatures 0 = β0βM
=βand distributions π0π1πM = πβ on Ω.
i = M· i
M
……
πM
π(x,i) = ˆ 1
M+1πi (x)
Tempering Markov Chain:
From (x,i),
• W.p. ½, Glauber dynamics at βi
• W.p. ½, randomly move to (x,i
±1)
π0
Ω̂ = Ω × [M+1],
Swapping[Geyer ’91]
Define inverse temperatures 0 = β0βM
=βand distributions π0π1πM = πβ on Ω.
i = M· i
M
……
πM
π(x) = Π ˆ πi (xi)
Swapping Markov Chain:
From x, choose random i
• W.p. ½, Glauber dynamics at βi
• W.p. ½, move to x(i,i+1)
π0
Ω̂ = Ω
[M+1],
i
Theoretical Results
• Madras-Zheng ’99:
∙ Tempering mixes rapidly at all temperatures for the ferromagnetic Ising model (Potts model, q = 2) on Kn.
∙ Rapid mixing for symmetric bimodal exponential distribution on an interval.
• Zheng ’99: ∙ Rapid mixing of swapping implies tempering mixes
rapidly.
• B-Randall ’04:
∙Simulated Tempering mixes slowly for 3 state ferromagnetic Potts model on Kn.
∙Modified swapping algorithm is rapidly mixing for mean-field Ising model with an external field.
• Woodard, Schmidler, Huber ’08:
∙ Sufficient conditions for rapid mixing of tempering and swapping.
∙ Sufficient conditions for torpid mixing of tempering and swapping.
In This Talk:
B-Randall ’04:
Tempering and swapping for the mean-field Potts model. Slow Mixing.
Tempering can be slowly mixing for any choice of temperatures.
Rapid Mixing Alternative tempered distributions for rapid mixing.
Tempering for Potts Model
Theorem [BR]: There exists βcrit> 0, such that
tempering for Potts model on Kn at βcrit mixes slowly.
(0,0,n)
Proof idea: Bound conductance on Ω = Ω × [M+1].
• Cut depends on number of vertices of each color.
• Induces the same cut on Ω at each βi
The space Ω partitioned into equivalence classes σ:
ˆ
(n/2, 0, n/2)
(n,0,0)
Stationary Distribution of Tempering Chain
At βcrit
At β0
…
At 0 < βi < βcrit
disordered mode
ordered mode
πi (σ) n
σR σB σGe β
i( )(σR)2 + (σB)2 + (σG)2
…
Tempering Fails to Converge
βcrit
β0
…
0 < βi < βcrit
…
At βcrittempering mixes
slowly for any set of intermediate temperatures.
Swapping and Tempering for Assymetric Distributions – Rapid Mixing
Assymetric exponential
Ising Model with an external Field
Potts model on KR, the line σB = σG n/3
01n- 2n
πβ(x) e β( H(x))
H(x) = Σ δxi = x
j + B Σ δx
i=+
(i,j) i
π(x) C |x| , x [-n1,n2 ]
n1 > n2
0 n
3
n
2
n
3
2n
πβ(x) e β( H(x))
H(x) = Σ δxi = x
j
n
Decomposition of Swapping Chain
πi(x) C |x|
i
M
Madras-Randall ’02
Decomposition for Markov chains
1. Mixing of restricted chains R0,i and R1,i at each temperature.
2. Mixing of the projection chain P.
Tswap C min TRb,i x TP
b {0,1},
i M
01n- 2n
…
Decomposition of Swapping Chain
πi(x) C |x|
i
M
011010 010110
011010 011011
Projection for Swapping chain
01n- 2n
…
Decomposition of Swapping Chain
Projection for Swapping chain Weighted Cube (WC)
011010 010110
011010 011011
011010 010010
Decomposition of Swapping Chain
Projection for Swapping chain Weighted Cube (WC)
Upto polynomials, πi(0) Cn1 i / M /Zi and πi(1) Cn2 i / M /Zi
Lemma: If for i > j,
πi(1) πj(0) p(n)πi(0) πj(1),
then TP q(n) TWC.
• Modify more than just temperature
• Define π’M … π’0 so cut is not preserved.
……
Flat-Swap: Fast Mixing for Mean-Field Models
πi (σ) n
σR σB σGe β
i( )(σR)2 + (σB)2 + (σG)2
3
n
2
n
3
2n n
• Modify more than just temperature
• Define π’M … π’0 so cut is not preserved.
Flat-Swap: Fast Mixing for Mean-Field Models
π’i (σ) n
σR σB σGe β
i( )(σR)2 + (σB)2 + (σG)2
……
i
M
π’i (σ) = πi (σ) fi(σ) = πi (σ) n
σR σB σG
i-M
M
3
n
2
n
3
2n n
• Modify more than just temperature
• Define π’M … π’0 so cut is not preserved.
Flat Swap for Mean-Field Models
Theorem [B-Randall]:
• Flat swap for the 3-state Potts model onb KR using
the distributions π’M … π’0 mixes rapidly at every
temperature.
• Flat swap mixes rapidly for the mean field Ising model at every temperature and for any external field B.
Lemma: For i > j, π’i(0) π’j(1) p(n)π’i(1) π’j(0)
Summary and Open problems
• Simulated tempering algorithms for other problems?
• Relative complexity of swapping and tempering
Open Problems
Summary
• Insight into why tempering can fail to converge.
• Designing more robust tempering algorithms.
……
0
SM > crit
Tempering vs. Fixed Temperature
3
n
2
n n3
2n
Theorem[BR]: On the line KR, σG = σB ≤ n/3, Tempering
mixes slower than Metropolis at M > crit by an
exponential factor.