tutor: prof. federico ricci tersenghi on the … · tutor: prof. federico ricci tersenghi. outline...
TRANSCRIPT
ON THE OPTIMAL USE OF THE BETHE APPROXIMATION FOR MODELS ON GRAPHS WITH LOOPS
GABRIELE PERUGINI TUTOR: PROF. FEDERICO RICCI TERSENGHI
OUTLINE OF TALK
the Bethe approximation
Belief propagation algorithm
INTRODUCTION
ONGOING PROJECTS
Bethe approximation and the T=0 random field Ising model
Belief propagation inspired heuristics for spin glass optimization problem
FUTURE PROJECTS
[4 slides]
[2 slides]
[6 slides]
[3 slides]
INTRODUCTION
too many variables !
few variables, hope errors are under control
Interacting Ising spins on a generic graph
Probability of finding the system in a configuration {s} with energy
cost function
Gibbs distribution
MODELS
approximate the Gibbs distribution
HOW CAN WE DO IT
compute marginal distributions
sampling from
compute the free energy
PROBLEMS ONE WOULD LIKE TO SOLVE [on a given graph]
NAIVE MEAN FIELD APPROXIMATION
approximate the Gibbs distribution:
parametrize the marginals: (local magnetization)
write the free energy in term of local magnetizations:
local magnetizations minimize the free energy:
exact for weak interactions (e.g. fully connected models)
No correlations !
Useful only for oversimplified models
BETHE APPROXIMATION (1/2)
approximate the Gibbs distribution:
parametrize:
(local magnetization)
(connected correlation between neighboring spins )
write down the parametrized (Bethe) free energy
find the parameters that minimize the free energy
BETHE APPROXIMATION (2/2)
exact on trees
asymptotically exact on locally tree-like graphs
Many states (RSB?)
Small loops ?
BELIEF PROPAGATION ALGORITHM (1/2)
Born as a way of computing exact marginals on trees
Later, it was realized that:
FIXED POINTS OF B.P.
MINIMA OF THE BETHE FREE ENERGY
[Yedidia, 2001]
Each edge carries a message
Update each message using incoming messages
Compute marginals with fixed point messages
mi
BELIEF PROPAGATION ALGORITHM (2/2)
Well established in various disciplines
Coding theory
Probabilistic inference
Artificial intelligence
Statistical mechanics
(almost) LINEAR ALGORITHM, and easy to implement
up to now, limited to safe grounds applications
PROBLEMS:
It may not converge…
Rigorous results only limited to a 1 fixed point scenario
Is the Bethe free energy a good approximation when the model has loops ?
Sensible to the initial condition
ONGOING PROJECTS
Quantify how bad or good is the Bethe approximation
Improve our understanding of the B.P. algorithm
Develop fast and efficient (approximate) algorithms forinference
optimization
AIMS OF THE PROJECT
MODELS CURRENTLY UNDER STUDY
T = 0 RANDOM FIELD ISING MODEL
T = 0 SPIN GLASS
Random regular graphs
finite dimensional lattices
best case worst case
ZERO TEMPERATURE RANDOM FIELD ISING MODEL
‣ ferromagnet + i.i.d. random fields acting on each site
‣ can be studied at zero temperature (optimization problem)
‣ “peculiar” second order phase transition
‣ lots of metastable states 1 spin flip stable states
‣ anomalous slow dynamics (Griffiths singularities)
max-flow / minimum-cut algorithm
the ground state can be obtained in polynomial time
BELIEF PROPAGATION APPROACH TO THE T=0 R.F.I.M.
[Chertkov, 2008]GLOBAL MINIMUM OF THE BETHE FREE ENERGY
GROUND STATE OF THE RFIM
What shall we expect ?
F
M
unbalanced ferromagnetic-like minima
some higher energy states..
BELIEF PROPAGATION APPROACH TO THE T=0 R.F.I.M.
[Chertkov, 2008]GLOBAL MINIMUM OF THE BETHE FREE ENERGY
GROUND STATE OF THE RFIM
Metastable states are relevant (dominant?) at criticality
F
M
unbalanced ferromagnetic-like minima
some (not so) higher energy states..
T=0 R.F.I.M.: MAXIMAL SOLUTIONS
BP results to be very sensitive to the initialization of the messages, however..
We proved that two special initial conditions exist “ (+), (-) “ that bound every fixed point:
0
0.2
0.4
0.6
0.8
1
0.1 0.2 0.3 0.4 0.5 0.6
n(-
)
J
L = 8L = 10L = 12L = 16L = 24
0
0.2
0.4
0.6
0.8
1
0.1 0.2 0.3 0.4 0.5 0.6
n(+
)
J
L = 8L = 10L = 12L = 16L = 24
3D latticeworst case
0
0.2
0.4
0.6
0.8
1
0.3 0.4 0.5 0.6 0.7 0.8
n(+
)
J
N = 211
N = 212
N = 213
N = 214
N = 215
0
0.2
0.4
0.6
0.8
1
0.3 0.4 0.5 0.6 0.7 0.8
n(-
)
J
N = 211
N = 212
N = 213
N = 214
N = 215
Random regular graphs
best case
prob “up” fixed point is the G.S. prob “down” fixed point is the G.S.
T=0 R.F.I.M.: PERCOLATION OF UNFROZEN VARIABLES
The spin is frozen
Every fixed point is bounded:
FRACTION OF FROZEN VARIABLES GIANT CLUSTER OF UNFROZEN VAR.
Random regular graphs
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0.48 0.5 0.52 0.54 0.56 0.58 0.6
frozen s
pin
s
J
N = 210
N = 211
N = 212
N = 213
N = 214
N = 215
N = 216
N = 217
N = 218
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0.48 0.5 0.52 0.54 0.56 0.58 0.6
gia
nt clu
ste
r
J
N = 210
N = 211
N = 212
N = 213
N = 214
N = 215
N = 216
N = 217
N = 218
#clusters ~ 1
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0.2 0.25 0.3 0.35 0.4 0.45 0.5
frozen v
ariable
s
J
L = 10
L = 12
L = 16
L = 24
L = 32
L = 48
3D lattice
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0.2 0.25 0.3 0.35 0.4 0.45 0.5
gia
nt clu
ste
r
J
L = 10L = 12L = 16L = 24L = 32L = 48
#clusters ~ N
Correlatedpercolation?
T=0 R.F.I.M.: SEARCHING BETWEEN SOLUTIONS
We designed a modified version of B.P. which is able to find efficiently many different fixed points starting from different initial conditions
INITIAL CONDITIONS = CONVEX COMBINATIONS OF FIXED POINTS ALREADY FOUND
competitive for optimization, Prob[g.s.] ~ 1
Too many states, however at the present time the fastest way for finding metastable states
Time complexity ~
Random regular graphs
best case
3D latticeworst case
1
1.5
2
2.5
3
3.5
4
0.3 0.4 0.5 0.6 0.7 0.8
< N
so
l >
J
N = 210
N = 211
N = 212
N = 213
N = 214
N = 215
T=0 R.F.I.M.: PERSPECTIVES
percolation of unconstrained variables
anomalous slow dynamics
?
are all these stable states relevant for the thermodynamics ?
are there situations where “faster” and “flexible” is better that “exact” ?
SPIN GLASS OPTIMIZATION
One of the hardest optimization problem
Many open questions can be answered only through numerical simulations
Ubiquitous in applications
ALGORITHMS
EXACT
HEURISTICS
GENETIC ALGORITHMS
CLUSTER EXACT APPROXIMATION
BP-BASED HEURISTICS
S.G. OPTIMIZATION: IMPROVING C.E.A.
build an unfrustrated cluster compute the g.s.
of the cluster with min-cut algorithm
BP is faster than min-cut
BP can handle a small amount of frustration
-1.67
-1.665
-1.66
-1.655
-1.65
-1.645
-1.64
-1.635
-1.63
0 10 20 30 40 50 60 70 80 90 100
E / N
number of i.c.
treecluster no frus
numFrus = 1numFrus = 2numFrus = 3numFrus = 4numFrus = 5
N = 163 = 4096
FUTURE WORKS
A BETTER UNDERSTANDING OF BETHE APPROXIMATION (B.P.) VIA…
▸ Characterization of unfrozen variables percolation
▸ Correct counting of metastable states
▸ Use of the Bethe approximation as an heuristicswhen it is not exact:
▸ when B.P. does not converge
▸ in presence of strongly frustrated short loops
▸ Comparison with exact (and slow) solutions andwith other fast (and approximate) heuristics
S.G. OPTIMIZATION: BP + PINNING
fix a fraction of variables
let them act as an external field
0
100
200
300
400
500
600
700
800
900
1000
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
fraction of pinned variables
L = 8L = 10L = 12L = 14L = 64
No phase transition : (
Seems to be great for optimization
…however
beliefs can be better than we think