statistical physics tools in information...
Post on 03-Aug-2018
214 Views
Preview:
TRANSCRIPT
Statistical Physics Tools in Information Science
Marc Mezard1 and Andrea Montanari2
(1) Universite de Paris Sud and (2) Stanford University
June 23, 2007
Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science
Structure of the presentation
Andrea: What is statistical physics and why should you care.
Marc: Two test cases: (1) counting matchings, (2) random k-SAT.
Ask whatever you want!
Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science
Structure of the presentation
Andrea: What is statistical physics and why should you care.
Marc: Two test cases: (1) counting matchings, (2) random k-SAT.
Ask whatever you want!
Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science
Structure of the presentation
Andrea: What is statistical physics and why should you care.
Marc: Two test cases: (1) counting matchings, (2) random k-SAT.
Ask whatever you want!
Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science
Structure of the presentation
Andrea: What is the statistical physics we do and . . . .
Marc: Two test cases: (1) counting matchings, (2) random k-SAT.
Ask whatever you want!
Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science
Sources
General: → M. Mezard and A Montanari, ’Information, Physics and
Computation,’ Upcoming book our web pages
Random k-SAT: → M. Mezard, G. Parisi, and R. Zecchina, ’Analytic and
Algorithmic Solution of Random Satisfiability Problems,’ Science
→ F. Krzakala, A. Montanari, F. Ricci-Tersenghi, G. Semerjian,
L. Zdeborova ‘Gibbs States and the Set of Solutions of Random
Constraint Satisfaction Problems,’ PNAS
Coding: → A. Montanari and R. Urbanke, ‘Modern Coding Theory: The
Statistical Mechanics and Computer Science Point of View,’ Lecture
notes
General graphical models: → google ee374
Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science
Outline
1 Problems
2 Methods
3 Results
4 The cavity method at work
5 Mean Field (BP) on graphical models
6 Matching
7 K-SAT
8 Appendices
Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science
Outline
1 Problems
2 Methods
3 Results
4 The cavity method at work
5 Mean Field (BP) on graphical models
6 Matching
7 K-SAT
8 Appendices
Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science
Outline
1 Problems
2 Methods
3 Results
4 The cavity method at work
5 Mean Field (BP) on graphical models
6 Matching
7 K-SAT
8 Appendices
Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science
Outline
1 Problems
2 Methods
3 Results
4 The cavity method at work
5 Mean Field (BP) on graphical models
6 Matching
7 K-SAT
8 Appendices
Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science
Outline
1 Problems
2 Methods
3 Results
4 The cavity method at work
5 Mean Field (BP) on graphical models
6 Matching
7 K-SAT
8 Appendices
Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science
Outline
1 Problems
2 Methods
3 Results
4 The cavity method at work
5 Mean Field (BP) on graphical models
6 Matching
7 K-SAT
8 Appendices
Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science
Outline
1 Problems
2 Methods
3 Results
4 The cavity method at work
5 Mean Field (BP) on graphical models
6 Matching
7 K-SAT
8 Appendices
Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science
Outline
1 Problems
2 Methods
3 Results
4 The cavity method at work
5 Mean Field (BP) on graphical models
6 Matching
7 K-SAT
8 Appendices
Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science
Problems
Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science
Probabilistic description of a physical system
State: x = (x1, . . . , xN), xi ∈ X
Temperature: β
Energy E : x 7→ E (x) ∈ R
(Boltzmann) probability distribution:
µ(x) =1
Zexp{−βE (x)} .
Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science
Probabilistic description of a physical system
State: x = (x1, . . . , xN), xi ∈ X
Temperature: β
Energy E : x 7→ E (x) ∈ R
(Boltzmann) probability distribution:
µ(x) =1
Zexp{−βE (x)} .
Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science
Probabilistic description of a physical system
State: x = (x1, . . . , xN), xi ∈ X
Temperature: β
Energy E : x 7→ E (x) ∈ R
(Boltzmann) probability distribution:
µ(x) =1
Zexp{−βE (x)} .
Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science
Probabilistic description of a physical system
State: x = (x1, . . . , xN), xi ∈ X
Temperature: β
Energy E : x 7→ E (x) ∈ R
(Boltzmann) probability distribution:
µ(x) =1
Zexp{−βE (x)} .
Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science
Probabilistic description of a physical system
State: x = (x1, . . . , xN), xi ∈ X
Energy E : x 7→ E (x) ∈ R
(Boltzmann) probability distribution:
µ(x) =1
Zexp{−E (x)} .
Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science
Probabilistic description of a physical system
State: x = (x1, . . . , xN), xi ∈ X
(Boltzmann) probability distribution:
µ(x) =1
Zw(x) .
Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science
Probabilistic description of a physical system
State: x = (x1, . . . , xN), xi ∈ X
probability distribution:
µ(x) =1
Zw(x) .
Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science
What is left? An example
L× L grid: G = (V ,E )xi ∈ X = {0, 1}, i ∈ V
µ(x) =1
Z (λ;G )λ|x | I{x is an independent set} .
Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science
What is left? Locality
L× L grid: G = (V ,E )xi ∈ X = {0, 1}, i ∈ V
µ(x) =1
Z (λ;G )
∏i∈V
λxi∏
(ij)∈E
I{(xi , xj) 6= (1, 1)} .
Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science
A more abstract version of locality
G = (V ,E ), V = [n], x = (x1, . . . , xN) ∈ {0, 1}V
µ(x) =1
Z (λ;G )
∏i∈V
λxi∏
(ij)∈E
I{(xi , xj) 6= (1, 1)} .
Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science
A more abstract version of locality
x1
x2 x3 x4
x5
x6
x7x8x9
x10
x11
x12
G = (V ,E ), V = [N], x = (x1, . . . , xN) ∈ XN
µ(x) =1
Z
∏(ij)∈G
ψij(xi , xj) .
Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science
Statistical mechanics questions: I. Qualitative
How does a typical configuration sampled from µ look like?
Disordered versus Ordered
Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science
Statistical mechanics questions: I. Qualitative
How does a typical configuration sampled from µ look like?
Disordered versus Ordered
Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science
Statistical mechanics questions: I. Qualitative
How does a typical configuration sampled from µ look like?
Liquid versus Solid
Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science
Statistical mechanics questions: II. Quantitative
L× L grid: N = L2
Compute (for N large)
φN(λ) =1
Nlog Z (G ;λ) =
1
Nlog
∑x∈IS(G)
λ|x |
Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science
Isn’t Z just an irrelevant normalization constant?
H(X ) = −∑x
µ(x) log µ(x)
= log Z (λ;G )−∑
x∈IS(G)
µ(x) |x | log λ
= log Z (λ;G )− log λ∑i∈V
〈xi 〉G
= log Z (λ;G )− log λ∂ log Z (λ;G )
∂ log λ
[this relation is completely general]
Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science
Isn’t Z just an irrelevant normalization constant?
H(X ) = −∑x
µ(x) log µ(x)
= log Z (λ;G )−∑
x∈IS(G)
µ(x) |x | log λ
= log Z (λ;G )− log λ∑i∈V
〈xi 〉G
= log Z (λ;G )− log λ∂ log Z (λ;G )
∂ log λ
[this relation is completely general]
Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science
Isn’t Z just an irrelevant normalization constant?
H(X ) = −∑x
µ(x) log µ(x)
= log Z (λ;G )−∑
x∈IS(G)
µ(x) |x | log λ
= log Z (λ;G )− log λ∑i∈V
〈xi 〉G
= log Z (λ;G )− log λ∂ log Z (λ;G )
∂ log λ
[this relation is completely general]
Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science
Isn’t Z just an irrelevant normalization constant?
H(X ) = −∑x
µ(x) log µ(x)
= log Z (λ;G )−∑
x∈IS(G)
µ(x) |x | log λ
= log Z (λ;G )− log λ∑i∈V
〈xi 〉G
= log Z (λ;G )− log λ∂ log Z (λ;G )
∂ log λ
[this relation is completely general]
Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science
Isn’t Z just an irrelevant normalization constant?
H(X ) = −∑x
µ(x) log µ(x)
= log Z (λ;G )−∑
x∈IS(G)
µ(x) |x | log λ
= log Z (λ;G )− log λ∑i∈V
〈xi 〉G
= log Z (λ;G )− log λ∂ log Z (λ;G )
∂ log λ
[this relation is completely general]
Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science
Questions I and II are related!
∆(x) =∑
i∈EVEN
xi −∑
i∈ODD
xi .
φN(λ, δ) =1
Nlog Z (G ;λ, δ) =
1
Nlog
∑x :∆(x)=Nδ
λ|x |
Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science
Liquid
-0.5
0
0.5
1
1.5
2
-2 -1.5 -1 -0.5 0 0.5 1 1.5 2
φN(λ, δ)
δ
Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science
Solid
-0.5
0
0.5
1
1.5
2
-2 -1.5 -1 -0.5 0 0.5 1 1.5 2
φN(λ, δ) ↑bottleneck
δ
l B
Theorem (Mossel/Weitz/Wormald/06)
On a random sparse bipartite graph B = Θ(1) whp for λ > λ∗.
Similar Thm for Ising models [A. Gerschenfeld/AM/07]
Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science
Solid
-0.5
0
0.5
1
1.5
2
-2 -1.5 -1 -0.5 0 0.5 1 1.5 2
φN(λ, δ) ↑bottleneck
δ
l B
Theorem (Mossel/Weitz/Wormald/06)
On a random sparse bipartite graph B = Θ(1) whp for λ > λ∗.
Similar Thm for Ising models [A. Gerschenfeld/AM/07]
Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science
An artistic view of µ in the solid phase
δδ = 0
Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science
What about non-bipartite graphs?
Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science
Frustration
?
?
No ‘simple ordering’⇒ Solid amorphous state?
[Solid+Amorphous = Glass]
Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science
Frustration
?
?
No ‘simple ordering’⇒ Solid amorphous state?
[Solid+Amorphous = Glass]
Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science
Frustration
?
?
No ‘simple ordering’⇒ Solid amorphous state?
[Solid+Amorphous = Glass]
Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science
How do you define ‘solid’?
i ∈ V
B(i , r) ball of radius r around i
x∼i ,r = {xj : j 6∈ B(i , r)}
Liquid: I (Xi ;X∼i ,r )r→ 0
Solid: I (Xi ;X∼i ,r )r→ I∞ > 0
Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science
How do you define ‘solid’?
i ∈ V
B(i , r) ball of radius r around i
x∼i ,r = {xj : j 6∈ B(i , r)}
Liquid: I (Xi ;X∼i ,r )r→ 0
Solid: I (Xi ;X∼i ,r )r→ I∞ > 0
Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science
How do you define ‘solid’?
i ∈ V
B(i , r) ball of radius r around i
x∼i ,r = {xj : j 6∈ B(i , r)}
Liquid: I (Xi ;X∼i ,r )r→ 0
Solid: I (Xi ;X∼i ,r )r→ I∞ > 0
Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science
Methods
Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science
Mean Field Methods
Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science
Mean field ******
Mean field methods : A family of techniques for approximatecalculations in statistical mechanics and graphical models.1
Mean field models : A class of models on which mean fieldmethods are asymptotically exact in the large system limit
1And more: Markov chains, queuing theory, stochastic networks, etc...Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science
The simplest mean field calculation
i
∂i
µA( · ) marginal of XA, A ⊆ V
Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science
The simplest mean field calculation
i
∂i
µi (1) =∑x∂i
µi |∂i (1|x∂i )µ∂i (x∂i ) =λ
1 + λµ∂i (0)
≈ λ
1 + λ
∏j∈∂i
µj(0) =λ
1 + λ
∏j∈∂i
(1− µj(1))
Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science
The simplest mean field calculation
i
∂i
µi (1) =∑x∂i
µi |∂i (1|x∂i )µ∂i (x∂i ) =λ
1 + λµ∂i (0)
≈ λ
1 + λ
∏j∈∂i
µj(0) =λ
1 + λ
∏j∈∂i
(1− µj(1))
Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science
The simplest mean field calculation
i
∂i
µi (1) =∑x∂i
µi |∂i (1|x∂i )µ∂i (x∂i ) =λ
1 + λµ∂i (0)
≈ λ
1 + λ
∏j∈∂i
µj(0) =λ
1 + λ
∏j∈∂i
(1− µj(1))
Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science
The simplest mean field calculation
i
∂i
µi (1) =∑x∂i
µi |∂i (1|x∂i )µ∂i (x∂i ) =λ
1 + λµ∂i (0)
≈ λ
1 + λ
∏j∈∂i
µj(0) =λ
1 + λ
∏j∈∂i
(1− µj(1))
Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science
Solving the equations
Bipartite, degree k + 1, assume
µi (1) =
{p1 if i ∈EVEN,p2 if i ∈ODD.
Then, MF equations are
p1 = fλ(p2) , p2 = fλ(p1)
where fλ(x) = λ(1 + λ)−1 (1− x)k+1
Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science
Solving the equations
Bipartite, degree k + 1, assume
µi (1) =
{p1 if i ∈EVEN,p2 if i ∈ODD.
Then, MF equations are
p1 = fλ(p2) , p2 = fλ(p1)
where fλ(x) = λ(1 + λ)−1 (1− x)k+1
Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science
Solving the equations (continued)
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7
p1(`)
p2(`)
p1(`)
p2(`)
Liquid vs Solid
Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science
The family of mean field approximations
Method Basic intuition Asympt. exact for
Naive mf Neglects correlations Some dense G ’s
Bethe-Peierls ‘Nearest neighbors’ correls Some sparse rand. G ’s
Cavity2 As BP + Glassy states ‘Any’ sparse rand. G
Kikuchi3 Short loops / Nonpert. ???
Loop corr.4 Loops / Perturbative ***
2Mezard/Parisi,. . .3Kikuchi, Yedidia/Freeman/Weiss4AM/Rizzo, Parisi/Slanina, Chernyak/Chertkov
Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science
The family of mean field approximations
Method Basic intuition Asympt. exact for
Naive mf Neglects correlations Some dense G ’s
Bethe-Peierls ‘Nearest neighbors’ correls Some sparse rand. G ’s
Cavity2 As BP + Glassy states ‘Any’ sparse rand. G
Kikuchi3 Short loops / Nonpert. ???
Loop corr.4 Loops / Perturbative ***
2Mezard/Parisi,. . .3Kikuchi, Yedidia/Freeman/Weiss4AM/Rizzo, Parisi/Slanina, Chernyak/Chertkov
Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science
The family of mean field approximations
Method Basic intuition Asympt. exact for
Naive mf Neglects correlations Some dense G ’s
Bethe-Peierls ‘Nearest neighbors’ correls Some sparse rand. G ’s
Cavity2 As BP + Glassy states ‘Any’ sparse rand. G
Kikuchi3 Short loops / Nonpert. ???
Loop corr.4 Loops / Perturbative ***
2Mezard/Parisi,. . .3Kikuchi, Yedidia/Freeman/Weiss4AM/Rizzo, Parisi/Slanina, Chernyak/Chertkov
Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science
The family of mean field approximations
Method Basic intuition Asympt. exact for
Naive mf Neglects correlations Some dense G ’s
Bethe-Peierls ‘Nearest neighbors’ correls Some sparse rand. G ’s
Cavity2 As BP + Glassy states ‘Any’ sparse rand. G
Kikuchi3 Short loops / Nonpert. ???
Loop corr.4 Loops / Perturbative ***
2Mezard/Parisi,. . .3Kikuchi, Yedidia/Freeman/Weiss4AM/Rizzo, Parisi/Slanina, Chernyak/Chertkov
Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science
The family of mean field approximations
Method Basic intuition Asympt. exact for
Naive mf Neglects correlations Some dense G ’s
Bethe-Peierls ‘Nearest neighbors’ correls Some sparse rand. G ’s
Cavity2 As BP + Glassy states ‘Any’ sparse rand. G
Kikuchi3 Short loops / Nonpert. ???
Loop corr.4 Loops / Perturbative ***
2Mezard/Parisi,. . .3Kikuchi, Yedidia/Freeman/Weiss4AM/Rizzo, Parisi/Slanina, Chernyak/Chertkov
Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science
The family of mean field approximations
Method Basic intuition Asympt. exact for
Naive mf Neglects correlations Some dense G ’s
Bethe-Peierls ‘Nearest neighbors’ correls Some sparse rand. G ’s
Cavity2 As BP + Glassy states ‘Any’ sparse rand. G
Kikuchi3 Short loops / Nonpert. ???
Loop corr.4 Loops / Perturbative ***
2Mezard/Parisi,. . .3Kikuchi, Yedidia/Freeman/Weiss4AM/Rizzo, Parisi/Slanina, Chernyak/Chertkov
Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science
The family of mean field approximations
Method Basic intuition Algorithmic version
Naive mf Neglects correlations Mean field
Bethe-Peierls ‘Nearest neighbors’ correls Belief Propagation
Cavity5 As BP + Glassy states Survey Propagation
Kikuchi6 Short loops / Nonpert. Generalized BP
Loop corr.7 Loops / Perturbative Loop corr. BP
5Mezard/Parisi,. . .6Kikuchi, Yedidia/Freeman/Weiss7AM/Rizzo, Parisi/Slanina, Chernyak/Chertkov
Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science
The family of mean field approximations
Method Basic intuition Algorithmic version
Naive mf Neglects correlations Mean field
Bethe-Peierls ‘Nearest neighbors’ correls Belief Propagation
Cavity8 As BP + Glassy states Survey Propagation
Kikuchi9 Short loops / Nonpert. Generalized BP
Loop corr.10 Loops / Perturbative Loop corr. BP
8Mezard/Parisi,. . .9Kikuchi, Yedidia/Freeman/Weiss
10AM/Rizzo, Parisi/Slanina, Chernyak/ChertkovMarc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science
‘Any sparse random graph?’
Caveats
Many (rigorous and non) indications but no proof.
‘Sparse random graph is a bit vague.’
Can define a family of ensembles.
Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science
‘Any sparse random graph?’
Caveats
Many (rigorous and non) indications but no proof.
‘Sparse random graph is a bit vague.’
Can define a family of ensembles.
Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science
‘Any sparse random graph?’
Factor graph G = (V ,F ,E ),
x3
x1
x6
x4
x2
x5
x7
x
x
x
8
9
10
← variables xi ∈ X
← factors, e.g. ψa(x5, x7, x9, x10)
µ(x) =1
Z
∏a∈F
ψa(x∂a)
∂a ≡ {i ∈ V : (i , a) ∈ E}Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science
Graph ensemble
︸ ︷︷ ︸degree 2
︸ ︷︷ ︸degree 3
︸ ︷︷ ︸degree dmax factorss
degree 2︷ ︸︸ ︷degree 3︷ ︸︸ ︷ degree dmax variables︷ ︸︸ ︷random permutation π
[∼ irregular LDPC ensembles]
Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science
Compatibility functions ensemble
Assign, for d ∈ {1, . . . dmax} a set of functions
{ψ(d ,r) : X × · · · × X︸ ︷︷ ︸d
→ R+}r=1,2,...
and a distribution {pd(r)} (pd(r) ≥ 0,∑
r pd(r) ≥ 0)
Then, for each f -node a of degree d(a)
ψa = ψ(d(a),r) independently, with prob pd(a)(r)
Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science
Compatibility functions ensemble
Assign, for d ∈ {1, . . . dmax} a set of functions
{ψ(d ,r) : X × · · · × X︸ ︷︷ ︸d
→ R+}r=1,2,...
and a distribution {pd(r)} (pd(r) ≥ 0,∑
r pd(r) ≥ 0)
Then, for each f -node a of degree d(a)
ψa = ψ(d(a),r) independently, with prob pd(a)(r)
Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science
The cavity method: An high level view
0. Cavity method = Replica method
Replica method is formal, while cavity makes some probabilityassumptions.
Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science
The cavity method: An high level view
1. What does it mean asymptotically exact?
Partition function
limN→∞
1
Nlog ZN = φcavity almost surely.
Marginals
limN→∞
1
N
N∑i=1
||µi − µcavityi ||TV = 0 almost surely.
Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science
The cavity method: An high level view
1. What does it mean asymptotically exact?
Partition function
limN→∞
1
Nlog ZN = φcavity almost surely.
Marginals
limN→∞
1
N
N∑i=1
||µi − µcavityi ||TV = 0 almost surely.
Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science
The cavity method: An high level view
1. What does it mean asymptotically exact?
Partition function
limN→∞
1
Nlog ZN = φcavity almost surely.
Marginals
limN→∞
1
N
N∑i=1
||µi − µcavityi ||TV = 0 almost surely.
Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science
The cavity method: An high level view
2. Naive mean field → µi ≈ νi (vertex quantities)Cavity → νi→j (messages)
Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science
The cavity method: An high level view
3. A hierarchy
Std terminology Cavity jargon Message space
Bethe-Peierls RS (0RSB) M0 = distribs over X*** 1RSB M1 = distribs over M0
*** 2RSB M2 = distribs over M1
*** 3RSB M3 = distribs over M2
· · · · ·· · · · ·· · · · ·*** ∞RSB ???
Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science
The cavity method: An high level view
3. A hierarchy
Std terminology Cavity jargon Message space
Bethe-Peierls RS (0RSB) M0 = distribs over X*** 1RSB M1 = distribs over M0
*** 2RSB M2 = distribs over M1
*** 3RSB M3 = distribs over M2
· · · · ·· · · · ·· · · · ·*** ∞RSB ???
Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science
The cavity method: An high level view
3. A hierarchy
Std terminology Cavity jargon Message space
Bethe-Peierls RS (0RSB) M0 = distribs over X*** 1RSB M1 = distribs over M0
*** 2RSB M2 = distribs over M1
*** 3RSB M3 = distribs over M2
· · · · ·· · · · ·· · · · ·*** ∞RSB ???
Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science
The cavity method: An high level view
3. A hierarchy
Std terminology Cavity jargon Message space
Bethe-Peierls RS (0RSB) M0 = distribs over X*** 1RSB M1 = distribs over M0
*** 2RSB M2 = distribs over M1
*** 3RSB M3 = distribs over M2
· · · · ·· · · · ·· · · · ·*** ∞RSB ???
Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science
The cavity method: An high level view
3. A hierarchy
Std terminology Cavity jargon Message space
Bethe-Peierls RS (0RSB) M0 = distribs over X*** 1RSB M1 = distribs over M0
*** 2RSB M2 = distribs over M1
*** 3RSB M3 = distribs over M2
· · · · ·· · · · ·· · · · ·*** ∞RSB ???
Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science
The cavity method: An high level view
3. A hierarchy
Std terminology Cavity jargon Message space
Bethe-Peierls RS (0RSB) M0 = distribs over X*** 1RSB M1 = distribs over M0
*** 2RSB M2 = distribs over M1
*** 3RSB M3 = distribs over M2
· · · · ·· · · · ·· · · · ·*** ∞RSB ???
Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science
The cavity method: An high level view
3. A hierarchy
Std terminology Cavity jargon Message space
Bethe-Peierls RS (0RSB) M0 = distribs over X*** 1RSB M1 = distribs over M0
*** 2RSB M2 = distribs over M1
*** 3RSB M3 = distribs over M2
· · · · ·· · · · ·· · · · ·*** ∞RSB ???
Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science
The cavity method: An high level view
3. A hierarchy
Std terminology Cavity jargon Message space
Bethe-Peierls RS (0RSB) M0 = distribs over X*** 1RSB M1 = distribs over M0
*** 2RSB M2 = distribs over M1
*** 3RSB M3 = distribs over M2
· · · · ·· · · · ·· · · · ·*** ∞RSB ???
Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science
Results
Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science
A list of models from. . .
Coding
Multi-user detection
Stochastic networks
Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science
Channel coding
BMSx = (x1 . . . xN) y = (y1 . . . yN)
Channel transition probability {Q(y |x)}.
Codeword: x ∈ {0, 1}N
Hx = 0 mod 2 .
Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science
LDPC codes [Gallager, MacKay, Luby et al.]
x1 ⊕ x2 ⊕ x3 ⊕ x4 = 0 · · · x5 ⊕ x6 ⊕ x8 = 0
x1 x2 x3 x4 x5 x6 x7 x8
Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science
x1 ⊕ x2 ⊕ x3 ⊕ x4 = 0 · · · x5 ⊕ x6 ⊕ x8 = 0
x1 x2 x3 x4 x5 x6 x7 x8y y y y y y y yy1 y2 y3 y4 y5 y6 y7 y8
µy (x) =1
ZN(y)I(x1 ⊕ x2 ⊕ x3 ⊕ x4 = 0) · · · I(x5 ⊕ x6 ⊕ x8 = 0) ·
· Q(y1|x1) · · ·Q(y8|x8)
Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science
Some results
Saad/Kabashima et al., AM/Sourlas (Replica method)
φ = limN→∞
1
NE log ZN(Y )⇒ [Conditional entropy per bit H(X |Y )/N]]
Proof: Lower bound → AM, MacrisUpper bound: Measson/AM/Urbanke (BEC)
Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science
Multi-user detection (CDMA channel)
N users: x ≡ (x1, x2, . . . , xN), xi ∈ {+1,−1} i.i.d uniform
M chips: y = (y1, y2, . . . , yN), ya ∈ R
ya = sa1xi1(a) + · · ·+ sakxik (a) + wa
wa = Normal(0, σ2) , {sai} spread sequences
Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science
Multi-user detection (CDMA channel)
noise
(+x1 − x2 + x3 + x4) + w1
y1 =· · · (−x5 − x6 + x8) + w6
y6 =
x1 x2 x3 x4 x5 x6 x7 x8
A posteriori distribution: µy (x) ≡ P {x |Y } → graphical model. . .
Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science
µy (x) =1
ZK (y)
N∏a=1
1√2πσ2
exp
− 1
2σ2
(ya −
∑l
salxil (a)
)2 .
Tanaka (replica method)
φ = limK→∞
1
KE log ZK (Y )⇒ [Capacity per user]
Several generalizations: Guo/Verdu, Caire et al., Kabashima et al.Proof: AM/Tse
Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science
µy (x) =1
ZK (y)
N∏a=1
1√2πσ2
exp
− 1
2σ2
(ya −
∑l
salxil (a)
)2 .
Tanaka (replica method)
φ = limK→∞
1
KE log ZK (Y )⇒ [Capacity per user]
Several generalizations: Guo/Verdu, Caire et al., Kabashima et al.Proof: AM/Tse
Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science
Channel assignment in cellular networks
ni ≥ 0, number of channels in cell i
µ(n) =1
Z
∏i∈V
λnii
ni !
∏(ij)∈E
I(ni + nj ≤ C ) .
Z → loss probability
Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science
END OF FIRST HALF
Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science
BEGINNING OF SECOND HALF
Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science
Cavity method: general (heuristic) framework
1- Draw the factor graph2- Write elementary “mean field (BP) equations” assuming thatthe local environment of a variable in the factor graph is a tree3- Two ways to use them: a) Statistical analysis of equations in agraph ensemble. b) Iteration of the message passing on a singleinstance (belief propagation)4- Check the existence of “Replica Symmetry Breaking”=dependence of the root from boundaries, using typical boundaries5- If needed, write the 1RSB cavity equations → surveypropagation ....
Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science
Cavity method: general (heuristic) framework
1- Draw the factor graph2- Write elementary “mean field (BP) equations” assuming thatthe local environment of a variable in the factor graph is a tree3- Two ways to use them: a) Statistical analysis of equations in agraph ensemble. b) Iteration of the message passing on a singleinstance (belief propagation)4- Check the existence of “Replica Symmetry Breaking”=dependence of the root from boundaries, using typical boundaries5- If needed, write the 1RSB cavity equations → surveypropagation ....
Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science
Cavity method: general (heuristic) framework
1- Draw the factor graph2- Write elementary “mean field (BP) equations” assuming thatthe local environment of a variable in the factor graph is a tree3- Two ways to use them: a) Statistical analysis of equations in agraph ensemble. b) Iteration of the message passing on a singleinstance (belief propagation)4- Check the existence of “Replica Symmetry Breaking”=dependence of the root from boundaries, using typical boundaries5- If needed, write the 1RSB cavity equations → surveypropagation ....
Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science
Cavity method: general (heuristic) framework
1- Draw the factor graph2- Write elementary “mean field (BP) equations” assuming thatthe local environment of a variable in the factor graph is a tree3- Two ways to use them: a) Statistical analysis of equations in agraph ensemble. b) Iteration of the message passing on a singleinstance (belief propagation)4- Check the existence of “Replica Symmetry Breaking”=dependence of the root from boundaries, using typical boundaries5- If needed, write the 1RSB cavity equations → surveypropagation ....
Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science
Cavity method: general (heuristic) framework
1- Draw the factor graph2- Write elementary “mean field (BP) equations” assuming thatthe local environment of a variable in the factor graph is a tree3- Two ways to use them: a) Statistical analysis of equations in agraph ensemble. b) Iteration of the message passing on a singleinstance (belief propagation)4- Check the existence of “Replica Symmetry Breaking”=dependence of the root from boundaries, using typical boundaries5- If needed, write the 1RSB cavity equations → surveypropagation ....
Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science
Factor graphs for graphical models
Many discrete variables xi , many constraints fa(Xa), each involvinga small number of variables. Factor graph:
2
1
4
5
a
b
c
d
e
3
P(x1, ..., x5) = 1Z fa(x1, x2, x3, x4)
fb(x1, x2, x3) fc(x2, x4, x5)fd(x1, x2, x5) fe(x1, x3, x5)
Q: Estimate marginals. Ubiquitous:inference, coding, combinatorial opti-mization, physics....
NB: In physics, ’energy’, ’tempera-ture’
fa(x1, x2, x3, x4) = e−βEa(x1,x2,x3,x4)
Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science
Factor graphs for graphical models
Many discrete variables xi , many constraints fa(Xa), each involvinga small number of variables. Factor graph:
2
1
4
5
a
b
c
d
e
3
P(x1, ..., x5) = 1Z fa(x1, x2, x3, x4)
fb(x1, x2, x3) fc(x2, x4, x5)fd(x1, x2, x5) fe(x1, x3, x5)
Q: Estimate marginals. Ubiquitous:inference, coding, combinatorial opti-mization, physics....
NB: In physics, ’energy’, ’tempera-ture’
fa(x1, x2, x3, x4) = e−βEa(x1,x2,x3,x4)
Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science
Locally tree-like factor graph
in LDPC error correcting codes,random K -satisfiability, colour-ing of random Erdos Renyigraphs, matching in randomgraphs, etc...: The factor graphis locally tree-like.
Ex: random 3-SAT
LoopsLog N
:
Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science
Simple mean field recursion: merge rooted trees
m m3 4
1 2 3 4
0
µ
µa
b
a b
m1 ( x )1
(x0 )
(x 0 )
m2(x2) (x3) (x4)
µa(x0) =∑
x1,x2m1(x1)m2(x2)fa(x1, x2, x0)
µb(x0) =∑
x3,x4m3(x3)m4(x4)fa(x3, x4, x0)
m0(x0) = Cµa(x0)µb(x0)0
m 0 ( x 0)
m0 = F (m1,m2,m3,m4) = Belief propagation
Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science
Simple mean field recursion: merge rooted trees
m m3 4
1 2 3 4
0
µ
µa
b
a b
m1 ( x )1
(x0 )
(x 0 )
m2(x2) (x3) (x4)
µa(x0) =∑
x1,x2m1(x1)m2(x2)fa(x1, x2, x0)
µb(x0) =∑
x3,x4m3(x3)m4(x4)fa(x3, x4, x0)
m0(x0) = Cµa(x0)µb(x0)0
m 0 ( x 0)
m0 = F (m1,m2,m3,m4) = Belief propagation
Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science
Belief propagation = iteration of mean field equations onone instance
mi→a(xi ) = C∏
b∈V (i)\a
µb→i (xi )
µa→i (xi ) =∑
{xj},j∈V (a)\i
fa(xi , {xj})∏
j∈V (a)\i
mj→a(xj)
Marginal on i (“belief”): pi (xi ) = C∏
b∈V (i) µb→i (xi )
Marginal around node a: Pa(Xa) = C∏
j∈V (a) mj→a(xj)
Entropy (exact on tree):
P(x) ' C∏
a Pa(Xa)∏
i pi (xi )1−di ; S = −
∑x P(x) log P(x)
Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science
Belief propagation = iteration of mean field equations onone instance
mi→a(xi ) = C∏
b∈V (i)\a
µb→i (xi )
µa→i (xi ) =∑
{xj},j∈V (a)\i
fa(xi , {xj})∏
j∈V (a)\i
mj→a(xj)
Marginal on i (“belief”): pi (xi ) = C∏
b∈V (i) µb→i (xi )
Marginal around node a: Pa(Xa) = C∏
j∈V (a) mj→a(xj)
Entropy (exact on tree):
P(x) ' C∏
a Pa(Xa)∏
i pi (xi )1−di ; S = −
∑x P(x) log P(x)
Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science
Belief propagation = iteration of mean field equations onone instance
mi→a(xi ) = C∏
b∈V (i)\a
µb→i (xi )
µa→i (xi ) =∑
{xj},j∈V (a)\i
fa(xi , {xj})∏
j∈V (a)\i
mj→a(xj)
Marginal on i (“belief”): pi (xi ) = C∏
b∈V (i) µb→i (xi )
Marginal around node a: Pa(Xa) = C∏
j∈V (a) mj→a(xj)
Entropy (exact on tree):
P(x) ' C∏
a Pa(Xa)∏
i pi (xi )1−di ; S = −
∑x P(x) log P(x)
Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science
Statistical analysis
Factor graph ensembles:1- Random regular graph: local environment = regular tree foralmost all points → measure should be translationally invariantm = F (m,m,m,m)2-Erdos Renyi graph: P(m)= probability that mi = m, when i istaken at random in the graph with uniform probability.k neighbours, Poisson distributed. m0 = F (m1, ...,mk) → integralequation for P(m), easily solved numerically
Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science
Example: matching
Edge i : si ∈ {0, 1}.Matching: Constraint on each vertex
∑i∈V (a) si ≤ 1.
Energy E (s) = number of unmatched vertices.Probability: P(s) = 1
Z exp(−βE (s))
Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science
Example: matching
Edge i : si ∈ {0, 1}.Matching: Constraint on each vertex
∑i∈V (a) si ≤ 1.
Energy E (s) = number of unmatched vertices.Probability: P(s) = 1
Z exp(−βE (s))
Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science
BP equations in the matching problem
ψa(s) = I(∑
i∈V (a) si ≤ 1)
e−β(1−P
i∈V (a) si )
BP equations:
i
j
a
b
mi→a(si = 1) =∏
j∈∂b−i mj→b(sj = 0)
mi→a(si = 0) = e−β∏
j∈∂b−i mj→b(sj = 0)+∑j∈∂b−i mj→b(sj = 1)
∏k∈∂b−{i ,j} mk→b(sk = 0)
Closed set of equations for hi→a = − 1β log mi→a(0)
mi→a(1)
Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science
BP equations in the matching problem
hi→a = − 1β log
[e−β +
∑j∈b−i eβhj→b
]= F (h1→b, h2→b, h3→b)
Statistical analysis:
1: r−regular random graph: h = 1β log
[√4(r−1)+e−2β−e−β
2(r−1)
]2: Erdos Renyi graph: P(h), solution of a simple integral equation
→ entropy S(β) = 1N E log[1 +N ] ,
→ size of the matching x(β) = Number of Matched VerticesN
Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science
Entropy of matchings: results
r−regular random graph: E logN = log EN , simple explicitformula, (Bollobas and McKay 86)
Erdos Renyi graph:
NB1: Size of largestmatching known fromKarp-Sipser 1981
NB2: Cavity methodcomputes E logN
Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science
How to control this heuristic approach?
One assumption:
P(x1, x2, x3, x4|x0, a, b absent) == m1(x1)m2(x2)m3(x3)m4(x4)
m m3 4
1 2 3 4
0
µ
µa
b
a b
m1 ( x )1
(x0 )
(x 0 )
m2(x2) (x3) (x4)
Two conditions:
- 1, 2, 3, 4 should be far away when 0, a, b are absent
- Correlations should decay at large distances
Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science
How to control this heuristic approach?
One assumption:
P(x1, x2, x3, x4|x0, a, b absent) == m1(x1)m2(x2)m3(x3)m4(x4)
m m3 4
1 2 3 4
0
µ
µa
b
a b
m1 ( x )1
(x0 )
(x 0 )
m2(x2) (x3) (x4)
Two conditions:
- 1, 2, 3, 4 should be far away when 0, a, b are absent
- Correlations should decay at large distances
Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science
How to control this heuristic approach?
One assumption:
P(x1, x2, x3, x4|x0, a, b absent) == m1(x1)m2(x2)m3(x3)m4(x4)
m m3 4
1 2 3 4
0
µ
µa
b
a b
m1 ( x )1
(x0 )
(x 0 )
m2(x2) (x3) (x4)
Two conditions:
- 1, 2, 3, 4 should be far away when 0, a, b are absent:OK for broad classes of random graphs
- Correlations should decay at large distances??.. Depends..
Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science
Correlation decay
Cavity = treeCorrelations (mutual infor-mation) between root andboundary should decay atlarge distances, for typicalconfigurations outside thetree
Sufficient condition (much easier, but too strong): correlationsdecay for worst case
Correlations for typical case (more difficult) → replica symmetrybreaking
Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science
Correlation decay
Cavity = treeCorrelations (mutual infor-mation) between root andboundary should decay atlarge distances, for typicalconfigurations outside thetree
Sufficient condition (much easier, but too strong): correlationsdecay for worst case
Correlations for typical case (more difficult) → replica symmetrybreaking
Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science
“Replica symmetry breaking”
Non trivial correlations between the root and the boundary
NB1: point-to-set correlationNB2: not necessarily detected by local stability condition
Random regular graph: m0 = F (m1, ..,m4)
RS solution: m = F (m,m,m,m) (transla-tional invariance)
Modulated solutions: mα0 = F (mα
1 , ..,mα4 )
Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science
“Replica symmetry breaking”
Non trivial correlations between the root and the boundary
NB1: point-to-set correlationNB2: not necessarily detected by local stability condition
Random regular graph: m0 = F (m1, ..,m4)
RS solution: m = F (m,m,m,m) (transla-tional invariance)
Modulated solutions: mα0 = F (mα
1 , ..,mα4 )
Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science
“Replica symmetry breaking 2”
RSB: exponentially many solutions to BP equations (extremalGibbs states)Survey: statistics on the solutionsµα
a→i (xi ): message from a to i in the solution α.
Qa→i (µ)= probability that the message µαa→i is equal to µ, when
α is chosen at random (with measure exp(−βxFα)).
Random reg. graph: translational invariance recovered with thestatistics over the sols → Qa→i (µ) = Q(µ), satisfies aself-consistent equation.
Matching: no RSB: Q(µ) = δ(µ, µrs)In many problems (SAT, colouring, 3-matching,...): RSB presentwhen the density of constraints is large enough
Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science
“Replica symmetry breaking 2”
RSB: exponentially many solutions to BP equations (extremalGibbs states)Survey: statistics on the solutionsµα
a→i (xi ): message from a to i in the solution α.
Qa→i (µ)= probability that the message µαa→i is equal to µ, when
α is chosen at random (with measure exp(−βxFα)).
Random reg. graph: translational invariance recovered with thestatistics over the sols → Qa→i (µ) = Q(µ), satisfies aself-consistent equation.
Matching: no RSB: Q(µ) = δ(µ, µrs)In many problems (SAT, colouring, 3-matching,...): RSB presentwhen the density of constraints is large enough
Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science
Random 3-satisfiability
NP-complete (Cook)
Pb: random Boolean formula, conjunctive normal form, threevariables per clause, chosen randomly in {x1, .., xN}, negatedrandomly with probability 1/2:(x1 ∨ x27 ∨ x3) ∧ (x11 ∨ x3 ∨ x2) ∧ . . . ∧ (x9 ∨ x8 ∨ x30)
Control parameter: α = MN = Constraints/Variables.
Numerically: Threshold phenomenon at αc ∼ 4.26.
Proba(SAT)=1 when α < αc ; Proba(SAT)=0 when α > αc .
Numerics Mitchell Selman Levesque Kirkpatrick Crawford Auton..Threshold Friedgut;Bounds Kaporis Kirousis Lalas Dubois Boufkhad..
Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science
Random 3-satisfiability
NP-complete (Cook)
Pb: random Boolean formula, conjunctive normal form, threevariables per clause, chosen randomly in {x1, .., xN}, negatedrandomly with probability 1/2:(x1 ∨ x27 ∨ x3) ∧ (x11 ∨ x3 ∨ x2) ∧ . . . ∧ (x9 ∨ x8 ∨ x30)
Control parameter: α = MN = Constraints/Variables.
Numerically: Threshold phenomenon at αc ∼ 4.26.
Proba(SAT)=1 when α < αc ; Proba(SAT)=0 when α > αc .
Numerics Mitchell Selman Levesque Kirkpatrick Crawford Auton..Threshold Friedgut;Bounds Kaporis Kirousis Lalas Dubois Boufkhad..
Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science
Threshold phenomenon → Phase transition
100
50
0
%SAT
α=Μ/Ν
N=200N=100
1 2 3 4 65αc
generically SAT for α < αc
generically UNSAT α > αc
Friedgut: → step function
Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science
Threshold phenomenon → Phase transition
100
50
0
%SAT
α=Μ/Ν1 2 3 4 65αc
Computer time Easy, and generically SAT,for α < αc
Hard, in the region α ∼ αc
Easy, generically UNSAT, forα > αc
Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science
Statistical physics of the random 3-SAT problem
Monasson, Zecchina, Weigt, Biroli, ....., MM, Parisi, Zecchina: →Phase diagram + New algorithm.
1- Analytic result:Discontinuousglass transition
Three phases:Easy-SAT, Hard-SAT,UNSAT
SAT (E = 0 ) UNSAT (E >0)0 0
1 stateE=0 E>0
Many states Many statesE>0
=M/Nαd
αc α= 4.267
2- New algorithm: Survey propagation (N = 107 at α = 4.23)
Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science
Statistical physics of the random 3-SAT problem
Monasson, Zecchina, Weigt, Biroli, ....., MM, Parisi, Zecchina: →Phase diagram + New algorithm.
1- Analytic result:Discontinuousglass transition
Three phases:Easy-SAT, Hard-SAT,UNSAT
SAT (E = 0 ) UNSAT (E >0)0 0
1 stateE=0 E>0
Many states Many statesE>0
=M/Nαd
αc α= 4.267
2- New algorithm: Survey propagation (N = 107 at α = 4.23)
Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science
Simple mean field message passing: warning propagation(Min Sum)
ua 1= 1
0
a
2 3
1
Message ua→1 ∈ {0, 1}
sent from clause a
to variable 1
Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science
Simple message passing: warning propagation
ua 1= 1
1
0
10
0 0
0 10
1
1
a
2 3
1
Warning ua→i = 1:
“According to the messagesI received, you should take thevalue which satisfies me!”.
Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science
Simple message passing: warning propagation
ua 1=
1
0
10
00
00
0
0
1
0
a
2 3
1
No warning ua→i = 0:
“No problem, take any value!”
Warning propagation (= ’Min Sum’) converges and gives thecorrect answer on a tree: SAT iff no contradictory messageOn a real random 3-SAT: limited to α < 3.9. Cannot get close tothe SAT-UNSAT transition
Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science
Replica symmetry breaking
Minimum Energy Configurations:energy cannot be lowered by a fi-nite number of flips
State/Cluster= { MEC connectedby finite flips } → one fixed pointof WP
Proliferation of states:
At α > αd , many states:
N (E ) ∼ exp(N Σ
(EN
))
c
eth
Σ
Ε/Ν
α αα
αα
α
d< <
c α<
=
c
Σ(0) → clusters of SAT configu-rationsΣ(eth)→ metastable clusters
Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science
From warning propagation to survey propagation
RSB: assume many states: N (E ) ∼ exp(N Σ
(EN
))Message = Survey of the elementary warnings in the variousstates:
ηa→i = probability of a warning being sent from constraint a tovariable i , when a state is picked up at random.
→ Propagate the surveys along the graph. Converges!
→ Results on the phase diagram and the complexity, from thestatistical analysis of the distribution of surveys in a generic sample.
→ Information on a single sample: a local field on each variable →new algorithmic strategies
Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science
From warning propagation to survey propagation
RSB: assume many states: N (E ) ∼ exp(N Σ
(EN
))Message = Survey of the elementary warnings in the variousstates:
ηa→i = probability of a warning being sent from constraint a tovariable i , when a state is picked up at random.
→ Propagate the surveys along the graph. Converges!
→ Results on the phase diagram and the complexity, from thestatistical analysis of the distribution of surveys in a generic sample.
→ Information on a single sample: a local field on each variable →new algorithmic strategies
Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science
Survey propagation
a 1η = Prob(warning)
ηb−>2
b
a
2 3
1
ηa→1: known exactly fromsurveys ofincoming warnings.
Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science
Statistical analysis of the SP equations in random K-SAT:phase diagram
Thresholds from integral equa-tion. Solved numerically orthrough large K asymptotic ex-pansion.
αc : SAT-UNSAT threshold.
αd : Onset of clustering→ clusters with frozen variables.
K αd αc α(7)c
3 3.93 4.2667 4.3074 8.30 9.931 9.9385 16.1 21.117 21.1186 30.5 43.37 43.3727 57.2 87.79 87.7858 107.2 176.5439 201.3 354.010
10 379.1 708.915
αc is conjectured to be exact (not αd).
Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science
Using the surveys : local field
In one given cluster of solutions, α:Hα
j =∑
a ua→j
Hαj > 0: number of warnings telling
“xi should be one”
Hαj < 0: number of warnings telling
“xi should be zero”
Hαj = 0: no warning
→ Survey of local field.
Pj(H) = Probability that Hαj = H
when α chosen at random.
0 H1−1
P(H)
32−2−3
W W +− W0
Some types of variables:
Balanced:
W± ' 1/2,W0 ' 0
Polarized:
W+ ' 1 or W− ' −1
Underconstrained:
W0 ' 1
Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science
Survey Inspired Decimation
Biased variable W i+ ' 1: In almost all clusters of solutions, xi = 1.
→ Fix xi = 1
SID algorithm: Iterate:
Run SP until convergence
Find most biased variable, i such that |W i+ −W i
−| maximal.
Fix it to xi = 1 if W i+ > W i
−, to xi = 0 if W i+ < W i
−, simplifythe formula.
Two possible ends: 1) Fix all variables 2) reduce the formula to astage where all W i
0 = 1. Underconstrained problem, easily solvedby e.g. simulated annealing or Walksat.
Solves: 107 variables at α ' 4.2− 4.25. Time O(N2), reduced toO(N) by fixing a fraction of the variables.
Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science
Survey decimation example
Number of clustersof assignmentswhich violate E clauses:
eΣ(E)
N = 10000, plot every 500decimation steps 0
50
100
150
200
0 5 10 15 20 25 30 35 40 45Σ
E’
decimationprocess
Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science
Glass phase in LDPC codes
p
Binary Symmetric Channel
Flip probability p
Complexity of the landscape(configurations on the sphere)
Σ(e) = 1N logN (E = Ne)
.04
.3
.2
.1
0.08 .12
p=.155
p=.3
(6,5) regular code. p
p d
c
= .139=.264
p=pc
Σ
e
p=.2
pd = threshold BP decoding
pc = threshold optimal decoding
Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science
Glass phase in LDPC codes
p
Binary Symmetric Channel
Flip probability p
Complexity of the landscape(configurations on the sphere)
Σ(e) = 1N logN (E = Ne)
.04
.3
.2
.1
0.08 .12
p=.155
p=.3
(6,5) regular code. p
p d
c
= .139=.264
p=pc
Σ
e
p=.2
pd = threshold BP decoding
pc = threshold optimal decoding
Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science
Miscellaneous comments
General approach to many constraint satisfaction networks, whenthe factor graph has a local tree structure (large girth)
Simple case (low density of constraint): RS cavity method OK.e.g. decoding with belief propagation at low enoug noise
Increasing density 1RSB: many pure states → statistical physics inthe space of pure states. Phase diagram for K -sat, q-colouring,LDPC codes...
Generic picture:SATHard-SAT (clusters)UNSAT
SAT (E = 0 ) UNSAT (E >0)0 0
1 stateE=0 E>0
Many states Many statesE>0
=M/Nαd
αc α= 4.267
Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science
Miscellaneous comments
Always “tree computations” (= iterative mapping of pdf), butwith different interpretations
Algorithmic implementation (single instance): belief propagation -survey propagation. Very powerful
Statistical analysis: Typical samples, typical configurations, viewedfrom a typical point: phase diagrams
Some predictions are rigorously confirmed (weighted matching,clusters in hard SAT phase, satisfiability threshold as upperbound...).
Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science
Appendix 1: Survey propagation equations
a 1η = Prob(warning)
ηb−>2
b
U
VW
X
a
2 3
1
π2+ =
∏b∈U(1− ηb→2)
π2− =
∏b∈V (1− ηb→2)
P(no contrad): π2+ + π2
− − π2+π
2−
q2 ≡ Prob(x2 = 1)
=π2−(1−π2
+)
π2++π2
−−π2+π2−
q3 ≡ Prob(x3 = 0)
=π3
+(1−π3−)
π3++π3
−−π3+π3−
ηa→1 = q2q3
Survey propagation: statistical analysis, or single sample →algorithms
Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science
Appendix 2 Origins of the cavity method
1975: Definition of the SK model of spin glasses E = −∑
ij Jijsi sj1979: Parisi solution of this model with replicas1986: An alternative approach: the cavity method (M, Parisi,Virasoro). Direct probabilistic approach, based on N → N + 1 butusing N � 1. Equivalent to replica approach.2001: A new version of the cavity method to handle ’finiteconnectivity’ problems (M, Parisi)2002: Applications to XORSAT, K-SAT, colouring.... → phasediagrams (thresholds) and algorithms (survey propagation).2003: Rigorous confirmation of Parisi’s solution for the SK model(Talagrand, Guerra)
Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science
top related