statistical physics tools in information...

Post on 03-Aug-2018

214 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Statistical Physics Tools in Information Science

Marc Mezard1 and Andrea Montanari2

(1) Universite de Paris Sud and (2) Stanford University

June 23, 2007

Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science

Structure of the presentation

Andrea: What is statistical physics and why should you care.

Marc: Two test cases: (1) counting matchings, (2) random k-SAT.

Ask whatever you want!

Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science

Structure of the presentation

Andrea: What is statistical physics and why should you care.

Marc: Two test cases: (1) counting matchings, (2) random k-SAT.

Ask whatever you want!

Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science

Structure of the presentation

Andrea: What is statistical physics and why should you care.

Marc: Two test cases: (1) counting matchings, (2) random k-SAT.

Ask whatever you want!

Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science

Structure of the presentation

Andrea: What is the statistical physics we do and . . . .

Marc: Two test cases: (1) counting matchings, (2) random k-SAT.

Ask whatever you want!

Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science

Sources

General: → M. Mezard and A Montanari, ’Information, Physics and

Computation,’ Upcoming book our web pages

Random k-SAT: → M. Mezard, G. Parisi, and R. Zecchina, ’Analytic and

Algorithmic Solution of Random Satisfiability Problems,’ Science

→ F. Krzakala, A. Montanari, F. Ricci-Tersenghi, G. Semerjian,

L. Zdeborova ‘Gibbs States and the Set of Solutions of Random

Constraint Satisfaction Problems,’ PNAS

Coding: → A. Montanari and R. Urbanke, ‘Modern Coding Theory: The

Statistical Mechanics and Computer Science Point of View,’ Lecture

notes

General graphical models: → google ee374

Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science

Outline

1 Problems

2 Methods

3 Results

4 The cavity method at work

5 Mean Field (BP) on graphical models

6 Matching

7 K-SAT

8 Appendices

Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science

Outline

1 Problems

2 Methods

3 Results

4 The cavity method at work

5 Mean Field (BP) on graphical models

6 Matching

7 K-SAT

8 Appendices

Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science

Outline

1 Problems

2 Methods

3 Results

4 The cavity method at work

5 Mean Field (BP) on graphical models

6 Matching

7 K-SAT

8 Appendices

Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science

Outline

1 Problems

2 Methods

3 Results

4 The cavity method at work

5 Mean Field (BP) on graphical models

6 Matching

7 K-SAT

8 Appendices

Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science

Outline

1 Problems

2 Methods

3 Results

4 The cavity method at work

5 Mean Field (BP) on graphical models

6 Matching

7 K-SAT

8 Appendices

Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science

Outline

1 Problems

2 Methods

3 Results

4 The cavity method at work

5 Mean Field (BP) on graphical models

6 Matching

7 K-SAT

8 Appendices

Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science

Outline

1 Problems

2 Methods

3 Results

4 The cavity method at work

5 Mean Field (BP) on graphical models

6 Matching

7 K-SAT

8 Appendices

Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science

Outline

1 Problems

2 Methods

3 Results

4 The cavity method at work

5 Mean Field (BP) on graphical models

6 Matching

7 K-SAT

8 Appendices

Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science

Problems

Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science

Probabilistic description of a physical system

State: x = (x1, . . . , xN), xi ∈ X

Temperature: β

Energy E : x 7→ E (x) ∈ R

(Boltzmann) probability distribution:

µ(x) =1

Zexp{−βE (x)} .

Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science

Probabilistic description of a physical system

State: x = (x1, . . . , xN), xi ∈ X

Temperature: β

Energy E : x 7→ E (x) ∈ R

(Boltzmann) probability distribution:

µ(x) =1

Zexp{−βE (x)} .

Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science

Probabilistic description of a physical system

State: x = (x1, . . . , xN), xi ∈ X

Temperature: β

Energy E : x 7→ E (x) ∈ R

(Boltzmann) probability distribution:

µ(x) =1

Zexp{−βE (x)} .

Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science

Probabilistic description of a physical system

State: x = (x1, . . . , xN), xi ∈ X

Temperature: β

Energy E : x 7→ E (x) ∈ R

(Boltzmann) probability distribution:

µ(x) =1

Zexp{−βE (x)} .

Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science

Probabilistic description of a physical system

State: x = (x1, . . . , xN), xi ∈ X

Energy E : x 7→ E (x) ∈ R

(Boltzmann) probability distribution:

µ(x) =1

Zexp{−E (x)} .

Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science

Probabilistic description of a physical system

State: x = (x1, . . . , xN), xi ∈ X

(Boltzmann) probability distribution:

µ(x) =1

Zw(x) .

Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science

Probabilistic description of a physical system

State: x = (x1, . . . , xN), xi ∈ X

probability distribution:

µ(x) =1

Zw(x) .

Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science

What is left? An example

L× L grid: G = (V ,E )xi ∈ X = {0, 1}, i ∈ V

µ(x) =1

Z (λ;G )λ|x | I{x is an independent set} .

Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science

What is left? Locality

L× L grid: G = (V ,E )xi ∈ X = {0, 1}, i ∈ V

µ(x) =1

Z (λ;G )

∏i∈V

λxi∏

(ij)∈E

I{(xi , xj) 6= (1, 1)} .

Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science

A more abstract version of locality

G = (V ,E ), V = [n], x = (x1, . . . , xN) ∈ {0, 1}V

µ(x) =1

Z (λ;G )

∏i∈V

λxi∏

(ij)∈E

I{(xi , xj) 6= (1, 1)} .

Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science

A more abstract version of locality

x1

x2 x3 x4

x5

x6

x7x8x9

x10

x11

x12

G = (V ,E ), V = [N], x = (x1, . . . , xN) ∈ XN

µ(x) =1

Z

∏(ij)∈G

ψij(xi , xj) .

Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science

Statistical mechanics questions: I. Qualitative

How does a typical configuration sampled from µ look like?

Disordered versus Ordered

Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science

Statistical mechanics questions: I. Qualitative

How does a typical configuration sampled from µ look like?

Disordered versus Ordered

Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science

Statistical mechanics questions: I. Qualitative

How does a typical configuration sampled from µ look like?

Liquid versus Solid

Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science

Statistical mechanics questions: II. Quantitative

L× L grid: N = L2

Compute (for N large)

φN(λ) =1

Nlog Z (G ;λ) =

1

Nlog

∑x∈IS(G)

λ|x |

Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science

Isn’t Z just an irrelevant normalization constant?

H(X ) = −∑x

µ(x) log µ(x)

= log Z (λ;G )−∑

x∈IS(G)

µ(x) |x | log λ

= log Z (λ;G )− log λ∑i∈V

〈xi 〉G

= log Z (λ;G )− log λ∂ log Z (λ;G )

∂ log λ

[this relation is completely general]

Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science

Isn’t Z just an irrelevant normalization constant?

H(X ) = −∑x

µ(x) log µ(x)

= log Z (λ;G )−∑

x∈IS(G)

µ(x) |x | log λ

= log Z (λ;G )− log λ∑i∈V

〈xi 〉G

= log Z (λ;G )− log λ∂ log Z (λ;G )

∂ log λ

[this relation is completely general]

Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science

Isn’t Z just an irrelevant normalization constant?

H(X ) = −∑x

µ(x) log µ(x)

= log Z (λ;G )−∑

x∈IS(G)

µ(x) |x | log λ

= log Z (λ;G )− log λ∑i∈V

〈xi 〉G

= log Z (λ;G )− log λ∂ log Z (λ;G )

∂ log λ

[this relation is completely general]

Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science

Isn’t Z just an irrelevant normalization constant?

H(X ) = −∑x

µ(x) log µ(x)

= log Z (λ;G )−∑

x∈IS(G)

µ(x) |x | log λ

= log Z (λ;G )− log λ∑i∈V

〈xi 〉G

= log Z (λ;G )− log λ∂ log Z (λ;G )

∂ log λ

[this relation is completely general]

Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science

Isn’t Z just an irrelevant normalization constant?

H(X ) = −∑x

µ(x) log µ(x)

= log Z (λ;G )−∑

x∈IS(G)

µ(x) |x | log λ

= log Z (λ;G )− log λ∑i∈V

〈xi 〉G

= log Z (λ;G )− log λ∂ log Z (λ;G )

∂ log λ

[this relation is completely general]

Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science

Questions I and II are related!

∆(x) =∑

i∈EVEN

xi −∑

i∈ODD

xi .

φN(λ, δ) =1

Nlog Z (G ;λ, δ) =

1

Nlog

∑x :∆(x)=Nδ

λ|x |

Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science

Liquid

-0.5

0

0.5

1

1.5

2

-2 -1.5 -1 -0.5 0 0.5 1 1.5 2

φN(λ, δ)

δ

Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science

Solid

-0.5

0

0.5

1

1.5

2

-2 -1.5 -1 -0.5 0 0.5 1 1.5 2

φN(λ, δ) ↑bottleneck

δ

l B

Theorem (Mossel/Weitz/Wormald/06)

On a random sparse bipartite graph B = Θ(1) whp for λ > λ∗.

Similar Thm for Ising models [A. Gerschenfeld/AM/07]

Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science

Solid

-0.5

0

0.5

1

1.5

2

-2 -1.5 -1 -0.5 0 0.5 1 1.5 2

φN(λ, δ) ↑bottleneck

δ

l B

Theorem (Mossel/Weitz/Wormald/06)

On a random sparse bipartite graph B = Θ(1) whp for λ > λ∗.

Similar Thm for Ising models [A. Gerschenfeld/AM/07]

Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science

An artistic view of µ in the solid phase

δδ = 0

Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science

What about non-bipartite graphs?

Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science

Frustration

?

?

No ‘simple ordering’⇒ Solid amorphous state?

[Solid+Amorphous = Glass]

Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science

Frustration

?

?

No ‘simple ordering’⇒ Solid amorphous state?

[Solid+Amorphous = Glass]

Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science

Frustration

?

?

No ‘simple ordering’⇒ Solid amorphous state?

[Solid+Amorphous = Glass]

Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science

How do you define ‘solid’?

i ∈ V

B(i , r) ball of radius r around i

x∼i ,r = {xj : j 6∈ B(i , r)}

Liquid: I (Xi ;X∼i ,r )r→ 0

Solid: I (Xi ;X∼i ,r )r→ I∞ > 0

Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science

How do you define ‘solid’?

i ∈ V

B(i , r) ball of radius r around i

x∼i ,r = {xj : j 6∈ B(i , r)}

Liquid: I (Xi ;X∼i ,r )r→ 0

Solid: I (Xi ;X∼i ,r )r→ I∞ > 0

Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science

How do you define ‘solid’?

i ∈ V

B(i , r) ball of radius r around i

x∼i ,r = {xj : j 6∈ B(i , r)}

Liquid: I (Xi ;X∼i ,r )r→ 0

Solid: I (Xi ;X∼i ,r )r→ I∞ > 0

Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science

Methods

Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science

Mean Field Methods

Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science

Mean field ******

Mean field methods : A family of techniques for approximatecalculations in statistical mechanics and graphical models.1

Mean field models : A class of models on which mean fieldmethods are asymptotically exact in the large system limit

1And more: Markov chains, queuing theory, stochastic networks, etc...Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science

The simplest mean field calculation

i

∂i

µA( · ) marginal of XA, A ⊆ V

Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science

The simplest mean field calculation

i

∂i

µi (1) =∑x∂i

µi |∂i (1|x∂i )µ∂i (x∂i ) =λ

1 + λµ∂i (0)

≈ λ

1 + λ

∏j∈∂i

µj(0) =λ

1 + λ

∏j∈∂i

(1− µj(1))

Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science

The simplest mean field calculation

i

∂i

µi (1) =∑x∂i

µi |∂i (1|x∂i )µ∂i (x∂i ) =λ

1 + λµ∂i (0)

≈ λ

1 + λ

∏j∈∂i

µj(0) =λ

1 + λ

∏j∈∂i

(1− µj(1))

Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science

The simplest mean field calculation

i

∂i

µi (1) =∑x∂i

µi |∂i (1|x∂i )µ∂i (x∂i ) =λ

1 + λµ∂i (0)

≈ λ

1 + λ

∏j∈∂i

µj(0) =λ

1 + λ

∏j∈∂i

(1− µj(1))

Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science

The simplest mean field calculation

i

∂i

µi (1) =∑x∂i

µi |∂i (1|x∂i )µ∂i (x∂i ) =λ

1 + λµ∂i (0)

≈ λ

1 + λ

∏j∈∂i

µj(0) =λ

1 + λ

∏j∈∂i

(1− µj(1))

Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science

Solving the equations

Bipartite, degree k + 1, assume

µi (1) =

{p1 if i ∈EVEN,p2 if i ∈ODD.

Then, MF equations are

p1 = fλ(p2) , p2 = fλ(p1)

where fλ(x) = λ(1 + λ)−1 (1− x)k+1

Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science

Solving the equations

Bipartite, degree k + 1, assume

µi (1) =

{p1 if i ∈EVEN,p2 if i ∈ODD.

Then, MF equations are

p1 = fλ(p2) , p2 = fλ(p1)

where fλ(x) = λ(1 + λ)−1 (1− x)k+1

Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science

Solving the equations (continued)

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7

p1(`)

p2(`)

p1(`)

p2(`)

Liquid vs Solid

Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science

The family of mean field approximations

Method Basic intuition Asympt. exact for

Naive mf Neglects correlations Some dense G ’s

Bethe-Peierls ‘Nearest neighbors’ correls Some sparse rand. G ’s

Cavity2 As BP + Glassy states ‘Any’ sparse rand. G

Kikuchi3 Short loops / Nonpert. ???

Loop corr.4 Loops / Perturbative ***

2Mezard/Parisi,. . .3Kikuchi, Yedidia/Freeman/Weiss4AM/Rizzo, Parisi/Slanina, Chernyak/Chertkov

Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science

The family of mean field approximations

Method Basic intuition Asympt. exact for

Naive mf Neglects correlations Some dense G ’s

Bethe-Peierls ‘Nearest neighbors’ correls Some sparse rand. G ’s

Cavity2 As BP + Glassy states ‘Any’ sparse rand. G

Kikuchi3 Short loops / Nonpert. ???

Loop corr.4 Loops / Perturbative ***

2Mezard/Parisi,. . .3Kikuchi, Yedidia/Freeman/Weiss4AM/Rizzo, Parisi/Slanina, Chernyak/Chertkov

Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science

The family of mean field approximations

Method Basic intuition Asympt. exact for

Naive mf Neglects correlations Some dense G ’s

Bethe-Peierls ‘Nearest neighbors’ correls Some sparse rand. G ’s

Cavity2 As BP + Glassy states ‘Any’ sparse rand. G

Kikuchi3 Short loops / Nonpert. ???

Loop corr.4 Loops / Perturbative ***

2Mezard/Parisi,. . .3Kikuchi, Yedidia/Freeman/Weiss4AM/Rizzo, Parisi/Slanina, Chernyak/Chertkov

Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science

The family of mean field approximations

Method Basic intuition Asympt. exact for

Naive mf Neglects correlations Some dense G ’s

Bethe-Peierls ‘Nearest neighbors’ correls Some sparse rand. G ’s

Cavity2 As BP + Glassy states ‘Any’ sparse rand. G

Kikuchi3 Short loops / Nonpert. ???

Loop corr.4 Loops / Perturbative ***

2Mezard/Parisi,. . .3Kikuchi, Yedidia/Freeman/Weiss4AM/Rizzo, Parisi/Slanina, Chernyak/Chertkov

Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science

The family of mean field approximations

Method Basic intuition Asympt. exact for

Naive mf Neglects correlations Some dense G ’s

Bethe-Peierls ‘Nearest neighbors’ correls Some sparse rand. G ’s

Cavity2 As BP + Glassy states ‘Any’ sparse rand. G

Kikuchi3 Short loops / Nonpert. ???

Loop corr.4 Loops / Perturbative ***

2Mezard/Parisi,. . .3Kikuchi, Yedidia/Freeman/Weiss4AM/Rizzo, Parisi/Slanina, Chernyak/Chertkov

Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science

The family of mean field approximations

Method Basic intuition Asympt. exact for

Naive mf Neglects correlations Some dense G ’s

Bethe-Peierls ‘Nearest neighbors’ correls Some sparse rand. G ’s

Cavity2 As BP + Glassy states ‘Any’ sparse rand. G

Kikuchi3 Short loops / Nonpert. ???

Loop corr.4 Loops / Perturbative ***

2Mezard/Parisi,. . .3Kikuchi, Yedidia/Freeman/Weiss4AM/Rizzo, Parisi/Slanina, Chernyak/Chertkov

Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science

The family of mean field approximations

Method Basic intuition Algorithmic version

Naive mf Neglects correlations Mean field

Bethe-Peierls ‘Nearest neighbors’ correls Belief Propagation

Cavity5 As BP + Glassy states Survey Propagation

Kikuchi6 Short loops / Nonpert. Generalized BP

Loop corr.7 Loops / Perturbative Loop corr. BP

5Mezard/Parisi,. . .6Kikuchi, Yedidia/Freeman/Weiss7AM/Rizzo, Parisi/Slanina, Chernyak/Chertkov

Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science

The family of mean field approximations

Method Basic intuition Algorithmic version

Naive mf Neglects correlations Mean field

Bethe-Peierls ‘Nearest neighbors’ correls Belief Propagation

Cavity8 As BP + Glassy states Survey Propagation

Kikuchi9 Short loops / Nonpert. Generalized BP

Loop corr.10 Loops / Perturbative Loop corr. BP

8Mezard/Parisi,. . .9Kikuchi, Yedidia/Freeman/Weiss

10AM/Rizzo, Parisi/Slanina, Chernyak/ChertkovMarc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science

‘Any sparse random graph?’

Caveats

Many (rigorous and non) indications but no proof.

‘Sparse random graph is a bit vague.’

Can define a family of ensembles.

Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science

‘Any sparse random graph?’

Caveats

Many (rigorous and non) indications but no proof.

‘Sparse random graph is a bit vague.’

Can define a family of ensembles.

Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science

‘Any sparse random graph?’

Factor graph G = (V ,F ,E ),

x3

x1

x6

x4

x2

x5

x7

x

x

x

8

9

10

← variables xi ∈ X

← factors, e.g. ψa(x5, x7, x9, x10)

µ(x) =1

Z

∏a∈F

ψa(x∂a)

∂a ≡ {i ∈ V : (i , a) ∈ E}Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science

Graph ensemble

︸ ︷︷ ︸degree 2

︸ ︷︷ ︸degree 3

︸ ︷︷ ︸degree dmax factorss

degree 2︷ ︸︸ ︷degree 3︷ ︸︸ ︷ degree dmax variables︷ ︸︸ ︷random permutation π

[∼ irregular LDPC ensembles]

Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science

Compatibility functions ensemble

Assign, for d ∈ {1, . . . dmax} a set of functions

{ψ(d ,r) : X × · · · × X︸ ︷︷ ︸d

→ R+}r=1,2,...

and a distribution {pd(r)} (pd(r) ≥ 0,∑

r pd(r) ≥ 0)

Then, for each f -node a of degree d(a)

ψa = ψ(d(a),r) independently, with prob pd(a)(r)

Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science

Compatibility functions ensemble

Assign, for d ∈ {1, . . . dmax} a set of functions

{ψ(d ,r) : X × · · · × X︸ ︷︷ ︸d

→ R+}r=1,2,...

and a distribution {pd(r)} (pd(r) ≥ 0,∑

r pd(r) ≥ 0)

Then, for each f -node a of degree d(a)

ψa = ψ(d(a),r) independently, with prob pd(a)(r)

Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science

The cavity method: An high level view

0. Cavity method = Replica method

Replica method is formal, while cavity makes some probabilityassumptions.

Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science

The cavity method: An high level view

1. What does it mean asymptotically exact?

Partition function

limN→∞

1

Nlog ZN = φcavity almost surely.

Marginals

limN→∞

1

N

N∑i=1

||µi − µcavityi ||TV = 0 almost surely.

Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science

The cavity method: An high level view

1. What does it mean asymptotically exact?

Partition function

limN→∞

1

Nlog ZN = φcavity almost surely.

Marginals

limN→∞

1

N

N∑i=1

||µi − µcavityi ||TV = 0 almost surely.

Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science

The cavity method: An high level view

1. What does it mean asymptotically exact?

Partition function

limN→∞

1

Nlog ZN = φcavity almost surely.

Marginals

limN→∞

1

N

N∑i=1

||µi − µcavityi ||TV = 0 almost surely.

Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science

The cavity method: An high level view

2. Naive mean field → µi ≈ νi (vertex quantities)Cavity → νi→j (messages)

Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science

The cavity method: An high level view

3. A hierarchy

Std terminology Cavity jargon Message space

Bethe-Peierls RS (0RSB) M0 = distribs over X*** 1RSB M1 = distribs over M0

*** 2RSB M2 = distribs over M1

*** 3RSB M3 = distribs over M2

· · · · ·· · · · ·· · · · ·*** ∞RSB ???

Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science

The cavity method: An high level view

3. A hierarchy

Std terminology Cavity jargon Message space

Bethe-Peierls RS (0RSB) M0 = distribs over X*** 1RSB M1 = distribs over M0

*** 2RSB M2 = distribs over M1

*** 3RSB M3 = distribs over M2

· · · · ·· · · · ·· · · · ·*** ∞RSB ???

Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science

The cavity method: An high level view

3. A hierarchy

Std terminology Cavity jargon Message space

Bethe-Peierls RS (0RSB) M0 = distribs over X*** 1RSB M1 = distribs over M0

*** 2RSB M2 = distribs over M1

*** 3RSB M3 = distribs over M2

· · · · ·· · · · ·· · · · ·*** ∞RSB ???

Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science

The cavity method: An high level view

3. A hierarchy

Std terminology Cavity jargon Message space

Bethe-Peierls RS (0RSB) M0 = distribs over X*** 1RSB M1 = distribs over M0

*** 2RSB M2 = distribs over M1

*** 3RSB M3 = distribs over M2

· · · · ·· · · · ·· · · · ·*** ∞RSB ???

Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science

The cavity method: An high level view

3. A hierarchy

Std terminology Cavity jargon Message space

Bethe-Peierls RS (0RSB) M0 = distribs over X*** 1RSB M1 = distribs over M0

*** 2RSB M2 = distribs over M1

*** 3RSB M3 = distribs over M2

· · · · ·· · · · ·· · · · ·*** ∞RSB ???

Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science

The cavity method: An high level view

3. A hierarchy

Std terminology Cavity jargon Message space

Bethe-Peierls RS (0RSB) M0 = distribs over X*** 1RSB M1 = distribs over M0

*** 2RSB M2 = distribs over M1

*** 3RSB M3 = distribs over M2

· · · · ·· · · · ·· · · · ·*** ∞RSB ???

Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science

The cavity method: An high level view

3. A hierarchy

Std terminology Cavity jargon Message space

Bethe-Peierls RS (0RSB) M0 = distribs over X*** 1RSB M1 = distribs over M0

*** 2RSB M2 = distribs over M1

*** 3RSB M3 = distribs over M2

· · · · ·· · · · ·· · · · ·*** ∞RSB ???

Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science

The cavity method: An high level view

3. A hierarchy

Std terminology Cavity jargon Message space

Bethe-Peierls RS (0RSB) M0 = distribs over X*** 1RSB M1 = distribs over M0

*** 2RSB M2 = distribs over M1

*** 3RSB M3 = distribs over M2

· · · · ·· · · · ·· · · · ·*** ∞RSB ???

Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science

Results

Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science

A list of models from. . .

Coding

Multi-user detection

Stochastic networks

Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science

Channel coding

BMSx = (x1 . . . xN) y = (y1 . . . yN)

Channel transition probability {Q(y |x)}.

Codeword: x ∈ {0, 1}N

Hx = 0 mod 2 .

Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science

LDPC codes [Gallager, MacKay, Luby et al.]

x1 ⊕ x2 ⊕ x3 ⊕ x4 = 0 · · · x5 ⊕ x6 ⊕ x8 = 0

x1 x2 x3 x4 x5 x6 x7 x8

Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science

x1 ⊕ x2 ⊕ x3 ⊕ x4 = 0 · · · x5 ⊕ x6 ⊕ x8 = 0

x1 x2 x3 x4 x5 x6 x7 x8y y y y y y y yy1 y2 y3 y4 y5 y6 y7 y8

µy (x) =1

ZN(y)I(x1 ⊕ x2 ⊕ x3 ⊕ x4 = 0) · · · I(x5 ⊕ x6 ⊕ x8 = 0) ·

· Q(y1|x1) · · ·Q(y8|x8)

Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science

Some results

Saad/Kabashima et al., AM/Sourlas (Replica method)

φ = limN→∞

1

NE log ZN(Y )⇒ [Conditional entropy per bit H(X |Y )/N]]

Proof: Lower bound → AM, MacrisUpper bound: Measson/AM/Urbanke (BEC)

Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science

Multi-user detection (CDMA channel)

N users: x ≡ (x1, x2, . . . , xN), xi ∈ {+1,−1} i.i.d uniform

M chips: y = (y1, y2, . . . , yN), ya ∈ R

ya = sa1xi1(a) + · · ·+ sakxik (a) + wa

wa = Normal(0, σ2) , {sai} spread sequences

Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science

Multi-user detection (CDMA channel)

noise

(+x1 − x2 + x3 + x4) + w1

y1 =· · · (−x5 − x6 + x8) + w6

y6 =

x1 x2 x3 x4 x5 x6 x7 x8

A posteriori distribution: µy (x) ≡ P {x |Y } → graphical model. . .

Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science

µy (x) =1

ZK (y)

N∏a=1

1√2πσ2

exp

− 1

2σ2

(ya −

∑l

salxil (a)

)2 .

Tanaka (replica method)

φ = limK→∞

1

KE log ZK (Y )⇒ [Capacity per user]

Several generalizations: Guo/Verdu, Caire et al., Kabashima et al.Proof: AM/Tse

Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science

µy (x) =1

ZK (y)

N∏a=1

1√2πσ2

exp

− 1

2σ2

(ya −

∑l

salxil (a)

)2 .

Tanaka (replica method)

φ = limK→∞

1

KE log ZK (Y )⇒ [Capacity per user]

Several generalizations: Guo/Verdu, Caire et al., Kabashima et al.Proof: AM/Tse

Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science

Channel assignment in cellular networks

ni ≥ 0, number of channels in cell i

µ(n) =1

Z

∏i∈V

λnii

ni !

∏(ij)∈E

I(ni + nj ≤ C ) .

Z → loss probability

Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science

END OF FIRST HALF

Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science

BEGINNING OF SECOND HALF

Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science

Cavity method: general (heuristic) framework

1- Draw the factor graph2- Write elementary “mean field (BP) equations” assuming thatthe local environment of a variable in the factor graph is a tree3- Two ways to use them: a) Statistical analysis of equations in agraph ensemble. b) Iteration of the message passing on a singleinstance (belief propagation)4- Check the existence of “Replica Symmetry Breaking”=dependence of the root from boundaries, using typical boundaries5- If needed, write the 1RSB cavity equations → surveypropagation ....

Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science

Cavity method: general (heuristic) framework

1- Draw the factor graph2- Write elementary “mean field (BP) equations” assuming thatthe local environment of a variable in the factor graph is a tree3- Two ways to use them: a) Statistical analysis of equations in agraph ensemble. b) Iteration of the message passing on a singleinstance (belief propagation)4- Check the existence of “Replica Symmetry Breaking”=dependence of the root from boundaries, using typical boundaries5- If needed, write the 1RSB cavity equations → surveypropagation ....

Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science

Cavity method: general (heuristic) framework

1- Draw the factor graph2- Write elementary “mean field (BP) equations” assuming thatthe local environment of a variable in the factor graph is a tree3- Two ways to use them: a) Statistical analysis of equations in agraph ensemble. b) Iteration of the message passing on a singleinstance (belief propagation)4- Check the existence of “Replica Symmetry Breaking”=dependence of the root from boundaries, using typical boundaries5- If needed, write the 1RSB cavity equations → surveypropagation ....

Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science

Cavity method: general (heuristic) framework

1- Draw the factor graph2- Write elementary “mean field (BP) equations” assuming thatthe local environment of a variable in the factor graph is a tree3- Two ways to use them: a) Statistical analysis of equations in agraph ensemble. b) Iteration of the message passing on a singleinstance (belief propagation)4- Check the existence of “Replica Symmetry Breaking”=dependence of the root from boundaries, using typical boundaries5- If needed, write the 1RSB cavity equations → surveypropagation ....

Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science

Cavity method: general (heuristic) framework

1- Draw the factor graph2- Write elementary “mean field (BP) equations” assuming thatthe local environment of a variable in the factor graph is a tree3- Two ways to use them: a) Statistical analysis of equations in agraph ensemble. b) Iteration of the message passing on a singleinstance (belief propagation)4- Check the existence of “Replica Symmetry Breaking”=dependence of the root from boundaries, using typical boundaries5- If needed, write the 1RSB cavity equations → surveypropagation ....

Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science

Factor graphs for graphical models

Many discrete variables xi , many constraints fa(Xa), each involvinga small number of variables. Factor graph:

2

1

4

5

a

b

c

d

e

3

P(x1, ..., x5) = 1Z fa(x1, x2, x3, x4)

fb(x1, x2, x3) fc(x2, x4, x5)fd(x1, x2, x5) fe(x1, x3, x5)

Q: Estimate marginals. Ubiquitous:inference, coding, combinatorial opti-mization, physics....

NB: In physics, ’energy’, ’tempera-ture’

fa(x1, x2, x3, x4) = e−βEa(x1,x2,x3,x4)

Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science

Factor graphs for graphical models

Many discrete variables xi , many constraints fa(Xa), each involvinga small number of variables. Factor graph:

2

1

4

5

a

b

c

d

e

3

P(x1, ..., x5) = 1Z fa(x1, x2, x3, x4)

fb(x1, x2, x3) fc(x2, x4, x5)fd(x1, x2, x5) fe(x1, x3, x5)

Q: Estimate marginals. Ubiquitous:inference, coding, combinatorial opti-mization, physics....

NB: In physics, ’energy’, ’tempera-ture’

fa(x1, x2, x3, x4) = e−βEa(x1,x2,x3,x4)

Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science

Locally tree-like factor graph

in LDPC error correcting codes,random K -satisfiability, colour-ing of random Erdos Renyigraphs, matching in randomgraphs, etc...: The factor graphis locally tree-like.

Ex: random 3-SAT

LoopsLog N

:

Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science

Simple mean field recursion: merge rooted trees

m m3 4

1 2 3 4

0

µ

µa

b

a b

m1 ( x )1

(x0 )

(x 0 )

m2(x2) (x3) (x4)

µa(x0) =∑

x1,x2m1(x1)m2(x2)fa(x1, x2, x0)

µb(x0) =∑

x3,x4m3(x3)m4(x4)fa(x3, x4, x0)

m0(x0) = Cµa(x0)µb(x0)0

m 0 ( x 0)

m0 = F (m1,m2,m3,m4) = Belief propagation

Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science

Simple mean field recursion: merge rooted trees

m m3 4

1 2 3 4

0

µ

µa

b

a b

m1 ( x )1

(x0 )

(x 0 )

m2(x2) (x3) (x4)

µa(x0) =∑

x1,x2m1(x1)m2(x2)fa(x1, x2, x0)

µb(x0) =∑

x3,x4m3(x3)m4(x4)fa(x3, x4, x0)

m0(x0) = Cµa(x0)µb(x0)0

m 0 ( x 0)

m0 = F (m1,m2,m3,m4) = Belief propagation

Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science

Belief propagation = iteration of mean field equations onone instance

mi→a(xi ) = C∏

b∈V (i)\a

µb→i (xi )

µa→i (xi ) =∑

{xj},j∈V (a)\i

fa(xi , {xj})∏

j∈V (a)\i

mj→a(xj)

Marginal on i (“belief”): pi (xi ) = C∏

b∈V (i) µb→i (xi )

Marginal around node a: Pa(Xa) = C∏

j∈V (a) mj→a(xj)

Entropy (exact on tree):

P(x) ' C∏

a Pa(Xa)∏

i pi (xi )1−di ; S = −

∑x P(x) log P(x)

Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science

Belief propagation = iteration of mean field equations onone instance

mi→a(xi ) = C∏

b∈V (i)\a

µb→i (xi )

µa→i (xi ) =∑

{xj},j∈V (a)\i

fa(xi , {xj})∏

j∈V (a)\i

mj→a(xj)

Marginal on i (“belief”): pi (xi ) = C∏

b∈V (i) µb→i (xi )

Marginal around node a: Pa(Xa) = C∏

j∈V (a) mj→a(xj)

Entropy (exact on tree):

P(x) ' C∏

a Pa(Xa)∏

i pi (xi )1−di ; S = −

∑x P(x) log P(x)

Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science

Belief propagation = iteration of mean field equations onone instance

mi→a(xi ) = C∏

b∈V (i)\a

µb→i (xi )

µa→i (xi ) =∑

{xj},j∈V (a)\i

fa(xi , {xj})∏

j∈V (a)\i

mj→a(xj)

Marginal on i (“belief”): pi (xi ) = C∏

b∈V (i) µb→i (xi )

Marginal around node a: Pa(Xa) = C∏

j∈V (a) mj→a(xj)

Entropy (exact on tree):

P(x) ' C∏

a Pa(Xa)∏

i pi (xi )1−di ; S = −

∑x P(x) log P(x)

Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science

Statistical analysis

Factor graph ensembles:1- Random regular graph: local environment = regular tree foralmost all points → measure should be translationally invariantm = F (m,m,m,m)2-Erdos Renyi graph: P(m)= probability that mi = m, when i istaken at random in the graph with uniform probability.k neighbours, Poisson distributed. m0 = F (m1, ...,mk) → integralequation for P(m), easily solved numerically

Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science

Example: matching

Edge i : si ∈ {0, 1}.Matching: Constraint on each vertex

∑i∈V (a) si ≤ 1.

Energy E (s) = number of unmatched vertices.Probability: P(s) = 1

Z exp(−βE (s))

Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science

Example: matching

Edge i : si ∈ {0, 1}.Matching: Constraint on each vertex

∑i∈V (a) si ≤ 1.

Energy E (s) = number of unmatched vertices.Probability: P(s) = 1

Z exp(−βE (s))

Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science

BP equations in the matching problem

ψa(s) = I(∑

i∈V (a) si ≤ 1)

e−β(1−P

i∈V (a) si )

BP equations:

i

j

a

b

mi→a(si = 1) =∏

j∈∂b−i mj→b(sj = 0)

mi→a(si = 0) = e−β∏

j∈∂b−i mj→b(sj = 0)+∑j∈∂b−i mj→b(sj = 1)

∏k∈∂b−{i ,j} mk→b(sk = 0)

Closed set of equations for hi→a = − 1β log mi→a(0)

mi→a(1)

Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science

BP equations in the matching problem

hi→a = − 1β log

[e−β +

∑j∈b−i eβhj→b

]= F (h1→b, h2→b, h3→b)

Statistical analysis:

1: r−regular random graph: h = 1β log

[√4(r−1)+e−2β−e−β

2(r−1)

]2: Erdos Renyi graph: P(h), solution of a simple integral equation

→ entropy S(β) = 1N E log[1 +N ] ,

→ size of the matching x(β) = Number of Matched VerticesN

Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science

Entropy of matchings: results

r−regular random graph: E logN = log EN , simple explicitformula, (Bollobas and McKay 86)

Erdos Renyi graph:

NB1: Size of largestmatching known fromKarp-Sipser 1981

NB2: Cavity methodcomputes E logN

Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science

How to control this heuristic approach?

One assumption:

P(x1, x2, x3, x4|x0, a, b absent) == m1(x1)m2(x2)m3(x3)m4(x4)

m m3 4

1 2 3 4

0

µ

µa

b

a b

m1 ( x )1

(x0 )

(x 0 )

m2(x2) (x3) (x4)

Two conditions:

- 1, 2, 3, 4 should be far away when 0, a, b are absent

- Correlations should decay at large distances

Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science

How to control this heuristic approach?

One assumption:

P(x1, x2, x3, x4|x0, a, b absent) == m1(x1)m2(x2)m3(x3)m4(x4)

m m3 4

1 2 3 4

0

µ

µa

b

a b

m1 ( x )1

(x0 )

(x 0 )

m2(x2) (x3) (x4)

Two conditions:

- 1, 2, 3, 4 should be far away when 0, a, b are absent

- Correlations should decay at large distances

Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science

How to control this heuristic approach?

One assumption:

P(x1, x2, x3, x4|x0, a, b absent) == m1(x1)m2(x2)m3(x3)m4(x4)

m m3 4

1 2 3 4

0

µ

µa

b

a b

m1 ( x )1

(x0 )

(x 0 )

m2(x2) (x3) (x4)

Two conditions:

- 1, 2, 3, 4 should be far away when 0, a, b are absent:OK for broad classes of random graphs

- Correlations should decay at large distances??.. Depends..

Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science

Correlation decay

Cavity = treeCorrelations (mutual infor-mation) between root andboundary should decay atlarge distances, for typicalconfigurations outside thetree

Sufficient condition (much easier, but too strong): correlationsdecay for worst case

Correlations for typical case (more difficult) → replica symmetrybreaking

Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science

Correlation decay

Cavity = treeCorrelations (mutual infor-mation) between root andboundary should decay atlarge distances, for typicalconfigurations outside thetree

Sufficient condition (much easier, but too strong): correlationsdecay for worst case

Correlations for typical case (more difficult) → replica symmetrybreaking

Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science

“Replica symmetry breaking”

Non trivial correlations between the root and the boundary

NB1: point-to-set correlationNB2: not necessarily detected by local stability condition

Random regular graph: m0 = F (m1, ..,m4)

RS solution: m = F (m,m,m,m) (transla-tional invariance)

Modulated solutions: mα0 = F (mα

1 , ..,mα4 )

Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science

“Replica symmetry breaking”

Non trivial correlations between the root and the boundary

NB1: point-to-set correlationNB2: not necessarily detected by local stability condition

Random regular graph: m0 = F (m1, ..,m4)

RS solution: m = F (m,m,m,m) (transla-tional invariance)

Modulated solutions: mα0 = F (mα

1 , ..,mα4 )

Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science

“Replica symmetry breaking 2”

RSB: exponentially many solutions to BP equations (extremalGibbs states)Survey: statistics on the solutionsµα

a→i (xi ): message from a to i in the solution α.

Qa→i (µ)= probability that the message µαa→i is equal to µ, when

α is chosen at random (with measure exp(−βxFα)).

Random reg. graph: translational invariance recovered with thestatistics over the sols → Qa→i (µ) = Q(µ), satisfies aself-consistent equation.

Matching: no RSB: Q(µ) = δ(µ, µrs)In many problems (SAT, colouring, 3-matching,...): RSB presentwhen the density of constraints is large enough

Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science

“Replica symmetry breaking 2”

RSB: exponentially many solutions to BP equations (extremalGibbs states)Survey: statistics on the solutionsµα

a→i (xi ): message from a to i in the solution α.

Qa→i (µ)= probability that the message µαa→i is equal to µ, when

α is chosen at random (with measure exp(−βxFα)).

Random reg. graph: translational invariance recovered with thestatistics over the sols → Qa→i (µ) = Q(µ), satisfies aself-consistent equation.

Matching: no RSB: Q(µ) = δ(µ, µrs)In many problems (SAT, colouring, 3-matching,...): RSB presentwhen the density of constraints is large enough

Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science

Random 3-satisfiability

NP-complete (Cook)

Pb: random Boolean formula, conjunctive normal form, threevariables per clause, chosen randomly in {x1, .., xN}, negatedrandomly with probability 1/2:(x1 ∨ x27 ∨ x3) ∧ (x11 ∨ x3 ∨ x2) ∧ . . . ∧ (x9 ∨ x8 ∨ x30)

Control parameter: α = MN = Constraints/Variables.

Numerically: Threshold phenomenon at αc ∼ 4.26.

Proba(SAT)=1 when α < αc ; Proba(SAT)=0 when α > αc .

Numerics Mitchell Selman Levesque Kirkpatrick Crawford Auton..Threshold Friedgut;Bounds Kaporis Kirousis Lalas Dubois Boufkhad..

Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science

Random 3-satisfiability

NP-complete (Cook)

Pb: random Boolean formula, conjunctive normal form, threevariables per clause, chosen randomly in {x1, .., xN}, negatedrandomly with probability 1/2:(x1 ∨ x27 ∨ x3) ∧ (x11 ∨ x3 ∨ x2) ∧ . . . ∧ (x9 ∨ x8 ∨ x30)

Control parameter: α = MN = Constraints/Variables.

Numerically: Threshold phenomenon at αc ∼ 4.26.

Proba(SAT)=1 when α < αc ; Proba(SAT)=0 when α > αc .

Numerics Mitchell Selman Levesque Kirkpatrick Crawford Auton..Threshold Friedgut;Bounds Kaporis Kirousis Lalas Dubois Boufkhad..

Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science

Threshold phenomenon → Phase transition

100

50

0

%SAT

α=Μ/Ν

N=200N=100

1 2 3 4 65αc

generically SAT for α < αc

generically UNSAT α > αc

Friedgut: → step function

Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science

Threshold phenomenon → Phase transition

100

50

0

%SAT

α=Μ/Ν1 2 3 4 65αc

Computer time Easy, and generically SAT,for α < αc

Hard, in the region α ∼ αc

Easy, generically UNSAT, forα > αc

Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science

Statistical physics of the random 3-SAT problem

Monasson, Zecchina, Weigt, Biroli, ....., MM, Parisi, Zecchina: →Phase diagram + New algorithm.

1- Analytic result:Discontinuousglass transition

Three phases:Easy-SAT, Hard-SAT,UNSAT

SAT (E = 0 ) UNSAT (E >0)0 0

1 stateE=0 E>0

Many states Many statesE>0

=M/Nαd

αc α= 4.267

2- New algorithm: Survey propagation (N = 107 at α = 4.23)

Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science

Statistical physics of the random 3-SAT problem

Monasson, Zecchina, Weigt, Biroli, ....., MM, Parisi, Zecchina: →Phase diagram + New algorithm.

1- Analytic result:Discontinuousglass transition

Three phases:Easy-SAT, Hard-SAT,UNSAT

SAT (E = 0 ) UNSAT (E >0)0 0

1 stateE=0 E>0

Many states Many statesE>0

=M/Nαd

αc α= 4.267

2- New algorithm: Survey propagation (N = 107 at α = 4.23)

Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science

Simple mean field message passing: warning propagation(Min Sum)

ua 1= 1

0

a

2 3

1

Message ua→1 ∈ {0, 1}

sent from clause a

to variable 1

Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science

Simple message passing: warning propagation

ua 1= 1

1

0

10

0 0

0 10

1

1

a

2 3

1

Warning ua→i = 1:

“According to the messagesI received, you should take thevalue which satisfies me!”.

Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science

Simple message passing: warning propagation

ua 1=

1

0

10

00

00

0

0

1

0

a

2 3

1

No warning ua→i = 0:

“No problem, take any value!”

Warning propagation (= ’Min Sum’) converges and gives thecorrect answer on a tree: SAT iff no contradictory messageOn a real random 3-SAT: limited to α < 3.9. Cannot get close tothe SAT-UNSAT transition

Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science

Replica symmetry breaking

Minimum Energy Configurations:energy cannot be lowered by a fi-nite number of flips

State/Cluster= { MEC connectedby finite flips } → one fixed pointof WP

Proliferation of states:

At α > αd , many states:

N (E ) ∼ exp(N Σ

(EN

))

c

eth

Σ

Ε/Ν

α αα

αα

α

d< <

c α<

=

c

Σ(0) → clusters of SAT configu-rationsΣ(eth)→ metastable clusters

Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science

From warning propagation to survey propagation

RSB: assume many states: N (E ) ∼ exp(N Σ

(EN

))Message = Survey of the elementary warnings in the variousstates:

ηa→i = probability of a warning being sent from constraint a tovariable i , when a state is picked up at random.

→ Propagate the surveys along the graph. Converges!

→ Results on the phase diagram and the complexity, from thestatistical analysis of the distribution of surveys in a generic sample.

→ Information on a single sample: a local field on each variable →new algorithmic strategies

Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science

From warning propagation to survey propagation

RSB: assume many states: N (E ) ∼ exp(N Σ

(EN

))Message = Survey of the elementary warnings in the variousstates:

ηa→i = probability of a warning being sent from constraint a tovariable i , when a state is picked up at random.

→ Propagate the surveys along the graph. Converges!

→ Results on the phase diagram and the complexity, from thestatistical analysis of the distribution of surveys in a generic sample.

→ Information on a single sample: a local field on each variable →new algorithmic strategies

Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science

Survey propagation

a 1η = Prob(warning)

ηb−>2

b

a

2 3

1

ηa→1: known exactly fromsurveys ofincoming warnings.

Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science

Statistical analysis of the SP equations in random K-SAT:phase diagram

Thresholds from integral equa-tion. Solved numerically orthrough large K asymptotic ex-pansion.

αc : SAT-UNSAT threshold.

αd : Onset of clustering→ clusters with frozen variables.

K αd αc α(7)c

3 3.93 4.2667 4.3074 8.30 9.931 9.9385 16.1 21.117 21.1186 30.5 43.37 43.3727 57.2 87.79 87.7858 107.2 176.5439 201.3 354.010

10 379.1 708.915

αc is conjectured to be exact (not αd).

Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science

Using the surveys : local field

In one given cluster of solutions, α:Hα

j =∑

a ua→j

Hαj > 0: number of warnings telling

“xi should be one”

Hαj < 0: number of warnings telling

“xi should be zero”

Hαj = 0: no warning

→ Survey of local field.

Pj(H) = Probability that Hαj = H

when α chosen at random.

0 H1−1

P(H)

32−2−3

W W +− W0

Some types of variables:

Balanced:

W± ' 1/2,W0 ' 0

Polarized:

W+ ' 1 or W− ' −1

Underconstrained:

W0 ' 1

Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science

Survey Inspired Decimation

Biased variable W i+ ' 1: In almost all clusters of solutions, xi = 1.

→ Fix xi = 1

SID algorithm: Iterate:

Run SP until convergence

Find most biased variable, i such that |W i+ −W i

−| maximal.

Fix it to xi = 1 if W i+ > W i

−, to xi = 0 if W i+ < W i

−, simplifythe formula.

Two possible ends: 1) Fix all variables 2) reduce the formula to astage where all W i

0 = 1. Underconstrained problem, easily solvedby e.g. simulated annealing or Walksat.

Solves: 107 variables at α ' 4.2− 4.25. Time O(N2), reduced toO(N) by fixing a fraction of the variables.

Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science

Survey decimation example

Number of clustersof assignmentswhich violate E clauses:

eΣ(E)

N = 10000, plot every 500decimation steps 0

50

100

150

200

0 5 10 15 20 25 30 35 40 45Σ

E’

decimationprocess

Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science

Glass phase in LDPC codes

p

Binary Symmetric Channel

Flip probability p

Complexity of the landscape(configurations on the sphere)

Σ(e) = 1N logN (E = Ne)

.04

.3

.2

.1

0.08 .12

p=.155

p=.3

(6,5) regular code. p

p d

c

= .139=.264

p=pc

Σ

e

p=.2

pd = threshold BP decoding

pc = threshold optimal decoding

Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science

Glass phase in LDPC codes

p

Binary Symmetric Channel

Flip probability p

Complexity of the landscape(configurations on the sphere)

Σ(e) = 1N logN (E = Ne)

.04

.3

.2

.1

0.08 .12

p=.155

p=.3

(6,5) regular code. p

p d

c

= .139=.264

p=pc

Σ

e

p=.2

pd = threshold BP decoding

pc = threshold optimal decoding

Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science

Miscellaneous comments

General approach to many constraint satisfaction networks, whenthe factor graph has a local tree structure (large girth)

Simple case (low density of constraint): RS cavity method OK.e.g. decoding with belief propagation at low enoug noise

Increasing density 1RSB: many pure states → statistical physics inthe space of pure states. Phase diagram for K -sat, q-colouring,LDPC codes...

Generic picture:SATHard-SAT (clusters)UNSAT

SAT (E = 0 ) UNSAT (E >0)0 0

1 stateE=0 E>0

Many states Many statesE>0

=M/Nαd

αc α= 4.267

Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science

Miscellaneous comments

Always “tree computations” (= iterative mapping of pdf), butwith different interpretations

Algorithmic implementation (single instance): belief propagation -survey propagation. Very powerful

Statistical analysis: Typical samples, typical configurations, viewedfrom a typical point: phase diagrams

Some predictions are rigorously confirmed (weighted matching,clusters in hard SAT phase, satisfiability threshold as upperbound...).

Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science

Appendix 1: Survey propagation equations

a 1η = Prob(warning)

ηb−>2

b

U

VW

X

a

2 3

1

π2+ =

∏b∈U(1− ηb→2)

π2− =

∏b∈V (1− ηb→2)

P(no contrad): π2+ + π2

− − π2+π

2−

q2 ≡ Prob(x2 = 1)

=π2−(1−π2

+)

π2++π2

−−π2+π2−

q3 ≡ Prob(x3 = 0)

=π3

+(1−π3−)

π3++π3

−−π3+π3−

ηa→1 = q2q3

Survey propagation: statistical analysis, or single sample →algorithms

Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science

Appendix 2 Origins of the cavity method

1975: Definition of the SK model of spin glasses E = −∑

ij Jijsi sj1979: Parisi solution of this model with replicas1986: An alternative approach: the cavity method (M, Parisi,Virasoro). Direct probabilistic approach, based on N → N + 1 butusing N � 1. Equivalent to replica approach.2001: A new version of the cavity method to handle ’finiteconnectivity’ problems (M, Parisi)2002: Applications to XORSAT, K-SAT, colouring.... → phasediagrams (thresholds) and algorithms (survey propagation).2003: Rigorous confirmation of Parisi’s solution for the SK model(Talagrand, Guerra)

Marc Mezard1 and Andrea Montanari2 Statistical Physics Tools in Information Science

top related