computer vision: models, learning and inference markov ...cv192/wiki.files/cv192_lec_mrf1.pdf ·...

36
Computer Vision: Models, Learning and Inference Markov Random Fields, Part 1 Oren Freifeld and Ron Shapira-Weber Computer Science, Ben-Gurion University March 11, 2019 www.cs.bgu.ac.il/ ~ cv192/ MRFs, Part 1 (ver. 1.01) Mar 11, 2019 1 / 36

Upload: others

Post on 19-Jun-2020

8 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Computer Vision: Models, Learning and Inference Markov ...cv192/wiki.files/CV192_lec_MRF1.pdf · Computer Vision: Models, Learning and Inference {Markov Random Fields, Part 1 Oren

Computer Vision: Models, Learning and Inference–

Markov Random Fields, Part 1

Oren Freifeld and Ron Shapira-Weber

Computer Science, Ben-Gurion University

March 11, 2019

www.cs.bgu.ac.il/~cv192/ MRFs, Part 1 (ver. 1.01) Mar 11, 2019 1 / 36

Page 2: Computer Vision: Models, Learning and Inference Markov ...cv192/wiki.files/CV192_lec_MRF1.pdf · Computer Vision: Models, Learning and Inference {Markov Random Fields, Part 1 Oren

Bayesian Image Restoration with a Markov Random Field

From left to right:

x: the true binary image.

y: its degraded version (20% random flips – this defines p(y|x)).

arg maxx p(x|y) = arg maxx p(y|x)p(x), where p(x) was taken to be aparticular MRF prior called the Ising model.

A sample from p(x|y).

In about week from now, you will know how to do it (in terms of both themath and the coding involved).

Figure from Winkler’s book on MRFs.www.cs.bgu.ac.il/~cv192/ MRFs, Part 1 (ver. 1.01) Mar 11, 2019 2 / 36

Page 3: Computer Vision: Models, Learning and Inference Markov ...cv192/wiki.files/CV192_lec_MRF1.pdf · Computer Vision: Models, Learning and Inference {Markov Random Fields, Part 1 Oren

1 Few Words on Probabilistic Graphical Models

2 Markov Chains

3 Markov Random Fields

www.cs.bgu.ac.il/~cv192/ MRFs, Part 1 (ver. 1.01) Mar 11, 2019 3 / 36

Page 4: Computer Vision: Models, Learning and Inference Markov ...cv192/wiki.files/CV192_lec_MRF1.pdf · Computer Vision: Models, Learning and Inference {Markov Random Fields, Part 1 Oren

Few Words on Probabilistic Graphical Models

Probabilistic Graphical Models (PGMs)

PGM come in two main flavors:

Bayesian Networks – directed graphsMarkov Random Fields (MRFs)– undirected graphs

In either case, a PGM encodes (and visualizes) dependency structure ofa joint pdf/pmf

Both types generalize Markov chains

PGMs and Neural Networks are different beasts – but there are relationsbetween them.

pdf: probability density functionpmf: probability mass function

www.cs.bgu.ac.il/~cv192/ MRFs, Part 1 (ver. 1.01) Mar 11, 2019 4 / 36

Page 5: Computer Vision: Models, Learning and Inference Markov ...cv192/wiki.files/CV192_lec_MRF1.pdf · Computer Vision: Models, Learning and Inference {Markov Random Fields, Part 1 Oren

Markov Chains

Markov Chain as a Directed Linear Graph

A Markov Chain, (X1, X2, . . . , Xn), may be graphically represented as

x1 x2 · · · xn−1 xn

This highlights the fact that

p(x) = p(x1)

n∏i=2

p(xi|xi−1)

www.cs.bgu.ac.il/~cv192/ MRFs, Part 1 (ver. 1.01) Mar 11, 2019 5 / 36

Page 6: Computer Vision: Models, Learning and Inference Markov ...cv192/wiki.files/CV192_lec_MRF1.pdf · Computer Vision: Models, Learning and Inference {Markov Random Fields, Part 1 Oren

Markov Chains

Markov Chain as an Undirected Linear Graph

A Markov Chain, (X1, X2, . . . , Xn), may also be graphically represented as

x1 x2 · · · xn−1 xn

This highlights the fact that

p(x1:n) =

n−1∏i=1

φi,i+1(xi,xi+1)in sloppier notation

=

n−1∏i=1

φ(xi,xi+1)

where:

φ1,2(x1,x2) = p(x1)p(x2|x1)

φi,i+1(xi,xi+1) = p(xi+1|xi) ∀i ∈ 2, . . . , n− 1

www.cs.bgu.ac.il/~cv192/ MRFs, Part 1 (ver. 1.01) Mar 11, 2019 6 / 36

Page 7: Computer Vision: Models, Learning and Inference Markov ...cv192/wiki.files/CV192_lec_MRF1.pdf · Computer Vision: Models, Learning and Inference {Markov Random Fields, Part 1 Oren

Markov Chains

Factorization Simplifies Computations

For now, forget about probability, and consider the following:

x, y, z are binary variables.

f : R3 → R>0 factorizes as f(x, y, z) = φx,y(x, y)φy,z(y, z) for sometwo nonnegative functions, φx,y : R2 → R≥0 and φy,z : R2 → R≥0.

Want:

maxx,y,z

f(x, y, z) (1)

www.cs.bgu.ac.il/~cv192/ MRFs, Part 1 (ver. 1.01) Mar 11, 2019 7 / 36

Page 8: Computer Vision: Models, Learning and Inference Markov ...cv192/wiki.files/CV192_lec_MRF1.pdf · Computer Vision: Models, Learning and Inference {Markov Random Fields, Part 1 Oren

Markov Chains

Factorization Simplifies Computations

Brute force requires 23 computations of f(x, y, z). However, we can dobetter by exploiting the factorization:

maxx,y,z

f(x, y, z) = maxx,y,z

φx,y(x, y)φy,z(y, z)

= maxx,y

φx,y(x, y) maxzφy,z(y, z)︸ ︷︷ ︸ψy(y),

= maxx,y

φx,y(x, y)ψy(y)︸ ︷︷ ︸ψx,y(x,y),

= maxx,y

ψx,y(x, y)

www.cs.bgu.ac.il/~cv192/ MRFs, Part 1 (ver. 1.01) Mar 11, 2019 8 / 36

Page 9: Computer Vision: Models, Learning and Inference Markov ...cv192/wiki.files/CV192_lec_MRF1.pdf · Computer Vision: Models, Learning and Inference {Markov Random Fields, Part 1 Oren

Markov Chains

Factorization Simplifies Computations

maxx,y ψx,y(x, y)

ψx,y(x, y) , φx,y(x, y)ψy(y)

ψy(y) , maxz φy,z(y, z)

Now:

ψy(0) = func(φy,z(0, 0), φy,z(0, 1)) (2 evaluations of φy,z)

ψy(1) = func(φy,z(1, 0), φy,z(1, 1)) (2 evaluations of φy,z)

ψx,y(0, 0) = func(φx,y(0, 0), ψy(0)) (1 evaluation of φx,y)

ψx,y(0, 1) = func(φx,y(0, 1), ψy(1)) (1 evaluation of φx,y)

ψx,y(1, 0) = func(φx,y(1, 0), ψy(0)) (1 evaluation of φx,y)

ψx,y(1, 1) = func(φx,y(1, 1), ψy(1)) (1 evaluation of φx,y)

The solution = max {ψx,y(0, 0), ψx,y(0, 1), ψx,y(1, 0), ψx,y(1, 1)}Again 23 evaluations, but of simpler functions

There is some overhead (e.g., memory, bookkeeping)

www.cs.bgu.ac.il/~cv192/ MRFs, Part 1 (ver. 1.01) Mar 11, 2019 9 / 36

Page 10: Computer Vision: Models, Learning and Inference Markov ...cv192/wiki.files/CV192_lec_MRF1.pdf · Computer Vision: Models, Learning and Inference {Markov Random Fields, Part 1 Oren

Markov Chains

Factorization Simplifies Computations

More generally:

If the 3 variables, instead of binary, take values in {0, 1, . . . , s− 1}, thenbrute forces requires s3 evaluations of f while exploiting thefactorization leads to 2s2 evaluations of its factors.

If x = (x1, . . . , xn) where each xi takes values in {0, 1, . . . , s− 1}, andwant maxx f(x) where

f(x) =

n−1∏i=1

φi,i+1(xi, xi+1) (2)

then brute forces requires sn evaluations of f while exploiting thefactorization leads to (n− 1)s2 evaluations of its factors. Difference canbe huge, e.g.: s = 10 and n = 100 ⇒ sn = 10100 and (n− 1)s2 = 9900.

Obviously: more overhead due to memory and bookkeeping.

www.cs.bgu.ac.il/~cv192/ MRFs, Part 1 (ver. 1.01) Mar 11, 2019 10 / 36

Page 11: Computer Vision: Models, Learning and Inference Markov ...cv192/wiki.files/CV192_lec_MRF1.pdf · Computer Vision: Models, Learning and Inference {Markov Random Fields, Part 1 Oren

Markov Chains

Factorization Simplifies Computations

Similar results hold if f(x) =∏n−1i=1 φi,i+1(xi, xi+1) and want

∑x f(x);

this is useful, e.g., if want to create a normalized version of f , i.e.,

f(x)∑x f(x)

A bit less trivial: as we will see, similar results hold iff(x) =

∏n−1i=1 φi,i+1(xi, xi+1) is a pmf, and want to sample from f :

x ∼ f(x) (3)

www.cs.bgu.ac.il/~cv192/ MRFs, Part 1 (ver. 1.01) Mar 11, 2019 11 / 36

Page 12: Computer Vision: Models, Learning and Inference Markov ...cv192/wiki.files/CV192_lec_MRF1.pdf · Computer Vision: Models, Learning and Inference {Markov Random Fields, Part 1 Oren

Markov Chains

Another Characterization of Markov Chain

Recall: a sequence, (X1, . . . , Xn), is called an MC ifp(xi|x1:(i−1)) = p(xi|xi−1) for every i ∈ {2, . . . , n}. This property isreferred to as “1-sided MC”

If a sequence, (X1, . . . , Xn), satisfiesp(xi|x1:(i−1),x(i+1):n) = p(xi|xi−1,xi+1) ∀i ∈ {2, . . . , n− 1}

p(x1|x2:n) = p(x1|x2)p(xn|x1:(n−1)) = p(xn|xn−1)

it is said to satisfy the “2-sided MC” property. In words:given all the others, each RV depends only on its neighbors.

www.cs.bgu.ac.il/~cv192/ MRFs, Part 1 (ver. 1.01) Mar 11, 2019 12 / 36

Page 13: Computer Vision: Models, Learning and Inference Markov ...cv192/wiki.files/CV192_lec_MRF1.pdf · Computer Vision: Models, Learning and Inference {Markov Random Fields, Part 1 Oren

Markov Chains

Fact

1-sided MC ⇐⇒ 2-sided MC

Corollary

By symmetry, it follows that if (X1, . . . , Xn) is an MC, then we also have

i ∈ {1, . . . , n− 1} ⇒ p(xi|x(i+1):n) = p(xi|xi+1)

i ∈ {1, . . . , n− 1} ⇒ p(xi:n) = p(xn)∏n−1j=i p(xj |xj+1)

Particularly,

p(x) = p(xn)

n−1∏i=1

p(xi|xi+1)

where x = (x1, . . . ,xn)

www.cs.bgu.ac.il/~cv192/ MRFs, Part 1 (ver. 1.01) Mar 11, 2019 13 / 36

Page 14: Computer Vision: Models, Learning and Inference Markov ...cv192/wiki.files/CV192_lec_MRF1.pdf · Computer Vision: Models, Learning and Inference {Markov Random Fields, Part 1 Oren

Markov Random Fields

Markov Random Fields

One of the two main types of Probabilistic Graphical Models

Generalize Markov Chains to general undirected graphs

Many computer-vision and machine-learning applications

www.cs.bgu.ac.il/~cv192/ MRFs, Part 1 (ver. 1.01) Mar 11, 2019 14 / 36

Page 15: Computer Vision: Models, Learning and Inference Markov ...cv192/wiki.files/CV192_lec_MRF1.pdf · Computer Vision: Models, Learning and Inference {Markov Random Fields, Part 1 Oren

Markov Random Fields

Markov Random Fields

Informal Definition

Associate an RV with each vertex of an undirected graph, G, and say thateach variable, given all the others, depends only on its neighbors(according to the graph). In which case, we say that p, the joint pdf (orpmf) of all these RVs, is an MRF (w.r.t. G).

Example

www.cs.bgu.ac.il/~cv192/ MRFs, Part 1 (ver. 1.01) Mar 11, 2019 15 / 36

Page 16: Computer Vision: Models, Learning and Inference Markov ...cv192/wiki.files/CV192_lec_MRF1.pdf · Computer Vision: Models, Learning and Inference {Markov Random Fields, Part 1 Oren

Markov Random Fields

Cliques

Definition

A clique (in the graph) is a set of vertices that are fully connected. Byconvention, each singleton is a clique.

Example

Notation

Let C denote the set of all cliques in the graph. If c ∈ C, thenxc , {xs : s ∈ c}

www.cs.bgu.ac.il/~cv192/ MRFs, Part 1 (ver. 1.01) Mar 11, 2019 16 / 36

Page 17: Computer Vision: Models, Learning and Inference Markov ...cv192/wiki.files/CV192_lec_MRF1.pdf · Computer Vision: Models, Learning and Inference {Markov Random Fields, Part 1 Oren

Markov Random Fields

The structure of MRFs leads to computational advantages in calculatingprobabilities on a graph.Examples for (graphs of) MRFs:

graphs defined over pixels (regular 2D lattice)

speech recognition

www.cs.bgu.ac.il/~cv192/ MRFs, Part 1 (ver. 1.01) Mar 11, 2019 17 / 36

Page 18: Computer Vision: Models, Learning and Inference Markov ...cv192/wiki.files/CV192_lec_MRF1.pdf · Computer Vision: Models, Learning and Inference {Markov Random Fields, Part 1 Oren

Markov Random Fields

Notation

S: collection of indices

Xa: a ∈ S, an RV

R: the range of Xa, called “state space”. Usually, |R| <∞ (but we willsee cases where this is not true)

XA, for A ⊂ S, is the set {Xs : s ∈ A}BXA = XA\B = {Xs : s ∈ A \B}If A = S, can also just write BX = XS\B = {Xs : s ∈ S \B}p: pmf (or pdf) of XS .

xs: a generic value for Xs, s ∈ S.

www.cs.bgu.ac.il/~cv192/ MRFs, Part 1 (ver. 1.01) Mar 11, 2019 18 / 36

Page 19: Computer Vision: Models, Learning and Inference Markov ...cv192/wiki.files/CV192_lec_MRF1.pdf · Computer Vision: Models, Learning and Inference {Markov Random Fields, Part 1 Oren

Markov Random Fields

Notation

G = (S, η), where η is a “neighborhood system”; η = {ηs}s∈S where

ηs ⊂ Ss /∈ ηss ∈ ηt ⇐⇒ t ∈ ηs

Example

S = {1, 2, 3, 4, 5, 6, 7, 8}η1 = {2, 3}, η2 = {1, 3, 4}, η3 = {1, 2}, η4 = {2, 5, 6}, η5 = {4},η6 = {4, 7, 8}, η7 = {6, 8}, η8 = {6, 7}www.cs.bgu.ac.il/~cv192/ MRFs, Part 1 (ver. 1.01) Mar 11, 2019 19 / 36

Page 20: Computer Vision: Models, Learning and Inference Markov ...cv192/wiki.files/CV192_lec_MRF1.pdf · Computer Vision: Models, Learning and Inference {Markov Random Fields, Part 1 Oren

Markov Random Fields

Cliques and Neighborhoods

Example

Recall C is the set of cliques in G; i.e., c ∈ C ⇒ c ⊂ S, such that ∀s, t ∈ cwe have s ∈ ηt.

www.cs.bgu.ac.il/~cv192/ MRFs, Part 1 (ver. 1.01) Mar 11, 2019 20 / 36

Page 21: Computer Vision: Models, Learning and Inference Markov ...cv192/wiki.files/CV192_lec_MRF1.pdf · Computer Vision: Models, Learning and Inference {Markov Random Fields, Part 1 Oren

Markov Random Fields

Definition (Markov Random Field)

p is an MRF w.r.t. G if p(xs|sx) = p(xs|xηs)∀s ∈ S(provided the LHS exists)

Remark: some authors also require p(x) > 0, ∀x

www.cs.bgu.ac.il/~cv192/ MRFs, Part 1 (ver. 1.01) Mar 11, 2019 21 / 36

Page 22: Computer Vision: Models, Learning and Inference Markov ...cv192/wiki.files/CV192_lec_MRF1.pdf · Computer Vision: Models, Learning and Inference {Markov Random Fields, Part 1 Oren

Markov Random Fields

Definition (Gibbs distribution)

p is Gibbs w.r.t. G if p(x) > 0 ∀x and

p(x) =∏c∈C

Fc(xc)

for some {Fc}c∈C , a set of functions.

www.cs.bgu.ac.il/~cv192/ MRFs, Part 1 (ver. 1.01) Mar 11, 2019 22 / 36

Page 23: Computer Vision: Models, Learning and Inference Markov ...cv192/wiki.files/CV192_lec_MRF1.pdf · Computer Vision: Models, Learning and Inference {Markov Random Fields, Part 1 Oren

Markov Random Fields

Theorem (Hammersley & Clifford)

If p(x) > 0 ∀x then:

p MRF w.r.t. G ⇐⇒ p Gibbs w.r.t. G

AKA the fundamental theorem of random fields.

www.cs.bgu.ac.il/~cv192/ MRFs, Part 1 (ver. 1.01) Mar 11, 2019 23 / 36

Page 24: Computer Vision: Models, Learning and Inference Markov ...cv192/wiki.files/CV192_lec_MRF1.pdf · Computer Vision: Models, Learning and Inference {Markov Random Fields, Part 1 Oren

Markov Random Fields

Hammersley-Clifford and Markov Chains

Consider an MC, X = (X1, X2, . . . , Xn); i.e., p(x) factorizes as

p(x) = p(x1)

n−1∏i=1

p(xi+1|xi)

Assume also p(x) > 0 ∀x.HC⇒ p(x) is MRF w.r.t. G (which here is a linear undirected graph).⇒ p(xi|ix) = p(xi|xi−1,xi+1) ∀i ∈ {2, . . . , n− 1}.We just showed, using HC, that the 1-sided Markov property implies the2-sided Markov property.

In fact, don’t need HC for this as we can prove it directly. But first,before we do it, we need the following fact.

www.cs.bgu.ac.il/~cv192/ MRFs, Part 1 (ver. 1.01) Mar 11, 2019 24 / 36

Page 25: Computer Vision: Models, Learning and Inference Markov ...cv192/wiki.files/CV192_lec_MRF1.pdf · Computer Vision: Models, Learning and Inference {Markov Random Fields, Part 1 Oren

Markov Random Fields

Fact

If p(xs|xA,xB) = g(xs,xA) for some s ∈ S, A ⊂ S, B ⊂ S, withA ∩B = ∅, s /∈ A ∪B, and some function g, then g(xs,xA) = p(xs|xA).Thus, p(xs|xA,xB) = p(xs|xA).

Proof.

p(xs|xA) =∑xB

p(xs,xB|xA) =∑xB

p(xs|xA,xB)p(xB|xA)

assumption=

∑xB

g(xs,xA)p(xB|xA) = g(xs,xA)∑xB

p(xB|xA) = g(xs,xA).

www.cs.bgu.ac.il/~cv192/ MRFs, Part 1 (ver. 1.01) Mar 11, 2019 25 / 36

Page 26: Computer Vision: Models, Learning and Inference Markov ...cv192/wiki.files/CV192_lec_MRF1.pdf · Computer Vision: Models, Learning and Inference {Markov Random Fields, Part 1 Oren

Markov Random Fields

1-sided Markov Property ⇒ 2-sided Markov Property.

p(xi|ix) =p(x)

p(ix)=

p(x)∑xip(x)

MC=

p(x1)∏n−1j=1 p(xj+1|xj)∑

xip(x1)

∏n−1j=1 p(xj+1|xj)

=p(xi|xi−1)p(xi+1|xi)∑xip(xi|xi−1)p(xi+1|xi)

=: g(xi,xi−1,xi+1)

Claim: p(xi|ix) = g(xi,xi−1,xi+1) ⇒ p(xi|ix) = p(xi|xi−1,xi+1). Thisfollows directly from the previous fact: just take A = {i− 1, i+ 1}.

www.cs.bgu.ac.il/~cv192/ MRFs, Part 1 (ver. 1.01) Mar 11, 2019 26 / 36

Page 27: Computer Vision: Models, Learning and Inference Markov ...cv192/wiki.files/CV192_lec_MRF1.pdf · Computer Vision: Models, Learning and Inference {Markov Random Fields, Part 1 Oren

Markov Random Fields

2-sided Markov Property ⇒ 1-sided Markov Property.

p(xi|ix) = p(xi|xi−1,xi+1) ⇒ p is MRF w.r.t. the (linear) graph GHC⇒ p is Gibbs w.r.t. G. ⇒ p(x) =

∏n−1i=1 F (xi+1,xi) ⇒

p(xi+1|x1:i) =p(x1:(i+1))

p(x1:i)=

∑x(i+2):n

p(x)∑x(i+1):n

p(x)

=

∑x(i+2):n

∏n−1j=1 F (xj+1,xj)∑

x(i+1):n

∏n−1j=1 F (xj+1,xj)

=

func(x1:i+1)︷ ︸︸ ︷∏ij=1 F (xj+1,xj)

∑x(i+2):n

func(xi+1,xi+2:n)︷ ︸︸ ︷∏n−1j=i+1 F (xj+1,xj)∏i−1

j=1 F (xj+1,xj)︸ ︷︷ ︸func(x1:i)

∑x(i+1):n

∏n−1j=i F (xj+1,xj)︸ ︷︷ ︸func(xi,xi+1:n)

= F (xi+1,xi)func(xi+1)

func(xi)=: g(xi+1,xi)⇒ p(xi+1|x1:i) = p(xi+1|xi)

www.cs.bgu.ac.il/~cv192/ MRFs, Part 1 (ver. 1.01) Mar 11, 2019 27 / 36

Page 28: Computer Vision: Models, Learning and Inference Markov ...cv192/wiki.files/CV192_lec_MRF1.pdf · Computer Vision: Models, Learning and Inference {Markov Random Fields, Part 1 Oren

Markov Random Fields

Proof of the Hammersley-Clifford Theorem

p(x) > 0

Proving “p is Gibbs w.r.t. G ⇒ p is MRF w.r.t. G”.

p is Gibbs w.r.t. G ⇒ p(x) =∏c∈C Fc(xc) for some {Fc}c∈C

p(xs|sx) =

∏c∈C Fc(xc)∑

xs

∏c∈C Fc(xc)

=

∏c∈C:xs /∈c Fc(xc)

∏c∈C:xs∈c Fc(xc)∏

c∈C:xs /∈c Fc(xc)∑

xs

∏c∈C:xs∈c Fc(xc)

=

∏c∈C:xs∈c Fc(xc)∑

xs

∏c∈C:xs∈c Fc(xc)

=func(xs,xηs)

func(xηs)= g(xs,xηs) = p(xs|xηs)

(recall that p(xs|sx) = g(xs,xηs) implies that p(xs|sx) = p(xs|xηs))⇒ p is MRF w.r.t. G.

The other direction is hard; we omit the proof (cf. Winkler’s book ifinterested)

www.cs.bgu.ac.il/~cv192/ MRFs, Part 1 (ver. 1.01) Mar 11, 2019 28 / 36

Page 29: Computer Vision: Models, Learning and Inference Markov ...cv192/wiki.files/CV192_lec_MRF1.pdf · Computer Vision: Models, Learning and Inference {Markov Random Fields, Part 1 Oren

Markov Random Fields

Marginals and Posteriors

Suppose we divide S into “Unobservable” (AKA hidden/latent) and“Observable” sites:

S = A ∪B A ∩B = ∅x = xS = (xA,yB)

Example

Of interest are the statistical structures of p(xA|yB) and p(yB).

www.cs.bgu.ac.il/~cv192/ MRFs, Part 1 (ver. 1.01) Mar 11, 2019 29 / 36

Page 30: Computer Vision: Models, Learning and Inference Markov ...cv192/wiki.files/CV192_lec_MRF1.pdf · Computer Vision: Models, Learning and Inference {Markov Random Fields, Part 1 Oren

Markov Random Fields

Fact (equivalent characterizations of MRFs)

Let p(x) > 0∀x. Let MP stand for “Markov Property”. If A ⊂ S, then∂A = Ac ∩

⋃s∈A ηs is called the Markov Blanket of A. Let A = A ∪ ∂A

The following are equivalents:

1 p(xs|sx) = p(xs|xηs) ∀s ∈ S (i.e., our original definition of an MRF)

2 p is Gibbs w.r.t. G

3 global MP: A,B,C ⊂ S are disjoint and C separatesa A and B⇒ xA ⊥⊥ xB|xC

4 Setwise local MP: A ⊂ S ⇒ xA ⊥⊥ xS\A|x∂A5 local MP: s ∈ S ⇒ xs ⊥⊥ xS\(s∪ηs)|xηs6 pairwise MP: s, t ⊂ S, s /∈ ηt ⇒ xs ⊥⊥ xt|S \ {s, t}

aI.e., for every s ∈ A and t ∈ B, any path in G between s and t passesthrough some q ∈ C

www.cs.bgu.ac.il/~cv192/ MRFs, Part 1 (ver. 1.01) Mar 11, 2019 30 / 36

Page 31: Computer Vision: Models, Learning and Inference Markov ...cv192/wiki.files/CV192_lec_MRF1.pdf · Computer Vision: Models, Learning and Inference {Markov Random Fields, Part 1 Oren

Markov Random Fields

Posteriors

For every clique (more generally, a subset of S) c, we havec = S ∩ c = (A ∪B) ∩ c = (A ∩ c) ∪ (B ∩ c)Can write Fc(xc) = Fc(xA∩c,yB∩c)

We have

p(xA|yB) =p(xA,yB)

p(yB)=

p(x)

p(yB)=

∏c∈C Fc(xA∩c,yB∩c)

p(yB)

=

∏c∈C:c∩A 6=∅ Fc(xA∩c,yB∩c)

∏c∈C:c∩A=∅ Fc(xA∩c,yB∩c)

p(yB)

∝∏

c∈C:c∩A 6=∅

Fc(xA∩c,yB∩c) =∏

c∈C:c∩A 6=∅

Fc(xA∩c)

www.cs.bgu.ac.il/~cv192/ MRFs, Part 1 (ver. 1.01) Mar 11, 2019 31 / 36

Page 32: Computer Vision: Models, Learning and Inference Markov ...cv192/wiki.files/CV192_lec_MRF1.pdf · Computer Vision: Models, Learning and Inference {Markov Random Fields, Part 1 Oren

Markov Random Fields

Posteriors

p(xA|yB) ∝∏

c∈C:c∩A 6=∅

Fc(xA∩c)

⇒ p(xA|yB) is Gibbs w.r.t. GA (i.e., G restricted to A).

⇒ p(xA|yB) is an MRF w.r.t. GA.

In words: conditioning on a subset of an MRF, yields another(somewhat simpler/smaller) MRF.

Example (Hidden Markov Model (HMM))

Here p(xA|yB) is an MRF w.r.t. a linear graph (i.e., Markov Chain).

www.cs.bgu.ac.il/~cv192/ MRFs, Part 1 (ver. 1.01) Mar 11, 2019 32 / 36

Page 33: Computer Vision: Models, Learning and Inference Markov ...cv192/wiki.files/CV192_lec_MRF1.pdf · Computer Vision: Models, Learning and Inference {Markov Random Fields, Part 1 Oren

Markov Random Fields

Marginals

p(yB) =∑xA

p(xA,yB) =∑xA

∏c∈C

Fc(xA,yB)

Example (y1 and y2 are conditionally independent but not independent)

p(x1,x2,y1,y2) = F12(x1,x2)G1(x1,y1)G2(x2,y2)⇒

p(y1,y2) =∑x1,x2

F12(x1,x2)G1(x1,y1)G2(x2,y2) = G12(y1,y2)

typically

6= H1(y1)H2(y2) so y1 ⊥6⊥ y2 even though y1 ⊥⊥ y2|x1,x2

www.cs.bgu.ac.il/~cv192/ MRFs, Part 1 (ver. 1.01) Mar 11, 2019 33 / 36

Page 34: Computer Vision: Models, Learning and Inference Markov ...cv192/wiki.files/CV192_lec_MRF1.pdf · Computer Vision: Models, Learning and Inference {Markov Random Fields, Part 1 Oren

Markov Random Fields

Marginals

In fact, more generally, every time we sum out a variable, we create aclique involving all its neighbors (“creating new edges”).

Example

www.cs.bgu.ac.il/~cv192/ MRFs, Part 1 (ver. 1.01) Mar 11, 2019 34 / 36

Page 35: Computer Vision: Models, Learning and Inference Markov ...cv192/wiki.files/CV192_lec_MRF1.pdf · Computer Vision: Models, Learning and Inference {Markov Random Fields, Part 1 Oren

Markov Random Fields

Marginals

p(yB) is an MRF w.r.t. GB (G restricted to B) with, in general, an addededge between s, t ∈ B provided there is a path in G from s to t that goesexclusively through A.

Example (Hidden Markov Model (HMM))

Here p(xA|yB) is an MRF w.r.t. a linear graph (i.e., Markov Chain) whilethe graph for p(yB) is fully connected.

www.cs.bgu.ac.il/~cv192/ MRFs, Part 1 (ver. 1.01) Mar 11, 2019 35 / 36

Page 36: Computer Vision: Models, Learning and Inference Markov ...cv192/wiki.files/CV192_lec_MRF1.pdf · Computer Vision: Models, Learning and Inference {Markov Random Fields, Part 1 Oren

Markov Random Fields

Version Log

11/3/2019, ver 1.01. S7: Changed R>0 to R≥0. S24: Added asentence. S28: Added a step.

9/3/2019, ver 1.00.

www.cs.bgu.ac.il/~cv192/ MRFs, Part 1 (ver. 1.01) Mar 11, 2019 36 / 36