graphical models reconstruction -...

50
Overview History and Background Graphical Models Reconstruction Open Issues References Graphical Models Reconstruction Graph Theory Course Project Firoozeh Sepehr April 27 th 2016 Firoozeh Sepehr — Graphical Models Reconstruction 1/50

Upload: others

Post on 23-Mar-2020

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Graphical Models Reconstruction - UTKweb.eecs.utk.edu/~cphill25/cs594_spring2016/Graphical...Foundations of probability theory2 Go back to 16th century when Gerolamo Cardano began

Overview History and Background Graphical Models Reconstruction Open Issues References

Graphical Models ReconstructionGraph Theory Course Project

Firoozeh Sepehr

April 27th 2016

Firoozeh Sepehr — Graphical Models Reconstruction 1/50

Page 2: Graphical Models Reconstruction - UTKweb.eecs.utk.edu/~cphill25/cs594_spring2016/Graphical...Foundations of probability theory2 Go back to 16th century when Gerolamo Cardano began

Overview History and Background Graphical Models Reconstruction Open Issues References

Outline

1 Overview

2 History and Background

3 Graphical Models

4 Reconstruction

5 Open Issues

Firoozeh Sepehr — Graphical Models Reconstruction 2/50

Page 3: Graphical Models Reconstruction - UTKweb.eecs.utk.edu/~cphill25/cs594_spring2016/Graphical...Foundations of probability theory2 Go back to 16th century when Gerolamo Cardano began

Overview History and Background Graphical Models Reconstruction Open Issues References

Outline

1 Overview

2 History and Background

3 Graphical Models

4 Reconstruction

5 Open Issues

Firoozeh Sepehr — Graphical Models Reconstruction 3/50

Page 4: Graphical Models Reconstruction - UTKweb.eecs.utk.edu/~cphill25/cs594_spring2016/Graphical...Foundations of probability theory2 Go back to 16th century when Gerolamo Cardano began

Overview History and Background Graphical Models Reconstruction Open Issues References

OverviewWhat are graphical models?

Graphical Models1;2

Combination of Probability Theory and Graph Theory

Tackling problems of uncertainty and complexity

Utilizing modularity for complex systems

Graphical representation of dependencies embedded in probabilisticmodels

ab

c

e

f

d

ga

bc

e

f

d

g

Bayesian/Belief Networks Markov Networks

Firoozeh Sepehr — Graphical Models Reconstruction 4/50

Page 5: Graphical Models Reconstruction - UTKweb.eecs.utk.edu/~cphill25/cs594_spring2016/Graphical...Foundations of probability theory2 Go back to 16th century when Gerolamo Cardano began

Overview History and Background Graphical Models Reconstruction Open Issues References

OverviewMarkov vs Bayesian Networks

Markov Networks

Undirected graphical models

Correlations between variables

Mostly used in physics and vision communities

Bayesian/Belief Networks

Directed graphical models

Directed Acyclic Graphs (DAGs)

Causal relationships between variables

Mostly used in AI and machine learning communities

Use Bayes’ rule for inference

Firoozeh Sepehr — Graphical Models Reconstruction 5/50

Page 6: Graphical Models Reconstruction - UTKweb.eecs.utk.edu/~cphill25/cs594_spring2016/Graphical...Foundations of probability theory2 Go back to 16th century when Gerolamo Cardano began

Overview History and Background Graphical Models Reconstruction Open Issues References

OverviewApplications

So many different applications

Pattern recognition

Diagnosis of diseases

Desicion-theoretic systems4

Statistical physics

Signal and image processing

Inferring cellular networks in biological systems3

Firoozeh Sepehr — Graphical Models Reconstruction 6/50

Page 7: Graphical Models Reconstruction - UTKweb.eecs.utk.edu/~cphill25/cs594_spring2016/Graphical...Foundations of probability theory2 Go back to 16th century when Gerolamo Cardano began

Overview History and Background Graphical Models Reconstruction Open Issues References

Outline

1 Overview

2 History and Background

3 Graphical Models

4 Reconstruction

5 Open Issues

Firoozeh Sepehr — Graphical Models Reconstruction 7/50

Page 8: Graphical Models Reconstruction - UTKweb.eecs.utk.edu/~cphill25/cs594_spring2016/Graphical...Foundations of probability theory2 Go back to 16th century when Gerolamo Cardano began

Overview History and Background Graphical Models Reconstruction Open Issues References

History and BackgroundProbability theory

Foundations of probability theory2

Go back to 16th century when Gerolamo Cardano began a formalanalysis of games of chance, followed by additional key developments byPierre de Fermat and Blaise Pascal in 17th century. The initialdevelopment involved only discrete probability spaces and the analysismethods were purely combinatorial.

Gerolamo Cardano Pierre de Fermat Blaise PascalItalian, 1501-1576 French, 1601-1665 French, 1623-1662

Science, maths, Mathematics and law 10 Theology, mathematics,philosophy, and literature 9 philosophy and physics 11

Firoozeh Sepehr — Graphical Models Reconstruction 8/50

Page 9: Graphical Models Reconstruction - UTKweb.eecs.utk.edu/~cphill25/cs594_spring2016/Graphical...Foundations of probability theory2 Go back to 16th century when Gerolamo Cardano began

Overview History and Background Graphical Models Reconstruction Open Issues References

History and BackgroundProbability theory

Foundations of probability theory - cont’d

The foundations of modern probability theory were laid by AndreyKolmogorov in the 1930s.

Andrey KolmogorovRussian, 1903-1987

MathematicsKnown for Topology, Intuitionistic logic,Turbulence studies, Classical mechanics,

Mathematical analysis, Kolmogorov complexity 12

Firoozeh Sepehr — Graphical Models Reconstruction 9/50

Page 10: Graphical Models Reconstruction - UTKweb.eecs.utk.edu/~cphill25/cs594_spring2016/Graphical...Foundations of probability theory2 Go back to 16th century when Gerolamo Cardano began

Overview History and Background Graphical Models Reconstruction Open Issues References

History and BackgroundBayes rule

Bayes theorem2

Shown in the 18th century by Reverend Thomas Bayes. This theoremallows us to use a model that tells us the conditional probability of eventa given event b in order to compute the contrapositive: the conditionalprobability of event b given event a. This type of reasoning is central tothe use of graphical models - Bayesian network.

Thomas BayesEnglish, 1701-1761

Statistician, philosopherand Presbyterian minister 13

Firoozeh Sepehr — Graphical Models Reconstruction 10/50

Page 11: Graphical Models Reconstruction - UTKweb.eecs.utk.edu/~cphill25/cs594_spring2016/Graphical...Foundations of probability theory2 Go back to 16th century when Gerolamo Cardano began

Overview History and Background Graphical Models Reconstruction Open Issues References

History and BackgroundOrigins of graphical models

Origins of graphical models2

Representing interactions between variables in a multidimensionaldistribution using a graph structure originates in several communities

Statistical physics: Gibbs - used an undirected graph to representthe distribution over a system of interacting particles

Genetics: path analysis of Sewal Wright - proposed the use of adirected graph to study inheritance in natural species

Statistics: Bartlett - analyzing interactions between variables in thestudy of contingency tables, also known as log-linear models

Computer science: Artificial Intelligence (AI) to perform difficulttasks such as oil-well location or medical diagnosis, at an expert level

Firoozeh Sepehr — Graphical Models Reconstruction 11/50

Page 12: Graphical Models Reconstruction - UTKweb.eecs.utk.edu/~cphill25/cs594_spring2016/Graphical...Foundations of probability theory2 Go back to 16th century when Gerolamo Cardano began

Overview History and Background Graphical Models Reconstruction Open Issues References

History and BackgroundOrigins of graphical models

Expert systems2

Need for methods that allow the interation of multiple pieces ofevidence and provide support for making decisions under uncertainty

Huge success in predicting the diseases using evidences likesysmptoms and test results in the 1970s

Fell into disfavor in AI community

1 AI should be based on similar methods to human intelligence2 Use of strong independence assumptions mae in the existing expert

systems was not a flexible, scalable mechanism

Firoozeh Sepehr — Graphical Models Reconstruction 12/50

Page 13: Graphical Models Reconstruction - UTKweb.eecs.utk.edu/~cphill25/cs594_spring2016/Graphical...Foundations of probability theory2 Go back to 16th century when Gerolamo Cardano began

Overview History and Background Graphical Models Reconstruction Open Issues References

History and BackgroundOrigins of graphical models

Expert systems - cont’d

Widespread acceptance of probabilistic methods began in the late1980s

1 Series of seminal theoretical developments

Bayesian network framework by Judea Pearl and his colleaagues in1988Foundations for efficient reasoning using probabilistic graphicalmodels by S. L. Lauritzen and D.J. Spiegelhalter in 1988

2 Construction of large-scale, highly successful expert systems basedon this framework that avoided the unrealistically strong assumptionsmade by early probabilistic expert systems

Pathfinder expert system (which assists community pathologists withthe diagnosis of lymph-node pathology) constructed by Heckermanand colleagues in 1992 14

Firoozeh Sepehr — Graphical Models Reconstruction 13/50

Page 14: Graphical Models Reconstruction - UTKweb.eecs.utk.edu/~cphill25/cs594_spring2016/Graphical...Foundations of probability theory2 Go back to 16th century when Gerolamo Cardano began

Overview History and Background Graphical Models Reconstruction Open Issues References

Outline

1 Overview

2 History and Background

3 Graphical Models

4 Reconstruction

5 Open Issues

Firoozeh Sepehr — Graphical Models Reconstruction 14/50

Page 15: Graphical Models Reconstruction - UTKweb.eecs.utk.edu/~cphill25/cs594_spring2016/Graphical...Foundations of probability theory2 Go back to 16th century when Gerolamo Cardano began

Overview History and Background Graphical Models Reconstruction Open Issues References

Graphical ModelsDefinitions

Directed and undirected graphs

G = (N,E ) is an undirected graph

G = (N, ~E ) a directed graph

Degree, indegree and outdegree

For a vertex y ∈ N

degree is deg(y)

indegree is deg−(y)

outdegree is deg+(y)

Root and leaf

If deg−(y) = 0, y is a root and if deg+(y) = 0, y is a leaf

Firoozeh Sepehr — Graphical Models Reconstruction 15/50

Page 16: Graphical Models Reconstruction - UTKweb.eecs.utk.edu/~cphill25/cs594_spring2016/Graphical...Foundations of probability theory2 Go back to 16th century when Gerolamo Cardano began

Overview History and Background Graphical Models Reconstruction Open Issues References

Graphical ModelsDefinitions

Chains and paths

A chain starting from yi and ending in yj is an ordered sequence ofdistinct nodes (yπ1 , yπ2 , ..., yπl−1

, yπl) where yi = yπ1 , yj = yπl

and

(yk , yk+1) ∈ ~E

A path starting from yi and ending in yj is an ordered sequence ofdistinct nodes (yπ1 , yπ2 , ..., yπl−1

, yπl) where yi = yπ1 , yj = yπl

and

either (yk , yk+1) ∈ ~E or (yk+1, yk) ∈ ~E

Note

Chains are a special case of paths!

Firoozeh Sepehr — Graphical Models Reconstruction 16/50

Page 17: Graphical Models Reconstruction - UTKweb.eecs.utk.edu/~cphill25/cs594_spring2016/Graphical...Foundations of probability theory2 Go back to 16th century when Gerolamo Cardano began

Overview History and Background Graphical Models Reconstruction Open Issues References

Graphical ModelsDefinitions

Parents, Children, Ancestors, Descendants

Consider a directed graph G = (N, ~E ) and yi ∈ N. Given a set X ⊆ N:

yi is a parent of yj if there is a directed edge from yi to yjpa(X ) := {yi ∈ N|∃yj ∈ X : yi is a parent of yj}yj is a child of yi if there is a directed edge from yi to yjch(X ) := {yj ∈ N|∃yi ∈ X : yj is a child of yi}yi is an ancestor of yj if there is a chain from yi to yjan(X ) := {yi ∈ N|∃yj ∈ X : yi is an ancestor of yj}yj is a descendant of yi if there is a chain from yi to yjde(X ) := {yj ∈ N|∃yi ∈ X : yj is a descendant of yi}

Neighbors

ngb(yi ), are the union of parents and children set.

Firoozeh Sepehr — Graphical Models Reconstruction 17/50

Page 18: Graphical Models Reconstruction - UTKweb.eecs.utk.edu/~cphill25/cs594_spring2016/Graphical...Foundations of probability theory2 Go back to 16th century when Gerolamo Cardano began

Overview History and Background Graphical Models Reconstruction Open Issues References

Graphical ModelsDefinitions

Visualize ...

Roots, Leaves

Paths, Chains

Parents, Children, Ancestors, Descendants, Neighbors

ab

c

e

f

d

g

Firoozeh Sepehr — Graphical Models Reconstruction 18/50

Page 19: Graphical Models Reconstruction - UTKweb.eecs.utk.edu/~cphill25/cs594_spring2016/Graphical...Foundations of probability theory2 Go back to 16th century when Gerolamo Cardano began

Overview History and Background Graphical Models Reconstruction Open Issues References

Graphical ModelsDefinitions

Forks, inverted forks and chain links6

Consider a path (yπ1 , yπ2 , ..., yπl−1, yπl

) in a directed graph G = (N, ~E ).Vertex yπi is

a fork if (yπi , yπi−1 ) and (yπi , yπi+1 ) are in ~E

an inverted fork (or collider) if (yπi−1 , yπi ) and (yπi+1 , yπi ) are in ~E

a chain link in all other cases

ab

c

e

f

d

g

Firoozeh Sepehr — Graphical Models Reconstruction 19/50

Page 20: Graphical Models Reconstruction - UTKweb.eecs.utk.edu/~cphill25/cs594_spring2016/Graphical...Foundations of probability theory2 Go back to 16th century when Gerolamo Cardano began

Overview History and Background Graphical Models Reconstruction Open Issues References

Graphical ModelsWhat is factorization?

Factorization

Joint probability distribution

Using the chain rule and assuming an arbitrary order d on variables2

p(x1, x2, ..., xn) = Πni=1p(xi |x1, x2, ..., xi−1) (1)

Using graphical models - leads to a compact representation8

Undirected GM

p(x1, x2, ..., xn) =1

ZΠ(i,j)∈Eφk(xi , xj) (2)

Undirected Tree GM (using junction tree theory)

p(x1, x2, ..., xn) = Πni=1p(xi )Π(i,j)∈E

p(xi , xj)

p(xi )p(xj)(3)

Directed GMp(x1, x2, ..., xn) = Πn

i=1p(xi |pa(xi )) (4)

Firoozeh Sepehr — Graphical Models Reconstruction 20/50

Page 21: Graphical Models Reconstruction - UTKweb.eecs.utk.edu/~cphill25/cs594_spring2016/Graphical...Foundations of probability theory2 Go back to 16th century when Gerolamo Cardano began

Overview History and Background Graphical Models Reconstruction Open Issues References

Graphical ModelsWhat is factorization?

Example 1

Consider we have N binary random variables, for representation of jointprobability distribution

chain rule requires O(2N) parameters

GM requires O(2|pa|) which could reduce the number of parametersexponentially depending on which conditional assumptions we make- helps in inference and learning

Firoozeh Sepehr — Graphical Models Reconstruction 21/50

Page 22: Graphical Models Reconstruction - UTKweb.eecs.utk.edu/~cphill25/cs594_spring2016/Graphical...Foundations of probability theory2 Go back to 16th century when Gerolamo Cardano began

Overview History and Background Graphical Models Reconstruction Open Issues References

Graphical ModelsWhat is factorization?

Example 2

Joint probability distribution1

Using the chain rule

p(x1, x2, x3, x4, x5, x6) = p(x1)p(x2|x1)p(x3|x1, x2)

p(x4|x1, x2, x3)p(x5|x1, x2, x3, x4)

p(x6|x1, x2, x3, x4, x5)

(5)

Using graphical models

p(x1, x2, x3, x4, x5, x6) = p(x1)p(x2|x1)p(x3|x2, x5)

p(x4|x1)p(x5|x4)p(x6|x5)(6)

x1

x2x3

x5

x6

x4

Firoozeh Sepehr — Graphical Models Reconstruction 22/50

Page 23: Graphical Models Reconstruction - UTKweb.eecs.utk.edu/~cphill25/cs594_spring2016/Graphical...Foundations of probability theory2 Go back to 16th century when Gerolamo Cardano began

Overview History and Background Graphical Models Reconstruction Open Issues References

Graphical ModelsFun application of joint distribution factorization

In rooted trees

Joint probability distribution is the same! Use the Bayes’ rule ...

ab

c ab

c ab

c ab

c

Undirected a is root b is root c is root

p(a, b, c) = p(a)p(b|a)p(c |b)

= p(a)p(b)

p(b)

p(a, b)

p(a)p(c |b)

= p(b)p(a|b)p(c |b)

= p(c)p(b|c)p(a|b)

(7)

Firoozeh Sepehr — Graphical Models Reconstruction 23/50

Page 24: Graphical Models Reconstruction - UTKweb.eecs.utk.edu/~cphill25/cs594_spring2016/Graphical...Foundations of probability theory2 Go back to 16th century when Gerolamo Cardano began

Overview History and Background Graphical Models Reconstruction Open Issues References

Graphical ModelsUndirected Graphical Models

Undirected graphical models

Family of multivariate probability distributions that factorizeaccording to a graph G = (N,E )

Set of vertices, N, represents random variables

Set of edges, E , encodes the set of conditional independenciesbetween variables

Definition

Random vector X is said to be Markov on G if for every i , the randomvariable xi is conditionally independent of all other variables given itsneighbours.

p(xi |x\i ) = p(xi |ngb(xi )) (8)

where p is the joint probability distribution.

Firoozeh Sepehr — Graphical Models Reconstruction 24/50

Page 25: Graphical Models Reconstruction - UTKweb.eecs.utk.edu/~cphill25/cs594_spring2016/Graphical...Foundations of probability theory2 Go back to 16th century when Gerolamo Cardano began

Overview History and Background Graphical Models Reconstruction Open Issues References

Graphical ModelsUndirected Graphical Models

Tree-structured graphical models

Family of multivariate probability distributions that are Markov on atree T = (N,E )

Firoozeh Sepehr — Graphical Models Reconstruction 25/50

Page 26: Graphical Models Reconstruction - UTKweb.eecs.utk.edu/~cphill25/cs594_spring2016/Graphical...Foundations of probability theory2 Go back to 16th century when Gerolamo Cardano began

Overview History and Background Graphical Models Reconstruction Open Issues References

Graphical ModelsDefinition

d-separation6

A subset of variables S is said to separate xi from xj if all pathsbetween xi and xj are separated by S

A path P is separated by a subset S of variables if at least one pairof successive edges along P is blocked by S

block6

Two edges meeting head-to-tail or tail-to-tail at node x (x is a chainor a fork) are blocked by S if x is in S

Two edges meeting head-to-head at node x (x is an inverted fork)are blocked by S if neither x nor any of its descendants is in S .

Firoozeh Sepehr — Graphical Models Reconstruction 26/50

Page 27: Graphical Models Reconstruction - UTKweb.eecs.utk.edu/~cphill25/cs594_spring2016/Graphical...Foundations of probability theory2 Go back to 16th century when Gerolamo Cardano began

Overview History and Background Graphical Models Reconstruction Open Issues References

Graphical ModelsDefinition

d-separation Example6

d–sep(x2, x3|{x1})?

d–sep(x2, x3|{x1, x4})?

d–sep(x2, x3|{x1, x6})?

x1

x3

x5

x6

x4

x2

Firoozeh Sepehr — Graphical Models Reconstruction 27/50

Page 28: Graphical Models Reconstruction - UTKweb.eecs.utk.edu/~cphill25/cs594_spring2016/Graphical...Foundations of probability theory2 Go back to 16th century when Gerolamo Cardano began

Overview History and Background Graphical Models Reconstruction Open Issues References

Graphical ModelsInteresting application

Lumiere project5

The Lumiere Project centers on harnessing probability and utility toprovide assistance to computer software users. Lumiere prototypes servedas the basis for components of the Office Assistant in the MicrosoftOffice ’97 suite of productivity applications.

Infers a user’s needs by considering a user’s background, actions, and queries

Challenges are

Model construction about time-varying goals of computer usersNeeds a large database - over 25,000 hours of usability studies wereinvested in Office ’97

Firoozeh Sepehr — Graphical Models Reconstruction 28/50

Page 29: Graphical Models Reconstruction - UTKweb.eecs.utk.edu/~cphill25/cs594_spring2016/Graphical...Foundations of probability theory2 Go back to 16th century when Gerolamo Cardano began

Overview History and Background Graphical Models Reconstruction Open Issues References

Outline

1 Overview

2 History and Background

3 Graphical Models

4 Reconstruction

5 Open Issues

Firoozeh Sepehr — Graphical Models Reconstruction 29/50

Page 30: Graphical Models Reconstruction - UTKweb.eecs.utk.edu/~cphill25/cs594_spring2016/Graphical...Foundations of probability theory2 Go back to 16th century when Gerolamo Cardano began

Overview History and Background Graphical Models Reconstruction Open Issues References

ReconstructionWhat is reconstruction?

Reconstruction

The problem is that samples are available only from a subset ofvariables

The goal is to learn the minimal latent tree - trees without anyredundant hidden nodes

Latent and minimal latent trees

A latent tree is a tree with node set N = V ∪ H, where V is the setof observed nodes and H is the set of latent (hidden) nodes.

Set of minimal latent trees, T≥3, is the set of latent trees that eachhidden node has at least three neighbors (hidden or observed)

Note

All leaves are observed, although not all observed nodes need to be leaves.

Firoozeh Sepehr — Graphical Models Reconstruction 30/50

Page 31: Graphical Models Reconstruction - UTKweb.eecs.utk.edu/~cphill25/cs594_spring2016/Graphical...Foundations of probability theory2 Go back to 16th century when Gerolamo Cardano began

Overview History and Background Graphical Models Reconstruction Open Issues References

Graphical ModelsInteresting application

Vista system4

A decision-theoretic system that has been used at NASA Mission ControlCenter in Houston for several years.

Uses Bayesian networks to interpretlive telemetry and provides adviceon the likelihood of alternativefailures of the space shuttle’spropulsion systems.

Considers time criticality andrecommends actions of the highestexpected utility

Employs decision-theoretic methodsfor controlling the display ofinformation to dynamically identifythe most important information tohighlight

Firoozeh Sepehr — Graphical Models Reconstruction 31/50

Page 32: Graphical Models Reconstruction - UTKweb.eecs.utk.edu/~cphill25/cs594_spring2016/Graphical...Foundations of probability theory2 Go back to 16th century when Gerolamo Cardano began

Overview History and Background Graphical Models Reconstruction Open Issues References

ReconstructionAdditive metric

Define a measurement8

Information distances

Defined for pairwisse distributions

For guassian graphical models, correlation coefficient of two randomvariables xi and xj

ρij =cov(xi , xj)√var(xi )var(xj)

(9)

Information distance

dij = − log |ρij | (10)

Inverse relation between information distance and correlation

Extendable to discrete random variables

Firoozeh Sepehr — Graphical Models Reconstruction 32/50

Page 33: Graphical Models Reconstruction - UTKweb.eecs.utk.edu/~cphill25/cs594_spring2016/Graphical...Foundations of probability theory2 Go back to 16th century when Gerolamo Cardano began

Overview History and Background Graphical Models Reconstruction Open Issues References

ReconstructionAdditive metric

Proposition8

The information distances dij are additive tree metrics. In other words, ifthe joint probabiliry distribution p(x) is a tree-structured graphical modelMarkov on the tree Tp = (N,Ep), then the information distances areadditive on Tp.

∀k , l ∈ N : dkl =∑

(i,j)∈Pathkl

dij (11)

Proof

Homework!

Firoozeh Sepehr — Graphical Models Reconstruction 33/50

Page 34: Graphical Models Reconstruction - UTKweb.eecs.utk.edu/~cphill25/cs594_spring2016/Graphical...Foundations of probability theory2 Go back to 16th century when Gerolamo Cardano began

Overview History and Background Graphical Models Reconstruction Open Issues References

ReconstructionSibling grouping

Lemma8

For distances dij for all i , j ∈ V on a tree T ∈ T≥3, the following twoproperties on Φijk = dik − djk hold.

1 Φijk = dij for all k ∈ V\i,j iff i is a leaf and j is its parent

1 Φijk = −dij for all k ∈ V\i,j iff j is a leaf and i is its parent

2 −dij < Φijk = Φijk′ < dij for all k , k′ ∈ V\i,j iff both i and j are

leaves and they have the same parent (they belong to the samesibling group)

Proof of 2

Homework!

Firoozeh Sepehr — Graphical Models Reconstruction 34/50

Page 35: Graphical Models Reconstruction - UTKweb.eecs.utk.edu/~cphill25/cs594_spring2016/Graphical...Foundations of probability theory2 Go back to 16th century when Gerolamo Cardano began

Overview History and Background Graphical Models Reconstruction Open Issues References

ReconstructionSibling grouping

Proof of 1

⇐: Using the additive property of information distances, if i is a leafand j is its parent, dik = dij + djk , therefore, Φijk = dij for all k 6= i , j .

⇒: By contradiction, i and j are not connected with an edge. Thenthere exists a node u 6= i , j on the path connecting i and j . Ifu ∈ V , then let k = u, otherwise, let k be an observed node in thesubtree away from i and j which exists since T ∈ T≥3. Therefore,dij = diu + duj > diu − duj = dik − dkj = Φijk which is a contradiction.

i u j

k

Firoozeh Sepehr — Graphical Models Reconstruction 35/50

Page 36: Graphical Models Reconstruction - UTKweb.eecs.utk.edu/~cphill25/cs594_spring2016/Graphical...Foundations of probability theory2 Go back to 16th century when Gerolamo Cardano began

Overview History and Background Graphical Models Reconstruction Open Issues References

ReconstructionSibling grouping

Proof of 1 - cont’d⇒:By contradition, if i is not a leaf, then there exists a node u 6= i , jsuch that (i , u) ∈ E . Let k = u if u ∈ V , otherwise, let k be anobserved node in the subtree away from i and j . Therefore,Φijk = dik − djk = −dij < dij which is again a contradiction,therefore, i is a leaf.

j

u

i

k

Firoozeh Sepehr — Graphical Models Reconstruction 36/50

Page 37: Graphical Models Reconstruction - UTKweb.eecs.utk.edu/~cphill25/cs594_spring2016/Graphical...Foundations of probability theory2 Go back to 16th century when Gerolamo Cardano began

Overview History and Background Graphical Models Reconstruction Open Issues References

ReconstructionSibling grouping

Using previous Lemma to determine node relationships8

For every pair of i , j ∈ V consider the following:

1 If Φijk = dij for all k ∈ V\i,j , then i is a leaf node and j is a parent ofi . Similarly, if Φijk = −dij for all k ∈ V\i,j , then j is a leaf and i is aparent of j .

2 If Φijk is constant for all k ∈ V\i,j but not equal to either dij or −dij ,then i and j are leaves and they are siblings.

3 If Φijk is not equal for all k ∈ V\i,j , then there are three cases:

(a) Nodes i and j are not siblings nor have a parent-child relationship.(b) Nodes i and j are siblings but at least one of them is not a leaf.(c) Nodes i and j have a parent-child relationship but the child is not a

leaf.

Firoozeh Sepehr — Graphical Models Reconstruction 37/50

Page 38: Graphical Models Reconstruction - UTKweb.eecs.utk.edu/~cphill25/cs594_spring2016/Graphical...Foundations of probability theory2 Go back to 16th century when Gerolamo Cardano began

Overview History and Background Graphical Models Reconstruction Open Issues References

ReconstructionSibling grouping

Visualize ...

Case 1 Case 2

i

j

d1

d2d3

d4

d5 d6

d7

d8

ij

d1

d2d3

d4

d5 d6

d7

d8

Φijk = −d8 = −dij Φijk 6= dijΦijk = d6 − d7

dij = d6 + d7

for all k ∈ V \ i , j

Firoozeh Sepehr — Graphical Models Reconstruction 38/50

Page 39: Graphical Models Reconstruction - UTKweb.eecs.utk.edu/~cphill25/cs594_spring2016/Graphical...Foundations of probability theory2 Go back to 16th century when Gerolamo Cardano began

Overview History and Background Graphical Models Reconstruction Open Issues References

ReconstructionSibling grouping

Visualize ...

Case 3a Case 3b Case 3c

i j

d1

d2d3

d4

d5 d6

d7

d8

k′

ki

j

d1

d2d3

d4

d5 d6

d7

d8

k′

k

i

j

d1

d2d3

d4

d5 d6

d7

d8

k′

k

Φijk 6= Φijk′ Φijk 6= Φijk′ Φijk 6= Φijk′

Φijk = d4 + d2 + d3 − d7 Φijk = d4 + d5 Φijk′ = d4 − d5

Φijk′ = d4 − d2 − d3 − d7 Φijk = d5 Φijk′ = −d5

for all k , k′ ∈ V \ i , j

Firoozeh Sepehr — Graphical Models Reconstruction 39/50

Page 40: Graphical Models Reconstruction - UTKweb.eecs.utk.edu/~cphill25/cs594_spring2016/Graphical...Foundations of probability theory2 Go back to 16th century when Gerolamo Cardano began

Overview History and Background Graphical Models Reconstruction Open Issues References

ReconstructionRecursive Grouping (RG) Algorithm

Recursive Grouping (RG) Algorithm

1 Initialize Y = V

2 Compute Φijk = dik − djk for all i , j , k ∈ Y

3 Using sibling grouping, define {Πl}Ll=1 to be partitions of Y suchthat for every subset Πl (with |Πl | ≥ 2), any two nodes are eithersiblings which are leaves or they have a parent-child relationship inwhich the child is a leaf

4 Add singles sets to Ynew

5 For each Πl with |Πl | ≥ 2, if Πl contains a parent node, add it toYnew , otherwise, create a new hidden node and connect it to all thenodes in Πl and add the node to Ynew

6 Update Yold to be Y and Y to be Ynew

7 Compute the distances of new hidden nodes

8 If |Y | ≥ 3, go to step 2, otherwise, if |Y | = 2, connect tworemaining nodes in Y and stop. If |Y | = 1, stop.

Firoozeh Sepehr — Graphical Models Reconstruction 40/50

Page 41: Graphical Models Reconstruction - UTKweb.eecs.utk.edu/~cphill25/cs594_spring2016/Graphical...Foundations of probability theory2 Go back to 16th century when Gerolamo Cardano began

Overview History and Background Graphical Models Reconstruction Open Issues References

ReconstructionRecursive Grouping (RG) Algorithm

Visualize ...

6

1 2

45

3

h3h2

h1

6

1 2

45

3h1

Original latent tree First iteration

6

2

45

3h11

h3h2

6 6

2

45

3h11

h3h2

6Second iteration Third iteration

Firoozeh Sepehr — Graphical Models Reconstruction 41/50

Page 42: Graphical Models Reconstruction - UTKweb.eecs.utk.edu/~cphill25/cs594_spring2016/Graphical...Foundations of probability theory2 Go back to 16th century when Gerolamo Cardano began

Overview History and Background Graphical Models Reconstruction Open Issues References

ReconstructionRecursive Grouping (RG) Algorithm

Proof of step 7

7 Compute the distances of new hidden nodes

Let i , j ∈ ch(h) and k ∈ Yold i , j . We know thatdih − djh = dik − djk = Φijk and dih + djh = dij . Therefore, we can recoverthe distances between a previously active node i ∈ Yold and its newhidden parent h ∈ Y using

dih =1

2(dij + Φijk) (12)

For any other active node l ∈ Y , we can compute dhl using a child nodei ∈ ch(h) using

dhl =

{dil − dih, if l ∈ Yold

dik − dih − dlk , otherwise, where k ∈ ch(l)(13)

Firoozeh Sepehr — Graphical Models Reconstruction 42/50

Page 43: Graphical Models Reconstruction - UTKweb.eecs.utk.edu/~cphill25/cs594_spring2016/Graphical...Foundations of probability theory2 Go back to 16th century when Gerolamo Cardano began

Overview History and Background Graphical Models Reconstruction Open Issues References

ReconstructionRecap

Steps to learn a latent tree

1 Define an additive metric

2 Perform sibling grouping test to determine nodes relationships

3 Perform RG algorithm

Firoozeh Sepehr — Graphical Models Reconstruction 43/50

Page 44: Graphical Models Reconstruction - UTKweb.eecs.utk.edu/~cphill25/cs594_spring2016/Graphical...Foundations of probability theory2 Go back to 16th century when Gerolamo Cardano began

Overview History and Background Graphical Models Reconstruction Open Issues References

Outline

1 Overview

2 History and Background

3 Graphical Models

4 Reconstruction

5 Open Issues

Firoozeh Sepehr — Graphical Models Reconstruction 44/50

Page 45: Graphical Models Reconstruction - UTKweb.eecs.utk.edu/~cphill25/cs594_spring2016/Graphical...Foundations of probability theory2 Go back to 16th century when Gerolamo Cardano began

Overview History and Background Graphical Models Reconstruction Open Issues References

Open IssuesWhat next?

Improvement!

Probabilistic models are used as a key component in somechallenging applications and they remain to be applied in some otherfields

Learning other types of GMs

Polytrees

General graphs

Applying the theorems on random processes

Define interrelations

Firoozeh Sepehr — Graphical Models Reconstruction 45/50

Page 46: Graphical Models Reconstruction - UTKweb.eecs.utk.edu/~cphill25/cs594_spring2016/Graphical...Foundations of probability theory2 Go back to 16th century when Gerolamo Cardano began

Overview History and Background Graphical Models Reconstruction Open Issues References

Homework

Question 1

Prove that information distances are additive tree metrics.

Question 2

Prove that for distances dij for all i , j ∈ V on a tree T ∈ T≥3, thefollowing the following property on Φijk = dik − djk holds

2 −dij < Φijk = Φijk′ < dij for all k, k

′∈ V\i,j iff both i and j are

leaves and they have the same parent (they belong to the samesibling group)

Firoozeh Sepehr — Graphical Models Reconstruction 46/50

Page 47: Graphical Models Reconstruction - UTKweb.eecs.utk.edu/~cphill25/cs594_spring2016/Graphical...Foundations of probability theory2 Go back to 16th century when Gerolamo Cardano began

Overview History and Background Graphical Models Reconstruction Open Issues References

Homework

Question 3

Draw the digraph associated with the following matrix and answerthe followings.

d–sep(x1, x2|{x6, x7})?d–sep(x4, x5|{x1, x2, x3, x6})?d–sep(x1, x7|{x3, x4, x5})?

M =

0 0 1 0 1 0 00 0 1 0 0 0 10 0 0 0 1 1 00 0 0 0 0 1 00 0 0 0 0 0 10 0 0 0 0 0 10 0 0 0 0 0 0

(14)

Firoozeh Sepehr — Graphical Models Reconstruction 47/50

Page 48: Graphical Models Reconstruction - UTKweb.eecs.utk.edu/~cphill25/cs594_spring2016/Graphical...Foundations of probability theory2 Go back to 16th century when Gerolamo Cardano began

Overview History and Background Graphical Models Reconstruction Open Issues References

Questions?

Firoozeh Sepehr — Graphical Models Reconstruction 48/50

Page 49: Graphical Models Reconstruction - UTKweb.eecs.utk.edu/~cphill25/cs594_spring2016/Graphical...Foundations of probability theory2 Go back to 16th century when Gerolamo Cardano began

Overview History and Background Graphical Models Reconstruction Open Issues References

References I

[1] Probabilistic Reasoning in Intelligent Systems: Networks of PlausibleInference J. Pearl, 1988

[2] Probabilistic Graphical Models, Principles and Techniques D. Koller,N. Friedman, MIT Press, 2009

[3] Inferring Cellular Networks Using Probabilistic Graphical Models N.Friedman, Vol 303, Issue 5659, pp. 799-805, 2004

[4] Vista Goes Online: Decision-Analytic Systems for Real-TimeDecision-Making in Mission Control M. Barry, E. Horvitz, C.Ruokangas, S. Srinivas, N94-35063, 1994

[5] The Lumiere Project: Bayesian User Modeling for Inferring theGoals and Needs of Software Users E. Horvitz, J. Breese, D.Heckerman, D. Hovel, K. Rommelse, 1998

[6] Fusion, Propagation, and Structuring in Belief Networks J. Pearl,Artificial Intelligence 29, 1986

Firoozeh Sepehr — Graphical Models Reconstruction 49/50

Page 50: Graphical Models Reconstruction - UTKweb.eecs.utk.edu/~cphill25/cs594_spring2016/Graphical...Foundations of probability theory2 Go back to 16th century when Gerolamo Cardano began

Overview History and Background Graphical Models Reconstruction Open Issues References

References II

[7] The Recovery of Causal Polytrees from Statistical Data G. Rebane,J. Pearl, Proceedings of the Third Conference on Uncertainty inArtificial Intelligence, 1987

[8] Learning Latent Tree Graphical Models M. J. Choi, V. Y. F. Tan, A.S. Willsky, Journal of Machine Learning Research, Volume 12, 2011

[9] Gerolamo Cardano https://en.wikipedia.org/wiki/Gerolamo Cardano

[10] Pierre de Fermat https://en.wikipedia.org/wiki/Pierre de Fermat

[11] Blaise Pascal https://en.wikipedia.org/wiki/Blaise Pascal

[12] Andrey Kolmogorovhttps://en.wikipedia.org/wiki/Andrey Kolmogorov

[13] Thomas Bayes https://en.wikipedia.org/wiki/Thomas Bayes

[14] An Evaluation of the Diagnostic Accuracy of Pathfinder D. E.Heckerman, 1991

Firoozeh Sepehr — Graphical Models Reconstruction 50/50