advanced algorithms and models for computational biology -- a machine learning approach

55
Advanced Algorithms Advanced Algorithms and Models for and Models for Computational Biology Computational Biology -- a machine learning approach -- a machine learning approach Systems Biology: Systems Biology: Inferring gene regulatory Inferring gene regulatory network using graphical models network using graphical models Eric Xing Eric Xing Lecture 25, April 19, 2006

Upload: balin

Post on 20-Jan-2016

39 views

Category:

Documents


0 download

DESCRIPTION

Advanced Algorithms and Models for Computational Biology -- a machine learning approach. Systems Biology: Inferring gene regulatory network using graphical models Eric Xing Lecture 25, April 19, 2006. Burglary. Earthquake. E. B. P(A | E,B). e. b. 0.9. 0.1. e. b. 0.2. 0.8. Radio. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Advanced Algorithms  and Models for  Computational Biology -- a machine learning approach

Advanced Algorithms Advanced Algorithms and Models for and Models for

Computational BiologyComputational Biology-- a machine learning approach-- a machine learning approach

Systems Biology:Systems Biology:

Inferring gene regulatory network using Inferring gene regulatory network using graphical modelsgraphical models

Eric XingEric Xing

Lecture 25, April 19, 2006

Page 2: Advanced Algorithms  and Models for  Computational Biology -- a machine learning approach

0.9 0.1

e

b

e

0.2 0.8

0.01 0.99

0.9 0.1

be

b

b

e

BE P(A | E,B)Earthquake

Radio

Burglary

Alarm

Call

Bayesian Network – CPDs

Local Probabilities: CPD - conditional probability distribution P(Xi|Pai)

Discrete variables: Multinomial Distribution (can represent any kind of statistical dependency)

Page 3: Advanced Algorithms  and Models for  Computational Biology -- a machine learning approach

XY

P(X

| Y

)),(~),...,|( 2

101

k

iiik yaaNYYXP

Bayesian Network – CPDs (cont.)

Continuous variables: e.g. linear Gaussian

Page 4: Advanced Algorithms  and Models for  Computational Biology -- a machine learning approach

E

R

B

A

C

0.9 0.1

e

b

e

0.2 0.8

0.01 0.99

0.9 0.1

be

bb

e

BE P(A | E,B)

E

R

B

A

C

Learning Bayesian Network

The goal:

Given set of independent samples (assignments of random variables), find the best (the most likely?) Bayesian Network (both DAG and CPDs)

(B,E,A,C,R)=(T,F,F,T,F)

(B,E,A,C,R)=(T,F,T,T,F)

……..

(B,E,A,C,R)=(F,T,T,T,F)

Page 5: Advanced Algorithms  and Models for  Computational Biology -- a machine learning approach

Learning Graphical Models

Scenarios: completely observed GMs

directed undirected

partially observed GMs directed undirected (an open research topic)

Estimation principles: Maximal likelihood estimation (MLE) Bayesian estimation

We use learning as a name for the process of estimating the parameters, and in some cases, the topology of the network, from data.

Page 6: Advanced Algorithms  and Models for  Computational Biology -- a machine learning approach

Likelihood:

Log-Likelihood:

Data log-likelihood

MLE

X1 X2

X3),,|()|()|(=)|(=)|( 33332211 XXXpXpXpXpXL θθ

),,|(log+)|(log+)|(log=)|(log=)|( 33332211 XXXpXpXpXpXl θθ

∑∑∑

),|(log+)|(log+)|(log=

)|(log=)|(

)()()()()(

)(

n

nnn

n

n

n

n

n

n

XXXpXpXp

XpDATAl

32132211

θθ

)|(maxarg=},,{ DATAlMLE θ321

∑∑∑ ),|(logmaxarg= ,)|(logmaxarg= ,)|(logmaxarg= )()()(*)(*)(*

n

nnn

n

n

n

n XXXpXpXp 32133222111

The basic idea underlying MLE

Page 7: Advanced Algorithms  and Models for  Computational Biology -- a machine learning approach

• Learning of best CPDs given DAG is easy– collect statistics of values of each node given specific assignment to its parents

• Learning of the graph topology (structure) is NP-hard – heuristic search must be applied, generally leads to a locally optimal network

• Overfitting– It turns out, that richer structures give higher likelihood P(D|G) to the data

(adding an edge is always preferable)

– more parameters to fit => more freedom => always exist more "optimal" CPD(C)

• We prefer simpler (more explanatory) networks– Practical scores regularize the likelihood improvement complex networks.

A

C

BA

CB

),|(≤)|( BACPACP

Learning Bayesian Network

Page 8: Advanced Algorithms  and Models for  Computational Biology -- a machine learning approach

E

R

B

A

C

Learning Algorithm

Expression data

Structural EM (Friedman 1998) The original algorithm

Sparse Candidate Algorithm (Friedman et al.) Discretizing array signals Hill-climbing search using local operators: add/delete/swap of a single edge Feature extraction: Markov relations, order relations Re-assemble high-confidence sub-networks from features

Module network learning (Segal et al.) Heuristic search of structure in a "module graph" Module assignment Parameter sharing Prior knowledge: possible regulators (TF genes)

BN Learning Algorithms

Page 9: Advanced Algorithms  and Models for  Computational Biology -- a machine learning approach

Bootstrap approach:

D resample

resample

resample

D1

D2

Dm

...Learn

Learn

Learn

E

R

B

A

C

E

R

B

A

C

E

R

B

A

C

m

iiGf

mfC

1

11

)(

Estimate “Confidence level”:

Confidence Estimates

Page 10: Advanced Algorithms  and Models for  Computational Biology -- a machine learning approach

The initially learned network of

~800 genes

KAR4

AGA1PRM1TEC1

SST2

STE6

KSS1NDJ1

FUS3AGA2

YEL059W

TOM6 FIG1YLR343W

YLR334C MFA1

FUS1

KAR4

AGA1PRM1

KAR4

AGA1PRM1TEC1

SST2

STE6

KSS1NDJ1TEC1

SST2

STE6

KSS1NDJ1

FUS3AGA2

YEL059W

TOM6 FIG1 FUS3AGA2

YEL059W

TOM6 FIG1YLR343W

YLR334C MFA1

FUS1

YLR343W

YLR334C MFA1

FUS1

The “mating response” substructure

Results from SCA + feature extraction (Friedman et al.)

Page 11: Advanced Algorithms  and Models for  Computational Biology -- a machine learning approach

Nature Genetics  34, 166 - 176 (2003)

A Module Network

Page 12: Advanced Algorithms  and Models for  Computational Biology -- a machine learning approach

Why?

Sometimes an UNDIRECTED association graph makes more sense and/or is more informative gene expressions may be influenced by

unobserved factor that are post-transcriptionally regulated

The unavailability of the state of B results in a constrain over A and CB

A C

B

A C

B

A C

Gaussian Graphical Models

Page 13: Advanced Algorithms  and Models for  Computational Biology -- a machine learning approach

Probabilistic inference on Graphical Models

Page 14: Advanced Algorithms  and Models for  Computational Biology -- a machine learning approach

),,,,|(),,,|(),,|(),|()|()(=

),,,,,(

543216432153214213121

654321

XXXXXXPXXXXXPXXXXPXXXPXXPXP

XXXXXXP

∏ )(=)()()()()()(=

),,,,,(

654321

654321

iiXPXPXPXPXPXPXP

XXXXXXP

X1

X2

X3

X4 X5

X6

p(X6| X2, X5)

p(X1)

p(X5| X4)p(X4| X1)

p(X2| X1)

p(X3| X2)

P(X1, X2, X3, X4, X5, X6) = P(X1) P(X2| X1) P(X3| X2) P(X4| X1) P(X5| X4) P(X6| X2, X5)

Recap of Basic Prob. Concepts

Joint probability dist. on multiple variables:

If Xi's are independent: (P(Xi|·)= P(Xi))

If Xi's are conditionally independent (as described by a GM), the joint can be factored to simpler products, e.g.,

Page 15: Advanced Algorithms  and Models for  Computational Biology -- a machine learning approach

Probabilistic Inference

We now have compact representations of probability distributions: Graphical Models

A GM M describes a unique probability distribution P

How do we answer queries about P?

We use inference as a name for the process of computing answers to such queries

Page 16: Advanced Algorithms  and Models for  Computational Biology -- a machine learning approach

Most of the queries one may ask involve evidence Evidence e is an assignment of values to a set E variables in the

domain

Without loss of generality E = { Xk+1, …, Xn }

Simplest query: compute probability of evidence

this is often referred to as computing the likelihood of e

∑ ∑=1

1x x

kk

,e),x,P(xP(e)

Query 1: Likelihood

Page 17: Advanced Algorithms  and Models for  Computational Biology -- a machine learning approach

Often we are interested in the conditional probability distribution of a variable given the evidence

this is the a posteriori belief in X, given evidence e

We usually query a subset Y of all domain variables X={Y,Z} and "don't care" about the remaining, Z:

the process of summing out the "don't care" variables z is called marginalization, and the resulting P(y|e) is called a marginal prob.

∑ ===|

x

x,e)P(X

P(X,e)P(e)

P(X,e)e)P(X

∑ |==|z

e)zP(Y,Ze)P(Y

Query 2: Conditional Probability

Page 18: Advanced Algorithms  and Models for  Computational Biology -- a machine learning approach

Prediction: what is the probability of an outcome given the starting condition

the query node is a descendent of the evidence

Diagnosis: what is the probability of disease/fault given symptoms

the query node an ancestor of the evidence

Learning under partial observation

fill in the unobserved values under an "EM" setting (more later)

The directionality of information flow between variables is not restricted by the directionality of the edges in a GM

probabilistic inference can combine evidence form all parts of the network

A CB

A CB

?

?

Applications of a posteriori Belief

Page 19: Advanced Algorithms  and Models for  Computational Biology -- a machine learning approach

In this query we want to find the most probable joint assignment (MPA) for some variables of interest

Such reasoning is usually performed under some given evidence e, and ignoring (the values of) other variables z :

this is the maximum a posteriori configuration of y.

∑ )|,(maxarg=)|(maxarg=)|(MPAz

yy ezyPeyPeY

Query 3: Most Probable Assignment

Page 20: Advanced Algorithms  and Models for  Computational Biology -- a machine learning approach

Classification find most likely label, given the evidence

Explanation what is the most likely scenario, given the evidence

Cautionary note:

The MPA of a variable depends on its "context"---the set of variables been jointly queried

Example: MPA of X ? MPA of (X, Y) ?

x y P(x,y)

0 0 0.35

0 1 0.05

1 0 0.3

1 1 0.3

Applications of MPA

Page 21: Advanced Algorithms  and Models for  Computational Biology -- a machine learning approach

Thm:

Computing P(X = x | e) in a GM is NP-hard

Hardness does not mean we cannot solve inference

It implies that we cannot find a general procedure that works efficiently for arbitrary GMs

For particular families of GMs, we can have provably efficient procedures

Complexity of Inference

Page 22: Advanced Algorithms  and Models for  Computational Biology -- a machine learning approach

Approaches to inference

Exact inference algorithms

The elimination algorithm The junction tree algorithms (but will not cover in detail here)

Approximate inference techniques

Stochastic simulation / sampling methods Markov chain Monte Carlo methods Variational algorithms (later lectures)

Page 23: Advanced Algorithms  and Models for  Computational Biology -- a machine learning approach

Query: P(e)

By chain decomposition, we get

A B C ED

∑∑∑∑

∑∑∑∑

)|()|()|()|()(=

),,,,(=)(

d c b a

d c b a

dePcdPbcPabPaP

edcbaPeP

a naïve summation needs to enumerate over an exponential number of terms

A signal transduction pathway:

What is the likelihood that protein E is active?

Marginalization and Elimination

Page 24: Advanced Algorithms  and Models for  Computational Biology -- a machine learning approach

Rearranging terms ...

A B C ED

∑∑∑ ∑

∑∑∑∑

)|()()|()|()|(=

)|()|()|()|()(=)(

d c b a

d c b a

abPaPdePcdPbcP

dePcdPbcPabPaPeP

Elimination on Chains

Page 25: Advanced Algorithms  and Models for  Computational Biology -- a machine learning approach

A B C EDX

∑∑∑

∑∑∑ ∑

)()|()|()|(=

)|()()|()|()|(=)(

d c b

d c b a

bpdePcdPbcP

abPaPdePcdPbcPeP

Elimination on Chains

Now we can perform innermost summation

This summation "eliminates" one variable from our summation argument at a "local cost".

Page 26: Advanced Algorithms  and Models for  Computational Biology -- a machine learning approach

A B C ED

Rearranging and then summing again, we get

∑∑

∑∑ ∑

∑∑∑

)()|()|(=

)()|()|()|(=

)()|()|()|(=)(

d c

d c b

d c b

cpdePcdP

bpbcPdePcdP

bpdePcdPbcPeP

X X

Elimination in Chains

Page 27: Advanced Algorithms  and Models for  Computational Biology -- a machine learning approach

A B C ED

Eliminate nodes one by one all the way to the end, we get

∑ )()|(=)(d

dpdePeP

X X X X

Complexity:

• Each step costs O(|Val(Xi)|*|Val(Xi+1)|) operations: O(kn2)

• Compare to naïve evaluation that sums over joint values of n-1 variables O(nk)

Elimination in Chains

Page 28: Advanced Algorithms  and Models for  Computational Biology -- a machine learning approach

General idea:

Write query in the form

this suggests an "elimination order" of latent variables to be marginalized

Iteratively

Move all irrelevant terms outside of innermost sum Perform innermost sum, getting a new term Insert the new term into the product

wrap-up

∑ ∑∑∏ )|(=),(nx x x i

ii paxPXP3 2

1 e

)(

),(=)|(

ee

ePXP

XP 11

Inference on General GM via Variable Elimination

Page 29: Advanced Algorithms  and Models for  Computational Biology -- a machine learning approach

B A

DC

E F

G H

A food web

What is the probability that hawks are leaving given that the grass condition is poor?

A more complex network

Page 30: Advanced Algorithms  and Models for  Computational Biology -- a machine learning approach

B A

DC

E F

G H

),|()|()|(),|()|()|()()( fehPegPafPdcePadPbcPbPaP

),|~

=(=),( fehhpfemh

h~

∑ )~

=(),|(=),(h

h hhfehpfem

B A

DC

E F

G

A regulatory network

Example: Variable Elimination

Query: P(A |h) Need to eliminate: B,C,D,E,F,G,H

Initial factors:

Choose an elimination order: H,G,F,E,D,C,B

Step 1: Conditioning (fix the evidence node (i.e., h) to its observed

value (i.e., )):

This step is isomorphic to a marginalization step:

Page 31: Advanced Algorithms  and Models for  Computational Biology -- a machine learning approach

B A

DC

E F

G H),()|()|(),|()|()|()()(⇒

),|()|()|(),|()|()|()()(

femegPafPdcePadPbcPbPaP

fehPegPafPdcePadPbcPbPaP

h

1=)|(=)( ∑g

g egpem

B A

DC

E F

),()|(),|()|()|()()(=

),()()|(),|()|()|()()( ⇒

femafPdcePadPbcPbPaP

fememafPdcePadPbcPbPaP

h

hg

Example: Variable Elimination

Query: P(B |h) Need to eliminate: B,C,D,E,F,G

Initial factors:

Step 2: Eliminate G compute

Page 32: Advanced Algorithms  and Models for  Computational Biology -- a machine learning approach

B A

DC

E F

G H

∑ ),()|(=),(f

hf femafpaem

),(),|()|()|()()( ⇒ eamdcePadPbcPbPaP f

),()|(),|()|()|()()(⇒

),()|()|(),|()|()|()()(⇒

),|()|()|(),|()|()|()()(

femafPdcePadPbcPbPaP

femegPafPdcePadPbcPbPaP

fehPegPafPdcePadPbcPbPaP

h

h

B A

DC

E

Example: Variable Elimination

Query: P(B |h) Need to eliminate: B,C,D,E,F

Initial factors:

Step 3: Eliminate F compute

Page 33: Advanced Algorithms  and Models for  Computational Biology -- a machine learning approach

B A

DC

E F

G H

∑ ),(),|(=),,(e

fe eamdcepdcam

),,()|()|()()( ⇒ dcamadPbcPbPaP e

),(),|()|()|()()( ⇒

),()|(),|()|()|()()(⇒

),()|()|(),|()|()|()()(⇒

),|()|()|(),|()|()|()()(

eamdcePadPbcPbPaP

femafPdcePadPbcPbPaP

femegPafPdcePadPbcPbPaP

fehPegPafPdcePadPbcPbPaP

f

h

h

B A

DC

Example: Variable Elimination

Query: P(B |h) Need to eliminate: B,C,D,E

Initial factors:

Step 4: Eliminate E compute

Page 34: Advanced Algorithms  and Models for  Computational Biology -- a machine learning approach

B A

DC

E F

G H

∑ ),,()|(=),(d

ed dcamadpcam

),()|()()( ⇒ camdcPbPaP d

),,()|()|()()( ⇒

),(),|()|()|()()( ⇒

),()|(),|()|()|()()(⇒

),()|()|(),|()|()|()()(⇒

),|()|()|(),|()|()|()()(

dcamadPbcPbPaP

eamdcePadPbcPbPaP

femafPdcePadPbcPbPaP

femegPafPdcePadPbcPbPaP

fehPegPafPdcePadPbcPbPaP

e

f

h

h

B A

C

Example: Variable Elimination

Query: P(B |h) Need to eliminate: B,C,D

Initial factors:

Step 5: Eliminate D compute

Page 35: Advanced Algorithms  and Models for  Computational Biology -- a machine learning approach

B A

DC

E F

G H

∑ ),()|(=),(c

dc cambcpbam

),()()( ⇒ bambPaP c

),()|()()( ⇒

),,()|()|()()( ⇒

),(),|()|()|()()( ⇒

),()|(),|()|()|()()(⇒

),()|()|(),|()|()|()()(⇒

),|()|()|(),|()|()|()()(

camdcPbPaP

dcamadPdcPbPaP

eamdcePadPdcPbPaP

femafPdcePadPdcPbPaP

femegPafPdcePadPdcPbPaP

fehPegPafPdcePadPdcPbPaP

d

e

f

h

h

B A

Example: Variable Elimination

Query: P(B |h) Need to eliminate: B,C

Initial factors:

Step 6: Eliminate C compute

Page 36: Advanced Algorithms  and Models for  Computational Biology -- a machine learning approach

B A

DC

E F

G H

∑ ),()(=)(b

cb bambpam)()( ⇒ amaP b

),()()( ⇒

),()|()()( ⇒

),,()|()|()()( ⇒

),(),|()|()|()()( ⇒

),()|(),|()|()|()()(⇒

),()|()|(),|()|()|()()(⇒

),|()|()|(),|()|()|()()(

bambPaP

camdcPbPaP

dcamadPdcPbPaP

eamdcePadPdcPbPaP

femafPdcePadPdcPbPaP

femegPafPdcePadPdcPbPaP

fehPegPafPdcePadPdcPbPaP

c

d

e

f

h

h

A

Example: Variable Elimination

Query: P(B |h) Need to eliminate: B

Initial factors:

Step 7: Eliminate B compute

Page 37: Advanced Algorithms  and Models for  Computational Biology -- a machine learning approach

B A

DC

E F

G H

, )()(=)~

,( amaphap b

∑ )()(

)()(=)

~|( ⇒

ab

b

amap

amaphaP

)()( ⇒

),()()( ⇒

),()|()()( ⇒

),,()|()|()()( ⇒

),(),|()|()|()()( ⇒

),()|(),|()|()|()()(⇒

),()|()|(),|()|()|()()(⇒

),|()|()|(),|()|()|()()(

amaP

bambPaP

camdcPbPaP

dcamadPdcPbPaP

eamdcePadPdcPbPaP

femafPdcePadPdcPbPaP

femegPafPdcePadPdcPbPaP

fehPegPafPdcePadPdcPbPaP

b

c

d

e

f

h

h

∑ )()(=)~

(a

b amaphp

Example: Variable Elimination

Query: P(B |h) Need to eliminate: { }

Initial factors:

Step 8: Wrap-up

Page 38: Advanced Algorithms  and Models for  Computational Biology -- a machine learning approach

Suppose in one elimination step we compute

This requires multiplications

For each value for x, y1, …, yk, we do k multiplications

additions

For each value of y1, …, yk , we do |Val(X)| additions

Complexity is exponential in number of variables in the intermediate factor

∑ ),,,('=),,(x

kxkx yyxmyym 11

∏=

),(=),,,('k

icikx i

xmyyxm1

1 y

∏ )Val(•)Val(•i

CiXk Y

∏ )Val(•)Val(i

CiX Y

Complexity of variable elimination

Page 39: Advanced Algorithms  and Models for  Computational Biology -- a machine learning approach

moralization

B A

DC

E F

G H

B A

DC

E F

G H

B A

DC

B A

DC

E F

G

B A

DC

E F

B A

DC

E

B A

C

B A A

graph elimination

Understanding Variable Elimination

A graph elimination algorithm

Intermediate terms correspond to the cliques resulted from elimination “good” elimination orderings lead to small cliques and hence reduce complexity

(what will happen if we eliminate "e" first in the above graph?)

finding the optimum ordering is NP-hard, but for many graph optimum or near-optimum can often be heuristically found

Applies to undirected GMs

Page 40: Advanced Algorithms  and Models for  Computational Biology -- a machine learning approach

E F

H

A

E F

B A

C

E

G

A

DC

E

A

DC

B A A

hm

gm

em

fm

bmcm

dm

B A

DC

E F

G H

B A

DC

E F

G H

B A

DC

B A

DC

E F

G

B A

DC

E F

B A

DC

E

B A

C

B A A

∑ ),()(),|(=

),,(

efg

e

eamemdcep

dcam

From Elimination to Message Passing

Our algorithm so far answers only one query (e.g., on one node), do we need to do a complete elimination for every such query?

Elimination message passing on a clique tree

Messages can be reused

Page 41: Advanced Algorithms  and Models for  Computational Biology -- a machine learning approach

E F

H

A

E F

B A

C

E

G

A

DC

E

A

DC

B A A

cm bm

gm

em

dmfm

hm

From Elimination to Message Passing

Our algorithm so far answers only one query (e.g., on one node), do we need to do a complete elimination for every such query?

Elimination message passing on a clique tree Another query ...

Messages mf and mh are reused, others need to be recomputed

Page 42: Advanced Algorithms  and Models for  Computational Biology -- a machine learning approach

The algorithm Construction of junction trees --- a special clique tree

Propagation of probabilities --- a message-passing protocol

Results in marginal probabilities of all cliques --- solves all queries in a single run

A generic exact inference algorithm for any GM

Complexity: exponential in the size of the maximal clique --- a good elimination order often leads to small maximal clique, and hence a good (i.e., thin) JT

Many well-known algorithms are special cases of JT Forward-backward, Kalman filter, Peeling, Sum-Product ...

A Sketch of the Junction Tree Algorithm

Page 43: Advanced Algorithms  and Models for  Computational Biology -- a machine learning approach

Approaches to inference

Exact inference algorithms

The elimination algorithm The junction tree algorithms (but will not cover in detail here)

Approximate inference techniques

Stochastic simulation / sampling methods Markov chain Monte Carlo methods Variational algorithms (later lectures)

Page 44: Advanced Algorithms  and Models for  Computational Biology -- a machine learning approach

Monte Carlo methods

Draw random samples from the desired distribution

Yield a stochastic representation of a complex distribution marginals and other expections can be approximated using sample-based

averages

Asymptotically exact and easy to apply to arbitrary models

Challenges: how to draw samples from a given dist. (not all distributions can be trivially

sampled)?

how to make better use of the samples (not all sample are useful, or eqally useful, see an example later)?

how to know we've sampled enough?

∑=

)( )(=)]([N

t

txfN

xfE1

1

Page 45: Advanced Algorithms  and Models for  Computational Biology -- a machine learning approach

Example: naive sampling

Sampling: Construct samples according to probabilities given in a BN.

Alarm example: (Choose the right sampling sequence)1) Sampling:P(B)=<0.001, 0.999> suppose it is false, B0. Same for E0. P(A|B0, E0)=<0.001, 0.999> suppose it is false... 2) Frequency counting: In the samples right, P(J|A0)=P(J,A0)/P(A0)=<1/9, 8/9>.

E0 B0 A0 M0 J0

E0 B0 A0 M0 J0

E0 B0 A0 M0 J1

E0 B0 A0 M0 J0

E0 B0 A0 M0 J0

E0 B0 A0 M0 J0

E1 B0 A1 M1 J1

E0 B0 A0 M0 J0

E0 B0 A0 M0 J0

E0 B0 A0 M0 J0

Page 46: Advanced Algorithms  and Models for  Computational Biology -- a machine learning approach

Example: naive sampling

Sampling: Construct samples according to probabilities given in a BN.

E0 B0 A0 M0 J0

E0 B0 A0 M0 J0

E0 B0 A0 M0 J1

E0 B0 A0 M0 J0

E0 B0 A0 M0 J0

E0 B0 A0 M0 J0

E1 B0 A1 M1 J1

E0 B0 A0 M0 J0

E0 B0 A0 M0 J0

E0 B0 A0 M0 J0

Alarm example: (Choose the right sampling sequence)

3) what if we want to compute P(J|A1) ? we have only one sample ...P(J|A1)=P(J,A1)/P(A1)=<0, 1>.

4) what if we want to compute P(J|B1) ? No such sample available! P(J|A1)=P(J,B1)/P(B1) can not be defined.

For a model with hundreds or more variables, rare events will be very hard to garner evough samples even after a long time or sampling ...

Page 47: Advanced Algorithms  and Models for  Computational Biology -- a machine learning approach

Direct Sampling We have seen it. Very difficult to populate a high-dimensional state space

Rejection Sampling Create samples like direct sampling, only count samples which is

consistent with given evidences.

....

Markov chain Monte Carlo (MCMC)

Monte Carlo methods (cond.)

Page 48: Advanced Algorithms  and Models for  Computational Biology -- a machine learning approach

Samples are obtained from a Markov chain (of sequentially evolving distributions) whose stationary distribution is the desired p(x)

Gibbs sampling we have variable set to X={x1, x2, x3,... xN}

at each step one of the variables Xi is selected (at random or according to some fixed sequences)

the conditonal distribution p(Xi| X-i) is computed

a value xi is sampled from this distribution

the sample xi replaces the previous of Xi in X.

Markov chain Monte Carlo

Page 49: Advanced Algorithms  and Models for  Computational Biology -- a machine learning approach

MCMC

Markov-Blanket A variable is independent from

others, given its parents, children and children‘s parents. d-separation.

p(Xi| X-i)= p(Xi| MB(Xi))

Gibbs sampling Create a random sample.

Every step, choose one variable and sample it by P(X|MB(X)) based on previous sample.

MB(A)={B, E, J, M}

MB(E)={A, B}

Page 50: Advanced Algorithms  and Models for  Computational Biology -- a machine learning approach

MCMC

To calculate P(J|B1,M1)

Choose (B1,E0,A1,M1,J1) as a start

Evidences are B1, M1, variables are A, E, J.

Choose next variable as A

Sample A by P(A|MB(A))=P(A|B1, E0, M1, J1) suppose to be false.

(B1, E0, A0, M1, J1)

Choose next random variable as E, sample E~P(E|B1,A0)

...

Page 51: Advanced Algorithms  and Models for  Computational Biology -- a machine learning approach

Complexity for Approximate Inference

Inference problem is NP-hard.

Approximate Inference will not reach the exact probability distribution in finite time, but only close to the value.

Often much faster than exact inference when BN is big and complex enough. In MCMC, only consider P(X|MB(X)) but not the whole network.

Page 52: Advanced Algorithms  and Models for  Computational Biology -- a machine learning approach

Multivariate Gaussian over all continuous expressions

The precision matrix K reveals the topology of the (undirected) network

Edge ~ |Kij| > 0

Learning Algorithm: Covariance selection Want a sparse matrix

Regression for each node with degree constraint (Dobra et al.) Regression for each node with hierarchical Bayesian prior (Li, et al)

∑ )K/K(=)|( -j

jiiijii xxxE

{ })-()-(-exp||)2(

1=]),...,([ 1-

21

121

2

xxxxp T

n n

Covariance Selection

Page 53: Advanced Algorithms  and Models for  Computational Biology -- a machine learning approach

A comparison of BN and GGM:G

GM

BN

Gene modules from identified using GGM (Li, Yang and Xing)

Page 54: Advanced Algorithms  and Models for  Computational Biology -- a machine learning approach

B

A C

B

A Cand

2: Protein-DNA Interaction Network

Expression networks are not necessarily causal BNs are indefinable only up to Markov equivalence:

can give the same optimal score, but not further distinguishable under a likelihood score unless further experiment from perturbation is performed

GGM have yields functional modules, but no causal semantics

TF-motif interactions provide direct evidence of casual, regulatory dependencies among genes stronger evidence than expression correlations indicating presence of binding sites on target gene -- more easily verifiable disadvantage: often very noisy, only applies to cell-cultures, restricted to known TFs ...

Page 55: Advanced Algorithms  and Models for  Computational Biology -- a machine learning approach

Simon I et al (Young RA) Cell 2001(106):697-708Ren B et al (Young RA) Science 2000 (290):2306-2309

Advantages:- Identifies “all” the sites where a TF

binds “in vivo” under the experimental condition.

Limitations:- Expense: Only 1 TF per experiment- Feasibility: need an antibody for the TF.- Prior knowledge: need to know what TF

to test.

ChIP-chip analysis