network biology- part · pdf filenetwork biology- part iv jun zhu, ... c=c2 11 25 18 c=c3 27...

33
Network Biology- part IV Jun Zhu, Ph. D. Professor of Genomics and Genetic Sciences Icahn Institute of Genomics and Multi-scale Biology The Tisch Cancer Institute Icahn Medical School at Mount Sinai New York, NY @IcahnInstitute http://research.mssm.edu/integrative-network-biology/ Email: [email protected]

Upload: nguyenxuyen

Post on 18-Feb-2018

215 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Network Biology- part  · PDF fileNetwork Biology- part IV Jun Zhu, ... C=c2 11 25 18 C=c3 27 10 16. ... Zhu lab Seungyeul Yoo Eunjee Lee Li Wang Luan Lin Quan Long

Network Biology- part IV

Jun Zhu, Ph. D.

Professor of Genomics and Genetic Sciences

Icahn Institute of Genomics and Multi-scale Biology

The Tisch Cancer Institute

Icahn Medical School at Mount Sinai

New York, NY

@IcahnInstitute

http://research.mssm.edu/integrative-network-biology/

Email: [email protected]

Page 2: Network Biology- part  · PDF fileNetwork Biology- part IV Jun Zhu, ... C=c2 11 25 18 C=c3 27 10 16. ... Zhu lab Seungyeul Yoo Eunjee Lee Li Wang Luan Lin Quan Long

Association

networks

Probabilistic causal

networks

Biological details revealed

Data required to train models

Biological networks/pathways

1. How do genes in the same

module interact?

2. How do genes in different

modules interact?

3. Can we make causal

inferences to elucidate

signaling pathway for

disease targets?

4267 top genes in BxH liver female rescan qtl overlap (num(p(GGC)<1e-15)>100 ~abs(cor)>0.5886)

Page 3: Network Biology- part  · PDF fileNetwork Biology- part IV Jun Zhu, ... C=c2 11 25 18 C=c3 27 10 16. ... Zhu lab Seungyeul Yoo Eunjee Lee Li Wang Luan Lin Quan Long

What are Bayesian networks? Association vs Causality

From Stephen Friend

Page 4: Network Biology- part  · PDF fileNetwork Biology- part IV Jun Zhu, ... C=c2 11 25 18 C=c3 27 10 16. ... Zhu lab Seungyeul Yoo Eunjee Lee Li Wang Luan Lin Quan Long

A simple biological question: are there

causal/reactive relationships?

Page 5: Network Biology- part  · PDF fileNetwork Biology- part IV Jun Zhu, ... C=c2 11 25 18 C=c3 27 10 16. ... Zhu lab Seungyeul Yoo Eunjee Lee Li Wang Luan Lin Quan Long

A Bayesian network approach:

Best model

Page 6: Network Biology- part  · PDF fileNetwork Biology- part IV Jun Zhu, ... C=c2 11 25 18 C=c3 27 10 16. ... Zhu lab Seungyeul Yoo Eunjee Lee Li Wang Luan Lin Quan Long

What are Bayesian networks?

• A Bayesian network is an expert system that captures

all existing knowledge;

• They are also called belief networks, Bayesian belief

networks, causal probabilistic networks;

• A Bayesian network consists of

• a directed acyclic graph (a set of nodes and directed edges

connecting nodes)--DAG

• A set of conditional probability tables (for discrete data) or

probability density functions (for continuous data)

Page 7: Network Biology- part  · PDF fileNetwork Biology- part IV Jun Zhu, ... C=c2 11 25 18 C=c3 27 10 16. ... Zhu lab Seungyeul Yoo Eunjee Lee Li Wang Luan Lin Quan Long

A Bayesian network

C

A B

F

D

( | , )p C A B

(D | )p B

E (E | )p B

(F | C)p

DAG Conditional

probability tables

Page 8: Network Biology- part  · PDF fileNetwork Biology- part IV Jun Zhu, ... C=c2 11 25 18 C=c3 27 10 16. ... Zhu lab Seungyeul Yoo Eunjee Lee Li Wang Luan Lin Quan Long

Bayesian network

• A tree is a Bayesian network

C

A

B

F

D E

Page 9: Network Biology- part  · PDF fileNetwork Biology- part IV Jun Zhu, ... C=c2 11 25 18 C=c3 27 10 16. ... Zhu lab Seungyeul Yoo Eunjee Lee Li Wang Luan Lin Quan Long

Bayesian network

C

A B

F

D E

• A Bayesian network is not a tree

Page 10: Network Biology- part  · PDF fileNetwork Biology- part IV Jun Zhu, ... C=c2 11 25 18 C=c3 27 10 16. ... Zhu lab Seungyeul Yoo Eunjee Lee Li Wang Luan Lin Quan Long

Bayesian network

• Conventional Notations

( ) ( | ( ))i i

i

p p A pa AA

1 2 n{A ,A , ,A } are nodes.A

( ) is the joint probability of nodes .p A A

( ) are parent nodesof .i ipa A A

Page 11: Network Biology- part  · PDF fileNetwork Biology- part IV Jun Zhu, ... C=c2 11 25 18 C=c3 27 10 16. ... Zhu lab Seungyeul Yoo Eunjee Lee Li Wang Luan Lin Quan Long

Bayesian network

C B

A

D E

• A diverging structure

out-degree =4

Page 12: Network Biology- part  · PDF fileNetwork Biology- part IV Jun Zhu, ... C=c2 11 25 18 C=c3 27 10 16. ... Zhu lab Seungyeul Yoo Eunjee Lee Li Wang Luan Lin Quan Long

Bayesian network

D

A B C

• A converging structure

in-degree =3

Page 13: Network Biology- part  · PDF fileNetwork Biology- part IV Jun Zhu, ... C=c2 11 25 18 C=c3 27 10 16. ... Zhu lab Seungyeul Yoo Eunjee Lee Li Wang Luan Lin Quan Long

Bayesian network

• Why a DAG is required?

( ) ( | ( ))i i

i

p p A pa AA

• It is guaranteed that there is a node Aj in a DAG that

has no child.

j

j

j

( ) ( \ { }) ( | \ { })

( \ { }) ( | ( ))

( ( | ( )))* ( | ( ))

j j

j j

i i j

i j

p p A p A A

p A p A pa A

p A pa A p A pa A

A A A

A

Page 14: Network Biology- part  · PDF fileNetwork Biology- part IV Jun Zhu, ... C=c2 11 25 18 C=c3 27 10 16. ... Zhu lab Seungyeul Yoo Eunjee Lee Li Wang Luan Lin Quan Long

Bayesian network: usages

• Bayesian networks can be used to predict outcomes

or diagnose causal effects (if structures are known)

• Bayesian networks can be used to discover causal

relationships (if structures are not known)

Page 15: Network Biology- part  · PDF fileNetwork Biology- part IV Jun Zhu, ... C=c2 11 25 18 C=c3 27 10 16. ... Zhu lab Seungyeul Yoo Eunjee Lee Li Wang Luan Lin Quan Long

Bayesian network: an example

Alarm

burglar Earthquake

Phonecall

Radio Internet

• A burglar alarm system

Page 16: Network Biology- part  · PDF fileNetwork Biology- part IV Jun Zhu, ... C=c2 11 25 18 C=c3 27 10 16. ... Zhu lab Seungyeul Yoo Eunjee Lee Li Wang Luan Lin Quan Long

Bayesian network: a classifier

• What is a naïve Bayes net

C B

A

D E

( (A)

(( | , , )

(

p A,B,C,D,E)= p(B | A)p(C | A)p(D | A)p(E | A)p

p A,B,C,D,E)p A B C D

p B,C,D,E)

Page 17: Network Biology- part  · PDF fileNetwork Biology- part IV Jun Zhu, ... C=c2 11 25 18 C=c3 27 10 16. ... Zhu lab Seungyeul Yoo Eunjee Lee Li Wang Luan Lin Quan Long

Bayesian network

B

A

C

• How to train a Bayesian network

A=a1 A=a2 A=a3

B=b1 7 12 25

B=b2 20 30 28

B=b3 25 20 6

A=a1 A=a2 A=a3

C=c1 15 8 20

C=c2 11 25 18

C=c3 27 10 16

Page 18: Network Biology- part  · PDF fileNetwork Biology- part IV Jun Zhu, ... C=c2 11 25 18 C=c3 27 10 16. ... Zhu lab Seungyeul Yoo Eunjee Lee Li Wang Luan Lin Quan Long

Bayesian network

B

A

C

• How to construct a Bayesian network? Enumerating

possible structures

B

A

C B

A

C B

A

C

B

A

C B

A

C B

A

C

Page 19: Network Biology- part  · PDF fileNetwork Biology- part IV Jun Zhu, ... C=c2 11 25 18 C=c3 27 10 16. ... Zhu lab Seungyeul Yoo Eunjee Lee Li Wang Luan Lin Quan Long

Bayesian network

• How to construct a Bayesian network? Enumerating all

possible structures is impossible

,NN N is thenumberof nodes

Page 20: Network Biology- part  · PDF fileNetwork Biology- part IV Jun Zhu, ... C=c2 11 25 18 C=c3 27 10 16. ... Zhu lab Seungyeul Yoo Eunjee Lee Li Wang Luan Lin Quan Long

Bayesian network

• How to construct a Bayesian network? Heuristic approach

xi

Pa1 Pa2 Pan

xi

Pa1 Pa2 Pan

xi

Pa1 Pa2 Pan

xi

Pa1 Pa2 Pan

X Pan+1 Paj

a b c

Page 21: Network Biology- part  · PDF fileNetwork Biology- part IV Jun Zhu, ... C=c2 11 25 18 C=c3 27 10 16. ... Zhu lab Seungyeul Yoo Eunjee Lee Li Wang Luan Lin Quan Long

Bayesian network

• How to construct a Bayesian network? Heuristic approach

D

A B

Parameters to estimate=3x3x3

D

A B C

Parameters to estimate=3x3x3x3

Page 22: Network Biology- part  · PDF fileNetwork Biology- part IV Jun Zhu, ... C=c2 11 25 18 C=c3 27 10 16. ... Zhu lab Seungyeul Yoo Eunjee Lee Li Wang Luan Lin Quan Long

Bayesian network

• How to construct a Bayesian network? Heuristic approach

( | ) ( )( | )

( )

p D M p Mp M D

p D

ˆBIC 2ln ( ) ln( )

: number of samples

: number of parameters toestimate

p D | M k n

n

k

Page 23: Network Biology- part  · PDF fileNetwork Biology- part IV Jun Zhu, ... C=c2 11 25 18 C=c3 27 10 16. ... Zhu lab Seungyeul Yoo Eunjee Lee Li Wang Luan Lin Quan Long

Bayesian network

• How to construct a Bayesian network? averaging

Zhu et al., PLoS CompBio, 2007

Zhu et al., Nature Genetics, 2008

Page 24: Network Biology- part  · PDF fileNetwork Biology- part IV Jun Zhu, ... C=c2 11 25 18 C=c3 27 10 16. ... Zhu lab Seungyeul Yoo Eunjee Lee Li Wang Luan Lin Quan Long

Bayesian network • How to construct a Bayesian network? Enforcing DAG

after averaging

1. Calculate shortest distance

2. Identify loops

3. Remove the weakest link in a loop

4. Go to step 1

Zhu et al., PLoS CompBio, 2007

Page 25: Network Biology- part  · PDF fileNetwork Biology- part IV Jun Zhu, ... C=c2 11 25 18 C=c3 27 10 16. ... Zhu lab Seungyeul Yoo Eunjee Lee Li Wang Luan Lin Quan Long

Bayesian network

• How to construct a Bayesian network? Upper limit on

in-degree

Parameters to estimate=

xi

Pa1 Pa2 Pan

13n

Page 26: Network Biology- part  · PDF fileNetwork Biology- part IV Jun Zhu, ... C=c2 11 25 18 C=c3 27 10 16. ... Zhu lab Seungyeul Yoo Eunjee Lee Li Wang Luan Lin Quan Long

Bayesian network

• Continuous vs discrete models

• Discrete model is faster, easier to capture high order

interactions

• Any discretization lost information

Page 27: Network Biology- part  · PDF fileNetwork Biology- part IV Jun Zhu, ... C=c2 11 25 18 C=c3 27 10 16. ... Zhu lab Seungyeul Yoo Eunjee Lee Li Wang Luan Lin Quan Long

Bayesian network

• Missing information

B

A

C B

X

C

A

B

X

C

A

Page 28: Network Biology- part  · PDF fileNetwork Biology- part IV Jun Zhu, ... C=c2 11 25 18 C=c3 27 10 16. ... Zhu lab Seungyeul Yoo Eunjee Lee Li Wang Luan Lin Quan Long

Bayesian network

• Biological network is context specific

• Bayesian network is just a snapshot under a specific

condition

Page 29: Network Biology- part  · PDF fileNetwork Biology- part IV Jun Zhu, ... C=c2 11 25 18 C=c3 27 10 16. ... Zhu lab Seungyeul Yoo Eunjee Lee Li Wang Luan Lin Quan Long

Other ways to infer causal networks

• Boolean network, Graphic Gaussian model

• Conditional Mutual Information

• ODE model

• Structural equation

Page 30: Network Biology- part  · PDF fileNetwork Biology- part IV Jun Zhu, ... C=c2 11 25 18 C=c3 27 10 16. ... Zhu lab Seungyeul Yoo Eunjee Lee Li Wang Luan Lin Quan Long

modeling by differential equations

uxfdtdx )(/

Response function

Observed states

Perturbation

Gardner et al, Science, 2003

Page 31: Network Biology- part  · PDF fileNetwork Biology- part IV Jun Zhu, ... C=c2 11 25 18 C=c3 27 10 16. ... Zhu lab Seungyeul Yoo Eunjee Lee Li Wang Luan Lin Quan Long

modeling by ordinary differential

equations (ODE)

▶ Assume static state dx/dt=0

▶ Assume linear relationships f(x)=Ax

0uAx

regulatory matrix

response matrix

01 uAx

Page 32: Network Biology- part  · PDF fileNetwork Biology- part IV Jun Zhu, ... C=c2 11 25 18 C=c3 27 10 16. ... Zhu lab Seungyeul Yoo Eunjee Lee Li Wang Luan Lin Quan Long

ODE: advantages and disadvantages

▶ Advantages:

• Simple

• Can model feedback loops

▶ Disadvantages:

• Need large amount of data

• Need even more data to capture non-linear

relationships

Page 33: Network Biology- part  · PDF fileNetwork Biology- part IV Jun Zhu, ... C=c2 11 25 18 C=c3 27 10 16. ... Zhu lab Seungyeul Yoo Eunjee Lee Li Wang Luan Lin Quan Long

Aknowledgements Mount Sinai

Genomics Institute

Eric Schadt

Bin Zhang

Zhidong Tu

Charles Powell

Patrizia Casaccia

Zhu lab

Seungyeul Yoo

Eunjee Lee

Li Wang

Luan Lin

Quan Long

•Icahn Institute of Genomics and Multiscale Biology,

Icahn School of Medicine at Mount Sinai

•Janssen

•Canary Foundation

•Prostate Cancer Foundation

•NIH

•NCI

Supported by:

Boston University

Avrum Spira

Joshua Campbell

U Washington

Roger Baumgarner

Berkerley

Rachel Brem

Princeton

Lenoid Kruglyak