learning module networks eran segal stanford university joint work with: dana pe’er (hebrew u.)...

Learning Module Networks

Eran SegalStanford University

Joint work with:Dana Pe’er (Hebrew U.)Daphne Koller (Stanford)

Aviv Regev (Harvard)Nir Friedman (Hebrew U.)

Learning Bayesian Networks

Density estimation Model data distribution in

population Probabilistic inference:

Prediction Classification

Dependency structure Interactions between variables Causality Scientific discovery

Stock Market Learn dependency of stock prices as a function of

Global influencing factors Sector influencing factors Price of other major stocks

Mar.’02

May.’02

Aug.’02

Oct.’02

Jan.’03

Jan.’02

MSFTDELLINTLNVLSMOTI

Mar.’02

May.’02

Aug.’02

Oct.’02

Jan.’03

Jan.’02

Mar.’02

May.’02

Aug.’02

Oct.’02

Jan.’03

Jan.’02

Bayesian Network

Fragment of learned BN

Stock Market

4411 stocks (variables) 273 trading days (instances) from Jan.’02 –

Mar.’03

Problems Statistical robustness Interpretability

Key Observation

Many stocks depend on the same influencing factors in much the same way

Example: Intel, Novelus, Motorola, Dell depend on the price of Microsoft

Many other domains with similar characteristics Gene expression Collaborative filtering Computer network performance …

Mar.’02

May.’02

Aug.’02

Oct.’02

Jan.’03

Jan.’02

AMAT HPQ

Bayesian Network

The Module Network Idea

AMAT HPQ

Module III

Module II

Module I

Module Network

Problems and Solutions

Statistical robustness

Interpretability

Share parameters and dependencies between variables with similar behavior

Explicit modeling of modular structure

Outline

Module Network Probabilistic model Learning the model

Experimental results

Module Network Components

Module Assignment Function A(MSFT)=MI

A(MOT)=A(DELL)=A(INTL) =MII

A(AMAT)= A(HPQ)=MIII

AMAT HPQ

Module III

Module II

Module I

MOTDELL

AMAT HPQ

Module Assignment Function

Set of parents for each module Pa(MI)= Pa(MII)={MSFT} Pa(MIII)={DELL, INTL}

AMAT HPQ

Module III

Module II

Module I

Module Assignment Function

Set of parents for each module

CPD template for each module

AMAT HPQ

Module III

Module II

Module I

Ground Bayesian Network

A module network induces a ground BN over X

A module network defines a coherent probabilty distribution over X if the ground BN is acyclic

AMAT HPQ

Module III

Module II

Module I

AMAT HPQ

Ground Bayesian Network

Module Graph

Nodes correspond to modules MiMj if at least one variable in Mi is a parent

AMAT HPQ

Module III

Module II

Module I

MI MII MIII

Module graph

Theorem: The ground BN is acyclic if the module graph is acyclicAcyclicity checked efficiently using the module graph

Outline

Experimental results

Learning Overview

Given data D, find assignment function A and structure S that maximize the Bayesian score

Marginal data likelihood A)dS,|)P(A,S,|P(DA)S,|P(D

Data likelihood

Parameter prior

A)P(S,A)S,|P(DD):A(S, loglogScore

Marginal likelihood

Assignment /structure prior

Instance 3

Likelihood Function

Module III

Module II

Module I

AMAT HPQ

Instance 1Instance 2

MII|MSFT

MIII|DELL,INTL

MSFT)(DELL,S^

MSFT)(MOT,S^

MSFT)(INTL,S^

MSFT),(MS II

Sufficient statistics of (X,Y)

Y)(X,S^

Likelihood function decomposes by

modules

Bayesian Score Decomposition

Bayesian score decomposes by modules

jMM D):Χ,(PascoreD):Ascore(S,

AMAT HPQ

Module III

Module II

Module I D):X(score 1

D):X(MSFT,score 2M2

D):XINTL},({DELL,score 3M3

Delete INTL ModuleIII

Module j variablesModule j parents

D):X(DELL,score 3M3

Bayesian Score Decomposition

Bayesian score decomposes by modules

jMM D):Χ,(PascoreD):Ascore(S,

AMAT HPQ

Module III

Module II

Module I D):X(score 1

D):X(MSFT,score 2M2

D):X(DELL,score 3M3

A(MOT)=2 A(MOT)=1

Algorithm Overview

Find assignment function A and structure S that maximize the Bayesian score

Dependency structure S

Improve structure

Improve assignmen

A)logP(S,A)S,|logP(DD):AScore(S,

Find initial assignment A

Assignment function A

Initial Assignment Function

x[2]x[3]x[4]

Variables (stocks)

Find variables that are similar across

instances

A(MOT)= MII

A(INTL)= MII

A(DELL)= MII

Algorithm Overview

Improve structure

Improve assignmen

Learning Dependency Structure

Heuristic search with operators Add/delete parent for module Cannot reverse edges

Handle acyclicity Can be checked efficiently

on the module graph

Efficient computation After applying operator for

module Mj, only update scoreof operators for module Mj

AMAT HPQ

Module III

Module II

Module I

MI MII MIII

INTL ModuleI

INTL ModuleIII

MSFT ModuleII

Learning Dependency Structure

Structure search done at module level Parent selection

Reduced search space relative to BN Acyclicity checking

Individual variables only used for computation of sufficient statistics

Algorithm Overview

Improve structure

Improve assignmen

Learning Assignment Function

A(DELL)=MI

Score: 0.7

AMAT HPQ

Module III

Module II

Module IDELL

A(DELL)=MI

Score: 0.7

A(DELL)=MII

Score: 0.9INTL

AMAT HPQ

Module III

Module II

Module I

A(DELL)=MI

Score: 0.7

A(DELL)=MII

Score: 0.9

A(DELL)=MIII

Score: cyclic!

AMAT HPQ

Module III

Module II

Module I

A(DELL)=MI

Score: 0.7

A(DELL)=MII

Score: 0.9

A(DELL)=MIII

Score: cyclic!

AMAT HPQ

Module III

Module II

Module I

Ideal Algorithm

Learn the module assignment of all variables simultaneously D):A'(S,scoreA MA'argmax

Problem

Due to acyclicity cannot optimize assignment for variables separately

Module I

Module III

Module II

Module IV

MI MII

Module NetworkModule graph

A(DELL)=ModuleI

VA(MSFT)=ModuleI

DELL MSFTDELL

Problem

Due to acyclicity cannot optimize assignment for variables separately

Module I

Module III

Module II

Module IV

MI MII

Module NetworkModule graph

A(DELL)=ModuleI

VA(MSFT)=ModuleI

DELL MSFTDELL

Sequential update algorithm Iterate over all variables For each variable, find its optimal assignment

given the current assignment to all other variables

Efficient computation When changing assignment from Mi to Mj, only

need to recompute score for modules i and j

Learning the Model

Initialize module assignment A

Optimize structure S

Optimize module assignment A For each variable, find its optimal

assignment given the currentassignment to all other variables

AMAT HPQ

Module III

Module II

Module I

MOTDELL

AMAT HPQ

Related Work

Bayesian networks

Parameter sharing

Module Networks

d stru

Langseth+

structu

Outline

Experimental results Statistical validation Case study: Gene regulation

Learning Algorithm Performance

0 5 10 15 20

Bayesi

Algorithm iterations

0 5 10 15 20

Algorithm iterations

l) Structure change iterations

0 20 40 60 80 100 120 140 160 180 200

Number of modules

25 instances50 instances

100 instances

200 instances

500 instances

Generalization to Test Data

Synthetic data: 10 modules, 500 variables

Best performance achieved for models with 10 modules

0 20 40 60 80 100 120 140 160 180 200

Generalization to Test DataTest

Number of modules

Synthetic data: 10 modules, 500 variables

100 instances

200 instances

500 instances

Gain beyond 100 instances is small

0 20 40 60 80 100 120 140 160 180 200

Structure Recovery Graph Synthetic data: 10 modules, 500 variables

Number of modules

200 instances

500 instances

100 instances

74% of 2250 parent-child relationships recovered

Stock Market 4411 variables (stocks), 273 instances (trading

days) Comparison to Bayesian networks (cross

validation)

Number of modules

0 50 100 150 200 250 3000

Bayesian network

performance

Regulatory Networks Learn structure of regulatory networks:

Which genes are regulated by each regulator

Gene Expression Data

Measures mRNA level forall genes in one condition

Learn dependency of the expression of genes as a function of expression of regulators

Experiments

Induced

Repressed

Gene Expression 2355 variables (genes), 173 instances

(arrays) Comparison to Bayesian networks

Number of modules

0 100 200 300 400 500

Bayesian network

performance

Biological Evaluation

Find sets of co-regulated genes (regulatory module)

Find the regulators of each module

Segal et al., Nature Genetics, 2003

Experimental Design Hypothesis: Regulator ‘X’ activates process

‘Y’ Experiment: Knock out ‘X’ and repeat

experimentHAP4

Ypl230Wtruefalse

truefalse X?

wt Ypl230w

0 3 5 7 9 24 0 2 5 7 9 24

(hrs.)

341 differentially expressed genes

0 7 15 30 60 0 7 15 30 60

wt (min.)

0 5 15 30 60 0 5 15 30 60

wt (min.)

Differentially Expressed Genes

Were the differentially expressed genes predicted as targets?

Rank modules by enrichment for diff. expressed genes

# Module Significance

14 Ribosomal and phosphate metabolism 8/32, 9e 3

11 Amino acid and purine metabolism 11/53, 1e 2

15 mRNA, rRNA and tRNA processing 9/43, 2e 2

39 Protein folding 6/23, 2e 2

30 Cell cycle 7/30, 2e 2

39Protein folding 7/23, 1e-4

29Cell differentiation 6/41, 2e-2

5 Glycolysis and folding 5/37, 4e-2

34Mitochondrial and protein fate 5/37, 4e-2

Ypl230w

3 Energy and osmotic stress I 8/31, 1e 4

2 Energy, osmolarity & cAMP signaling 9/64, 6e 3

15 mRNA, rRNA and tRNA processing 6/43, 2e 2

Biological Experiments Validation

All regulators regulate predicted modules

Summary

Probabilistic model for learning modules of variables and their structural dependencies

Improved performance over Bayesian networks Statistical robustness Interpretability

Application to gene regulation Reconstruction of many known regulatory

modules Prediction of targets for unknown regulators

learning module networks eran segal stanford university joint work with: dana pe’er (hebrew u.)...

module network slide

module pam

msft dell intl nvls

dell intl nvls mot msft

module cpd template

module network idea

msft dell intl nvls

module graph nodes

Documents

hardness of hyper-graph coloring irit dinur nec joint work...

novel approaches to antibiotic resistance gili regev-yochay...

phong q. nguy ê n (École normale supérieure) oded regev ...

lattice-based cryptography oded regev tel-aviv university...

inferring subnetworks from perturbed expression profiles d....

regev, motti (2007) - ethno-national pop-rock music_...

radiative processes during grb prompt emission based on...

copyright n. friedman, m. ninio. i. pe’er, and t. pupko....

length distributions of identity by descent reveal...

oded regev (tel aviv university) ben toner (cwi, amsterdam)

dror regev interligent - mwe...

barak regev

pe’er - civil engineering co. · pdf file- civil...

inferring the demographic history of the ashkenazi jewish...

automated target detection for geophysical applications...

the measure of cosmological distances asaf pe’er space...

challenges and solutions for broadband wireless and...

1 new lattice based cryptographic constructions oded regev

stanford basketball - stanford university

hubberman regev cure for cancer - front door approach