dependency networks sushmita roy bmi/cs 576 sroy@biostat.wisc.edu nov 26 th, 2013

Dependency networks

Sushmita RoyBMI/CS 576

www.biostat.wisc.edu/bmi576sroy@biostat.wisc.edu

Nov 26th, 2013

Goals for today

• Introduction to Dependency networks• GENIE3: A network inference algorithm for learning a

dependency network from gene expression data• Comparison of various network inference algorithms

What you should know

• What are dependency networks?• How they differ from Bayesian networks?• Learning a dependency network from expression

data• Evaluation of various network inference methods

Graphical models for representing regulatory networks

• Bayesian networks• Dependency networks

Structure

Random variables encode expression levels

TARGET

REGULATORS

Edges correspond to some form of statistical dependencies

Y3=f(X1,X2)

Function

Dependency network

• A type of probabilistic graphical model• As in Bayesian networks has– A graph component– A probability component

• Unlike Bayesian network – Can have cyclic dependencies

Dependency Networks for Inference, Collaborative Filtering and Data visualization Heckerman, Chickering, Meek, Rounthwaite, Kadie 2000

Notation

• Xi: ith random variable

• X={X1,.., Xp}: set of p random variables

• xik: An assignment of Xi in the kth sample

• x-ik: Set of assignments to all variables other than Xi

in the kth sample

Dependency networks

?? ?…

Regulators

•Function: fj can be of different types.•Learning requires estimation of each of the fj functions•In all cases it is trying to minimize an error of predicting Xj from its neighborhood:

Different representations of the fj function

• If X is continuous– fj can be a linear function

– fj can be a regression tree

– fj can be a random forest• An ensemble of trees

• If X is discrete– fj can be a conditional probability table

– fj can be a conditional probability tree

Linear regressionY

X (input)

Linear regression assumes that output (Y) is a linear function of the input (X)

Slope Intercept

Estimating the regression coefficient

• Assume we have N training samples• We want to minimize the sum of square errors

between true and predicted values of the output Y.

An example random forest for predicting gene expression

Ensemble of Regression trees

Output

1Input

A selected path for a set of genes

Sox6>0.5

Considerations for learning regression trees

• Assessing the purity of samples under a leaf node– Minimize prediction error– Minimize entropy

• How to determine when to stop building a tree?– Minimum number of data points at each leaf node– Depth of the tree– Purity of the data points under any leaf node

Algorithm for learning a regression tree

• Input: Output variable Xj, Input variables Xj

• Initialize tree to single node with all samples under node– Estimate

• mc: the mean of all samples under the node• S: sum of squared error

• Repeat until no more nodes to split– Search over all input variables and split values and compute

S for possible splits– Pick the variable and split value that has the highest

improvement in error

GENIE3: GEne Network Inference with Ensemble of trees

• Solves a set of regression problems– One per random variable

• Models non-linear dependencies• Outputs a directed, cyclic graph with a confidence of

each edge• Focus on generating a ranking over edges rather than

a graph structure and parameters

Inferring Regulatory Networks from Expression Data Using Tree-Based Methods Van Anh Huynh-Thu, Alexandre Irrthum, Louis Wehenkel, Pierre Geurts, Plos One 2010

GENIE3 algorithm sketch

• For each gene j, generate input/output pairs– LSj={(x-j

k,xjk),k=1..N}

– Use a feature selection technique on LSj such as tree building to compute wij for all genes i ≠ j

– wij quantifies the confidence of the edge between Xi and Xj

• Generate a global ranking of regulators based on each wij

GENIE3 algorithm sketch

Figure from Huynh-Thu et al.

Feature selection in GENIE3

• Random forest to represent the fj• Learning the Random forest

• Generate M=1000 bootstrap samples• At each node to be split, search for best split among K randomly

selected variables

– K was set to p-1 or (p-1)1/2

Computing the importance weight of each predictor

• Feature importance is computed at each test node• Remember there can be multiple test nodes per

regulator• For a test node importance is given by the reduction

in variance if we make a split on that node

Test node Set of data samples that reach the test node

#S: Size of the set S

Var(S): variance of the output variable in set S

Computing the importance of a predictor

• For a single tree the overall importance is then sum over over all points in the tree where this node is used to split

• For an ensemble the importance is averaged over all trees.

Computational complexity of GENIE3

• Complexity per variable– O(TKNlog N)– T is the number of trees– K is the number of random attributes selected per split– N is the learning sample size

Evaluation of network inference methods

• Assume we know what the “right” network is• One can use Precision-Recall curves to evaluate the

predicted network• Area under the PR curve (AUPR) curve quantifies

performance

AUPR based performance comparison

DREAM: Dialogue for reverse engineeting assessments and methods

Community effort to assess regulatory network inference

DREAM 5 challenge

Previous challenges: 2006, 2007, 2008, 2009, 2010 Marbach et al. 2012, Nature Methods

Where do different methods rank?

Marbach et al., 2010 Com

Comparing module (LeMoNe) and per-gene (CLR) methods

Summary of network inference methods

• Probabilistic graphical models provide a natural representation of networks

• A lot of network inference is done using gene expression data

• Many algorithms exist, we have seen three– Bayesian networks

• Sparse candidates• Module networks

– Dependency networks– GENIE3

• Algorithms can be grouped into per-gene and per-module

dependency networks sushmita roy bmi/cs 576 sroy@biostat.wisc.edu nov 26 th, 2013

function slide

input x slopeintercept

input variables x j

error slide

notation x

assignment of x

output variable x j

error of predicting

Documents

the whistling wanderer-sushmita ananth

introduction to molecular networks sushmita roy bmi/cs 576 ...

markov models and applications sushmita roy bmi/cs 576 ...

primer on probability sushmita roy bmi/cs 576 sushmita roy...

heuristic methods for sequence alignment in practice...

multiple sequence alignment bmi/cs 576 colin dewey...

measuring transcriptomes with rna-seq bmi 877 spring 2014...

learning hmm parameters sushmita roy bmi/cs 576 ...

unileverinbrazil sroy

graph structure learning for network inference:...

rizal chapter 4 de leon maria sushmita

ram govind singh and sushmita ruj - arxiv

sushmita sridhar thesis1 - stacksyp026hd9575/sridhar... ·...

module networks sushmita roy bmi/cs 576 ...

practical algorithms in sequence alignment sushmita roy...

sushmita crm asgnmnt

practical multiple sequence algorithms sushmita roy bmi/cs...

sequence assembly: concepts bmi/cs 576 sushmita roy...

probabilistic methods for phylogenetic trees (part 2)...

learning and representing molecular networks from data...