gene regulatory networks - the boolean approach andrey zhdanov based on the papers by tatsuya akutsu...

29
Gene Regulatory Networks - the Boolean Approach Andrey Zhdanov Based on the papers by Tatsuya Akutsu et al and others

Post on 21-Dec-2015

218 views

Category:

Documents


1 download

TRANSCRIPT

Gene Regulatory Networks - the Boolean Approach

Andrey Zhdanov

Based on the papers by Tatsuya Akutsu et aland others

Gene Expressions Revisited

Gene Regulatory Networks - the Boolean Approach

Gene Expressions Revisited

One of the major subjects of study in cellbiology is the behaviour of proteins – the“workhorses” of a cell.

Myoglobin molecule

Gene Expressions Revisited

We are interested in analysing proteinexpression levels – amounts of differentproteins synthesized by the cell.

Gene Expressions Revisited

The “blueprints” for all possible proteins thatcan be synthesized by a cell – genes – arestored in the cell's nucleus.

Only small fraction of all possible proteins issynthesized in each cell.

Gene Expressions Revisited

Proteins are synthesized from genes by theprocess of transcription and translation.

Gene Expressions Revisited

We estimate protein expression levelsindirectly – by measuring gene expressionlevels (amounts of mRNA produced for acertain gene) with DNA chips.

Gene Expressions Revisited

This approach makes a number ofassumptions:• Genes exist and are easily identifiable• Each protein is encoded by a single gene• Protein expression (amount of protein

produced) is determined by the corresponding gene expression (amount of mRNA produced)

These assumptions do not always hold (butwe use them anyway :-)

Gene Regulatory Networks

Gene Regulatory Networks - the Boolean Approach

Gene Regulatory Networks

We want to use protein (or gene) expressionmeasurements to understand the mechanismsregulating proteins' production.

Note that there is certain circularity to our logicsince we made certain assumptions aboutthese very same mechanisms in order tomeasure protein expressions.

Gene Regulatory Networks

In the talks by Shahar and Leon we haveseen the “regulatory network” approach tomodelling the protein expression mechanisms.

In his talk Oded has introduced tools for timeseries analysis that can be applied to ourproblem.

Gene Regulatory Networks

We are looking for a formal model of theprotein expression control mechanism thatcan serve as a framework for a rigoroustreatment of the problem.

To that end we assume that production rate ofa certain protein at any given time is regulatedonly by the amount of other proteins within thecell at that time.

t

Gene Regulatory Networks

Example:

Protein Aex

cites

Protein B

excites

Protein C

Protein Dinhibits

Protein A

Protein D

Protein C

Protein B

Expression level

time

Treating the gene expressions as real-valuedfunctions of continuous time variable leads tothe system of differential equations as themodel for the gene regulatory network.

Gene Regulatory Networks

i

11

( ,... )n

ii n

i

dXf X X

dt

Boolean Regulatory Networks

Gene Regulatory Networks - the Boolean Approach

Boolean Regulatory Networks

To facilitate the treatment of the problem wefurther simplify our model to the BooleanRegulatory Network. We assume:

1. Discrete time and synchronous update model

2. Genes’ expression level is binary

( , )G V F

Boolean Regulatory Networks

More formally, a boolean networkconsists of a set of nodes representing genes

and a list of boolean functions

where is computes boolean functionof nodes and assigns the output to

1,..., nV v v

1( ,..., )nF f f

1( ,..., )

ki i if v v

1,...,

ki iv v iv

Boolean Regulatory Networks

The state of the network at time t is defined byassignment of 0s and 1s to the node

variables.

The state of each node at time t+1 iscalculated from the states of the nodesat time t according to

iv

1,...,

ki iv v

1( ,..., )

ki i if v v

Boolean Regulatory Networks

Boolean regulatory network can be visualizedby the means of wiring diagram:

Boolean Regulatory Networks

Since the network’s state at t+1 is completelydetermined by its state at t, we can treat thegene expressions time series as an unorderedset of input / output pairs.

We say that the network is consistent with aset of input/output pairs if for each pair settingthe network to the input state at time t causesit to reach the output state at t+1.

Boolean Regulatory Networks

We can now start formulating some of thefundamental problems for our model.

CONSISTENCY: Given the number of nodesand set of input/output pairs, decide whetherthere is a boolean network consistent with thepairs.

Boolean Regulatory Networks

COUNTING: Given the number of nodesand set of input/output pairs, count the numberof boolean networks consistent with thepairs.

Boolean Regulatory Networks

ENUMERATION: Given the number of nodesand set of input/output pairs, output all theboolean networks consistent with the pairs.

Boolean Regulatory Networks

IDENTIFICATION: Given the number of nodesand set of input/output pairs, decide whetherthere is a unique boolean network consistentwith the pairs and output one if exists.

Boolean Regulatory Networks

The four problems presented above areclosely related. We address them in thestraightforward manner by constructing allpossible boolean networks and checking themon all the input/output pairs.

To make this task computationally feasible weneed yet another assumption – we assumethat the network’s indegree is bounded bysome constant K.

Boolean Regulatory Networks

Some of the results:

The complexity of the brute-force algorithm forthe CONSISTENCY problem is

Where is the number of nodes (genes) and is the number of input/output pairs.The results for the other problems are similar.

2 1(2 )K KO n m

n m

Boolean Regulatory Networks

Another theoretical result concerns thenumber of input/output pairs required touniquely identify a boolean network.

Again, to facilitate calculations, we make anunrealistic assumption: we assume that theinput/output pairs are randomly drawn from auniform distribution.

Boolean Regulatory Networks

Theorem: If input/outputexpressions are drawn from a uniformdistribution, the probability that there are morethan one boolean network consistent withthem is at most

2(2 (2 ) log )KO K n

1

n

Boolean Regulatory Networks

Conclusions:Boolean gene expression networks representa relatively simple model of the geneexpression control mechanisms of the cell.However, despite many (often unrealistic)simplifying assumptions, this model has notyielded any interesting theoretical results yet,which indicates the intristic difficulty ofmodeling gene expression mechanisms.