gene network motifs as dynamic systems (2)

Abstract — Gene networks are a particular class of complex networks found in living organisms. In these networks, there are some statistically significant connection patterns called motifs. The goal of this paper is to study mainly four motifs FFL, SIM, MO-FFL, BIFAN, present in the E. Coli gene network from the control theory perspective.

I. INTRODUCTION

ene networks are chemical networks where molecules called transcription factors modulate, enable or repress, the expression of genes [1, 2]. In these networks nodes

represent genes and links represent chemical interactions, i.e. a protein from one gene regulates the production of protein of another gene. E. Coli is a well known organism with a reliable gene network derived from experimental studies [3, 4]. In this network there are some statistically significant connection patterns or sub networks called network motifs. Here, four network motifs from E. Coli gene network are analyzed using control systems theory. Each motif is translated into a state variable representation and then both the transfer function and the step response are found. An interesting result is that nature has evolved stable structures, i.e. stable motifs. To illustrate a non-trivial motif variation, cited in the cancer literature, we introduce the interaction between genes p52 and mdm2, this motif includes delay, feedback, and a nonlinear gain [5-7]. The level of detail of gene networks can be increased when activation networks between transcription factors and DNA motifs are included. This type of modeling results in a network of networks, a possible integration between control theory and complex networks [8-9]. The paper is organized as follows. Section two presents four E. Coli network motifs, FFL, SIM, MO-FFL, BIFAN and their corresponding state variable representation. Also the resolvent matrix is calculated for each motif; in all cases the characteristic polynomial is stable as long as each gene has negative or zero feedback. Section three illustrates the step responses for each motif in the single input single output case. In section four, a different gene network motif with feedback and delay is introduced. Section five increases the level of detail of gene networks by considering activation networks between transcription factors and gene motifs. Finally, conclusions and future work are mentioned in section six.

1 The author is with the Electrical and Electronics Engineering

Department of the National University of Colombia at Bogota. Email: [email protected].

II. NETWORK MOTIFS There are mainly four types of motifs that are statistically significant in the gene network of E. Coli. All motifs will be formulated as a state variable representation (1).

Xcy

ubXAdtdX

.

..

=

+= (1)

Each motif defines its own state matrix A and hence the resolvent matrix Φ(s),

[ ] 1)( −−=Φ AsIs (2) If the input and output vectors, b and c, are given, then from the state space model (1) a transfer function T(s) can be derived (3),

[ ] bAsIcsusysT ..)()()( 1−−== (3)

Notice that T(s) is calculated with the resolvent matrix. Finally the unit step response (4) is found with the inverse Laplace operator,

⎭⎬⎫

⎩⎨⎧= −

ssTty 1).()( 1 (4)

A. FFL The first motif is called the Feed Forward Loop (FFL) and is shown in Fig. 1. Gene 3 is regulated directly by gene 1 and indirectly through gene 2. For general purposes each gene has a feedback coefficient. The FFL motif matrix A is,

⎥⎥⎥

⎦

⎤

⎢⎢⎢

⎣

⎡=

333231

2221

11

000

aaaaa

aA

Gene Network Motifs as Dynamic Systems A. Delgado1

G

2009 IEEE International Symposium on Intelligent ControlPart of 2009 IEEE Multi-conference on Systems and ControlSaint Petersburg, Russia, July 8-10, 2009

978-1-4244-4603-2/09/$25.00 ©2009 IEEE 189

Fig. 1. Feed Forward Loop (FFL). The resolvent matrix for the FFL motif with feedback is,

⎥⎥⎥⎥⎥⎥⎥

⎦

⎤

⎢⎢⎢⎢⎢⎢⎢

⎣

⎡

−−−−−−+−

−−−

−

333322

32

332211

3131223221

222211

21

11

1)).(()).().((

...

01)).((

001

asasasa

asasassaaaaa

asasasa

as

When genes have no feedback the resolvent matrix is reduced to,

⎥⎥⎥⎥⎥⎥

⎦

⎤

⎢⎢⎢⎢⎢⎢

⎣

⎡

+−

=Φ

ssa

ssaaaaa

ssas

s

1...

01

001

)(

232

33131223221

221

B. SIM The second motif is the Single Input Module (SIM) illustrated in Fig.2. Gene 1 regulates different genes, in this case genes 2 to 4.

Fig. 2. Single Input Module (SIM).

The SIM matrix A is,

⎥⎥⎥⎥

⎦

⎤

⎢⎢⎢⎢

⎣

⎡

=

4441

3331

2221

11

000000000

aaaa

aaa

A

The resolvent matrix with feedback is given by,

⎥⎥⎥⎥⎥⎥⎥⎥⎥

⎦

⎤

⎢⎢⎢⎢⎢⎢⎢⎢⎢

⎣

⎡

−−−

−−−

−−−

−

444411

41

333311

31

222211

21

11

100)).((

010)).((

001)).((

0001

asasasa

asasasa

asasasa

as

The resolvent matrix without feedback,

⎥⎥⎥⎥⎥⎥⎥⎥

⎦

⎤

⎢⎢⎢⎢⎢⎢⎢⎢

⎣

⎡

=Φ

ssa

ssa

ssas

s

100

010

001

0001

)(

241

231

221

C. MO – FFL

The third motif Multi – Output Feed Forward Loop (MO-FFL) is shown in Fig. 3. Gene 1 regulates directly genes 3 and 4 and indirectly the same genes through gene 2.

Fig. 3. Multi-Output Feed Forward Loop (MO-FFL).

190

The MO-FFL matrix A is,

⎥⎥⎥⎥

⎦

⎤

⎢⎢⎢⎢

⎣

⎡

=

444241

333231

2221

11

0000000

aaaaaa

aaa

A

The resolvent matrix with feedback,

⎥⎥⎥⎥⎥⎥⎥⎥⎥

⎦

⎤

⎢⎢⎢⎢⎢⎢⎢⎢⎢

⎣

⎡

−−−−−−+−

−−−−−−+−

−−−

−

444422

42

442211

4122414221

333322

32

332211

3131223221

222211

21

11

10)).(()).().((

...

01)).(()).().((

...

001)).((

0001

asasasa

asasassaaaaa

asasasa

asasassaaaaa

asasasa

as

Resolvent matrix without feedback,

⎥⎥⎥⎥⎥⎥⎥⎥

⎦

⎤

⎢⎢⎢⎢⎢⎢⎢⎢

⎣

⎡

+−

+−=Φ

ssa

ssaaaaa

ssa

ssaaaaa

ssas

s

10...

01...

001

0001

)(

242

34122414221

232

33131223221

221

D. BIFAN Finally motif four called BIFAN is illustrated in Fig.4. Genes 3 and 4 are regulated by genes 1 and 2 simultaneously.

Fig. 4. BIFAN.

The BIFAN matrix A is,

⎥⎥⎥⎥

⎦

⎤

⎢⎢⎢⎢

⎣

⎡

=

444241

333231

22

11

00000000

aaaaaa

aa

A

The resolvent matrix with feedback,

⎥⎥⎥⎥⎥⎥⎥⎥⎥

⎦

⎤

⎢⎢⎢⎢⎢⎢⎢⎢⎢

⎣

⎡

−−−−−

−−−−−

−

−

444422

42

4411

41

333322

32

3311

31

22

11

10)).(()).((

01)).(()).((

0010

0001

asasasa

asasa

asasasa

asasa

as

as

No feedback in the motif produces a resolvent matrix,

⎥⎥⎥⎥⎥⎥⎥⎥

⎦

⎤

⎢⎢⎢⎢⎢⎢⎢⎢

⎣

⎡

=Φ

ssa

sa

ssa

sa

s

s

s

10

01

0010

0001

)(

242

241

232

231

III. STEP RESPONSES After finding the resolvent matrices Φ(s) for the motifs FFL, SIM, MO-FFL, BIFAN, it is possible to define vectors b and c. Consider the single input single output case, when the input is applied to one node and the output is measured in another node. There are three typical transfer functions if there is negative feedback in all nodes,

111 )(

asksT

−=

)).(()(

22112 asas

ksT−−

=

)).().(()()(

3322113 asasas

ssT−−−

+= β

The unit step responses are,

191

0,.)( .11 ≥+= teBAty ta

0,..)( .. 2211 ≥++= teCeBAty tata

0,...)( ... 332211 ≥+++= teDeCeBAty tatata

Where A, B, C and D are constant coefficients that depend on k and iia . If motifs have negative feedback, all unit step responses y(t) are bounded. In the case where there is no feedback, the transfer functions are,

sksT =)(1

22 )(sksT =

33)()(

sssT β+=

Here the step responses are,

0,.)( ≥= ttkty

0,.)( 2 ≥= ttkty

0,.2..)( 32 ≥+= ttktkty β

When there is no feedback all step responses are unbounded, in real gene networks there is saturation so y(t) in steady state is finite.

IV. FEEDBACK AND DELAY Another interesting motif that has been studied in detail is the relationship between genes p53 and mdm2 [5]. Gene p53 is a tumor suppressor that is over-expressed when the cell is under stress conditions, such as DNA damage, oncogene activation, metabolic changes, hypoxia and changes in pH or temperature [6]. The protein P53 is short-lived and it is maintained at low levels in normal cells [7]. When p53 is over-expressed cells undergo apoptosis or cellular death. Gene mdm2 is activated by protein P53, and the oncoprotein Mdm2 binds to P53 with two effects: (i) blocking its ability to function as a transcription factor, and (ii) targeting it, directly or indirectly, for degradation. In other words, p53 limits its own activity through the production of Mdm2. In normal cells, mdm2 and p53 form a negative feedback loop that limits the tumor suppressing activity of p53, see figure 5. Here X1: p53, X2: mdm2, a21 is a time delay and a12 is a nonlinear gain.

Fig. 5. Feedback and delay motif from p53 – mdm2 network. The p53 – mdm2 motif introduces new elements not present in the E. Coli motifs, i.e., gene feedback, delay, nonlinear gain. Delays and nonlinearities in control systems are associated with oscillations, the analysis of the p53 - mdm2 motif is not as trivial as before, this one requires computer simulations such as figure 6.

Fig. 6. P53 and Mdm2 (dashed) responses when the stress signal increases.

V. COMPLEX NETWORKS IN GENE EXPRESSION The previous models only considered interactions at the gene level, i.e., only gene networks. In general, a gene is a segment of DNA that has two parts, a regulatory region with DNA motifs and a coding region that codes for proteins. Gene expression requires a network of interactions between transcription factors such as proteins and DNA motifs inside the gene regulatory region, in this sense a gene network is a network of networks. The interaction between transcription factors and motifs in the regulatory region of any gene can be analyzed from the perspective of complex networks [8].

192

In figure 7, gene y has five transcription factors {a, b, c, d, e} and four motifs in the regulatory region {A, B, C, D}, the interaction can be studied using a complex network perspective. For example, transcription factor {a} interacts with motifs {A, B} and also with transcription factors {b, e}, figure 8.

Fig. 7. Transcription factors {a, b, c, d, e} interacting with motifs {A, B, C, D} in the regulatory region of gene y.

A set of features can be calculated for the interaction network between transcription factors and motifs for gene y.

In this case, k = 2.22, δ = 0.28, d = 2.3, and D = 5; k is

the average number of links per node; δ or density is the actual number of links as percentage of the total possible;

d is the average distance between nodes; D or diameter is the maximum value from the minimum distances between two nodes.

Fig. 8. Activation network between transcription factors {a, b, c, d, e} and

motifs {A, B, C, D} for gene y.

The features vector for this complex network with four parameters is {2.22, 0.28, 2.3, 5}, more parameters are possible following the literature [8-9]. One question that remains open is the uniqueness of the features vector to characterize a particular interaction network. After formulating the features vector, for the regulatory region of gene y, it is important to realize that genes interact. In this sense gene networks are networks of activation networks, i.e., networks of networks as shown in figure 9.

Fig. 9. Gene interaction networks can create complex networks.

VI. CONCLUSIONS After millions of years of trial and error nature has produced stable chemical networks, in this paper we showed that four motifs present in E. Coli, FFL, SIM, MO-FFL, BIFAN, have bounded step responses when genes have feedback. If feedback is eliminated step responses are unbounded, but gene saturation (hill function) causes a finite output. The presence or absence of gene feedback affects the rise time of the step response, in all cases there are not oscillations. Also the p53 – mdm2 interaction, proposed in the cancer literature, was introduced to illustrate the diversity of gene network motifs and to consider elements such as delays, feedback and nonlinearities. In this case numerical solutions are needed to understand gene expression. Finally, gene networks are in fact networks of networks when transcription factors and DNA motifs in the regulatory region of genes are considered.

ACKNOWLEDGMENT The author thanks the National University of Colombia for providing an environment for free thinking.

REFERENCES [1] Campbell, A.M., Heyer, L.J.: Discovering Genomics, Proteomics and

Bioinformatics. Benjamin Cummings, San Francisco, 2003. [2] Yuh, C.H., Bolouri, H., Bower, J.M., Davidson, E.H.: A Logical

Model of cis – Regulatory Control in a Eukaryotic System. In: Bower, J.M., Bolouri, H. (eds.) Computational Modeling of Genetic and Biochemical Networks, pp. 73-100. The MIT Press, Cambridge, 2001.

[3] Alon, U.: An Introduction to Systems Biology: Design Principles of Biological Circuits. Chapman & Hall, Boca Raton, FL, 2007.

[4] Li, C., Chen, L. and Aihara, K.: A Systems Biology Perspective on Signal Processing in Genetic Network Motifs. IEEE Signal Processing Magazine, pp. 136-147, March 2007.

[5] Freedman, D.A. and Levine, A.J.: Regulation of the p53 protein by the Mdm2 oncoprotein. Cancer Research, 59, pp. 1-7, 1999.

[6] Ashcroft, M., and Vousden, K.H.: Regulation of p53 stability. Oncogene. 18, pp. 7637-7643, 1999.

[7] Kubbutat, M.H.G., Jones, S.N., and Vousden, K.H.: Regulation of p53 stability by Mdm2. Nature. 387, pp. 299-303, 1997.

[8] Newman, M., Barabasi, A.L., Watts, D.: The Structure and Dynamics of Networks. Princeton University Press, Princeton, 2006.

[9] Caldarelli, G.: Scale – Free Networks. Oxford University Press, New York, 2007.

193

gene network motifs as dynamic systems (2)

Documents