epigenetic vs. genetic, a story of the evolution of the ... · differentiation of multicellular...

Epigenetic vs. Genetic, a Storyof the Evolution of the GermlineMichael LachmannGuy Sella

SFI WORKING PAPER: 2003-02-012

SFI Working Papers contain accounts of scientific work of the author(s) and do not necessarily represent theviews of the Santa Fe Institute. We accept papers intended for publication in peer-reviewed journals or proceedings volumes, but not papers that have already appeared in print. Except for papers by our externalfaculty, papers must be based on work done at SFI, inspired by an invited visit to or collaboration at SFI, orfunded by an SFI grant.©NOTICE: This working paper is included by permission of the contributing author(s) as a means to ensuretimely distribution of the scholarly and technical work on a non-commercial basis. Copyright and all rightstherein are maintained by the author(s). It is understood that all persons copying this information willadhere to the terms and constraints invoked by each author's copyright. These works may be reposted onlywith the explicit permission of the copyright holder.www.santafe.edu

SANTA FE INSTITUTE

Epigenetic vs. Genetic, a story of the evolution of

the germline

Michael Lachmann

Max Planck Institute for

mathematics in the sciences

Inselstr 22

D014103 Leipzig, Germany

e-mail: dirk@santafe.edu

Guy Sella

The Weizmann Institute of Science

POB 26

Rehovot 76100, Israel

e-mail: sella@wisdom.weizmann.ac.il

17th January 2003

Abstract

Differentiation of multicellular organisms is controlled by epigenetic markers transmitted

through cell division. Many of the systems that encode these markers exist also in

unicellulars, but in unicellulars these systems do not control differentiation. Thus during

the evolution of multicellularity, epigenetic inheritance systems were exapted for their

current use in differentiation. During this transition there must have been stages at which

epigenetic information passed between generations to an even larger extent than it does

now. We show that this can lead to the evolution of cells that do not contribute to the

progeny of the organism, and thus to a germline-soma distinction. This hints that an

intrinsic instability during a transition from unicellulars to multicellulars may be the

reason wide spread of the evolution of germ line.

1 Introduction

Waddington coined the term epigenetics to refer to “the branch of biology which studies

the causal interactions between genes and their products which bring the phenotype into

being” (Waddington 1942). In the same paper Waddington also coined another term,

“epigenotype”, to describe the sum total of the patterns of development that a particular

genotype manifests during the process that leads from a fertilized egg into an adult

phenotype. It is believed today that part of the information of the epigenotype is carried

within epigenetic inheritance systems, systems that can transmit information stably

across cell-division, but are not encoded as bases on the DNA. Epigenetic inheritance

systems exist also in unicellulars, where they play a different role than the role of

regulating differentiation that they play in multicellulars. In unicellulars these systems

seem to mainly transmit state information, or are simply another inheritance system. In

this paper we study the consequences of taking an inheritance system that transmits state

information across cell-division, and evolving to use it for control of differentiation.

During the transition period, epigenetic information is both modified by gene-regulation,

and selected on at the population level. We will show that this process makes the state in

which two or more cell types produce a new generation of organisms unstable. This

eventually leads to the evolution of a germline-soma separation.

Because the full models that this paper examines are somewhat complex, we will

introduce some of the assumptions gradually. Section 2 gives and overview of epigenetic

systems, as they are used in this paper. In section 3 we introduce the first model, with just

two cell types, and equal reproduction for all cells in the adult. This model is then

analyzed in section 4. Section 5 adds another assumption to the model: cells in the adult

do not necessarily reproduce equally. It will be shown that in the model some cells evolve

to give up their reproductive ability. In section 6 it is shown that these results do not

depend on the fact that we chose a system with two cell types. It is shown that a system

with an arbitrary number of cells, and an arbitrary differentiation graph will tend to

evolve towards a more star-like differentiation graph, which corresponds to the evolution

of a germline-soma separation. Section 7 then introduces a model, which is fully

described in Appendix A, that allows for evolution of “generalized differentiation”. We

will discuss a few simulation results from this model, and to what extant they agree with

the preceding sections. Finally, the paper ends with a discussion of the results.

2 Epigenetics

There have recently been several papers that commented on the “hijacking” of the term

epigenetics to be used mainly in describing transmission of information across

generations in addition to the information on the DNA, and less to mean the translation of

genotype to phenotype (Wu and R. 2001; Cavalli 2002). Since this paper will deal with

both, we will use the term “epigenetic inheritance systems” (EIS), for the chemical

systems that enable the transmission of information across cell division, in addition to the

information in the DNA. We will not specify whether these transmit information across

generations.

The best understood epigenetic inheritance system is the methylation marking system

where the presence of methyl (CH3) groups on some cytosines (in most vertebrates and

plants these are cytosines that have guanines as neighbors) or other nucleotides, are

transmitted from one cell generation to the next. Inactive genes are often highly

methylated, whereas the same genes may be transcribed if the methylation level is low.

Developmental and environmental cues lead to changes in methylation, so the same gene

may carry distinctly different methylation patterns (’marks’) in different cell types

(Holliday 1990). Other than methylation marks, which are the best understood DNA

associated marks, there exist other types of marks, involving DNA-associated proteins

that affect gene activity and can also be transmitted in cell lineages, and are maintained

and reconstituted following DNA replication (Lyko and Paro. 1999). Differences in cell

states can also be transmitted in other ways, such as through positive regulatory

feedback-loops where a gene that was turned on by a transitory external stimulus,

produces a product that then acts as a positive regulator that maintains transcription of the

gene. Another type of epigenetic inheritance is based on the 3D templating of protein

complexes. (a more review of these and other EISs is reviewed in Jablonka, Lachmann,

and Lamb 1992 and in Jablonka and Lamb 1995).

Epigenetic inheritance systems involving all three types of EISs (based on chromatin

marking, on self sustaining regulatory loops, or on 3D templating) are observed in

unicellular organisms. For example, bacteria and yeast cells have epigenetic inheritance

systems and can therefore transmit induced and accidental functional and structural states

to their progeny (Grandjean, Hauck, Beloin, Le Hegarat, and Hirschbein 1998; Klar

1998). Unicellular organisms do not, however, undergo epigenesis in the classical sense:

this notion usually refers to processes of development and cell differentiation in

multicellular organisms. Since unicellular EISs were the precursors of cellular heredity in

multicellular organisms, the question is how did this change of function happen during

the evolution of multicellularity? Are there interesting phenomena that happen during

this transition?

In order to answer this question, we must clarify the difference between epigenetic and

genetic heritable variations. One could say that genetic information provides a plan or

program for the cell’s action, whereas epigenetic information codes for the current state

the cell is in, when the state is to be transmitted across cell divisions. In many cases, this

distinction is clear, since epigenetic variations are usually erased during the formation of

gametes. The neo-Darwinian view assumed for some time that in multicellular organisms

this distinction is completely clear-cut. It was assumed that epigenetic inheritance only

codes for the current state, and is never transmitted from generation to generation, while

genomic information is never modified during development. It was assumed that during

sexual reproduction epigenetic marks from the previous generation are always erased

before development gets underway. However, there are cases in multicellular organisms

in which epigenetic marks are transmitted through several generations. For example, it is

now recognized that a well-studied heritable variation in mouse coat color is caused by an

epigenetic modification, not a mutation (Morgan, Sutherland, Martin, and Whitelaw

1999). Similarly, the peloric form of toadflax, a “mutant” described by Linnaeus over 250

years ago, has turned out to be an “epimutant” - a difference in methylation pattern, not

DNA sequence (Cubas, Vincent, and Coen 1999).

Another property that may distinguish between genetic and epigenetic variations is the

inducability of the variation. Genetic variations are usually random in the sense that their

specificity with respect to the inducing stimulus is rather low, and they are usually not

adaptive responses to the stimulus. Epigenetic variations can be random or induced, and

in the latter case both specificity and adaptedness may be quite high. Since there are

examples of genetic variations that are highly specific and developmentally regulated this

distinction is not always valid.

For the purpose of this paper, we will use the following definition: Information that is

transmitted across cell division, and can be induced to change by the organism is part of

the epigenetic inheritance system. Information that can be transmitted across cell

division, but changes only through random mutation will be called part of the genetic

inheritance system.

Will the grey area in which epigenetic inheritance systems play both the role of

genetic-information carriers, and the role of state-information carriers effect evolution?

We will present an example in which it does. In this example novel evolutionary effects

occur as a result, and we will also see how the effect can also facilitate the evolution of a

germline-soma separation.

3 Description of the model

Now let us turn to the main model this paper is based on. The organisms in the model

will be primitive multicellulars, with functional cell differentiation, but no germline-soma

separation. Cell state is determined through an epigenetic inheritance system, and all cell

types can produce spores to make a new adult. The genetic/epigenetic system of the

organism controls differentiation through changes in the epigenetic state of cells. For

clarity we will assume that the epigenetic system is based on DNA methylation, though

the model would apply equally well to other epigenetic inheritance systems. For

simplicity, we assume asexual reproduction. The implications of sexual reproduction, and

in particular of diploid organisms are addressed in the discussion. The epigenetic

inheritance system will exhibit heritable variation of random mutations and induced

changes.

Each organism starts its life as a single cell, a spore. The organism then undergoes

growth through cell division, and differentiation. For simplicity we assume that all adult

organisms express the same organization of cells. The adult contains cells of two types -

A andB. Every adult hasNA cells of typeA, andNB cells of typeB. This assumption

corresponds to a fitness function in which any offspring that does not have exactlyNA

cells of typeA andNB of typeB is not viable. In section 7 we explore such a model.

As stated above, this model describes an organism before the evolution of a germline.

Thus the initial cell of the organism is not necessarily of a certain epigenotype – it may be

of typeA or of typeB. This first cell will then divide and replicate its epigenetic pattern to

create cells of the same type, and also divide and modify its epigenetic pattern to create

cells of the opposite type. The epigenetic pattern of the cell determines gene activity in

the cell, and thus determines the difference in activity between typeA and typeB. The

type of the first cell is determined by the type of the spore, which in turn is determined by

the type of the cell that produced it.

The number of spores produced by each cell in the adult is under genetic control, but

initially we just assume that each cell in the adult producesk spores. An adult organism

has a fitnessf , which determines how many spores it can produce. Following is a

summary of these assumptions:

• Each multicellular organism consists of cells of two distinct epigenetic types, and

distinct function: cells of typeA and cells of typeB.

• The adult form always contains exactlyNA cells of typeA, andNB cells of typeB.

• A cell of type i in the adult producesk · f spores, wheref is the fitness of the

parent. The spores maintain the epigenetic identity of their parent cells.

• During differentiation, a cell of typeA can replicate its epigenetic patterns to

produce cells of typeA, or change its epigenetic patterns (i.e. differentiate) through

genetically coded mechanisms to produce cells of typeB. Similarly, cells of typeB

replicate to produce typeB and differentiate to produce typeA.

Figure 1 shows the life cycle of these organisms, and the differentiation graph.

Now we turn to the main assumptions of the model: We now assume that at some point

the epigenetic pattern of a cell of typeA undergoes an epimutation, to give a cell of type

A′. The change is in the epigenetic information, and not in the genetic code of the cell -

its genes are still identical to the genes of all other cells in the organism (see figure 2 for

possible methylation patterns of cells of typeA, B, andA′).

k spores

spore spore

(a) orig. life cycle.eps

(b) orig. differentiation graph

Figure 1: (a) Original life cycle of the organism, as described above. The adult has a

fixed number of cells of two types,A andB. Cells of both types give rise to spores which

maintain their epigenetic identity, and can differentiate into a new adult.

(b) Differentiation graph of the organism. Nodes represent cells of an epigenetic type,

arrows represent the possibility for a cell type in the spore to produce another cell type in

the adult through replication or differentiation.

Figure 2: Possible methylation pattern for the different cell types. The patterns for types

A and B are different, and transform to one another during epigenesis. The pattern for A’

is different from these two, and the process that transforms A to B will transform A’ to B,

k spores

Mutated type, high fitness

k spores

(a) mod. life cycle

(b) mod. differentiation graph

Figure 3: (a) Modified life cycle of the system with cells of typeA, B, andA′.

(b) Differentiation graph with the epimutationA′. Notice that cells of typeA′ produce cells

of typeA′ andB, but that cells of typeB can not produce cells of typeA′.

We also assume the following:

• Cells of typeA′ will produce spores of typeA′. Organisms originating from such

spores will containNA cells of typeA′, andNB cells of typeB. During

differentiation, cells of typeA′ will replicate to produce cells of typeA′, and

differentiate to cells of typeB. Thus we assume that the differentiation pathway

which causes cells of typeA to become typeB will cause cells of typeA′ to become

cells of typeB.

• An organism with cells of typeA′ andB has a sufficiently higher fitness than the

original organism with cells of typeA andB. Later it will be shown how much

higher the fitness of such an organism has to be. An adult with cells of typeA andB

has fitnessf , and one with cells of typeA′, andB has fitnessf ′.

4 Formal description of the model

Epigenetic types of the cells areA, B, andA′. The frequencies of spores of typei at the

beginning of a generation ispi . The number of cells of typej in an adult that started from

a spore of typei, denoted byGi j is:

GAA GBA GA′A

GAB GBB GA′B

GAA′ GBA′ GA′A′

NA NA 0

NB NB NB

0 0 NA

This matrix represents the differentiation graph of the system, in which an entry at

positioni, j is non-zero if a spore of typei will produce a cell of typej in the adult.

The fitness of an adult that started as a spore of typei is fi . This fitness can be

represented by the following matrix:

0 0 f ′

Thus the number of cells of the various epigenetic types in the population in the adult

stage isF×G×−→p , and the number of spores of the different types in the next generation

is k ·F×G×−→p . The total number of spores produced is−→1 ×k ·F×G×−→p , where we

denote−→1 ≡ (1,1,1). Thus the distribution of spores in the next generation, is

−→p t+1 =k ·F×G×−→pt

−→1 ×k ·F×G×−→p t

The equilibrium distribution of this system depends on the eigenvalues of the matrix

F×G, which is

F×G =

f NA f NA 0

f NB f NB f ′NB

0 0 f ′NA

when f ′NA is bigger than the largest eigenvalue of the upper left-hand 2×2 matrix

f NA f NA

f NB f NB

then the eigenvector with largest eigenvalue of the system cannot be of the form(a,b,0),

and thusA′ is present in the equilibrium population. Thus if

A′ will successfully invade the population. In that case, The largest eigenvalue isf ′NA

and thus typeA′ acts as a source for the population, and typesA andB as sinks.

5 Reproductive ability of cells

Till now we assumed that every cell in the adult produces exactlyk spores. We will now

add an additional assumption, which will make it possible for a cell to specialize in

reproduction. Assume that the genome can determine allocation of resources in a certain

cell type, either to spore production, or towards aiding in spore production of other cells.

We will also assume that in transferring resources from one cell to another, potentially

some of the resources are lost.

In general the genome could specify how much resources a certain cell-type gives up, and

how much it contributes to each of the other cells. For simplicity, we will assume that the

resources simply re-distributed among the cells in proportion to their reproductive ability,

though our results do not rely on this assumption. We will assume that the genome can

specify a single parameter per cell type, the reproductive abilityr. Cells of typeB with

reproductive abilityr, then use a proportionr of their resources to produce spores. A

proportion 1− r of theB’s resources is contributed to a general pool that is divided with a

loss among all cells, in proportion to their reproductive ability. In transferring resources

between different cells, a proportionL is lost.L is called the loss factor. The number of

spores of typei produced in the next generation in an organism with fitnessf is

(f r i + f

∑ j r jNj(1−L)∑

j(1− r j)Nj

wherer i is the reproductive ability of cells of typei. This is a linear model, in which the

reproductive ability given up by some cells is divided among all cells in the organism in

proportion to their reproductive ability. The constantk can be seen as the number of

spores produced by one unit of resources of the cells. Summing overi in equation 7 we

can see that the total number of spores produced by the organism is

(∑ j Nj

∑ j r jNj(1−L)+L

)· f k ·∑

ir iNi (8)

To understand this equation better, we will also writeN for ∑i Ni andn for ∑i r iNi . Then

the total number of spores produced by the organism is

f ·k ·(

(1−L)+L

)n = f ·k ·

(N−L(N−n)

)When the reproductive ability of all cells is 1, the total number of spores produced is

f ·k ·N so that when the reproductive ability of some cells is smaller than 1, the organism

gives up the production off ·k ·L · (N−n) spores. Notice that in models of

specialization, it is assumed that when a cell type gives up its reproductive ability, the

organism gains more than it loses in the overall production of spores - this would

correspond toL < 0 using our formalism.

Now we will make the following claim, for 1> L > 0: before the invasion of the

epimutationA′, any mutation that increasesr will invade, and thus all cell types will

reproduce equally. After the invasion of the epimutationA′, any mutation that decreases

rB will invade, and thus cells of typeB will give up their reproductive ability. These cells

will become soma, and cells of typeA′ will become the germline.

The model as described in the previous section was simulated, for a population size of

1000. Figure 4 shows the results from a representative run. We can see that at first the

reproductive ability ofB stayed at 1. OnceA′ invaded the population, the reproductive

ability of B declined, until it reached almost 0 by generation 1000. The selection pressure

to lower the reproductive ability ofB declines as the reproductive ability ofB declines,

because the cost imposed byB’s production of offspring declines.

One can ask what happens if through genetic changes cells of typeB gain the ability to

produce cells of typeA′ instead ofA. Such a mutation will invade the population. In that

case, the life cycle of the organism will return to the one described in Figure 1. The

0 100 200 300 400 500 600 700 800 900 1000

Time (generations)

p(A)p(B)p(A’)

B mean reproductive ability

Figure 4: Results of a typical run. The population size is 1000.NA = NB = 2. f = 1, f ′ = 3,

k = 1, L = 0.5. Shown are proportions of spores of the different types over time, and the

average reproductive ability ofB over time. At around generation 130 the epimutation

A′ invaded. At that stageA′ is the source for the population, andA and B are sinks.

Afterwards, a series of mutations caused the reproductive ability ofB to decline to almost

0 - A′ could be considered germline at the end of the run, andB soma.

selection pressure for such a mutation to invade declines as the reproductive ability ofB

declines, until whenB lost all reproductive ability and became soma, there is no pressure

for the ability ofB to produce cells of typeA′. OnceB gained the ability to produceA′, it

will regain its reproductive ability. In this case, the bigger the loss factorL, the fasterB

will regain its reproductive ability. When there is no loss, there is no pressure forB to

regain its reproductive ability. So, with a with a big loss factorB with lose its

reproductive ability slowly, and regain it quickly; with a small loss,B will lose its

reproductive ability quickly, and regain it slowly.

6 Other differentiation graphs

Till now we discussed a differentiation graph with two nodes, with the following

properties after an epimutation: One of the cell types was mutated into new cell type -A

into A′. The old differentiation process acts in basically the same way, except that in one

of the differentiation pathways the mutation is lost: whenA′ differentiates intoB it loses

the epi-mutation.

Now we will show that the observations we have made, will hold also for bigger

differentiation graphs, in which the epimutation acts in the same way: Some of the cell

types will become new types, the basic differentiation graph will be preserved, except

that in some differentiation pathways the epimutation will be lost. First we will go

through two examples of a differentiation graph of size 3. In Figure 5 part (a) we see a

differentiation graph with three nodes, and in parts (b) and (c) we see two possible

epimutations that have the properties mentioned above. In part (b),A mutated intoA′, but

Figure 5: Differentiation graph for 3 cell types. Nodes represent cell type, arrows indicate

that the cell type pointed to will be present in an adult that originated in a spore of the

originating cell type. (a) Original differentiation graph.(b) Differentiation graph with

epimutation in whichA mutates toA′, and this epimutation is lost in the differentiation

to B andC. (c) Differentiation graph in whichA is mutated toA′. The epimutation is

retained in differentiation toB, which leads to the creation ofB′. The epimutation is lost

in differentiation ofA′ andB′ into C.

the epimutation is lost whenA′ makesB and whenA′ makesC. In part (c)A mutates into

A′, and in the differentiation process that madeB out ofA the mutation is not lost, so that

A′ becomesB′. In the transition fromB to C the mutation is lost, soB′ becomesC.

As before, we are still concerned with differentiation graphs in which each node is a cell

type, and an arrow between nodesA andB means that a spore of typeA will make a cell

of typeB in the adult. This graph can be represented by a matrix, in which each

row/column corresponds to a cell type, and entryi, j indicates how many cells of typej in

the adult are produced by a spore of typei. Let us consider an organism withn cell types,

and with a differentiation matrixM. An epimutation occurs, and now we have potentially

n+n cell types. At first let us assume that the mutation is preserved through all

differentiation processes. In that case the new differentiation graph can be represented by

the following 2n×2n matrix:

in which 0̃ represents ann×n matrix with all entries 0. In this case the epimutation is not

different from a genetic mutation, and if an organism with the new cell types has a higher

fitness, then it will invade the population. The eigenvector with the highest eigenvalue of

the differentiation matrix times the fitness matrix will have either 0 in the firstn entries,

or 0 in the lastn entries.

Now let us assume that in some differentiation process the epimutation is lost. In this

case the new differentiation matrix will take the form: M A

0̃ M−A

The matrixA represents the transitions in which the epimutation was lost. Since the

epimutation cannot be regained in the differentiation process of the original organism, the

lower left corner of the new differentiation matrix remains 0.

Now the eigenvector with the largest eigenvalue can take one of two forms: either the last

n entries are all 0, in which the epimutation did not invade, or there are non-zero entries

among the lastn entries in the eigenvector, in which case the epimutation manages to

remain stable in the population. When the eigenvector is of the form( x 0 ) thenx must

be an eigenvector of the original differentiation matrixM. When the eigenvector is of the

form ( x y ) theny must be an eigenvector of the matrix(M−A). For the epimutation

to be stable in the population, it must provide a fitness advantage that is bigger than the

ratio of the largest eigenvalue ofM to the largest eigenvalue ofM−A.

Since the epimutation can not be regained by the unmutated cells, and yet the epimutation

is stable in the population, this means that as in the simple 2-cell case, the population

with the epimutation acts in this case as a source, and the unmutated cells as a sink for

organisms with the epimutation. Therefore as before genetic mutations can invade that

cause those cell types that have lose the epimutation to give up their replication ability if

they can contribute some of their resources to the other cells.

7 General Differentiation

We have shown a scenario in which a germline-soma separation evolves. How likely is

this to happen when differentiation is free to evolve? This question is very hard to answer

solely by theoretical means. On the theoretical side, one can ask whether models in

which differentiation is free to evolve will exhibit the same behavior as the specialized

model described in the paper. To address this question we constructed a model as

described in Appendix A. The model exhibits a phenotype landscape in which

“generalized” differentiation is possible, for organisms living in a fitness landscape that

rewards a certain differentiation pattern. An organism in the model starts as a single cell

which then divides and differentiates repeatedly, until the organism reaches the adult

stage of 16 cells. At this stage the organism is rewarded with fitness benefits for certain

differentiation patterns: a certain proportion of cells need to exhibit a concrete state of

0 500 1000 1500 2000

Average fitness of adult

Average fitness over time

(a) fitness

0 500 1000 1500 2000

generations

Average reproductive value of cells in the adult over time

(b) reproductive ability

Figure 6: Evolution of fitness and reproductive value in a representative run of the model

described in detail in appendix A. a. Average fitness of the population vs. time b. Average

reproductive value of the cells in the adult. The population size is 1000, epimutation rate is

5·10−3, mutation rate is 3·10−4 per individual per generation. After 30000 generation, the

reproductive value of the soma still continued to decline. The gradual increase in average

fitness results from less and less production of unfit offsprings.

gene activity. Gene activity is regulated by epigenetic markers, and these markers are

transmitted from the cells in the adult to the spores that will produce a new organism.

Under these conditions, germline-soma differentiation does indeed evolve. Figure 6

shows a representative run with such differentiation. Notice that around generation 200 a

mutant seems to invade the population, which then is followed by a series of further

mutations which eventually lead to reduced reproductivity in some of the cells in the

organism. In Figure 7 we follow this evolution in more detail. At generation 0, the

organism has very low fitness, and a very simple differentiation graph. At generation 68,

after a series of epigenetic mutations, the organism obtained a higher fitness, while still

retaining the very simple differentiation graph. At generation 92 a mutant invades, that

obtains a higher fitness through a more elaborate differentiation graph. In this organism,

spores that are produced from some cells of the adult can not reproduce all the cells of the

adult, and thus have a much lower fitness. After a series of mutations, these cells give up

their reproductive ability and become soma.

8 Discussion

The phenomenon observed in this paper is based on a very simple premise: a

multicellular organism reaches an evolutionary impasse in which certain cell-types can

not recreate the whole organism. When these cell types transmit their ineptness to their

offspring, one of two things can happen: the organism can evolve so that all cell types can

recreate the whole organism again, or it can evolve to transfer resources from the cell

types that can’t recreate the whole organism to the cell types that can. When the second

path is taken, a soma-germline differentiation will have happened.

What is the difference between the model presented here, and a model in which

everything was only under genetic control, with no epimutations that are transmitted

across generations? Figure 8 shows transmission of genetic and epigenetic mutations

across cell division for the two cases. As can be seen, in the epigenetic case, ’conflict’

between mutated and non-mutated cells is transmitted across generations. In the parent

mutated and unmutated cell types exist. In the offspring, again, both the mutated and the

non-mutated types co-exist, because the differentiation process restores the original

epigenetic state in some cells. This enables the selection process that was outlined in the

000000011 0.04001

000000001 0.03965

(a) gen. 0

011001011 11.6822

011001001 5.50342

(b) gen. 61

011001011 525.83

011001001 623.573

001001110.94874 3.93593

0.617646

001001100.94874 3.94672

0.617646

(c) gen. 96

011110111 1207.65

011110101 1344.34

000010010.0996392 9.03731

0.258539

000010000.0996392 0.02284

0.258539

(d) gen. 749

Figure 7: Differentiation graphs at various stages of the evolutionary run. Each node rep-

resents a cell type in the adult organism. Arrows indicate differentiation. The top bits in

each rectangle represent the epigenetic state of gene activity. The left bottom number is

reproductive valuer of this cell type in the adult. The bottom right number is the average

fitness of an adult that is produced by a spore of this cell type. (a) Initial differentia-

tion graph. (b) At generation 61 an individual with a mutated epigenotype invades. The

epigenotype is still not changed by the genome, but this epigenotype achieves a higher fit-

ness. (c) At generation 96 a mutant invades that has a much higher fitness, and an elaborate

differentiation graph. The two bottom cell types can not reproduce the top ones, and thus

have a lower fitness as spores. Because of drift in previous generations, these cells already

have a slightly lower reproductive value. (d) At generation 749 the cell types which can

not reproduce the full organism have a much lower reproductive value, after a series of

invasions of mutants.

� ��

� � ��

� ��

� � � ��

� ��

� � ��

� ��

� � ��

� ��

epigenetic mutationgenetic mutation

Figure 8: Comparison between inheritance of genetic and epigenetic mutations, in an

organism without germline. On the left a genetic mutation occurs, and is transmitted to all

offspring that that are produced by the mutated cell. On the right an epimutation occurs.

It is transmitted only to the offspring of the mutated cell, but in the differentiation process

the epimutation is lost, and thus in the offsprings, again, both mutated and non-mutated

cells exist

paper, in which one of the two cell types loses its reproductive ability. In the genetic case,

on the other hand, the offspring have either only non-mutated cells, or only mutated cells.

The mutant and non-mutant cells co-exist in the same organism only for one generation.

Previous work (Michod and Roze 2001; Michod and Roze 1997) pointed out that even

this one-generation coexistence can be an evolutionary force to lessen the chance of

mutations in the organism.

The model presented deals with a scenario for asexual organisms. Multicellularity arose

in sexual organisms. In sexual organisms, recombination presents an additional constraint

on the transmission of epigenetic information. If different cell types produce gametes

with different epigenetic states, and the epigenetic information resides on the DNA, then

recombinations of the different epigenetic markings have to be viable. This process

probably caused a strong pressure for uniformity of epigenetic information in the

gametes. When the gametes are absolutely uniform with respect to epigenetic

information, then the process outlined in the paper will not occur. If there still is some

residential transfer of epigenetic information through the gametes, which is neutral with

respect to recombination, then the process outlined in the paper will occur.

One of the previous explanations for the evolution of germline-soma differentiation is

based on models of specialization. In those models, one cell type specializes in

reproduction, and other(s) in other tasks. It is assumed that by specializing in other tasks,

more fitness is gained than is lost by giving up the cells reproductive ability. In our

model, on the other hand, during the specialization process, a mutation which causes one

cell type to lose its reproductive ability would invade even if it contributes only a very

small fraction of the reproductive ability it gives up to other cells. The difference between

our model and models of specialization stems from the fact that the cell types that give up

their reproductive ability are already an evolutionary dead end, so nothing is lost from

giving up their reproductive ability. Thus, the condition for invasion of specialization is

much weaker. A stronger condition had to happen earlier: in order for the epigenetic

mutation to fix in the population, it had to provide a strong fitness benefit. One can say

that the process of invasion of specialization has been broken into two parts: first invasion

of an epimutation through a strong fitness advantage, and then the invasion of

specialization through a much smaller fitness change.

In some organisms, and in particular in plants, the germline-soma distinction is not as

strong as it is in higher animals. In those organisms it might be possible to observe the

processes described in this paper. A possible empirical question could be how fit are

offspring that are produced by one cell type vs. those that are produced by another. Are

there epigenetic mutations that are transmitted through one type, but erased during

differentiation to another cell type?

We discussed a model for the evolution of differentiation in multicellular organisms, and

showed how in this case reproduction through multiple germ-lines is not stable. The same

process could also describe the evolution of a germline in social insects, such as wasps

and termites. Here the control of the state of an individual in the colony could be carried

out by pheromonal interactions. One would need to show that some of the epigenetic

information of individuals in primitive colonies can be transmitted to the next generation

of the colony.

It is also possible that the model touches on a general phenomenon in nature - a

phenomenon that can occur whenever individuals at a lower level of organization

aggregate to form a higher level of individuality, and in the transition use one of the

low-level modes of information transfer to control organization. The separation of a

germline is a universal phenomenon at many levels of organization in nature. One

important example is the transition to DNA as the information carrier. If pre-cellular life

consisted of a collection of self-replicating molecules, or molecules assisting each other

in self-replication (a Hypercycle), then at some point a transition must have occurred that

made the main self-replicating molecule in the organism the DNA, and other molecules

in the cell being produced by the DNA, being in a sense ’somatic’ molecules. An

instability as described in this paper could be one of the causes for this transition.

Acknowledgments

We would like to thank Eva Jablonka for many intersting discussions that lead to this

paper, Lauren Ancel, Susan Ptak and Carl Bergstrom for many comments. The Santa Fe

Institute provided generous support and hospitality.

References

Cavalli, G. (2002). Chromatin as a eukaryotic template of genetic information.Current

Opinion in Cell Biology 14, 269–278.

Cubas, P., C. Vincent, and E. Coen (1999). An epigenetic mutation responsible for

natural variation in floral symmetry.Nature 401, 157–161.

Grandjean, V., Y. Hauck, C. Beloin, F. Le Hegarat, and L. Hirschbein (1998,

Apr-May). Chromosomal inactivation of bacillus subtilis exfusants: a prokaryotic

model of epigenetic regulation.Biol Chem 379(4-5), 553–7.

Holliday, R. (1990). Mechanisms for the control of gene activity during development.

Biological Reviews 65, 431–471.

Jablonka, E., M. Lachmann, and M. Lamb (1992). Evidence, mechanisms and models

for the inheritance of acquired characters.J. Theor. Biol. 158, 167–170.

Jablonka, E. and M. Lamb (1995).Epigenetic Inheritance and Evolution. Oxford

University Press.

Klar, A. J. S. (1998). Propagating epigenetic states through meiosis: where mendel’s

gene is more than a dna moiety.Trends Genet. 14, 299–301.

Lyko, F. and R. Paro. (1999). Chromosomal elements conferring epigenetic

inheritance.BioEssays 21, 824–832.

Michod, R. E. and D. Roze (1997, Jun 22). Transitions in individuality.Proc R Soc

Lond B Biol Sci 264(1383), 853–7.

Michod, R. E. and D. Roze (2001, January). Cooperation and conflict in the evolution

of multicellularity.Heredity 86(Pt 1), 1–7.

Morgan, H. D., H. G. Sutherland, D. I. Martin, and E. Whitelaw (1999, November).

Epigenetic inheritance at the agouti locus in the mouse.Nat Genet 23(3), 314–8.

Waddington, C. H. (1942). The epigenotype.Endeavour 1, 18–20.

Wu, C. and M. J. R. (2001). Genes, genetics, and epigenetics: a correspondence.

science 293, 1103–1105.

A Appendix - Description of differentiation model

In the paper we described a scenario in which a germline could evolve. The question is

whether this scenario is similar to the scenario that brought about the evolution of a

germline in so many colonial organisms. In this appendix we describe a model for the

evolution of differentiation, to test whether a germline will evolve here through a scenario

similar to the one described in the paper.

In this model organisms are placed in an environment that favors differentiation. We

assume that the organism is already multicellular, developing from a single-celled spore

to an adult with 16 cells. Initially all the cells in the adult are identical. The fitness

function takes into account the distribution of which genes are turned on or off in how

many cells in the adult. Following is an exact description of the model:

A.1 Epigenotype

Each cell has an epigenotype, which is specified by which genes are on or off. This

epigenotype is transmitted to both of the daughter cells when the cell divides. We model

the epigenotype as a string of bitsei , i = 1. . .nG which tells if genei is on or off. In our

model we used 8 genes, i.e.nG = 8. We also assumed that one of the genes distinguishes

between the two daughter cells of a cell-division, so that it is on in one of the daughter

cells, and off in the other.

A.2 Genotype

The genome of the organism determines two things: It determines differentiation, and it

determines the reproductive ability of the various cells in the adult.

A.2.1 Genetic determination of differentiation

The genome determines differentiation by specifying a list of classifiers to match the

epigenotype of the cell, probabilities for a subsequent action to be taken, and a list of

which genes should be turned on and off when this action is taken. Thus the genome has

a list of classifiers, chances and actions(cg, pg,ag). The classifiers are a string of length

nG, each element of which is 1, 0, or X. A ’1’ at position i specifies that genei has to be

on for the classifier to match, a ’0’ that the gene has to be off, and a ’X’ that the gene can

be either on or off. If the classifier matches the epigenotype of the cell after division, then

with probability pg an action is taken. The action is simply a list of which genes will be

turned on, and which will be turned off. For example: Let us assume that the epigenotype

specifies the activity of 8 genes. The initial epigenotype of the cell in consideration is

1102031405161718

The genome has two classifiers:

110203X4X5161708 0.5 on: 2,3,4; off: 1

11X2X3X4X5061708 0.5 off: 3Now let us go through cell-division and differentiation. First the cell divides into two

daughter cells, which will have the following epigenotype:

1102031405161718

1102031405161708

As was mentioned before, the last gene is on in one of the daughter cells, and off in the

other. Now these two daughter cells differentiate. The first daughter cell doesn’t match

the first classifier, since it specifies that the last gene should be off, whereas in the cell it

is on. It also doesn’t match the second classifier, for the same reason. The second cell

matches the first classifier: the first 3 genes are on, off, off, as specified by the classifier.

The classifier does not specify the state of genes 4 and 5, and then specifies that genes 6,

7, and 8 should be on, on, and off, as they are in the second cell. Thus with probability

0.5 the action specified by this classifier will be taken, which is to turn genes 2, 3, and 4

on, and turn gene 1 off. Thus with probability 0.5 the second cell will switch to the state

0112131405161708

Once the epigenotype of a cell matches a classifier and an action is taken, it is not

matched against other classifiers.

A.2.2 Genetic determination of reproductive ability

The genome also determines the reproductive ability of the various cell types in the

organism. This is done through a list of classifiers andr values:(cg, rg). For each cell

type, ther value of all classifiers it matches to will be multiplied to give the reproductive

ability of the cell. Thus if the adult has the following two cell types:

1102031405161718

0112131405161708

and the following list of classifiers specifying reproductive ability:

1102X3X4X5161718 0.7

X1X2X31405X617X8 0.8Then the first cell type matches only the first classifier, and thus its reproductive ability is

0.7. The second cell type matches both classifiers, and thus its reproductive ability is

0.56. Resources are divided among the different cells in the organism to produce spores

as specified in section 5.

A.3 Fitness

Fitness is determined through a list of fitness components. Each of these fitness

components consists of a list of classifiers, the minimal number of cells that need to

match that classifier, and the fitness benefit that the organism gets if the adult organism

contains the minimal number of cells matching each of the classifiers. For example,

assume that the adult organism contains the following cells:

i epigenotype no. of cells of this type

1 1102031405161718 6

2 1112130405161708 5

3 0112130405161718 5And assume that the fitness function has the following fitness benefits:

i min n pattern min n pattern fitness benefit

1 8 XX1001XX 5 XXXX0XXX 1

2 3 XXX1101X 8 XX0X11XX 2

3 7 111XXXXX 6 XX00XXXX 8

4 4 01XXXXXX 11 X11X0XXX 512

1. The first classifier of fitness component 1 matches cell types 2 and 3, and the

second classifier matches all cells. So, the organism has 10 cells matching the first

classifier, which is more than the minimum needed, 8. It also has 16 matching the

second, which is more than the needed 5. So the organism gains a fitness benefit of

2. No cell matches the first classifier of fitness component 2, and a minimum of 3 are

needed, so the organism doesn’t gain any fitness benefit.

3. Cells of type 2 match the first classifier of fitness component 3, so 5 cells match,

but 7 are needed, so no fitness benefit.

4. Cells of type 3 match the first classifier of fitness component 4, so 5 cells match

and a minimum of 4 are needed. Cells of type 2 and 3 match the second classifier,

but 11 are needed, so again no fitness benefit.

Overall only fitness component 1 matches, so the total relative fitness of the organism is 1.

A.4 Dynamics

1. Each generation starts with a population of spores. Each spore has the epigenotype

and genotype of the cell that produced it.

2. Each spore goes through differentiation and growth. Differentiation and growth

consist of 4 rounds of cell division and differentiation. During differentiation both

genotype and epigenotype can mutate. Each cell division is followed by a round of

differentiation of both daughter cells. The adult then consists of 16 cells.

3. The fitness of the adult organism is calculated. This is the relative fitness of the

organism, if all cells had a reproductive value of 1. When cells have a reproductive

value less than 1, than the total number of spores that will be produced will be

smaller. The transfer of resources between cell types is as described in section 5.

A.5 Simulation results

A deeper analysis of the results of this model are beyond the scope of this paper. A

simulation as described here was run with a population size of 1000, epigenetic mutation

rate of 0.005 and mutation rate of 0.0003. The loss factor was set to 0.5, which means

that half of the resources transfered from one cell to others are lost. Fitness components

with exponential fitness benefits where chosen at random. In approximately half of the

runs soma evolved, and this depended mainly on the fitness function, i.e. with certain

fitness functions most runs produced a soma. Figure 6 shows the average fitness and

average reproductive ability of a representative run in which soma evolved. Figure 7

shows representatives of the surviving lineage in this run. As one can see, a

soma-germline distinction evolved in this run. In the organism that appeared at

generation 749, some of the cells contribute 90% of their reproductive ability to other

cells. In figure 9 we show a possible differentiation path of this organism from single

spore to adult. The fitness function of this run is represented in table 1.

01111011

01111010

01111011

01111010

01111011

01111010

00001000

01111011

01111010

01111011

01111010

00001001

00001000

01111011

01111010

01111011

01111010

01111011

01111010

01111011

00001000

00001001

00001000

01111011

00001001

00001000

00001001

00001000

01111011

01111010

01111011

01111010

01111011

01111010

01111011

01111010

00001001

00001000

01111011

01111010

011110101

011110111

000010000.0996392

000010010.0996392

011110101

011110111

000010000.0996392

011110111

011110101

011110111

000010000.0996392

011110111

000010000.0996392

000010010.0996392

000010000.0996392

000010010.0996392

spore adulttime

Figure 9: Sample development of an organism from generation 749. Time goes from left

to right. The organism starts as a spore, and then goes through 4 cycles of cell-division

and differentiation. Dashed arrows point to the products of cell division, solid arrows to

the products of differentiation.

min n pattern min n pattern fitness benefit

8 XX1001XX 5 XXXX0XXX 13 XXX1101X 8 XX0X11XX 28 XX011XXX 4 X11XXXXX 47 111XXXXX 6 XX00XXXX 84 X11XXXXX 3 X0001XXX 164 XXX1110X 7 XXX0X01X 323 X000XXXX 5 1X10XXXX 643 XXXX01XX 3 X1101XXX 1286 X010XXXX 7 XXX1000X 2564 00XXXXXX 4 X11X0XXX 5125 XXX010XX 6 XXX110XX 1024

min n classifier min n classifier min n classifier fitness benefit

3 XXX1001X 3 XXX101XX 5 XXXX111X 44 XXXX101X 3 XX1X0XXX 6 11X1XXXX 86 X110XXXX 3 0111XXXX 5 XX01X1XX 163 XX101XXX 5 10X1XXXX 5 XX10XXXX 325 X1100XXX 5 XX1001XX 6 XX111XXX 644 XX110XXX 3 X0101XXX 6 XX0111XX 1284 X0110XXX 3 XX0110XX 6 XXX111XX 2565 XX00X1XX 6 XX1XX0XX 6 XXX000XX 5123 XXXXXXXX 5 XXX010XX 6 XXX110XX 10244 XXX01X1X 4 XX1101XX 6 XX101XXX 2048

Table 1: Table of fitness components for the fitness function of the run that is described in

this paper.

epigenetic vs. genetic, a story of the evolution of the ... · differentiation of multicellular...

Documents

unicellular and multicellular

multicellular organisms

epigenetic and microrna circuits in cancer dimitrios...

role of epigenetic modification, epigenetic biomarkers and

multicellular cyanos

epigenetic and non-epigenetic mode of sirt1 action during

epigenetic mechanisms underlying cognitive impairment and...

environmentally induced transgenerational …...epigenetic...

epigenetics: dna methylation i. requirements for epigenetic...

4.3 multicellular organization

multicellular life evolution of multicellular life animal...

epigenetic control of gene regulation epigenetic vs genetic...

epigenetic drift, epigenetic clocks and cancer risk

multicellular algae seaweeds ( macroalgae) multicellular...

multicellular primary producers

multicellular algae

kingdom animalia. general characteristics multicellular...

animal evolution chpt. 32. multicellular multicellular...

multicellular primary producers

multicellular algae