arxiv:1509.04145v1 [physics.bio-ph] 2 sep 2015in this section we introduce the basic concepts that...

31
, The physics of epigenetics Ruggero Cortini, 1, 2, 3 Maria Barbi, 1, 2, 3 Bertrand R. Car´ e, 1, 2, 3 Christophe Lavelle, 3, 4 Annick Lesne, 1, 2, 3, 5 Julien Mozziconacci, 1, 2, 3 and Jean-Marc Victor 1, 2, 3, 5, * 1 Sorbonne Universit´ es UPMC Univ. Paris 06 UMR 7600 LPTMC F-75005 Paris France. 2 CNRS UMR 7600 LPTMC, F-75005 Paris France 3 Nuclear Architecture and Dynamics, CNRS GDR 3536, UPMC Universit´ e Paris 6, 75005 Paris France 4 Genomes Structure and Instability, Sorbonne Universit´ es, National Museum of Natural History, Inserm U 1154, CNRS UMR 7196, 75005 Paris France 5 Institut de G´ en´ etique Mol´ eculaire de Montpellier CNRS UMR 5535 Montpellier France. (Dated: January 25, 2021) In higher organisms, all cells share the same genome, but every cell expresses only a limited and specific set of genes that defines the cell type. During cell division, not only the genome, but also the cell type is inherited by the daughter cells. This intriguing phenomenon is achieved by a variety of processes that have been collectively termed epigenetics: the stable and inheritable changes in gene expression patterns. This article reviews the extremely rich and exquisitely multi-scale physical mechanisms that govern the biological processes behind the initiation, spreading and inheritance of epigenetic states. These include not only the change in the molecular properties associated with the chemical modifications of DNA and histone proteins – such as methylation and acetylation – but also less conventional ones, such as the physics that governs the three- dimensional organization of the genome in cell nuclei. Strikingly, to achieve stability and heritability of epigenetic states, cells take advantage of many different physical princi- ples, such as the universal behavior of polymers and copolymers, the general features of non-equilibrium dynamical systems, and the electrostatic and mechanical properties related to chemical modifications of DNA and histones. By putting the complex bi- ological literature under this new light, the emerging picture is that a limited set of general physical rules play a key role in initiating, shaping and transmitting this crucial “epigenetic landscape”. This new perspective not only allows to rationalize the normal cellular functions, but also helps to understand the emergence of pathological states, in which the epigenetic landscape becomes disfunctional. CONTENTS I. Introduction 2 A. An intricate history 2 B. Scope of this review 2 II. The physical template of epigenetics: chromatin 3 A. Molecular picture of chromatin and its modifications 3 B. Large-scale picture of chromatin 5 C. Chromosomes as polymers 5 III. From epigenetic marks to regulation of gene expression through the 3D organization of the genome 7 A. General principles of gene silencing. The paradigm of DNA accessibility 7 B. Histone modifications as chromatin structural modulators 8 1. Histone tails and their role in internucleosomal interactions 8 2. Histone tail post-translational modifications (PTMs) 8 3. Histone tail acetylation: direct effects on chromatin accessibility 9 4. H4K16 acetylation is a silencing mark in budding yeast 11 * Corresponding author: Jean-Marc Victor LPTMC case courrier 121 Universit´ e Pierre et marie Curie 4 place Jussieu 75252, Paris cedex 05 France. Email: [email protected] 5. Histone tail methylation: indirect effects on chromatin condensation 11 C. How epigenetic marks organize the chromosomes in the cell nucleus. General rules. Physical modeling of epigenome wide studies. 12 1. Epigenome wide studies 12 2. The physics of TADs: finite-size effects in the coil-globule transition of copolymers 12 IV. Physical mechanisms involved in the initiation, spreading, maintenance and heritability of epigenetic marks 14 A. Mathematical modelling 14 B. Zero-dimensional models 14 C. Higher-dimensional models 16 1. One-dimensional models. 16 2. Three-dimensional models. 17 D. Biological relevance of the models 17 1. Waddington’s epigenetic landscape revisited. 17 2. Hysteresis 18 E. Example: plant vernalization 19 V. Toward a more complex scenario: DNA methylation, role of RNAs, supercoiling in epigenetics 20 A. DNA methylation 20 1. Mechanical properties of DNA change upon methylation 22 2. Impact of cytosine methylation on DNA-protein interactions 22 3. Relationship between nucleosome positioning and DNA methylation 23 4. Remarks and perspectives 23 B. Parental imprinting 23 arXiv:1509.04145v1 [physics.bio-ph] 2 Sep 2015

Upload: others

Post on 27-Jan-2021

0 views

Category:

Documents


0 download

TRANSCRIPT

  • ,

    The physics of epigenetics

    Ruggero Cortini,1, 2, 3 Maria Barbi,1, 2, 3 Bertrand R. Caré,1, 2, 3 Christophe Lavelle,3, 4 Annick Lesne,1, 2, 3, 5 JulienMozziconacci,1, 2, 3 and Jean-Marc Victor1, 2, 3, 5, ∗1Sorbonne Universités UPMC Univ. Paris 06 UMR 7600 LPTMC F-75005 Paris France.2CNRS UMR 7600 LPTMC, F-75005 Paris France3Nuclear Architecture and Dynamics, CNRS GDR 3536, UPMC Université Paris 6, 75005 Paris France4Genomes Structure and Instability, Sorbonne Universités, National Museum of Natural History, Inserm U 1154, CNRSUMR 7196, 75005 Paris France5Institut de Génétique Moléculaire de Montpellier CNRS UMR 5535 Montpellier France.

    (Dated: January 25, 2021)

    In higher organisms, all cells share the same genome, but every cell expresses only alimited and specific set of genes that defines the cell type. During cell division, not onlythe genome, but also the cell type is inherited by the daughter cells. This intriguingphenomenon is achieved by a variety of processes that have been collectively termedepigenetics: the stable and inheritable changes in gene expression patterns. This articlereviews the extremely rich and exquisitely multi-scale physical mechanisms that governthe biological processes behind the initiation, spreading and inheritance of epigeneticstates. These include not only the change in the molecular properties associated withthe chemical modifications of DNA and histone proteins – such as methylation andacetylation – but also less conventional ones, such as the physics that governs the three-dimensional organization of the genome in cell nuclei. Strikingly, to achieve stability andheritability of epigenetic states, cells take advantage of many different physical princi-ples, such as the universal behavior of polymers and copolymers, the general featuresof non-equilibrium dynamical systems, and the electrostatic and mechanical propertiesrelated to chemical modifications of DNA and histones. By putting the complex bi-ological literature under this new light, the emerging picture is that a limited set ofgeneral physical rules play a key role in initiating, shaping and transmitting this crucial“epigenetic landscape”. This new perspective not only allows to rationalize the normalcellular functions, but also helps to understand the emergence of pathological states, inwhich the epigenetic landscape becomes disfunctional.

    CONTENTS

    I. Introduction 2A. An intricate history 2B. Scope of this review 2

    II. The physical template of epigenetics: chromatin 3A. Molecular picture of chromatin and its modifications 3B. Large-scale picture of chromatin 5C. Chromosomes as polymers 5

    III. From epigenetic marks to regulation of gene expressionthrough the 3D organization of the genome 7A. General principles of gene silencing. The paradigm of

    DNA accessibility 7B. Histone modifications as chromatin structural

    modulators 81. Histone tails and their role in internucleosomal

    interactions 82. Histone tail post-translational modifications

    (PTMs) 83. Histone tail acetylation: direct effects on chromatin

    accessibility 94. H4K16 acetylation is a silencing mark in budding

    yeast 11

    ∗ Corresponding author: Jean-Marc Victor LPTMC case courrier121 Université Pierre et marie Curie 4 place Jussieu 75252, Pariscedex 05 France. Email: [email protected]

    5. Histone tail methylation: indirect effects onchromatin condensation 11

    C. How epigenetic marks organize the chromosomes in thecell nucleus. General rules. Physical modeling ofepigenome wide studies. 121. Epigenome wide studies 122. The physics of TADs: finite-size effects in the

    coil-globule transition of copolymers 12

    IV. Physical mechanisms involved in the initiation, spreading,maintenance and heritability of epigenetic marks 14A. Mathematical modelling 14B. Zero-dimensional models 14C. Higher-dimensional models 16

    1. One-dimensional models. 162. Three-dimensional models. 17

    D. Biological relevance of the models 171. Waddington’s epigenetic landscape revisited. 172. Hysteresis 18

    E. Example: plant vernalization 19

    V. Toward a more complex scenario: DNA methylation, roleof RNAs, supercoiling in epigenetics 20A. DNA methylation 20

    1. Mechanical properties of DNA change uponmethylation 22

    2. Impact of cytosine methylation on DNA-proteininteractions 22

    3. Relationship between nucleosome positioning andDNA methylation 23

    4. Remarks and perspectives 23B. Parental imprinting 23

    arX

    iv:1

    509.

    0414

    5v1

    [ph

    ysic

    s.bi

    o-ph

    ] 2

    Sep

    201

    5

  • 2

    C. Chromosome X inactivation 24D. Non-coding RNA and microRNA 25E. Supercoilingomics: supercoiling as a physical

    epigenetic mark, and its role in the initiation andmaintenance of epigenetic marks 26

    VI. Conclusion and perspectives 27

    Acknowledgments 27

    References 27

    I. INTRODUCTION

    A. An intricate history

    The word “epigenetic” has been introduced byWaddington in 1942 in the context of development, toqualify all the processes relating the genotype and thephenotype of an organism (Waddington, 1942). The as-sociated investigations belonged to the domain, novel atthat time, of developmental genetics. The word “epige-netic” in this original meaning was imprinted by the pre-existing concept of epigenesis, namely the fact that theorganism is not fully achieved in the initial cell but expe-riences complex developmental processes (Gilbert, 2011).Epigenetics was at that time the mechanistic study ofhow genes guide the epigenesis (development) of an or-ganism, what is captured in a metaphoric way by thefamous Waddington’s epigenetic landscape: a landscape,shaped by the genes, on which the organism would evolveduring its development as a rolling stone on the land-scape, following one of the possible epigenetic pathways(see Fig. 1).

    In parallel, the adjective has been used with the mean-ing of “para-genetic”. Epigenetic systems, as opposed tothe genetic system, were conceived as “signal interpretingdevices” (Nanney, 1958), i.e. mediators between signals– environmental or physiological cues – and the genomicresponse, mainly at the level of transcriptional regula-tion.

    Due to this dual origin of the word “epigenetic”, theassociated concepts have developed in several ways (see(Haig, 2004) for a detailed historical account) and onecould find in 1994 the two following complementary def-initions of epigenetics (Holliday, 1994): (1) changes ingene expression which occur in organisms with differ-entiated cells, and the mitotic inheritance of the asso-ciated patterns of gene expression; (2) transgenerationalinheritance, that is, transmission through meiosis of non-genomic information.

    Due to this intricate history, a consensus defini-tion of epigenetics is still lacking today (Dawson andKouzarides, 2012). Notably the transgenerational inher-itance, albeit largely documented in plants, remains amatter of debate in animals and especially in humans.

    Recently, some authors recently proposed an opera-tional definition of epigenetics: “An epigenetic trait is a

    stably heritable phenotype resulting from changes in achromosome without alterations in the DNA sequence”.This will be the definition we use in this review. To beeven more specific, we note that:

    Epigenetics is the modification of the function(s) of agene, that is stable and heritable during mitosis, possiblyduring meiosis.

    Epigenetics is not the reversible regulation of transcrip-tion in response to metabolic cues, because this is notstable nor heritable.

    According to a scenario proposed by Berger etal. (Berger et al., 2008), there are 3 categories of signalsthat culminate in the establishment of a stably heritableepigenetic state:

    1. Epigenetor: signal (cue) from the environment thattriggers an intracellular signaling pathway (e. g. bymeans of membrane receptors, notably G Protein-Coupled Receptors).

    2. Epigenetic initiator: epigenetors activate transcrip-tion factors (TFs) that bind to specific DNA tar-gets;

    3. Epigenetic maintainer: molecular covalent modifi-cations of DNA or histones (DNA-binding proteinsmost strongly bound to DNA).

    The molecular covalent modifications that eventuallyresult from epigenetors are the so called “epigeneticmarks”. The epigenetic marking of the genome is thus akey component of the dialogue between genes and envi-ronment in the eukaryotic realm.

    B. Scope of this review

    In this review, we not only intend to analyze thephysics that drives or accompanies epigenetic marking,but we also aim at understanding the rationale behindthis marking. And physics is a beautiful, yet underratedguide to reach this goal.

    Several epigenetic mechanisms will be distinguished:those occurring at the level of DNA, those involving hi-stone post-translational modification, and less conven-tional ones involving chromatin topology (supercoiling)and nuclear architecture.

    We first introduce in section II the physical templateof epigenetic marking, namely chromatin.

    Section III is devoted to the physics behind the familyof processes at work in the way epigenetic marks controlgene expression in different cell types.

    Section IV addresses the issue of the initiation, spread-ing, maintenance, and heritability of the epigenetic marksin the framework of dynamical systems.

    In Section V we review other epigenetic processesthat have a less clear-cut physical interpretation: DNA

  • 3

    FIG. 1 (Left) The epigenetic landscape described by Waddington (Waddington, 1957) represents the process by which thecell (represented by the ball) faces different possible paths during development (i.e. choose one of the permitted trajectories),leading to different cell fates. (Right) The landscape is dynamically determined by hidden wires that symbolize genes expressionand interactions. See Sec. IV.D and Sec. VI for a physical reanalysis of this metaphorical picture.

    methylation, imprinting, chromosome X inactivation, su-percoiling marking. In conclusion we finally propose a listof currently significant and challenging issues.

    Due the fundamentally different logic of transcriptionalregulation in prokaryotes and eukaryotes (Struhl, 1999),we will let aside the realm of bacteria, although epigeneticswitches have been observed as well in prokaryotic cellsand have been modeled successfully (Lim and Van Oude-naarden, 2007; Norregaard et al., 2013).

    We hope this review will be a stimulating introductionto epigenetics for physicists as well as an “alternativereading frame” of epigenetics for biologists that will helptackling cutting-edge advances in current topics rangingfrom nuclear organization and cell differentiation up tocancer progression and chronic diseases.

    II. THE PHYSICAL TEMPLATE OF EPIGENETICS:CHROMATIN

    In all living organisms, DNA encodes the geneticinstructions required to synthesize proteins, the basicbricks ensuring the proper functioning of the cell. Themain steps of protein synthesis are DNA transcriptioninto RNA, then RNA translation into an aminoacid chainand chain folding to form a functional protein (Albertset al., 2013).

    The very same genome is found in each cell. It hasto be packaged inside its tiny volume, and has to be re-trieved at will for physiological purposes. DNA is there-fore embedded in an orderly and dynamically retrievablearchitecture. Two main organizational strategies can beidentified. In prokaryotes (bacteria), DNA is located inthe same compartment as all other intracellular com-ponents. In eukaryotes (from the unicellular yeast up

    to multicellular organisms, including fungi, animals andplants), DNA is sequestered in the nucleus, a dedicatedcompartment enclosed within a membrane.

    In the cell nucleus, multiple long linear DNA moleculesare organized by architectural proteins to form chromo-somes. From a physicist point of view, chromosomes aregiant polymers. During mitosis, i.e. cell division, chro-mosomes duplicate and then condense in the well known“X” shape, with each DNA copy forming one of the tworods (the sister chromatids, bound together at the cen-tromere). The rest of the time (i.e. during interphase),chromosomes are less condensed and fill the whole nu-cleus, more or less homogeneously (Leblond and El-Alfy,1998). To give a quantitative idea of the composition ofan interphase nucleus, the dry matter of a yeast nucleusis about ∼70-80% in protein, ∼20-30% in RNA, and only∼2% in DNA (Rozijn and Tonino, 1964).

    In this section we introduce the basic concepts thatcome into play in the study of epigenetics. In Sec. II.Awe give a synthetic overview of the molecular structureof chromatin, and we introduce the concept of epigeneticmarks. In Sec. II.B we will give an overview of the large-scale organization of chromatin in the cellular nucleus,stressing the importance of this organization in gene ex-pression. Finally, in Sec. II.C we give a synthetic pic-ture of these two aspects, in the framework of polymerphysics.

    A. Molecular picture of chromatin and its modifications

    In eukaryotic organisms, chromosomal DNA is asso-ciated with proteins to form chromatin. The principalproteins associated with DNA are called histones. Hi-stones are polypeptidic monomers of five types: His-

  • 4

    H2A/H2Bacidic patch

    H3 tails-15 10

    H4 tail

    (a) (b)

    intra-nuc / intra-array

    inter-nuc / inter-arrayinter-nuc / intra-array

    ~2 mM MgCl2

    3-5 mM MgCl2

    0.5 mM MgCl2

    DNA + histones

    > 300S

    55S

    40S

    29S

    Sedimentationcoefficient

    primary structure

    secondary structure

    tertiary structure30 nm 10 nm 50 nm

    (c)(d)

    FIG. 2 Detailed structure of the Nucleosome Core Particle (NCP) and chromatin fiber. (a) left: internal tertiary structureof the NCP147 (PDB: 1KX5) H2A is in green, H2B in blue, H3 in yellow, H4 in red, and DNA in black. right: electrostraticpotential at the NCP surface (computed using the PDB2PQR/APBS plugin (Baker et al., 2001; Dolinsky et al., 2007) of theVMD software package(Humphrey et al., 1996), ionic strength 0.15 M monovalent salt). (b) Cartoon of the NCP showinghistone tails and the globular core. Lysine and arginines residues are marked by asterisks (from (Wolffe and Hayes, 1999)).(c) EM images of nucleosome arrays with low (left) or high (far right) ionic forces, with details of nucleosome (middle), from(Olins and Olins, 2003). (d) Reconstituted chromatin sedimentation assays principles, adapted from (Pepenella et al., 2013).

    tone 1 (H1) class, Histone 2A (H2A) class, Histone 2B(H2B) class, Histone 3 (H3) class and Histone 4 (H4)class. Each histone family has variants whose presencein chromatin depends on the species, the cell type, andthe development stage. The classical structure of thehistone-DNA assembly consists of 1.7 left-handed turnsof double strand DNA (approximately 147 base pairs orbp) wrapped around a histone octamer composed of twocopies of each histone monomer H2A, H2B, H3 and H4(Davey et al., 2002; Luger et al., 1997). In most species,

    this assembly, referred to as the nucleosome core parti-cle (NCP) (see Fig. 2a,b), may also integrate a copyof H1 (“linker histone”) at the DNA entry/exit point,although H1 does not share the ubiquity of the otherhistone classes.

    In addition, consecutive NCPs are separated by linkerDNA whose length ranges from 20 to 60 bp. Indeed,chromosomes are a succession of NCPs and DNA link-ers. The basic structural unit (monomer) is made of oneNCP and one DNA linker, and is called the nucleosome.

  • 5

    The number of DNA base pairs inside one nucleosomeis the Nucleosome Repeat Length (NRL) (see Fig. 2c,d),which is not constant and may vary along the genomeand across various tissues.

    Electrostatic interactions are important because theNCP has a charge of -150e, to which DNA contributes-294e and histones +144e. The NCP is therefore notelectrically neutral, so the folding of nucleosome arrays ishighly dependent on the presence of positive counterions(Bertin et al., 2007b; Yang and Hayes, 2011). Addition-ally, the charge distribution in the NCP is not spatiallyhomogeneous (see Fig. 2a).

    Epigenetic marks are chemical covalent modificationsof either DNA (namely DNA methylation, see Sec. V.A),or histones (so-called post-translational modifications,PTMs, see Sec. III.B). The DNA methylation state andthe histone PTMs are transmitted through cell divisionboth because they are covalent and thanks to specificmechanisms. DNA methylation is accurately transmit-ted by a specific molecular mechanism (see Sec. V.A).Histone PTMs are inherited in a fundamentally differentway, which will be the principal subject of Sec. IV.

    B. Large-scale picture of chromatin

    Eukaryotic chromosomes are giant polymers, eachformed by a huge string of nucleosomes. The conforma-tion of this string at different length scales is generallydescribed using an analogy with proteins: the string ofnucleosomes itself can be viewed as the primary structureof chromatin; the conformation adopted by an array ofa few dozen successive nucleosomes forms the secondarystructure of chromatin. The 3D structural arrangementof several arrays can finally be viewed as the chromatintertiary structure (Luger et al., 2012; Pepenella et al.,2013), see Fig. 2d.

    When observed by electron microscopy (seeFig. 3a,b,c), interphase chromatin appears to fillthe entire nucleus volume. As genome length may varyconsiderably from organism to organism, the nucleussize varies accordingly: orders of magnitude go from∼ 10 Mb (Mega base pairs) for a diameter of the orderof 2 µm in yeast (Fig. 3a), to ∼ 100 Mb and 4 µm indrosophila fly (Fig. 3b), and up to ∼ 1000 Mb and 10µm in mammals (Fig. 3c). These differences in size arecertainly correlated with the differences in chromatinorganization that can be directly deduced by simpleinspection of electron microscopy images.

    Yeast nuclei are the most homogeneously filled ones,with a large, denser region called the nucleolus, whichis known to be the site of very intense ribosomal RNAsynthesis. A smaller, dark linear body can also be seenin the inset, connected with a star-shaped structure, thespindle pole body (SPB) from which tubular proteic as-semblies, microtubules, stem and “hold” chromosomes at

    their centromeres. In contrast with multicellular organ-isms, in yeast this microtubule bundle is preserved allalong the cell cycle. It is a crucial organization centerfor the assembly of chromosomes in interphase and forchromosome segregation during mitosis (see Fig. 3a,d,g).

    When the nuclei of multicellular organisms are consid-ered (Fig. 3b,c), their most striking feature is the coexis-tence of distinct denser and less compact regions. Theseregions are persistent and are not simply the result oftemporal fluctuations of chromatin density. These fea-tures have been shown to strongly correlate with the tran-scription activity of genes. Active genes tend indeed togather at the center of the nucleus, in a region where chro-matin is less dense and more accessible, which is calledeuchromatin. Inactive genes are found instead in denserregions, called heterochromatin, and tend to associatewith the nuclear periphery. As a stunning example ofchromatin compaction and localization changes inducedby transcription, the activation of a genomic locus resultsin a dramatic change of its topology (Fig. 4a).

    With the improvement of imaging and labeling tech-niques, gene transcription by the RNA polymerase PolIIin multicellular organisms has been shown to occur inwell-defined loci, called factories (Fig. 4b) (Jackson et al.,1993). These factories are located within the euchro-matin domain and each factory has a propensity to gatherco-regulated genes (Jackson et al., 1998). In this picture,it appears that the functional differences between celltypes are related to the way the genome is folded in thenucleus of these cells.

    In the last two decades, impressive advances in ex-perimental techniques in measurements of 3D chromoso-mal contacts have been made, starting from the “Chro-mosome conformation capture” approach (Dekker et al.,2002). Its genome wide derivative (Hi-C) enable the gen-eration of contact maps at the genome scale (Lieberman-Aiden et al., 2009). From these maps, it is possible to re-construct the underlying 3D structure of the genome andsuch structures are represented on Fig. 3d,e,f. The resultsconfirmed the the tethering of centromeres in yeast anddrosophila, but also of telomeres. In humans the recon-struction of the first chromosome gives a visual illustra-tion of decondensed euchromatin loops emanating fromglobular heterochromatin globules. These are decoratedwith two different specific histone marks. We will comeback on the results of these investigations in Sec. III.C.1.

    C. Chromosomes as polymers

    Most of the modeling efforts addressing the questionof the nuclear organization have been so far oriented bypolymer physics. The question then arises as to under-stand whether polymer physics is the main player thatdrives chromosome organization.

    (i) In the simplest case of yeast, where chromosomes

  • 6

    FIG. 3 Nuclear organization in yeast (S. Cerevisiae), drosophila and mammals. (a-c) Electron microscopy images of nucleiin yeast (a) in drosophila (b) and human (c). Scale bars correspond respectively to 1 µm, 2 µm and 2 µm. In (a), thenucleolus is the darker region in the upper part of the nucleus. The SPB is shown by a circle. In (b), the nucleolus isthe dark circular region and heterochromatin can be seen as darker spots. In (c), the nucleolus is marked by a dashedcircle. (Figure (a) from http://scienceblogs.com/transcript/2006/08/16/the-centrosome-and-the-spindle/, (b) fromhttp://pixgood.com/nuclear-pore-em.html, (c) from http://tinyurl.com/m6phpf8). (d-f) 3D models reconstruction (Lesneet al., 2014) from contact maps obtained using the Hi-C protocol in these three organisms (Dixon et al., 2012; Duan et al.,2010; Sexton et al., 2012). In (d) and (e) all the chromosomes are represented with different colors. In (f), only the firstchromosome is shown and the colors correspond to regions harboring different epigenetic marks: H3K9Ac and H3K9me3 (seeSec. III.B). On each reconstruction, centromeres are shown as black beads and telomeres (chromosome ends) as purple beads.Each bead represents respectively 12 kb, 40 kb and 40 kb. (g-h) Polymer models of the genome in yeast and drosophila: In(h) each chromosome is labeled with a different color (Wong et al., 2012). In (g) colors correspond to the colors of chromatin(see Sec. III.C, courtesy of Giacomo Cavalli and Pascal Carrivain (IGH, Montpellier)). (i) To our knowledge, physical modelsof the human genome have not been developed so far.

    http://scienceblogs.com/transcript/2006/08/16/the-centrosome-and-the-spindle/http://pixgood.com/nuclear-pore-em.htmlhttp://tinyurl.com/m6phpf8

  • 7

    FIG. 4 Fluorescence microscopy of nuclei. (a) a specfic regionof the genome, labeled in red, is decompacted after the induc-tion of transcription. (Tumbar et al., 1999) (b) A human cellnucleus. DNA is labeled in blue, PolII in red (Crepaldi et al.,2013)

    are shorter and all anchored at the SPB by their cen-tromeres, it seems to be indeed the case. Indeed, sev-eral polymer simulations have been able to reproduce thestructure of interphase yeast nuclei (Tjong et al., 2012;Wong et al., 2012), see Fig. 3g 1. Moreover, fluorescentmicroscopy has been used to check the dynamical behav-ior in vivo of given chromosomal loci, (Albert et al., 2013;Hajjoul et al., 2013). Single particle tracking has revealeda quite uniform response within the genome, character-istic of polymers in confined spaces. Except for telom-eres and for the highly transcribed DNA in the nucleolus,yeast chromosomes behave as a polymer brush, and areessentially organized by simple physical principles (Huetet al., 2014) (see Fig. 3d,g).

    (ii) In the well-studied, intermediate-size case of thedrosophila, recent investigations tend to indicate thatthis polymer behavior is partially conserved, but withsome significant changes that go in the sense of greatercomplexity (see Fig. 3e,h). Roughly speaking, ithas been proposed that euchromatin and heterochro-matin have intrinsically different biochemical and phys-ical properties, due to a deeply different protein “dress-ing” of the DNA molecules. More precisely, Filion andco-workers have identified five principal chromatin states,called chromatin “colors” from the analysis of 53 chro-matin protein genome-binding profiles in drosophila cells(Filion et al., 2010). Among these states, some essentiallycorrespond to active, transcribing euchromatin, otherto dense, repressed heterochromatin. These chromatinstates result from the recruitment of DNA-binding pro-teins that are specific of the underlying epigenetic marks(see Sec. III.B).

    1 Note that, in yeast, the whole genome is actively transcribingmost of the time, with the only exception of the regions thatgovern the cell sexual behavior, called “hidden mating type loci”,and of the chromosome extremities, called telomeres, which pro-tect the ends of the chromosome from damaging or from fusionwith other chromosomes. Therefore heterochromatin is restrictedto telomeres and mating type loci in this case.

    As a consequence, drosophila chromosomes are moreproperly described as co-polymers, i.e. polymers con-taining more than one type of monomer. A model of theresulting copolymer brush is depicted on Fig. 3h.

    (iii) In mammals heterochromatin is mainly located atthe nuclear membrane and euchromatin at the center ofthe nucleus (see Fig. 3c). The reconstituted 3D struc-ture of chromosome 1 (the longest human chromosome)shows an alternance of long loops of euchromatin anddense parts of heterochromatin tethered to the nuclearmembrane (see Fig. 3f).

    In summary, the conformation adopted by chromatin isaffected by its intrinsic structural parameters such as theNRL (the reader may find an extensive review in (Bouléet al., 2015)), on top of which lies an additional layer ofmodulation by internucleosomal electrostatic interactions(Hansen, 2002; Pepenella et al., 2013) and binding of ar-chitectural proteins. This conformation is essential forgene regulation. The epigenetic marks present on DNAand histones, by mediating specific interactions betweenportions of chromatin, alter its conformation and henceits function. The next section will be devoted to un-derstanding the complex relationship between epigeneticmarking and genome structure and function.

    III. FROM EPIGENETIC MARKS TO REGULATION OFGENE EXPRESSION THROUGH THE 3DORGANIZATION OF THE GENOME

    A. General principles of gene silencing. The paradigm ofDNA accessibility

    During development, the determination of the cell type(cell fate) involves progressive restrictions in its develop-mental potency and results from differential gene expres-sion. DNA methylation is a key control parameter ofthis process: genes that are specific for the desired tissueare kept unmethylated, whereas the others are methy-lated. Moreover, patterns of DNA methylation are faith-fully propagated throughout successive cell divisions (seeSec. V.A). However the physics of DNA methylation isstill elusive and we therefore postpone further develop-ments on DNA methylation to the last part of this review(see Sec. V.A).

    Epigenetic regulation of gene expression involves si-lencing, i.e. a permanent and heritable inhibition of genetranscription (transciptional gene silencing) or transla-tion (post-transcriptional gene silencing). The currentparadigm is that gene silencing is achieved through chro-matin condensation, in a so-called heterochromatiniza-tion process (Grewal and Moazed, 2003). Can we char-acterize the physical properties of heterochromatin andeuchromatin? What are the physical consequences of het-erochromatinization in terms of structure, dynamics andhow do these physical consequences turn out into func-tional consequences?

  • 8

    Histones simultaneously play a crucial role in deter-mining the structure of chromatin; they are the substrateof a vast catalog of epigenetic markings (Cantone andFisher, 2013; Kouzarides, 2007), which is not a coinci-dence. This supports the hypothesis that epigenetic his-tone marks modulate gene expression through chromatinstructural rearrangements at each level of the nuclearorganization: nucleosome, chromatin fiber, chromatinloops, chromosome territories, whole nucleus (Poirieret al., 2009; Zhou et al., 2007).

    B. Histone modifications as chromatin structuralmodulators

    Most epigenetic marking occurs on the histones thatcoat DNA. What are the physical consequences of thismarking and what is its effect on chromatin organization?

    1. Histone tails and their role in internucleosomal interactions

    As already mentioned, nucleosomes are formed bywrapping DNA around an octameric protein assemblyformed by histone proteins. The N-terminal sequences ofH2A, H3 and H4 extend from the globular histone coreto form the so-called histone tails (see Fig. 2b). The H3and H4 tails consist respectively of 35 and 20 residues,of which respectively 13 and 9 are positively charged(lysines, K and arginines, R). These tails are intrinsi-cally disordered protein domains, hence adopt a randomcoil configuration, as suggested by crystallographic stud-ies (Davey et al., 2002; Luger et al., 1997) and proteolyticcleavage assays. Tails contribute differently to intranu-cleosomal stability and internucleosomal interactions (Al-lan et al., 1982; Arya and Schlick, 2006, 2009; Sinha andShogren-Knaak, 2010; Zhou et al., 2007). The two H3tails exit from the histone core close to the DNA entry-exit site of the nucleosome, and associate preferentiallywith DNA to “lock” its wrapping around the histone core.The H4 tails are known to associate with a set of sevenresidues referred to as the H2A/H2B acidic patch, lo-cated on the H2A-H2B interface (see Fig. 2a). A H4 tailon one nucleosome may interact with an H2A/H2B acidicpatch on a adjacent nucleosome, acting as a tether con-necting the two nucleosomes (Kalashnikova et al., 2013;Kan et al., 2009). The H2A and H2B tails, much shorterthan their H3 and H4 counterparts and the subject ofa much smaller literature, do not seem to significantlycontribute to internucleosome interactions, although theyare required for proper nucleosome reconstitution (Bertinet al., 2007a).

    2. Histone tail post-translational modifications (PTMs)

    Histone tails, besides their role in the structurationof nucleosome arrays, are also the support of virtuallyall PTMs targetting histones, which consist in replac-ing groups of atoms on one residue by another, chemi-cally different one (see Fig. 5). For an historical accountof their discovery, see (Morange, 2013). The globularhistone core and the lateral surface of the nucleosomemay also undergo post-translational modifications, whichmodulate the nucleosome stability, DNA wrapping (Tes-sarz and Kouzarides, 2014; Tropberger and Schneider,2013), hence chromatin architecture. The repertory ofhistone-tail PTMs is vast both in terms of types of modi-fications and in terms of where the modification can takeplace (Fierz, 2014; Pepenella et al., 2013; Zentner andHenikoff, 2013).

    In order to reach a comprehensive physical picture weoversimplify the daunting complexity of epigenetic his-tone PTMs (Kouzarides, 2007) to focus here on:

    (i) lysine methylation and notably the two main hi-stone PTMs that are involved in gene silencing: tri-methylation of the lysine 9 of H3, noted H3K9me3, whichrecruits HP1 and tri-methylation of the lysine 27 of H3,noted H3K27me3, which recruits the Polycomb architec-tural complex;

    (ii) lysine acetylation and specifically the acetylation oflysine 16 of H4, H4K16ac, which is a hallmark of activechromatin (actively expressed genes).

    Epigenetic marks are deposited on or removed fromhistone tails by dedicated enzymes, so-called “writers”and “erasers” (Fierz, 2014)). Writers devoted to acety-lation are histone acetyltransferases (HAT), notably ly-sine acetyltransferases (KAT), and writers devoted tomethylation are histone methyltransferases (HMT), no-tably lysine methyltransferases (KMT). Erasers are his-tone deacetyltransferases (HDAC) and histone demethyl-transferases (HDM), notably lysine demethyltransferases(KDM), see Fig. 5.

    A wealth of data exists regarding the presence ofhistone tail modifications in different species, develop-ment stages and cell types – the so-called epigenome –but efforts for characterizing the effect of histone PTMsare currently limited by the difficulty of examining invivo chromatin structure. Interestingly, the two mainmodifications discussed here – lysine acetylation and ly-sine methylation – seem to act on the chromatin ar-chitecture and state of activity through rather differ-ent mechanisms. In the case of acetylation, a directeffect on nucleosome-nucleosome interactions is at play,with a certain but subtle relationship with the associ-ated loss of a positive charge (see Sec. III.B.3). In con-trast, methylation preserves electric charges, while intro-ducing significant steric hindrance and potentially hy-drophobic interactions, and mainly act on chromatin in-directly by recruiting additional architectural proteins

  • 9

    N H 2

    O

    H 3 N +

    O

    H 3 C

    H N

    O

    N H 2

    C H 3

    H 2 N +

    O

    N H 2

    C H 3

    H N +

    O

    N H 2

    C H 3

    N H 2

    C H 3

    N +

    O

    C H 3

    H 3 C

    Histone AcetyltransferaseLysine Acetyltransferase

    (HAT, KAT)

    Histone deacetylase(HDAC)

    ac-coA coASAMSAH

    Histone methyltransferase(HMT)

    Histone demethylase(KDM)

    Lysine Acetyl-lysineMonomethyl-lysineDimethyl-lysineTrimethyl-lysine

    FIG. 5 Chemical formulas of unmodified, acetylated, and mono, di, tri-methylated lysine residues. Methylation (from lysineto the left) is achieved by successive additions of single methyl groups, using the SAM metabolite as a source of methyl groups.Acetylation (from lysine to the right) is achieved using the acetyl-coA cofactor as a source of acetyl groups. Acetylation reducesthe charge at biological pH, whereas methylation preserves the charge.

    (see Sec. III.B.5). For this reasons, acetylation mecha-nisms are more easily studied by in vitro experiments,while methylation effects are more generally studied inthe in vivo context in presence of their multiple part-ners. We will now sum up some of the main experi-mental results and theoretical interpretations concerningboth these PTMs.

    3. Histone tail acetylation: direct effects on chromatinaccessibility

    a. Experiments Experimental studies of the role of hi-stone tail acetylation in the architecture of nucleoso-mal arrays are conducted using reconstituted, in vitrochromatin. In this approach, nucleosomes are reconsti-tuted by incorporating recombinant histones with tai-lored aminoacid sequences on tandem repeats of a DNAsequence with very high histone affinity (the so-called”601 sequence”). The sedimentation coefficient of sucharrays is then measured as a proxy for their foldingpropensity, comparing the sedimentation coefficient of ar-rays with of without combinations of histone tail acety-lation (Allahverdi et al., 2011; Liu et al., 2011; Shogren-Knaak et al., 2006; Wang and Hayes, 2008). In addi-tion, small-angle X-ray scattering assays on folded nucle-osome arrays give estimations of internucleosome inter-action energies (Bertin et al., 2007c; Howell et al., 2013).Taken together, these studies show that H4 tail acetyla-tion decreases internucleosomal intra-array associations(Hizume et al., 2010).

    Acetylation of lysine 16 of histone H4 (H4K16ac) hasthe strongest effect in this regard, and may lead tomassive disruption of dense chromatin fibers in vitro(Shogren-Knaak et al., 2006). Structural effects ofH4K16 acetylation on chromatin compaction are also

    confirmed by the observation of a weakening of chromatinpacking in vivo (Shahbazian and Grunstein, 2007), andare in general associated with actively transcribed genes(e. g. , (Taylor et al., 2013)).

    Surprisingly enough, histone H3 acetylation, whichalso reduces the charge of the tails, does not seem to mod-ify the folding propensity of nucleosome arrays (Wangand Hayes, 2008) pointing to a specific mechanism ofH4K16 acetylation.

    b. Models Experimental studies are often combinedwith computational models to provide deeper insights onhow the electrostatic nature of histone tail PTMs influ-ence chromatin folding.

    Potoyan & Papoian (Potoyan and Papoian, 2012) ad-dressed the question of the decompaction induced byH4K16 acetylation, and carried out all-atom simulationsin explicit solvent to compare the conformation of H4tail with and without this modification. For the iso-lated histone tails, H4K16ac leads to slightly more com-pact and significantly more structured globular H4 tails.At this level, compaction is not surprising since the netcharge reduction weakens self repulsion between the tailresidues. When DNA is present, i.e. when the entirenucleosome is considered, tails have a similar behav-ior: acetylated tails are more compact, less fluctuating,and are more frequently bound to their own nucleosomalDNA. However the less charged acetylated tail interactsmuch more strongly (∼ 5 − 6kBT ) with DNA than theunmodified one (∼ 2kBT ), in contrast to what is ex-pected from electrostatic reasons. This counterintuitiveeffect is achieved thanks to an important tail reorgani-zation that brings other lysines closer to DNA. Whilethe overall electrostatic attraction is basically unchanged,

  • 10

    H3 tails H2A/H2B acidic patchacetylation

    (tri)methylation

    H4K16acGlobal acetylation level

    H3 acetylation

    H4 Tails

    H3K9me3 and HP1H3K27me3 and Polycomb

    (a)(b)

    (c)

    (d)

    FIG. 6 Nucleosome arrays and histone PTMs. (a) Cartoon of a nucleosome core particle (histone core in yellow, DNA inblue) (b-d) Examples of chromatin structural modulation by histone PTMs. Right: H4K16ac decreases nucleosome stackingby preventing the H4 tail from binding the H2A/H2B acidic patch. Global acetylation removes positive charges on H4 tailsand decreases the electrostatic screening of NCP electostatic repulsion. Middle: H3 methylation recruits chromatin associatedproteins to form heterochromatin (e.g. H3K9me3 and HP1, H3K27me3 and the Polycomb family complexes). Left: acetylationof H3 tails decreases their affinity for nucleosomal or linker DNA and reduces the electrostatic screening of DNA negativecharges, leading to changes in the mechanical properties of the linker and accessibility of nucleosomal DNA for other proteins.

    the collapse of the tail is favored by hydrophobic inter-action and entropic gain. In contrast, unmodified H4tails are more extended and flexible. They showed apreferential interaction with linker DNA (Angelov et al.,2001) and with an acidic patch exposed on the surfaceof next H2A/H2B dimers of neighboring nucleosomes(Zhou et al., 2007) (see Figs. 2a and 6a). Hence, whilemodified H4 tails may contribute to the nucleosome-nucleosome attractive interaction by the so-called “tailbridging” effect (Mühlbacher et al., 2006), the acetyla-tion of lysine 16 might oppose this effect, leading to weak-ened nucleosome–nucleosome interactions (Potoyan andPapoian, 2012) (see Fig. 6b,c,d). Of note this is qual-itatively consistent with experiments on the disorderedC-terminal tail of the p53 protein where a significant in-crease of its site-specific DNA binding is observed bothin vitro and in vivo (Luo et al., 2004).

    Other computational models generally rely on coarse-grained approximations of the nucleosome core particle

    and linker DNA which integrate the mechanical dynamicsof nucleosome as well as its distribution of charges. Arya& Schlick used their Discrete Surface Charge Optimiza-tion framework to provide estimations of the contributionof tails to electrostatic interaction energies, showing thatH3 tails principally screen the negative charge of linkerDNA, while H4 tails mediate internucleosomal interac-tions (Arya and Schlick, 2006, 2009), in agreement withprevious experimental findings. However, these studiesdo not compare interaction energies with or without hi-stone PTMs. Several other coarse-grained models havebeen used so far to specifically investigate histone tailacetylation (Allahverdi et al., 2011; Liu et al., 2011; Yanget al., 2009), showing that the effect of PTMs also largelydepend on the valency and the concentration of bulkcounterions, consistent with sedimentation assays.

  • 11

    4. H4K16 acetylation is a silencing mark in budding yeast

    In budding yeast, and this is specific to budding yeast,silencing is not achieved by histone methylation. Insteadheterochromatin is induced by SIR (Silent InformationRegulatory) complexes which are recruited by deacety-lated nucleosomes, crucially relying on H4K16 (Dayarianand Sengupta, 2013) (see below Sec. IV.C.1).

    5. Histone tail methylation: indirect effects on chromatincondensation

    In animals, notably in drosophila and mammals, si-lencing is mainly achieved through histone tail methy-lation which, as mentioned above, does not directly in-duce chromatin fiber compaction (a notable exceptionwas reported in (North et al., 2014)) but leads to the re-cruitment of additional architectural proteins, typicallyheterochromatin proteins.

    Importantly, such architectural proteins are includedin the set of proteins that have been used to define thechromatin colors in drosophila (Filion et al., 2010). Pre-cisely, chromatin colors are specific combinations of epi-genetic marks and associated proteins belonging to thefollowing set: histone-modifying enzymes, proteins thatbind specific histone modifications, general transcriptionmachinery components, nucleosome remodelers, insula-tor proteins, heterochromatin proteins, structural com-ponents of chromatin, and a selection of DNA bindingfactors (Filion et al., 2010). Histone tail methylationseems therefore to act as a (region specific) substrateto recruit (non specific) proteins. In turn, these proteinsinduce different chromatin-chromatin interactions in dif-ferent regions, and eventually different chromatin foldingleading in particular to different compaction degrees (seeSec. III.C.1).

    There are various kinds of heterochromatin in animals(e.g. black, blue and green chromatin in drosophila; evenmore “colors” in mammals). We focus here on the phys-ical mechanisms that drive the two main silencing pro-cesses in animals, namely the recruitment and spreadingof HP1 (Heterochromatin Protein 1) by the H3K9me3mark (Azzaz et al., 2014; Hathaway et al., 2012) andthe recruitment of the Polycomb architectural complex(PcG) by the H3K27me3 mark (Tie et al., 2009). Wemoreover discuss the role of these architectural proteinsin the physical process of heterochromatinization.

    Unlike acetylation, results obtained in vitro usingreconstituted chromatin arrays are not directly trans-ferrable to in vivo contexts for at least two reasons: (i)lysine methylation has no direct physical effect (recallthat, unlike lysine acetylation, lysine methylation doesnot change electric charges), instead, lysine methylationis recognized as a biochemical tag by dedicated chromatinproteins, either architectural (Gosalia et al., 2014; Mul-

    ligan et al., 2015; Ong and Corces, 2014; Zentner andHenikoff, 2013) or remodeling proteins ((Becker and Hrz,2002); (ii) there is considerable cross-talk among histonetail PTMs (Bannister and Kouzarides, 2011; Kouzarides,2007; Li and Shogren-Knaak, 2008)) which can then formnetworks comparable to signaling pathways, eventuallyresulting in a structural effect. An example of such apathway is given by (Wilkins et al., 2014) in the contextof budding yeast cell division where phosphorylation ofSerine 10 of the H3 tail induces H4K16 deacetylation,which eventually leads to chromatin compaction.

    a. HP1-mediated heterochromatin The family of Hete-rochromatin Protein 1 (HP1) are fundamental compo-nents of heterochromatin. They are abundant at the cen-tromeres and telomeres (which correspond roughly, as wehave seen, to central and ending regions of the chromo-somes, respectively) in nearly all eukaryotes.

    They display high binding affinity for the H3K9me3mark and are therefore specifically targeted to nucleo-somes harboring this mark. However the spreading ofHP1 along an H3K9me3 epigenetic domain is still a mat-ter of debate. Thus in the latest special issue of JPCM(Everaers and Schiessel, 2015), devoted to the physics ofchromatin, two contrasted models have been proposed:the group of Andrew Spakowitz (Mulligan et al., 2015)claims that bridging interaction between HP1 dimers iscritical for HP1 spreading, at odds with the group ofKarsten Rippe (Teif et al., 2015) who claims that thebinding of one HP1 dimer can stabilize a stacked nu-cleosome conformation and facilitate the binding of asecond dimer via an allosteric change of the nucleosomesubstrate, with no need for a direct interaction betweenneighboring HP1 dimers. It is to be noted that bothgroups could reproduce the in vitro binding curves ofthe yeast analog of HP1 (Swi6) on mono- and dinucle-osomes as well as on arrays of nucleosomes. MoreoverSpakowitz’s group claims that HP1 bridging interactionbetween different chromatin fibers explains the phaseseparation of hetero- and euchromatin (Mulligan et al.,2015), whereas Rippe’s group evidenced a dependenceof the binding stoechiometry on the NRL (nucleosomerepeat length) due to allosteric cooperativity of bindingfor nucleosome arrays with long but not with short DNAlinkers, pointing to a facilitated spreading of HP1 on longNRL substrates.

    b. Polycomb-mediated heterochromatin Polycomb are afamily proteins that mediate transcriptional silencing(Di Croce and Helin, 2013; Simon and Kingston, 2013).In drosophila, it was found that two distinct regulatorycomplexes (PRC1 and PRC2) are able to silence the Hoxgenes in a stable and inheritable way (Beuchle et al.,2001; Paro et al., 1998). It provides a mechanism for

  • 12

    “cellular memory” (Ringrose and Paro, 2004), that hasbeen speculated to be alternative to DNA methylation(Bird, 2002).

    The precise mechanism underlying the heritability ofthe repressed state of genes silenced by the Polycombcomplexes is still debated. It is known that the repressivehistone mark H3K27me3 (see Sec. III.B.5) is recruitedby the PRC2 complex. In turn, H3K27me3 recruitsPRC1, which then induces histone H2AK119 ubiquiti-nation. However, recent studies showed that this rela-tionship may also work in the opposite sense(Blackledgeet al., 2014; Cooper et al., 2014). It has also beensuggested that in X chromosome inactivation (seeSec. V.C), histone ubiquitination, and Polycomb pro-teins are mechanistically related to propagate the si-lenced state (de Napoles et al., 2004).

    A physical modelling of the cross-talk between histonemarks and the Polycomb complexes would be useful andis, to the best of our knowledge, still missing.

    C. How epigenetic marks organize the chromosomes in thecell nucleus. General rules. Physical modeling ofepigenome wide studies.

    1. Epigenome wide studies

    One of the current paradigms in the field is that theepigenetic landscape is driving the 3D genome foldingand by extension the functional state of the cell. In orderto tackle this issue at the genome scale level, epigenomictechniques based on Next Generation Sequencing (NGS)are increasingly used (Rivera and Ren, 2013). Thesetechniques are commonly used to map accessibility, pro-tein binding sites, and biochemical modification of his-tones or DNA along the linear genome (e.g. in drosophilaFig. 7a). A new technique, genome-wide ChromosomalConformation Capture (Hi-C) has been developed in or-der to address the issue of genome 3D folding using NGS.This technique allows to the generation of a list of pair-wise contacts between distal parts of the genome in vari-ous organisms and cell types (e. g. in drosophila, Fig. 7b,(Sexton et al., 2012)). The first results have confirmedthe physical segregation of the genome into heterochro-matin and euchromatin regions (Lieberman-Aiden et al.,2009). At finer scales, Hi-C also led to the identifica-tion of domains along the genome in which contacts arenumerous whereas very few contacts are established inbetween different domains. These regions are termedtopologically associating domains, TADs (Dixon et al.,2012; Nora et al., 2012). TADs can be seen as highintensity blocks along the diagonal of the chromosomalcontact maps (Fig. 7b, cyan squares). Combining Hi-C results with the linear epigenomic annotations of thegenome (i.e. the biological information of the underlyingsequences) is in principle a powerful method to compre-

    hend the functional architecture of the genome.

    Several physical models have been developed so far inorder to understand the 3D folding properties of blockcopolymers. The main goal of these studies is to recoverthe chromosomal contact maps observed from the Hi-Cdata. Two main classes can be distinguished: simula-tions that explicitly compute the 3D chromosome con-formations (Barbieri et al., 2012; Benedetti et al., 2014)and implicit models in which average contact maps aredirectly computed in a self-consistent Gaussian approxi-mation (Jost et al., 2014). The different explicit mod-els can account for the formation of TADs, either bypreferential binding of co-factors along specific regionsof the genome (Barbieri et al., 2012) or by topologicalconstraints (Benedetti et al., 2014) but so far, the directcomparison with experimental results has only been doneusing the implicit approach (Jost et al., 2014). In thisstudy, the authors use the previously described colors ofchromatin drosophila (Filion et al., 2010), which, as dis-cussed previously, assign to each subregion of the genomean epigenetic “color”, based on the specific protein bind-ing and histone marks found in this region (Fig. 7a).They then assign specific pair-potentials between beadsof the same or different colors (Fig. 7c) and compute thecorresponding contact maps using a statistical approachpreviously described (Timoshenko et al., 1998). Withwell-chosen parameters, they were able to retrieve thecontacts found experimentally (Fig. 7d). An importantoutcome of their study is that a fixed epigenetic land-scape is compatible with several 3D conformations of thechain, a phenomenon which they call multistable folding(see below “the physics of TADs”).

    2. The physics of TADs: finite-size effects in the coil-globuletransition of copolymers

    Jost et al. (Jost et al., 2014) show a phase diagram of atoy model copolymer as a function of the intensity of (i)block-specific and (ii) non-specific interactions, that weshow in Fig. 8. On top of the coil-globule transition ofthe whole copolymer, there is also coil-globule transitionrestricted to each separate block. Importantly, both coiland globule phases coexist in a region of the phase dia-gram, the size of which depends on the (average) size ofthe blocks. This is consistent with the finite-size scalinganalysis of the coil-globule transition which has been pro-posed in (Caré et al., 2014) (see also arXiv: arXiv:cond-mat/0004273).

    Let us show that both transitions, namely the coil-globule transition inside a given block and the segrega-tion of different blocks of the same color into separatedmicrophases, overlap in the phase diagram because offinite-size effects.

    We first remember that a polymer of N monomers,with monomer-monomer attractive interactions, under-

    http://arxiv.org/abs/cond-mat/0004273http://arxiv.org/abs/cond-mat/0004273

  • 13

    FIG. 7 Modeling of chromosomal contact maps from the epigenomic landscape. (a) Profiles of H1 occupancy, DNA accessibility,H3K27me3, H3K4me3, HP1 and a histone modifier, Su(Hw) along a region of the Dorsophila chromosome 3R. At the bottom,the colors corresponding to these profiles are shown. Yellow and red correspond to active chromatin, blue to polycomb boundregions. (Filion et al., 2010) (b) The corresponding contact map (Sexton et al., 2012) (c) Schematics of the co-polymer modelused in (Jost et al., 2014) (d) Two predicted contact maps corresponding to the region indicated by the pink dashed square in(b).

    −40 −30 −20 −10 0

    −70

    −60

    −50

    −40

    specific interaction Us

    non-

    spec

    ific

    inte

    ract

    ion

    Uns

    coil(a)

    gl obul e(b)

    mi cr ophaseseparation

    (c)

    multistability(d)

    (a)

    (b)

    (c)

    (d2)

    (d1)

    contact probabilityhighlow

    1 40 80 120

    1 40 80 120

    1 40 80 120

    1 40 80 120

    1 40 80 120

    120

    80

    40

    1

    120

    80

    40

    1

    120

    80

    40

    1

    120

    80

    40

    1120

    80

    40

    1

    A

    B

    microphaseseparation

    globule

    FIG. 8 Phase diagram of a toy model copolymer, as a functionof specific and non-specific interactions. Figure taken from(Jost et al., 2014).

    goes a coil-globule transition around the critical temper-ature Θ(N) = Θ(1 − b

    √ln(N)/N where b is a dimen-

    sionless prefactor of order unity. More precisely, thereis an equilibrium between coil and globule conformationsover a temperature range between Θ(N)−a/

    √ln(N) and

    Θ(N)+a/√

    ln(N) where a is a dimensionless prefactor oforder unity. At T = Θ(N) both coil and globule confor-mations are in equal proportions. Therefore, at a giventemperature T , longer polymers are more globular thansmall polymers of the same kind.

    We then consider a copolymer ABAB. . . made of smallblocks A and long blocks B, with monomer-monomer at-tractive interactions represented by an energy of interac-tion Eij between monomers with epigenetic states i andj the following kind: Eij = Uns + δijUs, where Uns isa non-specific term (does not depend on i and j), δijis the Kronecker delta, and Us is a specific interactionterm. According to the preceding results on the coil-globule transition of finite-size polymers, long blocks Bgo into globules when small blocks A are still coils. Whenlowering the temperature (or equivalently increasing theinteractions), blocks B start to transiently bind togetherinto a macroglobule: this is now the coil-globule transi-tion of the whole copolymer which is equivalent to a chainof B globules separated by A linkers; and while this chaincollapses (folds) the A linkers start to go into globules,so that both transitions overlap.

    Importantly the macroglobule fluctuates between coiland globule conformations (as well as any B globule) sothat it transiently dissociates thus permitting the small Ablocks, even in remote locations on the genome, to cometransiently into contact (see fig. 5). This correspondsto the multistate folding region calculated by Jost etal. and depicted on Fig. 8. Note that the width of thismultistate folding region varies as 1/ ln(n) where n is thetypical size of the small(est) blocks.

    Below the lower critical temperature Θ(N)−a/√

    ln(N)all the B globules are permanently collapsed in amacroglobule with the A blocks located at the macroglob-ule surface (because of interfacial tension). Crucially theA blocks are still coils, hence in the euchromatin phase,so that their genomic sequence is expressed, whereas the

  • 14

    B blocks are globular and as such in the heterochromatinphase, hence their underlying sequence is repressed.

    IV. PHYSICAL MECHANISMS INVOLVED IN THEINITIATION, SPREADING, MAINTENANCE ANDHERITABILITY OF EPIGENETIC MARKS

    Stem cells are capable of differentiating to the desiredfate depending on the tissue. Dramatic changes in geneexpression occur during development. These changes arethen stabilized and become heritable. Epigenetic modifi-cations take part in both initiating, stabilizing and prop-agating the patterns of gene expression. Gene regulationby epigenetic modifications is indeed stably propagatedthrough cell divisions (and, in some cases, across genera-tions). At each cell division, the whole DNA is replicated.Chromosomes then consist of two sister chromatids whichboth have identical genetic information, joined togetherat their centromere. Then, during mitosis, the two chro-matids are separated and segregated into the two nucleiof the daughter cells.

    Eukaryotic replication involves both DNA synthesisand chromatin assembly. As the two double helices aresynthesized from the two single strands of the mother-cell DNA, nucleosomes on the mother-cell DNA strandshould also be distributed to both daughter double he-lices, and completed by de novo nucleosome assembly. Inorder to ensure the transmission of epigenetic marks todaughter cells, mother-cell nucleosomes should be sharedby both newly formed chromosomes, even if the de-tailed mechanisms of this distribution are still debated(MacAlpine and Almouzni, 2013).

    While it is clear that histone modifications are involvedin gene silencing, hence gene regulation, the questionshow epigenetic marking is initiated, how it may spreadover specific chromosome regions (and not beyond), andhow it can be stably maintained along the cell cycle andthrough the cell division are still under investigation. Inthis section, we will review the main modeling efforts thathave been made in order to address these questions.

    A. Mathematical modelling

    Many recent theoretical works addressed the questionof how epigenetic marks are initiated, spread, and main-tained. The main objective of these models is to repro-duce a few essential features observed in vivo: (a) themultistability of the epigenetic marks; (b) their spatialpatterns and (c) their heritability.

    By multistability, it is generally meant that the epige-netic marks act as switches between different functionalstates. In the simplest case, different patterns of epi-genetic marks allow to switch between two states thathave a well-defined functional characterization (bistabil-ity). Such functional states are then inherited by the

    daughter cells through mitosis, which is what we call her-itability. As observed in genome-wide studies, the epige-netic patterns correspond to distinct epigenomic domainsthat are separated by boundaries (see Sec. III.C).

    We consider a system of N nucleosomes that can be innS different states. In the simplest case, nS = 2 and onerefers to “modified” or “unmodified” states, which canbe related to active or inactive genes.

    The state of the system is described by the variables{s1, s2, . . . , sN}, where si is the state of nucleosome i. Ifwe define nj as the number of nucleosomes in the statej, then one can write the conservation of the number ofnucleosomes as

    nS∑j=1

    nj = N (1)

    Many theoretical works use the silenced mating-typelocus of the fission yeast Schizosaccharomyces pombe (re-viewed in (Grewal and Elgin, 2002)) as a model system.In this system, the region containing the two mating-type regions is normally “silenced”, i.e. not expressed.The expression of the mating-type genes may becomebistable in mutants, flipping between a silenced stateand an active state (Grewal and Klar, 1996; Thon andFriis, 1997). Each state is stable and heritable; transitionbetween them occurs apparently stochastically. The S.pombe HMT, HDAC and other proteins are necessary forsilencing, and all may bound H3K9me directly or indi-rectly.

    In the following, we review the models of this behaviorproposed so far.

    B. Zero-dimensional models

    In zero-dimensional models, neither the spatial organi-zation of the N nucleosomes, nor the notion of distanceare introduced. In general, the model concerns rate equa-tions on how the variables nj vary as a function of time,and the objective of the models is to show how bistableor multistable states can appear. In this class of mod-els, the initiation of the epigenetic mark is implicitly de-fined as the initial state of the dynamical system, and thespreading is described as the time evolution of the initialstate. Mitosis can be modeled as an instantaneous pro-cess in which the concentrations of all species (modifiedand unmodified nucleosomes) are diluted and the systemrestarts. The dilution is due to sharing of mother-cell nu-cleosomes between both daugther cells. Nucleosomes arenot necessarily shared into equal parts between daughterchromosomes, but this may be assumed without loss ofgenerality as is done for convenience in most models.

    We can write a general expression for the time evolu-

  • 15

    tion of the variables nj :

    dnjdt

    =

    nS∑k 6=j

    R+jknk −nS∑k 6=j

    R−jknj + noise (2)

    Here, R+kj is the rate of transition of nucleosomes from the

    state k to the state j, while R−jk is the rate of transition

    from state j to k (obviously, R+kj = −R−jk). In general,

    these coefficients are not constant, but depend on theother dynamical variables. The “noise” may be includedto describe the effect of stochastic processes involved inthe system.

    The simplest possible model of this kind was proposedby (Micheelsen et al., 2010). The authors consider thecase of nS = 2, that is, they consider only a modified(M) or unmodified (U) state. Using equation (1), thesystem may be described by only one variable nM , thenumber of modified nucleosomes. The transition ratesare given by

    R+UM = αn2M + (1− α)

    R−MU = α(1− nM )2 + (1− α). (3)

    This model supposes that the creation of a modified stateinvolves a cooperative transition (as evidenced by thequadratic terms in equation (3)) or a spontaneous con-version to the unmodified state (which is described bythe (1 − α) term). Despite its simplicity, the model canaccount for the emergence of bistability. The parame-ter F = α/(1 − α) (feedback to noise ratio) governs thebehavior of the system. For F > 4, three fixed pointsemerge in the system: nM1 = 0 and nM3 = N , whichare stable, and nM2 = N/2, which is unstable. The Fparameter is possibly under active control by the cell,which then can regulate its function (notably by HDACinhibitors (Dayarian and Sengupta, 2013)). Heritabilitycan be partially accounted for by this model, since onecan speculate that cell division brings the system closeto the unstable point, which then returns to its stableattractor.

    David-Rus et al. thoroughly investigated a more gen-eral model that still has nS = 2 (David-Rus et al., 2009).Their rates read:

    R+UM = χ+ αnHM

    R−MU = γ + η(1− nM )K (4)

    The first interesting result they obtained is that thismodel can reproduce bistability only for H,K > 1. Thesimple quadratic case H = K = 2 is a generalisation ofthe model of Micheelsen et al. (Micheelsen et al., 2010),where the cooperative transition probability (rate) fromU to M is independent from that to M to U. If thebasal rates χ and γ are small, one again obtains threefixed points, with the intermediate unstable point being

    nM2 ≈ η/(α + η). Assuming now that cell divisions ex-actly halves the concentration of modified nucleosomesfor each daugher cell, if nM2 > nM3/2, than the systemwill always fall in the basin of attraction of nM1 after acell division, hence the only stable point is the unmodi-fied state nM1 ≈ 0. Conversely, for nM2 < nM3/2 (henceη < α), the system will converge to the modified statefixed point nM3 ≈ N for initial conditions larger than2nM2, and bistability becomes effective.

    This scenario is however modified by the presence ofnoise in the system. In fact, if the probability of transi-tion from U to M is larger than the probability of tran-sition from M to U (that is, η < α), then the nM1 fixedpoint is no longer stable. Noise drives the system outof the nM = 0 state, and brings it to the fully modi-fied nM ≈ N state. This consideration highlights theimportance of asymmetric recruitment rates.

    The same authors also considered the case of nS = 3,which was already considered by Dodd et al. (Dodd et al.,2007) in a very similar form. They consider the case of an“antimodified” state (A), that is possibly an acetylatedstate (active chromatin mark) that is opposed to the Mstate which is possibly a methylated (repressive) state(see Fig. 9a). A hypothesis is that only the U → Mand U → A are allowed, but the M → A transition isnot (i. e. R+MA = 0). They write the following transitionrates:

    R+UA = αAnA + χA

    R+UM = αMnM + χM

    R−AU = βMnM + γA

    R−MU = βAnA + γM . (5)

    The study of the system in the case where the basalrates χM , γM , χA and γA vanish already shows theexistence of four fixed points: two stable fixed points,{a = 1,m = 0} and {a = 0,m = 1}, an unsta-ble saddle point {a = 0,m = 0} and an unstablefixed point {a = αMβA/(αAβM + (αM + βM )βA),m =αAβM/(αAβM + (αM + βM )βA)}. The two latter pointsare aligned along the a = m line and create a barrierbetween the two basins of attraction (David-Rus et al.,2009).

    The last model of this class that we consider is the oneproposed by Jost (Jost, 2014). The author considers aspecial case of the three-state model outlined above:

    R+UA = �AnA + k0

    R+UM = �MnM + k0

    R−AU = �MnM + k0

    R−MU = �AnA + k0 , (6)

    that is, it is the same model with αA,M = βA,M = �A,Mand χA,M = γA,M = k0. Interestingly, this particularchoice allows to map the system to the zero-dimensional

  • 16

    Ising model, with, e.g., the correspondence A= +1,U= 0, M= −1. Within this analogy, recruitment corre-sponds to coupling between spins and random transitionsare associated with thermal fluctuations. A new observ-able, equivalent to the magnetizaton in the Ising model,is introduced here: µ = a−m.

    Some known results can thus be recalled for the sym-metric recruitment case �A = �M = �. Similarly towhat previously discussed, three fixed points exist. Thefirst one, µ = 0, is stable for weak recruitment, i.e.for � < �c = 3k0. Above this critical value of �,µ = 0 becomes instable and bistability settles downwith the appearence of two stable fixed points, µ± =±(k0/�)

    √(�/k0 + 1)(�/k0 − 3).

    The non-local character of the nucleosome-nucleosomeinteraction, which is the main hypothesis of the zero-order models, has been further justified by a recent work(Zhang et al., 2014). The authors proposed a two-layerPotts model in which in one layer they describe the nu-cleosomes, and in the other they include explicitly theenzymes that modify the nucleosomes. The interactionbetween the nucleosomes is the effectively mediated bythe modifying enzymes. Interestingly, by integrating outthe effect of the modifiers, it is possible to prove the exactequivalence to the model proposed by Dodd et al. (Doddet al., 2007).

    To conclude this section, we stress the main results ofthis comparative analysis. Bistability is obtained by thisclass of models in two ways: in two-state models onlywhen including nonlinear rates, and in three-state mod-els even having linear rates. The reason for this is thatin three-state models the transitions from a modified toan antimodified state can proceed only in a two-step pro-cess, effectively requiring cooperativity, hence producingbistable states (Dodd et al., 2007).

    C. Higher-dimensional models

    An inherent limit of the models discussed above is thatthey cannot reproduce spatial patterning of the epige-netic marks. Hence, limitation of the mark spreadingshould be included by limiting the extent of the con-cerned domain, i.e. the total number of nucleosomes. Ifthis assumption may be relevant e.g. for the mating-typeloci in yeast, it probably fails when multicellular organ-isms are considered. It is known, for example, that nearlyall noncentromeric H3K9me3 domains in mouse embry-onic stem cells have a peaked shape, with continuouslydecaying mark densities on both sides (Hathaway et al.,2012).

    1. One-dimensional models.

    Even when mark spreading is surrounded by bound-aries, the question arises how to model their presenceand effects. Dodd and Sneppen realize in their 2011work (Dodd and Sneppen, 2011) that positive feedbackcan lead to spreading of the modifications to genomeregions other than the target. They refer in particu-lar to the silent mating-type loci in budding yeast Sac-charomyces cerevisiae, HML and HMR, that are able tospontaneously flip between high and low expression states(Xu et al., 2006). These domains are stable over up to80 cell generations, and are surrounded by boundary el-ements that prevent silencing to spread out of the do-mains. These “barriers” are specific sequences, and maysimply be target sites for certain DNA-binding proteins,strong gene promoters, or nucleosome-excluding struc-tures. Dodd an Sneppen therefore consider a model inwhich all nucleosomes are explicitly treated, and the long-distance interaction between nucleosomes is modeled ina “local-local”, “local-global”, or “global-global” scheme(see Fig. 9c). To limit the long-range interaction be-tween DNA sites one can introduce a distance dependentcooperativity, i.e. by making the reaction rate RUM de-pendent on the nucleosome distance. A power law de-pendence, typical of the three-dimensional probability ofcontact, can be assumed.

    Then, the confinement of silenced regions can be ob-tained by introducing local barriers, modeled as singlenucleosomes fixed in the active (A) state. Due to the lo-cal character of the modification step, a single silencing-resistant nucleosome (e.g. H3K4me3 (Venkatasubrah-manyam et al., 2007)) or a nucleosome-depleted region(notably promoters of TEF genes (Bi et al., 2004)) isenough to stop the silencing spreading, provided that theflanking regions are entirely in the active state. However,an occasional inactivation of the barrier make the silenc-ing spread out. This effect can be limited by introducingregularly spaced weak barriers, modeled as anti-silencers(enhancers) of the U → A reaction, or by implementingin the model a Michaelis-Menten saturation effect whenthe number of U state nucleosomes increases. The com-bination of both effects results in robust prevention ofsilencing spreading.

    Still referring to the mating-type loci in budding yeast,Dayarian and Sengupta consider a four-state model withsite-dependent rate equations (Dayarian and Sengupta,2013). The fourth state they consider is the diacetly-ated state, which would correspond to acetylation of twoH4K16 sites. Importantly, in this model, the modifiedstate M is supposed to be a state where nucleosomes arebound to silencing (Sir) proteins, and depends thereforeon their availability (concentration). In its most gen-eral form, this is a one-dimensional model that explicitlydescribes cooperative transitions that involve any nucle-osome pair. However, it can be simplified into a zero-

  • 17

    dimensional model when considering uniform solutions,which again show bistability and a characteristic bifurca-tion diagram. Moreover, such concentration dependentmodel allows for additional interesting effects, involvinga fine balance between the silencing of mating-type loci,which have a definite extent, and of the telomeres, whoseextent may vary depending on the protein availability(Dayarian and Sengupta, 2013).

    Focusing instead on mammal silenced regions, Hath-away et al. were able to reproduce the sharp peaks ob-served in the experimental modification patterns by in-cluding a “source” term in their model (Hathaway et al.,2012). This is a model in which the initiation and spread-ing are explicitly separated, and in turns this allows toreproduce spatial patterning. They write rates as

    U0ktarget+−−−−→M0

    {Mi, U(i+1 or i−1)}k+−−→ {Mi,M(i+1 or i−1)}

    Mik−−−→ Ui. (7)

    This description means that at site 0 there is an activemodification source with rate ktarget+ , which then spreadsto the neighboring nucleosomes with rate k+. Fittingto experimental results leads to k+ and k− rates bothof the order of 0.1–0.2 h−1 (in agreement with differentexperimental estimates of k−.

    In Ref. (Hodges and Crabtree, 2012) a more detailedstudy of the model is presented. The source term en-sures that the resulting mark distribution are peaked atthe nucleation site, as experimentally observed, providedktarget+ /k− is large enough (& 0.2), with increased ampli-tude and formation rate for increasing ktarget+ .

    2. Three-dimensional models.

    Erdel et al. (Erdel et al., 2013) addressed some morespecific questions about the establishing of epigenetic do-mains, as how are the chromatin modifying enzymes tar-geted or excluded from given chromatin regions, and howexactly the modification can propagate from one nucleo-some to another, how is this state reestablished or main-tained during replication. The proposed model focuses onthe permanent binding of enzymes to a scaffold, either onchromatin itself or on the nuclear membrane, this leadingto the definition of a limited chromatin region allowed tointeract with the enzyme by short-range diffusion. Thespatial distribution of the enzyme hence may result ina spatially limited enzymatic activity, and results in thedefinition of epigenetic domains. This first attempt totake into account the chromatin architecture in a three-dimensional model is noteworthy, despite the difficultyin estimating many of the geometrical and physical pa-rameters involved in the model, as the linear base-pairdensity along the chromatin fiber, the fiber stiffness, or

    the nucleosome local density. Moreover, the question ofhow the set up of the correct architecture in the initialenzyme binding and in defining the functional chromatindomains remains open.

    D. Biological relevance of the models

    In this section we intend to examine the biological rel-evance of a few key points that emerged in the discussionof models of initiation and spreading of epigenetic mod-ifications.

    1. Waddington’s epigenetic landscape revisited.

    First, let us return to the discussion on the Wadding-ton landscape that we started in the Introduction. Theclassical image in the original Waddington representation(Waddington, 1957) of a marble rolling down a hill doesrather suggest a fixed landscape, leading to erroneousinterpretation when one goes beyond the metaphoricallevel (see Fig. 1).

    In the simplest model we discussed, the one byMicheelsen et al. (Micheelsen et al., 2010), the authorsshow that the model can be reformulated by a Fokker-Planck equation for the 1D diffusion of a particle in aneffective potential V (nM ) (see Fig. 9b). The latter ac-counts at a time for drift (external forces) and noiseevents (with a term of the type D/µ, with D the diffusioncoefficient and µ the mobility). The Waddington ideaof an epigenetic landscape is translated in (Micheelsenet al., 2010) in more modern terms, by defining a phys-ically consistent energy profile. Note however than themechanism invoked here is not an evolution along theprofile of Fig. 9b toward the minimum energy states,since different values of the F parameter correspond todifferent system parameters, hence different external con-straints. In other words, the equivalent of an epigeneticlandscape corresponds here to a given F = constant sec-tion of the two dimensional potential surface of Fig. 9b.This allows in turn to suppose that external constraintsmay be included in the parameter F , which may varyas a function of metabolism (level of activity) or drugdelivering of “writers” or “erasers” (see Sec. III.B), no-tably HDAC inhibitors (Dayarian and Sengupta, 2013),thus typically making the system switch from bistable tomonostable conditions. As discussed by Jost (Jost, 2014),this may also represent a strategy to gain in system sen-sitivity hence plasticity during development. Note thatthe switching mechanism between bistable and monos-table conditions can be interpreted as the result of anactive process bringing the system out of bistability andfavoring its switching to a different state.

    We then stress that it is important to consider theasymmetry of the modification rates. Taking the no-

  • 18

    -25-20-15-10-5051015

    0 0.2 0.4 0.6 0.8 1

    4

    8

    12

    16

    20-20

    -10

    0

    10

    V(m)

    m

    F

    V(m)

    0 3 6 90

    3

    6

    9

    εA/k

    0

    ε I/k

    0

    active

    inactive bistability

    0 3 6 9−1

    −0.5

    0

    0.5

    1

    m

    εA/k

    00 3 6 9ε

    A/k

    00 3 6 9ε

    A/k

    0

    (b)(a)

    (c) (d) (e)

    εI/k =1

    0εI/k =3

    0εI/k =5

    0

    (a) (b) (c)

    (d) (e)

    (f) (g) (h)

    FIG. 9 A modern view on epigenetic landscapes. (a) The basic mechanism of a three-state nucleosome modification model,depicting the modified (M), unmodified (U) and antimodified states (A). The transition between M and U is catalyzed byhistone methyltransferases (HMTs) and histone demethylases (HDMs), which depends on an antimodified histone; betweenU and A the transition is catalyzed by histone acetylases (HATs) and histone deacetylases (HDACs), which depends on amodified histone. Figure taken from (Dodd et al., 2007). (b) The coarse-grained potential V (m) as defined in the main text, asa function of the mean fractional number of modified nucleosomes m, and the feedback-to-noise ratio F (taken from (Micheelsenet al., 2010)). (c) Different modes of coupling between histone modification states, as described in (Dodd and Sneppen, 2011).Unrecruited enzymes may modify histones directly. Otherwise, recruited enzymes may operate in non-cooperative (global orlocal) or cooperative (local-local, global-local, or global-local) modes. Figure taken from (Dodd and Sneppen, 2011). (d-h)Illustration of the “epigenetic landscape” as proposed by Jost (Jost, 2014). As a function of the control parameters �A and�I , the system may undergo a transition between an inactive, active or bistable state. As shown in (h), the system may alsodevelop a hysteretic behavior.

    tation of Ref. (Jost, 2014) (equations (6)), we noticethat if recruitment of enzymes by modified or anti-modified marks are different, the stability diagram andthe boundaries between the mono- and bistable regionscan be traced as a function of the two parameters �A and�M . Bistability is observed only for strong recruitment(�A,M < �c) and small asymmetry.

    2. Hysteresis

    For even stronger recruitment, a typical hysteretic be-havior appears that may have important biological conse-quences. One can expect indeed that, while for differenti-ated, stable cells recruitment parameters are almost sym-metric, modifications of the environment might actively

  • 19

    induce asymmetric recruitment. The increase of one re-cruitment parameter can thus bring the system alongthe metastable branch, then make it abruptly switch tothe alternative state, which will then remain stable evenwhen the recruitment parameters comes back to theirinitial values, thanks to the hysteretic shape of the bi-furcation curve. In Fig. 9d, starting for instance fromthe low m state and symmetric recruitment, one can in-crease �I/k0 and switch to the upper, high m branch,then come back to �A = �I without switching back (seealso Fig. 9e-h).

    Close to �A ∼ �M ∼ �c, the system becomes ultra-sensitive to perturbations, and highly unstable. Thisregime may be associated to diseases. A pathological in-crease in the frequency of replication, for instance, mayresult in an increase of the random transition rate k0,which in turn may bring the system close to the criticalpoint and induce epigenetic instability and misregulation.

    However, the existence of a critical region may alsorepresent an advantage. During development, the abilityto switch between two coherent states when applying aweak asymmetric signal (the developmental signal) mayfacilitate developmental transitions. Since the randomtransition rate k0 may be increased by reducing the cellcycle, the system can be brought closer to the criticalregion and the switch induced by the application of aweak asymmetric signal during a finite period of time(Jost, 2014).

    E. Example: plant vernalization

    The 3-state model proposed by Dodd et al. (Doddet al., 2007) has been successfully adapted to the descrip-tion on vernalization, the mechanism allowing plants toflower after a prolonged cold period.

    Plants have the ability to measure the duration ofa cold season and to remember this prior cold expo-sure in the spring. In Arabidopsis thaliana, an annualplant, a prolonged cold exposure progressively triggersthe H3K27me3-mediated epigenetic silencing of Flower-ing Locus C (FLC), a locus encoding for proteins thatin turn act as flowering repressors. The accumulation ofhistone epigenetic marks in the FLC locus keeps increas-ing during the cold. This slow dynamics of vernaliza-tion, taking place over weeks in the cold, generate a levelof stable silencing of FLC in the subsequent warm thatdepends quantitatively on the length of the prior cold.Then, once the FLC is switched off, the silencing persistsat the return of the warm season, and is mitotically sta-ble through the rest of the development (often for manymonths) (see Fig. 10) (Song et al., 2013). This latterfeature is characteristic of annual plants, while in FLCperennial plants is repressed only transiently.

    Satake and Iwasa (Satake and Iwasa, 2012) show thatthis behavior can be accounted for by means of the Dodd

    FIG. 10 Mechanism of vernalization. The expression of floralrepressor gene, FLC, is repressed when plants are exposed tocold and remains stably repressed on the return to warm tem-peratures. Since this repression increases with the durationof the exposure to cold, flowering is more abundant for longercold duration (Figure 1 from (Song et al., 2013)).

    3-state model (Dodd et al., 2007), provided that an ex-plicit dependence on temperature of the model parame-ters in included. Explicitly, the transition rates are writ-ten in this case as

    RU→A = β nA + χ (8)

    RU→M = u(T )(β nM + χ) (9)

    RA→U = v(T )(βnM + χ) (10)

    RM→U = �(βnA + χ), (11)

    where u(T ) and v(T ) account for the temperature tuningand takes different values in warm conditions before ver-nalization, in cold conditions during vernalization, and inwarm conditions after vernalization. Transition rates arein fact under the control of a series of proteins (and inparticular Vernalization Insensitive 3, VIN3) whose ex-pression is temperature dependent. Authors prove that astrong feedback, hence bistability, is necessary to repro-duce the experimental observations. Interestingly, whenthe system evolution is simulated, the M ↔ A transitionis observed at a random time during the cold, for a givensystem containing N nucleosomes (i.e. a given cell). Dif-ferent cells switch therefore to the repressed state afterdifferent delays after the change from warm to cold. How-ever, the average over a cell population leads to a typicalbehavior that can be reproduced, if the cell populationis large enough (Satake and Iwasa, 2012). The durationof winter memory is also tuned by model parameters,and in particular by those accounting for to call divisionrate and rapidity of deposition of epigenetic marks aftervernalization. Changes in these parameters may lead toa much short memory extent (from more than one yearto a few days), this potentially explaining the differentbehavior observed in annuals and perennials plants.

    While the previous work addressed the question ofbistability behavior in vernalization, the question of theestablishment of epigenetic marks induced by cold is dis-cussed by Angel at al. (Angel et al., 2011), both theoret-ically and by experiments in Arabidopsis thaliana. Theseauthors focus on the fact that, when subjected to cold,repression (H3K27me3, M state) only concerns a small (1

  • 20

    kb) nucleation region inside the FLC (8 kb), close to thefirst exon (non coding region) after the promoter. Whileduring the cold only the nucleation region is marked, af-ter warm restoring the profile changed rather little in thenucleation region but rose quantitatively across the restof the FLC locus according to the length of the cold pe-riod. They ask whether the small size of the nucleationregion would be sufficient to cause a quantitative switchin the epigenetic state of the whole FLC locus after returnto the warm. Experimental results are shown to be com-patible with a 3-state zero-dimensional model, providedthat two supplementary ingredients are included: (i) asite-specific nucleation of the silencing modification dur-ing cold, described as an increased probability to switchto the M state for a sub-ensemble of the nucleosomes, and(ii) a permanent bias in the histone dynamics towardsthe M modification on return to the warm. Within theseassumptions simulated population-averaged levels of theM modification are found to be approximately stable upto 30 days after the cold, with a modification intensitywhich depend on the duration of cold, in good agreementwith experiments.

    Together, these two studies show that relative simplemodels displaying a strong bistability can be usefully em-ployed to model epigenetic mechanisms involved in realsystems as, in the case discussed here, in plants, evenif real system typically includes a few additional featuresneeded to specifically respond to the particular functionaltask they are designed for.

    V. TOWARD A MORE COMPLEX SCENARIO: DNAMETHYLATION, ROLE OF RNAS, SUPERCOILING INEPIGENETICS

    Up to now we have focused on histone PTMs and pre-sented them as a crucial issue in the transmission of epi-genetic information. However, the global picture is morecomplex. Among the additional epigenetic mechanisms,some are known since a long time, as DNA methylation(see Sec. V.A), while others have been evidenced quiterecently, as chromosome coating with (long) non cod-ing RNAs as in X inactivation (see Sec. V.C), messen-ger RNA silencing by interaction with micro RNAs (seeSec. V.D), or the coupling between epigenetics and su-percoiling (see Sec. V.E). An exhaustive description ofthe overall picture would represent a titanic task, wellbeyond the aim of this introductory review. Thereforewe focus here on the main physical aspects of these bio-logically relevant mechanisms, drawing on a few concreteexamples.

    A. DNA methylation

    Historically, DNA methylation has been the first epi-genetic mark to be recognized as a “stable, inheritable

    chemical modification that alters gene expression anddoes not modify the sequence” (see Sec. I). In fact, inearly days of research on DNA methylation, it was foundthat methylation states are propagated through mitosis(Wigler et al., 1981).

    DNA methylation is the substitution of a methyl(−CH3) group to the carbon atom in position 5 at thecytosine base (5mC). Importantly, DNA methylation iscoupled to metabolism through SAM (see Fig. 11a).

    The prevalence of DNA methylation in the genomechanges significantly among different organisms: it isvery high in vertebrates (where one refers to a “global”methylation), very low in Drosophila, and absent in thenematode worm C. Elegans. In somatic cells, cytosinemethylation occurs predominantly at CpG dinucleotides,although it has been detected in any sequence contextboth in plants (Cokus et al., 2008) and humans (Listeret al., 2009), where 70–80% of CpG dinucleotides aremethylated.

    The patterns of DNA methylation in the genomeare established in early development, and then faith-fully propagated throughout successive cell divisions.Crucially, tissue-specific genes are kept unmethylated,whereas the others are heavily methylated. These pro-cesses are catalyzed by DNA methyltransferases (DN-MTs). It is generally thought that the two methyl-transferases DNMT3A and DNMT3B are responsiblefor establishing the methylation