evolution of new genes and functions - diva...

74
ACTA UNIVERSITATIS UPSALIENSIS UPPSALA 2018 Digital Comprehensive Summaries of Uppsala Dissertations from the Faculty of Medicine 1502 Evolution of New Genes and Functions ERIK LUNDIN ISSN 1651-6206 ISBN 978-91-513-0463-2 urn:nbn:se:uu:diva-362157

Upload: others

Post on 24-Apr-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Evolution of New Genes and Functions - DiVA portaluu.diva-portal.org/smash/get/diva2:1252552/FULLTEXT01.pdfRNA–like molecules and created a suitable environment for certain types

ACTAUNIVERSITATIS

UPSALIENSISUPPSALA

2018

Digital Comprehensive Summaries of Uppsala Dissertationsfrom the Faculty of Medicine 1502

Evolution of New Genes andFunctions

ERIK LUNDIN

ISSN 1651-6206ISBN 978-91-513-0463-2urn:nbn:se:uu:diva-362157

Page 2: Evolution of New Genes and Functions - DiVA portaluu.diva-portal.org/smash/get/diva2:1252552/FULLTEXT01.pdfRNA–like molecules and created a suitable environment for certain types

Dissertation presented at Uppsala University to be publicly examined in B42, BMC,Husargatan 3, Uppsala, Friday, 23 November 2018 at 09:15 for the degree of Doctor ofPhilosophy (Faculty of Medicine). The examination will be conducted in English. Facultyexaminer: Professor Arjan de Visser (Wageningen University & Research, Laboratory ofGenetics, Department of Plant Sciences).

AbstractLundin, E. 2018. Evolution of New Genes and Functions. Digital Comprehensive Summariesof Uppsala Dissertations from the Faculty of Medicine 1502. 72 pp. Uppsala: ActaUniversitatis Upsaliensis. ISBN 978-91-513-0463-2.

To answer major evolutionary questions, we need a better understanding of the effects ofmutations on specific functions and organism fitness. The aim of this thesis was to elucidatehow new functions evolve and potential trade-offs with the original function.

Paper I identified a bimodal distribution of fitness effects of mutations in Salmonella entericaHisA protein. Most mutations negatively affected protein function but the effect was maskedat high gene expression. Expression levels and the extent to which the studied protein limitedgrowth were important in determining the fitness effects of mutations. No fitness prediction toolwas satisfactorily alone but in combination predictions were improved.

In Paper II, S. enterica HisA was evolved to acquire TrpF activity. Numerous pathwaystowards improved TrpF activity were examined and several improvement mechanisms wereidentified. Improved TrpF activity extensively reduced the original activity, generalist enzymeswere rare and restoring original activity after an initial loss was difficult. Furthermore,expression levels had a major impact on the shape of the trade-off curve.

In Paper III, adaptation during serial passage of Escherichia coli and S. enterica underlaboratory conditions were examined. Adaptive mutations were identified in four differentlaboratory media and their fitness effects were determined. Little overlap in mutation spectrawas found in the different media and species suggesting that adaptation was media-specific.Furthermore, media adaptation mutations reduced the accuracy of fitness assays and the use ofpre-adapted strains improved the sensitivity of fitness assays 10-fold.

Paper IV examined evolution of novel metabolic capabilities in S. enterica by analyzinggrowth on 124 non-native carbon sources. Growth was observed on 25 compounds and for fiveof these, the causative mutation was identified. Increased gene expression of cryptic genes wasa major mechanism for acquiring the novel phenotypes.

In conclusion, my results show that in most cases many types of mutations can improvea function and allow adaptive evolution but this often is associated with a trade-off and lossin other abilities. Increased gene expression was a major mechanism by which bacteria couldcompensate for loss of an activity as well as acquire new metabolic capabilities.

Erik Lundin, Department of Medical Biochemistry and Microbiology, Box 582, UppsalaUniversity, SE-75123 Uppsala, Sweden.

© Erik Lundin 2018

ISSN 1651-6206ISBN 978-91-513-0463-2urn:nbn:se:uu:diva-362157 (http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-362157)

Page 3: Evolution of New Genes and Functions - DiVA portaluu.diva-portal.org/smash/get/diva2:1252552/FULLTEXT01.pdfRNA–like molecules and created a suitable environment for certain types

Evolution is a light illuminating all facts, a curve that all lines must follow. Pierre Teilhard de Chardin

Page 4: Evolution of New Genes and Functions - DiVA portaluu.diva-portal.org/smash/get/diva2:1252552/FULLTEXT01.pdfRNA–like molecules and created a suitable environment for certain types
Page 5: Evolution of New Genes and Functions - DiVA portaluu.diva-portal.org/smash/get/diva2:1252552/FULLTEXT01.pdfRNA–like molecules and created a suitable environment for certain types

List of Papers

This thesis is based on the following papers, which are referred to in the text by their Roman numerals.

I Lundin, E., Tang, P–C., Guy, L., Näsvall, J., Andersson, D. I.

(2017) Experimental Determination and Prediction of the Fitness Effects of Random Point Mutations in the Biosynthetic Enzyme HisA. Molecular Biology and Evolution, 35(3):704–718

II Lundin, E., Näsvall, J., Andersson, D. I. (2018) Functional trade–offs during evolution of new functions. Manuscript

III Knöppel, A., Knopp, M*., Albrecht, L. M.*, Lundin, E.*, Lustig, U.*, Näsvall, J., Andersson, D. I. (2018) Genetic Adap-tation to Growth Under Laboratory Conditions in Escherichia coli and Salmonella enterica. Frontiers in Microbiology, 9:756

IV Warsi, O., Lundin, E., Lustig, U., Näsvall, J., Andersson, D. I. (2018) Experimental evolution of novel metabolic capabilities in Salmonella enterica. Manuscript

* These authors have contributed equally to this work

Reprints were made with permission from the respective publishers.

Page 6: Evolution of New Genes and Functions - DiVA portaluu.diva-portal.org/smash/get/diva2:1252552/FULLTEXT01.pdfRNA–like molecules and created a suitable environment for certain types
Page 7: Evolution of New Genes and Functions - DiVA portaluu.diva-portal.org/smash/get/diva2:1252552/FULLTEXT01.pdfRNA–like molecules and created a suitable environment for certain types

Contents

Introduction ...............................................................................................11Fitness ..................................................................................................13Effective population size .......................................................................14Mutations and how to find them ............................................................15Effects of mutations ..............................................................................16

Synonymous mutations and mutations affecting the RNA .................18Nonsynonymous mutations ...............................................................18

Distribution of fitness effects.................................................................20Beneficial mutations .........................................................................21Deleterious mutations .......................................................................22Neutral mutations .............................................................................22

Trade–offs ............................................................................................24The fate of a new mutation ....................................................................27Fitness landscapes .................................................................................30Evolution of new functions....................................................................31

Horizontal gene transfer ....................................................................32Duplication and divergence of genes encoding promiscuous enzymes ...........................................................................................33De novo origin ..................................................................................37

Experimental evolution .........................................................................38Salmonella enterica and Escherichia coli as model organisms ...............40

Present investigations ................................................................................42Paper I ..................................................................................................42

Experimental setup ...........................................................................42DFE of all mutants ............................................................................43DFE of single amino acid substitution mutants ..................................43HisA results compared to other systems ............................................43Epistatic effects between mutations ...................................................44Predicting fitness effects ...................................................................44Conclusion .......................................................................................44

Page 8: Evolution of New Genes and Functions - DiVA portaluu.diva-portal.org/smash/get/diva2:1252552/FULLTEXT01.pdfRNA–like molecules and created a suitable environment for certain types

Paper II .................................................................................................45Evolving increased TrpF activity ......................................................45Improving TrpF activity ....................................................................46Expression levels affect the shape of the trade–off curve ...................46Mutational pathways .........................................................................47Conclusions ......................................................................................47

Paper III ................................................................................................48Adaptation to laboratory media .........................................................48Coupling genotype to phenotype .......................................................48Adaptations to different media ..........................................................49Conclusions ......................................................................................49

Paper IV................................................................................................51Novel metabolic capabilities .............................................................51Mutational trajectories ......................................................................51Re–wiring metabolic pathways .........................................................51Trade–offs during acquisition of novel phenotypes ...........................52Conclusions ......................................................................................52

Concluding remarks...................................................................................53

Future perspectives ....................................................................................55

Acknowledgements ...................................................................................56

References .................................................................................................59

Page 9: Evolution of New Genes and Functions - DiVA portaluu.diva-portal.org/smash/get/diva2:1252552/FULLTEXT01.pdfRNA–like molecules and created a suitable environment for certain types

Abbreviations

E. coli Escherichia coli S. enterica Salmonella enterica Ne Effective population size N Actual population of size DNA Deoxyribonucleic acid RNA Ribonucleic acid mRNA Messenger RNA ΔG Folding free energy difference DFE Distribution of fitness effects Q Probability of mutation fixation in the population s Selection coefficient ORF Open reading frame HGT Horizontal gene transfer UV Ultraviolet light IAD Innovation–amplification–divergence Bp Basepair Kbp Kilobasepairs Mbp Megabasepairs ATP Adenosine triphosphate WGS Whole genome sequencing PCR Polymerase chain reaction MIC Minimum inhibitory concentration GFP Green fluorescent protein LB Lysogeny broth MH Mueller Hinton broth

Page 10: Evolution of New Genes and Functions - DiVA portaluu.diva-portal.org/smash/get/diva2:1252552/FULLTEXT01.pdfRNA–like molecules and created a suitable environment for certain types
Page 11: Evolution of New Genes and Functions - DiVA portaluu.diva-portal.org/smash/get/diva2:1252552/FULLTEXT01.pdfRNA–like molecules and created a suitable environment for certain types

11

Introduction

“I think” Charles Darwin wrote in 1859 right next to a drawing of the first evolutionary tree illustrating the origin of species (Figure 1) (Darwin 1859). Without any understanding of genetics, he introduced the theory of evolution driven by natural selection, or “survival of the fittest” as it was later termed (Darwin 1859). He defined it as “the principle by which each slight variation [of a trait], if useful, is preserved,” meaning that the diversity we observe among living organisms is a result of numerous generations of natural selec-tion where the organisms most adapted to their surroundings will prosper and others will diminish or go extinct. The diversity of organisms we see around us arose by descent and further development from the most well–adapted crea-tures of the past, giving rise to the tree of life (Figure 1).

Figure 1. Tree of life as sketched by Darwin. “The affinities of all the beings of the same class have sometimes been represented by a great tree. I believe this simile largely speaks the truth.” – Charles Darwin (1859)

Page 12: Evolution of New Genes and Functions - DiVA portaluu.diva-portal.org/smash/get/diva2:1252552/FULLTEXT01.pdfRNA–like molecules and created a suitable environment for certain types

12

Despite the importance of inheritance for his theories, Darwin did not under-stand the mechanisms behind it. In the late 19th century, Gregor Mendel stud-ied these mechanisms by examining trait inheritance and noted that traits are passed down to offspring by discrete “units of inheritance” (Mendel 1866; Charlesworth and Charlesworth 2009). Today we know that these “units of inheritance” are genes within the DNA (genome) of an organism and that it is genetic diversity within populations that makes evolution by natural selection possible. Other driving forces for evolution might be mutation, drift, or mi-gration.

Darwin’s theories of evolution explain how the first simple organisms on earth developed into the complex lifeforms we see today. But how did the first lifeforms arise? The most prevailing theory is that they developed through abiogenesis, a chemical phase before the biological (i.e. Darwinian and evo-lutionary) phase (Follmann and Brownson 2009; Pross 2011). Today’s life on earth can hence be said to have developed from a prebiotic stage where simple chemistry allowed for non–living but self–reproducing matter. This was used for the formation of membranes that were then able to protect the synthesis of RNA–like molecules and created a suitable environment for certain types of chemistry required for life. A biotic stage then followed with simple organ-isms susceptible to evolution by natural selection (Mansy et al. 2008; Pross 2011).

With today’s knowledge and technology, we can study the molecular mechanisms (genetics) behind different traits and their evolution. This enables us to understand pressing questions such as pathogenesis of bacteria, how they infect humans and how they can survive antibiotic treatment (i.e. become re-sistant to the antibiotics). These bacteria are threatening modern medicine, making trivial infections potentially non–curable and lethal. By understanding the evolutionary mechanisms behind this we can develop better treatments, develop new antibiotics, and hinder the spread of severe diseases in our glob-alized society.

Page 13: Evolution of New Genes and Functions - DiVA portaluu.diva-portal.org/smash/get/diva2:1252552/FULLTEXT01.pdfRNA–like molecules and created a suitable environment for certain types

13

Fitness Darwin’s theories about natural selection stipulate that the most well adapted organisms will prosper and the poorly adapted will diminish. But what defines “adapted?

“Adaptedness” of an organism is called “fitness” and can be seen as a meas-urement of the ability of that organism to survive and reproduce in a popula-tion. Major evolutionary questions about how genetic diseases arise (Yue et al. 2005; Eyre-Walker et al. 2006) and how we can combat antibiotic re-sistance (Andersson and Hughes 2010; Jacquier et al. 2013; Firnberg et al. 2014; Knight et al. 2015) is dependent on our understanding of what is evolu-tionarily beneficial, how organisms can adapt to changing environments, and the impact of most mutations. All these questions relate to fitness. However, it is not possible to uniformly define what traits can be considered to provide a high fitness. Instead, the definition of what is considered to be fit, how fit-ness is measured, and what it tells us about evolution varies and depends on the research question of interest.

For example, regarding antibiotic resistance, fitness can be defined in sev-eral different ways. First, one can measure susceptibility to a certain antibiotic in vitro. In an antibiotic–containing environment, bacteria susceptible to that antibiotic would be defined as having a low fitness compared to resistant ones. In vitro testing is important for determining treatment success or failure with a certain antibiotic in clinics (Doern 2011), but the fitness of bacteria in the human body cannot always be directly inferred from in vitro experiments, as the bacteria in the human body would also be affected by the immune system. As antibiotic resistance commonly comes with a cost (Andersson and Hughes 2010), bacteria resistant to antibiotics are usually growing slower in an anti-biotic–free environment compared to the susceptible ones – i.e. in the absence of antibiotics, resistant bacteria are less fit.

To study the effect of antibiotic-resistance mutations, it is common to measure changes in minimal selective concentration (Gullberg et al. 2011; Jacquier et al. 2013; Firnberg et al. 2014; Knight et al. 2015). Measurements of MIC are indeed relevant for evolution of antibiotic resistance, but only dis-plays the minimal concentration inhibiting growth. It does not directly de-scribe the fitness of the organism that would also be dependent on the growth environment and concentration of the antibiotic present.

In another study of fitness effects, fitness of mutations in a green fluores-cent protein (GFP) that was moved from its native organism (the jellyfish Ae-quorea victoria) to Escherichia coli was measured (Sarkisyan et al. 2016). The GFP–encoding gene was overexpressed and fitness was assayed as fluo-rescence at a certain wavelength with conditions far from the native conditions of A. victoria. As fluorescence has no benefit to E. coli in any environment, the measurements had no correlation to the organismal fitness of E. coli but is purely an assessment of the protein function.

Page 14: Evolution of New Genes and Functions - DiVA portaluu.diva-portal.org/smash/get/diva2:1252552/FULLTEXT01.pdfRNA–like molecules and created a suitable environment for certain types

14

To understand and predict evolutionary outcomes, it is essential to under-stand the effects of fitness at the protein level. However, function of a protein should not be confused with fitness of an organism. Protein function and or-ganismal fitness are often linked, but not always. For an enzyme, protein func-tion is a product of the level of folded and functional protein (dosage) and the specific activity of the protein (Protein function ∝ Protein dose × Protein ac-tivity) (Soskine and Tawfik 2010). Higher expression (i.e. increased gene dos-age) of a gene can hence compensate for a poor activity and no functional change of the protein would be observed. Higher dosage of a gene/protein can be achieved through amplification of the gene (Aharoni et al. 2005; Näsvall et al. 2012). With a high expression level of a gene/protein, a large drop in pro-tein function can be tolerated before organismal fitness is affected. Higher ex-pression levels can also yield higher tolerance to antibiotics. A dramatic in-crease in expression (e.g. as a result from extensive amplifications) often comes with a cost (Adler et al. 2014). Organisms strive to balance the need of increase in tolerance and the loss of fitness resulting from such increases in expression.

Effective population size The effective population size (Ne) describes how big an idealized population (i.e. a population with the same number of reproductive males and females, same amount of children per parent, random mating and a constant population size over generations) would be in order for the genetic diversity to be the same as in the actual population of size N (Husband and Barrett 1992). The effective population size is smaller than the actual population and is important in genetic models as it provides a single number and allows for analysis of the effect of population size with simplified approximations of demographic, ge-netic and spatial structuring (Wakeley 2008; Charlesworth 2009). The effec-tive population size is the harmonic mean of the actual population size at dif-ferent timepoints. When the actual population size change, this affects the ef-fective population size but with a delay (Husband and Barrett 1992). At times when the actual population size is small (bottle–necks), the genetic variability is limited which will then be reflected in calculations of the effective popula-tion size (Wright 1940). (Wright 1940). The human population has only re-cently become large (N = 7.6×109 (United Nations 2017)), but still the effec-tive population size is only 104 (Yu 2004). Effective population size is also of great importance when it comes to experimental evolution since it affects the fate of new mutations (e.g. the probability of fixation of a beneficial mutation) (Wahl et al. 2002).

Page 15: Evolution of New Genes and Functions - DiVA portaluu.diva-portal.org/smash/get/diva2:1252552/FULLTEXT01.pdfRNA–like molecules and created a suitable environment for certain types

15

Mutations and how to find them A mutation is in its most basic definition a heritable change in the nucleotide sequence of a genome. Mutations can be said to affect either the content of the genome or the amount of DNA. There can be large content changes such as genome rearrangements or smaller changes in the form of mutations of a sin-gle nucleotide. The amount of DNA can be changed either by amplification of existing DNA, by introduction of new DNA by horizontal gene transfer or by deletions (Figure 2). The frequencies of these different events vary between organisms and environment and are highly dependent on the fitness effect of the mutation and selective pressure. Even though these mechanisms are di-verse across different types of organisms, the mechanisms described in this work are specific to the bacterial species E. coli and S. enterica.

Figure 2. Mutations in a genome. Mutation size ranges from very small to very large, and can affect both the content of the genome and the amount of DNA. Large mutations can be amplifications, deletions, introduction of a new gene by horizontal gene transfer, or inversions. Smaller changes often take place within coding se-quences and can be as small as single nucleotide changes (deletions, substitutions, or insertions).

Given an observed change in phenotype during evolution, there are several ways of studying the genetics behind these changes. One method is via a ‘bot-tom–up’ approach in which a wild–type copy of a suspected mutated gene is introduced, a gene is inactivated, or in any other way where the phenotypic effect is compensated by changing or providing genetic material (Wassenaar and Gaastra 2001). However, this approach requires previous knowledge about the genetics behind a certain trait, and can be extremely time consuming (Nowrousian et al. 2012).

Modern techniques allow for ‘top–down’ approaches such as whole ge-nome sequencing, which enables one to sequence and compare the full

Page 16: Evolution of New Genes and Functions - DiVA portaluu.diva-portal.org/smash/get/diva2:1252552/FULLTEXT01.pdfRNA–like molecules and created a suitable environment for certain types

16

genomes of the changed (the evolved) and non–changed (the wild–type) var-iants and determine all genetic differences (Dettman et al. 2012). The genetic differences in the evolved variants can then be attributed to the changed phe-notype.

Even though whole genome sequencing provides a list of all mutations in the genome following an evolution experiment, only a small fraction of these mutations are responsible for the fitness increase (Dettman et al. 2012). Thus, the ‘bottom–up’ and ‘top–down’ approaches are usually combined to deter-mine what mutations that are responsible for the increase in fitness.

Effects of mutations Although small in size, single nucleotide substitutions can have a large impact on organism fitness. Their emergence can have many different origins, such as errors in DNA replication, mistakes in DNA repair or external induction by chemicals or radiation. Depending on the position of the mutation, i.e. within genes or in intergenic regions, the effect of the mutation will vary. Mutations within genes are expected to have the greatest impact on fitness and are di-vided into three classes – synonymous, nonsynonymous, and nonsense muta-tions. Synonymous mutations change the codon sequence but not the amino acid sequence, whereas nonsynonymous mutations change the amino acid se-quence of the protein. Nonsense mutations introduce a premature stop codon, resulting in a truncated (and often nonfunctional) protein. All natural selection is based on variability within genomes in a population and mutations occur at random over time. The overall effect of mutations on fitness can be classified into three major categories based on their predicted impact on evolution: ben-eficial, neutral, and deleterious (Figure 3).

Figure 3. Effects of mutations. Mutations occurring within a genome can either be beneficial (increased fitness, green area), neutral (not affecting fitness, yellow area) or deleterious (decreasing fitness, red area).

Page 17: Evolution of New Genes and Functions - DiVA portaluu.diva-portal.org/smash/get/diva2:1252552/FULLTEXT01.pdfRNA–like molecules and created a suitable environment for certain types

17

Beneficial mutations increase the fitness of the organism whereas neutral mu-tations will have no effect on organism fitness. Deleterious mutations are harmful to the organism and have a high probability of reducing survival or fertility of the organism (Eyre-Walker and Keightley 2007).

Neutral mutations can accumulate within genomes over time (random ge-netic drift) (Figure 4A) (Husband and Barrett 1992; Bloom et al. 2007; Burch et al. 2007; Charlesworth 2009), whereas beneficial mutations will be selected for (Luria and Delbrück 1943; Roy 2016) (Figure 4). This selection can be a direct adaption to environmental conditions (Figure 4B) or as compensation to deleterious mutations (Figure 4C). Like all other types of mutations, dele-terious mutations occur by chance, but unlike beneficial mutations they will be selected against (so called purifying selection) and are generally lost over time (Figure 4D).

Figure 4. Effects of mutations during evolution. Mutations can have neutral, benefi-cial, or deleterious effects on organismal fitness. (A) Neutral mutations (yellow cir-cles) with no effect on fitness, can accumulate over time by random genetic drift. (B) Beneficial mutations (green triangles and stars) can be fixed by direct adaption to selective conditions, (C) or by compensating the cost of deleterious mutations. (D) Deleterious mutations (red triangles), that have a negative effect on the organis-mal fitness, can be selected against by purifying selection.

Page 18: Evolution of New Genes and Functions - DiVA portaluu.diva-portal.org/smash/get/diva2:1252552/FULLTEXT01.pdfRNA–like molecules and created a suitable environment for certain types

18

There are hypotheses that state an organism can increase its mutation rate at certain locations within its genome (such as in a gene encoding an enzyme e.g.) to enable more efficient adaptation (Martincorena and Luscombe 2013). However, the support for this is weak and the prevailing theory is that muta-tions occur by chance (Roy 2016).

Effects of mutations can be determined experimentally, or estimated by comparing sequence data from related organisms or proteins. By comparing sequence data, conclusions can be made about which amino acid changes have not been selected against during millions of years of evolution (i.e. the varia-bility of amino acids in different positions). Furthermore, it can also reveal which positions are conserved in genomes or proteins, providing insight into which positions are important for survival (Mirny and Shakhnovich 1999; Franzosa and Xia 2009; Worth et al. 2009; Daudé et al. 2013; Nevin Gerek et al. 2013; Shahmoradi et al. 2014; Sikosek and Chan 2014; Yeh et al. 2014; Echave et al. 2016).

Synonymous mutations and mutations affecting the RNA Synonymous mutations usually have relatively small effects but can still in-fluence the evolutionary outcomes of different species. There is a bias in which codons are used to encode a particular amino acid and changing a codon to a synonymous codon could alter the translation efficiency of a gene. It has been estimated that the fitness cost of non–optimal codons ranges from 10−4

10−9 per codon per generation in E. coli (Bulmer 1991; Hartl et al. 1994) and was determined to be 10-4 per codon per generation in a highly expressed gene in S. enterica ( . Selection pressure varies among species and genes but codon usage can in some cases affect protein expression by several orders of magnitude (Welch et al. 2009; Tuller et al. 2010). Changes in the translation rate of a protein can impact protein folding and subsequently fitness (Tuller et al. 2010; Jacobson and Clark 2016). mRNA levels as a result of changes in mRNA secondary structure can be also be affected by synony-mous substitutions leading to reduced fitness of the organism (Kudla et al. 2009; Tuller et al. 2010; Sabarinathan et al. 2013). Changes in the first ~40 nucleotides of an mRNA have the greatest impact on fitness. The region is close to the ribosome binding site, and perturbations to ribosome binding and translation initiation can cause a 250–fold difference in protein levels (Tuller et al. 2010).

Nonsynonymous mutations Nonsynonymous mutations that change the amino acid sequence of a protein can affect the function of the protein. Effects on fitness of the organism ex-pressing the protein vary greatly. Mutations in an enzyme can change the amino acids directly involved in the active site rendering the enzyme

Page 19: Evolution of New Genes and Functions - DiVA portaluu.diva-portal.org/smash/get/diva2:1252552/FULLTEXT01.pdfRNA–like molecules and created a suitable environment for certain types

19

nonfunctional or severely damaged, which can have a lethal or highly delete-rious effect on organism fitness (Cupples and Miller 1988; Daudé et al. 2013). Mutations that do not affect the active site directly can still have a major im-pact on fitness by changing the stability or conformation of an enzyme that hinders the catalysis of the reaction (Echave et al. 2016). The impact on fitness and subsequent evolution is affected by the type of amino acid change, e.g. how the amino acid size and polarity changes (Cupples and Miller 1988; Suckow et al. 1996). By analyzing related sequences, amino acid changes have been studied and scoring measures have been generated to analyze the rela-tionship between amino acid sequences in multiple sequence alignments such as the Basic Local Alignment Search Tool (BLAST) (Grantham 1974; Day-hoff et al. 1978; Henikoff and Henikoff 1992; Altschul 1997). These scoring measures are calculated for single amino acid changes and are generalized to be used for any protein. Changes from for example alanine to valine (both small, non–polar) are scored as less deleterious than changing threonine (small, polar) to tryptophan (large, hydrophobic). The changes in size and po-larity of a mutated residue will also affect the protein structure. Early studies determined that mutating amino acids buried deep within a protein are less tolerated in evolution, indicating that mutations that affect the relative acces-sible surface area of amino acids will have an impact on the function of the protein (Hubbard and T.J.P. Hubbard 1987; Lim and Sauer 1989; Overington et al. 1990; Topham et al. 1993), and additional support has been presented in recent years (Ramsey et al. 2011; Shahmoradi et al. 2014). An alternative measure of the accessibility of the amino acids is their packing density, i.e. how many other amino acids surround a particular amino acid (Marcos and Echave 2015).

The stability of a protein can also be measured by the fraction of properly folded protein, directly linked to and assessed by the folding stability and free energy ΔG (Jacquier et al. 2013). Most mutations destabilize the protein struc-ture and even though the changes are usually small, destabilizing mutations have an impact on protein fitness and can in rare cases cause complete loss of protein function (Shortle et al. 1990; Green et al. 1992; Meeker et al. 1996; Chen and Stites 2001; Holder et al. 2001; Serrano et al.). Loss in stability leads to a lower probability of proper folding, resulting in a lower availability of functional protein. Some stabilizing mutations might not affect protein func-tion but could increase the tolerance to other destabilizing mutations, whereas others might cause a loss in flexibility and activity, and thus be associated with a fitness cost (DePristo et al. 2005; Jacquier et al. 2013).

Some proteins such as those with the TIM barrel fold (also known as (βα)8–barrel) contain distinct domains to stabilize the protein and provide catalytic activity (Höcker et al. 2001). The fold consists of a base of α–helices and β–sheets that form a rigid structure and provides stability (usually referred to as the “stability face”). Flexible loops are connected to this base and contain the residues responsible for the catalytic activity (usually referred to as the

Page 20: Evolution of New Genes and Functions - DiVA portaluu.diva-portal.org/smash/get/diva2:1252552/FULLTEXT01.pdfRNA–like molecules and created a suitable environment for certain types

20

“catalytic face”) (Höcker et al. 2001). This division of labor can make it diffi-cult to accurately predict effects on protein function as changes in different biophysical properties might have different effect depending on the region of change (stability or catalytic face).

Distribution of fitness effects Every mutation in the genome or a certain protein can affect organismal fitness which can then influence the possible evolutionary pathways for the organism. Elucidating these effects is the key to understanding certain diseases (Yue et al. 2005; Eyre-Walker et al. 2006), how to maintain healthy population sizes (Silander et al. 2007), and antibiotic resistance development (Wang et al. 2002; Jacquier et al. 2013; Firnberg et al. 2014; Knight et al. 2015).The dis-tribution of these effects, i.e. the distribution of fitness effects of new muta-tions (DFE), describes the probability distribution of the effect on fitness from a new mutation (Figure 5).

Figure 5. Distribution fitness effects (DFE). The DFE describes the probability of certain mutations having a given fitness effect. The distribution is unique for any condition, organism, or protein, but commonly have one or several peaks. A narrow peak at high fitness (A) indicates that most mutations have little or no effect. If a dis-tribution has a broad peak at lower fitness, the effect of most mutations is deleterious but the effects are highly variable. Some distributions have several peaks with most mutations having either a large effect, or little to no effect on fitness.

A study of the impact of mutations in the active site of the β–lactamase TEM–1 on antibiotic resistance for example showed that this enzyme can evolve towards having a larger active site to handle larger penicillins. The larger ac-tive site rendered the enzyme less stable and these findings have led to the development of new inhibitors that can be used in the battle against antibiotic resistance (Wang et al. 2002).

What is beneficial or deleterious effects of mutations in a long–term evo-lutionary perspective depends not only on the fitness change from a single mutation in the organism but also on the overall population size. Their

Page 21: Evolution of New Genes and Functions - DiVA portaluu.diva-portal.org/smash/get/diva2:1252552/FULLTEXT01.pdfRNA–like molecules and created a suitable environment for certain types

21

combined effect will determine the probability of fixation of the mutation in the population, Q, according to equation

𝑄 ≈𝑁&𝑠𝑁 ×

11 − e,-./0

where Ne is the effective population size, N the total population size and s the selection coefficient (Kimura 1962). Neutral mutations are subjected to ran-dom genetic drift and will have |𝑁&𝑠| ≪ 1, i.e. either mutations with a very small impact on the fitness of the organism or organisms with a very small effective population size. For s < 0, the mutation will have a deleterious effect on the organism and for s > 0 the effect will be beneficial. Fractions of dele-terious, neutral and beneficial mutations (i.e. the distribution of fitness effects) will depend on the organism, the environment for that organism, and the type of mutation (Eyre-Walker and Keightley 2007; Keightley and Eyre-Walker 2010).

When assessing the DFE, a single point in evolution is used as a starting point and possible outcomes are studied given this starting point. Beneficial mutations are rare and once one is fixed, the probability that additional bene-ficial mutations will further improve fitness is low (Eyre-Walker and Keight-ley 2007). When acquiring a deleterious mutation, there are more mutations available that are beneficial and can compensate for the deleterious mutation (Björkman 2000). Neutral mutations that stabilize the protein can buffer the effects of otherwise deleterious mutations (de Visser et al. 2003; Bloom et al. 2005). The DFE is hence highly dependent on the chosen starting point and can change as soon as a mutation is introduced.

Beneficial mutations Beneficial mutations tend to be rare, and under the assumption of the extreme value theory, those found are exponentially distributed (Kassen and Bataillon 2006; Orr 2006; Eyre-Walker and Keightley 2007; MacLean and Buckling 2009). This means that beneficial mutations are uncommon and most of those that exist have a very small effect on fitness. The fraction of beneficial muta-tions varies between species and experiments. Studies have found that 7 % of mutations introduced into the β–lactamase TEM–1 in E. coli are beneficial (Firnberg et al. 2014), 4 % of mutations in an RNA virus are beneficial (Sanjuán et al. 2004), 3 % in AraC, AraD and AraE proteins of S. enterica (Lind et al. 2017), 2 % in the bacteriophage f1 (Peris et al. 2010), none in the bacteriophage φ6 (Burch et al. 2007), none in E. coli (Elena et al. 1998), none in ribosomal proteins of Salmonella enterica (Lind et al. 2010), and close to none in humans (Zhang and Li 2005). The fruit fly, Drosophila melanogaster, is a rare case in which many substitutions are beneficial. Almost 15 % of all

Page 22: Evolution of New Genes and Functions - DiVA portaluu.diva-portal.org/smash/get/diva2:1252552/FULLTEXT01.pdfRNA–like molecules and created a suitable environment for certain types

22

substitutions within the genome are estimated to be beneficial (Andolfatto 2005).

This indicates that most organisms are well–adapted for their niche, and that it is uncommon to extensively improve existing proteins. It is generally accepted that if |𝑁&𝑠| > 1, a mutation will be under selection. Thus, in E. coli with a suggested Ne of 107 (Charlesworth 2006), an |s| < 10−7 is neutral and beneficial mutations are commonly fixed (Silander et al. 2007). Conversely, for organisms or populations with a small effective population size, beneficial mutations will commonly be lost as a result of genetic drift (Silander et al. 2007). Due to clonal interference, mutations with small beneficial effects will only rarely become fixed in nature (Rozen et al. 2002; de Visser and Rozen 2006).

Deleterious mutations The distribution of deleterious effects is more complex as it ranges from com-plete loss–of–function mutations to smaller effects. Lethal and weakly delete-rious effects are more common than other deleterious effects resulting in a bimodal U–shaped distribution (Eyre-Walker et al. 2006; Eyre-Walker and Keightley 2007). Lethal mutations are not present in all experimental systems (Lind et al. 2010) but in other systems up to 40 % of the mutations were ob-served to be lethal (Wloch et al. 2001; Sanjuán et al. 2004). In studies that focused on single proteins, the fraction of deleterious mutations have been reported to vary from 4–10 % (Soskine and Tawfik 2010; Jacquier et al. 2013; Firnberg et al. 2014; Sarkisyan et al. 2016; Lind et al. 2017).

It is not surprising that the DFE varies given the organism, protein, or ge-nome mutations studied. Mutations in different sites within genomes (e.g. cod-ing and non–coding regions) will have diverse effects on fitness and the DFE for single point mutations might be distinct from the DFE for random trans-poson insertions. The DFE tends to shift towards more deleterious as the amount of accumulated mutations within a single gene (Sarkisyan et al. 2016).

Mutations with very large effects can only be studied in a laboratory set-ting. In nature, they are subject to purifying selection and are subsequently removed from the population over time. Effects of purifying selection will depend on effective population size, as small population sizes will allow for slightly deleterious mutations to accumulate and reach fixation by random ge-netic drift, a phenomenon known as Muller’s ratchet (Butcher 1995).

Neutral mutations Mutations with a very small effect can be challenging to detect in a laboratory setting as a result of limited resolution in the experiments (Davies 1999; Den-ver et al. 2004). While |s| < 10−3 is the highest resolution reported in a labora-tory setting, evolution can act on fitness differences as small as |s| < 10−7 in

Page 23: Evolution of New Genes and Functions - DiVA portaluu.diva-portal.org/smash/get/diva2:1252552/FULLTEXT01.pdfRNA–like molecules and created a suitable environment for certain types

23

bacteria with large Ne (Lind et al. 2017). Even though effects of a mutation in a protein can have an effect on organismal fitness of |s| < 10−7, it is unlikely that it is completely neutral with regard to protein function. All changes can have some sort of effect on the protein, even if it is negligibly small and not selected for or against. From an evolutionary perspective, neutral mutations are those that have |𝑁&𝑠| ≪ 1 and these can only be fixed over time due to genetic drift (Ohta 1992) or by hitchhiking with beneficial mutations (Lang et al. 2013). What is neutral during the course of evolution will hence be depend-ent on the population size as well as the direct effect of the mutation. In species with a large effective population size, the fraction of neutral mutations is hence predicted to be lower. The effect of population size on neutrality also implies that the larger the population size, the more well–adapted the species will be as more of the smaller effect advantageous mutations will be se-lectable.

In humans that have an effective population size of about 104, it is estimated that the fraction of slightly deleterious mutations (s > 10−7) and neutral muta-tions is around 44–57 % of all mutations (Charlesworth 2009; Bataillon and Bailey 2014). Previous studies on microorganisms, with bacterial population sizes estimated to 106–107(Charlesworth 2006; Eyre-Walker and Keightley 2007; Charlesworth 2009), have reported a fraction of 30–56 % of the muta-tions as undistinguishable from wild–type controls (Sanjuán et al. 2004; Peris et al. 2010; Lind et al. 2017). Only in one study the seemingly neutral muta-tions were detected at a frequency of 5 % neutral mutations (Lind et al. 2010). In Drosophilia, the fraction of mutations that are neutral has been reported to be 16 % (Eyre-Walker 2002), and for enteric bacteria, 3 % (Charlesworth 2006).

To assess the fitness of many seemingly neutral mutations and to circum-vent limitations of direct measurements, DFEs have also been studied using DNA sequence data. This technique relies on mutation frequencies in popula-tions and relates them to the deleteriousness of the mutation under the assump-tion that more deleterious mutations are less likely to be widespread in a pop-ulation as negative selection would act against them (Eyre-Walker and Keight-ley 2007). The efficiency of selection also depends on the effective population size and by comparing different genomes or genomic regions and their popu-lations sizes, conclusions can be made about the mutational effects and the DFE (Charlesworth 2009). This technique is powerful and can be used to an-alyze vast amounts of data but can also yield misleading results. The method-ology is dependent on defining neutral sites (i.e. sites where most mutational changes are neutral) that can be compared with the evolution of mutations in sites under selection (Keightley and Eyre-Walker 2010). If these chosen neu-tral sites are in fact not neutral but under selection, the estimate of the muta-tional effect would be incorrect and conclusions about a certain DFE could be invalid.

Page 24: Evolution of New Genes and Functions - DiVA portaluu.diva-portal.org/smash/get/diva2:1252552/FULLTEXT01.pdfRNA–like molecules and created a suitable environment for certain types

24

Trade–offs To be good at something, organisms usually have to compromise on some-thing else, i.e. if an organism invests resources in maintaining one trait it will most likely loose in fitness for another trait (Lenoir 1984; Ferenci 2016) – there is a tradeoff between the traits (Figure 6).

Figure 6. Trade–offs between new and old function in proteins. During evolution to-wards acquiring a new trait, organisms commonly lose in fitness for an existing trait. This trade–off can either be weak; with large gains in the new function and little loss in the original function, or strong; with small gains in the new function at the ex-pense of large losses in original function. During evolution towards a new function (from specializing in function A to specializing in function B), generalist stages can occur where an organism or protein possesses both functions, most often with a re-duced capability.

All evolutionary adaptive processes are subject to trade–offs. Regardless of the beneficial effect of acquiring mutations and new functions, mutations will not be fixed during evolution if there is a severe deleterious effect on another important function (Tokuriki et al. 2008). This limitation prevents all living organisms from having perfect fitness in several different environments (Ferenci 2016) (Figure 6). Elucidating the mechanisms and impact of these

Page 25: Evolution of New Genes and Functions - DiVA portaluu.diva-portal.org/smash/get/diva2:1252552/FULLTEXT01.pdfRNA–like molecules and created a suitable environment for certain types

25

mutational effects on protein function and organism fitness (i.e. their trade–offs) is the key to understanding evolutionary dynamics (DePristo et al. 2005; Pal et al. 2006; Poelwijk et al. 2007; Camps et al. 2008).

Evolutionary trade–offs have a major impact on a broad range of microbial processes (Ferenci 2016), such as antibiotic resistance, which is a major chal-lenge for today’s healthcare (Rice 2009). Several studies have shown that re-sistance comes with a cost (Andersson and Hughes 2010), and that this cost can be due to a trade–off between removing or having fully functional outer membrane porins (Wang et al. 2002; Pages et al. 2008; Delcour 2009; Tängdén et al. 2013). Closing or mutating outer membrane porins can confer resistance to antibiotics as well as to bacteriophages and toxic compounds (Ni-kaido 2003; Sugawara and Nikaido 2005), but it will also limit access to nu-trients from the surroundings. Highest level of resistance is achieved by re-moval of all porins but fully functional porins are required for optimal uptake of nutrients. Either of the extreme points will be strongly deleterious (lethal effects from exposure to antibiotics, phage, or toxic compound, versus no ac-cess to nutrients, thus bacteria evolve to find the optimum trade–off between the two states.

Three major classes of trade–offs have been identified in bacteria, each act-ing separately or with a combined effect: resource allocation, design con-straint, and information processing (Ferenci 2016).

Resource allocation is controlled by several bacterial mechanisms. In gen-eral, a certain mechanism, protein, or pathway may have a central role in a metabolic pathway (such as utilizing a certain molecule as a source of carbon). Upon changes in the environment (e.g. if a new carbon source becomes avail-able), this role can be evolved to enable the bacteria to continue to survive and proliferate. The change in the focus of the pathway or protein will be a trade–off on the original function (Ferenci 2005). Trade–offs between nutrient–rich and starvation conditions (and other stresses) can to some extent be regulated on the transcriptional level by the use of different RNA polymerases (σ–fac-tors). The transcriptional regulator σS shifts the expression from growth–asso-ciated genes that are dependent on σ70, to genes that are involved in stress–response. Higher expression of σS results in higher tolerance to external stress, but under these conditions, fewer substrates can be metabolized (Nyström 2004; Maharjan et al. 2013). Trade–offs can also be observed between usage of fast but inaccurate metabolic pathways and accurate but slow pathways. Fast pathways are beneficial in a nutrient–rich environment but a more careful usage of nutrients is required in nutrient poor environments (Paczia et al. 2012; Peebo et al. 2015).

At the protein level, there are trade–offs between the function of outer membrane porins for inclusion of nutrients but exclusion of harmful com-pounds and phages. In many evolutionary pathways, an enzyme is involved in adaption to a change in the environment. That particular enzyme can either be accurate or inaccurate in the catalysis of the substrate into product (Tawfik

Page 26: Evolution of New Genes and Functions - DiVA portaluu.diva-portal.org/smash/get/diva2:1252552/FULLTEXT01.pdfRNA–like molecules and created a suitable environment for certain types

26

2014). Accuracy is accomplished through shaping the active site of the en-zyme to become optimal for carrying out a certain reaction, and improvement of catalytic efficiency is selected for in evolution (Beard et al. 2002). On the other hand, inaccuracy is not a selected trait but rather just the result of the fact that beyond a certain point, accuracy can either be costly or not beneficial enough for selection to act further on it (Murugan et al. 2012). Inaccuracy can also provide weak secondary functions in proteins that can become beneficial upon an environmental change, and improvement can be selected as explained by e.g. the IAD model (Khersonsky et al. 2006; Näsvall et al. 2012; Andersson et al. 2015).

It has also been observed that there is a trade–off between protein stability and the evolution of new functions. In enzymes for example, catalytic effi-ciency is largely determined by residues in the active site (Beadle and Shoichet 2002). An active site usually has a negative impact on protein stability as the residues are often polar in a hydrophobic environment and close to other res-idues with similar charges (Baldwin 2008). Furthermore, the residues can have uncomplemented hydrogen bond donors or acceptors (only complemented in the presence of the substrate). An active site can also expose a hydrophobic surface that should be buried to achieve maximum stability (Beadle and Shoi-chet 2002). Changes in an active site that promote stability might be highly deleterious to an enzyme’s function. Changes to the residues outside of the active site are likely more favorable. These can increase tolerance to, and buffer the effect of, otherwise deleterious mutations, transcriptional or trans-lational errors, and tolerance to fluctuating environments in terms of temper-ature, salinity, or redox potential (de Visser et al. 2003; Bloom et al. 2005). Even though stability buffers effects of deleterious mutations and errors, it can also limit the ability of an enzyme to acquire new functions through mutations, as new functions can arise from small changes in a flexible enzyme (Aharoni et al. 2005). On the contrary, stability could also be said to increase the evolv-ability of an enzyme by buffering the effects of mutations beneficial for a new activity that have a negative effect on enzyme stability – an effect seen in both cytochrome P450 and the β–lactamase TEM–1 (Bloom et al. 2006; Bershtein et al. 2008).

Information storage in and processing of genomes is a trade–off all bacteria handle differently. The ability to grow in different environments is stored in the genome and there is a cost associated with maintaining a genome. A large genome costs more to maintain and is more prone to mutation than a small genome, but a small genome reduces the number of environments in which a bacterium can survive (D'Souza et al. 2014). Genome sizes vary between spe-cies, and genome reduction can give a selective advantage when certain bio-synthetic genes are no longer needed, such is the case for endosymbiotic bac-teria living inside other cells (Nilsson et al. 2005; McCutcheon and Moran 2011). Processing of information is also subjected to trade–offs such as the rate versus accuracy of transcription (i.e. RNA polymerases) (Zhou et al.

Page 27: Evolution of New Genes and Functions - DiVA portaluu.diva-portal.org/smash/get/diva2:1252552/FULLTEXT01.pdfRNA–like molecules and created a suitable environment for certain types

27

2013) and translation (i.e. ribosomes) (Ehrenberg and Kurland 1984). The trade–offs present in DNA replication and repair systems include balancing the accuracy of these processes against the cost and growth rate limitations of maintaining a certain level of accuracy (Ferenci 2016). With less accuracy, more mutations can accumulate and provide a greater diversity for further evo-lution. However, as mutations are usually costly, evolvability trades-off with the cost of the mutations.

It is also known that the first mutations for gaining a new function have the greatest impact and that the contributions of new additional mutations de-crease as the number of accumulated mutations increases. These diminishing returns have been seen in a range of different experimental systems, such as metabolic pathways, mutation rates, and development of antibiotic resistance (de Visser et al. 1999; MacLean et al. 2010; Chou et al. 2011; Tokuriki et al. 2012). Trade–offs are not static but rather they vary with changes in the envi-ronment. For example, when a new carbon source becomes available, the tem-perature shifts, or antibiotics are introduced, trade–offs will be shifted towards the current most favorable state.

The fate of a new mutation At strong selection pressure and low mutation rates, a beneficial mutation will be selected for and kept in a population, the existence of neutral mutations will be dependent on the population size, and deleterious mutations will be lost or compensated over time. If the mutation rate increases, not only will the num-ber of beneficial mutations increase, but so will the frequencies of neutral and deleterious mutations. The evolutionary outcome would be dependent on the distribution of fitness effects of mutations and selection pressure. To under-stand and accurately predict the outcomes of evolution, we need data on mu-tation rates, distribution of fitness effects, and the effective population sizes – all of which are dependent on the genetic background of the organism and the studied environment.

Depending on selection pressure and the time scale, different mutational pathways are available. In a recent study, it was shown that evolutionary path-ways at different selection pressures (all ultimately leading to the same evolu-tionary endpoint – high level of antibiotic resistance) can be very different (Wistrand-Yuen et al. 2018). At a low selection pressure at a long period of time, the strains studied adapted a mutator phenotype resulting in roughly 100 different point mutations per genome. Only five of these mutations were needed to provide the highest measurable resistance (Wistrand-Yuen et al. 2018). At high selection pressure and over a short time period, no such mutator strains were observed and the mutations providing high–level resistance were different. The fate of a new mutation is hence highly dependent on the selec-tion pressure and the other mutations present. Genetic variability is driven by

Page 28: Evolution of New Genes and Functions - DiVA portaluu.diva-portal.org/smash/get/diva2:1252552/FULLTEXT01.pdfRNA–like molecules and created a suitable environment for certain types

28

selection pressure and cannot only lead to changes in mutation rate but also to changes in rates of recombination and horizontal gene transfer (Andersson and Hughes 2014).

Beneficial mutations, by definition, increase the fitness of an organism and can be assumed to take over a population rapidly, resulting in sequential fixa-tion of mutations and periods of waiting for the next beneficial mutation to sweep (so called “periodic selection”) (Atwood et al. 1951; Paquin and Adams 1983; Conrad et al. 2011). However, in cases of large population sizes or high mutation rates, different beneficial mutations can co–occur in different indi-viduals within the same population. This creates competition between the al-leles, a phenomenon called “clonal interference” (Lenski et al. 1991; Gerrish and Lenski 1998; de Visser et al. 1999; Miralles et al. 1999; Rozen et al. 2002; de Visser ). As a result, the fate of a new beneficial mutation will not only depend on the fitness effect of that particular mutation but also on the population size and its fitness effect in relation to other clones with other acquired beneficial mutations within the same population. The rate of fixation of the beneficial mutations will hence vary and some beneficial mu-tations might not become fixed at all.

A beneficial mutation leading to a new trait, such as resistance towards antibiotics, can provide survival in a changed environment. However, even though the mutation is beneficial for the changed conditions, it might yield a fitness cost in the original environment (i.e. a trade–off) (Andersson and Hughes 2010). The deleterious effects of a mutation can be compensated for by another mutation in the same gene or somewhere else on the chromosome, so that the combined effect of the mutations is neutral or beneficial (Kimura 1985; Andersson and Levin 1999; Maisnier-Patin and Andersson 2004; Te-naillon 2018). Compensation of the deleterious effects of a mutation enables organisms to retain the new property (antibiotic resistance) and still remain competitive in the environment (in the absence of antibiotic) (Andersson 2006; Andersson and Hughes 2010) (Figure 7).

Compensation can also be accomplished by increasing the expression of certain enzymes, pathways, or chaperones (Gordon et al. 1994; Fares et al. 2002; de Visser et al. 2003; Tokuriki and Tawfik 2009a; Tawfik 2014; Goyal and Chaudhuri 2015). High expression and chaperones can also buffer delete-rious effects on for example enzyme kinetics by allowing for an otherwise deleterious mutation to be introduced into a protein (Gordon et al. 1994; Fares et al. 2002; de Visser et al. 2003; Tokuriki and Tawfik 2009a; Tawfik 2014; Goyal and Chaudhuri 2015). This is known as robustness and defined as the invariance of the phenotype given a change in either the environment or the genome (de Visser et al. 2003) (Figure 7).

Page 29: Evolution of New Genes and Functions - DiVA portaluu.diva-portal.org/smash/get/diva2:1252552/FULLTEXT01.pdfRNA–like molecules and created a suitable environment for certain types

29

Figure 7. Robustness and compensation. By different buffering or compensatory mechanisms (black arrows), fitness effects of mutations or of changes in the envi-ronment (blue circles) can be improved (blue stars). Without this robustness or com-pensation, organisms would be very sensitive to change (environmental or mutation) but with it, organism can tolerate a vast number of conditions and genetic changes, improving the ability to tolerate many different, otherwise deleterious, conditions.

Robustness phenotypes can buffer and mask the effect of deleterious muta-tions or environmental changes on fitness. Environmental robustness buffers against fluctuations in the environment whereas genetic robustness buffers against mutational changes in the genome, the latter enabling mutations to oc-cur without loss of fitness. These can both be accomplished by chaperones (e.g. compensation for increased temperature in the surroundings or deleteri-ous mutations in enzymes), gene overexpression, or gene redundancy in the genome (de Visser et al. 2003; Gu et al. 2003; Stelling et al. 2004). Robustness allows for more mutations to be neutral and enables cryptic genetic variation that can serve as the basis for adaptation to a new niche and thus improve the overall evolvability of the organism (Félix and Wagner 2006; Masel and Trot-ter 2010) (Figure 7).

Per definition, neutral mutations do not affect the fitness of an organism. However, the combination of several neutral mutations can lead to a signifi-cant fitness increase (Wistrand-Yuen et al. 2018). In general, if there are no interactions between two mutations, the combined effect of the mutations should be additive (i.e. 1 + 1 = 2), but in some occasions the combined effect is larger than the individual contribution of the separate mutations (i.e. 1 + 1 = 3). This phenomenon is referred to as epistasis and includes all situations in which the observed phenotype deviates from the sum of the independent ef-fects of the mutations (de Visser et al. 2011). Combined effects that are greater than expected are referred to as synergistic epistasis whereas combined effects that are smaller than expected are known as antagonistic epistasis (Phillips 2008).

Robustness and epistasis are also related to one other. Both positive and negative epistasis can evolve from purifying selection, and in certain cases, robustness mechanisms can cause positive and negative epistasis. For exam-ple, this occurs when chaperones are induced to compensate for the effects of

Page 30: Evolution of New Genes and Functions - DiVA portaluu.diva-portal.org/smash/get/diva2:1252552/FULLTEXT01.pdfRNA–like molecules and created a suitable environment for certain types

30

the accumulation of deleterious mutations (positive epistasis). Genetic redun-dancy allows for one mutation to be neutral but a combination of two muta-tions (in two copies of the same gene for example) to be deleterious (negative epistasis)(de Visser et al. 2011).

The complexity of evolution with robustness and epistatic interactions makes the accumulative effects of a mutation hard to predict.

Fitness landscapes Understanding how a certain genotype yields a certain phenotype, and how that phenotype compares to the organisms in a population (i.e. its relative fit-ness), are central questions in biological research (de Visser and Krug 2014). This genotype–fitness map has a large role in the understanding of causes and course of evolution and has a major role in several evolutionary questions such as speciation, evolvability, evolution of sex, and robustness (Wright 1932). Also known as a fitness landscape, this map is a high–dimensional map with one dimension for every possible mutation. It is usually visualized as a three–dimensional mountainous landscape with different genotypes in the x–y plane and fitness on the z–axis (Figure 8). The complexity of the landscape varies depending on the trait assessed and can be either relatively simple or rather complex (Figure 8).

Figure 8. Fitness landscapes. Genotype–fitness maps can have as many dimensions as there are possible mutations but can be simplified and thought of as a three–di-mensional landscape. Movements in the X–Y plane represent the acquisition of mu-tations (the genotype space) and the Z–axis represents fitness (i.e. phenotype). By acquiring mutations, an organism moves along the fitness landscape and can reach local fitness peaks (maxima) or valleys (minima). The fitness landscape has one global optimum but often several local optima. At a local optimum, the organism risks getting caught as fitness will decrease in all movable directions. From this point, the organism might not reach the global optima. (A) Landscapes can either be relatively smooth with few optima and several neutral mutations or (B) very rugged with many local optima and few neutral mutations.

Page 31: Evolution of New Genes and Functions - DiVA portaluu.diva-portal.org/smash/get/diva2:1252552/FULLTEXT01.pdfRNA–like molecules and created a suitable environment for certain types

31

Movements in this X–Y plane (so called adaptive walks) lead to local fitness peaks (mountain tops) or fitness valleys (Wright 1932; de Visser and Krug 2014) (Figure 8). With a complete picture of the fitness landscape, evolution would be highly predictable as the fitness of every possible genotype would be known. However, this is impossible. Different techniques have assayed ei-ther a large area of the genotypic space at low resolution, or assessed every single possible variant (i.e. high resolution) but only on a very limited subset (such as a set of a few mutations) of the genotypic space (de Visser and Krug 2014). Another approach to this problem is to study parallel evolution in mi-crobes (large scale, low resolution) and select the most interesting mutations and mutation combinations to assay further (small scale, high resolution) (Wistrand-Yuen et al. 2018).

Fitness landscapes illustrate one of the most basic concepts in evolution – fitness is often gained sequentially, and to climb the peak of a mountain it is required to take one step at a time. However, this step–by–step climbing also means that populations risk getting caught at local suboptimal fitness if the landscape is rugged (DePristo et al. 2005) (Figure 8B).

Another challenge with the notion of a static fitness landscape that organ-isms are moving around and trying to increase their fitness within is that fit-ness landscapes are not always static – organisms must survive in everchang-ing environments. The challenge of any organism is, rather than over time climbing a mountain, surviving on an ever–changing landscape (Gillespie 1994). In addition to sequential mutation accumulation, there are also other ways of increasing fitness and gaining new function.

Evolution of new functions When environments change and an organism needs to adapt to survive or take advantage of a new niche, it might need to evolve a new function. To acquire this new function, the organism somehow needs to acquire or create a new gene. This can be accomplished by receiving a pre–existing gene with the re-quired new function from another organism by horizontal gene transfer (me-diated by transformation, transduction or conjugation), evolving it from a pre–existing gene by a duplication–divergence mechanism, or generating it de novo from noncoding DNA (Figure 9). The relative importance of these mech-anisms is dependent on the time frame available and the complexity of the new function required. Horizontal gene transfer is a much faster process but is limited to already existing functions that are transferred between organisms, in comparison to de novo origin and duplication–divergence that can give rise to new functions.

Page 32: Evolution of New Genes and Functions - DiVA portaluu.diva-portal.org/smash/get/diva2:1252552/FULLTEXT01.pdfRNA–like molecules and created a suitable environment for certain types

32

Figure 9. Origins of genes. When the environment changes, organisms need to adapt to new conditions, which often requires the evolution of a new function by evolving or acquiring a new gene. This can occur either by (A) acquiring a gene from another organism by horizontal gene transfer (HGT), (B) by evolution through duplication and divergence of a current gene that can be mutated to yield a new function, or (C) by de novo origin where a new gene evolves from native, previously non–coding DNA.

Horizontal gene transfer The fastest way to acquire a new function is through horizontal gene transfer (HGT). Fully functional genes with a clear role in an organism can be intro-duced into a new organism (Thomas and Nielsen 2005) (Figure 9A). The frac-tion of genes in bacteria that originate from a HGT event ranges from 0–25 % and is highly dependent on the species and lifestyle of the bacteria (Nakamura et al. 2004). Endosymbiotic bacteria with reduced genome sizes tend to have less foreign transferred genes compared to free–living pathogens and extremo-philes. HGT is also one of the major factors involved in the spread of antibiotic resistance (Ambur et al. 2009; Barlow 2009; Juhas et al. 2009).

DNA can be transferred from an organism to another by three distinct mechanisms: conjugation, transformation, and transduction. In conjugation, the DNA of interest is linked (e.g. by sitting on the same plasmid) to genes enabling physical contact with an acceptor cell. A channel is made into the acceptor cell, and the DNA segment is copied into the acceptor. Another mechanism of DNA transmission is through transformation, where DNA in the environment (e.g. through lysis of the donor cell) can be taken up by a recipient cell. DNA can also be moved by phages from a donor cell to a recip-ient cell upon infection (Thomas and Nielsen 2005; Soucy et al. 2015).

There are several mechanistic constraints that limit the transfer in these ways, such as host limitations among conjugative plasmids and phages, host cell defenses against foreign DNA, and the probability of sharing the same growth niche (Thomas and Nielsen 2005). Furthermore, it is usually costly to incorporate foreign DNA into a host chromosome, as it might disrupt existing genes or be involved in a costly metabolic reaction (Lawrence and Hendrick-son 2005; Koskiniemi et al. 2012; Baltrus 2013).

Page 33: Evolution of New Genes and Functions - DiVA portaluu.diva-portal.org/smash/get/diva2:1252552/FULLTEXT01.pdfRNA–like molecules and created a suitable environment for certain types

33

Duplication and divergence of genes encoding promiscuous enzymes Natural environments fluctuate and organisms evolve to cope with these changes. Often a new function is required to survive or to outgrow competi-tors. One way of acquiring a new function is by accumulating mutations that can modify the active site of an enzyme and thereby allow for a new or im-proved chemical reaction. As discussed earlier (Trade–offs, p. 24), such mu-tation accumulation likely comes with a trade–off that may eventually result in the loss of the original function. As the environmental change does not nec-essarily infer that the original function of the gene is no longer required, the duplication–divergence model provides a means of maintaining the old func-tion while evolving a new one. The gene is duplicated, and after evolution the organism will have two genes, one with the original function and one with a new function (Dittmar and Liberles 2010) (Figure 9B). However, it is com-mon that the new function required is already present in an existing enzyme, albeit at a low level and not as the primary function of the enzyme (Jensen 1976; O'Brien and Herschlag 1999; Babtie et al. 2010; Pandya et al. 2014; Newton et al. 2018). Enzyme promiscuity Many modern–day enzymes are highly specialized to carry out a narrow range of reactions with high efficiency. However, this is the result of evolution over a long time period and the primordial enzymes most likely possessed a broad specificity and low efficiency (Jensen 1976; Khersonsky and Tawfik 2010a; Newton et al. 2018). On the contrary, some enzymes such as glutathione S–transferases (GSTs) and cytochrome P450s have evolved into broad–specific-ity enzymes capable of transforming a wide range of different substrates (Khersonsky and Tawfik 2010b). Modern–day enzymes that evolved to spe-cialize in a narrow range of reactions or only one specific reaction are some-times able to carry out a different reaction than what they are used for in the organism (Khersonsky and Tawfik 2010a).

Promiscuity of proteins is common and when the proteins are overex-pressed at selective conditions, these weak secondary activities are enough to result in an increase in fitness (Copley 2003; Khersonsky et al. 2006; Kher-sonsky and Tawfik 2010a). Several cases of promiscuous proteins have been reported and include β–lactamase conferring resistance to other classes of β–lactams (Sun et al. 2009; Adler et al. 2013); HisB, Gph, and YtjC all inde-pendently rescuing SerB knockouts in E. coli (Patrick et al. 2007; Yip and Matsumura 2013); YeaB and ThrB aiding in pyridoxal–5’–phospahte synthe-sis (Kim et al. 2010); and E. coli sugar kinases rescuing a glucokinase mutant (Miller and Raines 2004; Miller and Raines 2005; Larion et al. 2007).

It is prevalent that promiscuous enzyme functions are catalyzed at very low efficiency. However, enzymes can have several promiscuous activities that

Page 34: Evolution of New Genes and Functions - DiVA portaluu.diva-portal.org/smash/get/diva2:1252552/FULLTEXT01.pdfRNA–like molecules and created a suitable environment for certain types

34

can vary in catalytic efficiency depending on the similarities of the substrates (van Loo et al. 2010). This can be assessed by analyzing the different types of bonds broken or formed in the reaction and by the differences in native and promiscuous reaction mechanisms (Bornscheuer and Kazlauskas 2004).

While primary enzymatic reactions typically have kcat/KM values of 105–108 M−1s−1, the variation among the kcat/KM values for promiscuous reactions var-ies greatly and are usually the values very low (O'Brie and Herschlag 2001; Catrina et al. 2007; van Loo et al. 2010).

Enzyme specificity is under strong selection pressure but is only selected to be adequate or sufficient rather than optimal. High specificity comes with a high energy cost for substrate–binding and a lower turn–over rate (Fersht 1999) indicating that rather than being a selected trait, promiscuity is a result of enzymes not being selected for maximum efficiency.

Promiscuous enzymes not only provide new reactions, but they can also rescue the loss of an existent reaction by carrying out the reaction of the lost enzyme (Yang and Metcalf 2004; McLoughlin and Copley 2008).

There are numerous ways that an enzyme can be promiscuous. It could be a result of diverse conformational states of the enzyme active site that allows for various substrates to fit into the catalytic site (Meier and Özbek 2007; To-kuriki and Tawfik 2009b). In some cases, the active site remains the same but the binding of the substrate differs due to dissimilarities in the hydrogen bond-ing network (Theodossis et al. 2004). Another way of providing promiscuous activity is through changes in the protonation state of the catalytic residues (Wang et al. 2003; Poelarends et al. 2004). Within the same catalytic site, dis-tinct sets of residues can also be responsible for the catalysis of unique reac-tions (providing subsites within the same catalytic site) (Yeung et al. 2005; Khersonsky and Tawfik 2006).

Gene amplification and divergence The concept of gene duplication and divergence was first published in the 1930’s by Haldane (1932) and Fisher (1935), was further developed by Ohno (Ohno 1970), and has been refined over the last decades (Jensen 1976; Pia-tigorsky 1991; Hughes 1994; Force et al. 1999; Kondrashov et al. 2002; Bergthorsson et al. 2007; Näsvall et al. 2012).

A gene duplication comes with a cost (Wagner 2005; Stoebel et al. 2008). This can be due to the maintenance of excess DNA, the metabolic burden of overexpressing the duplicated gene, the increased energy requirement of driv-ing an additional chemical reaction, the reduced energy requirement of other genes, or undesirable interactions with other proteins. The individual contri-butions of these are hard to disentangle and will depend on the protein and the system studied. It has been estimated that the cost of amplifications is s = 10−3 per kbp (Pettersson et al. 2008). Given the effective population sizes in bacte-ria and the size of spontaneous duplications, the extra copy is frequently lost through recombination (Pettersson et al. 2008; Reams et al. 2010; Näsvall et

Page 35: Evolution of New Genes and Functions - DiVA portaluu.diva-portal.org/smash/get/diva2:1252552/FULLTEXT01.pdfRNA–like molecules and created a suitable environment for certain types

35

al. 2012). The duplicated gene can also accumulate deleterious mutations which render the protein nonfunctional. Depending on the underlying mecha-nism causing a duplication (e.g. through rolling circle amplification, unequal crossing–over, duplication–inversion, transposable elements), duplication sta-bility relies on the amounts of nearby homologies necessary for homologous recombination (Andersson and Hughes 2009; Hastings et al. 2009; Kugelberg et al. 2010). Most duplications are lost as a result of the cost and instability unless there is a selection pressure for the duplicated copy. In bacterial popu-lations, the spontaneous gain and loss of duplications reaches a steady state at around 10−5–10−2 duplications per cell and gene at different loci (Anderson and Roth 1981; Reams et al. 2010).

Following an environmental change, a weak secondary (promiscuous) function can become beneficial. According to the innovation–amplification–divergence model, this weak secondary function will be under selection for improvement by amplifications and mutations (Figure 10).

To make use of the beneficial secondary function of the protein, selection will favor amplification of the gene (i.e. increased dosage). This increased dosage enhances the weak secondary activity, buffers against any loss in the primary activity (Figure 7), and provides a larger mutational target for bene-ficial mutations (Näsvall et al. 2012) (Figure 10). Under selection for both increasing a new activity and maintaining the original function, mutations can accumulate in the different copies of the amplified gene (divergence). Gene copies with beneficial mutations will be further selected by amplification of that copy and gene copies with deleterious mutations will be lost through re-combination (Näsvall et al. 2012) (Figure 10). The rate of adaption will largely depend on the selection pressure, the availability of beneficial muta-tions, and the magnitude of the fitness advantage of the beneficial mutations (Näsvall et al. 2012; Andersson et al. 2015).

Page 36: Evolution of New Genes and Functions - DiVA portaluu.diva-portal.org/smash/get/diva2:1252552/FULLTEXT01.pdfRNA–like molecules and created a suitable environment for certain types

36

Figure 10. Innovation–Amplification–Divergence model. To adapt to a new environ-mental condition, a new function may be required. Selection pressure can be applied to a weak secondary function of a promiscuous protein that is beneficial in the new conditions. Selection will drive amplification of the gene, enhancing the secondary function and buffering the loss of the original function. Mutations can accumulate and the genes can diverge into separate specialists. With time and improved genes, the copies can segregate, eventually resulting in two distinct genes, one that special-izes in the original function, and one that specializes in the new function.

Page 37: Evolution of New Genes and Functions - DiVA portaluu.diva-portal.org/smash/get/diva2:1252552/FULLTEXT01.pdfRNA–like molecules and created a suitable environment for certain types

37

De novo origin Even though many new functions can be acquired through amplification and divergence, some functions originate from a completely new gene (Figure 9C). However, given the vast number of functions but the limited number of folds it has been proposed that all new genes and functions originate from duplication–divergence events (Ohno 1970). The requirements for acquiring a new function through de novo evolution of a gene are also high, strengthen-ing that argument. A random sequence must first be transcribed (i.e. have an upstream promoter region), then translated (i.e. have a ribosomal binding site, a start codon and a stop codon), fold properly, and then exhibit a function. The resulting polypeptide must also give a fitness benefit high enough and during a long enough time for evolution to act on it so that it reaches fixation in the population (Andersson et al. 2015). It seems unlikely that all factors required for de novo gene birth would arise simultaneously as there would be no posi-tive selection until the gene is fully transcribed and translated. Rather than arising at the same time, studies indicate that de novo originated genes can arise by a certain genomic region first being transcribed as noncoding RNA and then acquiring mutations making the RNA into a translated open reading frame (ORF) (Reinhardt et al. 2013). There is however evidence that expres-sion and ORF can also be formed within the same time frame (Reinhardt et al. 2013). It has also been shown that de novo genes can originate from pre–ex-isting ORFs that acquire transcription (Heinen et al. 2009; Zhao et al. 2014).

Transcription studies of genomes have shown that many loci without pro-tein–encoding genes are transcribed into long noncoding RNAs (Dinger et al. 2008). Some of these long noncoding RNAs contain short ORFs which are bound by ribosomes and translated into a protein (Wilson and Masel 2011; Carvunis et al. 2012). If these ORFs provide a beneficial function, they will be positively selected for and evolve to provide improved function (e.g. through duplication–divergence). There are also examples of bifunctional RNAs that act as regulators or structural stabilizers, as well as encoding pro-teins with an important cellular function (Rastinejad and Blau 1993; Chooniedass-Kothari et al. 2004; Kloc 2005; Jenny et al. 2006; Dinger et al. 2008).

De novo genes do not always emerge from previously non–coding regions. A novel ORF can originate by overprinting of another gene. A pre–existing gene can acquire point mutations that give rise to a new ORF that can be tran-scribed and translated and have a separate function (Delaye et al. 2008). A gene can also lose a stop codon in a different frame, allowing for an overlap-ping gene to be translated from the same mRNA. Similarly, a start codon can be shifted, again providing a different gene to be translated from the same mRNA (Delaye et al. 2008). The process is well–studied in viruses and a com-mon mechanism of acquiring new viral genes (Sabath et al. 2012; Pavesi et al. 2013) but is not as common in prokaryotes (Delaye et al. 2008).

Page 38: Evolution of New Genes and Functions - DiVA portaluu.diva-portal.org/smash/get/diva2:1252552/FULLTEXT01.pdfRNA–like molecules and created a suitable environment for certain types

38

Although protein coding genes can emerge de novo, there is still a major challenge to overcome. These novel proteins need to fold properly to be func-tional. Studies examining functionality of randomly generated polypeptides have shown that functional proteins are rare. In a study of ATP–binding pro-teins, it was estimated that randomly generated polypeptides that fold and function (i.e. can bind ATP) emerge with a frequency of about 10−11 (Keefe and Szostak 2001). In another study of streptavidin–binding proteins, the fre-quency of functional proteins generated from a random library emerges at a frequency of 10−12 (Wilson et al. 2001).

Experimental evolution An inherent problem with studying current states of bacterial systems and mechanisms is that we cannot study the intermediate states of the natural evo-lution but only the end–points. To better understand how the evolution of a certain enzyme occurred in nature, we can use experimental evolution to fol-low evolution in real–time and study every intermediate on the path to a new function for example (Peisajovich and Tawfik 2007; Bloom and Arnold 2009; Tracewell and Arnold 2009).

Experimental evolution is defined as the study of how experimental popu-lations adapt as a response to conditions defined by an experiment (Kawecki et al. 2012). Experimental evolution is limited by the resolution of fitness measurements as compared to fitness differences selected for or against in na-ture. Evolution can also be a very slow process and not all types of evolution-ary questions can be tested with experimental evolution as a result of time requirements. Nevertheless, experimental evolution can still be a powerful tool to better understand nature, test evolutionary hypotheses, and study prob-lems for human society, such as the development and spread of antibiotic re-sistance (Conrad et al. 2011; Jacquier et al. 2013; Gullberg et al. 2014; Tenail-lon 2018).

One experimental evolution method is batch cultivation. Parallel cultures with microbial cells are allowed to grow in laboratory medium and a sub–population is transferred to fresh medium on a regular basis, often daily (called serial passage) (DePristo et al. 2005). The transfer can be carried out for a set number of generations, until a certain trait has emerged, or until fitness has been gained (or fully restored). Whole populations or isolated clones can be sequenced by whole genome sequencing (WGS) and mutations of interest can be re–constructed in a wild–type background (Wistrand-Yuen et al. 2018). Re–constructed strains can be used to disentangle the contributions of single or multiple mutations to an observed fitness change or a new trait.

The number of cells transferred at each occasion will also determine the outcome of the experiment as small bottlenecks increase the chance of changes due to random genetic drift (Kibota and Lynch 1996). Another factor

Page 39: Evolution of New Genes and Functions - DiVA portaluu.diva-portal.org/smash/get/diva2:1252552/FULLTEXT01.pdfRNA–like molecules and created a suitable environment for certain types

39

affecting evolutionary outcome is the mutation rate. Evolution and the im-provement of fitness is dependent on mutations, and with a low mutation rate little to no adaption will occur. In the case of serial passaging, a low mutation rate makes the probability of a beneficial mutation very low. On the other hand, as most mutations are deleterious, a high mutation rate allows for the accumulation of many deleterious mutations and a rapid decrease in fitness over the course of the experiment (Kibota and Lynch 1996). The size of the beneficial fitness effect is also important, as clonal interference lowers the probability that mutations with very small fitness effects are passed on to the next round upon passaging.

As bacterial populations are large the probability of random genetic drift during serial passaging of bacterial populations is low. Thus, mutations with a high fitness effect will accumulate in the population. Furthermore, this in-creases the probability of transferring beneficial mutations in the next passage.

As the serial passaging is carried out in laboratory medium, there is a risk that the bacteria adapt not only to the stressful conditions being studied, but also to the medium used. When analyzing the mutations from WGS, some mutations will be an adaption to the intended condition of interest (such as developing antibiotic resistance) and some mutations will be due to adaption to the growth media (Gullberg et al. 2011). Adaptive mutations to the growth medium are often beneficial and the mutations can sweep the population. The beneficial effect from mutations conferring, for example, antibiotic resistance might be lower and these mutations could potentially be lost in the population, resulting in a different experimental outcome than intended. A solution is to perform evolution experiments with strains that have been pre–adapted to the growth–conditions used.

As the spontaneous mutation frequency is low, the frequency of the emer-gence of any new traits will also be low. At times when evolving a new trait is the experimental goal (such as survival on antibiotics above the minimum inhibitory concentration (MIC) or growth in a new environment), an increased mutation rate can be useful to increase the probability of beneficial mutations (Maisnier-Patin et al. 2005). Overexpression of an error–prone DNA polymer-ase allowing for more than a 200–fold increase in mutation rate can increase the pool of mutations available for selection and thus increase the probability of finding a beneficial mutation (Maisnier-Patin et al. 2005). The overexpres-sion is usually carried out by regulating the error–prone DNA polymerase with an inducible promoter. The inducing agent can be added during an initial growth phase to allow for the accumulation of many mutations and then re-moved when selecting for and assessing fitness in the new environment of interest (Tracewell and Arnold 2009). As this generates several mutations per cell, there is a reasonable probability of finding mutations that enable the or-ganism to survive in the new environment.

If the study is limited to evolutionary questions regarding a single protein, mutations could be introduced directly into that protein rather than to the

Page 40: Evolution of New Genes and Functions - DiVA portaluu.diva-portal.org/smash/get/diva2:1252552/FULLTEXT01.pdfRNA–like molecules and created a suitable environment for certain types

40

whole chromosome. This is referred to as directed evolution (Dougherty and Arnold 2009). The protein can be mutagenized by several rounds of error–prone PCR to introduce mutations whose impact on fitness can be studied. In this method, all mutations affecting fitness will be localized to a single gene, and conclusions can be made regarding the function and structure of the pro-tein, the importance of the protein in the cell, the general fitness effect of mu-tations in the protein in the organism or the ability of the protein to provide a new function (Nannemann et al. 2011). This procedure provides a way to mimic the mutations–selection procedure of natural evolution but in a much faster and more efficient way (Meysman et al. 2013).

Salmonella enterica and Escherichia coli as model organisms The organisms used in this work are the two bacterial species Escherichia coli K–12 MG1655 (designated E. coli in text) and Salmonella enterica serovar Typhimurium str. LT2 (designated S. enterica in the text). These two species are both members of the Enterobacteriaceae family and their common ances-tor lived about 100–160 million years ago (McClelland et al. 2001; Riley et al. 2006). They are both facultative aerobic, rod–shaped Gram negatives and both have genome sizes of about 5 Mbp, each containing approximately 4,500 and 4,600 genes (E. coli and S. enterica respectively) (Meysman et al. 2013) out of which 2,900 are thought to be orthologous (Meysman et al. 2013). S. enterica was the sole organism used in Papers I, II and IV whereas Paper III made use of both S. enterica and E. coli.

E. coli K–12 is a commensal bacterium and a natural inhabitant of the hu-man (and other warm–blooded organisms) lower intestine (Riley et al. 2006; Meysman et al. 2013). It is commonly used as model organism for bacterial evolution and also broadly used in biotechnology (Bachmann 1996). It was originally isolated from a convalescent diphtheria patient in Palo Alto, Cali-fornia but has since then been subjected to X–rays, UV radiation, and acri-dinorange, which heavily mutated the original genome but allowed for loss of phage lambda and cured it of the F plasmid present in the original strain (McClelland et al. 2001; Meysman et al. 2013). E. coli K–12 grows well on standard laboratory media and has been cultivated in laboratory conditions since its isolation.

S. enterica is also well suited for laboratory work. It has not been mutagen-ized (except for adaptation to laboratory conditions during the years after iso-lation) and allows an easier way of moving genetic material between strains by transduction as compared to E. coli. S. enterica is a facultative intracellular pathogen that causes gastroenteritis in humans (Kröger et al. 2012) and it has

Page 41: Evolution of New Genes and Functions - DiVA portaluu.diva-portal.org/smash/get/diva2:1252552/FULLTEXT01.pdfRNA–like molecules and created a suitable environment for certain types

41

been used as a model organism for more than 50 years to study bacterial evo-lution (Riley et al. 2006; Kröger et al. 2012; Meysman et al. 2013).

The physiology, genetics and biochemistry are all well studied in both spe-cies and they are also appreciated for the ease of culturing as well as the fast growth rate (de Visser and Krug 2014). There are also a wide range of molec-ular tools that can be used with the two species, such as phage transduction and lambda red recombineering.

Page 42: Evolution of New Genes and Functions - DiVA portaluu.diva-portal.org/smash/get/diva2:1252552/FULLTEXT01.pdfRNA–like molecules and created a suitable environment for certain types

42

Present investigations

Paper I How a certain genotype relates to a certain phenotype is of fundamental im-portance in biological research (de Visser and Krug 2014). A part of compre-hending this relationship is to understand how a certain mutation (genotype) affects the fitness of the organism (phenotype). The sum of all these effects can be seen as a map over a fitness landscape where the beneficial mutations are hills and deleterious mutations are valleys (Yue et al. 2005; Eyre-Walker et al. 2006). This map over the landscape can aid our knowledge of the devel-opment of disease (Silander et al. 2007), the healthy population size for an endangered species (Wright 1932), and speciation and the evolution of sex (Höcker et al. 2001).

Experimental setup In this paper, we studied the effects of mutations (i.e. the distribution of fitness effects) in the biosynthetic enzyme HisA in Salmonella enterica to gain a bet-ter understanding of how different mutations affect fitness. The (βα)8 fold of this enzyme is found in approximately 10 % of all known enzymes (O'Brien and Herschlag 1999; Copley 2003; Bornscheuer and Kazlauskas 2004; Kher-sonsky and Tawfik 2010a; Pandya et al. 2014) indicating that findings for HisA can potentially be applied to may enzymes. HisA is conditionally essen-tial and the function of the enzyme can be set to be limit the growth of the organism. Mutations were introduced by error–prone PCR and the gene was placed on the chromosome under control of a strong constitutive promoter. In total, the fitness effects of mutations (ranging from 1–10 mutations per hisA gene) for 510 different gene variants were determined. Eighty–one of these variants had a single amino acid substitution and these were analyzed further at lower expression conditions. Under these conditions, in silico fitness pre-dictions were correlated to the experimental data and the link between residues crucial for protein function and the measured DFE was determined. We also examined how the accumulation of several mutations affected fitness and how epistatic effects influence overall. We were able to relate protein fitness to organismal fitness and determine how protein levels affect this relationship.

Page 43: Evolution of New Genes and Functions - DiVA portaluu.diva-portal.org/smash/get/diva2:1252552/FULLTEXT01.pdfRNA–like molecules and created a suitable environment for certain types

43

DFE of all mutants Studying the DFE of all 510 mutants, we found that the distribution was bi-modal at high expression levels, with one mode at neutral and one at lethal. The neutral mutations mode (i.e. mutations without any measurable fitness difference compared to wild–type) encompassed 44 % of the mutants, whereas the lethal mode encompassed 30 % of the mutants. The remaining 26 % of the mutations ranged from having a minor deleterious effect to nearly lethal ef-fects. In total, the average fitness cost per nonsynonymous mutation was 20.5 % and 10.5 % when including and excluding lethal mutations, respectively.

DFE of single amino acid substitution mutants When examining the 81 single amino acid substitution mutants at a high ex-pression level, the distribution was unimodal with the mode at neutral fitness and a tail of mutants with various deleterious effects. This DFE is most likely due to the robustness of HisA brought about by high expression. High expres-sion allows for reduction in specific enzymatic activity (protein fitness) with-out loss of growth rate (organismal fitness). Deleterious effects on protein fit-ness would hence not affect the organismal fitness until the total enzymatic activity (specific enzymatic activity × enzyme concentration) falls below a certain threshold. This indicates that high expression can buffer the deleterious effects of mutations and result in no measurable fitness effects on organism level even though the enzyme activity is less than wild–type.

To remove the buffering effect of expression, we lowered the expression level of the enzyme to make HisA function limiting for the growth rate and subsequently studied the DFE of all 81 single mutants under these conditions. At low expression levels, the DFE was bimodal with one mode at lethal and one broad mode centered at 20 % fitness cost.

HisA results compared to other systems In HisA, no beneficial mutations were found at any expression level. Benefi-cial mutations have been reported for various proteins at various conditions but are lacking for HisA. The fractions of lethal mutations have been reported to vary extensively among proteins and the fraction in HisA was is in line with previous studies. The amount of neutral mutations in HisA was low compared to other studies.

The variations reported in literature stress the importance of using highly sensitive assays to distinguish between fitness effects and the fact that protein expression must be rate limiting when studying DFE and measuring fitness via organismal growth. To obtain a more biologically relevant readout of how a protein function directly impacts organismal fitness, the protein of interest should be studied in its native locus and expressed under its native conditions.

Page 44: Evolution of New Genes and Functions - DiVA portaluu.diva-portal.org/smash/get/diva2:1252552/FULLTEXT01.pdfRNA–like molecules and created a suitable environment for certain types

44

Epistatic effects between mutations Analyzing the whole set of 510 mutations, fitness is expected to decline expo-nentially as mutations accumulate. If, however, there are epistatic interactions between mutations, the fitness decrease would deviate from the exponential function. When synonymous mutations were excluded from the analysis, there were epistatic effects between the mutations on average, but when synony-mous mutations were included, no epistatic effects could be observed on av-erage.

Predicting fitness effects To be able to forecast evolutionary trajectories, one needs to be able to accu-rately predict fitness effects of single mutations in an enzyme. We used 40 different tools and measures of biophysical and phylogenetic properties of proteins to predict the fitness effects of mutations in HisA and assess the ac-curacy of the predictions. The phylogenetic methods used were substitution matrices and phylogenetic conservation scores. The biophysical properties an-alyzed for the mutated amino acid were the shortest distance to the enzyme substrate, relative accessible surface area, effect on the free folding energy difference, and melting point changes. In addition, the effect on RNA folding free energy difference and hybrid/machine learning methods were used. None of the predictors was able to accurately predict the fitness effect of single mu-tations, but a linear combination of predictors significantly improved the pre-dictions.

Conclusion The shape of the fitness distribution is determined by the extent to which high expression levels mask the effects of mutations on the function of the protein.

On average, the combined deleterious effects of mutations were greater than expected for their individual contributions, thus demonstrating negative epistatic interaction between the mutations.

No single fitness predictor could accurately predict the fitness of a certain mutation. A combination of several different predictors significantly im-proved the accuracy of predictions.

Page 45: Evolution of New Genes and Functions - DiVA portaluu.diva-portal.org/smash/get/diva2:1252552/FULLTEXT01.pdfRNA–like molecules and created a suitable environment for certain types

45

Paper II To survive in a new environment, organisms must adapt to the new conditions. This might require acquiring a novel function, such as the utilization of a novel carbon source. To acquire this new function, organisms may make use of a preexisting but weak and promiscuous function in a native enzyme (Gullberg et al. 2011; Toprak et al. 2011; Miller et al. 2013). Continued evolution could then take place by amplification and divergence of the promiscuous enzyme, ultimately leading to a new enzyme with improved function in the new trait (O'Brien and Herschlag 1999; Khersonsky et al. 2006; Khersonsky and Taw-fik 2010; Näsvall et al. 2012; Pandya et al. 2014; Andersson et al. 2015). The trade–off between the new and the original activity of the enzyme will dictate the timing, size and frequency of the required amplifications (Khersonsky et al. 2006; Soskine and Tawfik 2010; Andersson et al. 2015). The trade–off also determines what mechanisms of adaption are feasible. Weak trade–offs would remove the need of amplifications whereas severe trade–offs on the original activity require compensatory mechanisms such as gene amplifications (Adler et al. 2015; Andersson et al. 2015).

We used S. enterica HisA to determine the trade–offs between its original function and acquiring a new (TrpF) activity. HisA is an essential part of the L–histidine biosynthesis pathway while the TrpF enzyme is an essential part of the L–tryptophan biosynthesis pathway. These two enzymes share the com-mon (βα)8–barrel fold and carry out the same type of catalytic reaction on similar substrates. We identified 15 unique mutations in hisA, all of which provide TrpF activity, but only a fraction retain HisA activity. Eleven of these mutations, and a previously reported variant were chosen for continued evo-lution towards TrpF activity and the trade–offs were determined.

Evolving increased TrpF activity 2.2×104 mutated clones of hisA were screened for TrpF activity and 73 clones were found. Of these, 11 retained HisA activity while the remaining clones lost all HisA activity, demonstrating the frequent loss of HisA activity upon evolution of TrpF activity. The TrpF activity varied among the clones that retained HisA activity, but an even greater disparity in TrpF activity was ob-served among the clones that lost all HisA activity.

Many of the isolated clones had several mutations and we re–constructed all single amino acid substitutions that occurred more than once. Fifteen mu-tations were found to yield TrpF activity and one additional was known from previous work (Näsvall et al 2012). Of these, 13 clones were found to have lost all HisA activity.

In summary, this shows that there are strong trade–offs between acquiring TrpF activity and retaining HisA activity. Furthermore, there are many

Page 46: Evolution of New Genes and Functions - DiVA portaluu.diva-portal.org/smash/get/diva2:1252552/FULLTEXT01.pdfRNA–like molecules and created a suitable environment for certain types

46

pathways towards TrpF activity, demonstrating the versatility of the (βα)8–barrel fold to evolve new functions.

Improving TrpF activity Of the 16 mutants, we selected 12 as a starting point for continued evolution (henceforth referred to as 12 lineages) towards improved TrpF activity and determined the shape of the trade–off curve for these separate lineages. The clones were improved in three steps, at which the clones from the previous step were mutagenized and the fittest clones were analyzed. In general, the fitness increased for each step but the variation and maximum of the deter-mined fitnesses differed between the lineages.

During the step–wise evolution, the fittest clones in each step did not al-ways yield the fittest clones in the subsequent step. The improved clones in each step were often based on suboptimal clones from the previous step.

Upon repeating some of the evolution steps for some of the lineages, the fitness among the new sets of clones resembled the fitness among the original ones. However, the isolated clones in each step were not always the progeny of the same clones as the isolated clones in the original isolation of improved clones. Thus, there is a vast number of possible mutations resulting in in-creased fitness. We also observed epistatic effects between the mutations dur-ing the step–wise increase of fitness.

The HisA activity was commonly lost as TrpF activity was improved. No starting mutant lacking HisA activity regained that activity. This is in stark contrast to previous findings indicating that HisA is very robust at high ex-pression levels.

Expression levels affect the shape of the trade–off curve In a previous study, protein expression levels were shown to influence the distribution of fitness effects (DFE) (Paper I). Therefore, all re–constructed hisA variants providing TrpF activity were also put under control of an induc-ible promoter and tested at high and low expression levels. The resulting data was then compared to that from the experiments using the original constitutive promoter. With the original promoter, three variants had both HisA and TrpF activity, of which one had a HisA fitness lower than wild–type levels. At high expression from the inducible promoter, all variants increased in TrpF fitness. None the three hisA variants with both HisA and TrpF activity had any meas-urable fitness cost in HisA activity and no variant originally lacking HisA ac-tivity gained in this activity. At low expression, the HisA activity of the vari-ants with double activity varied but all three still had an activity lower than wild–type levels.

The trade–off between original and new activity is hence highly dependent on the expression level of the protein. Thus, the up–regulation of enzymes can

Page 47: Evolution of New Genes and Functions - DiVA portaluu.diva-portal.org/smash/get/diva2:1252552/FULLTEXT01.pdfRNA–like molecules and created a suitable environment for certain types

47

aid in the evolution of novel activities and buffer deleterious effects of muta-tions.

Mutational pathways The mutations that led to the increase in TrpF activity differed between the lineages. During evolution towards increased TrpF activity, a total of 848 amino acid substitutions were observed. Of these, 16 amino acid substitutions were observed six or more times in the different isolated clones but the most common mutation found in each lineage differed between lineages. Many of the mutations occurred in amino acids that interact with either of the two phos-phates of the HisA native substrate. These interactions are thought to be im-portant for positioning of the substrate. Mutations in these residues could im-prove the positioning of the TrpF substrate. Some mutations occurred in posi-tions with no known interaction with the HisA substrate and were thought to affect the enzyme structure, and thereby improving TrpF substrate binding. Several single mutations known to provide TrpF activity were also observed during the step–wise evolution.

Conclusions In conclusion, several mutations and mutational pathways lead to improved TrpF activity. HisA activity was on rare occasions possible to maintain at poor TrpF activity. Complete loss of HisA activity was observed in all strains that gained high TrpF activity.

The mutational pathways differed depending on starting point but numer-ous mutations were found in several lineages. Some mutations improved TrpF activity on their own and others did so only in a certain mutational back-ground, suggesting epistatic interactions between mutations.

In contrast to previous studies on other proteins, evolution towards in-creased TrpF activity conferred strong trade–offs on the original HisA activ-ity. The fitness was highly dependent on hisA expression levels and expression levels are thus an important determinant for the shape of the trade–off curve and the paths available during continued evolution.

Finally, the fittest mutation in each step of the evolution experiments sel-dom gave rise to the fittest mutants in the succeeding step. Thus, the fitness landscape is rugged and epistatic interactions between mutations determine the fitness.

Page 48: Evolution of New Genes and Functions - DiVA portaluu.diva-portal.org/smash/get/diva2:1252552/FULLTEXT01.pdfRNA–like molecules and created a suitable environment for certain types

48

Paper III Laboratory evolution experiments have improved our understanding of evo-lution of antibiotic resistance (Tenaillon et al. 2012; Lang et al. 2013; Tenail-lon et al. 2016), convergent and parallel evolution (Blaby et al. 2012) and in-dustrial innovations (Orr 2005; Dettman et al. 2012; Tenaillon et al. 2012). All of these studies have been dependent on evolving a new phenotype, whole genome sequencing of the resulting strains, and determining the genotype re-sponsible for the observed phenotype.

Mutations leading to increased fitness in the studied trait will be acquired during evolution. There is also a selection for mutations improving fitness in the growth conditions used, such as standard laboratory media. To disentangle the mutations arising from the intended selection pressure and those arising from adaption to the growth conditions used, we studied the adaptations to commonly used laboratory media.

Adaptation to laboratory media We selected for adaptive mutations in E. coli and S. enterica in four liquid laboratory media; two complex media (lysogeny broth (LB) and Mueller Hin-ton broth (MH)) and two minimal media (M9 minimal medium supplemented with either 0.2% glycerol (M9gly) or 0.2% glucose (M9glu)). The strains were cycled for 500–1,000 generations.

Whole genome sequencing revealed 138 unique mutations in E. coli and 83 mutations in S. enterica. About half of the mutations observed in each spe-cies were amino acid substitutions. The mutations could be grouped into two categories: those that directly affected resource utilization and those that con-ferred a more global effect on gene expression.

Little mutational overlap was found between the two species evolved in the same media. However, within the same species, the same genes were mutated under several growth conditions. The same nucleotide mutation within the same gene in parallel independent lineages is uncommon (Barberán et al. 2014; Gilbert et al. 2014; Manoharan et al. 2015; Anantharaman et al. 2016; Locey and Lennon 2016). Several of the mutations observed in this study have been previously reported, indicating parallelism in the evolution. On the con-trary, some mutations from previous studies using similar conditions were not found, thus showing that subtle variations in growth conditions and genotype of the starting strain (even within the same species) affects the selection of mutations.

Coupling genotype to phenotype To elucidate the importance of single mutations, mutations appearing in sev-eral lineages were re–constructed alone or in combination with other

Page 49: Evolution of New Genes and Functions - DiVA portaluu.diva-portal.org/smash/get/diva2:1252552/FULLTEXT01.pdfRNA–like molecules and created a suitable environment for certain types

49

mutations in the parental non–evolved background. The fitness of the re–con-structed strains was assessed by competition experiments (measuring fitness over the whole growth curve) and by exponential growth rate assays (examin-ing only the exponential growth rate). In most cases, fitness increased for the re–constructed strains containing a single mutation. In general, fitness also increased for the re–constructed strains containing combinations of several mutations. We sometimes observed an increase in fitness of strains carrying two mutations with no discernible individual fitness advantage, thus suggest-ing epistatic effects between mutations.

The increase in fitness differed between the conditions used. In minimal media, the effects were generally larger and more varied than in complex me-dia and most effects were observed on exponential growth rate. In complex media, there was generally no measurable increase in exponential growth rates but rather improvements in lag or stationary phase. This is likely due to the two species being more pre–adapted to the complex media than to the minimal media.

Adaptations to different media For E. coli grown in LB, we observed several mutations to nutrient uptake systems resulting in, for example, increased uptake of glutamic acid and al-lowing for it to be utilized as a carbon source. In S. enterica, mutations affect-ing the motility (such as mutations in flagella) were commonly found as well as mutations restoring the function of the trehalose uptake system. Restoration of this system also decreased the standard deviation up to 10–fold when com-peting two restored mutants against each other.

Neither E. coli or S. enterica can utilize starch as a carbon source. However, S. enterica evolved in Muller Hinton (MH) medium that is rich in starch ac-quired several mutations in genes involved in the utilization of starch degra-dation products, suggesting that simpler oligosaccharides are formed from starch degradation during preparation of MH.

The M9 minimal medium is an inorganic sodium phosphate buffer with essential trace elements present only as a result of component impurities and is usually supplemented with either glucose or glycerol as the sole carbon source. These conditions are quite unlike the rich and complex environment in animal guts where both E. coli and S. enterica are commonly found. In M9 supplemented with glycerol, glycerol utilization was improved in both spe-cies. In M9 supplemented with glucose, no overlap of mutations between the species could be found.

Conclusions There is a vast number of possible mutations the two species can acquire to adapt to the different media tested. The pathways towards increased fitness in

Page 50: Evolution of New Genes and Functions - DiVA portaluu.diva-portal.org/smash/get/diva2:1252552/FULLTEXT01.pdfRNA–like molecules and created a suitable environment for certain types

50

the media are very different with few overlaps between mutations in the dif-ferent conditions and species. The majority of the tested mutations were ben-eficial, and fitness increased when several mutations were combined. Fitness measurements as measured by exponential growth rate and by competition experiments were similar in minimal media conditions, but differed in com-plex media.

Page 51: Evolution of New Genes and Functions - DiVA portaluu.diva-portal.org/smash/get/diva2:1252552/FULLTEXT01.pdfRNA–like molecules and created a suitable environment for certain types

51

Paper IV Microorganisms can thrive in many different environments. This requires the ability to utilize a diverse set of metabolites (Bhattacharya et al. 2003; Copley 2003; Nielsen 2003; Fani and Fondi 2009; Nobeli et al. 2009; Khersonsky and Tawfik 2010b). Promiscuous enzymes and coupled biochemical pathways in-crease the metabolic repertoire of an organism (Sokurenko et al. 1999; Eisen-reich et al. 2010; Bianconi et al. 2011; Rohmer et al. 2011; Fuchs et al. 2012). The ability of an organism to evolve new metabolic capabilities dictates its ability to colonize new environments. This also depends on the constraints and trade–offs that accompany the novel metabolic capabilities. Understanding these capabilities can improve our knowledge of how pathogens can colonize human hosts (Gutnick et al. 1969).

Novel metabolic capabilities In this study, we examined if S. enterica could evolve to utilize a non–native carbon compound as sole carbon source. We selected a set of 124 compounds known not to be used as a carbon source for S. enterica (Gutnick et al. 1969) and tested a large mutagenized population of S. enterica for growth with these compounds as the sole carbon source. The large population size allowed for every single possible point mutation to be examined.

Mutational trajectories S. enterica gained ability to utilize 18 of the 124 (14 %) tested compounds as the sole carbon source. In four of these cases, we identified that a single mu-tation caused the phenotype and showed that the activation of cryptic operons and re–wiring of metabolic pathways provided a path towards these novel phe-notypes. In two cases, no mutational overlaps were found between clones able to grow on the same compound suggesting multiple available routes for utili-zation of certain non–native compounds. Furthermore, mutants with a novel phenotype not transferable by P22 transduction to a wild–type background suggest that multiple mutations were required for the utilization of the specific carbon source.

Re–wiring metabolic pathways Some carbon sources are not utilized due to physiological constraints within the cell. One such constraint we found was the utilization of isoleucine. This compound can be degraded into acetyl co–A by S. enterica to be used for gen-erating other carbon compounds within the cell. The production of isoleucine utilizes the same enzymes required for valine and leucine biosynthesis, thus excess isoleucine can inhibit the production of valine and leucine, resulting in

Page 52: Evolution of New Genes and Functions - DiVA portaluu.diva-portal.org/smash/get/diva2:1252552/FULLTEXT01.pdfRNA–like molecules and created a suitable environment for certain types

52

starvation of these two amino acids. In the strains that overcame this physio-logical constraint, we observed mutations in a protein that is directly involved in this feedback inhibition, most likely relaxing the inhibition and allowing for synthesis of valine and leucine in the presence of excess isoleucine.

Trade–offs during acquisition of novel phenotypes The evolution of novel phenotypes can be limited by physiological con-straints, but it can also be limited by trade–offs between the native and the new abilities. Mutations allowing utilization of isoleucine as carbon source result in the inability to grow on poor carbon sources like acetate and citrate. Mutations allowing growth on threonine as sole carbon source result in de-creased growth rate (with respect to the ancestor) on minimal media with glu-cose as sole carbon source.

Conclusions Our study highlights the vast possibilities of organisms to acquire novel met-abolic capabilities from point mutations. We show that cryptic operons can be a source for the acquisition of new functions and that removal of physiological constraints by mutations allows for evolution of novel capabilities. Further-more, we show that trade–offs constrain the evolution towards novel capabil-ities.

Page 53: Evolution of New Genes and Functions - DiVA portaluu.diva-portal.org/smash/get/diva2:1252552/FULLTEXT01.pdfRNA–like molecules and created a suitable environment for certain types

53

Concluding remarks

One of the most fundamental questions in genome biology is how new genes and novel functions have originated. The evolution of new genes and functions enables bacteria to adapt to new selection pressures and environments, to col-onize new niches, and to increase in complexity.

The acquisition of novel metabolic capabilities was studied in different ways in Papers II, III and IV and was shown to be caused by single point mutations in S. enterica (Papers II and IV). In Paper II, this occurred through the introduction of a new activity into an already existing enzyme. In Paper IV, novel capabilities were achieved through increased expression of relevant genes by mutations in regulatory proteins. Even though they were not novel, metabolic capabilities were significantly improved in Paper III by effects on global gene expression.

Differences in gene expression affected the metabolic capabilities of S. en-terica in Papers III and IV. The gene dosage was also important for the over-all effect of mutations on fitness. In Paper I, we show that mutations that are very deleterious to the activity of the protein can still appear neutral when measuring fitness of the organism. Elevated expression levels can buffer the effect of mutations on protein activity. This can potentially aid in the evolution of new genes or functions by allowing more mutations to be tested in the or-ganism without fitness trade–offs. These buffering mechanisms were also seen in Paper II, where certain mutations in S. enterica HisA that resulted in a novel activity were buffered by high expression such that no measurable effect on the native activity of the enzyme could be detected. The fitness as measured by the amount of novel activity was also heavily affected by the expression level, ranging from no measurable activity to high, but suboptimal activity.

It is well known that epistasis is a common phenomenon during evolution (Bershtein and Tawfik 2008; Phillips 2008; de Visser et al. 2011; Perfeito et al. 2011; Jiang et al. 2013; Chou et al. 2014; Bank et al. 2016; Li et al. 2016; Sarkisyan et al. 2016; Wistrand-Yuen et al. 2018) and epistasis was observed in all papers in this thesis. In Paper I, it was shown that during the accumula-tion of random mutations, there are epistatic interactions between the muta-tions. In both Papers II and III, epistasis had a significant positive effect on the development of new capabilities. In Papers II and IV, the suggested epi-static interaction had an effect on the trade–off between evolving a new func-tion while maintaining the original function.

Page 54: Evolution of New Genes and Functions - DiVA portaluu.diva-portal.org/smash/get/diva2:1252552/FULLTEXT01.pdfRNA–like molecules and created a suitable environment for certain types

54

In conclusion, I have examined different pathways of evolution towards new functions, described the mechanisms behind them, and revealed con-straints and difficulties associated with such evolution.

Page 55: Evolution of New Genes and Functions - DiVA portaluu.diva-portal.org/smash/get/diva2:1252552/FULLTEXT01.pdfRNA–like molecules and created a suitable environment for certain types

55

Future perspectives

In Paper II, we studied the trade-offs during evolution of the biosynthetic enzyme HisA upon acquiring and improving novel TrpF activity. In S. enter-ica, the biosynthesis of L-histidine and L-tryptophan is divided between the HisA and TrpF enzymes and their respective operons. However, in Actino-bacteria, there is no specialized TrpF enzyme. Instead the phosphoribosyl iso-merase A (PriA) can catalyze both the reactions of HisA and TrpF. In Paper II, we were able to isolate several clones that were capable of performing both reactions and one of these had a Q18R mutation. In a previous study, a dupli-cation of amino acids 13-15 (dup13-15; amino acids VVR) also introduced TrpF activity in HisA, at the expense of all HisA activity. This elongated a flexible loop and positioned an arginine (R) in position 18, making it a variant of the single Q18R mutation. HisA activity was restored upon acquisition of a D10G or G11D mutation in the dup13-15 duplication. The dup-13-15 creates a longer loop 1 in HisA that clashes with loop 5 and as a result, amino acids important for the HisA reaction are unable to interact with the substrate (Söderholm et al. 2015; Newton et al. 2017). As this clashing does not occur in the Q18R mutant, it retains the HisA activity. In another study, it was also found that for the dup13-15 variant, a valine in position 16 affects both HisA and TrpF activity of the dup13-15 variant (Erdélyi 2016). Other changes in direct vicinity of the duplicated amino acids were also shown to change the activities. How these subtle amino acid changes at certain positions modify the function from a HisA specialist to a TrpF specialist, restores HisA activity in a TrpF specialist, or provides a promiscuous enzyme available for further evolution towards increased TrpF activity remains unclear and would be in-teresting to investigate.

In Paper II, we also observed that during step-wise evolution towards in-creased TrpF activity in the HisA enzyme, the most fit clone only gave rise to improved clones in roughly half of the studied cases. In the other instances, a less fit clone(s) gave rise to all of the variants with improved TrpF activity in the following step, indicating epistasis. We observed that a few mutations were able to increase fitness in several different lineages, while other muta-tions were specific to a particular lineage. The mechanisms underlying these improvements and how a particular combination of mutations alters the HisA structure and improves the TrpF activity is beyond the scope of Paper II and would be interesting to examine further by kinetic measures and structural studies.

Page 56: Evolution of New Genes and Functions - DiVA portaluu.diva-portal.org/smash/get/diva2:1252552/FULLTEXT01.pdfRNA–like molecules and created a suitable environment for certain types

56

Acknowledgements

First and foremost, I would like to thank Dan. Your passion is truly inspiring and I am grateful for your guidance. Under you lead, I have grown both as a person and as a scientist. To my main co-supervisor Jocke – it is no secret that Dan is not always around and especially not doing lab work. Given this, it has been invaluable to have your feedback on a regular basis and especially in the lab. You have been a great supervisor in many different ways. A big thank you for enjoyable collaborations goes to my co-authors Po, Lionel, Anna, Michael, Lisa A, Ulrika, and Omar. I would not have been able to do all the statistics and coding in R without you Lionel. To Diarmaid, my other co-supervisor. You have always given me great ad-vice and thoughtful suggestions for improvements to my work which I appre-ciate. Erik W-Y, we have had a lot of fun inside and outside of the lab. It was great fun working with you with iGEM and I am grateful that you designed such a nice front cover for me. Po, always happy and spreading joy. Starting as my student, you soon also became my friend. You are truly a unicorn. Greta, I am glad that you joined the corridor. You are a very nice person and office mate, and a true friend. Your cat pictures are invaluable in tough times in the office and I really think that you should get a cat. Marius, I have been jokingly calling you a horrible person but as you know there is no (or maybe just a little) truth to this. You have always listened to my complained and we have had a lot of fun together. Pikkei, it all started with iGEM and over the past years you have been really helpful with all kinds of things, especially python programming and calculations. I hope to repay you by being as helpful next time we play Carcassonne. The lab environment would be so much duller without your laughter Hind and I wish you the best of luck with our common favorite protein. The DA lab would not function without Ulrika taking care of many administrative things. I also appreciate your efforts in our common strive in the IMBIM board to stop what we think is bad for students and the university. Arianne, thanks a

Page 57: Evolution of New Genes and Functions - DiVA portaluu.diva-portal.org/smash/get/diva2:1252552/FULLTEXT01.pdfRNA–like molecules and created a suitable environment for certain types

57

lot for reading my thesis and providing invaluable feedback. For the same rea-son I am grateful to you Jenny. I also think that you will do a great job in the IMBIM board in the future, representing PhD students and fighting despicable proposals. Eva, you did a great job in the IMBIM PhD association and I am glad that we worked together to make good and fun things for PhD students at IMBIM. Now you have become a doctor Lisa A and I hope to follow in your footsteps and become the next doctor in the DA group. Hervé, you set a really nice tone in the lab with your whistling and you are a great photogra-pher. Michael I am looking forward to having my pirra returned. To Karin – thank you for all our interesting late evening discussions in the lab. Fredrika and Linnéa – we did a great job supporting each other during tough times in the ISR course. Roderich, welcome back to Uppsala and the corridor. I hope you get a chance to further improve your Swedish aside from doing great sci-ence. Doug, you have been a great lunch company over the years and it was fun watching GoT with you and others at yours and Bethany’s place. Dennis and Isa – I am confident that IPhA will stand strong and do remarkable things when you are a part of the board. Susan, we had a lot of fruitful discussions and a fun time at Noors castle. Serhiy, if you ever leave science, you should try a career as an actor. Your contribution to the spex movies has not passed unnoticed for anyone. Lisa All – it was a lot of fun supervising you. Vicky and Lisa C – you really showed the way for the rest of the team and had a major role at our win in the music competition at the last IMBIM day. Omar, you are a bright scientist and I am glad to have worked with you on our com-mon project. Annika, thank you for invaluable comments on some of my mu-tants. Thank you, Cao Sha, Gerrit, Katrin, Rachel, Linus, and Marie for making the D7:3 corridor such an enjoyable work environment. Veronica, Mårten, Eva Gottfridsson – we had a great time in Kolmården and all IM-BIM days thereafter. Rakan, it was really nice teaching with you. Lisa T, thank you for really nice lunch company over the years. I am sorry that I took you chair when I first came to the lab but I hope that this is forgiven by now. Marlen, we had a lot of fun over the years both inside and outside of the lab. It is sad that you have left Uppsala and Sweden for good. Cecilia, you were only in the lab for a year but it was a great year you left a big hole after you. You are a remarkable person. Anna, you are a true biologist and the ter-rarium you made for the lab gives a nice touch in the corridor. Mallen and Anders – I had a really great time during the periods that I worked with you in your company. Anders, you are doing a great job in Örebro/Lindesberg but I do think that you should move to or at least closer to Uppsala. Jenny, you have been my friend and supported me for a long long time by now and even though I am terrible at answering to messages, I hope we will continue on this path

Page 58: Evolution of New Genes and Functions - DiVA portaluu.diva-portal.org/smash/get/diva2:1252552/FULLTEXT01.pdfRNA–like molecules and created a suitable environment for certain types

58

throughout life. To all my fellow Sillkalvar - Niklas, Kajsa, Henrik, Stina, and Dorothea. Without you and all the fun over the years this work would never have been possible. Malin, I think you will be a great aunt in the future. To my parents-in-law, Åsa and Conny – you have always shown great interest in my work and supported me throughout my studies which I am grateful for. Till Elinor, du kämpar med ditt och jag kämpar med mitt men som bonussys-kon och sjusovare tar vi oss an framtiden tillsammans. To my family, mamma och pappa – irrespective of my choices you have always been there for me and supported me. Regardless of the struggles, scientific or life in general, you have always been there. Emelie, words cannot describe what you mean to me and I am looking forward to next year when our new family member arrives.

Page 59: Evolution of New Genes and Functions - DiVA portaluu.diva-portal.org/smash/get/diva2:1252552/FULLTEXT01.pdfRNA–like molecules and created a suitable environment for certain types

59

References

Adler M, Anjum M, Andersson DI, Sandegren L. 2013. Influence of acquired β-lactamases on the evolution of spontaneous carbapenem resistance in Escherichia coli. Journal of Antimicrobial Chemotherapy 68:51–59.

Aharoni A, Gaidukov L, Khersonsky O, McQ Gould S, Roodveldt C, Tawfik DS. 2005. The “evolvability” of promiscuous protein functions. Nature Reviews Ge-netics 37:73–76.

Altschul S. 1997. Gapped BLAST and PSI-BLAST: a new generation of protein da-tabase search programs. Nucleic Acids Research 25:3389–3402.

Ambur OH, Davidsen T, Frye SA, Balasingham SV, Lagesen K, Rognes T, Tønjum T. 2009. Genome dynamics in major bacterial pathogens. FEMS microbiology reviews 33:453–470.

Anantharaman K, Brown CT, Hug LA, Sharon I, Castelle CJ, Probst AJ, Thomas BC, Singh A, Wilkins MJ, Karaoz U, et al. 2016. Thousands of microbial genomes shed light on interconnected biogeochemical processes in an aquifer system. Na-ture Communications 2016 7 7:13219.

Anderson P, Roth J. 1981. Spontaneous tandem genetic duplications in Salmonella typhimurium arise by unequal recombination between rRNA (rrn) cistrons. Pro-ceedings of the National Academy of Sciences 78:3113–3117.

Andersson DI, Hughes D. 2009. Gene amplification and adaptive evolution in bacte-ria. Annual Review of Genetics 43:167–195.

Andersson DI, Hughes D. 2010. Antibiotic resistance and its cost: is it possible to reverse resistance? 8:260–271.

Andersson DI, Hughes D. 2014. Microbiological effects of sublethal levels of antibi-otics. Nature Reviews Microbiology 12:465–478.

Andersson DI, Jerlström-Hultqvist J, Näsvall J. 2015. Evolution of new functions de novo and from preexisting genes. Cold Spring Harb Perspect Biol 7:a017996.

Andersson DI, Levin BR. 1999. The biological cost of antibiotic resistance. Current Opinion in Microbiology 2:489–493.

Andersson DI. 2006. The biological cost of mutational antibiotic resistance: any prac-tical conclusions? Current Opinion in Microbiology 9:461–465.

Andolfatto P. 2005. Adaptive evolution of non-coding DNA in Drosophila. Nature 437:1149–1152.

Atwood KC, Schneider LK, Ryan FJ. 1951. Periodic Selection in Escherichia coli. Proceedings of the National Academy of Sciences 37:146–155.

Babtie A, Tokuriki N, Hollfelder F. 2010. What makes an enzyme promiscuous? Cur-rent Opinion in Chemical Biology 14:200–207.

Bachmann BJ. 1996. Derivations and Genotypes of Some Mutant Derivatives of Esch-erichia coli K-12. In: Neidhardt, editor. Escherichia coli and Salmonella. 2nd ed. Washington DC: ASM Press. pp. 2460–2488.

Baldwin RL. 2008. Structure and mechanism in protein science. A guide to enzyme catalysis and protein folding, by A. Fersht. 1999. New York: Freeman. 631 pp. $67.95 (hardcover). Protein Science 9:207–207.

Page 60: Evolution of New Genes and Functions - DiVA portaluu.diva-portal.org/smash/get/diva2:1252552/FULLTEXT01.pdfRNA–like molecules and created a suitable environment for certain types

60

Baltrus DA. 2013. Exploring the costs of horizontal gene transfer. Trends in Ecology & Evolution 28:489–495.

Barberán A, Ramirez KS, Leff JW, Bradford MA, Wall DH, Fierer N. 2014. Why are some microbes more ubiquitous than others? Predicting the habitat breadth of soil bacteria. Ecology Letters 17:794–802.

Barlow M. 2009. What antimicrobial resistance has taught us about horizontal gene transfer. Methods Mol. Biol. 532:397–411.

Bataillon T, Bailey SF. 2014. Effects of new mutations on fitness: insights from mod-els and data.Fox CW, Mousseau TA, editors. Ann. N. Y. Acad. Sci. 1320:76–92.

Beadle BM, Shoichet BK. 2002. Structural Bases of Stability–function Tradeoffs in Enzymes. Journal of Molecular Biology 321:285–296.

Beard WA, Shock DD, Vande Berg BJ, Wilson SH. 2002. Efficiency of Correct Nu-cleotide Insertion Governs DNA Polymerase Fidelity.

Bergthorsson U, Andersson DI, Roth JR. 2007. Ohno's dilemma: evolution of new

genes under continuous selection. Proceedings of the National Academy of Sci-ences 104:17004–17009.

Bershtein S, Goldin K, Tawfik DS. 2008. Intense Neutral Drifts Yield Robust and Evolvable Consensus Proteins. 379:1029–1044.

Bhattacharya S, Chakrabarti S, Nayak A, Bhattacharya SK. 2003. Metabolic networks of microbial systems. Microb. Cell Fact. 2:3.

Bianconi I, Milani A, Cigana C, Paroni M, Levesque RC, Bertoni G, Bragonzi A. 2011. Positive Signature-Tagged Mutagenesis in Pseudomonas aeruginosa: Tracking Patho-Adaptive Mutations Promoting Airways Chronic Infection.Co-burn J, editor. PLOS Pathogens 7:e1001270.

Björkman J. 2000. Effects of Environment on Compensatory Mutations to Ameliorate Costs of Antibiotic Resistance. Science 287:1479–1482.

Blaby IK, Lyons BJ, Wroclawska-Hughes E, Phillips GCF, Pyle TP, Chamberlin SG, Benner SA, Lyons TJ, Crécy-Lagard V de, Crécy E de. 2012. Experimental evo-lution of a facultative thermophile from a mesophilic ancestor. Appl. Environ. Microbiol. 78:144–155.

Bloom JD, Arnold FH. 2009. In the light of directed evolution: Pathways of adaptive protein evolution.

. Bloom JD, Labthavikul ST, Otey CR, Arnold FH. 2006. Protein stability promotes

evolvability. Proceedings of the National Academy of Sciences 103:5869–5874. Bloom JD, Romero PA, Lu Z, Arnold FH. 2007. Neutral genetic drift can alter pro-

miscuous protein functions, potentially aiding functional evolution. Biol Direct 2:17.

Bloom JD, Silberg JJ, Wilke CO, Drummond DA, Adami C, Arnold FH. 2005. Ther-modynamic prediction of protein neutrality. Proceedings of the National Acad-emy of Sciences 102:606–611.

Bornscheuer UT, Kazlauskas RJ. 2004. Catalytic Promiscuity in Biocatalysis: Using Old Enzymes to Form New Bonds and Follow New Pathways. Angewandte Chemie International Edition 43:6032–6040.

Brandis G, Hughes D. 2016. The Selective Advantage of Synonymous Codon Usage Bias in Salmonella. PLOS Genetics 12:e1005926.

Bulmer M. 1991. The selection-mutation-drift theory of synonymous codon usage. Genetics 129:897–907.

Burch CL, Guyader S, Samarov D, Shen H. 2007. Experimental estimate of the abun-dance and effects of nearly neutral mutations in the RNA virus phi 6. Genetics 176:467–476.

Page 61: Evolution of New Genes and Functions - DiVA portaluu.diva-portal.org/smash/get/diva2:1252552/FULLTEXT01.pdfRNA–like molecules and created a suitable environment for certain types

61

Camps M, Herman A, Loh E, Loeb LA. 2008. Genetic Constraints on Protein Evolu-tion. Critical reviews in biochemistry and molecular biology 42:313–326.

Carvunis A-R, Rolland T, Wapinski I, Calderwood MA, Yildirim MA, Simonis N, Charloteaux B, Hidalgo CA, Barbette J, Santhanam B, et al. 2012. Proto-genes and de novo gene birth. 487:370–374.

Catrina I, O'Brien PJ, Purcell J, Nikolic-Hughes I, Zalatan JG, Hengge AC, Herschlag D. 2007. Probing the Origin of the Compromised Catalysis of E. coli Alkaline Phosphatase in its Promiscuous Sulfatase Reaction. Journal of the American Chemical Society 129:5760–5765.

Charlesworth B, Charlesworth D. 2009. Darwin and Genetics. Genetics 183:757–766. Charlesworth B. 2009. Fundamental concepts in genetics: effective population size

and patterns of molecular evolution and variation. Nature Reviews Genetics 2015 16:8 10:195–205.

Charlesworth J. 2006. The Rate of Adaptive Evolution in Enteric Bacteria. Molecular Biology and Evolution 23:1348–1356.

Chen J, Stites WE. 2001. Energetics of Side Chain Packing in Staphylococcal Nucle-ase Assessed by Systematic Double Mutant Cycles. Biochemistry 40:14004–14011.

Chooniedass-Kothari S, Emberley E, Hamedani MK, Troup S, Wang X, Czosnek A, Hube F, Mutawe M, Watson PH, Leygue E. 2004. The steroid receptor RNA ac-tivator is the first functional RNA encoding a protein. FEBS Lett. 566:43–47.

Chou H-H, Chiu H-C, Delaney NF, Segrè D, Marx CJ. 2011. Diminishing returns epistasis among beneficial mutations decelerates adaptation. Science 332:1190–1192.

Conrad TM, Lewis NE, Palsson BO. 2011. Microbial laboratory evolution in the era of genome-scale science. Molecular Systems Biology 7:509–509.

Copley S. 2003. Enzymes with extra talents: moonlighting functions and catalytic promiscuity. Current Opinion in Chemical Biology 7:265–272.

Cupples CG, Miller JH. 1988. Effects of Amino Acid Substitutions at the Active Site in Escherichia coli β-Galactosidase. 120:637–644.

D'Souza G, Waschina S, Pande S, Bohl K, Kaleta C, Kost C. 2014. Less is more: selective advantages can explain the prevalent loss of biosynthetic genes in bac-teria. Evolution 68:2559–2570.

Darwin C. 1859. On the origin of species by means of natural selection, or preserva-tion of favoured races in the struggle for life. London: John Murray

Daudé D, Topham CM, Remaud-Siméon M, André I. 2013. Probing impact of active site residue mutations on stability and activity of Neisseria polysaccharea amyl-osucrase. Protein Science 22:1754–1765.

Davies EK. 1999. High Frequency of Cryptic Deleterious Mutations in Caenorhabdi-tis elegans. 285:1748–1751.

Dayhoff MO, Schwartz RM, Orcutt BC. 1978. A model of evolutionary change in proteins. In: Atlas of Protein Sequence and Structure 5. Vol. 40. National Bio-medical Research Foundation. pp. 345–352.

de Visser JAGM, Cooper TF, Elena SF. 2011. The causes of epistasis. Proceedings of the Royal Society B 278:3617–3624.

de Visser JAGM, Hermisson J, Wagner GP, Meyers LA, Chaichian HB, Blanchard JL, Chao L, Cheverud JM, Elena SF, Fontana W, et al. 2003. Perspective: Evolu-tion and Detection of Genetic Robustness. Evolution 57:1959–1972.

de Visser JAGM, Krug J. 2014. Empirical fitness landscapes and the predictability of evolution. Nature Reviews Genetics 2015 16:8 15:480–490.

Page 62: Evolution of New Genes and Functions - DiVA portaluu.diva-portal.org/smash/get/diva2:1252552/FULLTEXT01.pdfRNA–like molecules and created a suitable environment for certain types

62

de Visser JAGM, Zeyl CW, Gerrish PJ, Blanchard JL, Lenski RE. 1999. Diminishing Returns from Mutation Supply Rate in Asexual Populations. Science 283:404–406.

. Delaye L, DeLuna A, Lazcano A, Becerra A. 2008. The origin of a novel gene through

overprinting in Escherichia coli. BMC Evolutionary Biology 2008 8:1 8:31. Delcour AH. 2009. Outer membrane permeability and antibiotic resistance. Bio-

chimica et Biophysica Acta (BBA) - Proteins and Proteomics 1794:808–816. Denver DR, Morris K, Lynch M, Thomas WK. 2004. High mutation rate and predom-

inance of insertions in the Caenorhabditis elegans nuclear genome. 430:679–682.

DePristo MA, Weinreich DM, Hartl DL. 2005. Missense meanderings in sequence space: a biophysical view of protein evolution. Nature Reviews Genetics 2015 16:8 6:678–687.

Dettman JR, Rodrigue N, Melnyk AH, Wong A, Bailey SF, Kassen R. 2012. Evolu-tionary insight from whole-genome sequencing of experimentally evolved mi-crobes. Molecular Ecology 21:2058–2077.

Dinger ME, Pang KC, Mercer TR, Mattick JS. 2008. Differentiating Protein-Coding and Noncoding RNA: Challenges and Ambiguities. 4:e1000176.

Dittmar K, Liberles D. 2010. Evolution after Gene Duplication. Hoboken, New Jer-sey: John Wiley & Sons, Inc.

Doern GV. 2011. Antimicrobial Susceptibility Testing. Journal of Clinical Microbi-ology 49:S4.

Dougherty MJ, Arnold FH. 2009. Directed evolution: new parts and optimized func-tion. Current Opinion in Biotechnology 20:486–491.

Echave J, Spielman SJ, Wilke CO. 2016. Causes of evolutionary rate variation among protein sites. 17:109–121.

Ehrenberg M, Kurland CG. 1984. Costs of accuracy determined by a maximal growth rate constraint. Quarterly reviews of biophysics 17:45–82.

Eisenreich W, Dandekar T, Heesemann J, Goebel W. 2010. Carbon metabolism of intracellular bacterial pathogens and possible links to virulence. Nature Reviews Microbiology 8:401–412.

Elena SF, Ekunwe L, Hajela N, Oden SA, Lenski RE. 1998. Mutation and Evolution. In: Woodruff R, Thompson J, editors. Distribution of fitness effects caused by random insertion mutations in Escherichia coli.

Erdélyi A. 2016. HisA mutants with minor structural differences display major func-

tional deviations. Independent thesis Advanced level (degree of Master (Two Years)).

Eyre-Walker A, Keightley PD. 2007. The distribution of fitness effects of new muta-tions. Nature Reviews Genetics 8:610–618.

Eyre-Walker A, Woolfit M, Phelps T. 2006. The Distribution of Fitness Effects of New Deleterious Amino Acid Mutations in Humans. Genetics 173:891–900.

Eyre-Walker A. 2002. Changing effective population size and the McDonald-Kreit-man test. .

Fani R, Fondi M. 2009. Origin and evolution of metabolic pathways. Physics of Life Reviews 6:23–52.

Fares MA, Ruiz-Gonzalez MX, Moya A, Elena SF, Barrio E. 2002. Endosymbiotic bacteria GroEL buffers against deleterious mutations. Nature 417:398–398.

Ferenci T. 2005. Maintaining a healthy SPANC balance through regulatory and mu-tational adaptation. Molecular Microbiology 57:1–8.

Page 63: Evolution of New Genes and Functions - DiVA portaluu.diva-portal.org/smash/get/diva2:1252552/FULLTEXT01.pdfRNA–like molecules and created a suitable environment for certain types

63

Ferenci T. 2016. Trade-off Mechanisms Shaping the Diversity of Bacteria. Trends in Microbiology 24:209–223.

Fersht A. 1999. Structure and mechanism in protein science: a guide to enzyme catal-ysis and protein folding. Macmillan

Félix M-A, Wagner A. 2006. Robustness and evolution: concepts, insights and chal-lenges from a developmental model system. Heredity 100:132–140.

Firnberg E, Labonte JW, Gray JJ, Ostermeier M. 2014. A comprehensive, high-reso-lution map of a gene's fitness landscape. Molecular Biology and Evolution 31:1581–1592.

Fisher RA. 1935. The Sheltering of Lethals. The American Naturalist 69:446–455. Follmann H, Brownson C. 2009. Darwin’s warm little pond revisited: from molecules

to the origin of life. Naturwissenschaften 96:1265–1292. Force A, Lynch M, Pickett FB, Amores A, Yan Y-L, Postlethwait J. 1999. Preserva-

tion of Duplicate Genes by Complementary, Degenerative Mutations. .

Franzosa EA, Xia Y. 2009. Structural Determinants of Protein Evolution Are Context-Sensitive at the Residue Level. Molecular Biology and Evolution 26:2387–2395.

Fuchs TM, Eisenreich W, Heesemann J, Goebel W. 2012. Metabolic adaptation of human pathogenic and related nonpathogenic bacteria to extra- and intracellular habitats. FEMS microbiology reviews 36:435–462.

Gerrish PJ, Lenski RE. 1998. The fate of competing beneficial mutations in an asexual population. Genetica 102/103:127–144.

Gilbert JA, Jansson JK, Knight R. 2014. The Earth Microbiome project: successes and aspirations. BMC Biology 12:69.

Gillespie JH. 1994. The Causes of Molecular Evolution. New York Gordon CL, Sather SK, Casjens S, King J. 1994. Selective in vivo rescue by

GroEL/ES of thermolabile folding intermediates to phage P22 structural proteins. Journal of Biological Chemistry 269:27941–27951.

Goyal M, Chaudhuri TK. 2015. GroEL–GroES assisted folding of multiple recombi-nant proteins simultaneously over-expressed in Escherichia coli. The Interna-tional Journal of Biochemistry & Cell Biology 64:277–286.

Grantham R. 1974. Amino Acid Difference Formula to Help Explain Protein Evolu-tion. Science 185:862–864.

Green SM, Meeker AK, Shortle D. 1992. Contributions of the polar, uncharged amino acids to the stability of staphylococcal nuclease: evidence for mutational effects on the free energy of the denatured state. 31:5717–5728.

Gu Z, Steinmetz LM, Gu X, Scharfe C, Davis RW, Li W-H. 2003. Role of duplicate genes in genetic robustness against null mutations. Nature 421:63–66.

Gullberg E, Albrecht LM, Karlsson C, Sandegren L, Andersson DI. 2014. Selection of a Multidrug Resistance Plasmid by Sublethal Levels of Antibiotics and Heavy Metals. mBio 5:e01918–14–e01918–14.

Gullberg E, Cao S, Berg OG, Ilbäck C, Sandegren L, Hughes D, Andersson DI. 2011. Selection of Resistant Bacteria at Very Low Antibiotic Concentrations. PLOS Pathogens 7:e1002158.

Gutnick D, Calvo JM, Klopotowski T, Ames BN. 1969. Compounds which serve as the sole source of carbon or nitrogen for Salmonella typhimurium LT-2. Journal of Bacteriology 100:215–219.

Haldane JBS. 1932. The causes of evolution. London: Longmans Hartl DL, Moriyama EN, Sawyer SA. 1994. Selection intensity for codon bias. Ge-

netics 138:227–234. Hastings PJ, Lupski JR, Rosenberg SM, Ira G. 2009. Mechanisms of change in gene

copy number. 10:551–564.

Page 64: Evolution of New Genes and Functions - DiVA portaluu.diva-portal.org/smash/get/diva2:1252552/FULLTEXT01.pdfRNA–like molecules and created a suitable environment for certain types

64

Heinen TJAJ, Staubach F, Häming D, Tautz D. 2009. Emergence of a New Gene from an Intergenic Region. Current Biology 19:1527–1531.

Henikoff S, Henikoff JG. 1992. Amino acid substitution matrices from protein blocks. Proceedings of the National Academy of Sciences 89:10915–10919.

Holder JB, Bennett AF, Chen J, Spencer DS, Byrne MP, Stites WE. 2001. Energetics of Side Chain Packing in Staphylococcal Nuclease Assessed by Exchange of Va-lines, Isoleucines, and Leucines. Biochemistry 40:13998–14003.

Höcker B, Jürgens C, Wilmanns M, Sterner R. 2001. Stability, catalytic versatility and evolution of the (βα)8-barrel fold. Current Opinion in Biotechnology 12:376–381.

Hubbard TJP, T.J.P. Hubbard. 1987. Comparison of solvent-inaccessible cores of ho-mologous proteins: definitions useful for protein modelling.

. Hughes AL. 1994. The evolution of functionally novel proteins after gene duplication.

Proceedings of the Royal Society B 256:119–124. Husband BC, Barrett SCH. 1992. Effective Population Size and Genetic Drift in Tri-

stylous Eichhornia paniculata (Pontederiaceae). . Jacobson GN, Clark PL. 2016. Quality over quantity: optimizing co-translational pro-

tein folding with non-“optimal” synonymous codons. .

Jacquier H, Birgy A, Le Nagard H, Mechulam Y, Schmitt E, Glodt J, Bercot B, Petit E, Poulain J, Barnaud G, et al. 2013. Capturing the mutational landscape of the beta-lactamase TEM-1.

110:13067–13072. Jenny A, Hachet O, Závorszky P, Cyrklaff A, Weston MDJ, Johnston DS, Erdélyi M,

Ephrussi A. 2006. A translation-independent role of oskar RNA in early Drosoph-ila oogenesis. Development 133:2827–2833.

Jensen RA. 1976. Enzyme recruitment in evolution of new function. Annual Review of Microbiology 30:409–425.

Juhas M, van der Meer JR, Gaillard M, Harding RM, Hood DW, Crook DW. 2009. Genomic islands: tools of bacterial horizontal gene transfer and evolution. FEMS microbiology reviews 33:376–393.

Kassen R, Bataillon T. 2006. Distribution of fitness effects among beneficial muta-tions before selection in experimental populations of bacteria. Nature Reviews Genetics 38:484–488.

Kawecki TJ, Lenski RE, Ebert D, Hollis B, Olivieri I, Whitlock MC. 2012. Experi-mental evolution. Trends in Ecology & Evolution 27:547–560.

Keefe AD, Szostak JW. 2001. Functional proteins from a random-sequence library. 410:715–718.

Keightley PD, Eyre-Walker A. 2010. What can we learn about the distribution of fit-ness effects of new mutations from DNA sequence data? Philosophical Transac-tions of the Royal Society B: Biological Sciences 365:1187–1193.

Khersonsky O, Roodveldt C, Tawfik DS. 2006. Enzyme promiscuity: evolutionary and mechanistic aspects. Current Opinion in Chemical Biology 10:498–508.

Khersonsky O, Tawfik DS. 2006. The histidine 115-histidine 134 dyad mediates the lactonase activity of mammalian serum paraoxonases. Journal of Biological Chemistry 281:7649–7656.

Khersonsky O, Tawfik DS. 2010a. Enzyme promiscuity: a mechanistic and evolution-ary perspective. Annu. Rev. Biochem. 79:471–505.

Khersonsky O, Tawfik DS. 2010b. 8.03 - Enzyme Promiscuity – Evolutionary and Mechanistic Aspects. In: Comprehensive Natural Products II. Elsevier. pp. 47–88.

Page 65: Evolution of New Genes and Functions - DiVA portaluu.diva-portal.org/smash/get/diva2:1252552/FULLTEXT01.pdfRNA–like molecules and created a suitable environment for certain types

65

Kibota TT, Lynch M. 1996. Estimate of the genomic mutation rate deleterious to over-all fitness in E. coll. 381:694–696.

Kim J, Kershner JP, Novikov Y, Shoemaker RK, Copley SD. 2010. Three serendipi-tous pathways in E. coli can bypass a block in pyridoxal-5'-phosphate synthesis. Molecular Systems Biology 6:436.

Kimura M. 1962. On the Probability of Fixation of Mutant Genes in a Population. Genetics 47:713–719.

Kimura M. 1985. The role of compensatory neutral mutations in molecular evolution. Journal of Genetics 64:7–19.

Kloc M. 2005. Potential structural role of non-coding and coding RNAs in the organ-ization of the cytoskeleton at the vegetal cortex of Xenopus oocytes. Development 132:3445–3457.

Knight GM, Colijn C, Shrestha S, Fofana M, Cobelens F, White RG, Dowdy DW, Cohen T. 2015. The Distribution of Fitness Costs of Resistance-Conferring Mu-tations Is a Key Determinant for the Future Burden of Drug-Resistant Tuberculo-sis: A Model-Based Analysis. Clinical Infectious Diseases 61Suppl 3:S147–S154.

Kondrashov FA, Rogozin IB, Wolf YI, Koonin EV. 2002. Selection in the evolution of gene duplications. Genome Biology 2002 3:2 3:1–8.

Koskiniemi S, Sun S, Berg OG, Andersson DI. 2012. Selection-driven gene loss in bacteria. PLOS Genetics 8:e1002787.

Kröger C, Dillon SC, Cameron ADS, Papenfort K, Sivasankaran SK, Hokamp K, Chao Y, Sittka A, Hébrard M, Händler K, et al. 2012. The transcriptional land-scape and small RNAs of Salmonella enterica serovar Typhimurium. Proceedings of the National Academy of Sciences109:E1277–E1286.

Kudla G, Murray AW, Tollervey D, Plotkin JB. 2009. Coding-Sequence Determinants of Gene Expression in Escherichia coli. Science 324:255–258.

Kugelberg E, Kofoid E, Andersson DI, Lu Y, Mellor J, Roth FP, Roth JR. 2010. The Tandem Inversion Duplication in Salmonella enterica: Selection Drives Unstable Precursors to Final Mutation Types. Genetics 185:65–80.

Lang GI, Rice DP, Hickman MJ, Sodergren E, Weinstock GM, Botstein D, Desai MM. 2013. Pervasive genetic hitchhiking and clonal interference in forty evolv-ing yeast populations. Nature 500:571–574.

Larion M, Moore LB, Thompson SM, Miller BG. 2007. Divergent Evolution of Func-tion in the ROK Sugar Kinase Superfamily: Role of Enzyme Loops in Substrate Specificity. Biochemistry 46:13564–13572.

Lawrence JG, Hendrickson H. 2005. Genome evolution in bacteria: order beneath chaos. Antimicrobials / Edited by Malcolm Page and Christopher T Walsh · Ge-nomics / Edited by Stephan C Schuster and Gerhard Gottschalk 8:572–578.

Lenoir T. 1984. The eternal laws of form: Morphotypes and the conditions of exist-ence in Goethe's biological thought. Journal of Social and Biological Structures 7:317–324.

Lenski RE, Rose MR, Simpson SC, Tadler SC. 1991. Long-Term Experimental Evo-lution in Escherichia coli. I. Adaptation and Divergence During 2,000 Genera-tions. The American Naturalist 138:1315–1341.

Lim WA, Sauer RT. 1989. Alternative packing arrangements in the hydrophobic core of λrepresser. Nature 339:31–36.

Lind PA, Arvidsson L, Berg OG, Andersson DI. 2017. Variation in Mutational Ro-bustness between Different Proteins and the Predictability of Fitness Effects. Mo-lecular Biology and Evolution 34:408–418.

Lind PA, Berg OG, Andersson DI. 2010. Mutational robustness of ribosomal protein genes. Science 330:825–827.

Page 66: Evolution of New Genes and Functions - DiVA portaluu.diva-portal.org/smash/get/diva2:1252552/FULLTEXT01.pdfRNA–like molecules and created a suitable environment for certain types

66

Locey KJ, Lennon JT. 2016. Scaling laws predict global microbial diversity. Proceed-ings of the National Academy of Sciences 113:5970–5975.

Luria SE, Delbrück M. 1943. Mutations of bacteria from virus sensitivity to virus resistance. Genetics 28:491.

MacLean RC, Buckling A. 2009. The distribution of fitness effects of beneficial mu-tations in Pseudomonas aeruginosa. PLOS Genetics 5:e1000406.

MacLean RC, Perron GG, Gardner A. 2010. Diminishing Returns From Beneficial Mutations and Pervasive Epistasis Shape the Fitness Landscape for Rifampicin Resistance in Pseudomonas aeruginosa. Genetics 186:1345–1354.

Maharjan R, Nilsson S, Sung J, Haynes K, Beardmore RE, Hurst LD, Ferenci T, Gudelj I. 2013. The form of a trade-off determines the response to competition. Ecology Letters 16:1267–1276.

Maisnier-Patin S, Andersson DI. 2004. Adaptation to the deleterious effects of anti-microbial drug resistance mutations by compensatory evolution. Genome plastic-ity and the evolution of microbial genomes. Research in Microbiology 155:360–369.

Maisnier-Patin S, Roth JR, Fredriksson A, Nyström T, Berg OG, Andersson DI. 2005. Genomic buffering mitigates the effects of deleterious mutations in bacteria. Na-ture Reviews Genetics 37:1376–1379.

Manoharan L, Kushwaha SK, Hedlund K, Ahrén D. 2015. Captured metagenomics: large-scale targeting of genes based on “sequence capture” reveals functional di-versity in soils. DNA Res 22:451–460.

Mansy SS, Schrum JP, Krishnamurthy M, Tobé S, Treco DA, Szostak JW. 2008. Template-directed synthesis of a genetic polymer in a model protocell. Nature 454:122–125.

Marcos ML, Echave J. 2015. Too packed to change: side-chain packing and site-spe-cific substitution rates in protein evolution. PeerJ 3:e911.

Martincorena I, Luscombe NM. 2013. Non-random mutation: the evolution of tar-geted hypermutation and hypomutation. Bioessays 35:123–130.

Masel J, Trotter MV. 2010. Robustness and Evolvability. Trends in Genetics 26:406–414.

McClelland M, Sanderson KE, Spieth J, Clifton SW, Latreille P, Courtney L, Porwol-lik S, Ali J, Dante M, Du F, et al. 2001. Complete genome sequence of Salmonella enterica serovar Typhimurium LT2. Nature 413:852–856.

McCutcheon JP, Moran NA. 2011. Extreme genome reduction in symbiotic bacteria. Nature Reviews Microbiology 224:1209–1226.

McLoughlin SY, Copley SD. 2008. A compromise required by gene sharing enables survival: Implications for evolution of new enzyme activities. Proceedings of the National Academy of Sciences 105:13497–13502.

Meeker AK, Garcia-Moreno B, Shortle D. 1996. Contributions of the ionizable amino acids to the stability of staphylococcal nuclease. Biochemistry 35:6443–6449.

Meier S, Özbek S. 2007. A biological cosmos of parallel universes: Does protein structural plasticity facilitate evolution? BioEssays 29:1095–1104.

Mendel G. 1866. Versuche über Pflanzenhybriden. Verhandlungen des naturforschen-den Vereines in Brunn 4: 3 44.

Meysman P, Sánchez-Rodríguez A, Fu Q, Marchal K, Engelen K. 2013. Expression Divergence between Escherichia coli and Salmonella enterica serovar Typhi-murium Reflects Their Lifestyles. Molecular Biology and Evolution 30:1302–1314.

Miller BG, Raines RT. 2004. Identifying latent enzyme activities: substrate ambiguity within modern bacterial sugar kinases. Biochemistry 43:6387–6392.

Page 67: Evolution of New Genes and Functions - DiVA portaluu.diva-portal.org/smash/get/diva2:1252552/FULLTEXT01.pdfRNA–like molecules and created a suitable environment for certain types

67

Miller BG, Raines RT. 2005. Reconstitution of a defunct glycolytic pathway via re-cruitment of ambiguous sugar kinases. Biochemistry 44:10776–10783.

Miller C, Kong J, Tran TT, Arias CA, Saxer G, Shamoo Y. 2013. Adaptation of En-terococcus faecalis to Daptomycin Reveals an Ordered Progression to Resistance. Antimicrobial Agents and Chemotherapy. 57:5373–5383.

Miralles R, Gerrish PJ, Moya A, Elena SF. 1999. Clonal Interference and the Evolu-tion of RNA Viruses. Science 285:1745–1747.

Mirny LA, Shakhnovich EI. 1999. Universally conserved positions in protein folds: reading evolutionary signals about stability, folding kinetics and function11Ed-ited by A. R. Fersht. Journal of Molecular Biology 291:177–196.

Murugan A, Huse DA, Leibler S. 2012. Speed, dissipation, and error in kinetic proof-reading. Proceedings of the National Academy of Sciences109:12034–12039.

Nakamura Y, Itoh T, Matsuda H, Gojobori T. 2004. Biased biological functions of horizontally transferred genes in prokaryotic genomes. Nature Reviews Genetics 36:760–766.

Nannemann DP, Birmingham WR, Scism RA, Bachmann BO. 2011. Assessing di-rected evolution methods for the generation of biosynthetic enzymes with poten-tial in drug biosynthesis. Future Medicinal Chemistry 3:809–819.

Näsvall J, Sun L, Roth JR, Andersson DI. 2012. Real-time evolution of new genes by innovation, amplification, and divergence. Science 338:384–387.

Nevin Gerek Z, Kumar S, Banu Ozkan S. 2013. Structural dynamics flexibility in-forms function and evolution at a proteome scale. Evolutionary Applications 6:423–433.

Newton MS, Arcus VL, Gerth ML, Patrick WM. 2018. Enzyme evolution: innovation is easy, optimization is complicated. Current Opinion in Structural Biology 48:110–116.

Newton MS, Guo X, Söderholm A, Näsvall J, Lundström P, Andersson DI, Selmer M, Patrick WM. 2017. Structural and functional innovations in the real-time evo-lution of new (βα) 8barrel enzymes. Proceedings of the National Academy of Sci-ences 114:4727–4732.

Nielsen J. 2003. It Is All about Metabolic Fluxes. Journal of Bacteriology 185:7031–7035.

Nikaido H. 2003. Molecular Basis of Bacterial Outer Membrane Permeability Revis-ited. Microbiology and Molecular Biology Reviews 67:593–656.

Nilsson AI, Koskiniemi S, Eriksson S, Kugelberg E, Hinton JCD, Andersson DI. 2005. .

Nobeli I, Favia AD, Thornton JM. 2009. Protein promiscuity and its implications for biotechnology. Nature Biotechnology 27:157–167.

Nowrousian M, Teichert I, Masloff S, Kück U. 2012. Whole-Genome Sequencing of Sordaria macrospora Mutants Identifies Developmental Genes. G3 2:261–270.

Nyström T. 2004. Growth versus maintenance: a trade-off dictated by RNA polymer-ase availability and sigma factor competition? Molecular Microbiology 54:855–862.

O'Brie PJ, Herschlag D. 2001. Functional Interrelationships in the Alkaline Phospha-tase Superfamily: Phosphodiesterase Activity of Escherichia coli Alkaline Phos-phatase. Biochemistry 40:5691–5699.

O'Brien PJ, Herschlag D. 1999. Catalytic promiscuity and the evolution of new enzy-matic activities. Chemistry & Biology 6:R91–R105.

Ohno S. 1970. Evolution by Gene Duplication. Berlin, Heidelberg: Springer Berlin Heidelberg

Ohta T. 1992. The Nearly Neutral Theory of Molecular Evolution. Annu. Rev. Ecol. Syst. 23:263–286.

Page 68: Evolution of New Genes and Functions - DiVA portaluu.diva-portal.org/smash/get/diva2:1252552/FULLTEXT01.pdfRNA–like molecules and created a suitable environment for certain types

68

Orr HA. 2005. The probability of parallel evolution. Evolution 59:216–220. Orr HA. 2006. The distribution of fitness effects among beneficial mutations in Fish-

er's geometric model of adaptation. Journal of Theoretical Biology 238:279–285. Overington J, Johnson MS, Sali A, Blundell TL. 1990. Tertiary structural constraints

on protein evolutionary diversity: templates, key residues and structure predic-tion. Proceedings of the Royal Society B 241:132–145.

Paczia N, Nilgen A, Lehmann T, Gätgens J, Wiechert W, Noack S. 2012. Extensive exometabolome analysis reveals extended overflow metabolism in various micro-organisms. Microbial cell factories 11:122–122.

Pages J-M, James CE, Winterhalter M. 2008. The porin and the permeating antibiotic: a selective diffusion barrier in Gram-negative bacteria. Nature Reviews Microbi-ology 6:893–903.

Pal C, Papp B, Lercher MJ. 2006. An integrated view of protein evolution. Nature Reviews Genetics 7:337–348.

Pandya C, Farelli JD, Dunaway-Mariano D, Allen KN. 2014. Enzyme Promiscuity: Engine of Evolutionary Innovation. Journal of Biological Chemistry 289:30229–30236.

Paquin C, Adams J. 1983. Frequency of fixation of adaptive mutations is higher in evolving diploid than haploid yeast populations. Nature 302:495–500.

Patrick WM, Quandt EM, Swartzlander DB, Matsumura I. 2007. Multicopy Suppres-sion Underpins Metabolic Evolvability. Molecular Biology and Evolution 24:2716–2722.

Pavesi A, Magiorkinis G, Karlin DG. 2013. Viral Proteins Originated De Novo by Overprinting Can Be Identified by Codon Usage: Application to the “Gene Nursery” of Deltaretroviruses.Wilke CO, editor. PLOS Computational Biology 9:e1003162.

Peebo K, Valgepea K, Maser A, Nahku R, Adamberg K, Vilu R. 2015. Proteome reallocation in Escherichia coli with increasing specific growth rate. Molecular BioSystems 11:1184–1193.

Peisajovich SG, Tawfik DS. 2007. Protein engineers turned evolutionists. Nature Methods 4:991–994.

Peris JB, Davis P, Cuevas JM, Nebot MR, Sanjuan R. 2010. Distribution of Fitness Effects Caused by Single-Nucleotide Substitutions in Bacteriophage f1. Genetics 185:603–609.

Pettersson ME, Sun S, Andersson DI, Berg OG. 2008. Evolution of new gene func-tions: simulation and analysis of the amplification model. Genetica 135:309–324.

Phillips PC. 2008. Epistasis — the essential role of gene interactions in the structure and evolution of genetic systems. Nature Reviews Genetics 2015 16:8 9:855–867.

Piatigorsky J. 1991. The recruitment of crystallins: new functions precede gene dupli-cation. Science 252:1078–1079.

Poelarends GJ, Serrano H, Johnson WH, Hoffman DW, Whitman CP. 2004. The Hy-dratase Activity of Malonate Semialdehyde Decarboxylase: Mechanistic and Evolutionary Implications. Journal of the American Chemical Society 126:15658–15659.

Poelwijk FJ, Kiviet DJ, Weinreich DM, Tans SJ. 2007. Empirical fitness landscapes reveal accessible evolutionary paths. Nature 445:383–386.

Pross A. 2011. Toward a general theory of evolution: Extending Darwinian theory to inanimate matter. Journal of Systems Chemistry 2:1–14.

Ramsey DC, Scherrer MP, Zhou T, Wilke CO. 2011. The Relationship Between Rel-ative Solvent Accessibility and Evolutionary Rate in Protein Evolution. Genetics 188:479–488.

Page 69: Evolution of New Genes and Functions - DiVA portaluu.diva-portal.org/smash/get/diva2:1252552/FULLTEXT01.pdfRNA–like molecules and created a suitable environment for certain types

69

Rastinejad F, Blau HM. 1993. Genetic complementation reveals a novel regulatory role for 3′ untranslated regions in growth and differentiation. Cell 72:903–917.

Reams AB, Kofoid E, Savageau M, Roth JR. 2010. Duplication frequency in a popu-lation of Salmonella enterica rapidly approaches steady state with or without re-combination. Genetics 184:1077–1094.

Reinhardt JA, Wanjiru BM, Brant AT, Saelao P, Begun DJ, Jones CD. 2013. De Novo ORFs in Drosophila Are Important to Organismal Fitness and Evolved Rapidly from Previously Non-coding Sequences. PLoS Genetics 9:e1003860.

Rice LB. 2009. The clinical consequences of antimicrobial resistance. Current Opin-ion in Microbiology 12:476–481.

Riley M, Abe T, Arnaud MB, Berlyn MKB, Blattner FR, Chaudhuri RR, Glasner JD, Horiuchi T, Keseler IM, Kosuge T, et al. 2006. Escherichia coli K-12: a cooper-atively developed annotation snapshot--2005. Nucleic Acids Research 34:1–9.

Rohmer L, Hocquet D, Miller SI. 2011. Are pathogenic bacteria just looking for food? Metabolism and microbial pathogenesis. Trends in Microbiology 19:341–348.

Roy SW. 2016. Is Mutation Random or Targeted?: No Evidence for Hypermutability in Snail Toxin Genes. Molecular Biology and Evolution 33:2642–2647.

Rozen DE, de Visser JAGM, Gerrish PJ. 2002. Fitness Effects of Fixed Beneficial Mutations in Microbial Populations. Current Biology 12:1040–1045.

Sabarinathan R, Tafer H, Seemann SE, Hofacker IL, Stadler PF, Gorodkin J. 2013. The RNAsnp web server: predicting SNP effects on local RNA secondary struc-ture. Nucleic Acids Research 41:W475–W479.

Sabath N, Wagner A, Karlin D. 2012. Evolution of Viral Proteins Originated De Novo by Overprinting. Molecular Biology and Evolution 29:3767–3780.

Sanjuán R, Moya A, Elena SF. 2004. The distribution of fitness effects caused by single-nucleotide substitutions in an RNA virus. Proceedings of the National Academy of Sciences 101:8396–8401.

Sarkisyan KS, Bolotin DA, Meer MV, Usmanova DR, Mishin AS, Sharonov GV, Ivankov DN, Bozhanova NG, Baranov MS, Soylemez O, et al. 2016. Local fitness landscape of the green fluorescent protein. 533:397–401.

Serrano L, Kellis JT Jr, Cann P, molecular AMJO, 1992. The folding of an enzyme: II. Substructure of barnase and the contribution of different interactions to protein stability. Journal of Molecular Biology 224:783-804

Shahmoradi A, Sydykova DK, Spielman SJ, Jackson EL, Dawson ET, Meyer AG, Wilke CO. 2014. Predicting Evolutionary Site Variability from Structure in Viral Proteins: Buriedness, Packing, Flexibility, and Design. J Mol Evol 79:130–142.

Shortle D, Stites WE, Meeker AK. 1990. Contributions of the large hydrophobic amino acids to the stability of staphylococcal nuclease. 29:8033–8041.

Sikosek T, Chan HS. 2014. Biophysics of protein evolution and evolutionary protein biophysics. Journal of The Royal Society Interface 11:1–35.

Silander OK, Tenaillon O, Chao L. 2007. Understanding the Evolutionary Fate of Finite Populations: The Dynamics of Mutational Effects. PLOS Biology 5:e94.

Sokurenko EV, Hasty DL, Dykhuizen DE. 1999. Pathoadaptive mutations: gene loss and variation in bacterial pathogens. Trends in Microbiology 7:191–195.

Sonti RV, Roth JR. 1989. Role of gene duplications in the adaptation of Salmonella typhimurium to growth on limiting carbon sources. Genetics 123:19–28.

Soskine M, Tawfik DS. 2010. Mutational effects and the evolution of new protein functions. Nature Reviews Genetics 11:572–582.

Soucy SM, Huang J, Gogarten JP. 2015. Horizontal gene transfer: building the web of life. Nature Reviews Genetics 16:472–482.

Page 70: Evolution of New Genes and Functions - DiVA portaluu.diva-portal.org/smash/get/diva2:1252552/FULLTEXT01.pdfRNA–like molecules and created a suitable environment for certain types

70

Söderholm A, Guo X, Newton MS, Evans GB, Näsvall J, Patrick WM, Selmer M. 2015. Two-step Ligand Binding in a (βα)8 Barrel Enzyme. Journal of Biological Chemistry 290:24657–24668.

Stelling J, Sauer U, Szallasi Z, Doyle FJ III, Doyle J. 2004. Robustness of Cellular Functions. Cell 118:675–685.

Stoebel DM, Dean AM, Dykhuizen DE. 2008. The Cost of Expression of Escherichia coli lac Operon Proteins Is in the Process, Not in the Products. Genetics 178:1653–1660.

Suckow J, Markiewicz P, Kleina LG, Miller J, Kisters-Woike B, Müller-Hill B. 1996. Genetic Studies of the Lac Repressor XV: 4000 Single Amino Acid Substitutions and Analysis of the Resulting Phenotypes on the Basis of the Protein Structure.

261:509–523. Sugawara E, Nikaido H. 2005. OmpA/OprF: Slow Porins or Channels Produced by

Alternative Folding of Outer Membrane Proteins. In: Bacterial and Eukaryotic Porins. Structure, Function, Mechanism. Weinheim, FRG: Wiley-VCH Verlag GmbH & Co. KGaA. pp. 119–138.

Sun S, Berg OG, Roth JR, Andersson DI. 2009. Contribution of gene amplification to evolution of increased antibiotic resistance in Salmonella typhimurium. Genetics 182:1183–1195.

Tawfik DS. 2014. Accuracy-rate tradeoffs: how do enzymes meet demands of selec-tivity and catalytic efficiency? Current Opinion in Chemical Biology 21:73–80.

Tängdén T, Adler M, Cars O, Sandegren L, Löwdin E. 2013. Frequent emergence of porin-deficient subpopulations with reduced carbapenem susceptibility in ESBL-producing Escherichia coli during exposure to ertapenem in an in vitro pharma-cokinetic model. Journal of Antimicrobial Chemotherapy 68:1319–1326.

Tenaillon O, Barrick JE, Ribeck N, Deatherage DE, Blanchard JL, Dasgupta A, Wu GC, Wielgoss S, Cruveiller S, Médigue C, et al. 2016. Tempo and mode of ge-nome evolution in a 50,000-generation experiment. Nature 536:165–170.

Tenaillon O, Rodriguez-Verdugo A, Gaut RL, McDonald P, Bennett AF, Long AD, Gaut BS. 2012. The Molecular Diversity of Adaptive Convergence. Science 335:457–461.

Tenaillon O. 2018. Experimental evolution heals the scars of genome-scale recoding. Proceedings of the National Academy of Sciences 115:2853–2855.

Theodossis A, Walden H, Westwick EJ, Connaris H, Lamble HJ, Hough DW, Danson MJ, Taylor G. 2004. The structural basis for substrate promiscuity in 2-keto-3-deoxygluconate aldolase from the Entner-Doudoroff pathway in Sulfolobus sol-fataricus. Journal of Biological Chemistry 279:43886–43892.

Thomas CM, Nielsen KM. 2005. Mechanisms of and Barriers to, Horizontal Gene Transfer between Bacteria. Nature Reviews Microbiology 3:711–721.

Tokuriki N, Jackson CJ, Afriat-Jurnou L, Wyganowski KT, Tang R, Tawfik DS. 2012. Diminishing returns and tradeoffs constrain the laboratory optimization of an en-zyme. Nature Communications 2016 7 3:1257.

Tokuriki N, Stricher F, Serrano L, Tawfik DS. 2008. How protein stability and new functions trade off. PLOS Computational Biology 4:e1000002.

Tokuriki N, Tawfik DS. 2009a. Chaperonin overexpression promotes genetic varia-tion and enzyme evolution. Nature 459:668–673.

Tokuriki N, Tawfik DS. 2009b. Protein Dynamism and Evolvability. Science 324:203–207.

Topham CM, McLeod A, Eisenmenger F, Overington JP, Johnson MS, Blundell TL. 1993. Fragment Ranking in Modelling of Protein Structure.

Page 71: Evolution of New Genes and Functions - DiVA portaluu.diva-portal.org/smash/get/diva2:1252552/FULLTEXT01.pdfRNA–like molecules and created a suitable environment for certain types

71

Toprak E, Veres A, Michel J-B, Chait R, Hartl DL, Kishony R. 2011. Evolutionary paths to antibiotic resistance under dynamically sustained drug selection. Nature Reviews Genetics 44:101–105.

Tracewell CA, Arnold FH. 2009. Directed enzyme evolution: climbing fitness peaks one amino acid at a time. Current Opinion in Chemical Biology 13:3–9.

Tuller T, Waldman YY, Kupiec M, Ruppin E. 2010. Translation efficiency is deter-mined by both codon bias and folding energy. Proceedings of the National Acad-emy of Sciences107:3645–3650.

United Nations. 2017. World Population Prospects: The 2017 Revision, Key Findings and Advance Tables. United Nations, Department of Economic and Social Af-fairs, Population Division

van Loo B, Jonas S, Babtie AC, Benjdia A, Berteau O, Hyvonen M, Hollfelder F. 2010. An efficient, multiply promiscuous hydrolase in the alkaline phosphatase superfamily. Proceedings of the National Academy of Sciences 107:2740–2745.

Wagner A. 2005. Energy Constraints on the Evolution of Gene Expression. Molecular Biology and Evolution 22:1365–1374.

Wahl LM, Gerrish PJ, Saika-Voivod I. 2002. Evaluating the Impact of Population Bottlenecks in Experimental Evolution. Heredity 162:961.

Wakeley J. 2008. Coalescent Theory: An Introduction. W. H. Freeman Wang SC, Person MD, Johnson WH, Whitman CP. 2003. Reactions of trans-3-Chlo-

roacrylic Acid Dehalogenase with Acetylene Substrates:  Consequences of and Evidence for a Hydration Reaction. Biochemistry 42:8762–8773.

Wang X, Minasov G, Shoichet BK. 2002. Evolution of an Antibiotic Resistance En-zyme Constrained by Stability and Activity Trade-offs. Journal of Molecular Bi-ology 320:85–95.

Wassenaar TM, Gaastra W. 2001. Bacterial virulence: can we draw the line? FEMS Microbiology Letters 201:1–7.

Welch M, Govindarajan S, Ness JE, Villalobos A, Gurney A, Minshull J, Gustafsson C. 2009. Design Parameters to Control Synthetic Gene Expression in Escherichia coli. PLOS ONE 4:e7002.

Wilson BA, Masel J. 2011. Putatively Noncoding Transcripts Show Extensive Asso-ciation with Ribosomes. Genome Biology and Evolution 3:1245–1252.

Wilson DS, Keefe AD, Szostak JW. 2001. The use of mRNA display to select high-affinity protein-binding peptides. Proceedings of the National Academy of Sci-ences 98:3750–3755.

Wistrand-Yuen E, Knopp M, Hjort K, Koskiniemi S, Berg OG, Andersson DI. 2018. Evolution of high-level resistance during low-level antibiotic exposure. Nature Communications 9:1599.

Wloch DM, Szafraniec K, Borts RH, Korona R. 2001. Direct estimate of the mutation rate and the distribution of fitness effects in the yeast Saccharomyces cerevisiae. 159:441–452.

Worth CL, Gong S, Blundell TL. 2009. Structural and functional constraints in the evolution of protein families. Nature Reviews Molecular Cell Biology 10:709–720.

Wright S. 1932. The roles of mutation, inbreeding, crossbreeding and selection in evolution. Proceedings of the Sixth International Congress of Genetics 1:356–366.

Wright S. 1940. Breeding Structure of Populations in Relation to Speciation. The American Naturalist 74:232–248.

Yang K, Metcalf WW. 2004. A new activity for an old enzyme: Escherichia coli bac-terial alkaline phosphatase is a phosphite-dependent hydrogenase. Proceedings of the National Academy of Sciences 101:7919–7924.

Page 72: Evolution of New Genes and Functions - DiVA portaluu.diva-portal.org/smash/get/diva2:1252552/FULLTEXT01.pdfRNA–like molecules and created a suitable environment for certain types

72

Yeh S-W, Huang T-T, Liu J-W, Yu S-H, Shih C-H, Hwang J-K, Echave J. 2014. Local Packing Density Is the Main Structural Determinant of the Rate of Protein Se-quence Evolution at Site Level. BioMed Research International 2014:1–10.

Yeung DT, Lenz DE, Cerasoli DM. 2005. Analysis of active-site amino-acid residues of human serum paraoxonase using competitive substrates. FEBS Journal 272:2225–2230.

Yip SH-C, Matsumura I. 2013. Substrate Ambiguous Enzymes within the Escherichia coli Proteome Offer Different Evolutionary Solutions to the Same Problem. Mo-lecular Biology and Evolution 30:2001–2012.

Yu N. 2004. Nucleotide Diversity in Gorillas. Genetics 166:1375–1383. Yue P, Yue P, Li Z, Li Z, Moult J, Moult J. 2005. Loss of Protein Structure Stability

as a Major Causative Factor in Monogenic Disease. Journal of Molecular Biology 353:459–473.

Zhang L, Li W-H. 2005. Human SNPs Reveal No Evidence of Frequent Positive Se-lection. Molecular Biology and Evolution 22:2504–2507.

Zhao L, Saelao P, Jones CD, Begun DJ. 2014. Origin and Spread of de Novo Genes in Drosophila melanogaster Populations. Science 343:769–772.

Zhou YN, Lubkowska L, Hui M, Court C, Chen S, Court DL, Strathern J, Jin DJ, Kashlev M. 2013. Isolation and Characterization of RNA Polymerase rpoB Mu-tations That Alter Transcription Slippage during Elongation in Escherichia coli. Journal of Biological Chemistry 288 :2700–2710.

Page 73: Evolution of New Genes and Functions - DiVA portaluu.diva-portal.org/smash/get/diva2:1252552/FULLTEXT01.pdfRNA–like molecules and created a suitable environment for certain types
Page 74: Evolution of New Genes and Functions - DiVA portaluu.diva-portal.org/smash/get/diva2:1252552/FULLTEXT01.pdfRNA–like molecules and created a suitable environment for certain types

Acta Universitatis UpsaliensisDigital Comprehensive Summaries of Uppsala Dissertationsfrom the Faculty of Medicine 1502

Editor: The Dean of the Faculty of Medicine

A doctoral dissertation from the Faculty of Medicine, UppsalaUniversity, is usually a summary of a number of papers. A fewcopies of the complete dissertation are kept at major Swedishresearch libraries, while the summary alone is distributedinternationally through the series Digital ComprehensiveSummaries of Uppsala Dissertations from the Faculty ofMedicine. (Prior to January, 2005, the series was publishedunder the title “Comprehensive Summaries of UppsalaDissertations from the Faculty of Medicine”.)

Distribution: publications.uu.seurn:nbn:se:uu:diva-362157

ACTAUNIVERSITATIS

UPSALIENSISUPPSALA

2018