stolen genes, a shortcut to successuu.diva-portal.org/smash/get/diva2:1383246/fulltext01.pdf ·...

72
ACTA UNIVERSITATIS UPSALIENSIS UPPSALA 2020 Digital Comprehensive Summaries of Uppsala Dissertations from the Faculty of Science and Technology 1892 Stolen genes, a shortcut to success Evolution of metabolic and detoxification capacities in Diplomonads ALEJANDRO JIMENEZ-GONZALEZ ISSN 1651-6214 ISBN 978-91-513-0844-9 urn:nbn:se:uu:diva-401276

Upload: others

Post on 22-Jul-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Stolen genes, a shortcut to successuu.diva-portal.org/smash/get/diva2:1383246/FULLTEXT01.pdf · Resumen en español ... Mbp Mega base pairs MSA Multiple sequence alignment NCBI National

ACTAUNIVERSITATIS

UPSALIENSISUPPSALA

2020

Digital Comprehensive Summaries of Uppsala Dissertationsfrom the Faculty of Science and Technology 1892

Stolen genes, a shortcut to success

Evolution of metabolic and detoxification capacitiesin Diplomonads

ALEJANDRO JIMENEZ-GONZALEZ

ISSN 1651-6214ISBN 978-91-513-0844-9urn:nbn:se:uu:diva-401276

Page 2: Stolen genes, a shortcut to successuu.diva-portal.org/smash/get/diva2:1383246/FULLTEXT01.pdf · Resumen en español ... Mbp Mega base pairs MSA Multiple sequence alignment NCBI National

Dissertation presented at Uppsala University to be publicly examined in Room B21,Biomedicinskt centrum (BMC), Husargatan 3, Uppsala, Friday, 28 February 2020 at 09:15 forthe degree of Doctor of Philosophy. The examination will be conducted in English. Facultyexaminer: Professor Thomas A. Richards (University of Oxford, Department of Zoology).

AbstractJimenez-Gonzalez, A. 2020. Stolen genes, a shortcut to success. Evolution of metabolic anddetoxification capacities in Diplomonads. Digital Comprehensive Summaries of UppsalaDissertations from the Faculty of Science and Technology 1892. 71 pp. Uppsala: ActaUniversitatis Upsaliensis. ISBN 978-91-513-0844-9.

Parasites do not represent a single evolutionary lineage meaning that they have evolved severaltimes. The changes that parasites have undergone to adapt to such a lifestyle are not entirelyunderstood. This thesis focuses on the study of diplomonads (Fornicata, Metamonada), a groupof host-associated or free-living protists, to better understand how they adapted to differentenvironments and hosts.

Diplomonads, and close relatives, are found in low-oxygen environments. However, somespecies can withstand fluctuating levels of oxygen. In a first study, we reconstructed the oxygendetoxification pathway of Fornicata and study its evolution. Comparative genomics showed thatFornicata shares a common pathway with lineage-specific modifications. Phylogenetic analysesshowed a pathway in a constant change where proteins have been gained and lost.

In a second study, the Giardia muris genome was sequenced and compared to the genomeof G. intestinalis WB. We reconstructed the metabolic capacities of both species. Our analysesshowed that the observed differences are the result of gene acquisitions or differential lossesthat can be explained based on differences in the environment of the hosts.

Considering what we observed in the two previous studies, we reconstructed the metaboliccapacities of four diplomonads. Using cluster analysis, we reconstructed the putativemetabolism of the last Diplomonadida common ancestor. Our analyses suggested that thisancestor was, most likely, an obligate host-associated organism. However, we identified thattraits associated with parasitic diplomonads evolved in a free-living ancestor.

In the last study, we analyzed the genome of Hexamita inflata, a free-living, diplomonad.Our analyses showed that Trepomonas sp. PC1 and H. inflata acquired important genes forthe adaptation to a secondary free-living in a common ancestor. However, our analyses alsoshowed independent adaptations. The synthesis of glutathione and the acquisition of glutathioneperoxidase, most likely, allow H. inflata to detoxify higher levels of oxygen and arsenic thanother diplomonads.

In conclusion, this thesis highlights the value of metabolic analyses to identify howmicrobial eukaryotes interact with their environment. The phylogenetic approach shows that theacquisition of genes and differential losses have been important processes in the adaptation ofdifferent hosts and environments.

Keywords: Diplomonads, Parasitism, Evolutionary Genomics, Metabolism, Lateral GeneTransfer

Alejandro Jimenez-Gonzalez, Department of Cell and Molecular Biology, MolecularEvolution, Box 596, Uppsala University, SE-752 37 Uppsala, Sweden.

© Alejandro Jimenez-Gonzalez 2020

ISSN 1651-6214ISBN 978-91-513-0844-9urn:nbn:se:uu:diva-401276 (http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-401276)

Page 3: Stolen genes, a shortcut to successuu.diva-portal.org/smash/get/diva2:1383246/FULLTEXT01.pdf · Resumen en español ... Mbp Mega base pairs MSA Multiple sequence alignment NCBI National

To my family

Page 4: Stolen genes, a shortcut to successuu.diva-portal.org/smash/get/diva2:1383246/FULLTEXT01.pdf · Resumen en español ... Mbp Mega base pairs MSA Multiple sequence alignment NCBI National
Page 5: Stolen genes, a shortcut to successuu.diva-portal.org/smash/get/diva2:1383246/FULLTEXT01.pdf · Resumen en español ... Mbp Mega base pairs MSA Multiple sequence alignment NCBI National

List of Papers

This thesis is based on the following papers, which are referred to in the text by their Roman numerals.

I Jiménez-González, A., Xu, F., Andersson, JO. (2019) Lateral

acquisitions repeatedly remodel the oxygen detoxification path-way in diplomonads and relatives. Genome Biology and Evolu-tion, 11(9), 2542-2556.

II Xu, F., Jiménez-González, A., Einarsson, E., Ástvaldsson, Á.,

Peirasmaki, D., Eckmann, L., Andersson, JO., Svärd, SG., Jerl-ström-Hultqvist, J. (2019) The compact genome of Giardia mu-ris reveals important steps in the evolution of intestinal protozoan parasites. bioRxiv.

III Jiménez-González, A., Andersson, JO. (2019) Metabolic recon-

struction elucidates the lifestyle of the last Diplomonadida com-mon ancestor. Manuscript.

IV Jiménez-González, A., Kurt, Z., Stairs, C., Xu, F., Stoklasa, M.,

Svärd, SG., Tachezy, J., Andersson, JO. (2019) The Hexamita inflata genome reveals new trends in diplomonads evolution. Manuscript.

Reprints were made with permission from the respective publishers.

Note The supplementary materials of the papers included in the thesis (published and unpublished) are available on the Google Drive directory and will be maintained until February 29 2020:

https://drive.google.com/open?id=1bKmsFU1yo0W94pzuSl73lU1CDJKNsVTa

Page 6: Stolen genes, a shortcut to successuu.diva-portal.org/smash/get/diva2:1383246/FULLTEXT01.pdf · Resumen en español ... Mbp Mega base pairs MSA Multiple sequence alignment NCBI National
Page 7: Stolen genes, a shortcut to successuu.diva-portal.org/smash/get/diva2:1383246/FULLTEXT01.pdf · Resumen en español ... Mbp Mega base pairs MSA Multiple sequence alignment NCBI National

Contents

Introduction ...............................................................................................11

Parasitism ..................................................................................................12

Diplomonads .............................................................................................16 The discovery of diplomonads ...............................................................16 Classification ........................................................................................17

Diplomonads within the eukaryotic tree of life ..................................18 Diplomonads phylogeny ...................................................................22

Cellular features and lifestyle ................................................................23 Giardiasis and spironucleosis ............................................................26

Adaptation via the acquisition of new functions .........................................27 The origin of new functions...................................................................27

Duplication and new function ...........................................................27 Domain fusion ..................................................................................29 De novo origin ..................................................................................30

The spread of functions .........................................................................30 Endosymbiosis gene transfer .............................................................30 Lateral Gene Transfer .......................................................................32

Aims .........................................................................................................36

Comparative genomics and phylogenetic analyses .....................................37 Data sources (genomes).........................................................................37 Annotation (from genomes to functions) ...............................................38

Structural annotation .........................................................................38 Functional annotation .......................................................................38

BLAST and CD-HIT-2D (from functions to datasets) ............................40 Single gene tree (from datasets to phylogenetic histories) ......................42 Ancestral reconstruction ........................................................................43

Results ......................................................................................................45 Evolution of the oxygen detoxification pathway in Fornicata (Paper I) ..45 The genome of Giardia muris (Paper II) ................................................46 Metabolic reconstruction and the last Diplomonadida common ancestor (Paper III)................................................................................48 The genome of Hexamita inflata (Paper IV) ..........................................50

Page 8: Stolen genes, a shortcut to successuu.diva-portal.org/smash/get/diva2:1383246/FULLTEXT01.pdf · Resumen en español ... Mbp Mega base pairs MSA Multiple sequence alignment NCBI National

Discussion .................................................................................................52

Conclusions and future work .....................................................................55

Svensk sammanfattning .............................................................................56

Resumen en español ..................................................................................58

Acknowledgements ...................................................................................61

References .................................................................................................63

Page 9: Stolen genes, a shortcut to successuu.diva-portal.org/smash/get/diva2:1383246/FULLTEXT01.pdf · Resumen en español ... Mbp Mega base pairs MSA Multiple sequence alignment NCBI National

Abbreviations

YVSP Pseudogenized variant-specific surface protein BLAST Basic Local Alignment Search Tool CDD Conserved Domain database eToL Eukaryotic tree of life KEGG Kyoto Encyclopedia of Genes and Genomes LBA Long branch attraction LGT Lateral gene transfer Mbp Mega base pairs MSA Multiple sequence alignment NCBI National Center for Biotechnology Information NO Nitric oxide PPDK Pyrophosphate-dependent pyruvate phosphate dikinase ROS Reactive oxygen species SOD Superoxide dismutase SOR Superoxide reductase VSP Variant-specific surface protein

Page 10: Stolen genes, a shortcut to successuu.diva-portal.org/smash/get/diva2:1383246/FULLTEXT01.pdf · Resumen en español ... Mbp Mega base pairs MSA Multiple sequence alignment NCBI National

10

Page 11: Stolen genes, a shortcut to successuu.diva-portal.org/smash/get/diva2:1383246/FULLTEXT01.pdf · Resumen en español ... Mbp Mega base pairs MSA Multiple sequence alignment NCBI National

11

Introduction

During the past decades, the number of available eukaryotic genomes and transcriptomes has increased extraordinarily, both from host-associated and free-living species. Genomic comparisons between parasites and free-living species have helped us to understand the transition from a free-living to a par-asitic lifestyle. The analyses of these data have revealed some of the mecha-nisms that allow parasites to adapt to such a lifestyle. However, do we really know what a parasite is? In the first chapter of this thesis, I will discuss the actual knowledge about this particular lifestyle.

Every contribution in science is based on discoveries that someone made before. For that reason, I presented the subsequent chapters as a journey through the history of diplomonads and the discovery of genetic mechanisms like gene transfer. I wanted to honor the people before me that made these studies possible. With this idea in mind, I hope that the data, analyses and personal suggestions can help other scientists in the future.

The results presented here represent our efforts to understand how diplo-monads adapted to different environments and hosts. We focused our studies on the metabolism of diplomonads and how it evolved in different lineages. Our studies revealed that the acquisition of genes via lateral gene transfer and the differential loss of genes have been two crucial mechanisms that shaped the metabolism of organisms in this group.

I do not want to extend this introduction much longer. I would like to finish it with the hope that you enjoy reading the following chapters as much as I enjoyed writing them.

Page 12: Stolen genes, a shortcut to successuu.diva-portal.org/smash/get/diva2:1383246/FULLTEXT01.pdf · Resumen en español ... Mbp Mega base pairs MSA Multiple sequence alignment NCBI National

12

Parasitism

Man is infested with internal parasites, sometimes causing fatal effects; and is plagued by external parasites, all of which belong to the same genera of fami-lies as those infesting other mammals, and in the case of scabies to the same species.

Charles R. Darwin, 1871. The descent of man, and selection in relation to sex • 1st edition.

In nature, the organisms do not live isolated from each other. Quite the oppo-site, the ecosystems are made of different species interacting with it and with each other. The interactions can be divided into mutualistic, commensal and parasitic, based on the benefit obtained for the members. Mutualistic interac-tions are those in which the members obtain a clear benefit from the interac-tion. In commensal interactions, only one member of the interaction obtains a clear advantage, while the interaction is neutral for the other member. Parasitic interactions are beneficial for one member but, unlike the commensalism, the result is harmful to the other member. In this section, I will introduce parasitic interaction and how our understanding and approaches for their study have changed over the years.

Etymologically, the word parasite comes from the Greek term parasitos meaning “a person eating at another’s table”. In biology, the term parasite de-notes the organism that obtains nutrients, or any other benefit, at the expense of a host organism, which may directly or indirectly be harmed. We can find different kinds of parasites. Ectoparasites that infest the outside or endopara-sites that infest the inside of the host. Intracellular parasites that invade cells of the host or extracellular parasites that infest the tissue but do not invade the cells. Finally, parasites can be considered obligate when they always parasi-tize the host or facultative when they become parasitic depending on the con-ditions of the host. We can find parasites in all domains of the tree of life. Parasites share a common ancestor with free-living species and do not repre-sent a single lineage. This distribution indicates that parasitism has appeared independently several times in evolution (Klinger et al. 2016).

Historically, parasites have been extensively studied because of their im-pact on the economy. Nowadays, a large number of diseases caused by para-sites in humans, livestock and crops can be cured. However, other parasites, both human and non-human parasites, have been neglected, and we have just started to focus on understanding their molecular mechanisms and evolution.

Page 13: Stolen genes, a shortcut to successuu.diva-portal.org/smash/get/diva2:1383246/FULLTEXT01.pdf · Resumen en español ... Mbp Mega base pairs MSA Multiple sequence alignment NCBI National

13

Our approach to study parasites has also changed throughout the years. In 1890, Robert Hermann Koch published his famous postulates to establish a causative relationship between a pathogen and a disease (Blevins & Bronze 2010). The original postulates have been updated several times because of the increment of knowledge about parasites. For example, Sir Austin Bradford Hill proposed ten criteria to provide the causative relationship between a path-ogen and a disease in 1965 (Hill 1965). Both Koch’s and Hill’s postulates helped to identify lots of microorganisms as the agents responsible for dis-eases and have been very helpful since their publication. However, they do not apply to certain diseases or parasites for different reasons. Recently, the pathobiome concept has been introduced in the study of parasitism (Vayssier-Taussat et al. 2014). Unlike the previous idea of one-pathogen-one-disease, under the pathobiome, a disease is defined as a result of the interaction be-tween a set of host-associated organisms (bacteria, protists, fungi, viruses or phages) and the host (Bass et al. 2019).

Identifying the agent or agents responsible for a disease is just the begin-ning to understand how parasites adapted to live in such conditions and how they can harm the host. For example, the improvement of molecular tech-niques allowed to identify several virulence factors. Some of these factors were considered unique to parasitic species, and it was thought that they orig-inated as an adaptation to parasitism. With the rise of the large scale the se-quencing techniques, parasitic species are compared with their free-living rel-atives easier than years ago. Some of the virulence factors have shown homol-ogy between parasites and their free-living relatives (Janouškovec et al. 2015). This homology indicates that they appeared in a pre-parasitic lifestyle, and that their original role was not related to parasitism most likely. This discovery helps us to unveil the adaptation to parasitism. It allows us to divided the char-acteristic of a parasite into two categories (Figure 1) (Janouskovec & Keeling 2016): those characteristics that evolved under a host-associated interaction (characteristic driven by parasitism), and those characteristics that evolved in a free-living but helped in the transition to parasitism (parasitism driven by a characteristic).

Page 14: Stolen genes, a shortcut to successuu.diva-portal.org/smash/get/diva2:1383246/FULLTEXT01.pdf · Resumen en español ... Mbp Mega base pairs MSA Multiple sequence alignment NCBI National

14

Figure 1. Models for the transition from free-living organism to parasite. Adapted from Janouskovec, J. and Keeling, PJ. (2016).

Then, what is needed to become a parasite? The answer to this question is not easy. As it is mentioned at the beginning of this section, parasites evolved from a free-living ancestor, and they do not represent a single evolutionary lineage. However, parasites share the same need. They need a host to grow and multiply. Without a host, they cannot complete their life-cycle. To do so, they face similar problems (Poulin & Randhawa 2015). First, parasites need to avoid the immune system of the host and to transmit between hosts. Second, parasites need to withstand abiotic factors like oxygen, pH or temperature. Depending on which part of the host is infected, these factors are very varia-ble. Third, like other species, parasites need to have access to a source of nu-trients. Fourth, parasites need to fight other species that normally live within the host or also try to infest it.

If we compare free-living and parasitic species from a metabolic point of view, parasites have usually lost the capacity to synthesize several com-pounds. For example, some parasites have lost the synthesis of nucleotides de novo and import them actively from the host (Poulin & Randhawa 2015). Trichomonas vaginalis, for instance, lost most of the enzymes related to the synthesis of lipids, and it takes in most of them, including cholesterol, from the host (Carlton et al. 2007). Entamoeba histolytica lacks the genes for the synthesis of folate (Loftus et al. 2005). In contrast to this biosynthesis simpli-fication, parasites have acquired metabolic genes as an adaptation to a para-sitic lifestyle. Cryptosporidium parvum acquired proteins involved in the sal-vage of nucleotides via lateral gene transfer (LGT) (Striepen et al. 2004). Sometimes the acquisitions allow the parasite to have access to new sources of carbohydrates. For example, Blastocystis acquired chitinases and sialidases (Eme et al. 2017) and, E. histolytica acquired several glycosidases via LGT (Loftus et al. 2005).

Page 15: Stolen genes, a shortcut to successuu.diva-portal.org/smash/get/diva2:1383246/FULLTEXT01.pdf · Resumen en español ... Mbp Mega base pairs MSA Multiple sequence alignment NCBI National

15

Parasites also developed unique cellular structures in response to the life-style. For example, Giardia intestinalis expose variable-surface proteins (VSPs) outside the membrane. It has been proposed that these proteins help the cell to escape the immune system of the host (Adam et al. 1988; Nash et al. 1988). In other cases, parasites repurpose proteins already present in the free-living ancestor. For example, apicomplexans use proteins to invade the host that are orthologs to proteins of the flagellar apparatus in the free-living relatives (Woo et al. 2015). Giardia species have a unique structure called the ventral disk, which is involved in the attachment to epithelial cells. Elements of this disk are derived from modifications of the cytoskeleton (Hagen et al. 2011; Brown et al. 2016).

Independently of the host, the area infected and the life-cycle, all parasites share a series of characteristics related to such a lifestyle. In general, parasites have more reduced nuclear genomes compared to their free-living relatives. The most extreme examples of nuclear reduction known are from intracellular parasites. To give an idea of such reduction, Ostreococcus tauri has the small-est free-living eukaryotic genome with 12.56 Mbp (Derelle et al. 2006), while the Encephalitozoon intestinalis has the smallest parasitic eukaryotic genome with only 2.3 Mbp (Corradi et al. 2010). This genome reduction is not limited only to the nucleus. The reduction of the mitochondria has been reported in many parasitic lineages. However, this reduction can be the result of an adap-tation to low-oxygen environments since many free-living species also have reduced mitochondria or mitochondria-related organelle (Tsaousis et al. 2012). The reduction of genomes is also achieved through the reduction of numbers of introns and mobile elements or the length of intergenic regions (Poulin & Randhawa 2015).

Diplomonads constitute an excellent group to study the evolution of para-sitism for several reasons (see chapter “Diplomonads”). Almost all the closest relatives to diplomonads are free-living and share some cellular and ecological features with diplomonads. Within this group, we can find free-living, com-mensal and parasitic species. Some of these parasitic species show a broad range of hosts, while others seem to be more specific. Some diplomonads can be grown in vitro, allowing us to establish cultures under different conditions.

Page 16: Stolen genes, a shortcut to successuu.diva-portal.org/smash/get/diva2:1383246/FULLTEXT01.pdf · Resumen en español ... Mbp Mega base pairs MSA Multiple sequence alignment NCBI National

16

Diplomonads

Alle dese verhaelde deeltgens lagen in een heldere doorschijnende materie, in welke heldere materie ik op eenige tijden gesien heb, dat seer aerdig beweeg-den, dierkens eenige wat grooter andere wat kleijnder, als een globule bloet, alle van een ende deselvige maeksel, haer lichamen waren wat langer als breet, en haer onderlijf dat platagtig was, met verscheijde pooten versien, met de welke sij soo danige beweging door de heldere materie, en globulen maekten, als of wij ons inbeelden, een pissebedde tegen een muijer te sien op loopen; en al hoe wel sij een vaerdige beweginge met haer pooten maekten, soo hadden deselvige nogtans een trage voortgang1.

Antony van Leeuwenhoek, 1948. Alle de brieven. Deel 3: 1679-1683.

The discovery of diplomonads The paragraph that opens this chapter is one of the earliest descriptions of a protist and the first description of a parasitic protist. Leeuwenhoek observed these animalcules, as he called them, in a sample from his diarrheic stools. Based on this description, the animalcule could be identified as the parasite Giardia intestinalis, the agent responsible for the diarrheal-disease giardiasis (Dobell 1920). He was infected with this protist and made precise observa-tions. However, Leeuwenhoek did not correlate its present with the diarrheic episodes. It is not until almost two hundred years later when the Czech physi-cian Vilem Lambl described scientifically G. intestinalis in 1859. He classi-fied it as Cercomonas intestinalis and provided the first microscopic drawing of this protist (Lambl 1859). This species suffered changes in the nomencla-ture several times throughout history. Blanchard suggested the name Lamblia intestinalis in 1888 (Blanchard 1888); Styles used G. duodenalis in 1902 (Stiles 1902); Kofoid and Christiansen proposed the name G. lamblia in 1915 (Kofoid & Christiansen 1915), and subsequently, Kofoid changed it to G. en-terica in 1920 (Kofoid 1920). Nowadays, G. intestinalis, G. lamblia and G. duodenalis are used as synonyms. 1All these described particles lay in a clear transparent medium, in which I have at times seen very prettily moving animalcules, some rather larger, others somewhat smaller than a blood corpuscle, and all of one and the same structure. Their bodies were somewhat longer than broad, and their belly, which was flattened, provided with several feet, with which they made such a movement through the clear medium and the globules that we might fancy we saw a pissabed running up against a wall. But although they made a rapid movement with their feet, yet they made but slow progress (Dobell 1920).

Page 17: Stolen genes, a shortcut to successuu.diva-portal.org/smash/get/diva2:1383246/FULLTEXT01.pdf · Resumen en español ... Mbp Mega base pairs MSA Multiple sequence alignment NCBI National

17

Giardia intestinalis was not the only described Giardia during the 19th cen-tury. Grassi described a rodent organism as Dimorphus muris in 1879. Later, this organism would be known as G. muris. Curiously, the first time that Gi-ardia was used as a genus name was not with G. intestinalis. Kunstler de-scribed a flagellated in tadpoles and classified it as Giardia in 1882 (Kunstler 1882). Subsequently, this organism was classified as G. agilis. Since then, three more species of Giardia have been described: G. psittaci in 1987 (Erlandsen & Bemrick 1987), G. microti in 1988 (Feely 1988), and G. ardeae in 1990 (Erlandsen et al. 1990).

Eighteen years before Lambl’s description of C. intestinalis (G. intesti-nalis), Dujardin described the species Trepomonas agilis, Hexamita nodulosa, H. inflata and H. intestinalis in 1841 (Dujardin 1841). This description is con-sidered the oldest reference for these two genera and the first description of free-living diplomonads. In 1878, Bütschli unified H. nodulosa and H. inflata under the latter. By the end of the 19th century, two more genera were de-scribed: Gyromonas (Seligo 1887) and Trigonomonas (Klebs 1892). The be-ginning of the 20th century saw the description of three new diplomonad gen-era: Octomitus in 1904 (Prowazek 1904), Trimitus in 1910 (Aléxéieff 1910) and Enteromonas in 1915 (da Fonseca 1915).

It is not until 1936 when the second most studied diplomonad genus, Spi-ronucleus, was described for the first time (Lavier 1936). To date, nine Spiro-nucleus species have been identified: S. barkchanus, S. columbae, S. elegans, S. meleagridis, S. muris, S. salmonicida, S. salmonis, S. torosa and S. vortens (Ástvaldsson 2019). However, some species were described before but classi-fied under Hexamita or Octomitus. At the time, only light microscopy obser-vations were used to describe species. This technique could create confusion because these three genera are morphological very similar. For example, S. muris was previously known as H. muris, and S. salmonis as O. salmonis or H. salmonis (Poynton et al. 2004). These names could be changed in the future because, as I will explain in the next section of this chapter, the classification within diplomonads is still a matter of discussion with Spironucleus as a pol-yphyletic group.

The last diplomonad genus to be described to date is Brugerolleia (Desser et al. 1993). The only species that is known was isolated from the blood of frogs in Ontario, Canada.

Classification The classification of diplomonads has undergone several changes through his-tory both within the group and within the eukaryotic tree of life (eToL). Here, I will try to summarize both, starting from the position of diplomonads within the eToL.

Page 18: Stolen genes, a shortcut to successuu.diva-portal.org/smash/get/diva2:1383246/FULLTEXT01.pdf · Resumen en español ... Mbp Mega base pairs MSA Multiple sequence alignment NCBI National

18

Diplomonads within the eukaryotic tree of life Diplomonads lack some of the classical eukaryotic characteristics, such as pe-roxisomes, Golgi apparatus and mitochondria (see section “Cellular features and lifestyle”). The absence of these traits is not unique to this group. We can find members of Parabasalia (e.g., Trichomonas), Archamoebea (e.g., Enta-moeba) and Microsporidia (e.g., Encephalitozoon) without peroxisomes, Golgi and mitochondria. The Archezoa hypothesis unified these organisms in an early-branching monophyletic group of eukaryotes (Cavalier-Smith 1983). It was believed that these amitochondriate species were the eukaryotic survi-vor lineages before the acquisition of the classical eukaryotic features (Figure 2). Molecular data seemed to support this hypothesis. Phylogenetic trees using rRNA or proteins, like EF-1a and EF-2, placed these groups as the deepest branching eukaryotes (Vossbrinck et al. 1987; Sogin et al. 1989; Kamaishi et al. 1996).

Figure 2. Archezoa hypothesis. Adapted from Keeling, PJ (1998).

However, experimental data and phylogenetic analyses contradicted the Archezoa hypothesis in the coming years after its publication. Mitochondriate eukaryotes have several genes in the nuclear genome that codes for proteins used in the mitochondria. Several of these genes were transferred from the bacterial symbiont that gave rise to mitochondria and can be separated from other nuclear genes. Finding mitochondrial genes in any amitochondriate or-ganism would suggest that this organism evolved after the acquisition of the

Page 19: Stolen genes, a shortcut to successuu.diva-portal.org/smash/get/diva2:1383246/FULLTEXT01.pdf · Resumen en español ... Mbp Mega base pairs MSA Multiple sequence alignment NCBI National

19

mitochondrion, and that the absence of mitochondria is a secondary adapta-tion.

Entamoeba and other relatives were placed among the mitochondriate eu-karyotes when two genes with mitochondrial origin were found, pyridine nu-cleotide transhydrogenase and cpn60 (Clark & Roger 1995). Phylogenetic analyses placed Cpn60 with good support together with homologs from other eukaryotes. The identification of these two proteins also allowed to identify a double membrane-bound organelle called the mitosome (Tovar et al. 1999). Mitosomes and hydrogenosomes (see below) are mitochondria-related orga-nelles derived from aerobic mitochondria. They are modified mitochondria adapted to anaerobic or microaerobic environments (Graaf & Hackstein 2012). Hydrogenosomes lack the electron transport chain and produce ATP via fermentation of pyruvate with the release of H2 (Müller 1988). Mitosomes, on the other hand, have lost all functions related to energy production, and their only known function is the assembly and maturation of iron-sulfur clus-ters (Tovar et al. 2003).

The same thing happened with Microsporidia. The mitochondrial-origin gene 70-kDa heat shock protein was identified in the nuclear genome of En-cephalitozoon cuniculi (Peyretaillade et al. 1998). This led to the description of mitosomes in this group (Williams et al. 2002). Nowadays, Microsporidia are classified within Fungi (Adl et al. 2019).

The case of Parabasalia species was different from Entamoeba and Micro-sporidia. Parabasalia species have a double membrane-bound organelle called the hydrogenosome (Lindwiark & Muller 1973; Cerkasovová et al. 1973). The origin of this organelle remained unclear because hydrogenosomes were de-scribed in both Archezoa, like Trichomonas, and well-established mitochon-driate groups and resembled that of aerobic mitochondria in some cases (Graaf & Hackstein 2012). Once more, the identification of mitochondrial-origin genes in the nuclear genome of Trichomonas gave the first probe of the mitochondrial origin of these organelles. Phylogenetic analyses showed that the T. vaginalis Cpn60 formed a phylogenetic group with mitochondrial Cpn60 (Roger et al. 1996). Molecular data also showed that the hydrogenoso-mal targeting signal from T. vaginalis is functional in mitochondria of yeast and Trypanosoma (Häusler et al. 1997)

However, what happened with diplomonads? Studies on G. intestinalis were the first one to shake the foundation of the Archezoa hypothesis. Immu-nofluorescence micrographs using antibodies against Hsp60 (a mitochondrial protein) showed cross-reaction with proteins in G. intestinalis (Soltys & Gupta 1994). However, this immunofluorescence reaction did not target any partic-ular compartment of the cell. It seemed that this protein was dispersed all around the cytoplasm, something unexpected for a protein with a mitochon-drial origin. The final evidence that showed the mitochondrial origin of this protein was the isolation and phylogenetic analysis of this protein in 1998 (Roger et al. 1998).

Page 20: Stolen genes, a shortcut to successuu.diva-portal.org/smash/get/diva2:1383246/FULLTEXT01.pdf · Resumen en español ... Mbp Mega base pairs MSA Multiple sequence alignment NCBI National

20

Figure 3. New eukaryotic Tree of Life. Adapted from Burki, F. et al. (2019)

Page 21: Stolen genes, a shortcut to successuu.diva-portal.org/smash/get/diva2:1383246/FULLTEXT01.pdf · Resumen en español ... Mbp Mega base pairs MSA Multiple sequence alignment NCBI National

21

Table 1. Metamonada group and Fornicata subdivisions. Adapted from Adl, S. et al. (2019)

Metamonada

Fornicata

Diplomonadida

Giardiinae

Brugerolleia

Giardia

Octomitus

Hexamitinae

Enteromonas

Gyromonas

Hexamita

Spironucleus

Trepomonas

Trigonomonas

Trimitus

“Carpediemonas-like organisms”

Aduncisulcus

Carpediemonas

Dysnectes

Ergobibamus

Hicanonectes

Kipferlia

Caviomonadidae Caviomonas

Iotanema

Retortamonadida Chilomastix

Retortamonas

Parabasalia

Preaxostyla

Page 22: Stolen genes, a shortcut to successuu.diva-portal.org/smash/get/diva2:1383246/FULLTEXT01.pdf · Resumen en español ... Mbp Mega base pairs MSA Multiple sequence alignment NCBI National

22

Diplomonads are classified within Fornicata (Table 1). Recently, the genome and transcriptome of different Fornicata have been published (Leger et al. 2017; Tanifuji et al. 2018). These data helped to solve the classification of Fornicata and the position of diplomonads within it. Fornicata is classified within Metamonada (Figure 3). This group only comprises anaerobic protists, and phylogenetic analyses support its monophyly. Apart from Fornicata, Met-amonada is divided into two more subgroups: Preaxostyla and Parabasalia (Simpson 2003). For years Metamonada has been classified within Excavata (Cavalier-Smith 2002). Three groups were considered within Excavata: Dis-coba, Malawimonada and Metamonada. However, recent phylogenomic stud-ies showed that the Excavata group could be polyphyletic and the result of phylogenetic artifacts. In the last eukaryotic classification, Excavata is con-sidered as Incertae sedis (Figure 3) (Adl et al. 2019; Burki et al. 2019). With the continuous description of new species and lineages and the improvement of phylogenomic analysis, the real position of Metamonada will be defined in the future.

Diplomonads phylogeny Probably the earliest classification of diplomonads was made by Dujardin in 1841 (Dujardin 1841). He classified the genera Hexamita and Trepomonas within the family Monadiens or Monads. Within this family, we could also find the genera Trichomonas and Cercomonas, where G. intestinalis was first classified. Dujardin based this classification on morphological characters. In his classification, Monads were considered the simplest of the aquatic animal-cules, without any kind of tegument or visible organs (Dujardin 1841). It was not until 1926 when Wenyon described the order Diplomonadida based on the binary axial symmetry and the duplication of the nucleus and other organelles (Wenyon 1926). In this classification, he only included the genera that have two nuclei, Hexamita, Giardia and Trepomonas within diplomonads, while he kept Enteromonas within the family Monadidae and placed Trimitus within the family Cercomonadidae. He also proposed that the origin of this symmetry was originated from a uninucleated ancestor (see section “Cellular features and lifestyle”).

The International Society of Protistology (Protozoology at the time) estab-lished a Committee on Taxonomy and Taxonomic Problems in 1954. In its first classification, ten years after its creation, diplomonads were classified within the class Zoomastigophorea (superclass Mastigophora) (Honigberg et al. 1964). Due to discoveries, a revision on the classification was made in 1980 (Levine et al. 1980). This new classification divided the order Diplomonadida into two suborders: Diplomonadina (Giardia, Hexamita and Trepomonas) and Enteromonadina (Enteromonas and Trimitus). It was not until 2005 when En-teromonas and Trimitus, both of which have a single nucleus started to be classified within diplomonads based on molecular data, and all known genera

Page 23: Stolen genes, a shortcut to successuu.diva-portal.org/smash/get/diva2:1383246/FULLTEXT01.pdf · Resumen en español ... Mbp Mega base pairs MSA Multiple sequence alignment NCBI National

23

were included (Kolisko et al. 2005). The lastest eukaryotic classification di-vides the diplomonads into two monophyletic groups (Table 1): Giardiinae, with Giardia, Octomitus and Brugerolleia, and Hexamitinae, with Hexamita, Spironucleus, Trepomonas, Enteromonas, Trimitus, Gyromonas and Trigono-monas (Adl et al. 2019). At the same time, molecular phylogenetic analyses divided diplomonads into four groups (Kolisko et al. 2010; Takishita et al. 2012; Xu et al. 2016). Group I is equivalent to Giardiinae, while groups II-IV correspond to Hexamitinae. Trepomonas and Hexamita are included only in the group IV together with Trimitus, Enteromonas and S. muris. One of the main issues of the actual diplomonads taxonomy is that the genus Spironu-cleus is polyphyletic with species spread within groups II-IV. This polyphyly suggests that this genus will need a revision in the future.

Cellular features and lifestyle The first trait that draws the attention in most diplomonads is the bilateral axial symmetry with double karyomastigont where every cell contains two nuclei and eight basal bodies. There are two exceptions to this cellular architecture. Enteromonas and Trimitus lack the dizoic structure, having only one kary-omastigont per cell (Adam 2017). Two theories tried to explain the evolution of the karyomastigont within diplomonads. The first theory suggested that the last Diplomonadida common ancestor had a single karyomastigont and the double karyomastigont evolved several times within the group (Siddall et al. 1992). The second explanation suggested that the last Diplomonadida com-mon ancestor had already a double karyomastigont and Enteromonas and Tri-mitus reverted secondarily to a single karyomastigont situation (Kolisko et al. 2005, 2008). This question is difficult to address from a bioinformatic and an experimental point of view.

Figure 4. Scanning electron microscopy (SEM) images of S. salmonicida. Images by Ásgeir Ástvaldsson, Uppsala University

Page 24: Stolen genes, a shortcut to successuu.diva-portal.org/smash/get/diva2:1383246/FULLTEXT01.pdf · Resumen en español ... Mbp Mega base pairs MSA Multiple sequence alignment NCBI National

24

Diplomonads are flagellated protists (Figure 4). One flagellum emerges from each basal body, making eight flagella in dizoic cells and four in unizoic cells. There are two exceptions to this rule. Trimitus (unizoic) has three flagella per cell, and Gyromonas (dizoic) has only four flagella per cell (Adam 2017). The flagella are mainly related to locomotion. However, they are also related to the feeding in bacteriophagous species. In these genera, the endocytosis oc-curs through the cytostome. For example, there is a groove where three fla-gella lie inside in Trepomonas. The movement of these flagella drags bacteria to the cytostome localized at the base of the flagella. This cytostome is absent in non-phagocytic species (e.g., Giardia), and nutrients enter via pinocytosis (Adam 2017).

Diplomonads lack the classical aerobic mitochondria and were considered amitochondiate organisms for years (see section “Diplomonads within the eu-karyotic tree of life”). Now we know that this amitochondriate situation was incorrect, and diplomonads have hydrogenosomes (e.g., S. salmonicida) or mitosomes (e.g., G. intestinalis). Having mitochondria-related organelles is not exclusive of diplomonads, and examples of species with mitochondria-related organelles are spread all around the eToL (Tsaousis et al. 2012).

The overall metabolism of diplomonads remains unstudied and most of what it is known experimentally is based on G. intestinalis and Hexamita. Diplomonads reduce glucose to pyruvate via glycolysis (Lindmark 1980; Biagini et al. 2003). This pathway is energetically optimized in G. intestinalis but no in H. inflata. Both species have a pyrophosphate-dependent phos-phofructokinase, instead of the widespread ATP-dependent phosphofructoki-nase (Li & Phillips 1995; Phillips et al. 1997). G. intestinalis also has a pyro-phosphate-dependent pyruvate phosphate dikinase (PPDK) that catalyzes the production of five ATPs per molecule of pyruvate (Feng et al. 2008). The pro-duced pyruvate is used to synthesize ATP. The synthesis of ATP takes place within the hydrogenosomes or in the cytosol of species with mitosomes. The degradation of amino acids is another important source of energy in diplomon-ads. For example, the catabolism of arginine via the arginine dihydrolase path-way leads to the synthesis of ATP (Schofield et al. 1992; Dimopoulos et al. 2000). Experimental data showed that the glycolysis is favored in microaero-bic condition, while arginine metabolism predominates in anaerobic condi-tions in Hexamita (Biagini, McIntyre, et al. 1998).

The metabolism of lipids is very limited in diplomonads. They lack the synthesis de novo of lipids and need to uptake most of them, including cho-lesterol, from the environment (Lujan et al. 1996; Biagini, Rutter, et al. 1998).

Giardia intestinalis and S. salmonicida are not able to synthesize nucleo-tides de novo, and only salvage pathways have been detected (Wang & Aldritt 1983; Aldritt et al. 1985; Xu et al. 2014). This situation contrasts with S. vortens, S. barkhanus and Trepomonas sp. PC1 because they can synthesize deoxynucleotides de novo using an anaerobic ribonucleoside-triphosphate

Page 25: Stolen genes, a shortcut to successuu.diva-portal.org/smash/get/diva2:1383246/FULLTEXT01.pdf · Resumen en español ... Mbp Mega base pairs MSA Multiple sequence alignment NCBI National

25

reductase, which uses formate and nucleoside-triphosphate as substrates (Xu et al. 2016).

Diplomonads are found in anaerobic or microaerobic environments (Adam 2017). However, they can withstand fluctuating levels of oxygen (Biagini et al. 1997; Lloyd & Williams 2014). Analyzed diplomonads lack catalases, pe-roxidases and superoxide dismutase. Experimental data have shown that G. intestinalis uses NADH oxidase, superoxide reductase, A-type flavoprotein as the main defenses against oxygen (Brown et al. 1995, 1996, 1998; Vicente et al. 2009; Testa et al. 2011). They are among the proteins with the highest ex-pression during oxidative stress in G. intestinalis and S. salmonicida (Raj et al. 2014; Ma’ayeh et al. 2015; Stairs et al. 2019). In addition to these proteins, G. intestinalis has a flavohemoprotein that removes O2 but also NO (Rafferty et al. 2010; Mastronicola et al. 2010)

Host-associated diplomonads, both parasites and commensals, can be iso-lated from a wide range of animals. Giardia species have been isolated from rodents (G. muris), voles and muskrats (G. microti), psittacine birds (G. psit-taci), herons (G. ardeae), amphibians (G. agilis), and human and other mam-mals (G. intestinalis) (Adam 2001). Even though G. intestinalis can be found in such a wide range of mammals, the species can be divided into different assemblages that have a narrower specificity (Monis et al. 2009). The different assemblages are genetically isolated and probably should be considered dif-ferent species (Xu et al. 2012). Spironucleus species are mainly found in fish (S. barkhanus, S. elegans, S. salmonicida, S. salmonis, S. torosa and S. vortens). However, other Spironucleus species have been isolated from ro-dents (S. muris), and birds (S. meleagridis and S. columbae) (Ástvaldsson 2019). Octomitus intestinalis colonizes the digestive tract of rodents and am-phibians (Keeling & Brugerolle 2006), while Enteromonas hominis can be found in mammals, including humans (Adam 2017; Lokmer et al. 2019).

In diplomonads, it is also possible to find free-living species. They belong to four genera classified within Hexamitinae. Gyromonas and Trigonomonas species are rarely found and always in sediments and freshwater (Adam 2017). Hexamita and Trepomonas species are free-living but also host-associated (Siddall et al. 1993). For example, Trepomonas agilis has been described in the intestine of amphibians (Bishop 1937), while Hexamita species have been isolated from the intestine of different vertebrates and invertebrates (Adam 2017). Recently, Trepomonas sp. PC1 has been described as a secondary free-living (Xu et al. 2016).

Host-associated species have a life cycle with two well-differentiated stages. The cyst is the non-replicating stage and can be found outside the host in sediments, while the trophozoite is the active, replicating stage and is found within the intestinal tract of the host (Adam 2017). In G. intestinalis, it is pos-sible to induce the formation of cyst in vitro (Gillin et al. 1988). However, the formation of cysts is not exclusive of host-associated and it has also been ob-served in free-living Hexamita species (Brugerolle & Lee 2000). In some

Page 26: Stolen genes, a shortcut to successuu.diva-portal.org/smash/get/diva2:1383246/FULLTEXT01.pdf · Resumen en español ... Mbp Mega base pairs MSA Multiple sequence alignment NCBI National

26

Spironucleus species, the cyst form has been isolated in the environment. However, in the case of S. salmonicida, the cyst form has not been described or isolated (Ástvaldsson 2019). However, genomic evidence suggested the ex-istence of a cyst stage in this species (Xu et al. 2014). It is possible that some Spironucleus species spread among hosts as a trophozoite.

Diplomonads are divided into two monophyletic lineages (see section “Diplomonads phylogeny”). Interestingly, Giardiinae and Hexamitinae do not use the same genetic code. While Giardiinae species use the canonical genetic code, Hexamitinae species use a non-canonical genetic code where TAA and TAG encode glutamine instead of being stop codons (Keeling & Doolittle 1996; Kolisko et al. 2008). This alternative codon usage leaves TGA as the only stop codon in Hexamitinae species. This characteristic has biological im-plications that will be addressed in the next chapters.

Giardiasis and spironucleosis The disease caused by Giardia species is called giardiasis. However, the mechanisms of the giardiasis are not entirely understood. Giardia is a non-invasive parasite and colonizes the upper part of the small intestine, where the trophozoites replicate attached to the intestinal epithelium through the ventral disk. In humans, the most common symptoms are diarrhea, nausea, epigastric pain and weight loss (Ankarklev et al. 2010). Giardia expresses VSPs on the cytoplasmic membrane. It has been proposed that VSPs act as an antigenically variable protein shield to avoid the detection by the immune system of the host (Adam et al. 1988; Nash et al. 1988). During the interaction with intestinal cells, Giardia releases proteins that could play different roles attenuating the immune response during giardiasis, competing for resources with the host and the microbiota of the host (Ma’ayeh et al. 2017). Studies showed that proteases released by Giardia disrupt the normal microbiota of the host. This disruption produces the imbalance of the microbiota community, causing alterations of the epithelial cells, even after the clearance of Giardia (Beatty et al. 2017).

Some species of Spironucleus were classified under Hexamita. This classi-fication is the reason that spironucleosis and hexamitiasis are sometimes used as synonyms in the bibliography (Wood & Smith 2005). Our knowledge re-garding the mechanisms of infection is even more limited, compared with gi-ardiasis. Like Giardia, Spironucleus is an extracellular parasite and also col-onizes the intestinal tract of the host. However, S. salmonicida can cross the intestinal epithelium and cause systemic infections (Williams et al. 2011). Spi-ronucleus also expresses VSP-like proteins on the outside of the cell mem-brane as a mechanism to avoid the immune system of the host (Xu et al. 2014).

Page 27: Stolen genes, a shortcut to successuu.diva-portal.org/smash/get/diva2:1383246/FULLTEXT01.pdf · Resumen en español ... Mbp Mega base pairs MSA Multiple sequence alignment NCBI National

27

Adaptation via the acquisition of new functions

Some genes of bacterial origin were expected as the result of transfer from mitochondria, of course, but these were thought to be relatively few, and lim-ited to producing proteins reimported into mitochondria. Here, I suggest that the presence of many bacterial genes with many kinds of functions should not be a surprise. The operation of a gene transfer ratchet would inevitably result in the replacement of nuclear genes of early eukaryotes by genes from the bac-teria taken by them as food.

W. Ford Doolittle, 1998. You are what you eat: a gene transfer ratchet could account for bacterial genes in eukaryotic nuclear genomes.

In this chapter, I will do an overview of some of the mechanisms that facili-tated the adaptation of species. It is important to mention that they are not isolated mechanisms, and the emergence of a new function can be the result of several of these mechanisms working together.

The origin of new functions The origin of new functions takes place within the genome of the species. During many years, these mechanisms were considered the only option for the adaptation of species. In general, they are gradual and slow processes.

Duplication and new function DNA duplication is an event in which a region of the DNA gives rise to two indistinguishable regions. It can be divided into internal gene duplication, if only a section of the gene is duplicated, and complete gene duplication, if the whole gene suffered the duplication (Figure 5). The duplications can be placed one after the other, but they can also finish on distant regions of the DNA.

Internal gene duplications only affect a section of the gene or domain. Do-main duplications produce the increment in the gene length or gene elongation as a result. This can be an advantage because the protein increases the number of active sites, but also because the new domain (or domains) can evolve to interact with new elements (Figure 5A). Other mechanisms can produce gene elongation. For example, a single mutation can change a stop codon into a

Page 28: Stolen genes, a shortcut to successuu.diva-portal.org/smash/get/diva2:1383246/FULLTEXT01.pdf · Resumen en español ... Mbp Mega base pairs MSA Multiple sequence alignment NCBI National

28

sense codon. An insertion of DNA would increase the size of the gene. The elongation, due to these reasons, usually affects the function of the resultant protein negatively.

Complete gene duplications produce two identical copies of the gene (Fig-ure 5B). After the duplication, each copy evolves independently. Four differ-ent scenarios has been suggested after a gene duplication: (i) gene conserva-tion where both copies are kept increasing the number of proteins (Ohno 1970); (ii) one copy accumulates deleterious mutations due to relaxed selec-tion and becomes a pseudogene or disappears completely (Lynch & Conery 2000); (iii) subfunctionalization where both genes accumulate mutations, re-sulting in genes that only have part of the function but complement each other (Force et al. 1999); (iv) neofunctionalization where one copy accumulates mu-tations and acquires a new function (Ohno 1970).

Figure 5. Internal and complete gene duplications. (A) Domain duplication and acquisition of a new function. (B) Gene duplication and possible evolution of the copies.

Page 29: Stolen genes, a shortcut to successuu.diva-portal.org/smash/get/diva2:1383246/FULLTEXT01.pdf · Resumen en español ... Mbp Mega base pairs MSA Multiple sequence alignment NCBI National

29

Domain fusion Two different mechanisms produce the fusion of domains: the loss of introns causing the fusion of adjacent exons, and the fusion of domains from different genes (also known as domain shuffling). I will consider only the latter as a mechanism for the creation of new functions (Figure 6).

In 1975, The structure of four dehydrogenases was described (Rossman et al. 1975). The four of them have a domain with similar structure and proper-ties, while other domain was different in each dehydrogenase. Most likely, the NAD-binding domains originated via domain duplication and were subse-quently linked to different domains to form the different dehydrogenases. Gil-bert gave a genetic explanation to this domain shuffling, suggesting that, in eukaryotes, it was the result of the recombination of introns (Gilbert 1978).

Figure 6. Domain shuffling.

Mobile elements can also produce domain shuffling (Ekman et al. 2007). These mobile elements are very diverse and are present across the tree of life (Shapiro 2012). They can move gene sequences from one region of the ge-nome to another. These rearrangements put different domains or complete proteins in contact, which eventually can evolve into new functions.

A third mechanism of domain shuffling is the illegitimate recombination, which takes place between non-homologous sequences (Van Rijk & Bloemendal 2003). This illegitimate recombination can happen, for example, in misguided recombination during meiosis.

Page 30: Stolen genes, a shortcut to successuu.diva-portal.org/smash/get/diva2:1383246/FULLTEXT01.pdf · Resumen en español ... Mbp Mega base pairs MSA Multiple sequence alignment NCBI National

30

De novo origin The mechanisms mentioned above create new functions from already func-tional regions of DNA. However, all genes have originated from non-coding DNA at some point. Nowadays, the rise of de novo genes from non-coding regions is considered an important mechanism for the creation of new func-tions. (Tautz & Domazet-Lošo 2011). In all species, there are genes that lack homology to genes in other lineages. These orphan genes can be fast-evolving genes that have lost the homology signal, but they can also be the result of a de novo origin in the genome.

Simply put, the rise of de novo genes occurs when random mutations form cryptic functional sites, like transcription initiation regions, that are tran-scribed into RNA. Eventually, this new RNA could acquire a functional ORF, and a new protein is translated.

The spread of functions The spread of function is the insertion of foreign DNA into a new genome. This new DNA can codify for proteins absent in the host so that it increases the capacities of this host. In eukaryotes, we need to differentiate between those genes that arrived via endosymbiosis gene transfer and those that arrived via LGT (Figure 7). Here, I present both concepts, their history and the impli-cations for our studies.

Endosymbiosis gene transfer Two major endosymbiotic events have happened during the evolution of eu-karyotes. The first, which happened in the Last Eukaryote Common Ancestor, resulted in the mitochondria, while the second that happened in one eukaryotic lineage resulted in the chloroplasts (Gray & Doolittle 1982).

However, the endosymbiotic origin of mitochondria and chloroplasts was not always accepted, and even other organelles were proposed to have a sym-biotic origin. Russian biologist Konstantin Mereschowski was the first person that proposed the symbiotic origin, or symbiogenesis, for the eukaryotic cell at the begging the 20th century (Mereschkowski 1905). In his work, he pro-posed the symbiotic origin for the chloroplasts and the nucleus (Mereschkowsky 1910). He could be considered the first scientists to propose the blue-green algae (name of Cyanobacteria at the time) origin of chloro-plasts. During the 1910s and 1920s, two publications proposed the symbiotic origin from purple bacteria (name of Alphaproteobacteria at the time) of the mitochondria (Portier 1918; Wallin 1927). The idea of a symbiotic origin was based on previous studies on the division of chloroplasts and mitochondria and their resemblance with certain free-living bacteria. The symbiogenesis

Page 31: Stolen genes, a shortcut to successuu.diva-portal.org/smash/get/diva2:1383246/FULLTEXT01.pdf · Resumen en español ... Mbp Mega base pairs MSA Multiple sequence alignment NCBI National

31

ideas found opposition in the scientific community (Wilson 1925). Sixty years after Mereschowski’s hypothesis, in 1967, Lynn Margulis (Lynn Sagan at the time) revived the idea of the symbiotic origin of mitochondria and chloroplast with the publication of her first study (Sagan 1967). Margulis hypothesized that a heterotrophic anaerobe ingested an aerobic bacterium. This endosymbi-osis became obligated, resulting in the first mitochondrion. Margulis also sug-gested the symbiotic origin of the flagellar basal bodies via ingestion and pos-terior obligated symbiosis of a motile prokaryote. However, this last hypoth-esis is not accepted nowadays.

Margulis’ hypotheses were controversial at the time. However, molecular studies supported the endosymbiotic origin of mitochondria and chloroplasts. Both organelles still retain a vestigial genome (mtDNA and cpDNA, for mi-tochondrial and chloroplastic DNA, respectively). The characteristics of both genomes resemble the prokaryotic genomes. Phylogenetic analyses using cpDNA placed the origin of the chloroplast within Cyanobacteria, corroborat-ing Mereschowski’s hypothesis (Ponce-Toledo et al. 2017). On the other hand, the exact origin of the mitochondria remains controversial. The most accepted idea places the origin within Alphaproteobacteria (Yang et al. 1985; Andersson et al. 1998; Roger et al. 2017), but a recent study placed the mito-chondria as a sister-group lineage of Alphaproteobacteria (Martijn et al. 2018).

Several genes were transferred from mitochondria or chloroplast to the nu-cleus during the evolution via endosymbiotic gene transfer (EGT) (Figure 7) (Gray & Spencer 1996; Martin & Müller 1998; Timmis et al. 2004). This flux of genes reduced the autonomy of the organelles that now are controlled by the nuclear genome. The flux of gene from the organelles has also a conse-quence in the study of the evolution of genes via LGT events. Let us imagine that our sequence of interest clusters within Proteobacteria in our single gene tree phylogeny (see section “Single gene tree (from datasets to phylogenetic history)” in “Comparative genomics and phylogenetic analyses”). If these bac-teria belong to Alphaproteobacteria, this gene might have arrived at the nu-clear genome via an EGT event instead of an LGT event. If that is the case, then our gene of interest could have been inherited vertically in opposition to a more recent acquisition via LGT. Similar reasoning is applied to those genes with cyanobacterial origin in photosynthetic eukaryotes or eukaryotes with a photosynthetic ancestor.

Page 32: Stolen genes, a shortcut to successuu.diva-portal.org/smash/get/diva2:1383246/FULLTEXT01.pdf · Resumen en español ... Mbp Mega base pairs MSA Multiple sequence alignment NCBI National

32

Figure 7. Endosymbiotic gene transfer versus lateral gene transfer in eukary-otes. EGT, endosymbiotic gene transfer; LGT, lateral gene transfer; N, nucleus; Mt, mitochondrion; Cp, chloroplast.

Lateral Gene Transfer LGT, also known as horizontal gene transfer, is the acquisition of genetic ma-terial from a distantly related species (Figure 7). Under an LGT scenario, the size of the genomes should increase. If this increment continues without any control, genomes would be continuously expanded. However, we should not forget that genomes also lose genes. Under these two factors, we should con-sider genomes as dynamic entities were genes are acquired and lost continu-ously through the evolution of a lineage.

Lateral gene transfer in prokaryotes Lateral gene transfer has been studied extensively in bacteria. Classically, the mechanisms are classified into three categories: transformation, conjugation and transduction (Thomas & Nielsen 2005). Transformation is the acquisition of DNA directly from the environment. It is a mechanism that is entirely de-pendent on the receptor. Bacterial transformation was first described in 1928 (Griffith 1928). This was the first time that the term lateral gene transfer was used. Conjugation was first described in 1946 (Lederberg & Tatum 1946). This genetic transfer needs direct cell-to-cell contact or a bridge-like connec-tion between two cells. Transduction was described in 1952 and is the transfer

Page 33: Stolen genes, a shortcut to successuu.diva-portal.org/smash/get/diva2:1383246/FULLTEXT01.pdf · Resumen en español ... Mbp Mega base pairs MSA Multiple sequence alignment NCBI National

33

mechanisms of genetic material through infections with bacteriophage (Zinder & Lederberg 1952).

Lateral gene transfer in eukaryotes The endosymbiotic theory explains the origin of mitochondria and chloro-plasts, but what was the receptor cell? After Woese and collaborators de-scribed the domain Archaea in 1977 (Woese & Fox 1977), phylogenetic anal-yses showed that the receptor cell was of archaeal nature (Spang et al. 2015) and thus explained the origin of certain eukaryotic genes. However, phyloge-netic analyses have identified genes whose distribution and origin cannot be explained based on archaeal or organelle origin (Brown & Doolittle 1997). Interestingly, most of these genes are involved in metabolic functions. They showed two different origins: distant eukaryotes relatives, or bacterial origin other than Alphaproteobacteria, or Cyanobacteria in photosynthetic eukary-otes lineages.

Mechanisms In multicellular eukaryotes, the exogenous DNA needs to be incorporated in the chromosomes of the germ cells to be transmitted to the next generations. This requirement makes LGT in these organisms less frequent. In protists, on the other hand, the exogenous DNA only needs to cross two membranes to be incorporated in the chromosomes of the host, the cytoplasmic membrane and the nuclear membrane. The mechanisms of how this happens are still not well established. However, some mechanisms have been proposed. Inter-species crosses between eukaryote species have been proposed as the major cause of LGT (Syvanen 1985). Hybridization happens, indeed, between eukaryotic species, but this cannot explain the bacterial origin of genes. Other mecha-nisms and vectors are needed to transfer DNA between domains. Studies have shown that bacterial species like Bartonella henselae, Rhizobium etli or Esch-erichia coli can transfer DNA to eukaryotic (Lacroix 2016; Sieber et al. 2017). Retroviruses can transform eukaryotic cells and, in some cases, act as inter-domain vectors (Kidwell 1993). In phagotrophic protists, acquired genes could come from the food (Doolittle 1998). This theory is based on the prem-ise that fragments of DNA would escape digestion and would be incorporated into the chromosomes.

Lateral gene transfer in diplomonads Previous studies identified several genes as laterally acquired in diplomonads (Nixon et al. 2002; Andersson et al. 2003, 2007; Xu et al. 2016). Like in other cases, these genes seem to be related to metabolic functions. Interestingly, these phylogenetic studies pointed to bacteria and eukaryotes that share the niche with diplomonads as the donors. The transfer between species that share niche seems logical if we think about it. Enzymes could be deactivated by abiotic factors like oxygen or could have a preference for certain substrates.

Page 34: Stolen genes, a shortcut to successuu.diva-portal.org/smash/get/diva2:1383246/FULLTEXT01.pdf · Resumen en español ... Mbp Mega base pairs MSA Multiple sequence alignment NCBI National

34

The acquisition of better-adapted enzymes is an advantage for diplomonads (Andersson et al. 2003). For example, Diplomonads, but other Metamonada species as well, seem to have a recurrent history of eukaryote-to-eukaryote LGT with Entamoeba species (Andersson et al. 2003; Hampl et al. 2008; Grant & Katz 2014) (see Paper I and III for more examples). Metamonada and Entamoeba species also share niche, suggesting that there is a selective pres-sure to acquire genes for enzymes adapted to anaerobic or microaerobic envi-ronments.

Figure 8. Eukaryote-to-eukaryote LGT events. (A) LGT from Fornicata to other eukaryotic lineage. (B) LGT event between some Fornicata sequences and the or-phan eukaryotic lineage. (C) LGT event or vertical inherited gene can explain the tree.

Eukaryote-to-eukaryote LGT events are probably the most delicate assump-tions. There will be clear cases of eukaryote-to-eukaryote LGT events. For example, if a Fornicata cluster within a well-classified eukaryotic group or a distantly related organism clusters within diplomonads (Figure 8A). However, the position of Metamonada remains unsolved in the eToL. This uncertainty has an important implication in the identification of eukaryote-to-eukaryote LGT events. Let us imagine that we have a phylogenetic tree showing diplo-monads clustering with another lineage which position also remains uncertain. If this gene is present in other Metamonada species, and they cluster

Page 35: Stolen genes, a shortcut to successuu.diva-portal.org/smash/get/diva2:1383246/FULLTEXT01.pdf · Resumen en español ... Mbp Mega base pairs MSA Multiple sequence alignment NCBI National

35

independently in the phylogenetic tree, we can assume that at least one eukar-yote-to-eukaryote LGT event has happened (Figure 8B). However, if the gene is not present in more Metamonada, or all Metamonada sequences cluster with this orphan lineage (Figure 8C), we can assume a eukaryote-to-eukaryote LGT event. However, due to the uncertain taxonomic relation between Metamon-ada and the orphan lineage, a vertical inheritance could also explain this case.

Phylogenetic trees are subject to artifacts that can hide the real evolutionary history. By my experience, long-branch attraction (LBA) is the most common artifact working with diplomonads. This phylogenetic artifact can suggest phylogenetic relationships that hide the real one (see section “Single gene tree (from datasets to phylogenetic history)” in “Comparative genomics and phy-logenetic analyses”). The selection of appropriate phylogenetic models can mitigate the effect of LBA, but also the increment of sampled species. Adding data from different diplomonads or Fornicata can break long branches helping the model to find the correct sister group.

Hexamitinae species have an alternative codon usage where TAA and TAG encode glutamine instead of being stop codons and TGA is the only stop co-don (Keeling & Doolittle 1996; Kolisko et al. 2008). This alternative codon usage might have several implications for recent LGT events in this lineage. As a donor, it could be a problem because the protein might be shorter in the receptor cell if it uses TAA and TAG as stop codons. As a receptor cell, only 1 out of 3 acquisitions would be functional, in principle. However, when the GC content increase in bacteria, the use of the stop codon TGA also increases while the codon TAA decreases and TAG remains unchanged (Korkmaz et al. 2014). Acquisition from these bacteria might have a higher probability of be-ing functional. In most of the cases, the bacterial donor cannot be specified, but it might happen more often with this kind of bacteria. This idea is merely a speculation and is not the aim of this thesis, but it should be addressed in the future.

The acquisition and loss of genes create a dynamic situation in all genomes. We reported several cases where a function was acquired at some point in the evolution of the group, subsequently lost and reacquired again in some line-ages. We identified these cases because sequences of diplomonads, or Forni-cata, did not cluster together in the phylogenetic tree indicating independent evolutionary histories in the different lineages.

Page 36: Stolen genes, a shortcut to successuu.diva-portal.org/smash/get/diva2:1383246/FULLTEXT01.pdf · Resumen en español ... Mbp Mega base pairs MSA Multiple sequence alignment NCBI National

36

Aims

The general goal of this Ph.D. thesis can be divided into three categories: • To expand our knowledge about how diplomonads adapted to dif-

ferent environments and hosts. • To determine the role of different mechanisms in the adaptation of

diplomonads to different environments and hosts. • To describe the molecular characteristics of the last Diplomonadida

common ancestor.

The specific goals of the different studies include:

Paper I • To understand how diplomonads can withstand oxygen and reactive

oxygen species (ROS) damage. • To describe how the oxygen detoxification pathway evolved since

the last Fornicata common ancestor.

Paper II • To investigate the genome and the genomic features of G. muris. • To identify the metabolic differences between G. intestinalis and

G. muris. • To identify mechanisms of interaction with the host’s microbiota. • To understand the implications of these differences in terms of host

specificity.

Paper III • To describe the metabolic capacities of diplomonads. • To describe the molecular characteristics of the last Diplomonadida

common ancestor. • To address the lifestyle of the last Diplomonadida common ances-

tor.

Paper IV • To investigate the genome of H. inflata. • To identify new features and differences in previously studied path-

ways.

Page 37: Stolen genes, a shortcut to successuu.diva-portal.org/smash/get/diva2:1383246/FULLTEXT01.pdf · Resumen en español ... Mbp Mega base pairs MSA Multiple sequence alignment NCBI National

37

Comparative genomics and phylogenetic analyses

In this chapter, I will give an overview of the methods used in our studies. My intention here is to share some personal experiences using the different meth-ods rather than giving a formal explanation of how they work.

Data sources (genomes) Before the analyses that I will explain below, we need to obtain genomic data (i.e., genomes or transcriptomes) of the organisms of interest. Without data, there will not be research.

We can download any published genome or transcriptome from different databases. The National Center for Biotechnology Information (NCBI) is probably the most broadly-used of all of them. In this database, we can find genomic data in the form of raw reads to fully annotated genomes. However, the type and quality of data that can be found in this database largely rely on individual researchers who deposit the data. In this respect, a common issue is that the data associated with a sequencing project are not always deposited in NCBI. Researches might store the data in a different database or event on a personal webpage. Independently, information about data availability always needs to be indicated in the corresponding publication.

On the other hand, we can be interested in an organism for which there are no genetic data available. In such cases, additional experimental and bioinfor-matics work is required to obtain genetic information. This was the case for the two new diplomonad genomes included in this thesis. The approach we followed consisted of cultivation and DNA extraction of the diplomonad lin-eages before whole genome sequencing and genome reconstruction through de novo assembly. However, since other members of the group were respon-sible for the assemblies, I will not enter into details about this part.

Page 38: Stolen genes, a shortcut to successuu.diva-portal.org/smash/get/diva2:1383246/FULLTEXT01.pdf · Resumen en español ... Mbp Mega base pairs MSA Multiple sequence alignment NCBI National

38

Annotation (from genomes to functions) Simply put, genome annotation refers to the identification of genes along the genome (structural annotation) and their assignment to biological function (functional annotation).

Structural annotation Several existing tools aim to identify genes from genomic sequences. In all of them, the genes are identified based on the presence of patterns (e.g., TATA box, ORFs). Since every genome has its peculiarities, the prediction made cannot be 100% correct. There will always be errors. As such, no current tool is entirely accurate. Some programs might over-predict genes. Others could predict too many or too few introns. Obtaining a good annotation that can be reliable for the next steps often requires manual curation and, thus, can be very time-consuming. However, it is possible to use biological data (e.g., Hexa-mitinae species uses alternative genetic code). For G. muris and H. inflata, we used two different gene prediction programs, based on the experience with the genome of S. salmonicida. Since we expected few introns, we used Prodigal, a program designed for prokaryotes (Hyatt et al. 2010). We also used Glim-merHMM (Majoros et al. 2004), a eukaryotic gene prediction program that uses a set of previously identified genes as training sets. In our case, we used highly conserved proteins predicted by Prodigal as a training set. In general, GlimmerHMM made the best prediction when verified with transcriptome data, although some regions were complemented with the prediction made by Prodigal.

Functional annotation Unless we have experimental data showing the function of a particular gene, functional annotation is based on sequence similarity to previously annotated sequences. Generally, the transfer of annotation from a close relative is more accurate. In our case, we could use the genomes of G. intestinalis and S. salm-onicida and the transcriptome of Trepomonas sp. PC1.

The Kyoto Encyclopedia of Genes and Genomes (KEGG) is a database that stores information based on a KEGG orthology, or K number. Every K num-ber corresponds to a molecular function. The associated tools BlastKoala and GhostKoala, assign to every gene a K number (Kanehisa et al. 2016) based on similarity searches against a non-redundant KEGG-annotated set of genes. The KEGG output also provides access to the KEGG Mapper, a tool that al-lows pathway reconstructions of a genome based on the assigned K numbers.

EggNOG mapper assigns functional annotation based on orthology assign-ments using precomputed eggnog clusters. The EggNOG mapper output is very informative and includes information from other databases (e.g., KEGG,

Page 39: Stolen genes, a shortcut to successuu.diva-portal.org/smash/get/diva2:1383246/FULLTEXT01.pdf · Resumen en español ... Mbp Mega base pairs MSA Multiple sequence alignment NCBI National

39

COG or GO) apart from the EggNOG assignment for every gene. GhostKoala and EggNOG mapper are based on sequence similarity using automatic pre-diction previously generated. Most of these predictions have not been manu-ally curated, and if an error is introduced, it can be spread into the annotation of other genomes.

BioCyc is a collection of pathways and genome databases containing met-abolic predictions of thousands of sequenced organisms (Karp et al. 2018). The databases are divided into three categories: Tier 1, databases that are man-ually curated and frequently updated; Tier 2, databases that are computation-ally generated and receive moderate manual updating; and Tier 3, databases that are also computationally generated but do not receive manual updating. The associated program Pathway Tools generates a metabolic prediction for a species from its genome (Karp et al. 2016). Pathway Tools has the disad-vantage that it bases the metabolic prediction on gene annotation or experi-mental data using an EC number or BioCyc ID. However, the database can be manually curated, allowing the correction of errors.

Pathway Tool has an option, Pathway Hole Filler (Green & Karp 2004), that can be very useful to improve the metabolic prediction. This option allows closing metabolic gaps based on BLASTp searches (see next section) against other BioCyc databases. Unfortunately, BioCyc does not include any Meta-monada species. However, we can use Pathway Hole Filler against the data-bases generated with Pathway Tool. For example, once I manually curated the metabolic prediction generated by Pathway Tool of G. intestinalis WB and G. muris, I used Pathway Hole Filler to reduce the errors.

There are other databases used in this thesis that can provide functional information. Conserved Domain database (CDD) implemented in NCBI con-sists of a collection of curated protein domains (Marchler-Bauer & Bryant 2004). Eukaryotic Pathogen Database (EuPathDB) resources provide genomic information from a variety of eukaryotic parasites (Aurrecoechea et al. 2013). Diplomonads have their own database, GiardiaDB (Aurrecoechea et al. 2009). Here, we can find biological information about genes from G. intestinalis, S. salmonicida and, since the last update (December 2019), from G. muris and Monocercomonoides exilis as well.

To obtain a reliable annotation, I combined the output of GhostKoala, Egg-NOG mapper and Pathway Tool and manually curated the annotations within the latter. CDD and GiardiaDB were used in controversial cases in which the selected tools predicted different functions.

During the manual curation of the genomes of parasitic diplomonads, I found that the VSP proteins tend to confuse the prediction tools since they detect local similarity to proteins that present otherwise low similarity with the rest of the sequence (e.g., CXXC signature).

Page 40: Stolen genes, a shortcut to successuu.diva-portal.org/smash/get/diva2:1383246/FULLTEXT01.pdf · Resumen en español ... Mbp Mega base pairs MSA Multiple sequence alignment NCBI National

40

BLAST and CD-HIT-2D (from functions to datasets) Once we have a reliable annotation, it is possible to study the evolution of different genes. First, we need to collect sequences related to our sequence of interest, also known as the query, from other organisms. NCBI stores all pub-lished genomes in different databases. Here, we can find a human genomic database or a reference prokaryote genomes database, for example. There is one database, called non-redundant database, or nr, that contains more than 242 million protein sequences (update date March 2019). Almost everything that has been sequenced is there. With such a number of sequences, we need a tool to find those sequences related to our query.

Basic Local Alignment Search Tool (BLAST) compares sequences and finds regions of local similarity between them. BLAST also calculates the sta-tistical significance of the found sequences, also known as the hits. These val-ues can be used as a cutoff to determine how likely it is that the two sequences are similar by chance. In our studies, we wanted to retrieve the minimum num-ber of hits containing the maximum diversity related to the query (see Paper I for more details). To do so, we ran BLASTp (p for protein) searches with three different settings, keeping 1,000, 5,000 or 10,000 hits with e-values <0.001. The diversity was measured as the number of different bacterial and archaeal phyla and the number of eukaryotic supergroups present in the different set-tings (Figure 9).

Most of the time, it is assumed that a gene present in two or more close relatives shares the same origin, but this is not always the case. Some of the first single gene trees I produced for the proteins involved in oxygen detoxifi-cation (Paper I) showed Fornicata sequences clustered independently and with long branches. These phylogenetic trees were the result of two BLASTp searches using one sequence from diplomonads and one sequence from K. bi-alata. Posteriorly, the two BLASTp searches were merged, all Fornicata hom-ologs were added and a phylogenetic tree was computed using FastTree (Price et al. 2010). After checking that the queries were the correct ones, and the BLASTp searches and the phylogenetic tree ran without problems, we as-sumed that something was happening with the approach. We compared the number of hits in common between the different BLASTp searches from the different Fornicata queries (Figure 9). To do so, we used CD-HIT-2D, a pack-age included in CD-HIT (Li & Godzik 2006). In some cases, the number of hits in common between different BLASTp searches was very low or even 0%. Those Fornicata sequences with a number of hits in common equal or lower than 20% resulted in phylogenetic trees with long branches and Forni-cata clustering independently. We considered that a gene had an independent evolutionary history in two or more Fornicata when the number of hits in com-mon was equal or lower than 20% (Figure 9 and Paper I).

Page 41: Stolen genes, a shortcut to successuu.diva-portal.org/smash/get/diva2:1383246/FULLTEXT01.pdf · Resumen en español ... Mbp Mega base pairs MSA Multiple sequence alignment NCBI National

41

Figure 9. BLASTp and CD-HIT-2D schematic workflow. When possible, two dif-ferent Fornicata sequences were used as queries for BLASTp searches with three different settings, keeping 1,000, 5,000 or 10,000 hits. The BLASTp searches were merged pairwise and a preliminary phylogenetic tree was computed with FastTree. A diversity analysis was used to select the final BLASTp setting. The fraction of hits in common between the two selected BLASTp searches was calculated using CD-HIT-2D. A gene had an independent evolutionary history in two or more Fornicata when the number of hits in common was equal or lower than 20% (Paper I).

This 20% rule was decided based on the data from the oxygen detoxification pathway and should be used in other phylogenetic analysis with caution. When the proportion of hits in common is 0% or very low, the sequences used as queries will most likely not cluster together in the preliminary phylogenetic tree, and the datasets can be split. Based on my experience, the problem arises around 20% of hits in commons. Around this percentage, there is a twilight

Page 42: Stolen genes, a shortcut to successuu.diva-portal.org/smash/get/diva2:1383246/FULLTEXT01.pdf · Resumen en español ... Mbp Mega base pairs MSA Multiple sequence alignment NCBI National

42

zone where false positives and false negatives can appear. For example, I used the same cutoff for the phylogenetic analyses of the genome of H. inflata (Pa-per IV). Later, I realized that the split datasets had several sequences from Fornicata in common, and they should remain together.

Single gene tree (from datasets to phylogenetic histories) A single gene tree can be defined as the family tree of a gene. It shows the evolutionary history of that particular gene. It should not be mistaken for a species tree. The later shows the relationship between species using genomic data. Both single gene trees and species trees are hypotheses. They represent the most likely explanation based on the data that we are using at that moment. I will focus here on my experience with single gene trees. However, some of the steps can also be used for species trees.

Any phylogenetic analysis involves three steps: i) creating a reliable dataset of sequences (see the previous section), ii) aligning the sequences, and iii) inferring the phylogenetic tree.

Multiple sequence alignments (MSA) methods try to arrange homologous positions between sequences in columns. That is, they assume that these posi-tions share a common ancestral state. A good alignment is an important step for any phylogenetic analysis. During these years, I learned that before run-ning a phylogenetic tree, visual checking of the alignment is needed. Almost all homologous proteins, especially enzymes, have more or less conserved re-gions across all species. A good MSA method should be able to identify such conserved regions and align them.

After an MSA, there are useful regions for phylogenetic analyses (e.g., well-aligned regions), and regions that are not (e.g., a long region only present in one sequence). We need to trim the alignment to extract to informative re-gions for the phylogenetic analysis. In my studies, I used BMGE (Criscuolo & Gribaldo 2010). This program selects those positions that are biologically relevant based on the use of standard similarity matrices such as PAM (for nucleotides) or BLOSUM (for proteins). In our studies, we used the BLOSUM30 matrix because we were working with distantly related proteins. BMGE also removes regions with a too large proportion of gaps (e.g., equal or more than 50%).

Phylogenetic methods try to infer a phylogenetic tree given an MSA and a substitution model. The selection of the substitution model is as critical as having a good MSA. There are programs that estimate the best-fitting model for our data. In some cases, the program is implemented in the phylogenetic program. For example, IQ-Tree (Nguyen et al. 2015) uses ModelFinder (Kalyaanamoorthy et al. 2017).

Page 43: Stolen genes, a shortcut to successuu.diva-portal.org/smash/get/diva2:1383246/FULLTEXT01.pdf · Resumen en español ... Mbp Mega base pairs MSA Multiple sequence alignment NCBI National

43

There are several methods to infer a phylogenetic tree. However, Maximum Likelihood (ML) and Bayesian inference are the most used nowadays. In both cases, we have what it is called a landscape of phylogenetic trees. Both meth-ods try to find the best phylogenetic tree within this landscape. Maximum likelihood methods try to find the phylogenetic tree with the highest likeli-hood. The likelihood is defined as the probability of observing the data (i.e., the alignment), given the parameters (i.e., the tree, branch lengths and substi-tution model). On the other hand, Bayesian inference tries to find the phylo-genetic tree based on the posterior probability (Lemey et al. 2009).

Phylogenetic trees are hypotheses, and as such, are subject to errors or phy-logenetic artifacts. Sometimes the phylogenetic methods underestimate the number of substitution between fast-evolving sequences and consider that they are more closely related than they are. Long branches on the phylogenetic tree represent these fast-evolving sequences. The long branches are the reason that this phylogenetic artifact is called long branch attraction (LBA). Like most of the genes from parasites, genes from diplomonads are fast-evolving sequences, Single gene trees using these species need to be treated carefully and visually inspected before reaching a conclusion. Preventing LBA when studying a group that itself will have long branches is not always easy. How-ever, some things can be done to try to reduce this artifact. One option is to remove from the dataset short sequences that introduce lots of gaps in the alignment. These short sequences could be considered fast-evolving se-quences and produce LBA. Usually, removing sequences with lots of gaps have a second advantage. After the MSA, we could have more informative sites to use to infer the phylogenetic tree.

A second option to avoid LBA is to test if the controversial placement is real. We can remove from the dataset the controversial Fornicata sequences except the Fornicata sequence with the shortest branch. This sequence is now our reference sequence. If after rerunning the phylogenetic tree, it does not suffer changes, the placement is most likely real and not due to LBA artifacts

Ancestral reconstruction For the evolution of the oxygen detoxification (Paper I), we reconstructed the last Fornicata common ancestor and the last Diplomonadida common ancestor based on the evolutionary history of each protein. When we analyzed the met-abolic capacities of diplomonads (Paper III), we used a cluster approach based on BLAST hits, rather than a phylogenetic analysis. We ran BLASTp searches against a reference database to which we added the genome of the analyzed diplomonads and the genome of K. bialata as the outgroup. Every protein used for the BLASTp searches was associated with, at least, one metabolic reaction. We classified each reaction as putatively present in the last Diplomonadida common ancestor or most likely acquired after this ancestor. Those reactions

Page 44: Stolen genes, a shortcut to successuu.diva-portal.org/smash/get/diva2:1383246/FULLTEXT01.pdf · Resumen en español ... Mbp Mega base pairs MSA Multiple sequence alignment NCBI National

44

most likely present in the last Diplomonadida common ancestor were classi-fied as vertical inherited or LGT candidates.

Other ancestral reconstruction methods that simultaneously model various types of evolutionary events exist (Szöllősi, Tannier, et al. 2015). These meth-ods require a species phylogeny and information about homologous gene fam-ilies present in the selected species (Szöllősi, Davín, et al. 2015). With this information, these approaches can infer different types of evolutionary events among the selected species (e.g., duplication, losses, gene transfers, origina-tion, speciation). Transfers involving species not represented in the dataset will not be detected and will be instead masked as originations. Since these methods are computationally expensive, they need to be applied to a reduced dataset, usually representing a specific taxonomical group of interest. Hence, such approaches are not appropriate to pinpoint the origin of laterally acquired genes between very distant lineages since it would require the use of a massive dataset representing most clades in the Tree of Life.

Page 45: Stolen genes, a shortcut to successuu.diva-portal.org/smash/get/diva2:1383246/FULLTEXT01.pdf · Resumen en español ... Mbp Mega base pairs MSA Multiple sequence alignment NCBI National

45

Results

Evolution of the oxygen detoxification pathway in Fornicata (Paper I) In this study, we reconstruct the putative oxygen detoxification pathway of diplomonads and the close relative K. bialata and study its evolution from the last Fornicata common ancestor to extant Fornicata species.

We searched for all possible homologs in all available Fornicata genomes and transcriptomes and used a phylogenetic approach to study the evolution-ary history of each protein family.

We identified 25 proteins in total and proposed how they can interact in an oxygen detoxification pathway. We divided these proteins into two main groups: proteins forming an oxygen detoxification pathway and proteins in accessory pathways involved in the synthesis of non-protein thiols. At the same time, we divided the oxygen detoxification pathway into direct systems that reduce oxygen and ROS to H2O, and indirect systems that repair protein and membrane damage.

Evolution of the oxygen detoxification pathway We identified a similar oxygen detoxification pathway in all the analyzed For-nicata independently of the lifestyle or the host in the case of parasitic diplo-monads. However, these similarities masked a process of convergence. Our analyses identified at least seven proteins that have been acquired via inde-pendent LGT events in different Fornicata lineages. Our analyses also reveal a pathway in constant evolutionary change where proteins have been gained and lost repeatedly.

Our study showed that the majority of proteins in the oxygen detoxification pathway were acquired via LGT events. Only two proteins were identified as vertically inherited, hybrid cluster protein and serine hydroxymethyltransfer-ase, with the latter lost in the last Diplomonadida common ancestor and reac-quired in the Hexamitinae ancestor.

Oxygen detoxification pathway in the last Diplomonadida common ancestor and the last Fornicata common ancestor Using transcriptome data from different Fornicata species, we were able to date the different acquisitions and reconstruct the oxygen detoxification path-way in the last Diplomonadida common ancestor. Most likely, this ancestor

Page 46: Stolen genes, a shortcut to successuu.diva-portal.org/smash/get/diva2:1383246/FULLTEXT01.pdf · Resumen en español ... Mbp Mega base pairs MSA Multiple sequence alignment NCBI National

46

was already adapted to low-oxygen environments. It had the complete path-way with most of the enzymes that we observed in the extant diplomonads, and was also able to synthesize both glutathione and L-cysteine. This duality on the use of non-protein thiols is a difference with the extant diplomonads where Giardiinae species lost the capacity of synthesizing L-cysteine while Hexamitinae species lost the synthesis of glutathione and the use of glu-taredoxin. We suggest that these differential losses might be due to the differ-ent lifestyles. Giardiinae species are gut parasites and take in L-cysteine di-rectly from the host’s diet. Hexamitinae species are parasites of different tis-sues of the host, commensal or free-living. Being able to synthesize L-cysteine can be an advantage for them. Surprisingly, a mechanism for restoring oxi-dized methionine residues was inferred to be lacking in the last Diplomona-dida common ancestor.

We were also able to reconstruct the oxygen detoxification pathway par-tially in the last Fornicata common ancestor. Our analysis showed that four proteins were acquired before this ancestor. One of them was SOD, which would be replaced by SOR later in the ancestor of D. brevis and diplomonads. Most likely, this ancestor could not synthesize L-cysteine but was able to syn-thesize glutathione and detoxify oxygen and ROS in a pathway similar to that observed in diplomonads and K. bialata.

The genome of Giardia muris (Paper II) In this study, we describe the draft genome sequence of G. muris, representing the first genome of any Giardia outside of the G. intestinalis species.

The G. muris genome consisted of 59 contigs spanning 9.8 Mbp, which is notably smaller than the G. intestinalis WB genome (12.55 Mbp). The vast majority of the genome (9.0 Mbp, 92%) was found on five contigs (>1 Mbp). Of the remaining contigs, ten were short (<30 kbp) and terminated in telomeric repeats (TAGGG), suggesting that there were the terminal points of five chro-mosomes. Gene prediction and manually curated annotation identified 4721 protein coding genes in G. muris, covering 82.4% of the entire genome. This genome is an example of a very compacted eukaryotic genome. The average intergenic region size was only 242 bp and 441 genes showing overlap with neighboring genes for an average size of 21 bp.

Cysteine-rich proteins, such as VSPs, are important proteins for all parasitic diplomonads. They have been described in G. intestinalis and S. salmonicida (Adam et al. 1988; Nash et al. 1988; Xu et al. 2014). We identified 265 VSP homologs in G. muris. We identified pseudogenized VSPs (YVSPs). Nine out of the ten ends of the main contigs have a YVSP array at, or close to, the edge of the chromosomes. The adhesive disc is a unique cytoskeletal structure of Giardia parasites essential for the attachment of the trophozoite in the small

Page 47: Stolen genes, a shortcut to successuu.diva-portal.org/smash/get/diva2:1383246/FULLTEXT01.pdf · Resumen en español ... Mbp Mega base pairs MSA Multiple sequence alignment NCBI National

47

intestine but is missing in other diplomonads. Most of G. intestinalis disc pro-teins were also identified in G. muris.

Metabolic pathways in G. muris We reconstructed the metabolic capacities of G. muris and G. intestinalis WB combining several metabolic prediction tools and databases. These predictions were manually curated to reduce the error due to misannotations. We identi-fied 95 pathways in G. muris compared to 98 in G. intestinalis WB. We iden-tified four unique pathways and 18 unique reactions in G. muris and eight unique pathways and 25 unique reactions in G. intestinalis WB.

Both species have glycolysis as the main energy production pathway. How-ever, the potential utilization of carbon sources for this pathway is different. Fructokinase and mannose 6-phosphate isomerase enable G. muris to use fruc-tose and mannose 6-phosphate, unlike G. intestinalis. Both genes were ac-quired via LGT from bacteria. The G. muris gene for mannose 6-phosphate isomerase is found next to a bacterial transcriptional regulator/sugar kinase gene. Our phylogenetic analyses showed that G. muris mannose 6-phosphate isomerase and the transcriptional regulator/sugar kinase gene were acquired form Bacteroidetes in one single LGT event.

The use of arginine is important for organisms living within the intestinal tract. In all studied Metamonada, arginine is degraded by the arginine deimi-nase pathway, generating energy. Our study suggested that G. muris can also degrade arginine through a second pathway, the arginase pathway. The phy-logenetic analyses suggested that this protein was an ancestral acquisition in Metamonada and subsequently lost in several lineages, including G. intesti-nalis WB.

Our analysis suggested that G. muris can synthesize coenzyme A from pan-tothenate. The same pathways have been described in S. salmonicida, but it seems to be absent in G. intestinalis. The utilization of glycerol for ATP syn-thesis via glycerol kinase has been suggested in G. intestinalis upon depletion of primary carbon sources. However, we could not find any enzyme for this reaction in G. muris, suggesting that this compound could be toxic for G. mu-ris.

Interaction with the intestinal microbiota We also identified three enzymes that have a potential role in the interaction with the intestinal microbiota: tryptophanase, quorum-quenching N-acyl-ho-moserine lactonase and Tae4. Tryptophanase produces pyruvate, indole and ammonium from tryptophan. While the pyruvate is used for the production of energy, indole affects the gut microbiota composition, possibly by interfering with the quorum-sensing systems. Quorum-quenching N-acyl-homoserine lactonase degrades N-acyl-homoserine lactone, a molecule used by different bacteria, both Gram-positive and Gram-negative, for quorum-sensing. Tae4 is an amidase that hydrolyzes the amide bond, γ-D-glutamyl-mDAP (DL-bond)

Page 48: Stolen genes, a shortcut to successuu.diva-portal.org/smash/get/diva2:1383246/FULLTEXT01.pdf · Resumen en español ... Mbp Mega base pairs MSA Multiple sequence alignment NCBI National

48

of Gram-negative bacteria destabilizing the bacterial peptidoglycan. Our phy-logenetic analyses suggested that these proteins were acquired via LGT from bacteria. While tryptophanase and quorum-quenching N-acyl-homoserine lac-tonase represent a more ancestral acquisition, Tae4 has been acquired recently in G. muris lineage.

Metabolic reconstruction and the last Diplomonadida common ancestor (Paper III) In this study, we reconstructed the metabolic capacities of four diplomonad genomes (G. intestinalis WB, G. intestinalis GS B, G. muris and S. salm-onicida) and one transcriptome (Trepomonas sp. PC1). To do so, we combined the prediction from different metabolic tools and databases for each genome and transcriptome. Intending to decrease the error due to misannotations, we manually curated the different predictions and created a metabolic database for each genome and transcriptome.

In total, we identified 852 reactions and 147 pathways present in at least one of the analyzed diplomonads. Among these, 559 reactions (66%) and 82 pathways (56%) were common to all the analyzed diplomonads, while 101 reactions (12%) and 22 pathways (15%) were unique to one diplomonad. Trepomonas sp. PC1 showed the most complex metabolism with 763 reac-tions and 119 pathways, while G. muris showed the most reduced metabolism with 668 reactions and 95 pathways.

The reactions were classified as putatively present in the last Diplomona-dida common ancestor or acquired after this ancestor using a clustering ap-proach. We also classified the reactions as LGT candidates based on the num-ber of prokaryote organisms in the cluster.

Last Diplomonadida common ancestor The last Diplomonadida common ancestor encoded 700 of the 852 reactions, and 104 of the 147 pathways, annotated in the extant species. The predicted overall metabolism was fairly similar to extant diplomonads. The last Diplo-monadida common ancestor produced pyruvate via glycolysis and had most likely a limited capacity to synthesize lipids de novo, suggesting that it was depended on external sources.

Our analysis identified several traits already present in the common ances-tor of diplomonads and K. bialata. For example, the capacity for synthesizing UDP-N-acetyl-D-galactosamine and eleven of the proteins suggested as viru-lence factors in G. intestinalis were already present in this common ancestor.

Our phylogenetic analysis suggested that the Diplomonadida ancestor lacked the synthesis of deoxynucleotides de novo. Instead, it had several path-ways for the salvage and degradation of nucleotides and nucleosides. All key

Page 49: Stolen genes, a shortcut to successuu.diva-portal.org/smash/get/diva2:1383246/FULLTEXT01.pdf · Resumen en español ... Mbp Mega base pairs MSA Multiple sequence alignment NCBI National

49

enzymes related to salvage and degradation of purines were most likely ac-quired after the divergence from K. bialata, whereas the key enzymes for the salvage and degradation of pyrimidines shared an origin with K. bialata.

Giardiinae ancestor, Giardia intestinalis and Giardia muris In the lineage leading to the Giardiinae ancestor, 52 reactions were lost. For example, this ancestor lost the ability to degrade galactose and triacylglycer-ols. The nucleotide metabolism was also affected by the loss of several reac-tions. On the other hand, our analyses classified 32 reactions as acquired. The Giardiinae ancestor acquired the oxidative branch of the pentose phosphate pathway. The combination of this branch, together with the partial branch, already present since the last Diplomonadida common ancestor, allows the synthesize of glyceraldehyde 3-phosphate from glucose 6-phosphate.

Our analysis revealed that the different G. intestinalis assemblages used in this study were not metabolically identical. Interestingly, our analysis sug-gested that the Giardiinae ancestor was metabolically very similar to extant G. intestinalis. In contrast, G. muris suffered a significant reduction since the Gi-ardiinae ancestor. This reduction is especially important in the salvage and degradation of nucleosides and nucleotides.

Hexamitinae ancestor, Spironucleus salmonicida and Trepomonas sp. PC1 In the lineage leading to the Hexamitinae ancestor, 40 reactions were lost, while 35 reactions were gain. Interestingly, this ancestor was able to synthe-size deoxynucleotides de novo due to the acquisition of the enzyme anaerobic ribonucleoside-triphosphate reductase. We identified the protein formate C-acetyltransferase in Trepomonas sp. PC1. This protein was present in the Diplomonadida ancestor, lost in the lineage leading to Giardiinae but retain in the lineage leading to Hexamitinae. Having this protein, most likely, facili-tated the acquisition of the protein anaerobic ribonucleoside-triphosphate re-ductase several times in this group.

Our analyses suggested that S. salmonicida could be more sensitive than other diplomonads to NADHX and arsenic. Unlike other diplomonads, S. salmonicida lacks the enzymes to repair NADHX, and the arsenical pump-driving ATPase is the only identified protein for the detoxification of arsenic.

We identified several acquisitions in Trepomonas sp. PC1 related to the nucleotide metabolism and degradation of different sources of carbohydrates that most likely contributed to the development of the secondary free-living in this species.

Page 50: Stolen genes, a shortcut to successuu.diva-portal.org/smash/get/diva2:1383246/FULLTEXT01.pdf · Resumen en español ... Mbp Mega base pairs MSA Multiple sequence alignment NCBI National

50

The genome of Hexamita inflata (Paper IV) In this study, we describe the draft genome sequence of H. inflata, represent-ing the first genome of any free-living diplomonad.

The H. inflata genome consisted of 6344 contigs spanning 134 Mbp, which is notably larger than the other diplomonad genomes. Gene prediction identi-fied 106,012 protein coding genes in H. inflata. We only identified nine in-trons in cis in paralogs of two different genes, representing the lowest number of unique genes with introns in a diplomonad.

Overview of Hexamita inflata metabolism Hexamita inflata has glycolysis as the main energy producting pathway. Like G. intestinalis, the glycolysis is energetically optimized by the use of the en-zymes pyrophosphate-fructose 6-phosphate 1-phosphotransferase and PPDK. Like in Trepomonas sp. PC1, acetyl-CoA is synthesized via pyruvate:ferre-doxin oxidoreductase and formate C-acetyltransferase. We also detected the potential use of other sources of energy, like fructose, tryptophan and arginine.

Our analyses did not detect the enzyme squalene-tetrahymanol cyclase, an important enzyme for the stabilization of the membrane in low-oxygen level conditions, which was previously reported from Trepomonas sp. PC1 (Xu et al. 2016).

Previously, the formation of cysts has been described in H. inflata (Brugerolle & Lee 2000). Our analyses showed that the synthesis of UDP-N-acetyl-galactosamine, an essential compound for the cyst wall in other diplo-monads, is conserved in this species. We also detected homologs for the cyst wall proteins.

The nucleotide metabolism in H. inflata resembles that in Trepomonas sp. PC1. Hexamita inflata can synthesize deoxynucleotides de novo using the en-zyme anaerobic ribonucleoside-triphosphate reductase. Our phylogenetic analysis showed that a common ancestor to Trepomonas sp. PC1 and H. in-flata acquired this protein from bacteria.

Oxygen and arsenic detoxification pathways Hexamita inflata can detoxify oxygen and ROS using the same main pathway described in diplomonads. However, like in other diplomonads, we detected modifications in this pathway. Hexamita inflata can synthesize L-cysteine, but also glutathione, representing the first diplomonads with this versatility in the synthesis of non-protein thiols. Hexamita inflata also encodes for the protein glutathione peroxidase. Our phylogenetic analyses suggested that the loss of the synthesis of glutathione happened independently in S. salmonicida and Trepomonas sp. PC1, and that the acquisition of the glutathione peroxidase from bacteria was independent of the K. bialata homolog.

Our analysis suggested that H. inflata could withstand a higher concentra-tion of arsenic than other diplomonads. The action of glutathione and the

Page 51: Stolen genes, a shortcut to successuu.diva-portal.org/smash/get/diva2:1383246/FULLTEXT01.pdf · Resumen en español ... Mbp Mega base pairs MSA Multiple sequence alignment NCBI National

51

enzyme arsenite methyltransferase, together with other proteins, most likely allows H. inflata to detoxify different arsenic states.

Page 52: Stolen genes, a shortcut to successuu.diva-portal.org/smash/get/diva2:1383246/FULLTEXT01.pdf · Resumen en español ... Mbp Mega base pairs MSA Multiple sequence alignment NCBI National

52

Discussion

We revealed some of the changes that diplomonads has undergone to adapt to different environments from the Diplomonadida ancestor to extant species. Our studies suggested that this Diplomonadida ancestor was already a host-associated organism (Papers III). We identified three important pre-adapta-tions to parasitism in this organism that, most likely, helped in the shift of lifestyle: the adaptation to low-oxygen environments (Paper I), the capacity to form a cyst, and the present of proteins that are considered virulence factors in G. intestinalis (Paper III).

In other parasitic or host-associated eukaryotic groups, LGT has been an important element of adaptation (Hirt et al. 2015). For example, the acquisi-tion of different transporters in Oomycetes expanded the accessibility of nu-trients in a group where the ancestor was phagotrophic, but the extant species are osmotrophic (Savory et al. 2018; Milner et al. 2019). In diplomonads, we identified acquired enzymes that expanded the use of different sources of en-ergy. For example, G. muris, S. salmonicida and Trepomonas sp. PC1 can use mannose 6-phosphate, unlike G. intestinalis and H. inflata (Paper II and Paper III). Interestingly the acquisition of this protein happened independently in the different lineages.

The acquisition of new functions is also related to the interaction with the host. Giardia intestinalis removes NO, a toxic compound and a signal of the inflammatory process, using the acquired protein flavohemoprotein (Mastronicola et al. 2010). In G. muris, some of the acquired functions allow diplomonads to interact with the microbiota of the host (Paper II). Quorum-quenching N-acyl-homoserine lactonase and Tae4 interfere with the bacterial growth allowing G. muris to colonize the intestine of the host. Interestingly, quorum-quenching N-acyl-homoserine lactonase was acquired before the common ancestor of G. intestinalis and G. muris and lost in G. intestinalis WB and P15. Differences in the microbiota of the hosts could explain this differential loss.

LGT has also allowed diplomonads to colonize new niches (Paper IV). Hexamita inflata can tolerate a higher concentration of oxygen than other diplomonads (Biagini et al. 1997). This tolerance increases the niches where H. inflata can find bacterial prey. The addition of the enzyme glutathione pe-roxidase to the oxygen detoxification pathway present in its ancestor, most likely, allows H. inflata to survive in an environment that should be toxic oth-erwise.

Page 53: Stolen genes, a shortcut to successuu.diva-portal.org/smash/get/diva2:1383246/FULLTEXT01.pdf · Resumen en español ... Mbp Mega base pairs MSA Multiple sequence alignment NCBI National

53

In some cases, the acquisition of genes resulted in the replacement of the gene previously present in the organism. Sometimes this replacement has a clear advantage if the acquired enzyme is better-adapted to the environment than the existing version. Blastocystis replaced the enzyme glutamine-depend-ent NAD synthetase by a bacterial NH3-dependent NAD synthetase (Eme et al. 2017). The reason for this replacement could be due to the availability of glutamine and NH3 in the environment. Our analyses revealed that SOR re-placed the Fornicata SOD in the common ancestor of Dysnectes brevis and diplomonads (Paper I). Previous analyses have shown that SOR is adapted to work at lower oxygen concentrations than SOD (Testa et al. 2011; Sheng et al. 2014).

Sometimes the advantage of the replacement is unknown. Most likely, the Diplomonadida ancestor lacked the synthesis of deoxynucleotide de novo and had pathways for the salvage of nucleotides. These salvage pathways are com-mon to all species. Surprisingly, our studies suggested that the Diplomonadida ancestor, most likely, replaced key enzymes for the salvage of purines via LGT (Paper III). Interestingly, Cryptosporidium parvum shows a similar pat-tern with the acquisition of enzymes for the salvage of purines and pyrimidines (Striepen et al. 2004).

The acquisition of functions does not always replace the existing functions. G. muris and G. intestinalis GS B degrade arginine through two different path-ways, arginine dihydrolase pathway and arginase pathway (Paper II). The lat-ter represents an ancestral acquisition in Metamonada with differential loss in different lineages.

Most of the identified LGT events pointed to bacteria as donors. However, we identified four cases of eukaryote-to-eukaryote LGT events. Nitroreduc-tase (Paper I) and dihydropyrimidine dehydrogenase (Paper IV), were trans-ferred most likely from a diplomonad ancestor to an Entamoeba ancestor. Glu-tamate-cysteine synthase most likely was transferred between Fornicata and Blastocystis (Paper I). However, the direction of the LGT event is unknown. Cases of eukaryote-to-eukaryote LGT events have been reported before be-tween these groups (Andersson et al. 2003; Grant & Katz 2014; Eme et al. 2017). Taken together, these data suggest that these groups adapted to the host by sharing genes.

The fourth eukaryote-to-eukaryote LGT event, flavohemoprotein (Paper I), involves four members of the family Trypanosomatidae and G. intestinalis. These four species are gut parasites of insects and do not share niche with G. intestinalis. However, within Trypanosomatidae, there are species that para-site vertebrates and use insects as vectors. It might be possible that an ancestor of these four species and an ancestor of G. intestinalis shared host.

LGT only cannot explain all the metabolic differences that our studies de-tected in diplomonads. Like other parasitic lineages, diplomonad lineages have suffered important losses. For example, G. muris lost all the pathways for the degradation and salvage of purines (Paper II). These losses suggested

Page 54: Stolen genes, a shortcut to successuu.diva-portal.org/smash/get/diva2:1383246/FULLTEXT01.pdf · Resumen en español ... Mbp Mega base pairs MSA Multiple sequence alignment NCBI National

54

that this species is entirely dependent on the uptake of purines from the envi-ronment within the host. In Paper I, the loss of the capacity to synthesize glu-tathione in S. salmonicida is explained as a differential loss due to redundant functions with the synthesis of L-cysteine. However, the loss of the synthesis of glutathione had other consequences for the detoxification of other com-pounds like arsenic (Paper III). Interestingly, Trepomonas sp. PC1 also lost the synthesis of glutathione but acquired the enzyme arsenite methyltransfer-ase that allows detoxifying As(III).

The work presented here highlights the metabolic and detoxification diver-sity existing in diplomonads and the importance of LGT and differential loss in the evolution of the group. Our studies have broadened the knowledge of the lifestyle of diplomonads and the Diplomonadida ancestor.

Page 55: Stolen genes, a shortcut to successuu.diva-portal.org/smash/get/diva2:1383246/FULLTEXT01.pdf · Resumen en español ... Mbp Mega base pairs MSA Multiple sequence alignment NCBI National

55

Conclusions and future work

In this thesis, we combined metabolic analyses with clustering and phyloge-netic analyses to further understand the evolution of the metabolism in diplo-monads. Taken together, our studies strongly indicated that the acquisition of genes via LGT has played an important role in remodeling the metabolism of the group. It has been an important piece in the adaptation to parasitism and also in the posterior adaptation to secondary free-living in Trepomonas sp. PC1 and H. inflata.

However, our studies have raised new questions regarding how the acqui-sition of genes works in microbial eukaryotes. Our studies showed cases of convergence where the adaptation to the same factor was the acquisition of homologous genes from different donors (e.g., different bacterial groups). There were other cases where a function is acquired, lost, and regain, creating a dynamic situation where some pathways are in constant change. A better understanding of the different factors that result in the selection of inserted pieced of exogenous DNA would help to understand why some functions are kept while others are replaced more frequently.

Here, we reconstructed the Diplomonadida ancestor, the Giardiinae ances-tor and the Hexamitinae ancestor. They represent the information obtained from five genomes and one transcriptome. Fornicata is a diverse group. In order to improve our knowledge of diplomonads and their relatives, it would be valuable to obtain genomes from different diplomonads and other Fornicata species.

There are cellular features in diplomonads that need more experimental data. For example, diplomonads encode for a large diversity of transporters. Predicting the function and specificity of them using a bioinformatic approach is difficult without experimental data.

The knowledge about parasitic diplomonads has increased drastically dur-ing the last years. However, there are elements of the pathobiome that are not fully understood. For example, the role of the microbiota of the host during giardiasis or spironucleosis needs to be addressed.

Even when metabolic analyses are very time-consuming, the result can be used to study different aspects of microbial eukaryotes. This thesis has demon-strated the power of these analyses in non-model organisms to predict the re-lationship between different species and their environment.

Page 56: Stolen genes, a shortcut to successuu.diva-portal.org/smash/get/diva2:1383246/FULLTEXT01.pdf · Resumen en español ... Mbp Mega base pairs MSA Multiple sequence alignment NCBI National

56

Svensk sammanfattning

Alla arter är i ständig interaktion med andra arter. Interaktionerna kan delas in i tre klasser, baserat på hur de ingående arterna påverkar varandra:

• mutualism: båda arterna får en fördel, • kommensalism: den ena arten får en fördel, medan den andra inte

påverkas, • parasitism: den ena arten får en fördel, medan den andra skadas.

Parasiter delar gemensamma egenskaper. Eftersom de erhåller många närings-ämnen från värden har parasiter genomgått en evolutionär process som resul-terat i en mindre komplex ämnesomsättning än fritt levande organismer. De har tappat förmågan att tillverka och bryta ned flera av de näringsämnen som behövs för att cellen ska överleva. Å andra sidan har de utvecklat nya sätt att komma åt dessa viktiga näringsämnen direkt från värdorganismerna. Vissa parasiter har också utvecklat speciella strukturer för att kolonisera värden, som en anpassning till denna livsstil. Traditionellt har parasiter betraktats som för-enklade versioner av deras fritt levande förfäder. Idag ses parasiter som org-anismer som är mycket anpassade till deras specifika nischer.

Oberoende av livsstilen kan alla levande varelser klassificeras i tre grupper: eukaryoter, bakterier och arkeer. Eukaryota celler kännetecknas av att det ge-netiska materialet finns inuti cellkärnan och avatt ha strukturer i cellen som kallas organeller (t.ex. mitokondrien). Bakterier och arkeer saknar cellkärna och organeller. Växter, djur, svampar och protister är alla grupper av eukaryo-ter. De flesta protister är encelliga organismer som är omöjliga att se utan mikroskop och de representerar huvuddelen av den eukaryota mångfalden. Det är viktigt att nämna att protister inte är en enskild evolutionär grupp. Det finns t.ex. protister som är närmare besläktade med djur än med andra protis-ter. Det samma gäller för växter och svampar.

Inom protister kan vi hitta diplomonader, den grupp som denna avhandling handlar om. Diplomonader finns i miljöer med syrekoncentrationer som är mycket lägre än i atmosfären, såsom i sjösediment och människotarmar. De flesta diplomonader har två cellkärnor och åtta flageller som används till t.ex. att simma med. De saknar aeroba mitokondrier, kraftverken i våra och de flesta andra eukaryota celler, och har hydrogenosomer eller mitosomer istället. Båda är förenklade versioner av de aeroba mitokondrierna som inte använder syre. Diplomonader är välkända som parasiter. Till exempel orsakar Giardia

Page 57: Stolen genes, a shortcut to successuu.diva-portal.org/smash/get/diva2:1383246/FULLTEXT01.pdf · Resumen en español ... Mbp Mega base pairs MSA Multiple sequence alignment NCBI National

57

intestinalis diarré hos däggdjur, inklusive människor, och Spironucleus sal-monicida orsakar systemisk infektion i laxfiskar. Men vi kan också hitta kom-mensala och fritt levande arter inom denna grupp. Diplomonaderna produce-rar energi främst via nedbrytning av glukos. Nedbrytningen av aminosyror kan också användas för produktion av energi. Medan vissa arter kan skapa deoxyn-ukleotider, byggstenen av DNA, från enklare föreningar, har andra arter tappat denna förmåga och behöver ta upp dem från miljön inuti värdorganismen. Flera arter av diplomonader har en livscykel med två stadier. Trofozoiten är den aktiva, replikerande cellen, och cystan är den icke-replikerande cellen. Trofozoiterna från parasiten G. intestinalis utvecklas till cystor när de lämnar människokroppen och blir bara aktiva igen när de kommer in i tarmen hos en annan människa.

Jag har studenterat diplomonader från en evolutionär synvinkel. Grunden för evolutionen är att egenskaper ärvs från föräldrar till nästa generation, och att, som Darwin visade, de i nästa generation som är bäst anpassade till miljön kommer att ha störst chans att överleva. De som överlever bäst har också störst chans att föra sina egenskaper vidare till kommande generationer. Efter tusen-tals generationer kommer den här processen att leda till uppkomsten av nya arter med egenskaper anpassade till den ständigt föränderliga omgivande mil-jön. Det allra vanligaste är att en individ i en ny generation endast har varianter av egenskaper som fanns i den tidigare generationen, det vill säga att det ge-netiska materialet ärvs vertikalt, från föräldrarna till avkomman. Tidigare stu-dier av bakterier och protister har dock visat att arter ibland kan förvärva egen-skaper (gener) från avlägset besläktade arter och integrera dem i deras arvs-massa. Denna process kallas horisontal genöverföring och möjliggör att egen-skaper kan spridas mellan vitt skilda arter.

Denna avhandling fokuserar på studier av diplomonaders metabolism för att förstå hur de interagerar med sin miljö. Från en evolutionär synvinkel har vi studerat hur ämnesomsättningen förändrades från förfäder till de existe-rande arterna. Resultaten visar att diplomonadernas gemensamma förfader var en organism som redan var anpassad till miljöer med låg syrehalt, kunde bilda cystor och hade en mindre komplex ämnesomsättning. Detta antyder att för-fadern troligen var en organism som var anpassad till att leva inuti en värd-organism, kanske redan som en parasit. Våra resultat visar också att förvärv av gener via horisontell genöverföring och förlust av gener har format ämnes-omsättningen i en dynamisk evolutionär process. Flera av genförvärven är re-laterade till tillgången till nya näringsämnen, förmågan att ta hand om giftiga ämnen och interaktion med värden, eller de andra mikroberna som strävar in-uti värden. Dessa resultat visar styrkan i kombinationen av metaboliska och evolutionära analyser för att förstå sambandet mellan arter och deras miljö.

Page 58: Stolen genes, a shortcut to successuu.diva-portal.org/smash/get/diva2:1383246/FULLTEXT01.pdf · Resumen en español ... Mbp Mega base pairs MSA Multiple sequence alignment NCBI National

58

Resumen en español

Todos los organismos se encuentran en constante interacción con otras espe-cies. Estas interacciones se pueden dividir en tres categorías en función del beneficio obtenido por las especies involucradas en la interacción:

• Mutualismo: cuando la interacción produce un beneficio mutuo. • Comensalismo: sólo una especie de la interacción obtiene un bene-

ficio, mientras que la otra no se ve afectada. • Parasitismo: una especie obtiene un beneficio, mientras que la otra

obtiene un daño.

Desde el punto de vista molecular, los parásitos representan un estilo de vida muy interesante y comparten una serie de características comunes. Debido a que obtienen muchos nutrientes del hospedador, las especies parásitas han ex-perimentado un proceso evolutivo que da lugar a un metabolismo menos com-plejo si lo comparamos con las especies de vida libre. Los parásitos han per-dido la capacidad de producir y degradar muchos de los compuestos necesa-rios para la supervivencia de la célula. Por otro lado, han desarrollado nuevas maneras de acceder a estos nutrientes esenciales directamente del hospedador. También como consecuencia del estilo de vida, algunos parásitos han desarro-llado estructuras especiales para poder colonizar el hospedador. Antigua-mente, los parásitos eran considerados versiones simplificadas de organismos de vida libre, pero hoy día son considerados organismos altamente especiali-zados a su entorno.

Independientemente del estilo de vida, todos los seres vivos conocidos se pueden clasificar en tres grupos: eucariota, bacteria y arquea. Las células eu-cariotas se caracterizan por tener su material genético dentro del núcleo y por presentar compartimentos llamados orgánulos (p. ej. la mitocondria), mientras que las células bacterianas y arqueas carecen tanto de núcleo como de orgá-nulos. Dentro de los eucariotas encontramos plantas, animales, hongos y pro-tistas. Los protistas son organismos unicelulares imposible de ver sin un mi-croscopio, en su mayoría, y representan la mayor diversidad dentro los euca-riotas. Cabe destacar que los protistas no representan una única línea evolu-tiva, es decir, existen protistas más relacionados con animales que con otros protistas. Lo mismo ocurre con plantas y hongos.

Dentro de los protistas, encontramos a las diplomonas, el grupo de estudio de esta tesis. Las diplomonas se encuentran en ambientes con niveles de

Page 59: Stolen genes, a shortcut to successuu.diva-portal.org/smash/get/diva2:1383246/FULLTEXT01.pdf · Resumen en español ... Mbp Mega base pairs MSA Multiple sequence alignment NCBI National

59

oxigeno inferiores a los niveles ambientales, como por ejemplo sedimentos en lagos o el intestino humano. Presentan la peculiaridad de que muchos de sus miembros tienen dos núcleos y ocho flagelos que usan, por ejemplo, para el desplazamiento. Las diplomonas carecen de mitocondria aeróbica, la planta de energía de la mayoría de células eucarióticas, y en su lugar pueden tener hidrogenosomas o mitosomas. Ambas son versiones simplificadas de la mito-condria aeróbica y no usan oxigeno. Las diplomonas son conocidas en su ma-yoría como especies parásitas. Por ejemplo, Giardia intestinalis causa diarrea en mamíferos incluido los humanos y Spironucleus salmonicida produce in-fección sistémica en salmones. Dentro de este grupo también encontramos es-pecies comensales y especies de vida libre. El metabolismo global de las di-plomonas no está completamente estudiado. Se sabe que obtienen energía principalmente a través de la degradación de glucosa. También son capaces de degradar algunos aminoácidos para la producción de energía. Mientras que algunas especies de diplomonas son capaces de producir desoxirribonucleóti-dos, moléculas esenciales para la síntesis del DNA, desde compuestos sim-ples, otras han perdido esta capacidad y dependen del entorno del hospedador para obtener estas moléculas. Muchas especies de diplomonas, ya sean para-sitas, comensales o de vida libre, tienen un ciclo de vida con dos fases. El trofozoito que es la forma activa, y el quiste que es la forma de resistencia. Los trofozoitos del parasito G. intestinalis se convierten en quistes cuando abandonan el hospedador y sólo se vuelven activos de nuevo cuando entran en el intestino de un nuevo hospedador.

En esta tesis, he estudiado las diplomonas desde un punto de vista evolu-tivo. La base de la evolución es la herencia de los caracteres del progenitor a la descendencia y, como Darwin mostró, aquellos mejor adaptados al entorno tendrán mayores posibilidades de sobrevivir y pasar sus caracteres a una futura generación. Después de miles de generaciones, este proceso conducirá a la aparición de nuevas especies con características adaptadas al siempre cam-biante ambiente. Lo más común es que el individuo de una nueva generación sólo tenga variaciones de características presentes en generaciones previas puesto que el material genético se hereda de forma vertical (del progenitor a la descendencia). Sin embargo, estudios previos han mostrado que en ciertas ocasiones las especies adquieren caracteres (genes) procedentes de especies distantes evolutivamente y lo integran en su genoma. Este proceso es conocido como transferencia lateral de genes y permite la dispersión y adquisición de características de diferentes especies.

Esta tesis se ha centrado en el estudio del metabolismo de las diplomonas con la intención de determinar cómo interaccionan con el entorno. Desde un marco evolutivo, se ha estudiado cómo dicho metabolismo ha ido cambiando desde el ancestro de las diplomonas hasta las especies que vemos hoy en día. Nuestros resultados muestran que el ancestro era un organismo adaptado a ambientes con bajo nivel de oxigeno, capaz de formar quistes y con un meta-bolismo reducido. Estos datos indicarían que se trataría de un organismo con

Page 60: Stolen genes, a shortcut to successuu.diva-portal.org/smash/get/diva2:1383246/FULLTEXT01.pdf · Resumen en español ... Mbp Mega base pairs MSA Multiple sequence alignment NCBI National

60

un estilo de vida dependiente de otro organismo, o completamente adaptado a un estilo de vida parasítica. Nuestros resultados también indican que la adqui-sición de genes vía transferencia lateral y la perdida de genes han moldeado el metabolismo de estas especies en un proceso dinámico de perdida y ganan-cia. Muchas de estas adquisiciones suponen una ventaja adaptativa relacio-nada con la obtención de nutrientes, interacción con el hospedador o con los otros microorganismos presentes en el hospedador, o bien con la eliminación de compuestos tóxicos, entre otros procesos. Estos resultados muestran la uti-lidad de la combinación de estudios metabólicos con estudios evolutivos para entender la adaptación de cualquier grupo de organismos a su entorno.

Page 61: Stolen genes, a shortcut to successuu.diva-portal.org/smash/get/diva2:1383246/FULLTEXT01.pdf · Resumen en español ... Mbp Mega base pairs MSA Multiple sequence alignment NCBI National

61

Acknowledgements

People who know me know that I am not a person who usually talks about myself. However, I feel it’s a good time to do it and to explain some of the reasons why I ended up doing my Ph.D. in Uppsala.

When I was a child in Spain, there was this cartoon show called Érase una vez… la Vida (Once Upon a Time… Life). This series explained the DNA, diseases and the immune system, how the organs and systems work, and more in a really cool way. Suddenly, there was a new world just in from of me and I wanted to know more about it. My mind was set, I would study biology.

During my first year at the university, I studied those concepts that I learned through the TV show but in a more accurate way. However, I had the feeling that something was missing. It wasn’t until my second year that I took this course called Evolutionary Genetics, that I found that missing link. Of course, I studied evolution before, but this was the first time I had the feeling of what study evolution means. After that, I took all the courses where I could study a bit more about evolution, the origin of life, diversity, and the tree of life. When I finish my degree in biology, I started a two years degree in biochemistry. After that, I did my master in evolutionary biology. I can say now they were two of my best decisions because I realized then that I wanted to do a Ph.D.

I remembered when I first read the position. I didn’t know where Uppsala was in the Swedish map, I knew Giardia but nothing more about diplomonads, so of course, I decided to apply. Jan, I am grateful for all these years under your supervision. From day one, you were there, showing me the way but also allowing me to explore it on my own. Your office was always open if I had any problem or if I just found a new piece of the puzzle. Staffan, thank you for being my co-supervisor. Your energy and passion for the study of diplo-monads are contagious!

When I joined Jan’s group, I also joined the great MolEvo family. Thank you all for these incredible years full of good memories, fikas, beers, and a bit of science J. Siv, Lisa (2020 and Mötley Crüe are back on tour!!!), Emil, Karl, Guilhe, Anna, Andrea, Zeynep, Christian, Erik H, Tom, Mayank, Kasia, the current members of this big family. However, once molever, for-ever molever!!! Thijs, Laura, Dani, Eva, Anders, Wei, Joran, Courtney, Ellie, Moritz, Vicente, Gaëlle, Minna, Feifei, Jon, Henning, Jonathan L, Disa (always in our hearts), Jennah, Will, Lina, Felix, Max, Erik P, Ajith, Anja, Jun-hoe, Erik H, Jonathan H, Kirsten. I learned so much from all of you.

Page 62: Stolen genes, a shortcut to successuu.diva-portal.org/smash/get/diva2:1383246/FULLTEXT01.pdf · Resumen en español ... Mbp Mega base pairs MSA Multiple sequence alignment NCBI National

62

If the MoleEvo family is great, I was lucky to expand the uppsaliensis fam-ily with more amazing members, Ale, Ana Rosa, Emilio, Giulia, Gianni, Sophia, Lucia, Mahsa, Guillaume, Ana B, Maria, Laia, Claudia, Ana M, Ana Z, Jesús, Carmen, Davide, Mercé, Karin, Dennis, Javi V, David, Showgy, Ásgeir, Laura, Tiscar, Javi C. Thank you for being there.

Writing this section, it came to my memory all the good moments together during these years: the Tough Viking (even when I don’t remember much of the end of the race), the Toughest, blodomloppet races, climbing, the trip to Abisko, Oslo, Gothenburg, midsummer in Fulufjället, table tennis evenings, Beer Clubs, Valborg, Darwin Day, Tasty Tuesdays, concerts, the Short Film Festival, Katuskas, Wilde parties at Kalmar, KulturNatten, snowboarding (in my case, eating snow) or just a beer after work. I will keep them as a treasure. Thank you all for being part of this adventure.

Todo tiene un origen y el mío está en España. Por supuesto tengo que em-pezar con mis padres, aunque algunos (por no decir muchos) días soy inso-portable, ellos siempre han estado ahí ayudándome en este camino. Mi her-mano, después de todas las peleas (y han sido muchas J), no cambio nada de haber crecido juntos. Sino es por ti, probablemente no sería el hijo del trueno que soy hoy. Eugenia, gracias por todo, especialmente por ser capaz de aguan-tar a mi hermano J. Mis sobrinos, Jaime y Eugenia, mis dos ojitos derechos. Gracias también a mis abuelos, os echo de menos, a mis tíos y a mis primos, por esas memorables comidas familiares. Manolo, Alicia, Enrique, Justi, Silvia, Ignacio, Manolo, Jesús, no sabría escoger una única cosa. Me habéis visto crecer!!! Tantos buenos amigos, Rubén, Marga, Manuel, Gloria y Carlitos, con vosotros una Cruzcampo sabe de lujo J, Chema, siempre he estado orgulloso de nuestra amistad, Lourdes, mi amore, Mar, dispuesto a dar guerra en Laos y Camboya, que lo sepas J. Esas vacaciones en los Piri-neos con Jesús, Pili, Marta, Jorge, Roberto, María Jesús, Manolo y Justi en nuestro centro de operaciones, el Hotel Mariana.

Thank you very much for everything! ¡Muchas gracias por todo!

Page 63: Stolen genes, a shortcut to successuu.diva-portal.org/smash/get/diva2:1383246/FULLTEXT01.pdf · Resumen en español ... Mbp Mega base pairs MSA Multiple sequence alignment NCBI National

63

References

Adam RD et al. 1988. Antigenic variation of a cysteine-rich protein in Giardia lamblia. J. Exp. Med. 167:109–118.

Adam RD. 2001. Biology of Giardia lamblia. Clin. Microbiol. Rev. 14:447–469. Adam RD. 2017. Diplomonadida. In: Handbook of the Protists: Second Edition.

Archibald, JM, Simpson, AGB, & Slamovits, CH, editors. Springer, Cham pp. 1219–1246.

Adl SM et al. 2019. Revisions to the Classification, Nomenclature, and Diversity of Eukaryotes. J. Eukaryot. Microbiol. 66:4–119.

Aldritt SM, Tien P, Wang CC. 1985. Pyrimidine salvage in Giardia lamblia. J. Exp. Med. 161:437–445.

Aléxéieff A. 1910. Sur les flagellés intestinaux des poissons marins (note préliminaire). Arch. Zool. Exp. 5:1–22.

Andersson JO et al. 2007. A genomic survey of the fish parasite Spironucleus salmonicida indicates genomic plasticity among diplomonads and significant lateral gene transfer in eukaryote genome evolution. BMC Genomics. 8:51.

Andersson JO, Sjögren ÅM, Davis LAM, Embley TM, Roger AJ. 2003. Phylogenetic analyses of diplomonad genes reveal frequent lateral gene transfers affecting eukaryotes. Curr. Biol. 13:94–104.

Andersson SGE et al. 1998. The genome sequence of Rickettsia prowazekii and the origin of mitochondria. Nature. 396:133–140.

Ankarklev J, Jerlström-Hultqvist J, Ringqvist E, Troell K, Svärd SG. 2010. Behind the smile: Cell biology and disease mechanisms of Giardia species. Nat. Rev. Microbiol. 8:413–422.

Ástvaldsson Á. 2019. Pathogenesis and Cell Biology of the Salmon Parasite Spironucleus salmonicida. Uppsala University.

Aurrecoechea C et al. 2013. EuPathDB: the eukaryotic pathogen database. Nucleic Acids Res. 41:684–691.

Aurrecoechea C et al. 2009. GiardiaDB and TrichDB: integrated genomic resources for the eukaryotic protist pathogens Giardia lamblia and Trichomonas vaginalis. Nucleic Acids Res. 37:D526–D530.

Bass D, Stentiford GD, Wang HC, Koskella B, Tyler CR. 2019. The pathobiome: new insights in animal and plant health. Trends Ecol. Evol. xx:1–13.

Beatty JK et al. 2017. Giardia duodenalis induces pathogenic dysbiosis of human intestinal microbiota biofilms. Int. J. Parasitol. 47:311–326.

Biagini GA et al. 2003. Bacterial-like energy metabolism in the amitochondriate protozoon Hexamita inflata. Mol. Biochem. Parasitol. 128:11–19.

Biagini GA, McIntyre PS, Finlay BJ, Lloyd D. 1998. Carbohydrate and amino acid fermentation in the free-living primitive protozoon Hexamita sp. Appl. Environ. Microbiol. 64:203–207.

Biagini GA, Rutter AJ, Finlay BJ, Lloyd D. 1998. Lipids and lipid metabolism in the microaerobic free-living diplomonad Hexamita sp. Eur. J. Protistol. 34:148–152.

Page 64: Stolen genes, a shortcut to successuu.diva-portal.org/smash/get/diva2:1383246/FULLTEXT01.pdf · Resumen en español ... Mbp Mega base pairs MSA Multiple sequence alignment NCBI National

64

Biagini GA, Suller MT, Finlay BJ, Lloyd D. 1997. Oxygen uptake and antioxidant responses of the free-living diplomonad Hexamita sp. J Eukaryot Microbiol. 44:447–453.

Bishop A. 1937. The Method of Division of Trepomonas agilis in Culture. Parasitology. 29:413–418.

Blanchard R. 1888. Remarques sur le megastome intestinal. Bull. Soc. Zool. Fr. 30:13–18.

Blevins SM, Bronze MS. 2010. Robert Koch and the “golden age” of bacteriology. Int. J. Infect. Dis. 14.

Brown DM, Upcroft JA, Edwards MR, Upcroft P. 1998. Anaerobic bacterial metabolism in the ancient eukaryote Giardia duodenalis. Int. J. Parasitol. 28:149–164.

Brown DM, Upcroft JA, Upcroft P. 1996. A H2O-producing NADH oxidase from the protozoan parasite Giardia duodenalis. Eur. J. Biochem. 241:155–61.

Brown DM, Upcroft JA, Upcroft P. 1995. Free radical detoxification in Giardia duodenalis. Mol. Biochem. Parasitol. 72:47–56.

Brown JR, Doolittle WF. 1997. Archaea and the prokaryote-to-eukaryote transition. Microbiol. Mol. Biol. Rev. 61:456–502.

Brown JR, Schwartz CL, Heumann JM, Dawson SC, Hoenger A. 2016. A detailed look at the cytoskeletal architecture of the Giardia lamblia ventral disc. J. Struct. Biol. 194:38–48.

Brugerolle G, Lee JJ. 2000. Orden Diplomonadida. In: The Illustrated Guide to the Protozoa. Lee, JJ, Leedale, GF, & Bradbury, P, editors. Society of Protozoologists: Lawrence, Kansas pp. 1125–1135.

Burki F, Roger A, Brown MW, Simpson AGB. 2019. The New Tree of Eukaryotes. Trends Ecol. Evol. In press:1–13.

Carlton JM et al. 2007. Draft genome sequence of the sexually transmitted pathogen Trichomonas vaginalis. Science (80-. ). 315:207–212.

Cavalier-Smith T. 1983. A 6-kingdom classification and a unified phylogeny. In: Endocytobiology. II. Intracellular Space as Oligogenetic. Schenk, HEA & Schwemmler, WS, editors. Walter de Gruyter: Berlin pp. 1027–1034.

Cavalier-Smith T. 2002. The phagotrophic origin of eukaryotes and phylogenetic classification on protozoa. Int. J. Syst. Evol. Microbiol. 52:297–354.

Cerkasovová A, Lukasová G, Cerkasòv J, Kulda J. 1973. Biochemical characterization of large granule fraction of Tritrichomonas foetus (strain KV1). In: Journal of Protozoology.Vol. 20 SOC Protozoologists p. 535.

Clark CG, Roger AJ. 1995. Direct evidence for secondary loss of mitochondria in Entamoeba histolytica. Proc. Natl. Acad. Sci. U. S. A. 92:6518–6521.

Corradi N, Pombert JF, Farinelli L, Didier ES, Keeling PJ. 2010. The complete sequence of the smallest known nuclear genome from the microsporidian Encephalitozoon intestinalis. Nat. Commun. 1.

Criscuolo A, Gribaldo S. 2010. BMGE (Block Mapping and Gathering with Entropy): a new software for selection of phylogenetic informative regions from multiple sequence alignments. BMC Evol. Biol. 10:210.

Darwin CR. 1871. The Descent of Man, and Selection in Relation to Sex. 1st ed. London: John Murray.

Derelle E et al. 2006. Genome analysis of the smallest free-living eukaryote Ostreococcus tauri unveils many unique features. Proc. Natl. Acad. Sci. U. S. A. 103:11647–11652.

Page 65: Stolen genes, a shortcut to successuu.diva-portal.org/smash/get/diva2:1383246/FULLTEXT01.pdf · Resumen en español ... Mbp Mega base pairs MSA Multiple sequence alignment NCBI National

65

Desser SS, Hong H, Siddall ME, Barta JR. 1993. An ultrastructural study of Brugerolleia algonquinensis gen. nov., sp. nov. (Diplomonadina; Diplomonadida), a flagellate parasite in the blood of frogs from Ontario, Canada. Eur. J. Protistol. 29:72–80.

Dimopoulos M, Bagnara AS, Edwards MR. 2000. Characterisation and sequence analysis of a carbamate kinase gene from the diplomonad Hexamita inflata. J. Eukaryot. Microbiol. 47:499–503.

Dobell C. 1920. The Discovery of the Intestinal Protozoa of Man. Proc. R. Soc. Med. 13:1–15.

Doolittle WF. 1998. You are what you eat: a gene transfer ratchet could account for bacterial genes in eukaryotic nuclear genomes. Trends Genet. 14:307–311.

Dujardin F. 1841. Histoire naturelle des Zoophytes. Infusoires, comprenant la physiologie et la classificatin de ces animaux, et la manière de les étudier à l’aide du microscope.

Ekman D, Björklund ÅK, Elofsson A. 2007. Quantification of the Elevated Rate of Domain Rearrangements in Metazoa. J. Mol. Biol. 372:1337–1348.

Eme L, Gentekaki E, Curtis B, Archibald JM, Roger AJ. 2017. Lateral gene transfer in the adaptation of the anaerobic parasite Blastocystis to the gut. Curr. Biol. 27:807–820.

Erlandsen SL et al. 1990. Axenic Culture and Characterization of Giardia ardeae from the Great Blue Heron (Ardea herodias). J. Parasitol. 76:717.

Erlandsen SL, Bemrick WJ. 1987. SEM Evidence for a New Species, Giardia psittaci. J. Parasitol. 73:623.

Feely DE. 1988. Morphology of the Cyst of Giardia microti by Light and Electron Microscopy. J. Protozool. 35:52–54.

Feng XM, Cao LJ, Adam RD, Zhang XC, Lu SQ. 2008. The catalyzing role of PPDK in Giardia lamblia. Biochem. Biophys. Res. Commun. 367:394–398.

da Fonseca OR. 1915. Estudos sobre os flagellados parasitos dos Mammiferos do Brazil. Typographia Leuzinger: Rio de Janeiro.

Force A et al. 1999. Preservation of Duplicate Genes by Complementary, Degenerative Mutations. Genetics. 151:1531–1545.

Gilbert W. 1978. Why genes in pieces? Nature. 271:501. Gillin FD, Reiner DS, Boucher SE. 1988. Small-intestinal factors promote encystation

of Giardia lamblia in vitro. Infect. Immun. 56:705–707. Graaf RM De, Hackstein JHP. 2012. Hydrogenosomes and Mitosomes: Mitochondrial

Adaptations to Life in Anaerobic Environments. Anoxia. 21:83–111. Grant JR, Katz LA. 2014. Phylogenomic study indicates widespread lateral gene

transfer in Entamoeba and suggests a past intimate relationship with parabasalids. Genome Biol. Evol. 6:2350–2360.

Gray MW, Doolittle WF. 1982. Has the endosymbiont hypothesis been proven? Microbiol. Rev. 46:1–42.

Gray MW, Spencer DF. 1996. Organellar evolution. Symp. Soc. Gen. Microbiol. 54:109–126.

Green ML, Karp PD. 2004. A Bayesian method for identifying missing enzymes in predicted metabolic pathway databases. BMC Bioinformatics. 5:1–16.

Griffith F. 1928. The Significance of Pneumococcal Types. J. Hyg. (Lond). 27:113–159.

Hagen KD et al. 2011. Novel structural components of the ventral disc and lateral crest in Giardia intestinalis. PLoS Negl. Trop. Dis. 5.

Hampl V et al. 2008. Genetic evidence for a mitochondriate ancestry in the “amitochondriate” flagellate Trimastix pyriformis. PLoS One. 3.

Page 66: Stolen genes, a shortcut to successuu.diva-portal.org/smash/get/diva2:1383246/FULLTEXT01.pdf · Resumen en español ... Mbp Mega base pairs MSA Multiple sequence alignment NCBI National

66

Häusler T, Stierhof YD, Blattner J, Clayton C. 1997. Consevation of mitochondrial targeting sequence function in mitochondrial and hydrogenosomal proteins from the early-branching eukaryotes Crithidia, Trypanosoma and Trichomonas. Eur. J. Cell Biol. 73:240–251.

Hill AB. 1965. The Environment and Disease: Association or Causation? J. R. Soc. Med. 58:295–300.

Hirt RP, Alsmark C, Embley TM. 2015. Lateral gene transfers and the origins of the eukaryote proteome: A view from microbial parasites. Curr. Opin. Microbiol. 23:155–162.

Honigberg BM et al. 1964. A Revised Classification of the Phylum Protozoa. J. Protozool. 11:7–20.

Hyatt D et al. 2010. Prodigal: Prokaryotic gene recognition and translation initiation site identification. BMC Bioinformatics. 11.

Janouškovec J et al. 2015. Factors mediating plastid dependency and the origins of parasitism in apicomplexans and their close relatives. Proc. Natl. Acad. Sci. U. S. A. 112:10200–10207.

Janouskovec J, Keeling PJ. 2016. Evolution: Causality and the origin of parasitism. Curr. Biol. 26:174–177.

Kalyaanamoorthy S, Minh BQ, Wong TKF, von Haeseler A, Jermiin LS. 2017. ModelFinder: fast model selection for accurate phylogenetic estimates. Nat. Methods. 14:587.

Kamaishi T et al. 1996. Protein phylogeny of translation elongation factor EF-1α suggests microsporidians are extremely ancient eukaryotes. J. Mol. Evol. 42:257–263.

Kanehisa M, Sato Y, Morishima K. 2016. BlastKOALA and GhostKOALA: KEGG Tools for Functional Characterization of Genome and Metagenome Sequences. J. Mol. Biol. 428:726–731.

Karp PD et al. 2016. Pathway Tools version 19.0 update: software for pathway/genome informatics and systems biology. Brief. Bioinform. 17:877–890.

Karp PD et al. 2018. The BioCyc collection of microbial genomes and metabolic pathways. Brief. Bioinform. 20:1085–1093.

Keeling PJ. 1998. A kingdom’s progress: Archezoa and the origin of eukaryotes. BioEssays. 20:87–95.

Keeling PJ, Brugerolle G. 2006. Evidence from SSU rRNA Phylogeny that Octomitus is a Sister Lineage to Giardia. Protist. 157:205–212.

Keeling PJ, Doolittle WF. 1996. A non-canonical genetic code in an early diverging eukaryotic lineage. EMBO J. 15:2285–2290.

Kidwell MG. 1993. Lateral Trasnfer in Natural Populations of Eukaryotes. Annu. Rev. Genet. 27:235–256.

Klebs G. 1892. Flagellatenstudien. Zeitschrift für Wissenschaftliche Zool. 55:265–445.

Klinger CM, Karnkowska A, Herman EK, Hampl V, Dacks JB. 2016. Phylogeny and evolution. Mol. Parasitol. Protozoan Parasites their Mol. 383–408.

Kofoid CA. 1920. A Critical Review of the Nomenclature of Human Intestinal Flagellates, Cercomonas, Chilomastix, Trichomonas, Tetratrichomonas, and Giardia. Univiversity Calif. Publ. Zool. 20:145-168 pp.

Kofoid CA, Christiansen EB. 1915. On Binary and Multiple Fission in Giardia muris (Grassi). Univiversity Calif. Publ. Zool. 16:30-54 pp.

Kolisko M et al. 2010. A wide diversity of previously undetected free-living relatives of diplomonads isolated from marine/saline habitats. Environ. Microbiol. 12:2700–2710.

Page 67: Stolen genes, a shortcut to successuu.diva-portal.org/smash/get/diva2:1383246/FULLTEXT01.pdf · Resumen en español ... Mbp Mega base pairs MSA Multiple sequence alignment NCBI National

67

Kolisko M et al. 2008. Molecular phylogeny of diplomonads and enteromonads based on SSU rRNA, alpha-tubulin and HSP90 genes: implications for the evolutionary history of the double karyomastigont of diplomonads. BMC Evol. Biol. 8:205.

Kolisko M, Cepicka I, Hampl V, Kulda J, Flegr J. 2005. The phylogenetic position of enteromonads: A challenge for the present models of diplomonad evolution. Int. J. Syst. Evol. Microbiol. 55:1729–1733.

Korkmaz G, Holm M, Wiens T, Sanyal S. 2014. Comprehensive analysis of stop codon usage in bacteria and its correlation with release factor abundance. J. Biol. Chem. 289:30334–30342.

Kunstler J. 1882. Sur cinque Protozoaires parasites nouveaux. C.R. Hebd. Seances Acad. Sci. 95:347–349.

Lacroix B. 2016. Transfer of DNA from Bacteria to Eukaryotes. MBio. 7:1–9. Lambl W. 1859. Mikroskopische Untersuchungen der Darm-Excrete : Beitrag zur

Pathologie des Darms und zur Diagnostik am Krankenbette. Vierteljahrschrift für die Prakt. Heilkd. 61:1–58.

Lavier G. 1936. Sur la structure des Flagellés du genre Hexamita (Dujardin). C.R. Soc. Biol. 121:1177–1180.

Lederberg J, Tatum EL. 1946. Gene recombination in Escherichia coli. Nature. 158:558.

van Leeuwenhoek A. 1948. Alle de brieven. Deel 3: 1679-1683. N. V. Swets & Zeitlinger.

Leger MM et al. 2017. Organelles that illuminate the origins of Trichomonas hydrogenosomes and Giardia mitosomes. Nat. Ecol. Evol. 1:92.

Lemey P, Salemi M, Vandamme A-M, eds. 2009. The Phylogenetic Handbook. Cambridge University Press: Cambridge.

Levine ND et al. 1980. A Newly Revised Classification of the Protozoa. J. Protozool. 27:37–58.

Li W, Godzik A. 2006. Cd-hit: A fast program for clustering and comparing large sets of protein or nucleotide sequences. Bioinformatics. 22:1658–1659.

Li Z, Phillips NFB. 1995. Pyrophosphate-dependent phosphofructokinase from Giardia lamblia: Purification and characterization. Protein Expr. Purif. 6:319–328.

Lindmark DG. 1980. Energy metabolism of the anaerobic protozoan Giardia lamblia. Mol. Biochem. Parasitol. 1:1–12.

Lindwiark G, Muller M. 1973. A Cytoplasmic Organelle Flagellate Tritrichomonas foetus, and Its Role in Pyruvate Metabolism. Biol. Chem. 248:7724–7728.

Lloyd D, Williams CF. 2014. Comparative biochemistry of Giardia, Hexamita and Spironucleus: Enigmatic diplomonads. Mol. Biochem. Parasitol. 197:43–49.

Loftus B et al. 2005. The genome of the protist parasite Entamoeba histolytica. Nature. 433:865–868.

Lokmer A et al. 2019. Use of shotgun metagenomics for the identification of protozoa in the gut microbiota of healthy individuals from worldwide populations with various industrialization levels. PLoS One. 14:1–20.

Lujan HD, Mowatt MR, Nash TE. 1996. Lipid requirements and lipid uptake by Giardia lamblia trophozoites in culture. J. Eukaryot. Microbiol. 43:237–242.

Lynch M, Conery JS. 2000. The evolutionary fate and consequences of duplicate genes. Science (80-. ). 290:1151–1155.

Ma’ayeh SY et al. 2017. Characterization of the Giardia intestinalis secretome during interaction with human intestinal epithelial cells: the impact on host cells. PLoS Negl. Trop. Dis. 11.

Ma’ayeh SY, Knörr L, Svärd SG. 2015. Transcriptional profiling of Giardia intestinalis in response to oxidative stress. Int. J. Parasitol. 45:925–938.

Page 68: Stolen genes, a shortcut to successuu.diva-portal.org/smash/get/diva2:1383246/FULLTEXT01.pdf · Resumen en español ... Mbp Mega base pairs MSA Multiple sequence alignment NCBI National

68

Majoros WH, Pertea M, Salzberg SL. 2004. TigrScan and GlimmerHMM: Two open source ab initio eukaryotic gene-finders. Bioinformatics. 20:2878–2879.

Marchler-Bauer A, Bryant SH. 2004. CD-Search: protein domain annotations on the fly. Nucleic Acids Res. 32:327–331.

Martijn J, Vosseberg J, Guy L, Offre P, Ettema TJG. 2018. Deep mitochondrial origin outside the sampled alphaproteobacteria. Nature. 557:101–105.

Martin W, Müller M. 1998. The hydrogen hypothesis for the first eukaryote. Nature. 392:37–41.

Mastronicola D et al. 2010. Flavohemoglobin and nitric oxide detoxification in the human protozoan parasite Giardia intestinalis. Biochem. Biophys. Res. Commun. 399:654–658.

Mereschkowski K. 1905. Über Natur und Ursprung der Chromatophoren im Pflanzenreiche. Biol. Cent. 25:593–604.

Mereschkowsky C. 1910. Theorie der zwei Plasmaarten als Grundlage der Symbiogenesis, einer neuen Lehre von der Entstehung der Organismen. Biol. Cent. 30:278–288.

Milner DS et al. 2019. Environment-dependent fitness gains can be driven by horizontal gene transfer of transporter-encoding genes. Proc. Natl. Acad. Sci. U. S. A. 116:5613–5622.

Monis PT, Caccio SM, Thompson RCA. 2009. Variation in Giardia: towards a taxonomic revision of the genus. Trends Parasitol. 25:93–100.

Müller M. 1988. Energy metabolism of protozoa without mitochondria. Annu. Rev. Microbiol. 42:465–488.

Nash TE, Aggarwal A, Adam RD, Conrad JT, Merritt Jr JW. 1988. Antigenic variation in Giardia lamblia. J. Immunol. 141:636–641.

Nguyen L-T, Schmidt HA, von Haeseler A, Minh BQ. 2015. IQTree: A fast and effective stochastic algorithm for estimating maximum likelihood phylogenies. Mol. Biol. Evol. 32:268–274.

Nixon JEJ et al. 2002. Evidence for lateral transfer of genes encoding ferredoxins, nitroreductases, NADH oxidase, and alcohol dehydrogenase 3 from anaerobic prokaryotes to Giardia lamblia and Entamoeba histolytica. Eukaryot. Cell. 1:181–190.

Ohno S. 1970. Evolution by Gene Duplication. Springer, Berlin, Heidelberg. Peyretaillade E et al. 1998. Microsporidia, amitochondrial protists, possess a 70-kDa

heat shock protein gene of mitochondrial evolutionary origin. Mol. Biol. Evol. 15:683–689.

Phillips NFB, Li Z, Lindmark DG. 1997. Isolation of a pyrophosphate-dependent phosphofructokinase from Hexamita inflata. Mol. Biochem. Parasitol. 90:377–380.

Ponce-Toledo RI et al. 2017. An Early-Branching Freshwater Cyanobacterium at the Origin of Plastids. Curr. Biol. 27:1–6.

Portier P. 1918. Les symbiotes. Masson. Poulin R, Randhawa HS. 2015. Evolution of parasitism along convergent lines: From

ecology to genomics. Parasitology. 142:S6–S15. Poynton SL, Fard MRS, Jenkins J, Ferguson HW. 2004. Ultrastructure of

Spironucleus salmonis n. comb. (formerly Octomitus salmonis sensu Moore 1922, Davis 1926, and Hexamita salmonis sensu Ferguson 1979), with a guide to Spironucleus species. Dis. Aquat. Organ. 60:49–64.

Price MN, Dehal PS, Arkin AP. 2010. FastTree 2 – approximately maximum-likelihood trees for large alignments. PLoS One. 5.

Prowazek S v. 1904. Untersuchungen über einige parasitische Flagellaten. Arb. aus d. Kaiserl. Gesundheitsamte. 21:1–41.

Page 69: Stolen genes, a shortcut to successuu.diva-portal.org/smash/get/diva2:1383246/FULLTEXT01.pdf · Resumen en español ... Mbp Mega base pairs MSA Multiple sequence alignment NCBI National

69

Rafferty S, Luu B, March RE, Yee J. 2010. Giardia lamblia encodes a functional flavohemoglobin. Biochem. Biophys. Res. Commun. 399:347–351.

Raj D, Ghosh E, Mukherjee AK, Nozaki T, Ganguly S. 2014. Differential gene expression in Giardia lamblia under oxidative stress: significance in eukaryotic evolution. Gene. 535:131–139.

Van Rijk A, Bloemendal H. 2003. Molecular mechanisms of exon shuffling: Illegitimate recombination. Genetica. 118:245–249.

Roger AJ et al. 1998. A mitochondrial-like chaperonin 60 gene in Giardia lamblia: Evidence that diplomonads once harbored an endosymbiont related to the progenitor of mitochondria. Proc. Natl. Acad. Sci. U. S. A. 95:229–234.

Roger AJ, Clark CG, Doolittle WF. 1996. A possible mitochondrial gene in the early-branching amitochondriate protist Trichomonas vaginalis. Proc. Natl. Acad. Sci. U. S. A. 93:14618–14622.

Roger AJ, Muñoz-Gómez SA, Kamikawa R. 2017. The Origin and Diversification of Mitochondria. Curr. Biol. 27:R1177–R1192.

Rossman MG, Liljas A, Brändén CI, Banaszak LJ. 1975. 2 Evolutionary and Structural Relationships among Dehydrogenases. In: Enzymes. Boyer, PD, editor. Vol. 11 Academic Press pp. 61–102.

Sagan L. 1967. On the origin of mitosing cells. J. Theor. Biol. 14:255–74. Savory FR, Milner DS, Miles DC, Richards TA. 2018. Ancestral function and

diversification of a horizontally acquired oomycete carboxylic acid transporter. Mol. Biol. Evol. 35:1887–1900.

Schofield PJ, Edwards MR, Matthews J, Wilson JR. 1992. The pathway of arginine catabolism in Giardia intestinalis. Mol. Biochem. Parasitol. 51:29–36.

Seligo A. 1887. Untersuchungen über Flagellaten. Beiträge zur Biol. der Pflanz. 4:145–178.

Shapiro J. 2012. Mobile Genetic Elements. Elsevier Science. Sheng Y et al. 2014. Superoxide dismutases and superoxide reductases. Chem. Rev.

114:3854–3918. Siddall ME, Brooks DR, Desser SS. 1993. Phylogeny and the Reversibility of

Parasitism. Evolution (N. Y). 47:308. Siddall ME, Hong H, Desser SS. 1992. Phylogenetic Analysis of the Diplomonadida

(Wenyon, 1926) Brugerolle, 1975: Evidence for Heterochrony in Protozoa and Against Giardia lamblia as a “Missing Link.” J. Protozool. 39:361–367.

Sieber KB, Bromley RE, Dunning Hotopp JC. 2017. Lateral gene transfer between prokaryotes and eukaryotes. Exp. Cell Res. 358:421–426.

Simpson AGB. 2003. Cytoskeletal organization, phylogenetic affinities and systematics in the contentious taxon Excavata (Eukaryota). Int. J. Syst. Evol. Microbiol. 53:1759–1777.

Sogin ML, Gunderson JH, Elwood HJ, Alonso RA, Peatrie DA. 1989. Phylogenetic meaning of the kingdom concept: An unusual ribosomal RNA from Giardia lamblia. Science (80-. ). 243:75–77.

Soltys BJ, Gupta RS. 1994. Presence and Cellular Distribution of a 60-kDa Protein Related to Mitochondrial HSP60 in Giardia lamblia. J. Parasitol. 80:580–590.

Spang A et al. 2015. Complex archaea that bridge the gap between prokaryotes and eukaryotes. Nature. 521:173–179.

Stairs CW et al. 2019. Oxygen induces the expression of invasion and stress response genes in the anaerobic salmon parasite Spironucleus salmonicida. BMC Biol. 17:19.

Stiles CW. 1902. The Type-species of Certain Genera of Parasitic Flagellates, Particularly Grassi’s Genera of 1879 and 1881.

Page 70: Stolen genes, a shortcut to successuu.diva-portal.org/smash/get/diva2:1383246/FULLTEXT01.pdf · Resumen en español ... Mbp Mega base pairs MSA Multiple sequence alignment NCBI National

70

Striepen B et al. 2004. Gene transfer in the evolution of parasite nucleotide biosynthesis. Proc. Natl. Acad. Sci. U. S. A. 101:3154–3159.

Syvanen M. 1985. Cross-species gene transfer; implications for a new theory of evolution. J. Theor. Biol. 112:333–343.

Szöllősi GJ, Davín AA, Tannier E, Daubin V, Boussau B. 2015. Genome-scale phylogenetic analysis finds extensive gene transfer among fungi. Philos. Trans. R. Soc. B Biol. Sci. 370.

Szöllősi GJ, Tannier E, Daubin V, Boussau B. 2015. The inference of gene trees with species trees. Syst. Biol. 64:e42–e62.

Takishita K et al. 2012. Multigene Phylogenies of Diverse Carpediemonas-like Organisms Identify the Closest Relatives of “Amitochondriate” Diplomonads and Retortamonads. Protist. 163:344–355.

Tanifuji G et al. 2018. The draft genome of Kipferlia bialata reveals reductive genome evolution in fornicate parasites Schierwater, B, editor. PLoS One. 13:e0194487.

Tautz D, Domazet-Lošo T. 2011. The evolutionary origin of orphan genes. Nat. Rev. Genet. 12:692–702.

Testa F et al. 2011. The superoxide reductase from the early diverging eukaryote Giardia intestinalis. Free Radic. Biol. Med. 51:1567–1574.

Thomas CM, Nielsen KM. 2005. Mechanisms of, and barriers to, horizontal gene transfer between bacteria. Nat. Rev. Microbiol. 3:711–721.

Timmis JN, Ayliff MA, Huang CY, Martin W. 2004. Endosymbiotic gene transfer: Organelle genomes forge eukaryotic chromosomes. Nat. Rev. Genet. 5:123–135.

Tovar J et al. 2003. Mitochondrial remnant organelles of Giardia function in iron-sulphur protein maturation. Nature. 426:172–176.

Tovar J, Fischer A, Clark CG. 1999. The mitosome, a novel organelle related to mitochondria in the amitochondrial parasite Entamoeba histolytica. Mol. Microbiol. 32:1013–1021.

Tsaousis AD, Leger MM, Stairs CAW, Roger AJ. 2012. The Biochemical Adaptations of Mitochondrion-Related Organelles of Parasitic and Free-Living Microbial Eukaryotes to Low Oxygen Environments. In: Anoxia: Evidence for Eukaryote Survival and Paleontological Strategies. Altenbach, A V, Bernhard, JM, & Seckbach, J, editors. Springer Netherlands: Dordrecht pp. 51–81.

Vayssier-Taussat M et al. 2014. Shifting the paradigm from pathogens to pathobiome new concepts in the light of meta-omics. Front. Cell. Infect. Microbiol. 5:1–7.

Vicente JB et al. 2009. Redox properties of the oxygen-detoxifying flavodiiron protein from the human parasite Giardia intestinalis. Arch. Biochem. Biophys. 488:9–13.

Vossbrinck CR, Maddox J V., Friedman S, Debrunner-Vossbrinck BA, Woese CR. 1987. Ribosomal RNA sequence suggests microsporidia are extremely ancient eukaryotes. Nature. 326:411–414.

Wallin IE. 1927. Symbionticism and the origin of species. Baltimore :Williams & Wilkins Company.

Wang CC, Aldritt S. 1983. Purine salvage networks in Giardia lamblia. J. Exp. Med. 158:1703–1712.

Wenyon CM. 1926. Protozoology: A Manual For Medical Men, Veterinarians and Zoologists.

Williams BAP, Hirt RP, Lucocq JM, Embley TM. 2002. A mitochondrial remnant in the microsporidian Trachipleistophora hominis. Nature. 418:865–869.

Williams CF et al. 2011. Spironucleus species: Economically-important fish pathogens and enigmatic single-celled eukaryotes. J. Aquac. Res. Dev.

Wilson EB. 1925. The cell in development and heredity. Macmillan: New York.

Page 71: Stolen genes, a shortcut to successuu.diva-portal.org/smash/get/diva2:1383246/FULLTEXT01.pdf · Resumen en español ... Mbp Mega base pairs MSA Multiple sequence alignment NCBI National

71

Woese CR, Fox GE. 1977. Phylogenetic structure of the prokaryotic domain: The primary kingdoms. Proc. Natl. Acad. Sci. U. S. A. 74:5088–5090.

Woo YH et al. 2015. Chromerid genomes reveal the evolutionary path from photosynthetic algae to obligate intracellular parasites. Elife. 4:1–41.

Wood AM, Smith H V. 2005. Spironucleosis (Hexamitiasis, Hexamitosis) in the Ring-Necked Pheasant (Phasianus colchicus): Detection of Cysts and Description of Spironucleus meleagridis in Stained Smears. Avian Dis. 49:138–143.

Xu F et al. 2016. On the reversibility of parasitism: Adaptation to a free-living lifestyle via gene acquisitions in the diplomonad Trepomonas sp. PC1. BMC Biol. 14:62.

Xu F et al. 2014. The genome of Spironucleus salmonicida highlights a fish pathogen adapted to fluctuating environments. PLoS Genet. 10.

Xu F, Jerlström-Hultqvist J, Andersson JO. 2012. Genome-wide analyses of recombination suggest that Giardia intestinalis assemblages represent different species. Mol. Biol. Evol. 29:2895–2898.

Yang D, Oyaizu Y, Oyaizu H, Olsen GJ, Woese CR. 1985. Mitochondrial origins. Proc. Natl. Acad. Sci. U. S. A. 82:4443–4447.

Zinder ND, Lederberg J. 1952. Genetic exchange in Salmonella. J. Bacteriol. 64:679–699.

Page 72: Stolen genes, a shortcut to successuu.diva-portal.org/smash/get/diva2:1383246/FULLTEXT01.pdf · Resumen en español ... Mbp Mega base pairs MSA Multiple sequence alignment NCBI National

Acta Universitatis UpsaliensisDigital Comprehensive Summaries of Uppsala Dissertationsfrom the Faculty of Science and Technology 1892

Editor: The Dean of the Faculty of Science and Technology

A doctoral dissertation from the Faculty of Science andTechnology, Uppsala University, is usually a summary of anumber of papers. A few copies of the complete dissertationare kept at major Swedish research libraries, while thesummary alone is distributed internationally throughthe series Digital Comprehensive Summaries of UppsalaDissertations from the Faculty of Science and Technology.(Prior to January, 2005, the series was published under thetitle “Comprehensive Summaries of Uppsala Dissertationsfrom the Faculty of Science and Technology”.)

Distribution: publications.uu.seurn:nbn:se:uu:diva-401276

ACTAUNIVERSITATIS

UPSALIENSISUPPSALA

2020