expanding magnetic organelle biogenesis in the domain bacteria · 55 of bacteria and archaea,...

28
1 Expanding magnetic organelle biogenesis in the domain Bacteria 1 2 Wei Lin 1,2,3* , Wensi Zhang 1,2,3,4 , Greig A. Paterson 5 , Qiyun Zhu 6 , Xiang Zhao 7 , Rob 3 Knight 6 , Dennis A. Bazylinski 8 , Andrew P. Roberts 7 , and Yongxin Pan 1,2,3,4* 4 5 1 Key Laboratory of Earth and Planetary Physics, Institute of Geology and Geophysics, Chinese 6 Academy of Sciences, Beijing 100029, China. 7 2 Innovation Academy for Earth Science, Chinese Academy of Sciences, Beijing 100029, China. 8 3 France-China Joint Laboratory for Evolution and Development of Magnetotactic Multicellular 9 Organisms, Chinese Academy of Sciences, Beijing 100029, China. 10 4 College of Earth and Planetary Sciences, University of Chinese Academy of Sciences, Beijing 11 100049, China. 12 5 Department of Earth, Ocean and Ecological Sciences, University of Liverpool, Liverpool, L69 13 7ZE, UK. 14 6 Department of Pediatrics, University of California San Diego, La Jolla, CA 92037, USA. 15 7 Research School of Earth Sciences, Australian National University, Canberra, ACT 2601, 16 Australia. 17 8 School of Life Sciences, University of Nevada at Las Vegas, Las Vegas, NV 89154-4004, USA. 18 19 * Email: [email protected]; [email protected] 20 (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprint this version posted April 28, 2020. . https://doi.org/10.1101/2020.04.27.061960 doi: bioRxiv preprint

Upload: others

Post on 06-Jul-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Expanding magnetic organelle biogenesis in the domain Bacteria · 55 of Bacteria and Archaea, posing problems for elucidating the evolutionary history of 56 cellular complexity. 57

1

Expanding magnetic organelle biogenesis in the domain Bacteria 1

2

Wei Lin1,2,3*, Wensi Zhang1,2,3,4, Greig A. Paterson5, Qiyun Zhu6, Xiang Zhao7, Rob 3

Knight6, Dennis A. Bazylinski8, Andrew P. Roberts7, and Yongxin Pan1,2,3,4* 4

5

1Key Laboratory of Earth and Planetary Physics, Institute of Geology and Geophysics, Chinese 6

Academy of Sciences, Beijing 100029, China. 7

2Innovation Academy for Earth Science, Chinese Academy of Sciences, Beijing 100029, China. 8

3France-China Joint Laboratory for Evolution and Development of Magnetotactic Multicellular 9

Organisms, Chinese Academy of Sciences, Beijing 100029, China. 10

4College of Earth and Planetary Sciences, University of Chinese Academy of Sciences, Beijing 11

100049, China. 12

5Department of Earth, Ocean and Ecological Sciences, University of Liverpool, Liverpool, L69 13

7ZE, UK. 14

6Department of Pediatrics, University of California San Diego, La Jolla, CA 92037, USA. 15

7Research School of Earth Sciences, Australian National University, Canberra, ACT 2601, 16

Australia. 17

8School of Life Sciences, University of Nevada at Las Vegas, Las Vegas, NV 89154-4004, USA. 18

19

*Email: [email protected]; [email protected] 20

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted April 28, 2020. . https://doi.org/10.1101/2020.04.27.061960doi: bioRxiv preprint

Page 2: Expanding magnetic organelle biogenesis in the domain Bacteria · 55 of Bacteria and Archaea, posing problems for elucidating the evolutionary history of 56 cellular complexity. 57

2

Abstract 21

The discovery of membrane-enclosed, metabolically functional organelles in 22

Bacteria and Archaea has transformed our understanding of the subcellular 23

complexity of prokaryotic cells. However, whether prokaryotic organelles 24

emerged early or late in evolutionary history remains unclear and limits 25

understanding of the nature and cellular complexity of early life. 26

Biomineralization of magnetic nanoparticles within magnetosomes by 27

magnetotactic bacteria (MTB) is a fascinating example of prokaryotic organelles. 28

Here, we reconstruct 168 metagenome-assembled MTB genomes from various 29

aquatic environments and waterlogged soils. These genomes represent nearly a 3-30

fold increase over the number currently available, and more than double the 31

known MTB species. Phylogenomic analysis reveals that these newly described 32

genomes belong to 13 Bacterial phyla, six of which were previously not known to 33

include MTB. These findings indicate a much wider taxonomic distribution of 34

magnetosome organelle biogenesis across the domain Bacteria than previously 35

thought. Comparative genome analysis reveals a vast diversity of magnetosome 36

gene clusters involved in magnetosomal biogenesis in terms of gene content and 37

synteny residing in distinct taxonomic lineages. These gene clusters therefore 38

represent a promising, diverse genetic resource for biosynthesizing novel magnetic 39

nanoparticles. Finally, our phylogenetic analyses of the core magnetosome 40

proteins in this largest available and taxonomically diverse dataset support an 41

unexpectedly early evolutionary origin of magnetosome biomineralization, likely 42

ancestral to the origin of the domain Bacteria. These findings emphasize the 43

potential biological significance of prokaryotic organelles on the early Earth and 44

have important implications for our understanding of the evolutionary history of 45

cellular complexity. 46

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted April 28, 2020. . https://doi.org/10.1101/2020.04.27.061960doi: bioRxiv preprint

Page 3: Expanding magnetic organelle biogenesis in the domain Bacteria · 55 of Bacteria and Archaea, posing problems for elucidating the evolutionary history of 56 cellular complexity. 57

3

Introduction 47

It was accepted widely that intracellular, membrane-bounded, metabolically functional 48

organelles are present exclusively in eukaryotic cells and that they are absent from 49

Bacteria and Archaea. This long-held view was revised after numerous recent 50

discoveries of a diverse group of highly organized, membrane-enclosed organelles in 51

the domains Bacteria and Archaea associated with specific cellular functions 1–3. 52

However, the origin and evolution of prokaryotic organelles remain largely elusive. It 53

is still unclear whether organelle biogenesis emerged early or late during the evolution 54

of Bacteria and Archaea, posing problems for elucidating the evolutionary history of 55

cellular complexity. 56

Magnetosomes within magnetotactic bacteria (MTB) are a striking example of 57

prokaryotic organelles 4. Magnetosomes consist of a lipid bilayer-bounded membrane 58

in which nanosized, ferrimagnetic magnetite (Fe3O4) and/or greigite (Fe3S4) crystals 59

are biomineralized and usually arranged in chain-like structure(s) that maximize the 60

magnetic dipole moment 5,6 (Figure 1). The most accepted major function of 61

magnetosomes is to produce tiny compass needles that facilitate MTB navigation to 62

their preferred low-O2 or anaerobic microenvironments in chemically-stratified aquatic 63

systems, a behaviour referred to as magnetotaxis or microbial magnetoreception 7. 64

Additional suggested functions of magnetosomal crystals include 65

detoxification/elimination of toxic reactive oxygen species (ROS), iron sequestration 66

and storage in which they act as an electrochemical battery, or as a gravity sensor 8–10 67

(Figure 1). Understanding the phylogenetic and genomic diversity of MTB could 68

advance significantly our understanding of the evolutionary origin of bacterial 69

organelle biogenesis in general. Moreover, considering that magnetoreception occurs 70

widely in both micro- and macro-organisms and that magnetosomal crystals are the 71

only magnetoreceptors definitively characterized thus far, MTB also represent a 72

valuable system for exploring the origin and early evolution of magnetoreception 11–14. 73

MTB are distributed globally across a broad range of O2-limited or anaerobic aquatic 74

habitats, ranging from freshwater lakes to oceans and even to some extreme 75

environments 15–17. However, few studies have reported MTB in waterlogged soils 18,19. 76

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted April 28, 2020. . https://doi.org/10.1101/2020.04.27.061960doi: bioRxiv preprint

Page 4: Expanding magnetic organelle biogenesis in the domain Bacteria · 55 of Bacteria and Archaea, posing problems for elucidating the evolutionary history of 56 cellular complexity. 57

4

Magnetosomal biogenesis by MTB is recognized as a key component of the global iron 77

cycle 20,21, and is also potentially important in the biogeochemical cycling of 78

phosphorus, nitrogen, and sulphur 9,22–24. Historically, magnetosomal biomineralization 79

has been viewed as a specialized type of metabolism restricted to two Bacterial phyla: 80

the Proteobacteria (Alphaproteobacteria, Deltaproteobacteria, and 81

Gammaproteobacteria classes) and the Nitrospirae 25. Initial attempts to explain this 82

restricted yet scattered distribution of magnetosomal biogenesis based on a few taxa 83

gave rise to two alternative hypotheses: polyphyletic origin in different taxonomic 84

lineages 26, or extensive horizontal gene transfers (HGTs) 27,28. 85

In recent years, the known extent of MTB diversity has undergone a significant 86

expansion due to methodological advances, such as the successful cultivation of novel 87

MTB strains e.g., 29–32, 16S rRNA gene-based characterization e.g., 33–35, genome mining 88

of public repositories including GenBank and IMG/ER 17,36, and cultivation-89

independent surveys of magnetosome gene clusters (MGCs, which are physically 90

clustered groups of genes that are together responsible for magnetosomal biogenesis) 91

37,38. MTB have now been further identified in other Bacterial taxa, including the 92

Betaproteobacteria, Zetaproteobacteria, “Candidatus Etaproteobacteria”, and 93

“Candidatus Lambdaproteobacteria” classes of the Proteobacteria phylum, the 94

candidate phylum Omnitrophica (previously known as the candidate division OP3), the 95

candidate phylum Latescibacteria (previously known as the candidate division WS3), 96

and the phylum Planctomycetes, according to the NCBI taxonomy. Analyses including 97

data from these latter groups suggest that magnetosomal biogenesis among different 98

Bacterial lineages has a monophyletic origin from a common ancestor, which occurred 99

prior to divergence of the Nitrospirae and Proteobacteria phyla, or perhaps even earlier, 100

in the last common ancestor of five MTB-containing Bacterial phyla: Proteobacteria, 101

Nitrospirae, Omnitrophica, Latescibacteria, and Planctomycetes 15,37,39–41. 102

Discovery of novel MTB outside the traditionally recognized taxonomic lineages 103

suggests that the diversity and distribution of MTB across the domain Bacteria are 104

considerably underestimated. This raises important questions regarding the origin and 105

evolution of magnetosome organelle biosynthesis. Here we present the most 106

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted April 28, 2020. . https://doi.org/10.1101/2020.04.27.061960doi: bioRxiv preprint

Page 5: Expanding magnetic organelle biogenesis in the domain Bacteria · 55 of Bacteria and Archaea, posing problems for elucidating the evolutionary history of 56 cellular complexity. 57

5

comprehensive metagenomic analysis available of MTB communities from 107

geographically, physically, and chemically diverse sites to evaluate two questions: (1) 108

are MTB taxonomically widespread among the phyla of the domain Bacteria; and (2) 109

did the magnetosome organelle originated early? 110

111

Results and Discussion 112

Survey of MTB from diverse environments. We perform a light microscopy survey 113

of MTB from wide-ranging environments across China and Australia (Figure 2a), 114

including sediments from freshwater lakes, ponds, rivers, creeks, paddy fields, 115

intertidal zones, and soils from acidic peatlands. MTB were observed microscopically 116

in these habitats with a salinity range of <0.1-37.0 ppt and a pH range of 4.3-8.6 117

(Supplementary Table 1). Unexpectedly, we find living MTB cells in acidic peatland 118

soils (pH 4.3-5.7) with high water contents (>60%) and organic matter contents 119

(typically >20%) 42–44. MTB have been found broadly in diverse aquatic ecosystems 17, 120

including some extreme environments such as hot springs 45,46, saline-alkaline lakes 47, 121

acidic lagoons and mine drainage systems 33,48, and deep-sea sediments 49, but reports 122

of MTB in soils are limited to a few studies that were published almost 30 years ago 123

18,19. Whether MTB can survive and reproduce in waterlogged soil environments 124

remains unresolved, and the taxonomic diversity of MTB in these environments has not 125

been elucidated. Our finding represents, to the best of our knowledge, the first discovery 126

of MTB in soils with relatively acidic pH. MTB in acidic peatland soils are represented 127

mainly by deep-branching lineages, which is markedly different from other 128

environments (Figure 3, discussed below). Our finding, together with the previous 129

studies 18,19, suggests widespread MTB occurrence in different types of waterlogged 130

soils, in addition to subaqueous sediments and bodies of water. 131

132

Expanded genomic diversity of MTB. Metagenomic DNA from magnetically 133

enriched MTB was sequenced and metagenome-assembled genomes were 134

reconstructed with a single-sample assembly and binning strategy (see Methods). 135

Reconstructed genomes were checked manually for the presence of MGCs. From this 136

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted April 28, 2020. . https://doi.org/10.1101/2020.04.27.061960doi: bioRxiv preprint

Page 6: Expanding magnetic organelle biogenesis in the domain Bacteria · 55 of Bacteria and Archaea, posing problems for elucidating the evolutionary history of 56 cellular complexity. 57

6

analysis, we recover a total of 168 MGC-containing genomes with a quality score 50 137

above 50 (Supplementary Table 2). Of these putative MTB genomes, 69 (41%) are 138

high-quality (with >90% completeness and <5% contamination), 64 (38%) 139

are medium-quality (>70% complete with <10% contamination ), and 35 (21%) are 140

partial (with 50-70% completeness and <5% contamination) genomes (Figure 2b and 141

Supplementary Table 2). These genomes increase substantially the known genomic 142

diversity of MTB, expanded from 59 (as of January 2020, Supplementary Table 3) to 143

227 MTB genomes, which can now be clustered into 164 unique species-level genomes 144

based on a 95% average nucleotide identity (ANI) 51. Of these species, 110 (67%) are 145

reported here for the first time. 146

The taxonomy of the newly acquired MTB genomes is classified using a standardized 147

phylogenomic curated taxonomy system Genome Taxonomy Database (GTDB) Toolkit 148

52,53. Around 80% of the recovered genomes cannot be assigned at the species level, 149

and more than 45% cannot be assignable at the genus level (Figure 2c), which confirms 150

that many of these genomes represent previously unknown populations. The 151

phylogenetic relationships of these genomes were further determined by phylogenomic 152

analysis (see Methods). The reconstructed genome tree is congruent with the GTDB 153

taxonomy as shown in Figure 3a and Supplementary Figure 1; therefore, the GTDB 154

taxonomy is used for taxonomic classification of MTB throughout, unless otherwise 155

noted. The corresponding NCBI taxonomy of each lineage is also given in Figure 3a. 156

The 168 genomes belong to organisms from 13 distinct Bacterial phyla as defined in 157

the GTDB taxonomy and 7 phyla according to the NCBI taxonomy (Figure 3a and 158

Supplementary Table 2). These genomes are recovered from several previously poorly 159

characterized MTB groups, including 26 genomes from the Desulfobacterota phylum, 160

23 genomes from the Nitrospirota phylum, and 20 genomes from the Omnitrophota 161

phylum (Figure 3a). These novel genomes also expand substantially the representation 162

of common MTB lineages, such as the Magnetococcia (57 genomes) and 163

Alphaproteobacteria (13 genomes) classes of the Proteobacteria phylum. More 164

importantly, we identify 16 genomes that are affiliated within 6 phyla that were not 165

known previously to contain MTB, including the Nitrospinota, UBA10199, 166

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted April 28, 2020. . https://doi.org/10.1101/2020.04.27.061960doi: bioRxiv preprint

Page 7: Expanding magnetic organelle biogenesis in the domain Bacteria · 55 of Bacteria and Archaea, posing problems for elucidating the evolutionary history of 56 cellular complexity. 57

7

Bdellovibrionota, Bdellovibrionata_B, Fibrobacterota, and Riflebacteria phyla (Figure 167

3a and Supplementary Figure 1). This expands greatly the number of Bacterial lineages 168

associated with magnetosomal biogenesis and magnetic navigation. 169

To deepen our understanding of the environmental distribution patterns of MTB, the 170

taxonomic diversity of novel MTB genomes was compared across environments. 171

Genomes belonging to the phyla Proteobacteria and Desulfobacterota are found 172

predominantly in both freshwater/marginal (<1 ppt) and saline/brackish/marine (>1 ppt) 173

sediments, whereas genomes affiliated with the Nitrospirata and Omnitrophota phyla 174

represent the dominant MTB groups in acidic peatland soils (Figure 3d). MTB from the 175

Proteobacteria, Desulfobacterota, and Omnitrophota exist in all three environmental 176

sample types, while those of the Desulfobacterota_A phylum and of the SAR324, 177

Fibrobacterota, and Planctomycetota phyla are observed exclusively in acidic peatland 178

soils and brackish/saline/marine environments, respectively (Figure 3b). For 179

Proteobacteria classes, MTB genomes from the Magnetococcia and 180

Alphaproteobacteria are found in wide-ranging environments, while those with 181

Zetaproteobacteria and Gammaproteobacteria genomes occur solely in 182

brackish/saline/marine environments. 183

184

Diverse MGCs across distinct taxonomic lineages. Genes for the metabolic pathway 185

responsible for magnetosomal biogenesis have been found in contiguous gene clusters 186

in MTB genomes 54, which are referred to as MGCs. These gene clusters are not only 187

the key to deciphering the mechanisms and evolutionary origin of magnetosome 188

formation and magnetotaxis 5, but they also provide a wealth of gene resources for 189

biosynthesis of membrane-bounded, single-domain magnetic nanoparticles with 190

diverse properties for various applications 55. The genomes acquired here contain 191

diverse MGC types in terms of gene content and synteny (Figure 4), which expands our 192

knowledge of MGC diversity considerably. Discovery of various MGCs suggests the 193

potential for diverse magnetosomal biogenesis and magnetotaxis across the domain 194

Bacteria. The mam 56–58 genes that play essential roles in magnetosomal biogenesis are 195

present in all MGCs identified here. Remarkably, we note that the feoB gene, which is 196

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted April 28, 2020. . https://doi.org/10.1101/2020.04.27.061960doi: bioRxiv preprint

Page 8: Expanding magnetic organelle biogenesis in the domain Bacteria · 55 of Bacteria and Archaea, posing problems for elucidating the evolutionary history of 56 cellular complexity. 57

8

responsible for iron transport into the cell, is also shared by most MTB lineages and is 197

usually included in the MGCs (Figure 3c and Figure 4). Deletion of feoB in 198

Magnetospirillum strains results in reduced magnetite biomineralization 59,60, which 199

indicates its potentially significant role in magnetosomal biomineralization. The mad 200

61,62 and man 63 genes, which have been proposed to play important accessary functions 201

in magnetosomal biomineralization, represent a much wider distribution across MTB 202

genomes than previously thought: mad genes are present in the genomes of 11 phyla 203

and man genes occur in the genomes of the Desulfobacterota (nTS_bin18 and 204

nDJH15_bin4), Nitrospirota, and Nitrospinota (nPCR_bin9 and nNGH_bin12) phyla. 205

The mms6 operon, which contains magnetosome genes mms6, mmsF, mms36, and 206

mms48, controls the size and/or number of magnetic magnetosomal crystals in 207

Magnetospirillum strains 57,58,64,65, and these genes have previously only been found in 208

the Proteobacteria phylum. Here we find that MGCs from genomes of the phyla 209

Nitrospinota (nNGH_bin12) and SAR324 (nKLK_bin6 and nPCR_bin7) also contain 210

mms genes (mms6 and mmsF). In general, the gene content and orientation of MGCs 211

vary considerably across different taxonomic lineages but are generally conserved 212

within the same lineage (Figure 4), which is indicative of lineage specific MGC 213

evolution without extensive inter-phylum HGTs and inter-class HGTs among the 214

Proteobacteria phylum. Considering the potentially high metabolic cost of maintaining 215

such complex gene clusters in MTB, the wide observed MGC distribution across 216

different lineages suggests that magnetosomal biogenesis and magnetotaxis must 217

confer selective advantages on these organisms. 218

Two types of magnetosomal mineral crystals have been identified to date: magnetite 219

(Fe3O4) and greigite (Fe3S4). Some MTB biomineralize both minerals within the same 220

cell 66,67. Before this study, Fe3O4-type MGCs had been identified in all MTB lineages 221

except for the Latescibacterota and Planctomycetota 17, while Fe3S4-type MGCs were 222

only found in the Desulfobacterota (e.g., Candidatus Magnetoglobus multicellularis 68 223

and Candidatus Desulfamplus magnetomortis strain BW-1 67), Latescibacterota 36, and 224

Planctomycetota 17 phyla. Here, most of the identified MGCs contain Fe3O4-type 225

magnetosome genes (including the first identification of Fe3O4-type MGCs in the 226

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted April 28, 2020. . https://doi.org/10.1101/2020.04.27.061960doi: bioRxiv preprint

Page 9: Expanding magnetic organelle biogenesis in the domain Bacteria · 55 of Bacteria and Archaea, posing problems for elucidating the evolutionary history of 56 cellular complexity. 57

9

Planctomycetota phylum) (Figure 4). Only a small fraction (nER2_bin1, nHLH_bin7, 227

nN2-2_bin5, nS315_bin9, nS315_bin20, nS315_bin24, nTS_bin4, and nXX_bin1) 228

harbor putative Fe3S4-type or both Fe3O4- and Fe3S4-type magnetosome genes, which 229

suggests that Fe3O4-producing MTB are distributed more widely than Fe3S4-producing 230

MTB in present-day habitats. The coexistence of Fe3O4- and Fe3S4-type MGCs in the 231

same genomes of UBA10199 (nXX_bin1) and Omnitrophota (nS315_bin20 and 232

nS315_bin24) phyla has never been reported previously, which not only indicates an 233

unexpected phylogenetic diversity of Fe3S4-producing MTB, but, more importantly, 234

challenges the traditional hypothesis regarding the origin of Fe3S4-producing MTB 235

(discussed below). Thus, this study reveals that MTB in the domain Bacteria contain 236

many more MGCs than anticipated, which captures a more complete picture of the 237

genomic diversity of MTB. The diverse MGCs reconstructed here also represent a 238

promising new gene resource for magnetic bio-nanoparticles that can be used to modify 239

magnetosomal biogenesis pathways in MTB or even build new pathways in non-MTB 240

55. 241

242

Biogenesis of a magnetic organelle in the last Bacterial common ancestor (LBCA)? 243

A group of nine genes (mamABEIKMOPQ) was identified previously as the core 244

magnetosome gene set shared by both Fe3O4- and Fe3S4-producing MTB 61. The 245

products of these genes are thought to have important functions in magnetosomal 246

biomineralization and for magnetosome chain construction. To trace the evolutionary 247

history of magnetosomal biogenesis, we first performed a comparative genome analysis 248

of 83 representative high-quality MTB genomes (with >90% completeness and <5% 249

contamination) to identify core magnetosome genes that are defined such that >90% of 250

input genomes (i.e., ≥74 genomes) must contain these genes, which allows for missing 251

or fragmented genes due to the incomplete nature of draft genomes. Six magnetosome 252

genes (mamABIKMQ) meet these criteria. Among these genes, proteins encoded by 253

mamBIMQ are identified to be essential for magnetosomal biogenesis in 254

Magnetospirillum strains; deletion of these genes results in non-magnetotactic mutants 255

5,56,58. Although mamA and mamK are not essential for magnetic mineral formation in 256

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted April 28, 2020. . https://doi.org/10.1101/2020.04.27.061960doi: bioRxiv preprint

Page 10: Expanding magnetic organelle biogenesis in the domain Bacteria · 55 of Bacteria and Archaea, posing problems for elucidating the evolutionary history of 56 cellular complexity. 57

10

Magnetospirillum species based on previous studies, the proteins they encoded are both 257

involved in fine-tuning the magnetic dipole moment of the cell and thus magnetotaxis: 258

MamA is responsible for magnetosome membrane assembly and MamK is involved in 259

biomineral chain formation 69–72. 260

We then examined the phylogeny of magnetosome proteins MamABKMQ across 261

available MTB genomes (Figure 5), with the exception of MamI due to its <60 aa 262

positions after alignment trimming. The resulting trees are congruent overall with the 263

genome-based phylogeny in that they are consistent with monophyly of the major phyla. 264

This provides good evidence that the current MGC distribution across major Bacterial 265

phyla is not due to extensive recent HGT events, but that it is due mainly to vertical 266

inheritance. Coupled with the sharing of a common set of magnetosome genes, this 267

strengthens the scenario of a single MGC emergence with later multiple independent 268

losses during Bacterial diversification. However, a few discrepancies must be noted, 269

such as separation between the UBA10199 phylum of nDJH6_bin20 and nXX_bin1 270

(Fe3O4-type), separate branching of nPCR_bin9 from the other Nitrospinota lineages, 271

clustering of the Fe3O4-type Planctomycetota and Omnitrophota phyla, and clustering 272

of Desulfobacterota_A and Desulfobacterota phyla in most protein trees, and clustering 273

of Nitrospinota (except nPCR_bin9) and SAR324 within the Proteobacteria in trees of 274

MamB and MamM (Figure 5 and Supplementary Figures 2 to 6). These discrepancies 275

could have resulted from ancient HGT events. Alternatively, considering that only 276

limited MTB genomes are available for the UBA10199, SAR324, Nitrospinota, and 277

Planctomycetota phyla, these groupings could also be an artefact of tree reconstruction 278

73. Additional MTB genomes from these phyla will help to differentiate between these 279

possibilities. 280

Initial discovery of both Fe3O4- and Fe3S4-type MGCs within the same genome in 281

the Desulfobacterota phylum (Deltaproteobacteria class in the NCBI taxonomy) led to 282

the proposal that genes for Fe3S4 biomineralization in magnetosomes originated in this 283

lineage 15. However, our finding of the coexistence of Fe3O4- and Fe3S4-type MGCs in 284

the genomes of UBA10199 and Omnitrophota phyla weakens this hypothesis. Fe3S4-285

type magnetosome proteins form a monophyletic group in all MamABMQ trees with 286

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted April 28, 2020. . https://doi.org/10.1101/2020.04.27.061960doi: bioRxiv preprint

Page 11: Expanding magnetic organelle biogenesis in the domain Bacteria · 55 of Bacteria and Archaea, posing problems for elucidating the evolutionary history of 56 cellular complexity. 57

11

only the exception of Fe3S4-type MamA of nXX_bin1 that is clustered with Nitrospirota 287

Fe3O4-type counterparts with low bootstrap support (<75%, Supplementary Figure 2). 288

The structures of Fe3S4-type protein trees generally represent their phylum-level 289

taxonomic phylogenies (Figure 5 and Supplementary Figures 2 to 6), which suggests 290

that all Fe3S4-type magnetosome proteins likely shared a common ancestor and evolved 291

separately in each phylum. These findings suggest that Fe3S4-type MGCs arose before 292

the common ancestor of the Fe3S4-MTB containing phyla of Desulfobacterota, 293

Latescibacterota, Planctomycetota, UBA10199, and Omnitrophota. The Fe3S4-type 294

protein cluster is not associated robustly with any other Fe3O4-producing MTB lineages 295

in protein trees, which precludes determination of the evolutionary origin of Fe3S4-296

producing MTB. Considering that Fe3O4-type MGCs are the most widespread, with 297

identification in all major MTB lineages, it is likely that the Fe3O4-types are the 298

ancestral form of magnetosome biosynthesis and that Fe3S4-type MGCs arose from 299

Fe3O4-type MGCs through gene cluster duplication and divergence 15. However, both 300

Fe3O4- and Fe3S4-type MGCs originating from an ancient unknown MGC type (Fe3S4 301

or other iron-containing biominerals) is also plausible 37. 302

Magnetosomal biogenesis is now considered to have an ancient origin 15,37,41,74. 303

Discovery in this study of divergent MGCs across various Bacterial phyla expands 304

significantly the database of MTB genomes and allows development of a more 305

comprehensive scenario for the origin of magnetosomal biogenesis. The phylogenies 306

of core magnetosome proteins (Figure 5) strengthens the notion of an ancient MGC 307

origin, which then spread through a combination of vertical inheritance over geological 308

times followed by multiple independent losses, potential HGT events, and gene/cluster 309

duplications. This should have occurred from a more basal ancestor than previously 310

thought prior to the divergence of all 14 known MTB-containing phyla (Figure 3). 311

These phyla are scattered within the Bacterial tree of life and their common ancestor 312

can be traced to near the base of the tree (Supplementary Figure 7), which indicates 313

parsimoniously that close descendants of LBCA or the LBCA itself may have already 314

contained ancestors of magnetosome genes and were, thus, capable of biomineralizing 315

primitive magnetosomes. 316

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted April 28, 2020. . https://doi.org/10.1101/2020.04.27.061960doi: bioRxiv preprint

Page 12: Expanding magnetic organelle biogenesis in the domain Bacteria · 55 of Bacteria and Archaea, posing problems for elucidating the evolutionary history of 56 cellular complexity. 57

12

Potential biogenesis of the magnetosome organelle in the LBCA has two major 317

implications. First, the LBCA could have already possessed a relatively complex 318

subcellular organization and, thus, was not as “primitive” as is usually imagined. Early 319

organisms on Earth are described typically as simple organisms that lacked complex 320

subcellular structures; however, a conceptual model of a complex LBCA or even a 321

complex last universal common ancestor (LUCA) has been proposed based on the 322

presence of eukaryote-like features present in some members of the Planctomycetes 323

75,76. Our results imply the formation of a magnetosome organelle in the LBCA, which 324

supports the idea of a relatively complex LBCA. On early Earth the lack of a protective 325

ozone layer resulted in higher harmful ultraviolet radiation than the present-day Earth, 326

which would have been one of major challenges for life in the surface and shallow-327

water conditions 77. The intrinsic enzyme-like properties of magnetosome iron 328

nanoparticles 78,79 and their potentially enhanced enzyme-like stability under a wide 329

range of temperatures and pH 80 might have helped life to cope with environmental 330

stresses on early Earth (e.g., detoxification of ultraviolet radiation and free-iron-331

generated ROS 12). No MGC has yet been found in the Archaea, which suggests that 332

magnetosomal biogenesis evolved after the divergence of Bacteria and Archaea. 333

Second, a magnetosome-forming LBCA implies that this feature may have been 334

inherited by a taxonomically wide group of Bacterial phyla, while other phyla lost this 335

trait during evolution. Thus, we argue that many MTB-containing lineages await 336

discovery. Future discovery of additional MTB affiliated within other Bacterial phyla, 337

especially those near the base of the Bacterial tree 81,82, will help to understand and 338

constrain the evolutionary origin of the magnetosome organelle. 339

In summary, we reconstruct 168 MTB genomes here, which expand substantially the 340

genomic representation of MTB and indicate a much more widespread distribution of 341

magnetosome organelle biogenesis across the domain Bacteria. Analysis of core 342

magnetosome proteins in the largest available taxonomic representation strengthens the 343

notion of an ancient origin for magnetosome organelle biogenesis, which may date back 344

to the LBCA. Genomes from this study will enable a better understanding of the biology 345

and biomineralization of MTB and offer clues to assist in cultivation of uncultured 346

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted April 28, 2020. . https://doi.org/10.1101/2020.04.27.061960doi: bioRxiv preprint

Page 13: Expanding magnetic organelle biogenesis in the domain Bacteria · 55 of Bacteria and Archaea, posing problems for elucidating the evolutionary history of 56 cellular complexity. 57

13

MTB from different lineages. 347

348

Methods 349

Sample collection. A total of 53 sediment and soil samples were collected from a wide 350

range of natural environments across China and Australia (Figure 2a), including 13 351

sediment samples that have been previously described 37 (for details see Supplementary 352

Table 1). Each sample was examined for the presence of living MTB by light 353

microscopy using the hanging-drop method 83. MTB cells from sediments and peatland 354

soils were enriched magnetically using a “MTB trap” and enriched MTB cells were 355

then subjected to metagenomic analyses, the detailed procedures for which are 356

described elsewhere 37,41,84. 357

358

Metagenome assembly, population genome binning, and comparative genomic 359

analyses. Metagenomic DNA from each location was sequenced on Illumina HiSeq 360

2000, 2500, or 4000 platforms. Sequencing data were processed through a single-361

sample assembly and binning strategy using a MetaWRAP pipeline 85. The individual 362

metagenomic datasets were assembled separately de novo using metaSPAdes (version 363

3.13.0) 86 with default parameters. Assembled scaffolds ≥2000 bp were binned 364

separately using MetaBAT2 (version 2.12.1) 87, MaxBin2 (version 2.2.4) 88, and 365

CONCOCT 89. Results of three binning methods for each sample were refined using 366

MetaWRAP’s Bin_refinement and Reassemble_bins 85. Genome completeness and 367

contamination were estimated with CheckM 90 using the ‘lineage_wf’ workflow. Only 368

genomes with an estimated quality of >50 (defined as completeness − 5 × 369

contamination) 50 were retained. Statistics for each genome were obtained using 370

QUAST (version 4.2) 91, including the genome length, number of scaffolds, largest 371

scaffold, GC content, and N50. Resultant genomes were annotated using Prokka 372

(version 1.11) 92. Reconstructed genomes were checked manually for the presence of 373

MGCs, followed by extensive manual verification of candidate magnetosome genes 374

using NCBI PSI-BLAST 93. Notably, genome sequences with >99% ANI of previously 375

reconstructed MTB genomes 37 from the same samples were recovered here using a 376

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted April 28, 2020. . https://doi.org/10.1101/2020.04.27.061960doi: bioRxiv preprint

Page 14: Expanding magnetic organelle biogenesis in the domain Bacteria · 55 of Bacteria and Archaea, posing problems for elucidating the evolutionary history of 56 cellular complexity. 57

14

different approach, which emphasizes the reproducibility of different genome-resolved 377

metagenomic approaches. Taxonomic annotation of all acquired MTB genomes was 378

performed using the Genome Taxonomy Database Toolkit GTDB-Tk 53 (version 0.3.2, 379

database Release 04-RS89) with the ‘classify_wf’ function and default parameters. The 380

genomes reconstructed here were combined with published MTB genomes 381

(Supplementary Table 3). These genomes were dereplicated at 95% ANI for species 51 382

delineation using dRep 94 with ‘-sa 0.95’. 383

To identify the core magnetosome gene set shared by MTB in their respective 384

genomes, only complete genomes and those with >90% completeness and <5% 385

contamination are considered. These genomes were dereplicated using dRep 94 with ‘-386

sa 0.99’ for dereplication at 99% ANI, and finally 83 high-quality representative MTB 387

genomes were selected. Most are draft genomes, so we define a core magnetosome 388

protein as being present in >90% of the input genomes (i.e., at least 74 of 83 genomes) 389

to minimize exclusion of potential core proteins due to the incomplete nature of draft 390

genomes. Core proteins were calculated with COGtriangles 95 using 391

GET_HOMOLOGUES 96 and were checked manually for magnetosome proteins. 392

MamABIKMQ protein sequences were then searched with hidden Markov Models 97 393

(HMM) across known MTB genomes. 394

395

Phylogenetic analyses. The phylogenetic tree composed of MTB genomes and those 396

of relatively closely related non-MTB was inferred from 120 concatenated Bacterial 397

single-copy marker proteins 50. Maximum-likelihood phylogeny was calculated using 398

IQ-TREE (version 1.6.9) 98 under the LG+I+G4 substitution model with 1000 ultrafast 399

bootstraps. The genome tree was rooted with the genome from the candidate phyla 400

radiation (accession number LCFW00000000). For each MamABKMQ protein, 401

sequences were aligned using MAFFT (version 7.407) 99 in ‘auto’ mode and filtered 402

using trimAL 100 with ‘-gappyout’ option. Maximum-likelihood phylogenetic protein 403

trees were then constructed using IQ-TREE under the TEST option for best model 404

selection with 1000 ultrafast bootstraps. Protein trees were rooted at the midpoint. All 405

trees were visualized using FigTree version 1.4.2 406

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted April 28, 2020. . https://doi.org/10.1101/2020.04.27.061960doi: bioRxiv preprint

Page 15: Expanding magnetic organelle biogenesis in the domain Bacteria · 55 of Bacteria and Archaea, posing problems for elucidating the evolutionary history of 56 cellular complexity. 57

15

(http://tree.bio.ed.ac.uk/software/figtree/) and Interactive Tree Of Life (iTOL) v4 101. 407

AnnoTree 102 was used for phylogenomic visualization of the distribution of MTB-408

containing phyla across the Bacterial tree of life at the taxonomic phylum level. 409

410

Data availability. Genome sequences are deposited in the NCBI BioProject under 411

PRJNA400260 (accession numbers pending). 412

413

Acknowledgments 414

We thank Xianyu Huang from State Key Laboratory of Biogeology and Environmental 415

Geology, China University of Geosciences (Wuhan) and Zhiqi Zhang from Shennongjia 416

National Park Administration for their contributions to the collection of acidic peatland 417

soil samples. We thank Li Liu, Jia Liu, Runjia Ji, Yan Chen, and Yuan Fang for 418

assistance with fieldwork. We thank Patrick De Deckker for suggesting field sampling 419

sites and David Gordon and Samantha Burn for granting access to laboratory and 420

materials at the Australian National University that enabled MTB extraction after our 421

Australian fieldwork. This work was supported by the National Natural Science 422

Foundation of China (NSFC) Grants 41621004 and 41822704, the Youth Innovation 423

Promotion Association of the Chinese Academy of Sciences, the Natural Environment 424

Research Council Independent Research Fellowship NE/P017266/1, and the Australian 425

Research Council Grant DP140104544. 426

427

Author contributions 428

W.L. and Y.X.P. conceived and planned the study. W.L., W.S.Z., G.A.P., X.Z., A.P.R., 429

and Y.X.P. performed field sampling. W.L. and W.S.Z. enriched environmental 430

magnetotactic bacteria and analyzed the data. W.L. wrote the manuscript with input 431

from G.A.P., Q.Y.Z., R.K., D.A.B., A.P.R., and Y.X.P All authors approved the final 432

manuscript. 433

434

Competing interests 435

The authors declare no competing interests. 436

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted April 28, 2020. . https://doi.org/10.1101/2020.04.27.061960doi: bioRxiv preprint

Page 16: Expanding magnetic organelle biogenesis in the domain Bacteria · 55 of Bacteria and Archaea, posing problems for elucidating the evolutionary history of 56 cellular complexity. 57

16

References 437 1. Yeates, T. O., Crowley, C. S. & Tanaka, S. Bacterial microcompartment organelles: protein 438

shell structure and evolution. Annu. Rev. Biophys. 39, 185–205 (2010). 439 2. Grant, C. R., Wan, J. & Komeili, A. Organelle formation in Bacteria and Archaea. Annu. Rev. 440

Cell Dev. Biol. 34, 217–238 (2018). 441 3. Saier, M. H. & Bogdanov, M. V. Membranous organelles in Bacteria. J. Mol. Microbiol. 442

Biotechnol. 23, 5–12 (2013). 443 4. Lower, B. H. & Bazylinski, D. A. The bacterial magnetosome: a unique prokaryotic organelle. 444

J. Mol. Microbiol. Biotechnol. 23, 63–80 (2013). 445 5. Uebe, R. & Schüler, D. Magnetosome biogenesis in magnetotactic bacteria. Nat. Rev. 446

Microbiol. 14, 621–637 (2016). 447 6. McCausland, H. C. & Komeili, A. Magnetic genes: studying the genetics of biomineralization 448

in magnetotactic bacteria. PLOS Genet. 16, e1008499 (2020). 449 7. Frankel, R. B. & Blakemore, R. P. Magnetite and magnetotaxis in microorganisms. 450

Bioelectromagnetics 10, 223–237 (1989). 451 8. Vali, H. & Kirschvink, J. L. Observations of magnetosome organization, surface structure, and 452

iron biomineralization of undescribed magnetotactic bacteria: evolutionary speculations. in 453 Iron Biominerals (eds. Frankel, R. B. & Blakemore, R. P.) 97–115 (Plenum Press, 1990). 454

9. Simmons, S. L. & Edwards, K. J. Geobiology of magnetotactic bacteria. in Magnetoreception 455 and Magnetosomes in Bacteria (ed. Schüler, D.) 77–102 (Springer, 2007). 456

10. Kopp, R. E. & Kirschvink, J. L. The identification and biogeochemical interpretation of fossil 457 magnetotactic bacteria. Earth-Sci. Rev. 86, 42–61 (2008). 458

11. Kirschvink, J. L., Walker, M. M. & Diebel, C. E. Magnetite-based magnetoreception. Curr. 459 Opin. Neurobiol. 11, 462–467 (2001). 460

12. Lin, W., Kirschvink, J. L., Paterson, G. A., Bazylinski, D. A. & Pan, Y. On the origin of 461 microbial magnetoreception. Natl. Sci. Rev. 7, 472–479 (2020). 462

13. Monteil, C. L. et al. Ectosymbiotic bacteria at the origin of magnetoreception in a marine 463 protist. Nat. Microbiol. 4, 1088–1095 (2019). 464

14. Monteil, C. L. & Lefèvre, C. T. Magnetoreception in microorganisms. Trends Microbiol. 28, 465 266–275 (2020). 466

15. Lefèvre, C. T. & Bazylinski, D. A. Ecology, diversity, and evolution of magnetotactic bacteria. 467 Microbiol. Mol. Biol. Rev. 77, 497–526 (2013). 468

16. Bazylinski, D. A. & Lefèvre, C. T. Magnetotactic bacteria from extreme environments. Life 3, 469 295–307 (2013). 470

17. Lin, W., Pan, Y. & Bazylinski, D. A. Diversity and ecology of and biomineralization by 471 magnetotactic bacteria. Environ. Microbiol. Rep. 9, 345–356 (2017). 472

18. Fassbinder, J. W. E., Stanjek, H. & Vali, H. Occurrence of magnetic bacteria in soil. Nature 473 343, 161–162 (1990). 474

19. Fassbinder, J. W. E. & Stanjek, H. Magnetic properties of biogenic soil greigite (Fe3S4). 475 Geophys. Res. Lett. 21, 2349–2352 (1994). 476

20. Lin, W., Bazylinski, D. A., Xiao, T., Wu, L.-F. & Pan, Y. Life with compass: diversity and 477 biogeography of magnetotactic bacteria. Environ. Microbiol. 16, 2646–2658 (2014). 478

21. Amor, M., Tharaud, M., Gélabert, A. & Komeili, A. Single-cell determination of iron content 479 in magnetotactic bacteria: implications for the iron biogeochemical cycle. Environ. Microbiol. 480

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted April 28, 2020. . https://doi.org/10.1101/2020.04.27.061960doi: bioRxiv preprint

Page 17: Expanding magnetic organelle biogenesis in the domain Bacteria · 55 of Bacteria and Archaea, posing problems for elucidating the evolutionary history of 56 cellular complexity. 57

17

22, 823–831 (2020). 481 22. Schulz-Vogt, H. N. et al. Effect of large magnetotactic bacteria with polyphosphate inclusions 482

on the phosphate profile of the suboxic zone in the Black Sea. ISME J. 13, 1198–1208 (2019). 483 23. Rivas-Lamelo, S. et al. Magnetotactic bacteria as a new model for P sequestration in the 484

ferruginous Lake Pavin. Geochemical Perspect. Lett. 35–41 (2017). 485 doi:10.7185/geochemlet.1743 486

24. Cox, B. L. et al. Organization and elemental analysis of P-, S-, and Fe-rich inclusions in a 487 population of freshwater magnetococci. Geomicrobiol. J. 19, 387–406 (2002). 488

25. Amann, R., Peplies, J. & Schüler, D. Diversity and taxonomy of magnetotactic bacteria. in 489 Magnetoreception and Magnetosomes in Bacteria (ed. Schüler, D.) 25–36 (Springer, 2007). 490

26. DeLong, E. F., Frankel, R. B. & Bazylinski, D. A. Multiple evolutionary origins of 491 magnetotaxis in bacteria. Science. 259, 803–806 (1993). 492

27. Jogler, C. et al. Comparative analysis of magnetosome gene clusters in magnetotactic bacteria 493 provides further evidence for horizontal gene transfer. Environ. Microbiol. 11, 1267–1277 494 (2009). 495

28. Jogler, C. & Schüler, D. Genomics, genetics, and cell biology of magnetosome formation. 496 Annu. Rev. Microbiol. 63, 501–521 (2009). 497

29. Morillo, V. et al. Isolation, cultivation and genomic analysis of magnetosome 498 biomineralization genes of a new genus of South-seeking magnetotactic cocci within the 499 Alphaproteobacteria. Front. Microbiol. 5, 72 (2014). 500

30. Monteil, C. L. et al. Genomic study of a novel magnetotactic Alphaproteobacteria uncovers the 501 multiple ancestry of magnetotaxis. Environ. Microbiol. 20, 4415–4430 (2018). 502

31. Du, H. et al. Magnetosome gene duplication as an important driver in the evolution of 503 magnetotaxis in the Alphaproteobacteria. mSystems 4, e00315-19 (2019). 504

32. Geurink, C. et al. Complete genome sequence of strain BW-2, a magnetotactic 505 Gammaproteobacterium in the family Ectothiorhodospiraceae, isolated from a brackish spring 506 in Death Valley, California. Microbiol. Resour. Announc. 9, e01144-19 (2020). 507

33. Abreu, F. et al. Culture-independent characterization of a novel magnetotactic member 508 affiliated to the Beta class of the Proteobacteria phylum from an acidic lagoon. Environ. 509 Microbiol. 20, 2615–2624 (2018). 510

34. Li, J. et al. Phylogenetic and structural identification of a novel magnetotactic 511 Deltaproteobacteria strain, WYHR-1, from a freshwater lake. Appl. Environ. Microbiol. 85, 512 e00731-19 (2019). 513

35. Kolinko, S. et al. Single-cell analysis reveals a novel uncultivated magnetotactic bacterium 514 within the candidate division OP3. Environ. Microbiol. 14, 1709–1721 (2012). 515

36. Lin, W. & Pan, Y. A putative greigite type magnetosome gene cluster from the candidate 516 phylum Latescibacteria. Environ. Microbiol. Rep. 7, 237–242 (2015). 517

37. Lin, W. et al. Genomic expansion of magnetotactic bacteria reveals an early common origin of 518 magnetotaxis with lineage-specific evolution. ISME J. 12, 1508–1519 (2018). 519

38. Koziaeva, V. et al. Genome-based metabolic reconstruction of a novel uncultivated freshwater 520 magnetotactic coccus “Ca. Magnetaquicoccus inordinatus” UR-1, and proposal of a candidate 521 family “Ca. Magnetaquicoccaceae”. Front. Microbiol. 10, 2290 (2019). 522

39. Abreu, F. et al. Common ancestry of iron oxide- and iron-sulfide-based biomineralization in 523 magnetotactic bacteria. ISME J. 5, 1634–1640 (2011). 524

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted April 28, 2020. . https://doi.org/10.1101/2020.04.27.061960doi: bioRxiv preprint

Page 18: Expanding magnetic organelle biogenesis in the domain Bacteria · 55 of Bacteria and Archaea, posing problems for elucidating the evolutionary history of 56 cellular complexity. 57

18

40. Lefèvre, C. T. et al. Monophyletic origin of magnetotaxis and the first magnetosomes. Environ. 525 Microbiol. 15, 2267–2274 (2013). 526

41. Lin, W. et al. Origin of microbial biomineralization and magnetotaxis during the Archean. 527 Proc. Natl. Acad. Sci. U. S. A. 114, 2171–2176 (2017). 528

42. Huang, X. et al. Response of carbon cycle to drier conditions in the mid-Holocene in central 529 China. Nat. Commun. 9, 1369 (2018). 530

43. Tian, W., Wang, H., Xiang, X., Wang, R. & Xu, Y. Structural variations of bacterial 531 community driven by sphagnum microhabitat differentiation in a subalpine peatland. Front. 532 Microbiol. 10, 1661 (2019). 533

44. Zhao, M., Zhang, Y., Zhang, Z. & Huang, X. Comparison of microbial community in topsoil 534 among different habitats in Dajiuhu, Hubei Province: evidence from phospholipid fatty acids 535 (in Chinese with English abstract). Earth Sci. in press, (2020). 536

45. Nash, C. Mechanisms and evolution of magnetotactic bacteria. California Institute of 537 Technology (California Institute of Technology, 2008). 538

46. Lefèvre, C. T. et al. Moderately thermophilic magnetotactic bacteria from hot springs in 539 Nevada. Appl. Environ. Microbiol. 76, 3740–3743 (2010). 540

47. Lefèvre, C. T., Frankel, R. B., Posfai, M., Prozorov, T. & Bazylinski, D. A. Isolation of 541 obligately alkaliphilic magnetotactic bacteria from extremely alkaline environments. Environ. 542 Microbiol. 13, 2342–2350 (2011). 543

48. Aliaga Goltsman, D. S., Comolli, L. R., Thomas, B. C. & Banfield, J. F. Community 544 transcriptomics reveals unexpected high microbial diversity in acidophilic biofilm 545 communities. Isme J 9, 1014–1023 (2015). 546

49. Liu, J. et al. Bacterial community structure and novel species of magnetotactic bacteria in 547 sediments from a seamount in the Mariana volcanic arc. Sci. Rep. 7, 17964 (2017). 548

50. Parks, D. H. et al. Recovery of nearly 8,000 metagenome-assembled genomes substantially 549 expands the tree of life. Nat. Microbiol. 2, 1533–1542 (2017). 550

51. Klappenbach, J. A. et al. DNA–DNA hybridization values and their relationship to whole-551 genome sequence similarities. Int. J. Syst. Evol. Microbiol. 57, 81–91 (2007). 552

52. Parks, D. H. et al. A standardized bacterial taxonomy based on genome phylogeny 553 substantially revises the tree of life. Nat. Biotechnol. 36, 996–1004 (2018). 554

53. Chaumeil, P.-A., Mussig, A. J., Hugenholtz, P. & Parks, D. H. GTDB-Tk: a toolkit to classify 555 genomes with the Genome Taxonomy Database. Bioinformatics 1–3 (2019). 556 doi:10.1093/bioinformatics/btz848 557

54. Grünberg, K., Wawer, C., Tebo, B. M. & Schüler, D. A large gene cluster encoding several 558 magnetosome proteins is conserved in different species of magnetotactic bacteria. Appl. 559 Environ. Microbiol. 67, 4573–4582 (2001). 560

55. Kolinko, I. et al. Biosynthesis of magnetic nanostructures in a foreign organism by transfer of 561 bacterial magnetosome gene clusters. Nat. Nanotechnol. 9, 193–197 (2014). 562

56. Murat, D., Quinlan, A., Vali, H. & Komeili, A. Comprehensive genetic dissection of the 563 magnetosome gene island reveals the step-wise assembly of a prokaryotic organelle. Proc. 564 Natl. Acad. Sci. U. S. A. 107, 5593–5598 (2010). 565

57. Lohße, A. et al. Functional analysis of the magnetosome island in Magnetospirillum 566 gryphiswaldense: the mamAB operon is sufficient for magnetite biomineralization. PLoS One 567 6, e25561 (2011). 568

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted April 28, 2020. . https://doi.org/10.1101/2020.04.27.061960doi: bioRxiv preprint

Page 19: Expanding magnetic organelle biogenesis in the domain Bacteria · 55 of Bacteria and Archaea, posing problems for elucidating the evolutionary history of 56 cellular complexity. 57

19

58. Lohße, A. et al. Genetic dissection of the mamAB and mms6 operons reveals a gene set 569 essential for magnetosome biogenesis in Magnetospirillum gryphiswaldense. J. Bacteriol. 196, 570 2658–2669 (2014). 571

59. Rong, C. et al. Ferrous iron transport protein B gene (feoB1) plays an accessory role in 572 magnetosome formation in Magnetospirillum gryphiswaldense strain MSR-1. Res. Microbiol. 573 159, 530–536 (2008). 574

60. Rong, C. et al. FeoB2 functions in magnetosome formation and oxidative stress protection in 575 Magnetospirillum gryphiswaldense strain MSR-1. J. Bacteriol. 194, 3972–3976 (2012). 576

61. Lefèvre, C. T. et al. Comparative genomic analysis of magnetotactic bacteria from the 577 Deltaproteobacteria provides new insights into magnetite and greigite magnetosome genes 578 required for magnetotaxis. Environ. Microbiol. 15, 2712–2735 (2013). 579

62. Rahn-Lee, L. et al. A genetic strategy for probing the functional diversity of magnetosome 580 formation. Plos Genet. 11, e1004811 (2015). 581

63. Lin, W. et al. Genomic insights into the uncultured genus ‘Candidatus Magnetobacterium’ in 582 the phylum Nitrospirae. ISME J. 8, 2463–2477 (2014). 583

64. Murat, D. et al. The magnetosome membrane protein, MmsF, is a major regulator of magnetite 584 biomineralization in Magnetospirillum magneticum AMB-1. Mol Microbiol 85, 684–699 585 (2012). 586

65. Tanaka, M., Mazuyama, E., Arakaki, A. & Matsunaga, T. Mms6 protein regulates crystal 587 morphology during nano-sized magnetite biomineralization in vivo. J. Biol. Chem. 286, 6386–588 6392 (2011). 589

66. Bazylinski, D. A., Heywood, B. R., Mann, S. & Frankel, R. B. Fe3O4 and Fe3S4 in a bacterium. 590 Nature 366, 218 (1993). 591

67. Lefèvre, C. T. et al. A cultured greigite-producing magnetotactic bacterium in a novel group of 592 sulfate-reducing bacteria. Science. 334, 1720–1723 (2011). 593

68. Abreu, F. et al. Deciphering unusual uncultured magnetotactic multicellular prokaryotes 594 through genomics. Isme J 8, 1055–1068 (2014). 595

69. Komeili, A., Li, Z., Newman, D. K. & Jensen, G. J. Magnetosomes are cell membrane 596 invaginations organized by the actin-like protein MamK. Science. 311, 242–245 (2006). 597

70. Zeytuni, N. et al. Self-recognition mechanism of MamA, a magnetosome-associated TPR-598 containing protein, promotes complex assembly. Proc. Natl. Acad. Sci. 108, E480–E487 599 (2011). 600

71. Katzmann, E., Scheffel, A., Gruska, M., Plitzko, J. M. & Schüler, D. Loss of the actin-like 601 protein MamK has pleiotropic effects on magnetosome formation and chain assembly in 602 Magnetospirillum gryphiswaldense. Mol. Microbiol. 77, 208–224 (2010). 603

72. Komeili, A., Vali, H., Beveridge, T. J. & Newman, D. K. Magnetosome vesicles are present 604 before magnetite formation, and MamA is required for their activation. Proc. Natl. Acad. Sci. 605 U. S. A. 101, 3839–3844 (2004). 606

73. Yang, Z. & Rannala, B. Molecular phylogenetics: principles and practice. Nat. Rev. Genet. 13, 607 303–314 (2012). 608

74. Zeytuni, N. et al. MamA as a model protein for structure-based insight into the evolutionary 609 origins of magnetotactic bacteria. PLoS One 10, e0130394 (2015). 610

75. Forterre, P. A new fusion hypothesis for the origin of Eukarya: better than previous ones, but 611 probably also wrong. Res. Microbiol. 162, 77–91 (2011). 612

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted April 28, 2020. . https://doi.org/10.1101/2020.04.27.061960doi: bioRxiv preprint

Page 20: Expanding magnetic organelle biogenesis in the domain Bacteria · 55 of Bacteria and Archaea, posing problems for elucidating the evolutionary history of 56 cellular complexity. 57

20

76. Fuerst, J. Keys to eukaryality: Planctomycetes and ancestral evolution of cellular complexity. 613 Front. Microbiol. 3, 167 (2012). 614

77. Cnossen, I. et al. Habitat of early life: Solar X-ray and UV radiation at Earth’s surface 4–3.5 615 billion years ago. J. Geophys. Res. 112, E02008 (2007). 616

78. Guo, F. F. et al. Magnetosomes eliminate intracellular reactive oxygen species in 617 Magnetospirillum gryphiswaldense MSR-1. Environ. Microbiol. 14, 1722–1729 (2012). 618

79. Li, K. et al. Light irradiation helps magnetotactic bacteria eliminate intracellular reactive 619 oxygen species. Environ. Microbiol. 19, 3638–3648 (2017). 620

80. Gao, L., Fan, K. & Yan, X. Iron oxide nanozyme: a multifunctional enzyme mimetic for 621 biomedical applications. Theranostics 7, 3207–3227 (2017). 622

81. Brown, C. T. et al. Unusual biology across a group comprising more than 15% of domain 623 Bacteria. Nature 523, 208–211 (2015). 624

82. Zhu, Q. et al. Phylogenomics of 10,575 genomes reveals evolutionary proximity between 625 domains Bacteria and Archaea. Nat. Commun. 10, 5477 (2019). 626

83. Greenberg, M., Canter, K., Mahler, I. & Tornheim, A. Observation of magnetoreceptive 627 behavior in a multicellular magnetotactic prokaryote in higher than geomagnetic fields. 628 Biophys. J. 88, 1496–1499 (2005). 629

84. Jogler, C. et al. Towards cloning the magnetotactic metagenome: identification of 630 magnetosome island gene clusters in uncultivated magnetotactic bacteria from different aquatic 631 sediments. Appl. Environ. Microbiol. 75, 3972–3979 (2009). 632

85. Uritskiy, G. V, DiRuggiero, J. & Taylor, J. MetaWRAP—a flexible pipeline for genome-633 resolved metagenomic data analysis. Microbiome 6, 158 (2018). 634

86. Nurk, S., Meleshko, D., Korobeynikov, A. & Pevzner, P. A. metaSPAdes: a new versatile 635 metagenomic assembler. Genome Res. 27, 824–834 (2017). 636

87. Kang, D. D. et al. MetaBAT 2: an adaptive binning algorithm for robust and efficient genome 637 reconstruction from metagenome assemblies. PeerJ 7, e7359 (2019). 638

88. Wu, Y.-W., Simmons, B. A. & Singer, S. W. MaxBin 2.0: an automated binning algorithm to 639 recover genomes from multiple metagenomic datasets. Bioinformatics 32, 605–607 (2016). 640

89. Alneberg, J. et al. Binning metagenomic contigs by coverage and composition. Nat. Methods 641 11, 1144–1146 (2014). 642

90. Parks, D. H., Imelfort, M., Skennerton, C. T., Hugenholtz, P. & Tyson, G. W. CheckM: 643 assessing the quality of microbial genomes recovered from isolates, single cells, and 644 metagenomes. Genome Res. 25, 1043–1055 (2015). 645

91. Gurevich, A., Saveliev, V., Vyahhi, N. & Tesler, G. QUAST: quality assessment tool for 646 genome assemblies. Bioinformatics 29, 1072–1075 (2013). 647

92. Seemann, T. Prokka: rapid prokaryotic genome annotation. Bioinformatics 30, 2068–2069 648 (2014). 649

93. Altschul, S. F. et al. Gapped BLAST and PSI-BLAST: a new generation of protein database 650 search programs. Nucleic Acids Res. 25, 3389–3402 (1997). 651

94. Olm, M. R., Brown, C. T., Brooks, B. & Banfield, J. F. dRep: a tool for fast and accurate 652 genomic comparisons that enables improved genome recovery from metagenomes through de-653 replication. ISME J. 11, 2864–2868 (2017). 654

95. Kristensen, D. M. et al. A low-polynomial algorithm for assembling clusters of orthologous 655 groups from intergenomic symmetric best matches. Bioinformatics 26, 1481–1487 (2010). 656

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted April 28, 2020. . https://doi.org/10.1101/2020.04.27.061960doi: bioRxiv preprint

Page 21: Expanding magnetic organelle biogenesis in the domain Bacteria · 55 of Bacteria and Archaea, posing problems for elucidating the evolutionary history of 56 cellular complexity. 57

21

96. Contreras-Moreira, B. & Vinuesa, P. GET_HOMOLOGUES, a versatile software package for 657 scalable and robust microbial pangenome analysis. Appl. Environ. Microbiol. 79, 7696–7701 658 (2013). 659

97. Johnson, L. S., Eddy, S. R. & Portugaly, E. Hidden Markov model speed heuristic and iterative 660 HMM search procedure. BMC Bioinformatics 11, 431 (2010). 661

98. Nguyen, L.-T., Schmidt, H. A., von Haeseler, A. & Minh, B. Q. IQ-TREE: a fast and effective 662 stochastic algorithm for estimating maximum-likelihood phylogenies. Mol. Biol. Evol. 32, 268–663 274 (2015). 664

99. Katoh, K. & Standley, D. M. MAFFT multiple sequence alignment software version 7: 665 improvements in performance and usability. Mol. Biol. Evol. 30, 772–780 (2013). 666

100. Capella-Gutierrez, S., Silla-Martinez, J. M. & Gabaldon, T. trimAl: a tool for automated 667 alignment trimming in large-scale phylogenetic analyses. Bioinformatics 25, 1972–1973 668 (2009). 669

101. Letunic, I. & Bork, P. Interactive Tree Of Life (iTOL) v4: recent updates and new 670 developments. Nucleic Acids Res. 47, W256–W259 (2019). 671

102. Mendler, K. et al. AnnoTree: visualization and exploration of a functionally annotated 672 microbial tree of life. Nucleic Acids Res. 47, 4442–4448 (2019). 673

103. Blakemore, R. P. Magnetotactic bacteria. Science. 190, 377–379 (1975). 674 104. Frankel, R. B., Bazylinski, D. A., Johnson, M. S. & Taylor, B. L. Magneto-aerotaxis in marine 675

coccoid bacteria. Biophys. J. 73, 994–1000 (1997). 676 677

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted April 28, 2020. . https://doi.org/10.1101/2020.04.27.061960doi: bioRxiv preprint

Page 22: Expanding magnetic organelle biogenesis in the domain Bacteria · 55 of Bacteria and Archaea, posing problems for elucidating the evolutionary history of 56 cellular complexity. 57

22

678

Figure 1. The magnetosome in which magnetotactic bacteria (MTB) biomineralize 679

magnetic crystals is a typical example of a bacterial organelle. Membrane-bounded 680

magnetosomes contain intracellular magnetic nanoparticles (Fe3O4 or Fe3S4), with 681

typical ~20-150 nm sizes. Magnetic particles within MTB magnetosomes are typically 682

organized into (a) chain-like structure(s) within the cell in order to optimize the cellular 683

magnetic dipole moment. Functions of magnetosomes include magnetoreception 103,104 684

and ROS detoxification 78,79, both of which have been experimentally proven. 685

Additional proposed functions, such as iron storage and sequestration, acting as an 686

electrochemical battery or a gravity sensor, need further testing. 687

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted April 28, 2020. . https://doi.org/10.1101/2020.04.27.061960doi: bioRxiv preprint

Page 23: Expanding magnetic organelle biogenesis in the domain Bacteria · 55 of Bacteria and Archaea, posing problems for elucidating the evolutionary history of 56 cellular complexity. 57

23

688

Figure 2. Recovery of 168 MTB genomes from various environments. a. Map of 689

sampling locations (generated using the GeoMapApp 3.6.0, 690

http://www.geomapapp.org/). Further site details are given in Supplementary Table 1. 691

b. Estimated completeness and contamination of MTB genomes reconstructed in this 692

study. CheckM was used to estimate completeness and contamination. Of these 693

genomes, 69 are high-quality (>90% completeness and <5% contamination), 64 are 694

medium-quality (>70% completeness and <10% contamination) and 35 are partial (50-695

70% completeness and <5% contamination) genomes. c. Relative abundance of 696

recovered MTB genomes that can be classified according to the GTDB taxonomy 697

(database Release 04-RS89). Of the 168 recovered genomes, 34 were classified at 698

species level, 91 were classified at genus level, 140 were classified at family level, 160 699

were classified at order lever, and 168 could be classified at class and phylum levels. 700

Details are given in Supplementary Table 2. 701

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted April 28, 2020. . https://doi.org/10.1101/2020.04.27.061960doi: bioRxiv preprint

Page 24: Expanding magnetic organelle biogenesis in the domain Bacteria · 55 of Bacteria and Archaea, posing problems for elucidating the evolutionary history of 56 cellular complexity. 57

24

702

Figure 3. Distribution of MTB genomes across Bacterial phyla and distinct 703

environments. a. The maximum-likelihood phylogenomic tree of MTB genomes and 704

their close non-MTB relatives inferred from concatenated 120 Bacterial single-copy 705

marker proteins 50, which was constructed using IQ-TREE under the LG+I+G4 706

substitution model. The number in each clade refers to the number of MTB genomes 707

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted April 28, 2020. . https://doi.org/10.1101/2020.04.27.061960doi: bioRxiv preprint

Page 25: Expanding magnetic organelle biogenesis in the domain Bacteria · 55 of Bacteria and Archaea, posing problems for elucidating the evolutionary history of 56 cellular complexity. 57

25

reconstructed in this study. The complete tree is shown in Supplementary Figure 1. b. 708

Relative abundances of reconstructed MTB genomes in this study across different 709

environments within each lineage. c. Distribution of magnetosome genes (mam, mms, 710

mad, and man) and feoB gene within MGCs across different lineages. d. Distribution 711

of acquired MTB genomes at the phylum level across different environments, including 712

freshwater/marginal (<1 ppt) and saline/brackish/marine (>1 ppt) sediments, and soils 713

from acidic peatland. Details are given in Supplementary Table 1. 714

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted April 28, 2020. . https://doi.org/10.1101/2020.04.27.061960doi: bioRxiv preprint

Page 26: Expanding magnetic organelle biogenesis in the domain Bacteria · 55 of Bacteria and Archaea, posing problems for elucidating the evolutionary history of 56 cellular complexity. 57

26

715

Figure 4. Representative magnetosome gene clusters (MGCs) from distinct MTB 716

lineages recovered in this study. Genomes containing Fe3S4-type MGCs are 717

highlighted with # and putative Fe3S4-type magnetosome genes in MGCs are denoted 718

by *. 719

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted April 28, 2020. . https://doi.org/10.1101/2020.04.27.061960doi: bioRxiv preprint

Page 27: Expanding magnetic organelle biogenesis in the domain Bacteria · 55 of Bacteria and Archaea, posing problems for elucidating the evolutionary history of 56 cellular complexity. 57

27

720

Figure 5. Maximum-likelihood trees of core magnetosome proteins. Trees were 721

inferred using the MamABKMQ found in available MTB genomes. Complete trees are 722

shown in Supplementary Figures 2 to 6. 723

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted April 28, 2020. . https://doi.org/10.1101/2020.04.27.061960doi: bioRxiv preprint

Page 28: Expanding magnetic organelle biogenesis in the domain Bacteria · 55 of Bacteria and Archaea, posing problems for elucidating the evolutionary history of 56 cellular complexity. 57

28

Supplementary materials 724

725

Supplementary Figure 1. Maximum likelihood phylogenomic tree of MTB genomes 726

and their close non-MTB relatives. 727

Supplementary Figure 2. Maximum-likelihood tree of magnetosome protein MamA. 728

Supplementary Figure 3. Maximum-likelihood tree of magnetosome protein MamB. 729

Supplementary Figure 4. Maximum-likelihood tree of magnetosome protein MamK. 730

Supplementary Figure 5. Maximum-likelihood tree of magnetosome protein MamM. 731

Supplementary Figure 6. Maximum-likelihood tree of magnetosome protein MamQ. 732

Supplementary Figure 7. Phylogenetic distribution of MTB-containing phyla across 733

the Bacterial tree of life. The phylum level Bacterial tree of life with MTB-containing 734

phyla highlighted in blue. The Bacterial tree was made using the AnnoTree server. 735

736

Supplementary Table 1. Summary of sampled sites. 737

Supplementary Table 2. General characteristics of the 168 MTB genomes reported in 738

this study. Genome completeness and contamination were estimated using CheckM and 739

genome statistics were obtained using QUAST (version 4.2). Genome quality was 740

defined as (completeness - 5 × contamination). 741

Supplementary Table 3. Previously published MTB genomes included in this study. 742

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted April 28, 2020. . https://doi.org/10.1101/2020.04.27.061960doi: bioRxiv preprint