a broad genomic survey reveals multiple origins and frequent...

21
1 A broad genomic survey reveals multiple origins and frequent losses in the evolution of respiratory hemerythrins and hemocyanins José M. Martín-Durán §1* , Alex de Mendoza §2,3 , Arnau Sebé-Pedrós §2,3 , Iñaki Ruiz-Trillo 2,3,4 , Andreas Hejnol 1 1 Sars International Centre for Marine Molecular Biology University of Bergen Thormøhlensgate 55 5008 Bergen, Norway 2 Institut de Biologia Evolutiva (CSIC-Universitat Pompeu Fabra) Passeig Marítim de la Barceloneta, 37–49 08003 Barcelona, Spain 3 Departament de Genètica, Universitat de Barcelona Av. Diagonal 645 08028 Barcelona, Spain 4 Institució Catalana de Recerca i Estudis Avançats (ICREA) Passeig Lluís Companys, 23 08010 Barcelona, Spain § These authors contributed equally to this work. *Corresponding author: José M. Martín-Durán. Sars International Centre for Marine Molecular Biology. University of Bergen, Bergen, Norway. Phone +47-55584291. Fax +47-55584305. [email protected] © The Author(s) 2013. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/3.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact [email protected] Genome Biology and Evolution Advance Access published July 9, 2013 doi:10.1093/gbe/evt102 by guest on July 15, 2013 http://gbe.oxfordjournals.org/ Downloaded from

Upload: others

Post on 22-Jan-2021

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: A broad genomic survey reveals multiple origins and frequent …public-files.prbb.org/publicacions/38091a40-cf4e-0130-27... · 2016. 8. 4. · 1 A broad genomic survey reveals multiple

1

A broad genomic survey reveals multiple origins and frequent losses in the

evolution of respiratory hemerythrins and hemocyanins

José M. Martín-Durán§1*, Alex de Mendoza§2,3, Arnau Sebé-Pedrós§2,3, Iñaki Ruiz-Trillo2,3,4,

Andreas Hejnol1

1Sars International Centre for Marine Molecular Biology

University of Bergen

Thormøhlensgate 55

5008 Bergen, Norway

2Institut de Biologia Evolutiva (CSIC-Universitat Pompeu Fabra)

Passeig Marítim de la Barceloneta, 37–49

08003 Barcelona, Spain

3Departament de Genètica, Universitat de Barcelona

Av. Diagonal 645

08028 Barcelona, Spain

4Institució Catalana de Recerca i Estudis Avançats (ICREA)

Passeig Lluís Companys, 23

08010 Barcelona, Spain

§These authors contributed equally to this work.

*Corresponding author:

José M. Martín-Durán. Sars International Centre for Marine Molecular Biology. University

of Bergen, Bergen, Norway. Phone +47-55584291. Fax +47-55584305. [email protected]

© The Author(s) 2013. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/3.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact [email protected]

Genome Biology and Evolution Advance Access published July 9, 2013 doi:10.1093/gbe/evt102 by guest on July 15, 2013

http://gbe.oxfordjournals.org/D

ownloaded from

Page 2: A broad genomic survey reveals multiple origins and frequent …public-files.prbb.org/publicacions/38091a40-cf4e-0130-27... · 2016. 8. 4. · 1 A broad genomic survey reveals multiple

2

Abstract

Hemerythrins and hemocyanins are respiratory proteins present in some of the most ecologically

diverse animal lineages; however the precise evolutionary history of their enzymatic domains

(hemerythrin, hemocyanin M and tyrosinase) is still not well understood. We survey a wide dataset

of prokaryote and eukaryote genomes and RNAseq data to reconstruct the phylogenetic origins of

these protiens. We identify new species with hemerythrin, hemocyanin M and tyrosinase domains in

their genomes, particularly within animals, and demonstrate that the current distribution of

respiratory proteins is due to several events of lateral gene transfer and/or massive gene loss. We

conclude that the last common metazoan ancestor had at least two hemerythrin domains, one

hemocyanin M domain and six tyrosinase domains. The patchy distribution of these proteins among

animal lineages can be partially explained by physiological adaptations, making these genes good

targets for investigations into the interplay between genomic evolution and physiological

constraints.

Keywords: comparative genomics, hemerythrin, hemocyanin, tyrosinase, respiration, lateral gene

transfer.

by guest on July 15, 2013http://gbe.oxfordjournals.org/

Dow

nloaded from

Page 3: A broad genomic survey reveals multiple origins and frequent …public-files.prbb.org/publicacions/38091a40-cf4e-0130-27... · 2016. 8. 4. · 1 A broad genomic survey reveals multiple

3

Text

Hemoglobins, hemerythrins and hemocyanins are three different respiratory proteins present

in animals (Terwilliger 1998). Hemoglobins have a Fe–protoporphyrin ring to reversibly bind

oxygen, and are the most common molecules for oxygen transport and storage in the Bilateria

(Weber and Vinogradov 2001). Globin proteins are widespread in the tree of life and, in animals,

respiratory globins likely evolved from a membrane bound ancestor that acquired a respiratory

function independently in different lineages (Blank and Burmester 2012; Roesner, et al. 2005). In

contrast to the widespread hemoglobins, hemerythrins and hemocyanins have been detected in

fewer animal groups. Hemerythrins transport oxygen using two Fe2+ ions that bind directly to the

polypeptide chain, and have been described in a cnidarian (Nematostella vectensis), priapulids,

brachiopods, some annelids and sipunculans (Bailly, et al. 2008; Terwilliger 1998). Recently, it has

been shown that regulation of iron homeostasis in vertebrates involves an E3 ubiquitin ligase

(FBXL5 gene) with an iron-responsive hemerythrin domain in its structure (Salahudeen, et al. 2009;

Vashisht, et al. 2009), although it is not clear how this hemerythrin domain-containing protein is

related to invertebrate respiratory hemerythrins. Hemocyanins are large proteins that have copper

binding sites to transport oxygen in arthropods and mollusks (Bonaventura and Bonaventura 1980).

Despite their shared name, arthropod and molluscan respiratory hemocyanins are considered to

have evolved independently from a common ancestral copper protein based on protein similarities

(Burmester 2001; van Holde, et al. 2001). A deeper understanding of the evolutionary history of

hemerythrins and hemocyanins, and thus of the different respiratory strategies in animals, is limited

by the absence of data for many invertebrate groups and, in particular, for unicellular holozoans and

other eukaryotes. In this study, we survey a wide phylogenetic distribution set of genomes and

RNAseq data to identify new hemerythrin and hemocyanin proteins, and we then reconstruct their

evolution within the eukaryote tree of life, with particular focus on animal lineages.

by guest on July 15, 2013http://gbe.oxfordjournals.org/

Dow

nloaded from

Page 4: A broad genomic survey reveals multiple origins and frequent …public-files.prbb.org/publicacions/38091a40-cf4e-0130-27... · 2016. 8. 4. · 1 A broad genomic survey reveals multiple

4

In addition to the respiratory hemerythrin sequences previously characterized in animals

(Bailly, et al. 2008; Meyer and Lieb 2010; Vanin, et al. 2006), we identified respiratory

hemerythrins in the priapulid Priapulus caudatus, the arthropod Calanus finmarchicus, and the

bryozoans Alcyonidium diaphanum and Membranipora membranacea (see Supplementary File S1).

In bryozoans, no respiratory proteins have been previously described, despite the presence of a

circulatory system in these animals (Schmidt-Rhaesa 2007). Differently from other animal

hemerythrins, the hemerythrin domain shows a Ca2+-binding EF-hand domain in its N-terminal

region in both bryozoan species. Additionally, we identified an E3 ubiquitin ligase containing an F-

box domain together with a hemerythrin domain, as in FBXL5, in cnidarians and across bilaterally

symmetrical animals (see Supplementary Table S1), suggesting an ancient origin of the iron-sensing

system described in vertebrates. Our phylogenetic analyses show two major clades of hemerythrin-

containing proteins (Figure 1), one comprising the metazoan FBXL5 gene and most of the eukaryote

hemerythrins (clade A), and the other comprising the metazoan respiratory hemerythrins (including

the newly identified sequences from this study) (clade B). As shown in a previous report (Bailly, et

al. 2008), respiratory hemerythrins are closely related to some Naegleria gruberi hemerythrins, and

a sequence from the amoebozoan Acanthamoeba castellanii. To eliminate possible bacterial

contamination, we checked the gene structure and confirmed that the Acanthamoeba gene has

introns within the hemerythrin domain. The two major clades (A, non-respiratory, and B,

respiratory) are separated with high nodal support, which is in agreement with observed structural

differences (Histidine 74 being only present in clade B) (Thompson, et al. 2012). Interestingly,

clade B hemerythrins seem to be more common and highly diversified in prokaryotes (French, et al.

2008) than in eukaryotes. Metazoans may have acquired them during an ancient event of lateral

gene transfer (LGT), but, according to our phylogeny, it is more likely that clade B hemerythrins are

ancient and have been lost in many eukaryotic lineages, so far only present in three extant distant

lineages: Amoebozoa, Excavata and Metazoa. Clade B prokaryote hemerythrins have been shown

to bind oxygen as metazoan respiratory hemerythrins (Xiong, et al. 2000), but have been mostly

by guest on July 15, 2013http://gbe.oxfordjournals.org/

Dow

nloaded from

Page 5: A broad genomic survey reveals multiple origins and frequent …public-files.prbb.org/publicacions/38091a40-cf4e-0130-27... · 2016. 8. 4. · 1 A broad genomic survey reveals multiple

5

related to oxygen sensing and aerotaxis processes (Isaza, et al. 2006; Xiong, et al. 2000) and oxygen

supply to other metabolic enzymes (Karlsen, et al. 2005). This suggests that the oxygen storage and

transport function of hemerythrins may have evolved independently in metazoans, given the lack of

functional data for the Excavata and the Amoebozoa. Under this scenario, the role of respiratory

hemerythrins in iron storage, metal detoxification and immunity observed in some annelids (e.g. the

leeches Theromyzon tessulatum and Hirudo medicinalis and the polychaete Neanthes diversicolor)

(Baert, et al. 1992; Demuynck, et al. 1993; Vergote, et al. 2004) are secondary specializations of this

type of proteins. Finally, non-respiratory hemerythrins (clade A) are quite common in eukaryotes,

and have recruited several companion domains in different lineages.

Searches for the hemocyanin M domain (arthropod hemocyanins) identified this copper

binding protein in amoebozoans, the fungus Aspergilus niger, the sponge Amphimedon

queenslandica, the ctenophore Mnemiopsis leidyi and the hemichordate Saccoglossus kowalevski

(Figure 2). Therefore it is likely to be a unikont synapomorphy. Our phylogenetic analyses show the

monophyly of all metazoan sequences, as well as of the main arthropod protein families (see

Supplementary Fig. S1). The relationship of the sponge and fungal sequences may be due to an

ancient LGT, though the presence of hemocyanins in the more distant amoeobozoans does not

support that idea. Moreover the A. niger gene has a N-terminal intron and is located between two

other fungal genes, making it less likely to come from a recently incorporated segment of metazoan

DNA. Furthermore, a pseudo-gene with a hemocyanin M domain is present in the fungus

Neosartorya fischeri (a congeneric species despite the name), but absent from all 6 other

Aspergillus genomes and also from all the other fungi sequenced to date. The newly identified

sequences in this study demonstrate that the N-domain of arthropod hemocyanins and related

proteins is a specific molecular signature of the Panarthropoda (Onychophora+Arthropoda),

although there is some degree of similarity of these regions in non-arthropod sequences. The

presence of hemocyanin-like proteins in the tunicate Ciona intestinalis with putative phenoloxidase

by guest on July 15, 2013http://gbe.oxfordjournals.org/

Dow

nloaded from

Page 6: A broad genomic survey reveals multiple origins and frequent …public-files.prbb.org/publicacions/38091a40-cf4e-0130-27... · 2016. 8. 4. · 1 A broad genomic survey reveals multiple

6

activity suggested that respiratory hemocyanins evolved from an ancestral prophenoloxidase

(Immesberger and Burmester 2004). Given the absence of functional data for the non-bilaterian

animals (i.e. the ctenophore M. leidyi and the sponge A. queenslandica) and our phylogeny (Figure

2), this is still the most parsimonious functional explanation for the evolution of the respiratory

properties of arthropod hemocyanins.

An extensive search for the tyrosinase domain (molluscan hemocyanin) demonstrated a

wide distribution of this copper-binding protein across metazoan lineages (see Supplemental Table

S1), with the remarkable exception of arthropods. The absence in arthropods could be associated

with the expansion and diversification of the hemocyanin M domain in this lineage, which can

exhibit similar activities to the tyrosinase domain, for example in melanin biosynthesis (Sugumaran

2002). In contrast to a previous analysis (Esposito, et al. 2012), our phylogenetic reconstruction of a

broader dataset shows that the animal tyrosinase domains group in 6 independent clades (clades A–

F) (Figure 3), which are further supported by the domain architecture of the proteins nested in each

clade (e.g. clade D and clade F). The tyrosinase domain that gave rise to the molluscan

hemocyanins is related to brachiopod and tunicate sequences (clade B), and the series of

duplications that lead to the typical arrangement of 8 tyrosinase domains in tandem (Bonaventura

and Bonaventura 1980) is specific to mollusks. With the exception of clade E, which is restricted to

non-bilateral animals (Figure 3), the other clades exhibit an extremely patchy distribution across

bilaterally symmetrical animals (see Supplementary Table S1), not only between major animal

groups, but also within the same group (e.g. in mollusks, Crassostrea gigas has only a clade D

tyrosinase, Lottia gigantea has clade A and D tyrosinases, and Sepia officinalis has both clade B and

D tyrosinases). Despite the poor resolution of deeper nodes, our phylogenetic scenario at least

strongly supports 3 independent origins of metazoan tyrosinases. Clades A, B and F are well

supported (PP>0.9) and nested with non-metazoan sequences. The other 3 clades (clades C, D and

E) are not robustly supported, but have unique domain architectures and do not significantly cluster

by guest on July 15, 2013http://gbe.oxfordjournals.org/

Dow

nloaded from

Page 7: A broad genomic survey reveals multiple origins and frequent …public-files.prbb.org/publicacions/38091a40-cf4e-0130-27... · 2016. 8. 4. · 1 A broad genomic survey reveals multiple

7

with other metazoan groups, therefore they might also come from independent origins. Moreover,

our phylogenetic analysis demonstrates that the tyrosinase-containing proteins of plants likely

originated due to a LGT event from bacteria, corroborated by these proteins exhibiting the same

domain architecture (see Figure 3).

Altogether, our data clarify the origins and evolutionary history of the alternative respiratory

strategies observed in animals (Figure 4). Respiratory hemerythrins, arthropod hemocyanins and

molluscan respiratory tyrosinases originated independently from enzymatic domains that were most

likely already present in the last common metazoan ancestor. Although their function in early

branching lineages that do not possess circulatory systems needs to be elucidated (e.g. the function

of hemerythrins in the cnidarian N. vectensis or the hemocyanin M domain in sponges and

ctenophores), the co-option of these domains for respiratory purposes occurred independently, and

most likely took place at the base of the Protostomia (hemerythrin), the (Pan-)Arthropoda

(arthropod hemocyanins), and the Mollusca (molluscan tyrosinase “hemocyanins”). Accordingly,

the similarities observed between arthropod and molluscan hemocyanins (e.g. use of copper to

reversibly bind oxygen as a respiratory strategy, oligomerization and secretion to the hemolymph)

are the result of convergent evolution. The evolutionary history of hemerythrins and hemocyanins is

characterized by frequent losses, even after a respiratory function has been acquired when a higher

selective pressure against loss could be expected. For instance, the use of tyrosinase as an oxygen

transport molecule seems to be absent in some groups of mollusks, such as solenogasters and

pteriomorphids (e.g. C. gigantea, as also shown in this study) (Lieb and Todt 2008), in which it was

probably replaced by other respiratory proteins that have evolved independently in these lineages.

This is the case for gastropods in the group Planorbidae, which lack hemocyanin in their

hemolymph and which utilize an extracellular hemoglobin (evolved from an intracellular

myoglobin present in the gastropod radula muscle) as an alternative strategy for oxygen transport.

This high molecular mass hemoglobin has a higher affinity for oxygen than the ancestral

by guest on July 15, 2013http://gbe.oxfordjournals.org/

Dow

nloaded from

Page 8: A broad genomic survey reveals multiple origins and frequent …public-files.prbb.org/publicacions/38091a40-cf4e-0130-27... · 2016. 8. 4. · 1 A broad genomic survey reveals multiple

8

hemocyanin (Lieb, et al. 2006). Similar adaptations are also observed within the Crustacea, such as

in branchiopods, ostracods, copepods, cirripeds and decapods, which lost hemocyanins and evolved

hemoglobins as respiratory proteins (Terwilliger and Ryan 2001). In the water flea Daphnia magna,

for instance, the tandemly duplicated gene cluster of hemoglobin genes shows multiple hypoxia

inducible factor (HIF) binding sites, which dramatically induce the expression of hemoglobins

when daphnids are exposed to hypoxia (Gorr, et al. 2004; Kimura, et al. 1999). The extremely

patchy distribution of these proteins across the animal phylogeny can be partially understood by the

different biochemical properties of their oxygen binding domains and the changing physiological

needs of each particular animal lineage, that make one or the other respiratory protein more

effective in their function as oxygen carriers. Recent studies show that many enzymatic genes have

complex evolutionary histories, with massive gene losses in most of the eukaryote genomes

sampled, but retention in certain tips of the tree of life (Allen, et al. 2011; Attenborough, et al. 2012;

de Mendoza and Ruiz-Trillo 2011; Stairs, et al. 2011). In contrast, transcription factors, signaling

pathways and adhesion molecules, for instance, can be traced back in a congruent phylogenetic

pattern (Pang, et al. 2010; Sebe-Pedros, et al. 2011; Sebe-Pedros, et al. 2010; Srivastava, et al.

2010). In some cases, the patchy phylogenetic distribution observed in enzymatic families could be

explained by multiple events of LGT, although the phylogenetic signal is often not strong enough.

Together with gene structure and synteny analysis we do not find strong evidences of LGT, with the

exception of plant tyrosinases (see above). Moreover, the study of the evolution of respiratory

proteins emerges as an ideal model to study the interplay between molecular evolution, biochemical

constraints and physiological-ecological needs.

Material and Methods

All potential hemerythrin, hemocyanin and tyrosinase sequences were identified by

HMMER searches against the Protein, Genome, and EST databases at the NCBI (National Center

for Biotechnology Information), and against completed genome/transcriptome projects databases

by guest on July 15, 2013http://gbe.oxfordjournals.org/

Dow

nloaded from

Page 9: A broad genomic survey reveals multiple origins and frequent …public-files.prbb.org/publicacions/38091a40-cf4e-0130-27... · 2016. 8. 4. · 1 A broad genomic survey reveals multiple

9

publicly available or that are being conducted in our laboratories (sequences available in

Supplementary File S1) with the default parameters and an inclusive E-value of 0.05. The retrieved

sequences were aligned using MAFFT (Katoh, et al. 2002) L-INS-i algorithm, and then manually

inspected to remove those hits fulfilling one of the following conditions: 1) incomplete sequences

with more than 99% sequence identity to a complete sequence from the same taxa; 2) sequences

that showed extremely long branches in the preliminary maximum likelihood trees; and 3) incorrect

gene model predictions. The final alignment was carried out using the MAFFT G-INS-i algorithm

(for global homology). Maximum likelihood (ML) phylogenetic trees were estimated by RaxML

(Stamatakis 2006) and the best tree from 100 replicates was selected. Bootstrap support was

calculated from 1,000 replicates. Bayesian inference (BI) analyses were performed with

PhyloBayes (Lartillot and Philippe 2004), using two parallel runs for 500,000 generations and

sampling every 100. Bayesian posterior probabilities (BPP) were used for assessing the statistical

support of each bipartition. The domain architecture of all retrieved sequences was inferred by

performing a Pfam scan with the gathering threshold as cut-off value. The domain information was

used to assess the reliability of each sequence of the initial dataset, to help define protein families

according to their architectural coherence, and to assess the level of functional and structural

diversification of hemerythrins, hemocyanins and tyrosinases across the eukaryote lineages.

Acknowledgements

The authors thank J. Ryan for his advice and M. leidyi sequences, members of the S9 lab for

collecting material and discussion, G. Richards for reading and improving the manuscript, G.

Torruella and X. Grau-Boved for their help and J. Achatz for his support. This work was funded by

the Sars International Centre for Marine Molecular Biology to J.M.M.-D. and A.H., and an ICREA

contract, an European Research Council Starting Grant (ERC-2007-StG-206883), and a grant

(BFU2011-23434) from Ministerio de Economía y Competitividad (MINECO) to I.R.-T. A.S.-P.’s

by guest on July 15, 2013http://gbe.oxfordjournals.org/

Dow

nloaded from

Page 10: A broad genomic survey reveals multiple origins and frequent …public-files.prbb.org/publicacions/38091a40-cf4e-0130-27... · 2016. 8. 4. · 1 A broad genomic survey reveals multiple

10

salary was supported by a pregraduate FPU grant and A.d.M.’s salary from a FPI grant, both from

MICINN.

by guest on July 15, 2013http://gbe.oxfordjournals.org/

Dow

nloaded from

Page 11: A broad genomic survey reveals multiple origins and frequent …public-files.prbb.org/publicacions/38091a40-cf4e-0130-27... · 2016. 8. 4. · 1 A broad genomic survey reveals multiple

11

References

Allen AE, Dupont CL, Obornik M, Horak A, Nunes-Nesi A, McCrow JP, Zheng H, Johnson DA,

Hu H, Fernie AR, Bowler C 2011. Evolution and metabolic significance of the urea cycle in

photosynthetic diatoms. Nature 473: 203-207. doi: 10.1038/nature10074

Attenborough RM, Hayward DC, Kitahara MV, Miller DJ, Ball EE 2012. A "neural" enzyme in

nonbilaterian animals and algae: preneural origins for peptidylglycine alpha-amidating

monooxygenase. Mol Biol Evol 29: 3095-3109. doi: 10.1093/molbev/mss114

Baert JL, Britel M, Sautiere P, Malecha J 1992. Ovohemerythrin, a major 14-kDa yolk protein

distinct from vitellogenin in leech. Eur J Biochem 209: 563-569.

Bailly X, Vanin S, Chabasse C, Mizuguchi K, Vinogradov SN 2008. A phylogenomic profile of

hemerythrins, the nonheme diiron binding respiratory proteins. BMC Evol Biol 8: 244. doi:

10.1186/1471-2148-8-244

Blank M, Burmester T 2012. Widespread occurrence of N-terminal acylation in animal globins and

possible origin of respiratory globins from a membrane-bound ancestor. Mol Biol Evol 29:

3553-3561. doi: 10.1093/molbev/mss164

Bonaventura J, Bonaventura C 1980. Hemocyanins - Relationships in Their Structure, Function and

Assembly. American Zoologist 20: 7-17.

Burmester T 2001. Molecular evolution of the arthropod hemocyanin superfamily. Mol Biol Evol

18: 184-195.

de Mendoza A, Ruiz-Trillo I 2011. The mysterious evolutionary origin for the GNE gene and the

root of bilateria. Mol Biol Evol 28: 2987-2991. doi: 10.1093/molbev/msr142

Demuynck S, Li KW, Van der Schors R, Dhainaut-Courtois N 1993. Amino acid sequence of the

small cadmium-binding protein (MP II) from Nereis diversicolor (annelida, polychaeta).

Evidence for a myohemerythrin structure. Eur J Biochem 217: 151-156.

by guest on July 15, 2013http://gbe.oxfordjournals.org/

Dow

nloaded from

Page 12: A broad genomic survey reveals multiple origins and frequent …public-files.prbb.org/publicacions/38091a40-cf4e-0130-27... · 2016. 8. 4. · 1 A broad genomic survey reveals multiple

12

Esposito R, D'Aniello S, Squarzoni P, Pezzotti MR, Ristoratore F, Spagnuolo A 2012. New insights

into the evolution of metazoan tyrosinase gene family. PLoS One 7: e35731. doi: 10.1371/

journal.pone.0035731

French CE, Bell JM, Ward FB 2008. Diversity and distribution of hemerythrin-like proteins in

prokaryotes. FEMS Microbiol Lett 279: 131-145. doi: 10.1111/j.1574-6968.2007.01011.x

Gorr TA, Cahn JD, Yamagata H, Bunn HF 2004. Hypoxia-induced synthesis of hemoglobin in the

crustacean Daphnia magna is hypoxia-inducible factor-dependent. J Biol Chem 279:

36038-36047. doi: 10.1074/jbc.M403981200

Immesberger A, Burmester T 2004. Putative phenoloxidases in the tunicate Ciona intestinalis and

the origin of the arthropod hemocyanin superfamily. J Comp Physiol B 174: 169-180. doi:

10.1007/s00360-003-0402-4

Isaza CE, Silaghi-Dumitrescu R, Iyer RB, Kurtz DM, Jr., Chan MK 2006. Structural basis for O2

sensing by the hemerythrin-like domain of a bacterial chemotaxis protein: substrate tunnel

and fluxional N terminus. Biochemistry 45: 9023-9031. doi: 10.1021/bi0607812

Karlsen OA, Ramsevik L, Bruseth LJ, Larsen O, brenner A, Berven FS, Jensen HB, Lillehaug JR

2005. Characterization of a prokaryotic hemerythrin from the methanotrophic bacterium

Methylococcus capsulatus (Bath). FEBS Journal 272: 2428-2440.

Katoh K, Misawa K, Kuma K, Miyata T 2002. MAFFT: a novel method for rapid multiple sequence

alignment based on fast Fourier transform. Nucleic Acids Res 30: 3059-3066.

Kimura S, Tokishita S, Ohta T, Kobayashi M, Yamagata H 1999. Heterogeneity and differential

expression under hypoxia of two-domain hemoglobin chains in the water flea, Daphnia

magna. J Biol Chem 274: 10649-10653.

Lartillot N, Philippe H 2004. A Bayesian mixture model for across-site heterogeneities in the

amino-acid replacement process. Mol Biol Evol 21: 1095-1109. doi: 10.1093/molbev/

msh112

by guest on July 15, 2013http://gbe.oxfordjournals.org/

Dow

nloaded from

Page 13: A broad genomic survey reveals multiple origins and frequent …public-files.prbb.org/publicacions/38091a40-cf4e-0130-27... · 2016. 8. 4. · 1 A broad genomic survey reveals multiple

13

Lieb B, Dimitrova K, Kang HS, Braun S, Gebauer W, Martin A, Hanelt B, Saenz SA, Adema CM,

Markl J 2006. Red blood with blue-blood ancestry: intriguing structure of a snail

hemoglobin. Proc Natl Acad Sci U S A 103: 12011-12016. doi: 10.1073/pnas.0601861103

Lieb B, Todt C 2008. Hemocyanin in mollusks--a molecular survey and new data on hemocyanin

genes in Solenogastres and Caudofoveata. Mol Phylogenet Evol 49: 382-385. doi: 10.1016/

j.ympev.2008.06.005

Meyer A, Lieb B 2010. Respiratory proteins in Sipunculus nudus--implications for phylogeny and

evolution of the hemerythrin family. Comp Biochem Physiol B Biochem Mol Biol 155:

171-177. doi: 10.1016/j.cbpb.2009.11.001

Pang K, Ryan JF, Program NCS, Mullikin JC, Baxevanis AD, Martindale MQ 2010. Genomic

insights into Wnt signaling in an early diverging metazoan, the ctenophore Mnemiopsis

leidyi. Evodevo 1: 10. doi: 10.1186/2041-9139-1-10

Roesner A, Fuchs C, Hankeln T, Burmester T 2005. A globin gene of ancient evolutionary origin in

lower vertebrates: evidence for two distinct globin families in animals. Mol Biol Evol 22:

12-20. doi: 10.1093/molbev/msh258

Salahudeen AA, Thompson JW, Ruiz JC, Ma HW, Kinch LN, Li Q, Grishin NV, Bruick RK 2009.

An E3 ligase possessing an iron-responsive hemerythrin domain is a regulator of iron

homeostasis. Science 326: 722-726. doi: 10.1126/science.1176326

Schmidt-Rhaesa A. 2007. The evolution of organ systems. Oxford: Oxford University Press.

Sebe-Pedros A, de Mendoza A, Lang BF, Degnan BM, Ruiz-Trillo I 2011. Unexpected repertoire of

metazoan transcription factors in the unicellular holozoan Capsaspora owczarzaki. Mol Biol

Evol 28: 1241-1254. doi: 10.1093/molbev/msq309

Sebe-Pedros A, Roger AJ, Lang FB, King N, Ruiz-Trillo I 2010. Ancient origin of the integrin-

mediated adhesion and signaling machinery. Proc Natl Acad Sci U S A 107: 10142-10147.

doi: 10.1073/pnas.1002257107

by guest on July 15, 2013http://gbe.oxfordjournals.org/

Dow

nloaded from

Page 14: A broad genomic survey reveals multiple origins and frequent …public-files.prbb.org/publicacions/38091a40-cf4e-0130-27... · 2016. 8. 4. · 1 A broad genomic survey reveals multiple

14

Srivastava M, Simakov O, Chapman J, Fahey B, Gauthier ME, Mitros T, Richards GS, Conaco C,

Dacre M, Hellsten U, Larroux C, Putnam NH, Stanke M, Adamska M, Darling A, Degnan

SM, Oakley TH, Plachetzki DC, Zhai Y, Adamski M, Calcino A, Cummins SF, Goodstein

DM, Harris C, Jackson DJ, Leys SP, Shu S, Woodcroft BJ, Vervoort M, Kosik KS, Manning

G, Degnan BM, Rokhsar DS 2010. The Amphimedon queenslandica genome and the

evolution of animal complexity. Nature 466: 720-726. doi: 10.1038/nature09201

Stairs CW, Roger AJ, Hampl V 2011. Eukaryotic pyruvate formate lyase and its activating enzyme

were acquired laterally from a Firmicute. Mol Biol Evol 28: 2087-2099. doi: 10.1093/

molbev/msr032

Stamatakis A 2006. RAxML-VI-HPC: maximum likelihood-based phylogenetic analyses with

thousands of taxa and mixed models. Bioinformatics 22: 2688-2690. doi: 10.1093/

bioinformatics/btl446

Sugumaran M 2002. Comparative biochemistry of eumelanogenesis and the protective roles of

phenoloxidase and melanin in insects. Pigment Cell Res 15: 2-9.

Terwilliger DP, Ryan M 2001. Ontogeny of crustacean respiratory proteins. American Zoologist 41:

1057-1067.

Terwilliger NB 1998. Functional adaptations of oxygen-transport proteins. J Exp Biol 201:

1085-1098.

Thompson JW, Salahudeen AA, Chollangi S, Ruiz JC, Brautigam CA, Makris TM, Lipscomb JD,

Tomchick DR, Bruick RK 2012. Structural and molecular characterization of iron-sensing

hemerythrin-like domain within F-box and leucine-rich repeat protein 5 (FBXL5). J Biol

Chem 287: 7357-7365. doi: 10.1074/jbc.M111.308684

van Holde KE, Miller KI, Decker H 2001. Hemocyanins and invertebrate evolution. J Biol Chem

276: 15563-15566. doi: 10.1074/jbc.R100010200

by guest on July 15, 2013http://gbe.oxfordjournals.org/

Dow

nloaded from

Page 15: A broad genomic survey reveals multiple origins and frequent …public-files.prbb.org/publicacions/38091a40-cf4e-0130-27... · 2016. 8. 4. · 1 A broad genomic survey reveals multiple

15

Vanin S, Negrisolo E, Bailly X, Bubacco L, Beltramini M, Salvato B 2006. Molecular evolution and

phylogeny of sipunculan hemerythrins. J Mol Evol 62: 32-41. doi: 10.1007/

s00239-004-0296-0

Vashisht AA, Zumbrennen KB, Huang X, Powers DN, Durazo A, Sun D, Bhaskaran N, Persson A,

Uhlen M, Sangfelt O, Spruck C, Leibold EA, Wohlschlegel JA 2009. Control of iron

homeostasis by an iron-regulated ubiquitin ligase. Science 326: 718-721. doi: 10.1126/

science.1176333

Vergote D, Sautiere PE, Vandenbulcke F, Vieau D, Mitta G, Macagno ER, Salzet M 2004. Up-

regulation of neurohemerythrin expression in the central nervous system of the medicinal

leech, Hirudo medicinalis, following septic injury. J Biol Chem 279: 43828-43837. doi:

10.1074/jbc.M403073200

Weber RE, Vinogradov SN 2001. Nonvertebrate hemoglobins: functions and molecular adaptations.

Physiol Rev 81: 569-628.

Xiong J, Kurtz DM, Jr., Ai J, Sanders-Loehr J 2000. A hemerythrin-like domain in a bacterial

chemotaxis protein. Biochemistry 39: 5117-5125. by guest on July 15, 2013http://gbe.oxfordjournals.org/

Dow

nloaded from

Page 16: A broad genomic survey reveals multiple origins and frequent …public-files.prbb.org/publicacions/38091a40-cf4e-0130-27... · 2016. 8. 4. · 1 A broad genomic survey reveals multiple

16

Figure Legends

Fig. 1— Maximum likelihood (ML) phylogenetic tree of the hemerythrin domain as obtained by

RAxML. The tree is rooted using the midpoint-rooted tree option. 1,000 replicate bootstrap values

(BV, in black) and Bayesian posterior probabilities (BPP, in red) are shown for each node. A black

dot in the node indicates BV>95% and BPP>0.95. Metazoan hemerythrins are highlighted by

colored rectangles. Domain architectures are shown for major lineages (abbreviations and accession

numbers of each domain are listed in Supplementary File S2).

Fig. 2— Maximum likelihood (ML) phylogenetic tree of the hemocyanin M domain as obtained by

RAxML. The tree is rooted using the amoebozoan hemocyanin genes as outgroup. 1,000 replicate

bootstrap values (BV, in black) and Bayesian posterior probabilities (BPP, in red) are shown for

each node. A black dot in the node indicates BV>95% and BPP>0.95. Metazoan and panarthropod

hemocyanins are highlighted by colored rectangles. The Aspergilus niger hemocyanin sequence is

enclosed by a red circle. Domain architectures are shown for major lineages (abbreviations and

accession numbers of each domain are listed in Supplementary File S2).

Fig. 3— Maximum likelihood (ML) phylogenetic tree of the tyrosinase domain as obtained by

RAxML. The tree is rooted using the midpoint-rooted tree option. 1,000 replicate bootstrap values

(BV, in black) and Bayesian posterior probabilities (BPP, in red) are shown for each node. A black

dot in the node indicates BV>95% and BPP>0.95. Metazoan tyrosinase clades are highlighted by

colored rectangles. Domain architectures are shown for major lineages (abbreviations and accession

numbers of each domain are listed in Supplementary File S2).

Fig. 4– Scenario for the evolution of hemerythrins and hemocyanins in metazoans. As shown in this

study, the metazoan ancestor likely had 2 different hemerythrin domains (oxygen subtype and iron-

related subtype), one hemocyanin M domain and 6 independent tyrosinases (A–F subtypes). In the

by guest on July 15, 2013http://gbe.oxfordjournals.org/

Dow

nloaded from

Page 17: A broad genomic survey reveals multiple origins and frequent …public-files.prbb.org/publicacions/38091a40-cf4e-0130-27... · 2016. 8. 4. · 1 A broad genomic survey reveals multiple

17

last common ancestor to cnidarians and bilaterians, the iron-related hemerythrin subtype evolved

into FBXL5, an E3 ubiquitin ligase involved in iron homeostasis. The diversification of bilaterally

symmetrical animals was accompanied by extensive independent losses of these proteins in the

different lineages, which can be partially related to the colonization of new niches and

environments. The adaptation of the oxygen hemerythrin subtype to respiratory purposes occurred

at least in the last common ancestor of protostome animals. On the contrary, the adaptation of the

hemocyanin M domain and the clade B tyrosinases to oxygen transport and storage occurred

independently in panarthropods and mollusks, respectively. Results from groups indicated with a 1

superscript originated from RNAseq data.

Supplementary Material

Table S1– List of main groups analyzed, with special focus on metazoans, and presence (black dot)/

absence (white dot) of the different clades of hemerythrins, hemocyanins and tyrosinases.

1The tyrosinase domain was not found in other flatworm genomes.

2Other annelids, such as Helobdella robusta, and sipunculids contain respiratory hemerythrins in

their genomes.

Table S2– List of domains detected in the study, with the abbreviations used in the main figures and

their respective PFAM accession numbers.

Supplementary File S1– Sequences used in the phylogenetic analyses. For each sequence,

accession number (where possible), species name, and sequence information (gene, domain and/or

phylogenetic clade) is indicated.

by guest on July 15, 2013http://gbe.oxfordjournals.org/

Dow

nloaded from

Page 18: A broad genomic survey reveals multiple origins and frequent …public-files.prbb.org/publicacions/38091a40-cf4e-0130-27... · 2016. 8. 4. · 1 A broad genomic survey reveals multiple

Phaeodactylium tricornutum

Thalassiosira pseudonana

Achantamoeba castellaniiAchantamoeba castellanii

Capsaspora owczarzakiAchantamoeba castellanii

Achantamoeba castellanii

Chlorella variabilis

Capsaspora owczarzaki

Bacteria

Ectocarpus siliculosus

Spizellomyces punctatusMicromonas pusilla

Ectocarpus siliculosusClamydomonas reinhardtii

Achantamoeba castellanii

Bacteria

2.0

MetazoanFBXL5

Haptophyta

Fungi

Bacteria

ChlorophytaStramenopile

Amoebozoa

unicellular Holozoa

green algae

Fungi

Fungi

Fungi

Embryophyta

Fungi

Trichomonas vaginalis

unicellular Holozoa

unicellular Holozoa

Archaeplastida

Embryophyta

unicellular Holozoa

Metazoanhemerythrins

Naegleria gruberi

Bacteria

Bacteria

Bacteria

Bacteria

Bacteria

Archaea

Archaea

Naegleria gruberi

Archaea

BacteriaBacteria

BacteriaBacteria

BacteriaBacteria

Archaea

Naegleria gruberi

Ministeria vibrans

Bacteria

Naegleria gruberi

Naegleria gruberiNaegleria gruberi

Ministeria vibrans

Volvox carteri

Volvox carteri

Cyanophora paradoxa

Bacteria

BacteriaMonosiga brevicollis

4810

10

0

518

3

47

23

9

8

00

1

1

1

49

92

88

62

0

96

0

1

2

11

2

85

14

8

47

19

94

6959

39

51

4046

86

3

8242

14

41

21

38

14

16

2089

20

2311

1135

24

31

11

1615

53

3632

1615

96

3

370

41

52

9123

17

2

93

010

0

0

1

2

18

6

30

30

Hry F-boxLRR

Hry

HryEF-hand (Bryozoa)

1

0.99

0.99

0.77

0.97

0.67

0.59

--

-

-

-

-

0.91

-

-

-

-

-

-

-

-

-

-

-

-

0.66

-

1

0.99

-

0.82

-

0.9

-

-

-

-

-

-

-

-

0.97-

-

0.99

0.980.63

1

0.99

0.9

1

0.79

-

-

-

0.970.86

1

-

-

-

-

-0

-

0.99

1

0.55

0.88

-

0.95

0.6

0.81

--

-

0.87

0.52

--

0.62

1

0.82-

-

0.82

0.59

0.95

-

-

Hry

Hry

Hry

Hry

Hry

Hry

Hry

Hry PolyCyc2

Hry

Hry

HryADK Bromo

Hry

Hry

Hry

HryScdA

Hry

Hry

HryMCPsign

Hry AMP bind

HryMCPsign

Hry

HryHry

HryAnk2 Ank

Hry

HryMCPsignHAMP

HryMCPsignLactB cNMPbcNMPb

HryMCPsign

HryMCPsignHAMP

Hry

HryAnk2RasGEFNADbFADbFlavo1

Hry

GST N3 GST C2 Hry

HryGST N3

Hry zfCHY zfRING2Hry Hry

Hry Hry zfRING2

zfCHY zfRING2Hry Hry

zfCHY zfRING2Hry

zfCHYHry Hry

Hry

Hry zfCHY zfRING2Hry Hry zfRING2

HryDUF4395 C4d ma tran

GlutRed HryDUF4395 C4d ma tran

GlutRed HryC4d ma tran

ECH GlutRed HryDUF4395 C4d ma tran

Hry

Iron metabolism

O2 transport

Clade A

Clade B

1

by guest on July 15, 2013http://gbe.oxfordjournals.org/

Dow

nloaded from

Page 19: A broad genomic survey reveals multiple origins and frequent …public-files.prbb.org/publicacions/38091a40-cf4e-0130-27... · 2016. 8. 4. · 1 A broad genomic survey reveals multiple

Saccoglossus kowalevskii

Cheliceratehemocyanins

Myriapodhemocyanins

Crustaceancryptocyanins

Crustaceanhemocyanins

Insecthexamerins

InsectproPhenolOxidases

CrustaceanproPhenolOxidases

Tunicata

Ctenophora

Epiperipatus sp.

47

62

74

Aspergillus nigerAmphimedon queenslandica

58

66

28

38

Dictyostelium fasciatumPolysphondylium pallidum

Schistocerca americanaPacifastacus leniuscus

Homarus americanusPalinurus vulgaris – 3

Palinurus vulgaris – 2Palinurus vulgaris – 1

Palinurus elephas – 4

Penaeus vannameiCallinectes sapidusCancer magister

Insect hemocyanins

50

Onychophora

N M C

M C

0.5

Pan-Arthropoda

Amoebozoa

Fungi

0.78

0.97

1

0.78

1

0.56

0.53

0.95

by guest on July 15, 2013http://gbe.oxfordjournals.org/

Dow

nloaded from

Page 20: A broad genomic survey reveals multiple origins and frequent …public-files.prbb.org/publicacions/38091a40-cf4e-0130-27... · 2016. 8. 4. · 1 A broad genomic survey reveals multiple

BacteriaBacteriaBacteria BacteriaBacteria

Tyr Tyr

Tyr

Tyr DWL

Tyr DWLTyr

PlantTyr DWL

Tyr

Tyr DWL KFDV

Tyr DWL KFDVTyr

TyrBacteria

Fungi Tyr

Bilateria Tyr

Tunicata

Brachiopoda TyrTyr

MolluscTyrosinases

(Hemocyanins)

Bilateria

Tyr

Bilateria Tyr

Non-bilaterianmetazoans Tyr

Tyr

TyrvW-A

TyrKtetra

SushiSushi Sushi

Cnidarians

Ctenophores

SpongesTyr

Bilateria Tyr vW-C vW-DvW-C vW-C

TyrBacteriaTyr

ProtostomiaTyr

Tyr ShK ShK

Tyr ShK ShK ShK ShK

Ichthyosporea Tyr

Ichthyosporea

Stramenopile Tyr

Ichthyosporea Tyr

Choanoflagelata Tyr Kazal-2 Kazal-2Kunitz

Tyr Tyr Kazal-2 Kazal-2Kunitz

Tyr Tyr

TyrKunitz

Fungi

Tyr

TyrDUF866

TyrNnf1

Tyr zf-MYND

TyrChitin bind 1 Chitin bind 1

Clade A

Clade B

Clade C

Clade E

Clade F

Clade D

57

59

84

2915

36

8

65

12

42

9339

18

58

25

47

35

Bacteria88

26

58

55

StramenopileTyr

Tyr Tyr

Bacteria

14

34

30

10

7

3010

5

4

2

53

12

49

6

6975

42

0.3

0.92

-

--

1

0.93

0.991

0.93

0.54

1

0.79

0.97

0.72

0.98

0.98

-

0.99

0.6

0.78

1

0.77

0.99

0.62

1

0.59

-

-

- 0.84

0.94

11

0.99

0.83

0.92

-

-

Tyr Tyr Tyr Tyr TyrTyr Tyr Tyr

Cubind

Itrans

by guest on July 15, 2013http://gbe.oxfordjournals.org/

Dow

nloaded from

Page 21: A broad genomic survey reveals multiple origins and frequent …public-files.prbb.org/publicacions/38091a40-cf4e-0130-27... · 2016. 8. 4. · 1 A broad genomic survey reveals multiple

Cnidaria

Xenoturbellida

Acoelomorpha1

Chordata

Echinodermata

Hemichordata

Chaetognatha

Nematoda

Nematomorpha

Tardigrada

Onychophora1

Arthropoda

Priapulida1

Loricifera

Kinorhyncha

Bryozoa1

Entoprocta

Cycliophora

Annelida

Mollusca

Nemertea1

Brachiopoda1

Phoronida

Gastrotricha

Platyhelminthes

Gnathostomulida

Micrognathozoa

Rotifera

Ctenophora

Porifera

Trichoplax

Deu

t.

Bila

teria

Pro

tost

omia

1

Iron HryO2 Hry

Hemocyanin M

A B C D E F

Tyrosinase:

1 Metazoan ancestor:

FBXL5O2 Hry

Hemocyanin M

A B C D F

Tyrosinase:

2 Bilaterian ancestor:

Respiratory hemocyanin

3

FBXL5

Hemocyanin M

C F

Tyrosinase:

3 Deuterostomian ancestor:

FBXL5O2 Hry

Hemocyanin M

A B C D F

Tyrosinase:

4 Protostomian ancestor:

2

Iron Hry FBXL5

Tyr E

4

Respiratoryhemerythrin

Respiratory tyrosinase

by guest on July 15, 2013http://gbe.oxfordjournals.org/

Dow

nloaded from