the chloroplast genomes of the green algae pyramimonas

18
The Chloroplast Genomes of the Green Algae Pyramimonas, Monomastix, and Pycnococcus Shed New light on the Evolutionary History of Prasinophytes and the Origin of the Secondary Chloroplasts of Euglenids Monique Turmel,* Marie-Christine Gagnon,* Charley J. O’Kelly, Christian Otis,* and Claude Lemieux* *De ´partement de Biochimie et de Microbiologie, Universite ´ Laval, Que ´bec (Que ´bec), Canada; and  Botany Department, University of Hawaii Because they represent the earliest divergences of the Chlorophyta and include the smallest known eukaryotes (e.g., the coccoid Ostreococcus), the morphologically diverse unicellular green algae making up the Prasinophyceae are central to our understanding of the evolutionary patterns that accompanied the radiation of chlorophytes and the reduction of cell size in some lineages. Seven prasinophyte lineages, four of which exhibit a coccoid cell organization (no flagella nor scales), were uncovered from analysis of nuclear-encoded 18S rDNA data; however, their order of divergence remains unknown. In this study, the chloroplast genome sequences of the scaly quadriflagellate Pyramimonas parkeae (clade I), the coccoid Pycnococcus provasolii (clade V), and the scaly uniflagellate Monomastix (unknown affiliation) were determined, annotated, and compared with those previously reported for green algae/land plants, including two prasinophytes (Nephroselmis olivacea, clade III and Ostreococcus tauri, clade II). The chlorarachniophyte Bigelowiella natans and the euglenid Euglena gracilis, whose chloroplasts originate presumably from distinct green algal endosymbionts, were also included in our comparisons. The three newly sequenced prasinophyte genomes differ considerably from one another and from their homologs in overall structure, gene content, and gene order, with the 80,211-bp Pycnococcus and 114,528-bp Monomastix genomes (98 and 94 conserved genes, respectively) resembling the 71,666-bp Ostreococcus genome (88 genes) in featuring a significantly reduced gene content. The 101,605-bp Pyramimonas genome (110 genes) features two conserved genes (rpl22 and ycf65) and ancestral gene linkages previously unrecognized in chlorophytes as well as a DNA primase gene putatively acquired from a virus. The Pyramimonas and Euglena cpDNAs revealed uniquely shared derived gene clusters. Besides providing unequivocal evidence that the green algal ancestor of the euglenid chloroplasts belonged to the Pyramimonadales, phylogenetic analyses of concatenated chloroplast genes and proteins elucidated the position of Monomastix and showed that the Mamiellales, a clade comprising Ostreococcus and Monomastix, are sister to the Pyramimonadales þ Euglena clade. Our results also revealed that major reduction in gene content and restructuring of the chloroplast genome occurred in conjunction with important changes in cell organization in at least two independent prasinophyte lineages, the Mamiellales and the Pycnococcaceae. Introduction The green plants (Viridiplantae) are divided among two major lineages: the Chlorophyta, containing the bulk of the extant green algae, and the Streptophyta, containing the green algae belonging to the Charophyceae sensu Mattox and Stewart (1984) and all land plants (Lewis and McCourt 2004). It is thought that the first green plants were unicellular green algae bearing nonmineralized or- ganic scales on their cell body and/or their flagella (Mattox and Stewart 1984). This hypothesis was put forward when it was recognized that flagellated reproductive cells (zoo- spores, gametes) of some taxa in both the Chlorophyta and Streptophyta are covered by a layer of square-shaped scales, which also occur as an underlayer in many prasino- phytes. Free-living scaly flagellates have been ascribed mainly to the Prasinophyceae, a nonmonophyletic class representing the earliest divergences of the Chlorophyta (Steinkotter et al. 1994; Nakayama et al. 1998; Fawley et al. 2000; Guillou et al. 2004; Proschold and Leliaert 2007). This morphologically heterogeneous assemblage of green algae gave rise to the three advanced classes des- ignated as the Trebouxiophyceae, Ulvophyceae, and Chlor- ophyceae (Lewis and McCourt 2004). Note that the scaly biflagellate Mesostigma viride, traditionally classified within the Prasinophyceae, has been formally excluded from this class and placed in the Streptophyta (Marin and Melkonian 1999; Lemieux et al. 2007; Rodriguez- Ezpeleta et al. 2007). Prasinophytes have always fascinated the phycologists because their studies have the potential to shed light on the nature of the last common ancestor of all green plants and on the origin of the advanced chlorophytes. The concept of the class Prasinophyceae has been un- der constant revision since its formal description by Moestr- up and Throndsen (1988) (Sym and Pienaar 1993); in the last few years, it has profoundly changed with the descrip- tion of several new taxa and the analysis of environmental sequences. Most prasinophytes are found in marine habi- tats, and considerable diversity is observed with respect to cell shape and size, flagella number and behavior, mitotic and cytokinetic mechanisms, and biochemical features such as accessory photosynthetic pigments and storage products (Melkonian 1990; O’Kelly 1992; Sym and Pienaar 1993; Latasa et al. 2004). Some species lack flagella, others lack scales, and in some cases, both flagella and scales are absent (e.g., Ostreococccus tauri). The small-sized members of the Prasinophyceae, particularly those belonging to three gen- era of the Mamiellales (Micromonas, Bathycoccus, and Ostreococcus), are prominent in the oceanic picoplankton (comprising organisms less than 3 lm in diameter) (Guillou et al. 2004). Included in this category is the smallest Key words: prasinophyte green algae, euglenids, chloroplast genome evolution, phylogenomics, secondary endosymbiosis, genome reduction, horizontal DNA transfers. E-mail: [email protected]. Mol. Biol. Evol. 26(3):631–648. 2009 doi:10.1093/molbev/msn285 Advance Access publication December 12, 2008 Ó The Author 2008. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. All rights reserved. For permissions, please e-mail: [email protected] Downloaded from https://academic.oup.com/mbe/article/26/3/631/977861 by guest on 28 December 2021

Upload: others

Post on 11-Feb-2022

0 views

Category:

Documents


0 download

TRANSCRIPT

The Chloroplast Genomes of the Green Algae Pyramimonas, Monomastix, andPycnococcus Shed New light on the Evolutionary History of Prasinophytes andthe Origin of the Secondary Chloroplasts of Euglenids

Monique Turmel,* Marie-Christine Gagnon,* Charley J. O’Kelly,� Christian Otis,* andClaude Lemieux*

*Departement de Biochimie et de Microbiologie, Universite Laval, Quebec (Quebec), Canada; and �Botany Department, Universityof Hawaii

Because they represent the earliest divergences of the Chlorophyta and include the smallest known eukaryotes (e.g., thecoccoid Ostreococcus), the morphologically diverse unicellular green algae making up the Prasinophyceae are central toour understanding of the evolutionary patterns that accompanied the radiation of chlorophytes and the reduction of cellsize in some lineages. Seven prasinophyte lineages, four of which exhibit a coccoid cell organization (no flagella norscales), were uncovered from analysis of nuclear-encoded 18S rDNA data; however, their order of divergence remainsunknown. In this study, the chloroplast genome sequences of the scaly quadriflagellate Pyramimonas parkeae (clade I),the coccoid Pycnococcus provasolii (clade V), and the scaly uniflagellate Monomastix (unknown affiliation) weredetermined, annotated, and compared with those previously reported for green algae/land plants, including twoprasinophytes (Nephroselmis olivacea, clade III and Ostreococcus tauri, clade II). The chlorarachniophyte Bigelowiellanatans and the euglenid Euglena gracilis, whose chloroplasts originate presumably from distinct green algalendosymbionts, were also included in our comparisons. The three newly sequenced prasinophyte genomes differconsiderably from one another and from their homologs in overall structure, gene content, and gene order, with the80,211-bp Pycnococcus and 114,528-bp Monomastix genomes (98 and 94 conserved genes, respectively) resembling the71,666-bp Ostreococcus genome (88 genes) in featuring a significantly reduced gene content. The 101,605-bpPyramimonas genome (110 genes) features two conserved genes (rpl22 and ycf65) and ancestral gene linkagespreviously unrecognized in chlorophytes as well as a DNA primase gene putatively acquired from a virus. ThePyramimonas and Euglena cpDNAs revealed uniquely shared derived gene clusters. Besides providing unequivocalevidence that the green algal ancestor of the euglenid chloroplasts belonged to the Pyramimonadales, phylogeneticanalyses of concatenated chloroplast genes and proteins elucidated the position of Monomastix and showed that theMamiellales, a clade comprising Ostreococcus and Monomastix, are sister to the Pyramimonadales þ Euglena clade. Ourresults also revealed that major reduction in gene content and restructuring of the chloroplast genome occurred inconjunction with important changes in cell organization in at least two independent prasinophyte lineages, theMamiellales and the Pycnococcaceae.

Introduction

The green plants (Viridiplantae) are divided amongtwo major lineages: the Chlorophyta, containing the bulkof the extant green algae, and the Streptophyta, containingthe green algae belonging to the Charophyceae sensuMattox and Stewart (1984) and all land plants (Lewisand McCourt 2004). It is thought that the first green plantswere unicellular green algae bearing nonmineralized or-ganic scales on their cell body and/or their flagella (Mattoxand Stewart 1984). This hypothesis was put forward when itwas recognized that flagellated reproductive cells (zoo-spores, gametes) of some taxa in both the Chlorophytaand Streptophyta are covered by a layer of square-shapedscales, which also occur as an underlayer in many prasino-phytes. Free-living scaly flagellates have been ascribedmainly to the Prasinophyceae, a nonmonophyletic classrepresenting the earliest divergences of the Chlorophyta(Steinkotter et al. 1994; Nakayama et al. 1998; Fawleyet al. 2000; Guillou et al. 2004; Proschold and Leliaert2007). This morphologically heterogeneous assemblageof green algae gave rise to the three advanced classes des-ignated as the Trebouxiophyceae, Ulvophyceae, and Chlor-

ophyceae (Lewis and McCourt 2004). Note that the scalybiflagellate Mesostigma viride, traditionally classifiedwithin the Prasinophyceae, has been formally excludedfrom this class and placed in the Streptophyta (Marinand Melkonian 1999; Lemieux et al. 2007; Rodriguez-Ezpeleta et al. 2007). Prasinophytes have always fascinatedthe phycologists because their studies have the potentialto shed light on the nature of the last common ancestorof all green plants and on the origin of the advancedchlorophytes.

The concept of the class Prasinophyceae has been un-der constant revision since its formal description by Moestr-up and Throndsen (1988) (Sym and Pienaar 1993); in thelast few years, it has profoundly changed with the descrip-tion of several new taxa and the analysis of environmentalsequences. Most prasinophytes are found in marine habi-tats, and considerable diversity is observed with respectto cell shape and size, flagella number and behavior, mitoticand cytokinetic mechanisms, and biochemical features suchas accessory photosynthetic pigments and storage products(Melkonian 1990; O’Kelly 1992; Sym and Pienaar 1993;Latasa et al. 2004). Some species lack flagella, others lackscales, and in some cases, both flagella and scales are absent(e.g., Ostreococccus tauri). The small-sized members of thePrasinophyceae, particularly those belonging to three gen-era of the Mamiellales (Micromonas, Bathycoccus, andOstreococcus), are prominent in the oceanic picoplankton(comprising organisms less than 3 lm in diameter) (Guillouet al. 2004). Included in this category is the smallest

Key words: prasinophyte green algae, euglenids, chloroplast genomeevolution, phylogenomics, secondary endosymbiosis, genome reduction,horizontal DNA transfers.

E-mail: [email protected].

Mol. Biol. Evol. 26(3):631–648. 2009doi:10.1093/molbev/msn285Advance Access publication December 12, 2008

� The Author 2008. Published by Oxford University Press on behalf ofthe Society for Molecular Biology and Evolution. All rights reserved.For permissions, please e-mail: [email protected]

Dow

nloaded from https://academ

ic.oup.com/m

be/article/26/3/631/977861 by guest on 28 Decem

ber 2021

free-living eukaryote known to date, O. tauri (Courties et al.1994). Phylogenetic studies using molecular data, in partic-ular the nuclear-encoded small subunit (SSU) rRNA gene,identified seven monophyletic groups of prasinophytesat the base of the Chlorophyta (Steinkotter et al. 1994;Nakayama et al. 1998; Fawley et al. 2000; Guillou et al.2004); however, their order of divergence could not be re-solved. Despite this uncertainty, it appears that the coccoidform evolved more than once in the Prasinophyceae(Fawley et al. 2000; Guillou et al. 2004). Coccoid cellsare distributed among four lineages (clade II, Mamiellales;clade V, Pseudocourfieldiales, Pycnococcaceae; clade VI,Prasinococcales; and clade VII, no order assigned to thisclade), two of which (clades II and V) exhibit both thecoccoid and flagellated cell organizations.

Comparative analysis of chloroplast genomes has beenhelpful to resolve problematic relationships among greenalgae and land plants (Wolf et al. 2005; Qiu et al. 2006;Jansen et al. 2007; Lemieux et al. 2007; Turmel et al.2008) although the phylogenetic positions of some greenplant lineages have remained contentious (Pombert et al.2005; Turmel et al. 2006; Lemieux et al. 2007). The onlytwo complete chloroplast DNA (cpDNA) sequences cur-rently available for prasinophytes, those of the scaly biflagel-late Nephroselmis olivacea (clade III, Pseudocourfieldiales,Nephroselmidaceae) (Turmel et al. 1999b) and of the tinycoccoid O. tauri (clade II, Mamiellales) (Robbens et al.2007), have revealed contrasting evolutionary patterns whichcan be designated as ancestral and reduced derived, respec-tively. Whereas the 200.8-kb Nephroselmis genome harborsthe largest gene repertoire yet reported for a chlorophyte (128different conserved genes compared with about 138 genesfor the deepest branching streptophyte algae) and has re-tained many ancestral gene clusters, the nearly 3-fold smallerOstreococcus genome, which is the most compact chloro-phyte cpDNA known to date, displays a reduced set of 88genes whose order is highly scrambled. As in most otherchloroplast genomes, two identical copies of a large invertedrepeat (IR) are separated by single-copy (SC) regions; how-ever, the two prasinophyte genomes differ remarkably intheir quadripartite architectures. The Nephroselmis architec-tural design closely resembles that found in all streptophyteIR-containing cpDNAs: the SC regions are vastly unequal insize, each SC region is characterized by a highly conservedset of genes, and the rRNA operon encoded by the IR is tran-scribed toward the small SC (SSC) region. In Ostreococcus,the SC regions have essentially the same number of genes;the few genes (just five) that would be expected to map to theSSC region in streptophyte cpDNAs are confined to the sameSC region, and the rRNA operon is transcribed away fromthe latter SC region (see supplementary fig. 1, Supplemen-tary Material online). This gene partitioning pattern is rem-iniscent of that reported for the cpDNAs of the ulvophytesPseudendoclonium akinetum and Oltmannsiellopsis viridis(Pombert et al. 2005, 2006).

To explore the relationships among prasinophyte lin-eages and to better understand the mode of cpDNA evolu-tion in the Prasinophyceae, we sequenced the cpDNAs ofthe scaly quadriflagellate Pyramimonas parkeae (clade I,Pyramimonadales), the coccoid Pycnococcus provasolii(clade V, Pseudocourfieldiales, Pycnococcaceae), and the

scaly uniflagellate Monomastix (unknown affiliation) andcompared these genomes with those previously reportedfor Nephroselmis (Turmel et al. 1999b), Ostreococcus(Robbens et al. 2007), other chlorophytes (Wakasugiet al. 1997; Maul et al. 2002; Pombert et al. 2005, 2006;Belanger et al. 2006; de Cambiaire et al. 2006, 2007;Brouard et al. 2008), the deep-branching streptophytesMesostigma (Lemieux et al. 2000) and Chlorokybus atmo-phyticus (Lemieux et al. 2007), the euglenid Euglena gra-cilis (Hallick et al. 1993) and the chlorarachniophyteBigelowiella natans (Rogers et al. 2007). The latter photo-synthetic eukaryotes, which presumably gained their chlor-oplasts via independent secondary endosymbiotic events(Rogers et al. 2007), were included in our comparisonsin an attempt to gain more detailed information aboutthe green algal donors of their chloroplasts. We found thatthe three newly sequenced prasinophyte genomes differconsiderably from one another and from their previouslysequenced homologs at the overall structure, gene content,and gene order levels, with both the Monomastix andPycnococcus genomes featuring a reduced pattern ofevolution. Our phylogenetic analyses of sequence data of-fered significant insights into the phylogeny and evolutionof prasinophytes and provided unequivocal evidence thatthe euglenid chloroplasts were secondarily acquired froma member of the Pyramimonadales.

Materials and MethodsStrains and Culture Conditions

Pyramimonas parkeae (CCMP 726) and P. provasolii(CCMP 1203), two marine species, were obtained from theProvasoli–Guillard National Center for Culture of MarinePhytoplankton (West Boothbay Harbor, Maine) and grownin K medium (Keller et al. 1987) under 12 h light–darkcycles. Monomastix sp., a freshwater strain originally col-lected by H. R. Preisig in New Zealand, originates from thepersonal collection of C.J.O. This strain, which is availableupon request to M.T., was grown in modified Volvox me-dium (McCracken et al. 1980) under 12 h light–dark cycles.

Cloning and Sequencing of Chloroplast Genomes

The complete cpDNA sequences of Pyramimonas,Monomastix, and Pycnococcus were generated essentiallyas described previously (Turmel et al. 2005). For each greenalga, A þ T-rich organelle DNA was separated from nu-clear DNA by CsCl–bisbenzimide isopycnic centrifugationof total cellular DNA (Turmel et al. 1999a). The organelleDNA fraction was sheared by nebulization to produce1,500 to 3,000-bp fragments that were subsequently clonedinto a plasmid vector, either pBluescrit II KSþ orpSMART-HCKan (Lucigen Corporation, Middleton,WI). After hybridization of the resulting clones with theoriginal DNA used for cloning, plasmids from positiveclones were purified with the QIAprep 96 Miniprep kit(Qiagen Inc., Mississauga, Canada) and sequenced usinguniversal primers. DNA assembly was carried out usingAUTOASSEMBLER 2.1.1 (Applied BioSystems, FosterCity, CA) or SEQUENCHER 4.2 (Gene Codes Corporation,

632 Turmel et al.

Dow

nloaded from https://academ

ic.oup.com/m

be/article/26/3/631/977861 by guest on 28 Decem

ber 2021

Ann Arbor, MI). Distinct contigs of cpDNA origin were or-dered by polymerase chain reaction (PCR) amplificationwith primers specific to contig ends. The amplified fragmentsencompassing uncloned regions were sequenced on bothstrands.

Chloroplast Genome Analyses

Genes and all open reading frames (ORFs) larger than100 codons were identified as described previously (Turmelet al. 2006). Secondary structures of group I and group IIintrons were modeled according to Michel et al. (1989) andMichel and Westhof (1990), respectively. Short repeats inthe Monomastix genome were identified using REPuter2.74 (Kurtz et al. 2001), and the number of copies of eachrepeat was determined with FINDPATTERNS of theGenetics Computer Group package (Accelrys, San Diego,CA). For all three newly sequenced prasinophyte genomes,regions containing nonoverlapping repeated elements weremapped with RepeatMasker (http://www.repeatmasker.org/) running under the WU-Blast 2.0 search engine(http://blast.wustl.edu/), using the repeats �30 bp identifiedwith REPuter as input sequences. Conserved gene clustersexhibiting identical gene polarities in selected green algalcpDNAs were identified using a custom-built program.

Sequencing of the Monomastix 18S rRNA Gene andPhylogenetic Analysis

The nuclear-encoded SSU rRNA gene was amplifiedfrom total cellular DNA by PCR using the specific primersNS1 (White et al. 1990) and 18L (Hamby and Zimmer1991). The resulting PCR product was purified and se-quenced directly using these primers and two internal pri-mers. The Monomastix nuclear-encoded SSU rDNAsequence was aligned manually against the alignment pre-pared by Guillou et al. (2004) from 83 chlorophytes and 12streptophytes. A data set of 1,663 positions was obtainedafter removing ambiguously aligned regions usingGBLOCKS 0.91b (Castresana 2000) and the same filtrationparameters employed by Guillou et al. (2004). Maximumlikelihood (ML) trees were inferred using Treefinder (ver-sion of April 2008) (Jobb et al. 2004) with the best modelfitting the data [TN þ I (proportion of invariable sites) þ C(four discrete rate categories)] under the Akaike informa-tion criterion. Bootstrap values were calculated for 100replications.

Phylogenetic Inferences from Whole-Genome SequenceData

An amino acid data set and the corresponding nucle-otide data set with first and second codon positions werederived from the completely sequenced cpDNAs ofBigelowiella (NC_008408), Euglena (NC_001603), and22 green plants (species names and accession numbers, ex-cept those for Oedogonium cardiacum [NC_011031] andLeptosira terrestris [NC_009681], are provided in table3 of Lemieux et al. 2007). These data sets were allowed

to contain missing data; however, limitations were imposedto the proportion of missing data by selecting for analysisthe protein-coding genes that are shared by at least 14 taxa.Seventy genes met this criterion: atpA, B, E, F, H, I, ccsA,cemA, chlB, I, L, N, clpP, ftsH, infA, petA, B, D, G, L, psaA,B, C, I, J, M, psbA, B, C, D, E, F, H, I, J, K, L, M, N, T, Z,rbcL, rpl2, 5, 14, 16, 20, 23, 32, 36, rpoA, B, C1, C2, rps2,3, 4, 7, 8, 9, 11, 12, 14, 18, 19, tufA, ycf1, 3, 4, 12. Theamino acid data set was prepared as follows. The deducedamino acid sequences from the 70 individual genes werealigned using MUSCLE 3.7 (Edgar 2004), the ambiguouslyaligned regions in each alignment were removed usingGBLOCKS 0.91b (Castresana 2000) with the –b2 option(minimal number of sequences for a flank position) setto 13, and the protein alignments were concatenated. Toobtain the nucleotide data set, the multiple sequence align-ment of each protein was converted into a codon alignment,the poorly aligned and divergent regions in each codonalignment were excluded using GBLOCKS 0.91b withthe options –b2 5 13 and –t 5 c (the latter specifying thatselected sequences are complete codons), the individual co-don alignments were concatenated, and finally third codonpositions were excluded with PAUP* 4.0b10 (Swofford2003). Missing characters represented 5.9% and 5.8% ofthe amino acid and nucleotide data sets, respectively.

Treefinder (version of April 2008) was used to per-form the ML analyses and to identify the best model fittingthe data under the Akaike information criterion. The aminoacid data set was analyzed using the cpREV þ F (observedamino acid frequencies) þ C (five categories) model of se-quence evolution. Trees were inferred from the nucleotidedata set using the GTR þ C (five categories) model. Con-fidence of branch points was estimated by 500 bootstrapreplications.

The Bayesian inference method was conducted usingMrBayes 3.1.2 (Ronquist and Huelsenbeck 2003). Themodel selected was cpREV þ F þ C for the inference fromthe amino acid data set and GTR þ C for the inference ofthe nucleotide data set. Rates across sites were modeled ona discrete gamma distribution with five categories. Two in-dependent Markov chain Monte Carlo runs, each consistingof three heated chains in addition to the cold chain, werecarried out using the default parameters. For the analysisof the nucleotide data set, the length of each run was 3 mil-lion generations after a burn-in phase of 500,000 genera-tions; for the amino acid data set, it was 1 milliongenerations after a burn-in phase of 150,000 generations.Trees were sampled every 100 generations. Convergenceof the two independent runs was verified according tothe output of the ‘‘sump’’ command; this output was alsoused to determine the burn-in phase. Posterior probabilityvalues were estimated from the trees sampled from bothruns using the ‘‘sumt’’ command.

Reconstruction of Ancestral Character States

A data set of gene content was prepared from the chlo-roplast genomes of the streptophytes Mesostigma andChlorokybus, the prasinophytes, and Euglena by codingthe presence and absence of genes as binary characters.

Analysis of three Prasinophyte Chloroplast Genomes 633

Dow

nloaded from https://academ

ic.oup.com/m

be/article/26/3/631/977861 by guest on 28 Decem

ber 2021

Gene order in each of these chloroplast genomes was con-verted to all possible pairs of signed genes (i.e., taking intoaccount gene polarity) and a gene order data set was ob-tained by coding as binary characters the presence/absenceof the ancestral gene pairs conserved in at least one strep-tophyte and one prasinophyte. The gene content and geneorder data sets were merged to produce a data set of com-bined ancestral characters. Losses of these characters on thebest tree topology inferred from sequence data were map-ped using MacClade 4.08 (Maddison and Maddison 2000).The most parsimonious reconstructions of ancestral charac-ter states were inferred under the Dollo principle of parsi-mony (Farris 1977).

Results and DiscussionPyramimonas cpDNA Features an AncestralQuadripartite Structure and a Large Repertoire of Genes

Of the three newly sequenced prasinophyte genomes,only that of Pyramimonas displays a large IR (table 1). At101,605 bp, this genome is 2-fold smaller than its Nephro-selmis homolog, a size difference attributable to a muchshorter IR, gene losses, and a more compact gene organi-zation. As shown in figure 1, the two copies of the IR se-quence, each 13,057 bp in size and encoding 11 genes, areseparated by SC regions of 10,338 and 65,153 bp compris-ing 12 and 76 genes, respectively. In this figure are colorcoded the genes whose orthologs are usually found withinthe IR, the SSC and large SC (LSC) regions in streptophytecpDNAs. It can be seen that the pattern of gene partitioningamong the SC regions of the Pyramimonas genome closelyresembles that observed for streptophytes. Considering thatthe Pyramimonas IR is about 2-fold larger and encodes ad-ditional genes relative to that of Mesostigma and that the IRis known to contract and expand through gene conversionevents (Goulding et al. 1996), the observation that the ter-mini of the Pyramimonas IR contain genes characteristic ofthe adjacent SC regions is not surprising. The most impor-

tant deviation from the highly conserved partitioning pat-tern displayed by streptophytes concerns the locations ofchlL and chlN. These two genes, which would be expectedto be present in the SSC region, lie within the IR near theLSC region.

The Pyramimonas chloroplast genome encodes 110conserved genes, that is, genes found in several othercpDNAs and usually present in cyanobacteria. The productsof these genes consist of 81 proteins and 29 RNA species (2rRNAs and 27 tRNAs) (table 2). The set of 27 tRNAs is suf-ficient to decode all 61 sense codons provided that the tRNAspecies encoded by trnV(uac), trnA(ugc), trnT(ugu), trnS(uga),trnL(uag), and trnP(ugg) recognize all four members of theirrespective codon family through superwobble pairingbetween the first position of the anticodon and the thirdposition of the codon (Rogalski et al. 2008). The size ofthe Pyramimonas chloroplast gene complement closelymatches those observed for the trebouxiophytes Chlorellavulgaris and Leptosira and for the ulvophytes Pseudendo-clonium and Oltmannsiellopsis (de Cambiaire et al. 2007).Although it is significantly reduced compared with itsNephroselmis counterpart (table 2), the set of Pyramimonaschloroplast genes includes six ndh genes (ndhA and ndhDthrough ndhH) typically present in streptophytes but previ-ously found only in Nephroselmis in the Chlorophyta, aswell as two protein-coding genes reported here for the firsttime in a chlorophyte chloroplast genome, rpl22 and ycf65(supplementary table 1, Supplementary Material online).The ycf65 gene is present in both Mesostigma and Chloro-kybus but missing in the other investigated streptophytes,whereas rpl22 shows a widespread distribution in theStreptophyta and also resides in the Euglena chloroplasts.Perhaps not surprisingly, most of the 22 chloroplast genespresent in Nephroselmis but absent in Pyramimonas arealso missing from some chlorophytes belonging to theTrebouxiophyceae, Ulvophyceae, or Chlorophyceae (sup-plementary table 1, Supplementary Material online). Onlyfive genes (cemA, petD, petL, psbM, and rrf) represent ex-ceptions and interestingly, all five, except rrf (the 5S rRNA

Table 1General Features of Prasinophyte cpDNAs

Feature Nephroselmis Pyramimonas Pycnococcus Monomastix Ostreococcus

Size (bp)Total 200,799 101,605 80,211 114,528 71,666IR 46,137 13,057 —a —a 6,825LSC 92,126 65,153 —a —a 35,684SSC 16,399 10,338 —a —a 22,332

AþT (%) 57.9 65.3 60.5 61.0 60.1Conserved genes (no.)b 128 110 98 94 88Introns

Fraction of genome (%) 0 2.7 3.3 4.6 5.2Group I (no.) 0 0 0 5 0Group II (no.) 0 1 1 1 1

Intergenic sequencesc

Fraction of genome (%) 32.6 19.6 11.6 43.9 15.1Average size (bp) 352 159 102 524 115

Short repeated sequencesd

Fraction of genome (%) 0.5 0.5 0.1 17.6 0

a Because Pycnococcus and Monomastix cpDNAs lack an IR, only the total sizes of these genomes are given.b Conserved genes refer to free-standing coding sequences usually present in chloroplast genomes. Genes present in the IR were counted only once.c In addition to conserved genes, all ORFs �100 codons were considered as gene sequences.d Nonoverlapping repeat elements were mapped on each genome with RepeatMasker using the repeats �30 bp identified with REPuter as input sequences.

634 Turmel et al.

Dow

nloaded from https://academ

ic.oup.com/m

be/article/26/3/631/977861 by guest on 28 Decem

ber 2021

gene), are also lacking in the Ostreococcus and Euglenachloroplasts. The analysis of the nuclear genome fromboth O. tauri and Ostreococcus lucimarinus revealed thatcemA, petD, and psbM have been transferred to the nucleus(Derelle et al. 2006; Palenik et al. 2007; Robbens et al.2007). Considering that these genes are essential for chlo-roplast function, they are also likely to be nuclear-encodedin Pyramimonas. Because no case of chloroplast to nucleustransfer has been documented for rrf, the possibility existsthat this conserved gene is present in Pyramimonas cpDNAand that its sequence has diverged beyond recognition.

We found two large ORFs that are not associated withany introns, orf454 and orf510. For the orf510, present in theLSC region near the IR, our Blast searches against the non-redundant protein sequence database of the National Centerfor Biotechnology Information failed to identify any putativefunction for the potential encoded protein. However, theproduct of the orf454 localized in the IR revealed sequencesimilarity with the conserved domain of phage associatedDNA primases (COG3378, E-value 5 1e � 06). Interest-ingly, in the course of the present study, we have found that

the orf389 in the Nephroselmis IR (Turmel et al. 1999b) alsoencodes a putative protein with the conserved domain ofphage associated DNA primases (COG3378, E-value 52e � 12). Given that viruses have been observed in Pyrami-monas (Moestrup and Thomsen 1974; Sandaa et al. 2001)and Nephroselmis (Nakayama et al. 2007), it is temptingto speculate that the above-mentioned orf454 and orf389originated from horizontal transfer of viral genes. Thereare only a few documented cases of nonstandard, free-standing chloroplast genes that were acquired via horizontalgene transfer, and all these cases involve genes that partic-ipate in DNA recombination or replication (Khan et al. 2007;Brouard et al. 2008; Cattolico et al. 2008). Like the orf454and orf389, the two horizontally transferred genes identifiedin the chlorophycean green alga Oedogonium cardiacum arehoused in the IR (Brouard et al. 2008).

In general, the conserved genes present in Pyramimo-nas cpDNA are densely packed (table 1). Prominent excep-tions are those in the regions containing the orf454 andorf510 (fig. 1). There are two cases of overlapping genes(psbC–psbD and ndhC–ndhK); for the remaining genes,

FIG. 1.—Gene map of Pyramimonas cpDNA. The two copies of the IR sequence are represented by thick lines. Genes (filled boxes) on the outsideof the map are transcribed in a clockwise direction. Coding sequences not commonly found in cpDNA are shown in gray. The single intron in atpB isrepresented by an open box. The color code denotes the genomic regions containing the corresponding genes in the cpDNAs of Nephroselmis andstreptophytes: magenta, SSC; cyan, LSC; and yellow, IR. Given the variable gene content of the IR in these ancestral-type genomes, only the genesinvariably present in this region (i.e., those forming the rRNA operon) were represented in yellow. tRNA genes are indicated by the one-letter aminoacid code (Me, elongator methionine; Mf, initiator methionine) followed by the anticodon in parentheses.

Analysis of three Prasinophyte Chloroplast Genomes 635

Dow

nloaded from https://academ

ic.oup.com/m

be/article/26/3/631/977861 by guest on 28 Decem

ber 2021

intergenic spacers vary between 3 and 2,517 bp, with anaverage size of 159 bp. Consistent with this high degreeof compaction, only a few short repeats, mostly direct re-peats, were identified (table 2); they are found mainly in thelarge spacer adjacent to the orf510.

Like its Ostreococcus and Pycnococcus homologs (seebelow), the Pyramimonas genome features a unique intron,

a group II intron in atpB. However, the Pyramimonas atpBintron and those of Ostreococcus and Pycnococcus are in-serted at different sites and carry distinct ORFs, indicatingthat they arose from separate events of horizontal DNAtransfer. It should be pointed out here that the currentlyavailable chloroplast genome data strongly support the no-tion that no introns were present in the chloroplast of the

Table 2Gene Repertoires of Prasinophyte cpDNAs

Genea Nephroselmis Pyramimonas Pycnococcus Monomastix Ostreococcus

accD þ � � � �ccsA þ þ � þ �cemA þ � þ � �chlB þ þ � � �chlI þ þ þ þ �chlL þ þ þ � �chlN þ þ þ � �cysA þ � � � �cysT þ � þ � �ftsI þ � � � �ftsW þ � � � �minD þ � � � �ndhA þ þ � � �ndhB þ þ � � �ndhC þ þ � � �ndhD þ þ � � �ndhE þ þ � � �ndhF þ þ � � �ndhG þ þ � � �ndhH þ þ � � �ndhI þ þ � � �ndhK þ þ � � �petD þ � þ þ �petL þ � þ � �petN þ þ þ þ �psaC þ þ þ � þpsaJ þ þ � þ þpsaM � þ � þ þpsbM þ � � þ �rne þ � � – �rnpB þ � þ þ �rpl12 þ þ � � �rpl19 þ � þ � �rpl22 � þ � � �rpl32 þ þ � þ þrpoB þ þ � þ þrps9 þ þ � � þrrf þ � � þ þtrnG(gcc) þ þ þ þ �trnI(cau) þ þ � þ �trnL(caa) þ � � � �trnL(gag) þ � þ � þtrnP(ggg) � � þ � �trnR(ccg) þ þ þ � �trnS(cga) þ � þ � �trnS(gga) þ � þ � �trnT(ggu) þ � – � �ycf4 þ þ þ � �ycf20 þb þ � þ �ycf47 þ � � � �ycf62 þ � � � �ycf65 � þ � � �ycf81 þ � � � �

a Only the genes that are missing in one or more genomes are indicated. A total of 80 genes are shared by all compared cpDNAs: atpA, B, E, F, H, I, clpP, ftsH, infA,

petA, B, G, psaA, B, I, psbA, B, C, D, E, F, H, I, J, K, L, N, T, Z, rbcL, rpl2, 5, 14, 16, 20, 23, 36, rpoA, C1, C2, rps2, 3, 4, 7, 8, 11, 12, 14, 18, 19, rrl, rrs, tufA, trnA(ugc),

C(gca), D(guc), E(uuc), F(gaa), G(ucc), H(gug), I(gau), K(uuu), L(uaa), L(uag), Me(cau), Mf(cau), N(guu), P(ugg), Q(uug), R(acg), R(ucu), S(gcu), S(uga), T(ugu), V(uac),

W(cca), Y(gua), ycf1, 3, 12.b ycf20 is present as a pseudogene in Nephroselmis (Lemieux C, unpublished data); it is located downstream of ndhE and corresponds to orf111 in the gene map

reported by Turmel et al. (1999b).

636 Turmel et al.

Dow

nloaded from https://academ

ic.oup.com/m

be/article/26/3/631/977861 by guest on 28 Decem

ber 2021

common ancestor of all green plants (Turmel et al. 1999b;Lemieux et al. 2000, 2007). The orf608 of the Pyramimonasgroup IIA intron is located within domain IV of theintron secondary structure and carries the reverse tran-scriptase (cd01651) and maturase (pfam01348) domains,but not the endonuclease domain, of reverse transcriptasesencoded by group II introns. The endonuclease domain,which carries out second-strand DNA cleavage duringgroup II intron mobility (Lambowitz and Zimmerly2004), was most likely lost after the horizontal transferof the intron in the Pyramimonas chloroplast. Theorf608 product shares strong sequence similarity with re-verse transcriptases encoded by the genomes of firmicutebacteria and by the mitochondrial cox1 genes of fungi,the brown alga Pylaiella littoralis, and the cryptophyteRhodomonas salina.

Like Its Ostreococcus Homolog, Pycnococcus cpDNAHas a Reduced Gene Content and Is Highly Compact

The Pycnococcus chloroplast genome is the smallestand most compact of the three prasinophyte genomes se-

quenced during this study (table 1 and fig. 2). It is only8.6 kb larger relative to Ostreococcus cpDNA and contains10 additional conserved genes, for a total of 98 genes. Interms of size, this gene repertoire, which consists of 65 pro-tein genes and 33 RNA genes encoding 2 rRNAs, 30tRNAs, and the RNA component of RNase P (table 2),is similar to that observed for chlorophycean green algae(Brouard et al. 2008). The tRNA complement includesone tRNA species not previously documented in any chlor-ophytes [tRNAPro (GGG)] but like its Ostreococcus homo-log, lacks the tRNA species that reads the AUA codon [i.e.,the tRNAIle (CAU) where C is modified posttranscription-ally to lysidine]. As in Pyramimonas cpDNA, the 5S rRNAgene was not detected. Moreover, the Pycnococcus genomeis missing the protein-coding genes psaJ and rpoB, whichare present in all other investigated chlorophytes. Althoughthe Pycnococcus, Ostreococcus, and PyramimonascpDNAs all show a reduced gene content compared withthe Nephroselmis genome, their sets of genes show substan-tial differences (table 2).

No vestigial IR region was identified in PycnococcuscpDNA. The genes generally found in this region are

FIG. 2.—Gene map of Pycnococcus cpDNA. Genes (filled boxes) on the outside of the map are transcribed in a clockwise direction. The singleintron in atpB is represented by an open box. The orf163 and orf175 revealed no detectable similarity with any known gene sequences. The geneswhose orthologs are found within the IR, SSC, and LSC regions in Nephroselmis and streptophyte cpDNAs are color coded in supplementary figure 2,Supplementary Material online.

Analysis of three Prasinophyte Chloroplast Genomes 637

Dow

nloaded from https://academ

ic.oup.com/m

be/article/26/3/631/977861 by guest on 28 Decem

ber 2021

dispersed throughout the genome; in contrast, severalgenes usually present within the SSC region in genomesdisplaying an ancestral quadripartite structure [chlN, chlL,ycf1, cysT, and trnP(ggg)] remained clustered together(supplementary fig. 2, Supplementary Material online).There are two cases of overlapping genes (ycf4–rnpBand psbD–psbC); for the other coding regions, intergenicspacers were found to vary from 0 to 383 bp, for an av-erage length of 102 bp.

The Pycnococcus atpB intron shares with its Ostreo-coccus counterpart the same insertion position and a largeORF in domain IV that features the reverse transcriptase(cd01651), maturase (pfam08388), and HNH endonuclease(cd00085) domains of reverse transcriptases encoded bygroup II introns. The Pycnococcus and Ostreococcus intronORFs share strong similarity with one another and with re-verse transcriptase genes found in several cyanobacterialspecies as well as in group II introns present in the mito-chondrial large subunit (LSU) rRNA gene of the red algaPorphyra purpurea (Burger et al. 1999) and the chloroplastpsbA genes of Chlamydomonas sp. CCMP 1619 (Odom

et al. 2004) and Euglena myxocylindracea (Shevelevaand Hallick 2004).

The Monomastix Chloroplast Genome Has a ReducedGene Content but is Loosely Packed with Genes

Compared with its Pycnococcus homolog, the Mono-mastix chloroplast genome is 34 kb larger, has a deficitof four genes, and contains five additional introns (table 1,fig. 3 and supplementary fig. 3, Supplementary Material on-line). Its increased size is largely accounted for by the ex-pansion of intergenic spacers. The latter vary from 3 to2,566 bp, for an average size of 524 bp, and contain amyriad of short repeated sequences rich in G þ C. The94 conserved genes specify 64 proteins and 30 RNAs(3 rRNAs, 26 tRNAs, and the RNA component of RNaseP) (table 2). The 26 tRNAs can decode all 61 sense codonsassuming that tRNAArg(ACG), where A is modified toinosine, recognizes all four codons of the CGX family.The reduced gene content of Monomastix is more likethe gene complement of Ostreococcus than that of

FIG. 3.—Gene map of Monomastix cpDNA. Genes (filled boxes) on the outside of the map are transcribed in a clockwise direction. Introns arerepresented by open boxes. The orf122 and orf125 revealed no detectable similarity with any known gene sequences. The genes whose orthologs arefound within the IR, SSC, and LSC regions in Nephroselmis and streptophyte cpDNAs are color coded in supplementary figure 3, SupplementaryMaterial online.

638 Turmel et al.

Dow

nloaded from https://academ

ic.oup.com/m

be/article/26/3/631/977861 by guest on 28 Decem

ber 2021

Pycnococcus (table 2). It features nine genes that are miss-ing from Ostreococcus and lacks only three genes that arepresent in this alga, including psaC, a gene shared by thechloroplasts of all previously investigated chlorophytes.Although short dispersed repeats were mapped predomi-nantly to intergenic regions, a small fraction was foundwithin the coding regions of five genes (ftsH, rpoB, rpoC1,rpoC2, and ftsH) and within two introns (psbA intron and rrlintron 4) (supplementary fig. 4, Supplementary Materialonline). This distribution pattern resembles those reportedfor other chlorophyte cpDNAs rich in short repeats (Maulet al. 2002; Pombert et al. 2005, 2006; Belanger et al.2006; de Cambiaire et al. 2006, 2007). Ranging from 19and 58 nucleotides, the most abundant short dispersed re-peats of Monomastix were classified into four families (Aand A1, B and B1, and C and D) according to their sequencemotifs; moreover, some repeats displaying partial sequen-ces characteristic of distinct families were discerned (sup-plementary fig. 5, Supplementary Material online). Thehybrid nature of the latter dispersed repeats, which were as-signed to six categories (named AB, AC, AD, A1D, A1B,and BD), suggests they arose through recombination be-tween regions carrying different repeats.

The Monomastix chloroplast genome contains a singlegroup II intron, located in trnK(uuu), and five group I in-trons, one of which resides in psbA and four in the LSUrRNA gene (rrl) (fig. 1). The IIB trnK intron is insertedwithin the D arm of the tRNA secondary structure follow-ing G23 and lacks an ORF. All other trnK(uuu) introns that

have been identified in streptophyte cpDNAs carry an in-ternal ORF with a maturase domain (matK) and are insertedwithin the anticodon loop (Turmel et al. 2006). In view oftheir ability to encode a homing endonuclease, the fiveMonomastix group I introns are likely to be mobile andwere probably captured via horizontal intracellular and/orintercellular DNA transfer. The IA2 psbA intron, foundat position 525 relative to the corresponding Mesostigmagene, specifies a potential homing endonuclease with theGIY–YIG motif and has chloroplast homologs with thesame insertion site and highly similar endonuclease genesin the ulvophytes Oltmannsiellopsis and Pseudendoclo-nium and the chlorophycean green algae Oedogoniumand Chlamydomonas reinhardtii (Brouard et al. 2008).The remaining four group I introns encode potential LA-GLIDADG homing endonucleases (Cote et al. 1993; Lucaset al. 2001) and also share identical insertion sites witha large number of chlorophyte (Lucas et al. 2001; Brouardet al. 2008) and cyanobacterial (Haugen et al. 2007) introns.The first and third LSU rDNA introns, whose insertion po-sitions correspond to sites 1931 and 2500 in the E. coli 23SrRNA, fall within subgroup IB4, whereas the second andfourth introns inserted at sites 1951 and 2593 belong tothe IA3 family. Like its Chlamydomonas homolog I-CreI,the Monomastix site-2593 intron-encoded homing endonu-clease (I-MsoI) has been characterized at the 3D level inthe presence of its DNA target site, revealing that the twoisoschizomers display strikingly different protein/DNA con-tacts (Lucas et al. 2001; Chevalier et al. 2003). Interestingly, at

FIG. 4.—Conservation of ancestral gene clusters in prasinophyte and Euglena cpDNAs. Ancestral clusters were defined as those containing genesin the same order and polarity in at least one streptophyte and one prasinophyte. For each genome, the set of genes making up each of the identifiedancestral clusters is shown as black boxes connected by a horizontal line. Black boxes that are contiguous but not linked together indicate that thecorresponding genes are not adjacent on the genome. Gray boxes denote individual genes that have been relocated elsewhere on the chloroplast genomeand empty boxes denote missing genes. The relative polarities of the genes are not represented in this figure; for this information, consult the mapsshown in figures 1–3 or that previously reported for the Nephroselmis genome (Turmel et al. 1999b).

Analysis of three Prasinophyte Chloroplast Genomes 639

Dow

nloaded from https://academ

ic.oup.com/m

be/article/26/3/631/977861 by guest on 28 Decem

ber 2021

sites 1931, 2500, and 2593, the Monomastix mitochondrialLSU rRNA gene features introns with similar structures andORFs as those found at identical sites in the chloroplast gene(Lucaset al. 2001) (Turmel M,GagnonM-C, Otis C, LemieuxC, unpublished data), highlighting the possibility that mobilegroup I introns were exchanged between different organel-lar compartments in the Monomastix lineage. Evidencesupporting such intracellular exchanges of group I intronshas also been reported for the Nephroselmis (Turmel et al.1999a) and Pseudendoclonium (Pombert et al. 2006)lineages.

Pyramimonas and Euglena cpDNAs Show StrikingSimilarities in Gene Order

Gene orders in the three newly sequenced prasino-phyte chloroplast genomes were compared with one an-other and with those in previously examinedchlorophytes, the streptophytes Mesostigma and Chloroky-bus, the euglenid Euglena, and the chlorarachniophyteBigelowiella. In all pairwise genome comparisons, exceptthat including Pyramimonas and Euglena, the vast majorityof the identified syntenic blocks were composed exclu-sively of gene clusters commonly found in streptophytesand chlorophytes. Ancestral clusters of this type displaysubstantial variability among the Euglena and prasinophytegenomes (fig. 4). Clearly, the gene-rich genome of Neph-roselmis exhibits the highest number of genes (94 genes)mapping to clusters predating the split of the Chlorophytaand Streptophyta. Breakpoints within ancestral clustersproved to be too variable in positions to determine whichof the compared genomes are the most closely related. Notethat our comparisons of the Pyramimonas genome withthose of Mesostigma and Chlorokybus disclosed ancestralgene linkages that had not been reported in any chlorophytecpDNA (e.g., psbH–petB–petD, R(ccg)–rbcL–atpB–atpE).The ancestral rps2–atpI linkage detected in the Euglenagenome was also previously unrecognized in chlorophytes.

Comparison of gene orders in the Pyramimonasand Euglena cpDNAs revealed striking similarities be-tween these genomes. Almost two-thirds of the 87 genes(56 genes) in Euglena cpDNA were found to be part of

collinear regions, for a total of 16 syntenic blocks.Thirty-five of these genes form eight blocks that exhibitgene linkages unique to Pyramimonas and Euglena(fig. 5). Four blocks contain exclusively derived linkages,whereas the remaining four also include ancestral genelinkages present in chlorophytes and streptophytes(the rpl23, rpl32, rps12, and rrs clusters). It is interestingto note that in each of the latter four blocks, a pair ofadjacent genes was cleanly excised from the Euglenagenome following the formation of the derived linkages.The syntenic block containing the triad psbK–ycf12–psaM is not uniquely shared by the Pyramimonas andEuglena chloroplasts. Being also present in Chlorella,Pseudendoclonium, and Oltmannsiellopsis but not instreptophytes, this derived cluster must have arisenin prasinophytes and have been transmitted by verticaldescent to the trebouxiophyte and ulvophyte lineages.

Monomastix Occupies an Early-Diverging Branch of theMamiellales in 18S rDNA Trees

Monomastix has been historically affiliated with thePrasinophyceae; however, the finding that its body scalesare not typical of those found in prasinophytes but aremore like those of the chrysophyte Chromulina placentula(Manton 1967) led to the exclusion of this genus from thePrasinophyceae (Melkonian 1990; Sym and Pienaar1993). Very limited molecular information has been re-ported so far for Monomastix, explaining why its phylo-genetic status has remained enigmatic. In the presentstudy, we determined the sequence of the Monomastixnuclear-encoded SSU rRNA gene and compared it withthose available for other prasinophytes and somerepresentatives of the Trebouxiophyceae, Ulvophyceae,and Chlorophyceae. Trees inferred with ML unam-biguously showed that Monomastix represents an early-diverging lineage of the Mamiellales (clade II) (fig. 6).This uniflagellate, which has nonprasinophyte scales,was resolved as the first branch of this morphologicallydiverse clade. An unquestionable affinity thereforeexists between Ostreococcus and Monomastix eventhough these two taxa belong to different lineages of

FIG. 5.—Derived gene clusters uniquely shared by the Euglena and Pyramimonas cpDNAs. The genes shown as gray boxes represent the derivedcomponents of these clusters; those shown as black boxes exhibit an ancestral organization. The genes shown as empty boxes are missing in Euglena cpDNA.

640 Turmel et al.

Dow

nloaded from https://academ

ic.oup.com/m

be/article/26/3/631/977861 by guest on 28 Decem

ber 2021

the Mamiellales. The naked Ostreococcus is closely re-lated to the scaly Bathycoccus and the clade uniting thesenonflagellated genera is sister to that containing the flag-

ellated genera Mamiella (two flagella), Mantoniella (oneflagellum), Micromonas (naked, one flagellum), and thenew genus represented by isolate RCC 391 (two flagella).

FIG. 6.—Phylogenetic position of Monomastix among prasinophytes as inferred from nuclear-encoded SSU rDNA sequences. The figure presentsthe best ML tree. Bootstrap values are shown on the corresponding nodes. The names of the taxa whose chloroplast genomes were examined in thepresent study are shown on a black background. Clade numbering follows that of Guillou et al. (2004).

Analysis of three Prasinophyte Chloroplast Genomes 641

Dow

nloaded from https://academ

ic.oup.com/m

be/article/26/3/631/977861 by guest on 28 Decem

ber 2021

Chloroplast Phylogenomic Analyses Unite thePyramimonadales with the Mamiellales and Identify the

Pyramimonadales as the Source of the EuglenidChloroplasts

To explore the relationships among prasinophyte lin-eages (in particular clades I, II, III, and V) as well as therelationships of chlorophyte chloroplasts with the second-

arily acquired chloroplasts of Bigelowiella and Euglena, wegenerated data sets of 70 concatenated proteins and genes(first and second codon positions) from completely se-quenced chloroplast genomes and analyzed them usingthe ML and Bayesian methods (fig. 7). As expected, boththe protein and gene trees identified a strongly supportedclade uniting the two representatives of the Mamiellales,Monomastix, and Ostreococcus. This clade is sister to a

FIG. 7.—Phylogenies inferred from 70 concatenated chloroplast genes (first two codon positions) and their deduced amino acid sequences. (A) BestML tree inferred from the amino acid data set. (B) Best ML tree inferred from the nucleotide data set. The bootstrap values obtained in ML analyses andthe posterior probability values obtained in Bayesian analyses are shown on the left and right, respectively, on the corresponding nodes.

642 Turmel et al.

Dow

nloaded from https://academ

ic.oup.com/m

be/article/26/3/631/977861 by guest on 28 Decem

ber 2021

robust monophyletic group clustering the Pyramimonas(scaly, four or eight flagella) and Euglena chloroplasts.Although this sister relationship received 87% bootstrapsupport in the protein ML tree (fig. 7A), exclusion of thelong-branch taxa Euglena and Bigelowiella from theanalysisresulted in 97% bootstrap support for the Pyramimonas þMonomastix þ Ostreococcus clade (data not shown). Inall analyses, the scaly biflagellate Nephroselmis was sisterto all chlorophytes analyzed, whereas the position of thenaked, nonflagellated Pycnococcus remained equivocal.The latter prasinophyte was resolved as sister to the corechlorophytes in the protein tree (fig. 7A), but was sisterto the Mamiellales, Pyramimonadales, and euglenids inthe gene tree (fig. 7B). The protein and gene trees thus differonly in the branching position of the core chlorophytes withrespect to the prasinophyte lineages.

Because phylogenetic analyses based on the whole-genome approach are inherently associated with sparsetaxon sampling, they can lead to trees robustly supportingan artifactual clustering of taxa (Brinkmann and Philippe2008; Heath et al. 2008). Caution must therefore be exer-cised in the interpretation of the observed topologies. In thecase of trees derived from complete genome sequences,structural features of these genomes can be used as indepen-dent data to test topologies (Rokas 2006). In the presentstudy, the strong alliance we uncovered between thePyramimonas and Euglena chloroplasts is strengthenedby a number of gene linkages that are unique to the cpDNAsof these algae (fig. 5). Based on this finding, we infer withconfidence that the green algal partner in the secondaryendosymbiosis that gave rise to euglenids was a memberof the Pyramimonadales. Euglenids are unicellular organ-isms that belong to the Excavata, a supergroup of eukar-yotes including diverse nonphotosynthetic groups likediplomonads, retortamonads, parabasalids, oxymonads,and jakobids (Baldauf et al. 2000, 2008; Keeling et al.2005). Euglenids are the only photosynthetic excavates,and they are specifically related to a subgroup containingthe kinetoplastids and diplonemids (Triemer and Farmer2007). Prior to our study, published data were consistentwith the notion that the euglenid chloroplasts evolved froma green algal endosymbiont that was allied to prasinophytes(Turmel et al. 1999b; Ishida et al. 1997; Rogers et al. 2007);however, it remained unknown as to which of the mono-phyletic groups of prasinophytes harbored the closest rel-ative of the euglenid endosymbiont. In agreement withour results, the ML tree that Ishida et al. (1997) inferredfrom the amino acid sequences of elongation factor Turevealed a strongly supported clade clustering Pyramimo-nas disomata and the euglenids E. gracilis and Astasialonga; however, this Pyramimonas species was the onlyprasinophyte sampled in this single-gene analysis. Like-wise, considering that P. parkeae is the unique representa-tive of the Pyramimonadales in our chloroplastphylogenomic study, there remain uncertainties about theexact pyramimonadalean lineage that was the source ofthe euglenid chloroplasts.

In the eukaryotic tree of life based on nuclear-encodedgenes, euglenids and chlorarachniophytes fall within dis-tinct branches. Like euglenids, chlorarachniophytes belongto a supergroup of eukaryotes that is primarily nonphoto-

synthetic, the Rhizaria (Keeling et al. 2005; Baldauf2008). By robustly placing Bigelowiella at a separate posi-tion from Euglena, our chloroplast phylogenomic analysesstrongly reinforce the hypothesis that the euglenid andchlorarachniophyte chloroplasts trace back to two indepen-dent secondary endosymbioses (Rogers et al. 2007; Taka-hashi et al. 2007) (fig. 7). Although the chloroplast ofBigelowiella was found to be sister to those of the ulvo-phytes Pseudendoclonium and Oltmannsiellopsis in boththe protein and gene trees, broader sampling of core chlor-ophytes will be required to pinpoint the closest green algalrelative of the chlorarachniophyte endosymbiont.

The most unexpected finding that emerged from ourstudy is the observation that the Pyramimonas þ Euglenaclade is sister to the Monomastix þ Ostreococcus clade. Al-though the existence of a sister relationship between thePyramimonadales and Mamiellales has not been previouslydocumented, it is compatible with the resemblance thatthese monophyletic groups display at the level of flagellarscale structure (Melkonian 1984, 1990; O’Kelly 1992; Symand Pienaar 1993) and with the branching order inferredfrom 18S rDNA data. The Pyramimonadales emerge justbefore the Mamiellales in most 18S rDNA trees (Steinkotteret al. 1994; Nakayama et al. 1998; Fawley et al. 2000;Guillou et al. 2004); however, these lineages form a weaklysupported clade in the ML tree recently reported byNakayama et al. (2007). No similarities were found atthe chloroplast gene order level that link the Pyramimona-dales and Mamiellales to the exclusion of other chlorophytegroups; however, losses of at least four genes (cemA, cysT,petL, and rpl19) could be traced back unambiguously to thecommon ancestor of the Pyramimonadales and Mamiellales(supplementary table 1, Supplementary Material online).

Because the Pyramimonadales and Mamiellales aredistinguished by prominent morphological differences,the existence of a sister relationship between these lineageshas important implications for the evolution of prasino-phytes. All members of the Pyramimonadales, which rep-resent the five genera indicated in figure 6 and also probablythe Tasmanites (a fossil resembling the phycoma stages ofCymbomonas, Pterosperma, and Halosphaera, which hasbeen found in Precambrian deposits), share a number ofsynapomorphic characters and have at least four flagellaand a complex scaly covering consisting of three layersof scales on the cell body and of two layers on the flagella(Melkonian 1984, 1990; Sym and Pienaar 1993). The inter-mediate scale layer on the cell body consists of spiderweb-shaped scales in Pterosperma and is homologous to theouter scale layer on the flagellum (the limulus scales)and to the spiderweb scales of the Mamiellales. The limu-loid scales of Cymbomonas are also reminiscent of thespiderweb scales of the Mamiellales, particularly duringmorphogenesis (Moestrup et al. 2003). Interestingly, an ap-parent food-uptake apparatus is present in Cymbomonas,which has been interpreted as a character inherited from aphagotrophic ancestor of the green plants and subsequentlylost during evolution of the green algae (Moestrup et al.2003). On the other hand, the members of the Mamiellalesshow reduced morphological complexity and are character-ized by a progressive simplification of cellular structureand a reduction in cell size that occurred concomitantly with

Analysis of three Prasinophyte Chloroplast Genomes 643

Dow

nloaded from https://academ

ic.oup.com/m

be/article/26/3/631/977861 by guest on 28 Decem

ber 2021

the loss of scales (Nakayama et al. 1998). They lack anunderlayer of square-shaped scales (such scales are presentin most other prasinophyte lineages and the flagellatereproductive cells of streptophytes) and no microtubularflagellar roots are attached to the basal body no. 2. A sisterrelationship between the Pyramimonadales and Mamiel-lales implies that some of the cellular features displayedby the Mamiellales were derived from the more complexorganization seen in the Pyramimonadales and presumablyin the common ancestor of all chlorophytes. In this context,it is worth mentioning that the nature of the progenitor ofall green plants has generated intense debate and is stillcontroversial (Melkonian 1984; O’Kelly 1992; Sym andPienaar 1993). A better understanding of the relationshipsamong prasinophyte lineages will be required before onecan infer with confidence evolutionary scenarios of cellularchanges.

At present, the identity of the earliest-diverging chlor-ophyte lineage remains uncertain. Intriguingly, the treesinferred from 18S rDNA sequences (Guillou et al. 2004;Nakayama et al. 2007) are in discordance with the chloro-plast phylogenomic trees reported in this study with regardsto the position of the Nephroselmis genus (clade III). Theearly-diverging position observed for the Nephroselmisrepresentative in chloroplast trees is in agreement withthe high degree of ancestral features found in the cpDNAof this taxon (see fig. 8) but contrasts sharply with the muchlater divergence observed for the genus in 18S rDNA trees.In the latter trees, the branch occupied by Nephroselmisspecies emerges near the lineage containing Pycnococcusand Pseudocourfieldia marina, the clade VII containing on-ly picoplanktonic species, and the clade containing the corechlorophytes (Chlorodendrales sensu [Melkonian 1990] þTrebouxiophyceae þ Ulvophyceae þ Chlorophyceae).Together, these lineages form a large clade that is well sup-ported in ML analysis (fig. 6). Given the close relationshipobserved on the basis of scale structure between Nephro-selmis and the genera Tetraselmis and Scherffelia, Nakayamaet al. (2007) proposed that the common ancestor of theclade containing Nephroselmis and the core chlorophyteshad two layers of small scales on the flagella (squared-shapedscales and rod-shaped scales) and cell body (square scalesand stellate scales). The above-mentioned discrepancy be-tween nuclear and chloroplast trees highlights the need foranalysis of chloroplast genomes from additional prasino-phytes. Sampling of chloroplast genomes from all sevenknown lineagesofprasinophytes will be required todeterminethe exact position of Nephroselmis relative to the Pycnococ-caceae, Pyramimonadales, and Mamiellales.

Losses of Multiple Ancestral cpDNA Characters inIndependent Prasinophyte Lineages are Correlated withMajor Cellular Remodeling

To trace some of the evolutionary changes thatoccurred at the chloroplast genome level during the evolu-tion of prasinophytes and euglenids, losses of 62 genes and75 ancestral gene pairs were mapped on the tree topologyinferred from sequence data (fig. 8). In this analysis, thecore chlorophytes were excluded and the streptophytes

Mesostigma and Chlorokybus were used as outgroup.Although multiple characters were lost in independentlineages, a substantial fraction of losses are uniquelyshared. In particular, the monophyletic group containingthe Mamiellales þ euglenids þ Pyramimonadales andthe node linking the latter clade with the Pycnococcaceaeare supported by several changes that occurred only once.Because the nuclear genome of just one prasinophyte genus(Ostreococcus) has been decrypted so far (Derelle et al.2006; Palenik et al. 2007), we cannot interpret our resultsin terms of gene transfers from the chloroplast to the nu-cleus. Most of the genes that vanished from the chloroplastgenome probably fall into this category; however, somemight have disappeared entirely from the cell because theirrequirement is restricted to certain growth and physiolog-ical conditions (e.g., the chl genes associated with chloro-phyll synthesis in the dark, the cys genes involved in sulfateand thiosulfate transport, and the ndh genes associated withchlororespiration).

The chloroplast genome sustained important reductionin gene content in at least three separate lineages, namely,the lineages leading to Euglena, to the mamiellalean generaMonomastix and Ostreococcus, and to Pycnococcus (fig. 8).In light of the close affinity of the Pyramimonas andEuglena chloroplast genomes, we propose that the secondaryendosymbiosis that gave rise to the euglenid chloroplasts wasaccompanied by extensive gene losses. Similar extinction ofnumerous chloroplast genes has been associated with thesecondary endosymbiosis that involved the capture of a redalga and generated the chloroplasts of heterokonts, crypto-phytes, and haptophytes (Khan et al. 2007; Oudot-Le Secqet al. 2007; Cattolico et al. 2008). With regards to theMamiellales, it appears that the common ancestor ofMonomastix and Ostreococcus had already experiencedmultiple chloroplast gene losses (fig. 8), implying that theseevents might have accompanied the simplification of cell or-ganization that presumably coincided with the emergence ofthe Mamiellales. Moreover, as indicated by the higher fre-quency of genes losses in the Ostreococcus lineage comparedwith the Monomastix lineage, part of the gene losses in theformer lineage were likely connected with the evolution ofthe coccoid cell organization and the reduction in cell size.Pycnococcus represents an independent coccoid lineage thatsustained considerable reduction of the chloroplast genome,and as observed for Ostreoccocus, there was strong pressureto maintain a compact genome organization. In contrast, thegenomeofthemamiellaleanMonomastixfollowedadivergentevolutionary pathway and became loosely packed with genesfollowing proliferation of small dispersed repeats (table 1 andsupplementary fig. 4, Supplementary Material online).

The pressure to maintain the ancestral quadripartite ar-chitecture became relaxed during the evolution of prasino-phytes and euglenids. The IR was lost a minimum of threetimes (fig. 8), an observation that is not surprising given thatindependent IR losses have been documented for the classTrebouxiophyceae (de Cambiaire et al. 2007) and for landplants (Palmer 1991; Raubeson and Jansen 2005). Moreunexpected was our finding that the three examined IR-containing prasinophyte cpDNAs differ significantly inthe distribution of their genes among the two SC regionsand in the orientation of the IR relative to these regions.

644 Turmel et al.

Dow

nloaded from https://academ

ic.oup.com/m

be/article/26/3/631/977861 by guest on 28 Decem

ber 2021

Although the Nephroselmis genome is the most similar to thegene partitioning pattern observed for streptophytes andsome nongreen algae (Turmel et al. 1999b), the reduced ge-

nome of Ostreococcus shows a pattern (supplementary fig. 1,Supplementary Material online) more like that observed forthe ulvophytes Pseudendoclonium and Oltmannsiellopsis

FIG. 8.—Losses of chloroplast genes and gene pairs during the evolution of prasinophytes and euglenids. Unique losses are indicated by squares,whereas convergent losses in two or more lineages are indicated by triangles. Red and blue symbols refer to losses of genes and gene pairs, respectively.Some gene pairs disappeared as a result of gene losses; those that were not correlated with any gene losses are denoted by dots. The number below eachtaxon name indicates the total number of conserved genes in the chloroplast genome. Losses of the IR occurred in the three indicated lineages.

Analysis of three Prasinophyte Chloroplast Genomes 645

Dow

nloaded from https://academ

ic.oup.com/m

be/article/26/3/631/977861 by guest on 28 Decem

ber 2021

(Pombert et al. 2005, 2006). When the latter pattern wasidentified in Pseudendoclonium, it was hypothesized thatit might represent an intermediate form between the highlyderived pattern found in the chlorophycean green algaC. reinhardtii and the ancestral quadripartite structurefound in streptophytes, Nephroselmis, and probably early-diverging trebouxiophytes, thus lending support to thenotion that the Ulvophyceae is sister to the Chlorophyceae(Pombert et al. 2005). However, the great variability in thequadripartite structure uncovered here for the Prasinophy-ceae and recently reported for the Chlorophyceae (deCambiaire et al. 2006; Brouard et al. 2008) casts doubton the phylogenetic value of this genomic feature. Clearly,these data indicate that chloroplast genome rearrange-ments led to the exchanges of genes between oppositeSC regions on multiple occasions during the evolutionaryhistory of chlorophytes.

Conclusions

The chloroplast genome of prasinophytes exhibitsmuch more fluidity in gene content and arrangement thananticipated from the earlier reports on the Nephroselmisand Ostreococcus genomes. Major reduction and restruc-turing of the chloroplast genome occurred in conjunctionwith changes in cell organization in at least two lineages,the Mamiellales and Pycnococcaceae. By disclosing the ex-istence of a sister relationship between the Mamiellales andPyramimonadales, our study represents a significant steptoward a better understanding of prasinophyte evolution.Furthermore, it offers for the first time compelling evidencethat the evolutionary history of the prasinophytes was di-rectly linked with the acquisition of photosynthesis throughsecondary endosymbiosis by a subgroup of excavates, theeuglenids. Two independent lines of evidence, trees in-ferred from sequence data and the presence of uniquelyshared derived gene clusters, robustly support the notionthat the green algal ancestor of the euglenid chloroplastsbelonged to the Pyramimonadales. Although sampling ofBigelowiella has not enabled us to pinpoint the green algaldonor of chlorarachniophytes chloroplasts, the inferredtrees strengthen the hypothesis that chloroplasts arose in-dependently in chlorarachniophytes and euglenids. Consid-ering that pyramimonadaleans are richer in ancestralcharacters at the chloroplast genome level and exhibita more pronounced level of cell asymmetry and complexitycompared with the mamiellaleans, it is plausible that cellasymmetry characterized the common ancestor of these lin-eages. Consistent with the hypothesis that the common an-cestor of all chlorophytes also featured an asymmetrical cellarchitecture is the observation that Nephroselmis occupiesthe earliest divergence of the Chlorophyta and displays thehighest conservation of ancestral characters. Future chloro-plast genome investigations incorporating the Chloroden-drales, the two picoplanktonic lineages not sampled inthe present study, and a broader range of taxa in each lin-eage should resolve further the branching pattern of prasi-nophyte lineages and clarify the number of separate eventsthat gave rise to coccoids and streamlining of the chloro-plast genome.

Supplementary Material

Supplementary figures 1–5, supplementary table 1, thedata sets used in phylogenetic analyses, and the data setused to infer the evolutionary scenario of character lossesare available at Molecular Biology and Evolution online(http://mbe.oxfordjournals.org/). The fully annotated chlo-roplast genome sequences of Monomastix, Pycnococcusand Pyramimonas have been deposited in the GenBank da-tabase under the accession numbers FJ493497, FJ493498,and FJ493499, respectively. The GenBank accession num-ber for the Monomastix 18S rDNA sequence determined inthis study is FJ493496.

Acknowledgments

We thank Mathieu Blais and Bertrand Caillier for theirassistance in cloning and sequencing the Pyramimonaschloroplast genome. This study was supported by a grantfrom the Natural Sciences and Engineering Research Coun-cil of Canada (to M.T. and C.L.).

Literature Cited

Baldauf SL. 2008. An overview of the phylogeny and diversity ofeucaryotes. J Syst Evol. 46:263–273.

Baldauf SL, Roger AJ, Wenk-Siefert I, Doolittle WF. 2000.A kingdom-level phylogeny of eukaryotes based on combinedprotein data. Science. 290:972–977.

Belanger A-S, Brouard J-S, Charlebois P, Otis C, Lemieux C,Turmel M. 2006. Distinctive architecture of the chloroplastgenome in the chlorophycean green alga Stigeocloniumhelveticum. Mol Genet Genomics. 276:464–477.

Brinkmann H, Philippe H. 2008. Animal phylogeny and large-scale sequencing: progress and pitfalls. J Syst Evol. 46:274–286.

Brouard J-S, Otis C, Lemieux C, Turmel M. 2008. ChloroplastDNA sequence of the green alga Oedogonium cardiacum(Chlorophyceae): unique genome architecture, derived char-acters shared with the Chaetophorales and novel genesacquired through horizontal transfer. BMC Genomics. 9:290.

Burger G, Saint-Louis D, Gray MW, Lang BF. 1999. Completesequence of the mitochondrial DNA of the red alga Porphyrapurpurea. Cyanobacterial introns and shared ancestry of redand green algae. Plant Cell. 11:1675–1694.

Castresana J. 2000. Selection of conserved blocks from multiplealignments for their use in phylogenetic analysis. Mol BiolEvol. 17:540–552.

Cattolico R, Jacobs M, Zhou Y, Chang J, Duplessis M,Lybrand T, McKay J, Ong H, Sims E, Rocap G. 2008.Chloroplast genome sequencing analysis of Heterosigmaakashiwo CCMP452 (West Atlantic) and NIES293 (WestPacific) strains. BMC Genomics. 9:211.

Chevalier B, Turmel M, Lemieux C, Monnat RJ, Stoddard BL.2003. Flexible DNA target site recognition by divergenthoming endonuclease isoschizomers I-CreI and I-MsoI. J MolBiol. 329:253–269.

Cote V, Mercier J-P, Lemieux C, Turmel M. 1993. The singlegroup-I intron in the chloroplast rrnL gene of Chlamydomo-nas humicola encodes a site-specific DNA endonuclease(I-ChuI). Gene. 129:69–76.

Courties C, Vaquer A, Troussellier M, Lautier J, Chretiennot-Dinet MJ, Neveux J, Machado C, Claustre H. 1994. Smallesteukaryotic organism. Nature. 370:255.

646 Turmel et al.

Dow

nloaded from https://academ

ic.oup.com/m

be/article/26/3/631/977861 by guest on 28 Decem

ber 2021

de Cambiaire J-C, Otis C, Lemieux C, Turmel M. 2006. Thecomplete chloroplast genome sequence of the chlorophyceangreen alga Scenedesmus obliquus reveals a compact geneorganization and a biased distribution of genes on the twoDNA strands. BMC Evol Biol. 6:37.

de Cambiaire J-C, Otis C, Lemieux C, Turmel M. 2007. Thechloroplast genome sequence of the green alga Leptosiraterrestris: multiple losses of the inverted repeat and extensivegenome rearrangements within the Trebouxiophyceae. BMCGenomics. 8:213.

Derelle E, Ferraz C, Rombauts S, et al. (26 co-authors). 2006.Genome analysis of the smallest free-living eukaryote Ostreo-coccus tauri unveils many unique features. Proc Natl Acad SciUSA. 103:11647–11652.

Edgar RC. 2004. MUSCLE: multiple sequence alignment withhigh accuracy and high throughput. Nucleic Acids Res.32:1792–1797.

Farris JS. 1977. Phylogenetic analysis under Dollo’s Law. SystZool. 26:77–88.

Fawley MW, Yun Y, Qin M. 2000. Phylogenetic analyses of 18SrDNA sequences reveal a new coccoid lineage of thePrasinophyceae (Chlorophyta). J Phycol. 36:387–393.

Goulding SE, Olmstead RG, Morden CW, Wolfe KH. 1996. Ebband flow of the chloroplast inverted repeat. Mol Gen Genet.252:195–206.

Guillou L, Eikrem W, Chretiennot-Dinet M-J, Le Gall F,Massana R, Romari K, Pedros-Alio C, Vaulot D. 2004.Diversity of picoplanktonic prasinophytes assessed by directnuclear SSU rDNA sequencing of environmental samples andnovel isolates retrieved from oceanic and coastal marineecosystems. Protist. 155:193–214.

Hallick RB, Hong L, Drager RG, Favreau MR, Monfort A,Orsat B, Spielmann A, Stutz E. 1993. Complete sequence ofEuglena gracilis chloroplast DNA. Nucleic Acids Res.21:3537–3544.

Hamby RK, Zimmer EA. 1991. Ribosomal RNA as a phyloge-netic tool in plant systematics. In: Soltis P, Soltis D, Doyle J,editors. Molecular systematics in plants. New York: Rout-ledge, Chapman and Hall. p. 50–91.

Haugen P, Bhattacharya D, Palmer JD, Turner S, Lewis LA,Pryer KM. 2007. Cyanobacterial ribosomal RNA genes withmultiple, endonuclease-encoding group I introns. BMC EvolBiol. 7:159.

Heath TA, Hedtke SM, Hillis DM. 2008. Taxon sampling and theaccuracy of phylogenetic analyses. J Syst Evol. 46:239–257.

Ishida K, Cao Y, Hasegawa M, Okada N, Hara Y. 1997. Theorigin of chlorarachniophyte plastids, as inferred fromphylogenetic comparisons of amino acid sequences of EF-Tu. J Mol Evol. 45:682–687.

Jansen RK, Cai Z, Raubeson LA, et al. (16 co-authors). 2007.Analysis of 81 genes from 64 plastid genomes resolvesrelationships in angiosperms and identifies genome-scaleevolutionary patterns. Proc Natl Acad Sci USA.104:19369–19374.

Jobb G, von Haeseler A, Strimmer K. 2004. TREEFINDER:a powerful graphical analysis environment for molecularphylogenetics. BMC Evol Biol. 4:18.

Keeling PJ, Burger G, Durnford DG, Lang BF, Lee RW,Pearlman RE, Roger AJ, Gray MW. 2005. The tree ofeukaryotes. Trends Ecol Evol. 20:670–676.

Keller MD, Selvin RC, Claus W, Guillard RRL. 1987. Media for theculture of oceanic ultraphytoplankton. J Phycol. 23:633–638.

Khan H, Parks N, Kozera C, Curtis BA, Parsons BJ, Bowman S,Archibald JM. 2007. Plastid genome sequence of thecryptophyte alga Rhodomonas salina CCMP1319: lateraltransfer of putative DNA replication machinery and a test ofchromist plastid phylogeny. Mol Biol Evol. 24:1832–1842.

Kurtz S, Choudhuri JV, Ohlebusch E, Schleiermacher C, Stoye J,Giegerich R. 2001. REPuter: the manifold applications of repeatanalysis on a genomic scale. Nucleic Acids Res. 29:4633–4642.

Lambowitz AM, Zimmerly S. 2004. Mobile group II introns.Annu Rev Genet. 38:1–35.

Latasa M, Scharek R, Le Gall F, Guillou L. 2004. Pigment suites andtaxonomic groups in Prasinophyceae. J Phycol. 40:1149–1155.

Lemieux C, Otis C, Turmel M. 2000. Ancestral chloroplastgenome in Mesostigma viride reveals an early branch of greenplant evolution. Nature. 403:649–652.

Lemieux C, Otis C, Turmel M. 2007. A clade uniting the greenalgae Mesostigma viride and Chlorokybus atmophyticusrepresents the deepest branch of the Streptophyta inchloroplast genome-based phylogenies. BMC Biol. 5:2.

Lewis LA, McCourt RM. 2004. Green algae and the origin ofland plants. Am J Bot. 91:1535–1556.

Lucas P, Otis C, Mercier J-P, Turmel M, Lemieux C. 2001.Rapid evolution of the DNA-binding site in LAGLIDADGhoming endonucleases. Nucleic Acids Res. 29:960–969.

Maddison D, Maddison W. 2000. MacClade 4: analysis of phylogenyand character evolution. Sunderland (MA): Sinauer Associates.

Manton I. 1967. Electron microscopical observations on a cloneof Monomastix Scherffel in culture. Nova Hedwigia. 14:1–11.

Marin B, Melkonian M. 1999. Mesostigmatophyceae, a newclass of streptophyte green algae revealed by SSU rRNAsequence comparisons. Protist. 150:399–417.

Mattox KR, Stewart KD. 1984. Classification of the green algae:a concept based on comparative cytology. In: Irvine DEG,John DM, editors. The systematics of the green algae.London: Academic Press. p. 29–72.

Maul JE, Lilly JW, Cui L, dePamphilis CW, Miller W,Harris EH, Stern DB. 2002. The Chlamydomonas reinhardtiiplastid chromosome: islands of genes in a sea of repeats. PlantCell. 14:2659–2679.

McCracken DA, Nadakavukaren MJ, Cain JR. 1980. A biochemicaland ultrastructural evaluation of the taxonomic position ofGlaucosphaera vacuolata Korsch. New Phytol. 86:39–44.

Melkonian M. 1984. Flagellar apparatus ultrastructure in relationto green algal classification. In: Irvine DEG, John DM, editors.The systematics of the green algae. London: Academic Press.p. 73–120.

Melkonian M. 1990. Phylum Chlorophyta. Class Prasinophy-ceae. In: Margulis L, Corliss JO, Melkonian M, Chapman DJ,editors. Handbook of protoctista. The structure, cultivation,habitats and life histories of the eukaryotic microorganismsand their descendants exclusive of animals, plants and fungi.Boston: Jones and Bartlett Publishers. p. 600–607.

Michel F, Umesono K, Ozeki H. 1989. Comparative andfunctional anatomy of group II catalytic introns – a review.Gene. 82:5–30.

Michel F, Westhof E. 1990. Modelling of the three-dimensionalarchitecture of group I catalytic introns based on comparativesequence analysis. J Mol Biol. 216:585–610.

Moestrup O, Inouye I, Hori T. 2003. Ultrastructural studies onCymbomonas tetramitiformis (Prasinophyceae). I. General struc-ture, scale microstructure, and ontogeny. Can J Bot. 81:657–671.

Moestrup O, Thomsen HA. 1974. An ultrastructural study of theflagellate Pyramimonas orientalis with particular emphasis ongolgi apparatus activity and the flagellar apparatus. Proto-plasma. 81:247–269.

Moestrup O, Throndsen J. 1988. Light and electron microscop-ical studies on Pseudoscourfieldia marina a primitive scalygreen flagellate prasinophyceae with posterior flagella. Can JBot. 66:1415–1434.

Nakayama T, Marin B, Kranz HD, Surek B, Huss VAR, Inouye I,Melkonian M. 1998. The basal position of scaly greenflagellates among the green algae (Chlorophyta) is revealed by

Analysis of three Prasinophyte Chloroplast Genomes 647

Dow

nloaded from https://academ

ic.oup.com/m

be/article/26/3/631/977861 by guest on 28 Decem

ber 2021

analyses of nuclear-encoded SSU rRNA sequences. Protist.149:367–380.

Nakayama T, Suda S, Kawachi M, Inouye I. 2007. Phylogenyand ultrastructure of Nephroselmis and Pseudoscourfieldia(Chlorophyta), including the description of Nephroselmisanterostigmatica sp. nov. and a proposal for the Nephrosel-midales ord. nov. Phycologia. 46:680–697.

O’Kelly CJ. 1992. Flagellar apparatus architecture and thephylogeny of ‘‘green algae’’: chlorophytes, euglenoids,glaucophytes. In: Menzel D, editor. The cytoskeleton of thealgae. Boca Raton: CRC Press. p. 315–345.

Odom OW, Shenkenberg DL, Garcia JA, Herrin DL. 2004.A horizontally acquired group II intron in the chloroplast psbAgene of a psychrophilic Chlamydomonas: in vitro self-splicingand genetic evidence for maturase activity. RNA. 10:1097–1107.

Oudot-Le Secq M-P, Grimwood J, Shapiro H, Armbrust EV,Bowler C, Green BR. 2007. Chloroplast genomes of thediatoms Phaeodactylum tricornutum and Thalassiosira pseu-donana: comparison with other plastid genomes of the redlineage. Mol Genet Genomics. 277:427–439.

Palenik B, Grimwood J, Aerts A, et al. 2007. The tiny eukaryoteOstreococcus provides genomic insights into the paradox ofplankton speciation. Proc Natl Acad Sci USA. 104:7705–7710.

Palmer JD. 1991. Plastid chromosomes: structure and evolution.In: Bogorad L, Vasil K, editors. The molecular biology ofplastids. San Diego: Academic Press. p. 5–53.

Pombert J-F, Lemieux C, Turmel M. 2006. The complete chloroplastDNA sequence of the green alga Oltmannsiellopsis viridisreveals a distinctive quadripartite architecture in the chloroplastgenome of early diverging ulvophytes. BMC Biol. 4:3.

Pombert J-F, Otis C, Lemieux C, Turmel M. 2005. Thechloroplast genome sequence of the green alga Pseudendo-clonium akinetum (Ulvophyceae) reveals unusual structuralfeatures and new insights into the branching order ofchlorophyte lineages. Mol Biol Evol. 22:1903–1918.

Proschold T, Leliaert F. 2007. Systematics of the green algae:conflict of classic and modern approaches. In: Brodie J, LewisJ, editors. Unravelling the algae: the past, present, and futureof algal systematics. Boca Raton: CRC Press, Taylor &Francis. p. 123–153.

Qiu YL, Li LB, Wang B, et al. (21 co-authors). 2006. Thedeepest divergences in land plants inferred from phyloge-nomic evidence. Proc Natl Acad Sci USA. 103:15511–15516.

Raubeson LA, Jansen RK. 2005. Chloroplast genomes of plants.In: Henry RJ, editor. Plant diversity and evolution: genotypicand phenotypic variation in higher plants. Wallingford: CABIPublishing. p. 45–68.

Robbens S, Derelle E, Ferraz C, Wuyts J, Moreau H, Van dePeer Y. 2007. The complete chloroplast and mitochondrialDNA sequence of Ostreococcus tauri: organelle genomes ofthe smallest eukaryote are examples of compaction. Mol BiolEvol. 24:956–968.

Rodriguez-Ezpeleta N, Philippe H, Brinkmann H, Becker B,Melkonian M. 2007. Phylogenetic analyses of nuclear,mitochondrial, and plastid multigene data sets support theplacement of Mesostigma in the Streptophyta. Mol Biol Evol.24:723–731.

Rogalski M, Karcher D, Bock R. 2008. Superwobbling facilitatestranslation with reduced tRNA sets. Nat Struct Mol Biol.15:192–198.

Rogers MB, Gilson PR, Su V, McFadden GI, Keeling PJ. 2007.The complete chloroplast genome of the chlorarachniophyteBigelowiella natans: evidence for independent origins ofchlorarachniophyte and euglenid secondary endosymbionts.Mol Biol Evol. 24:54–62.

Rokas A. 2006. Genomics and the tree of life. Science.313:1897–1899.

Ronquist F, Huelsenbeck JP. 2003. MrBayes 3: Bayesianphylogenetic inference under mixed models. Bioinformatics.19:1572–1574.

Sandaa RA, Heldal M, Castberg T, Thyrhaug R, Bratbak G.2001. Isolation and characterization of two viruses with largegenome size infecting Chrysochromulina ericina (Prymne-siophyceae) and Pyramimonas orientalis (Prasinophyceae).Virology. 290:272–280.

Sheveleva EV, Hallick RB. 2004. Recent horizontal intron transferto a chloroplast genome. Nucleic Acids Res. 32:803–810.

Steinkotter J, Bhattacharya D, Semmelroth I, Bibeau C,Melkonian M. 1994. Prasinophytes form independent lineageswithin the Chlorophyta: evidence from ribosomal RNAsequence comparisons. J Phycol. 30:340–345.

Swofford DL. 2003. PAUP*. Phylogenetic analysis usingparsimony (*and other methods). Version 4. Sunderland(MA): Sinauer Associates.

Sym SD, Pienaar RN. 1993. The class Prasinophyceae. In: RoundFE, Chapman DJ, editors. Progress in phycological research.Bristol: Biopress Ltd. p. 281–376.

Takahashi F, Okabe Y, Nakada T, Sekimoto H, Ito M,Kataoka H, Nozaki H. 2007. Origins of the secondary plastidsof Euglenophyta and Chlorarachniophyta as revealed by ananalysis of the plastid-targeting, nuclear-encoded gene psbO.J Phycol. 43:1302–1309.

Triemer R, Farmer M. 2007. A decade of euglenoid molecularphylogenetics. In: Brodie J, Lewis J, editors. Unravelling thealgae: the past, present, and future of algal systematics. BocaRaton: CRC Press, Taylor & Francis. p. 315–330.

Turmel M, Brouard JS, Gagnon C, Otis C, Lemieux C. 2008.Deep division in the Chlorophyceae (Chlorophyta) revealedby chloroplast phylogenomic analyses. J Phycol. 44:739–750.

Turmel M, Lemieux C, Burger G, Lang BF, Otis C, Plante I,Gray MW. 1999a. The complete mitochondrial DNAsequences of Nephroselmis olivacea and Pedinomonas minor:two radically different evolutionary patterns within greenalgae. Plant Cell. 11:1717–1729.

Turmel M, Otis C, Lemieux C. 1999b. The complete chloroplastDNA sequence of the green alga Nephroselmis olivacea:insights into the architecture of ancestral chloroplast genomes.Proc Natl Acad Sci USA. 96:10248–10253.

Turmel M, Otis C, Lemieux C. 2005. The complete chloroplastDNA sequences of the charophycean green algae Staurastrumand Zygnema reveal that the chloroplast genome underwentextensive changes during the evolution of the Zygnematales.BMC Biol. 3:22.

Turmel M, Otis C, Lemieux C. 2006. The chloroplast genomesequence of Chara vulgaris sheds new light into the closest greenalgal relatives of land plants. Mol Biol Evol. 23:1324–1338.

Wakasugi T, Nagai T, Kapoor M, et al. (15 co-authors). 1997.Complete nucleotide sequence of the chloroplast genome fromthe green alga Chlorella vulgaris: the existence of genespossibly involved in chloroplast division. Proc Natl Acad SciUSA. 94:5967–5972.

White TJ, Bruns T, Lee S, Taylor J. 1990. Amplification anddirect sequencing of fungal ribosomal RNA genes forphylogenetics. In: Innis MA, Gelfand DH, Sninsky JJ, WhiteTJ, editors. PCR protocols: a guide to methods andapplications. San Diego: Academic Press. p. 315–322.

Wolf PG, Karol KG, Mandoli DF, et al. 2005. The first completechloroplast genome sequence of a lycophyte, Huperzialucidula (Lycopodiaceae). Gene. 350:117–128.

Martin Embley, Associate Editor

Accepted December 8, 2008

648 Turmel et al.

Dow

nloaded from https://academ

ic.oup.com/m

be/article/26/3/631/977861 by guest on 28 Decem

ber 2021