the plant journal 699–709 doi: 10.1111/tpj.12233...

11
Characterization of the LTR retrotransposon repertoire of a plant clade of six diploid and one tetraploid species Mathieu Piednoe ¨ l*, Greta Carrete-Vega and Susanne S. Renner Systematic Botany and Mycology, University of Munich (LMU), Munich 80638, Germany Received 8 January 2013; accepted 2 May 2013; published online 10 May 2013. *For correspondence (e-mail [email protected]). SUMMARY Comparisons of closely related species are needed to understand the fine-scale dynamics of retrotransposon evolution in flowering plants. Towards this goal, we classified the long terminal repeat (LTR) retrotranspo- sons from six diploid and one tetraploid species of Orobanchaceae. The study species are the autotrophic, non-parasitic Lindenbergia philippensis (as an out-group) and six closely related holoparasitic species of Orobanche [O. crenata, O. cumana, O. gracilis (tetraploid) and O. pancicii] and Phelipanche (P. lavandulacea and P. ramosa). All major plant LTR retrotransposon clades could be identified, and appear to be inherited from a common ancestor. Species of Orobanche, but not Phelipanche, are enriched in Ty3/Gypsy retrotrans- posons due to a diversification of elements, especially chromoviruses. This is particularly striking in O. grac- ilis, where tetraploidization seems to have contributed to the Ty3/Gypsy enrichment and led to the emergence of seven large species-specific families of chromoviruses. The preferential insertion of chromovi- ruses in heterochromatin via their chromodomains might have favored their diversification and enrichment. Our phylogenetic analyses of LTR retrotransposons from Orobanchaceae also revealed that the Bianca clade of Ty1/Copia and the SMART-related elements are much more widely distributed among angiosperms than previously known. Keywords: next-generation sequencing, polyploidy, genome downsizing, transposable elements, LTR retro- transposons, Ty3/Gypsy, Ty1/Copia, Orobanche, Phelipanche, Orobanchaceae. INTRODUCTION In angiosperms, nuclear genome size varies 2400-fold (Pellicer et al., 2010), largely because of different propor- tions of non-coding DNA, especially repetitive DNA (Leitch, 2007). Apart from whole-genome duplications (polyploidi- zation), the main cause of genome size increase is the accu- mulation of tandem-repeat DNA families and transposable elements (TEs). Variation in nuclear genome size is of major evolutionary importance because it determines key traits, such as the duration of the cell cycle, that directly impact fit- ness (Gregory and Hebert, 1999; Meagher and Vassiliadis, 2005; Gruner et al., 2010). The insertion and accumulation of TEs is therefore expected to be counter-selected and transposition activity suppressed, for example, by TE autoregulation (Simmons and Bucholz, 1985; Lohe and Hartl, 1996; Lohe et al., 1996), protein regulators (Adams et al., 1997), RNA silencing (Sarot et al., 2004; Aravin et al., 2007; Brennecke et al., 2007; Olivieri et al., 2010) and meth- ylation (Verbsky and Richards, 2001; Bird, 2002; Slotkin and Martienssen, 2007). Under stable genomic conditions, transposition activity is therefore probably low. Genomic stresses, however, can facilitate transposition (Zeh et al., 2009), and polyploidization (which often accom- panies hybridization) is one such stress thought to promote the proliferation of TEs (Kashkush et al., 2002; Liu and Wendel, 2003; Shan et al., 2005; Chen and Ni, 2006; Renny- Byfield et al., 2011). The resulting temporary increase of genome size is sometimes counterbalanced by rapid gen- ome ‘downsizing’ (Bennetzen, 2002; Leitch and Bennett, 2004; Skalick a et al., 2005; Hawkins et al., 2008; Mun et al., 2009; Eilam et al., 2010; Renny-Byfield et al., 2011). Thus far, the evidence for such downsizing via the loss of repeat types, particularly Ty3/Gypsy retrotransposons, comes mainly from the allotetraploid Nicotiana tabacum (Renny- Byfield et al., 2011). A recent analysis of the repetitive DNA in nine species of Orobanchaceae of different life histories (seven holoparasitic species, one hemiparasitic species and one autotrophic species; Piednoel et al., 2012) also pointed to genome downsizing in the tetraploid species included in the sample. The genomic proportions of repetitive DNA var- ied greatly among the nine species, ranging from 25 to © 2013 The Authors The Plant Journal © 2013 John Wiley & Sons Ltd 699 The Plant Journal (2013) 75, 699–709 doi: 10.1111/tpj.12233

Upload: others

Post on 21-Jun-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: The Plant Journal 699–709 doi: 10.1111/tpj.12233 ...renners/Piednoel_Transposons_PlantJournal2013.pdf · Characterization of the LTR retrotransposon repertoire of a plant clade

Characterization of the LTR retrotransposon repertoire of aplant clade of six diploid and one tetraploid species

Mathieu Piednoel*, Greta Carrete-Vega and Susanne S. Renner

Systematic Botany and Mycology, University of Munich (LMU), Munich 80638, Germany

Received 8 January 2013; accepted 2 May 2013; published online 10 May 2013.

*For correspondence (e-mail [email protected]).

SUMMARY

Comparisons of closely related species are needed to understand the fine-scale dynamics of retrotransposon

evolution in flowering plants. Towards this goal, we classified the long terminal repeat (LTR) retrotranspo-

sons from six diploid and one tetraploid species of Orobanchaceae. The study species are the autotrophic,

non-parasitic Lindenbergia philippensis (as an out-group) and six closely related holoparasitic species of

Orobanche [O. crenata, O. cumana, O. gracilis (tetraploid) and O. pancicii] and Phelipanche (P. lavandulacea

and P. ramosa). All major plant LTR retrotransposon clades could be identified, and appear to be inherited

from a common ancestor. Species of Orobanche, but not Phelipanche, are enriched in Ty3/Gypsy retrotrans-

posons due to a diversification of elements, especially chromoviruses. This is particularly striking in O. grac-

ilis, where tetraploidization seems to have contributed to the Ty3/Gypsy enrichment and led to the

emergence of seven large species-specific families of chromoviruses. The preferential insertion of chromovi-

ruses in heterochromatin via their chromodomains might have favored their diversification and enrichment.

Our phylogenetic analyses of LTR retrotransposons from Orobanchaceae also revealed that the Bianca clade

of Ty1/Copia and the SMART-related elements are much more widely distributed among angiosperms than

previously known.

Keywords: next-generation sequencing, polyploidy, genome downsizing, transposable elements, LTR retro-

transposons, Ty3/Gypsy, Ty1/Copia, Orobanche, Phelipanche, Orobanchaceae.

INTRODUCTION

In angiosperms, nuclear genome size varies 2400-fold

(Pellicer et al., 2010), largely because of different propor-

tions of non-coding DNA, especially repetitive DNA (Leitch,

2007). Apart from whole-genome duplications (polyploidi-

zation), the main cause of genome size increase is the accu-

mulation of tandem-repeat DNA families and transposable

elements (TEs). Variation in nuclear genome size is of major

evolutionary importance because it determines key traits,

such as the duration of the cell cycle, that directly impact fit-

ness (Gregory and Hebert, 1999; Meagher and Vassiliadis,

2005; Gruner et al., 2010). The insertion and accumulation

of TEs is therefore expected to be counter-selected and

transposition activity suppressed, for example, by TE

autoregulation (Simmons and Bucholz, 1985; Lohe and

Hartl, 1996; Lohe et al., 1996), protein regulators (Adams

et al., 1997), RNA silencing (Sarot et al., 2004; Aravin et al.,

2007; Brennecke et al., 2007; Olivieri et al., 2010) and meth-

ylation (Verbsky and Richards, 2001; Bird, 2002; Slotkin and

Martienssen, 2007). Under stable genomic conditions,

transposition activity is therefore probably low.

Genomic stresses, however, can facilitate transposition

(Zeh et al., 2009), and polyploidization (which often accom-

panies hybridization) is one such stress thought to promote

the proliferation of TEs (Kashkush et al., 2002; Liu and

Wendel, 2003; Shan et al., 2005; Chen and Ni, 2006; Renny-

Byfield et al., 2011). The resulting temporary increase of

genome size is sometimes counterbalanced by rapid gen-

ome ‘downsizing’ (Bennetzen, 2002; Leitch and Bennett,

2004; Skalick�a et al., 2005; Hawkins et al., 2008; Mun et al.,

2009; Eilam et al., 2010; Renny-Byfield et al., 2011). Thus

far, the evidence for such downsizing via the loss of repeat

types, particularly Ty3/Gypsy retrotransposons, comes

mainly from the allotetraploid Nicotiana tabacum (Renny-

Byfield et al., 2011). A recent analysis of the repetitive DNA

in nine species of Orobanchaceae of different life histories

(seven holoparasitic species, one hemiparasitic species and

one autotrophic species; Piedno€el et al., 2012) also pointed

to genome downsizing in the tetraploid species included in

the sample. The genomic proportions of repetitive DNA var-

ied greatly among the nine species, ranging from 25 to

© 2013 The AuthorsThe Plant Journal © 2013 John Wiley & Sons Ltd

699

The Plant Journal (2013) 75, 699–709 doi: 10.1111/tpj.12233

Page 2: The Plant Journal 699–709 doi: 10.1111/tpj.12233 ...renners/Piednoel_Transposons_PlantJournal2013.pdf · Characterization of the LTR retrotransposon repertoire of a plant clade

60%, with long terminal repeat (LTR) retrotransposons mak-

ing up most of the repetitive DNA; the tetraploid species dif-

fered substantially in Ty3/Gypsy families.

Retrotransposons, a TE class specific to eukaryotes,

transpose via an RNA intermediate. Based on structural

features and phylogenetic relationships, five orders of ret-

rotransposons have been defined (Wicker et al., 2007): LTR

retrotransposons; tyrosine recombinase-encoding retro-

transposons (e.g. DIRS1-like elements); Penelope elements;

long interspersed elements (LINEs); and short interspersed

elements (SINEs). The LTR retrotransposons, related to ret-

roviruses (Xiong and Eickbush, 1990), usually encode two

open reading frames (ORFs): one called gag, which

encodes a structural protein for virus-like particles, and

another called pol, which encodes enzymatic domains

involved in the transposition cycle, such as an aspartic pro-

tease (AP), a reverse transcriptase (RT), an RNase H (RH)

and an integrase (INT). The two major superfamilies of

plant LTR retrotransposons are Ty1/Copia and Ty3/Gypsy

(see Velasco et al., 2010; table S6), which differ in their pol

gene order (Capy et al., 1997; Wicker et al., 2007; Eickbush

and Jamburuthugoda, 2008): the RT and RH genes are

located upstream of the INT gene in Ty3/Gypsy, but down-

stream in Ty1/Copia.

In the present study, we characterize the TE dynamics in

Orobanchaceae at a finer scale than previously achieved in

this or any other flowering plant clade. For this purpose, we

analyzed the Ty1/Copia and Ty3/Gypsy elements phyloge-

netically, using seven of the nine species to represent clos-

est relatives and one out-group (Lindenbergia philippensis,

Orobanche crenata, Orobanche cumana, Orobanche graci-

lis, Orobanche pancicii, Phelipanche ramosa and Phelipan-

che lavandulacea; Figure 1). We wondered whether specific

elements are responsible for the Ty3/Gypsy diversification,

and we also wanted to know how the Ty1/Copia and Ty3/

Gypsy families reacted to the tetraploidization in O. gracilis.

Earlier studies have taken a similar approach, but were usu-

ally based either on a single species (Domingues et al.,

2012) or on a single clade of elements (Gao et al., 2012).

Comparative classifications of TEs from closely related spe-

cies have also been performed on species in which TEs had

previously been well characterized (Wicker and Keller, 2007;

Estep et al., 2013). Our study is the first to exhaustively

sample and classify all highly and moderately repeated

Ty1/Copia and Ty3/Gypsy families based on next-

generation sequencing data from species in which ele-

ments have not been previously well characterized. Among

the unexpected results is the wide distribution of the Bianca

clade and of SMART-related elements across angiosperms.

RESULTS

Clusters are hypothesized to be TE families, each

consisting of related elements (based on all-to-all BLAST and

graph-based clustering, see Experimental procedures).

Within each cluster, contigs were obtained by read assem-

bly, using an identity threshold of 80% over at least 40 bp.

The effect of using a single contig per family was tested

with a phylogenetic analysis of the largest Ty1/Copia fam-

ily from O. pancicii that we called Opan_CL2 (Figure 2). All

individual contigs from the Opan_CL2 family formed a

highly supported clade (99 bootstrap support). Given this

result, we subsequently included only one representative

contig per cluster (family).

Phylogenetic structure of Ty1/Copia elements in

Orobanchaceae

Phylogenetic analyses were performed for each of the

seven Orobanchaceae species using reference TEs, from

the Ty1/Copia clades found in plants: Hopscotch, Tos17,

SIRE1/Maximus, Tnt1, Angela, Tont1 and Bianca (Wicker

and Keller, 2007; Llorens et al., 2009; Hribov�a et al., 2010).

The tree composed of Ty1/Copia families from O. gracilis

is shown in Figure 3 (the other species trees are shown in

Figures S1–S6).

For each species, most of the seven Ty1/Copia clades

have a bootstrap support ≥80% (Figure 3). Only a few

Orobanche cumana

GS = 1.45 pg 2n = 38

Orobanche gracilis

GS = 2.10 pg 2n = 76

Orobanche pancicii

GS = 3.24 pg 2n = 38

Orobanche crenata

GS = 2.84 pg 2n = 38

PhelipanchelavandulaceaGS = 4.38 pg

2n = 24

Phelipancheramosa

GS = 4.34 pg 2n = 24

LindenbergiaphilippensisGS = 0.46 pg

2n = 32

Ty1/Copia 16.01% 18.41% 18.82% 21.42% 21.13% 22.83% 17.21%

Ty3/Gypsy 17.02% 28.34% 24.16% 21.44% 15.16% 15.92% 1.93%

Total repetitiveDNA 45.57% 60.13% 56.09% 54.94% 43.01% 47.02% 29.63%

Figure 1. Phylogenetic relationships and key genomic parameters of the seven Orobanchaceae species studied. Genomic proportions and species phylogeny

from Piedno€el et al. (2012), except for recalculated values for Phelipanche lavandulacea.

© 2013 The AuthorsThe Plant Journal © 2013 John Wiley & Sons Ltd, The Plant Journal, (2013), 75, 699–709

700 Mathieu Piednoel et al.

Page 3: The Plant Journal 699–709 doi: 10.1111/tpj.12233 ...renners/Piednoel_Transposons_PlantJournal2013.pdf · Characterization of the LTR retrotransposon repertoire of a plant clade

clades, such as Tos17 and Tont1, are weakly supported.

Five of the Ty1/Copia clades (SIRE1/Maximus, Tnt1,

Angela, Tont1 and Bianca) occur in all seven species. Hop-

scotch is restricted to Phelipanche species and Tos17 is

restricted to P. ramosa, where it makes up 0.01% of the

genome. The elements FRretro64 and OSCOPIA2, related

to the small LTR retrotransposons (SMARTs), cluster

together with the element Victim, either as the sister-group

of the Hopscotch clade (e.g. Figure 3) or nested within it

(e.g. Figure 2). We accordingly considered FRretro64,

OSCOPIA2 and Victim to belong to the Hopscotch clade.

The phylogenetic analyses also revealed that one element

from P. ramosa and two elements from P. lavandulacea

are closely related to the SMART retrotransposons.

Most Ty1/Copia families per species could be included

in the phylogenies because their contigs harbored reliable

matches with the RT domain; fewer than 13 families (per

species) could not be included. Most of these families were

instead assigned to clades using a BLAST-based classifica-

tion (see Experimental procedures); <0.2% of the genomes

remained unassigned to TE clades (Figure 4). The SIRE1/

Maximus families make up the largest proportions of Ty1/

Copia in Orobanchaceae (Figure 4), representing between

10.5% of the genome, in O. cumana, and 15.3%, in

P. ramosa. TEs from the Angela clade also are abundant:

contributing 2.2% of the genome in L. philippensis, and

4.9% in O. crenata. Families from the Hopscotch clade

make up 2.4 and 3.9% of the P. ramosa and P. lavandula-

cea genomes, whereas they were undetectable in three of

the four Orobanche, the exception being O. cumana, in

which they made up 0.04% of the genome. The Tnt1,

Tont1, and Bianca clades also occurred in low genomic

proportions (0.1–0.8%), except for Tont1, which makes up

~3.8% of the O. crenata genome.

To date, Bianca elements have been reported from only a

few species. Here we found, however, that L. philippensis

contains two Bianca families, O. gracilis and O. cumana

each contain four families, and the remaining species each

have five Bianca families. To test whether the presence of

Bianca elements in Orobanchaceae results from an under-

estimation of their distribution in angiosperms or from hori-

zontal transfer(s), we performed similarity searches against

the nr/nt database from NCBI (http://www.ncbi.nlm.nih.

gov). Low E-values were obtained for the Elote1 element

from maize and for sequences from Arachis hypogaea, Beta

vulgaris, Brassica rapa, Capsella rubella, Citrullus lanatus

and Vitis vinifera, as well as from Ipomoea trifida and Sola-

num lycopersicum. When these high-similarity elements

were included in a phylogenetic analysis, they clustered

together in a single Bianca clade (Figure 5).

Hopscotch

Tos17

SIRE1/Maximus

Tont1

Angela

Tnt1

BiancaATCOPI

A5

PDR1

SC-3

0.2

Figure 2. Phylogenetic relationships in the Opan_CL2 family, inferred from neighbour-joining analysis of the reverse transcriptase encoding domain. Contigs

from the Opan_CL2 family are indicated in dark red. Statistical support (>70%) comes from parametric bootstrapping using 100 replicates.

© 2013 The AuthorsThe Plant Journal © 2013 John Wiley & Sons Ltd, The Plant Journal, (2013), 75, 699–709

Fine-scale LTR retrotransposon classification 701

Page 4: The Plant Journal 699–709 doi: 10.1111/tpj.12233 ...renners/Piednoel_Transposons_PlantJournal2013.pdf · Characterization of the LTR retrotransposon repertoire of a plant clade

Phylogenetic structure of Ty3/Gypsy elements in

Orobanchaceae

Similar to our phylogenetic analyses of the six known Ty1/

Copia clades, we compared Orobanchaceae Ty3/Gypsy ele-

ments with the seven known plant clades from this TE

superfamily, Tekay, Galadriel, CRM, Reina, Athila, Ogre

and Tat (Llorens et al., 2009; Hribov�a et al., 2010). Figure 6

shows the resulting tree for O. gracilis (the other species

trees are shown in Figures S7–S12). The element clades

have bootstrap supports of ≥80%, except for Reina and Tat.

Families that could not be included in the phylogenetic

analyses (~20–30 in each species) were classified using the

BLAST-based approach (see Experimental procedures). Only

a few families, comprising less than ~1% of the genomes,

could not be assigned. Two families (Ocre_CL223 and

Lind_CL46), which turned out to be caulimoviruses instead

of Ty3/Gypsy, make up 0.01 and 0.08% of the genomes in

which they were found (O. crenata and L. philippensis).

The genomic proportion of Ty3/Gypsy in L. philippensis is

thus lower (1.85%) than previously calculated (1.93%;

Piedno€el et al., 2012).

Three of the seven plant Ty3/Gypsy clades (Tekay, CRM

and Athila) are found in all seven species (Figure 7). Tekay

is more abundant in Orobanche (>9.7%) than in Phelipan-

che (6.0–6.5%), and makes up 0.78% of the genome of the

out-group L. philippensis. CRM elements make up a lower

proportion in Phelipanche (<0.1%) than in Orobanche (from

0.3% in O. crenata up to 0.7% in O. gracilis), and the single

CRM family in L. philippensis (Lind_CL124; 0.02%) was

only detected using BLAST. Athila families, by contrast, are

more abundant in Phelipanche (5.6%) than in Orobanche

(<3.5%) and comprise a substantial proportion of the

L. philippensis genome (0.43%). Tat appears absent from

L. philippensis, but is ubiquitously distributed in the other

species, where it makes up 2.8–7.1% of the genomes. The

remaining elements are more rare, with Reina and Ogre

restricted to O. gracilis (0.05 and 0.08% of the genome,

respectively), and Galadriel restricted to O. gracilis (~0.7%of the genome), O. pancicii, O. crenata and L. philippensis

(<0.1% of their genomes).

LTR retrotransposon dynamics in the tetraploid species

O. gracilis

To better understand the genome modifications that

occurred in the tetraploid O. gracilis, we investigated in

detail both its species-specific LTR retrotransposons and

the TE families it has lost (compared with the remaining

Hopscotch

Tos17

SIRE1/Maximus

Tnt1

Angela

Tont1

Bianca

ATCOPI

A5

ATCOPIA63

0.2

Figure 3. Phylogenetic relationships in Ty1/Copia elements from Orobanche gracilis, inferred from neighbour-joining analysis of the reverse transcriptase

encoding domain. Families from O. gracilis are indicated in dark red. Families that are widely distributed in Orobanche but are lost in O. gracilis are indicated in

green. Statistical support (>70%) comes from parametric bootstrapping using 100 replicates.

© 2013 The AuthorsThe Plant Journal © 2013 John Wiley & Sons Ltd, The Plant Journal, (2013), 75, 699–709

702 Mathieu Piednoel et al.

Page 5: The Plant Journal 699–709 doi: 10.1111/tpj.12233 ...renners/Piednoel_Transposons_PlantJournal2013.pdf · Characterization of the LTR retrotransposon repertoire of a plant clade

Orobanche). None of its species-specific families belong to

Ty1/Copia, whereas 11 belong to Ty3/Gypsy. These 11

make up 5.85% of the O. gracilis genome (Figure 6), and

comprise seven Tekay families, two Athila families and

two Galadriel families. The seven Tekay families by them-

selves make up 5.12% of the O. gracilis genome. TE

families that are lost in O. gracilis are two Ty3/Gypsy fami-

lies (Tekay) and four Ty1/Copia families (one Angela, one

SIRE1/Maximus and two Bianca; Figures 3 and 6, and

BLAST-based classification).

DISCUSSION

This study is a fine-scale analysis of the Ty1/Copia and

Ty3/Gypsy LTR retrotransposons in closely related species

of Orobanchaceae, including a young tetraploid and a phy-

logenetically more distant species as an out-group. Most

of the TE families belong to clades commonly found in

plants (Hribov�a et al., 2010; Staton et al., 2012). Two fami-

lies (Ocre_CL223 and Lind_CL46), which we previously

classified as Ty3/Gypsy (Piedno€el et al., 2012), are in fact

caulimoviruses and make up 0.01 and 0.08% of the

genomes in which they were found (O. crenata and L. phil-

ippensis, respectively). Before this study, Bianca elements

had been reported from only a few species, including

Arabidopsis thaliana, Lotus japonicus, Medicago truncula-

ta, Oryza sativa (rice) and Triticeae (Schulman and

Kalendar, 2005; Holligan et al., 2006; Wicker and Keller,

2007; Wang and Liu, 2008), which are distantly related to

Orobanchaceae. It is now clear, however, that the Bianca

clade of Ty1/Copia, which also comprises the Elote1 ele-

ment from Zea mays (maize), is more widely distributed

across angiosperms, occurring in Brassicales (Arabidopsis

thaliana, Brassica rapa and Capsella rubella), Caryophyll-

ales (Beta vulgaris), Cucurbitales (Citrullus lanatus),

Fabales (Arachis hypogaea, Lotus japonicus and Medica-

go truncatula), Lamiales (Orobanchaceae spp.), Poales

(O. sativa and Triticeae spp.), Solanales (Ipomoea trifida

and Solanum lycopersicum) and Vitales (Vitis vinifera).

This suggests that Bianca originated early during angio-

sperm evolution, which fits the hypothesis that Bianca may

be the most ancient Ty1/Copia clade in angiosperms

(Wicker and Keller, 2007).

Considering the wide distribution of Bianca and the

poorly resolved phylogenetic relationships between ele-

ments from Orobanchaceae and the other species

(Figure 5), we hypothesize that the Bianca clade is verti-

cally inherited in the Orobanchaceae family. No uncor-

rupted full-length Bianca elements from Orobanchaceae

are known, and these elements are therefore probably no

longer active, even though the Elote1 element transposed

‘recently’ in inbred maize (Wang and Dooner, 2006). In

Musa acuminata (banana; Hribov�a et al., 2010), Saccharum

0.0

2.0

4.0

6.0

8.0

10.0

12.0

14.0

16.0

18.0

Gen

om

ic p

rop

ort

ion

Clades

Ty1/Copia

O. cumana

O. gracilis

O. pancicii

O. crenata

P. ramosa

P. lavandulacea

L. philippensis

Figure 4. Ty1/Copia clade distribution (%) among the seven Orobanchaceae species. NC: not classified.

© 2013 The AuthorsThe Plant Journal © 2013 John Wiley & Sons Ltd, The Plant Journal, (2013), 75, 699–709

Fine-scale LTR retrotransposon classification 703

Page 6: The Plant Journal 699–709 doi: 10.1111/tpj.12233 ...renners/Piednoel_Transposons_PlantJournal2013.pdf · Characterization of the LTR retrotransposon repertoire of a plant clade

officinarum (sugar cane; Domingues et al., 2012) and Gly-

cine max (soybean; Du et al., 2010), Bianca has not been

detected, possibly because of low family number, judging

from its low diversity in other angiosperms where it has

been detected (Orobanchaceae, this study; Triticae, rice

and Arabidopsis, Wicker and Keller, 2007). Likewise, our

study reveals the first elements (Plav_CL114 and Plav_180,

Figure S4; Pram_CL107, Figure S5) related to SMARTs out-

side monocotyledons (Gao et al., 2012). As P. lavandula-

ceae and P. ramosa only parasitize dicotyledon species

(http://www.farmalierganes.com/Flora/Angiospermae/Orob

anchaceae/Host_Orobanchaceae_Checklist.htm, accessed

February 2013), the presence of SMART-related elements

in these two species probably results from an underestima-

tion of the distribution of SMART elements among

angiosperms.

Our LTR retrotransposon classification shows that sev-

eral Ty1/Copia and Ty3/Gypsy clades (SIRE1/Maximus,

Tnt1, Angela, Tont1, Bianca, Tekay, CRM, Athila and Tat)

are widely distributed in Orobanchaceae (Figures 4 and 7),

and may have been present in the family’s most recent

common ancestor. We previously found that Orobanche

and Phelipanche are characterized by different TE dynam-

ics (Piedno€el et al., 2012). The present analysis further

illustrates this. For example, Hopscotch is restricted to

Phelipanche, whereas Tekay is overabundant in Oroban-

che. In addition, there are species-specific features. For

example, the L. philippensis genome has a high proportion

of Ty1/Copia (17.2%), but a very low proportion of Ty3/

Gypsy (1.9%), and O. crenata is enriched in Tont1 elements

(3.8%), compared with all other species (<0.8%). The two

closely related Phelipanche resemble each other in their

Ty1/Copia element proportions (21.1% in P. lavandulacea;

22.8% in P. ramosa), but diverge in their Ty1/Copia compo-

sition, with P. ramosa enriched for the SIRE1/Maximus ele-

ments and P. lavandulacea enriched for Hopscotch. This

highlights the need to study TE dynamics at both large and

fine scales, and in a comparative context. TE transposition

can be activated by stress (Melayah et al., 2001; Fablet and

Vieira, 2011) or the colonization of new environments

(Vieira et al., 2002), and it has been suggested that the TE

repertoire of a gene pool could promote, or be associated

with, the emergence of evolutionarily separate lines (Oliver

and Greene, 2009, 2011; Jurka et al., 2011).

The Ty3/Gypsy genome proportions are higher in Oroban-

che than in Phelipanche (Figure 1), which we have attributed

to diversification rather than a burst of transposition (Pied-

no€el et al., 2012). The present results fit that hypothesis.

Hopscotch

Tos17

SIRE1/MaximusTont1 Angela

Tnt1

Bianca

ATCOPI

A3

0.1

Figure 5. Phylogenetic relationships in the Bianca clade, inferred from neighbour-joining analysis of the reverse transcriptase encoding domain. Families from

Orobanche are indicated in dark red, families from Phelipanche are indicated in red, families from Lindenbergia philippensis are indicated in blue and the Bianca

element is indicated in pink. Additional sequences are indicated in green. Statistical support (>70%) comes from parametric bootstrapping using 100 replicates.

© 2013 The AuthorsThe Plant Journal © 2013 John Wiley & Sons Ltd, The Plant Journal, (2013), 75, 699–709

704 Mathieu Piednoel et al.

Page 7: The Plant Journal 699–709 doi: 10.1111/tpj.12233 ...renners/Piednoel_Transposons_PlantJournal2013.pdf · Characterization of the LTR retrotransposon repertoire of a plant clade

Firstly, Orobanche genomes comprise more diverse element

clades than Phelipanche (Galadriel, Reina and Ogre were

only detected in Orobanche). Three Galadriel families

are present in low genomic proportions in the out-group

L. philippensis (Lind_CL123, Lind_CL135 and Lind_CL137),

and thus perhaps this family as well as Reina and Ogre were

lost in the common ancestor of Phelipanche. Secondly, Tat,

Tekay and CRM are more abundant in Orobanche than in

Phelipanche, with the Tekay clade (and to a lesser extent the

Tat families) making up most of the Ty3/Gypsy enrichment

in Orobanche (Figure 7). This enrichment is accompanied

with an increase of the CRM and Tekay family number in

Orobanche (4–9 CRM and 25–33 Tekay families), compared

with Phelipanche (1–2 CRM and 19–22 Tekay families).

The Tekay elements, as well as the CRM, Galadriel and

Reina elements, are chromoviruses (Gorinsek et al., 2004;

Llorens et al., 2009). Chromoviruses are the earliest-

diverging branch of Ty3/Gypsy, and are found in plants,

fungi and animals (Kordis, 2005). They have a high geno-

mic turnover (Gorinsek et al., 2004), which may result

from their ‘strategy’ to escape repression and elimination

mechanisms (Kordis, 2005; Baucom et al., 2009; Novikov

et al., 2012). Chromoviruses differ from other Ty3/Gypsy

elements in harboring a chromodomain in their 3′ end,

which is a structural domain commonly found in proteins

associated with the remodeling and manipulation of chro-

matin (Gorinsek et al., 2004). Chromodomains are highly

constrained (Novikov et al., 2012), and may promote the

integration of TEs in heterochromatin regions (Gao et al.,

2008; but see Novikov et al., 2012 for chromoviruses in

euchromatin). In accordance with this, CRM families

appear to be centromere-specific (Luo et al., 2012). The

high genomic turnover and site-specific integration of

chromoviruses probably both contribute to their survival

and abundance, especially in plants where they often

attain large numbers of young copies (Kordis, 2005). It is

therefore not surprising that chromoviruses make up a

high proportion of the Orobanchaceae Ty3/Gypsy. There

is, however, an exception to this global pattern, with the

Athila element enrichment in Phelipanche. Once again,

this underlines the need to study TE dynamics at both

large and fine scales.

We previously showed that O. gracilis (the sequenced

plant was tetraploid, with 2n = 76) has a particular TE com-

position, probably related to its tetraploidization and subse-

quent genome downsizing (Piedno€el et al., 2012). Although

O. gracilis has one of the smallest genomes of the Orob-

anchaceae studied, it has the highest proportion of TEs,

Galadriel

CRMReina

Athila

ERIKA1 TM

01

Ogre

Ogr

a_C

L19

SP

E

0.1

Tekay Tat

Figure 6. Phylogenetic relationships in Ty3/Gypsy elements from Orobanche gracilis, inferred from neighbour-joining analysis of the reverse transcriptase

encoding domain. Families from O. gracilis are indicated in dark red, and species-specific families from this species are labeled with an ‘SPE’ tag in their name.

Families that are widely distributed in Orobanche, but are lost in O. gracilis, are indicated in green. Statistical support (>70%) comes from parametric bootstrap-

ping using 100 replicates.

© 2013 The AuthorsThe Plant Journal © 2013 John Wiley & Sons Ltd, The Plant Journal, (2013), 75, 699–709

Fine-scale LTR retrotransposon classification 705

Page 8: The Plant Journal 699–709 doi: 10.1111/tpj.12233 ...renners/Piednoel_Transposons_PlantJournal2013.pdf · Characterization of the LTR retrotransposon repertoire of a plant clade

especially of Ty3/Gypsy retrotransposons (Figure 1). The

present LTR retrotransposon classification allows a better

understanding of the underlying dynamics. As is character-

istic of Orobanche, Tekay elements are enriched in O. grac-

ilis: Tekay families make up 17.82% of its genome, with

seven Tekay families unique to O. gracilis and making up

5.12% of its genome. These seven families represent

almost one-third of the O. gracilis species-specific families,

and 75% of the Ty3/Gypsy genomic enrichment compared

with O. crenata and O. pancicii. Previous studies have

shown that polyploidy can be associated with selective

amplification of repetitive DNA (Parisod et al., 2010 for a

review). The O. gracilis polyploidization could thus have

promoted the proliferation of specific Tekay families.

Polyploidy may have been only one of the factors activat-

ing the Tekay elements in O. gracilis. In maize, only one of

the LTR retrotranposon amplification bursts was initiated

by polyploidy, whereas the other element activations were

not (Estep et al., 2013). Additionally, the chromodomain of

Tekay chromoviruses could have helped their persistence

and thus accumulation (as described above). Like the CRM

families, Tekay elements may be preferentially located in

centromeric and pericentromeric regions of plant chromo-

somes (Theuri et al., 2005; Domingues et al., 2012). Inter-

estingly, the tetraploid O. gracilis is the only species in our

sample that harbors all four chromovirus clades (Tekay,

Galadriel, CRM and Reina), with Galadriel elements being

especially diverse and abundant (partially via the presence

of two specific families; Figures 6 and 7).

The Ty3/Gypsy enrichment in O. gracilis might be

related to the elimination of genes, especially redundant

ones, as suggested for the triploid Brassica rapa genome,

compared with the diploid A. thaliana (Mun et al., 2009).

This could be the case because Ty3/Gypsy elements (and

more especially chromoviruses) are preferentially located

in heterochromatin, in contrast to genes, which are prefer-

entially located in euchromatin. However, this differential

location cannot be the only explanation of the TE enrich-

ment, because genome downsizing in O. gracilis also led

to the loss of TE families from various LTR retrotransposon

clades (Figures 3 and 6), including the presumably hetero-

chromatin-located Tekay elements. This loss of entire TE

families matches results from the allopolyploid Nicotiana

tabacum, in which common repeats from the parental spe-

cies are lost in the polyploid descendant (Renny-Byfield

et al., 2011).

In conclusion, we classified the Ty1/Copia and Ty3/Gypsy

LTR retrotransposons from a rapidly speciating clade of

Orobanchaceae, as indicated by the numerous morphologi-

cally poorly separated species that differ mainly in flower

color and host preferences (Carl�on et al., 2008; Pusch and

G€unther, 2009). Most of the identified LTR retrotransposon

clades appear to have been inherited from the most recent

common ancestor of Orobanchaceae. Orobanche has the

0.0

2.0

4.0

6.0

8.0

10.0

12.0

14.0

16.0

18.0G

eno

mic

pro

po

rtio

n

Clades

Ty3/Gypsy

O. cumanaO. gracilis

O. pancicii

O. crenata

P. ramosaP. lavandulacea

L. philippensis

Figure 7. Ty3/Gypsy clade distribution (%) among the seven Orobanchaceae species. NC: not classified.

© 2013 The AuthorsThe Plant Journal © 2013 John Wiley & Sons Ltd, The Plant Journal, (2013), 75, 699–709

706 Mathieu Piednoel et al.

Page 9: The Plant Journal 699–709 doi: 10.1111/tpj.12233 ...renners/Piednoel_Transposons_PlantJournal2013.pdf · Characterization of the LTR retrotransposon repertoire of a plant clade

greatest Ty3/Gypsy element diversification, perhaps because

chromoviruses (Tekay, Galadriel, CRM and Reina) targeted

heterochromatin regions via their chromodomains, and

thus were able to persist longer. The tetraploidization

event in O. gracilis appears to have promoted the prolifera-

tion of seven species-specific Tekay families in this spe-

cies. In contrast to the Ty3/Gypsy elements, the Ty1/Copia

repertoire appears more homogeneous among Orobancha-

ceae species, although there are striking species-specific

TE dynamics. Finally, this study revealed that the Bianca

clade of Ty1/Copia elements and SMART-related elements

are more widely distributed in angiosperms than previ-

ously known.

EXPERIMENTAL PROCEDURES

Plant material

Lindenbergia philippensis (Cham. and Schltd.) Benth. (2n = 32) isa fully autotrophic species from Bangladesh, India, Burma,Thailand, Cambodia, Laos, Vietnam, tropical China and the Philip-pines. Orobanche cumana Wallr. (2n = 38) is distributed from theMediterranean region to Central Asia, and its main hosts belongto Asteraceae. Orobanche gracilis Beck (2n = 76) is found in theMediterranean, and northwards to southern Central Europe, itshosts are exclusively shrubby Fabaceae. Orobanche pancicii Beck(2n = 38) is distributed from the Balkan Peninsula, northwards tothe Eastern Alps, and its hosts are species of Knautia and Scabi-osa. Orobanche crenata Forssk. (2n = 38) is found in the Mediter-ranean region and the Near East, and its hosts are legumes,mainly annual crop species. Phelipanche lavandulacea Pomel(2n = 24) is a Mediterranean species, and its sole host is Bitumina-ria bituminosa, a perennial Fabaceae. Finally, Phelipanche ramosa(L.) Pomel (2n = 24) has been introduced worldwide, but its nativedistribution is the Mediterranean region and the Near East. Itshosts are a broad range of annual species. Chromosome numbersof the individuals studied were reported in Schneeweiss et al.(2004) and Piedno€el et al. (2012).

Sequencing data and assembly

The 454 pyrosequencing reads were obtained from the sequenceread archives (accession no. SRA047928). Filtering for plastid con-taminants resulted in 76–555 Mb of DNA sequence for each spe-cies. This amounts to ~23% of the O. cumana genome (1.42 Gb),~20% of the O. gracilis genome (2.05 Gb), ~20% of the O. crenatagenome (2.78 Gb), ~16% of the O. pancicii genome (3.17 Gb),~12% of the P. ramosa genome (4.25 Gb) and ~11% of theP. lavandulacea genome (4.29 Gb). The corresponding genomesizes were reported in Weiss-Schneeweiss et al. (2006) andPiedno€el et al. (2012).

The reads were processed as described in Piedno€el et al. (2012).Briefly, they were assembled using a graph-based clusteringapproach (Nov�ak et al., 2010), in which vertices correspond tosequence reads, and overlapping reads are connected, with edgesassociated with edge weights corresponding to their similarityscores. Clusters of frequently connected nodes represent groupsof similar sequences (hereby considered as families of genomicrepeats). The number of reads in each family is proportional to itsgenomic abundance. Within each family, the reads were assem-bled into contigs, representing chimeric consensus sequences,using TIGR Gene Indices clustering tools (Pertea et al., 2003), with

the �O′ �p80 �o40′ parameters, specifying overlap percentageidentity and minimal length cut-off for the cap3 assembler.

Family classification

For both Ty1/Copia and Ty3/Gypsy, we reconstructed phylogenetictrees including several previously classified TEs, representative ofall LTR retrotransposon clades described from plants. Referencesequences were selected from the matrices used in Hribov�a et al.(2010), plus the ATCopia28 and OSCOPIA2 elements from Arabid-opsis thaliana and rice, all deposited in Repbase (Jurka et al.,2005). Some other elements not represented in Hribov�a et al.(2010) were also added: (i) Bianca, Eninu, Opie and Victim fromthe maize TE database (http://maizetedb.org/~maize), (ii) Giepumand Ji from the Retrotransposon database (http://data.genomics.purdue.edu/~pmiguel/projects/retros); and (iii) Ale (HE774675),Araco (AC079131:14472-19329), FRetro64 (JN806224), Ivana(EF067844:429582-434664) and Kielia (EU195798) from GenBank.For each Orobanchaceae LTR retrotransposon family, one contigcovering the RT domain was then included in the phylogeneticanalyses as a representative of its entire family. The representativecontigs were selected as the most conserved considering theirsimilarity scores with known elements obtained using RPS-BLAST(reversed position-specific BLAST; Altschul et al., 1997) and threealignment profiles: pfam07727, pfam00078 and cd01650.

The RT domains of the reference sequences and representativecontigs were extracted using a custom Python script and the RTboundaries provided by the results of RPS-BLAST. All RT domainswere then translated using Traduit (http://www.snv.jussieu.fr/~wensgen/soft/doc/index.html), and the corresponding sequencesof amino acids were aligned using MAFFT (Katoh et al., 2009). Align-ments were manually curated, and ambiguously aligned siteswere filtered out using BMGE (Criscuolo and Gribaldo, 2010). Phylo-genetic analyses were carried out using the neighbour-joiningmethod, 100 bootstrap replicates and the pairwise deletion ofgaps option included in MEGA 5.0 (Tamura et al., 2011). The best-fitting model, JTT + G (Jones et al., 1992), was selected usingTOPALI 2 (Milne et al., 2009).

Some families without any reliable hit on the RT domain couldnot be included in the phylogenies. We thus classified them con-sidering their BLASTX results on the Gypsy database (Llorens et al.,2011). A family was assigned to a particular clade using two crite-ria: (i) best hits obtained on a unique clade, and (ii) an E-value dif-ference between these hits and the best hits obtained on otherclades of at least 1E�5.

Bianca clade distribution

To determine the Bianca distribution among angiosperms, similar-ity searches were performed using the AT28Copia element fromArabidopsis thaliana and elements from Orobanchaceae speciesagainst the nr/nt database. For this purpose, both BLASTN and BLASTX

were used. Several candidates from Arachis hypogaea (HQ637177),Beta vulgaris (GU057342), Brassica rapa (AC232487), Capsellarubella (DQ103594), Citrullus lanatus (JX027061), Ipomoea trifida(AH013750), Lotus japonicus (AP009656), Medicago truncaluta(AC161750), Oryza sativa japonica group (AC018929), Solanumlycopersicum (AF275345), Vitis vinifera (AM477556) and Zea mays(Elote1: DQ493648) were selected and included in a specific phylo-genetic analysis.

ACKNOWLEDGEMENTS

This work was supported by the German Science Foundation (RE603/9-1 and -2). We thank Jiri Macas from the Institute of PlantMolecular Biology in Budweis for the RepeatExplorer pipeline.

© 2013 The AuthorsThe Plant Journal © 2013 John Wiley & Sons Ltd, The Plant Journal, (2013), 75, 699–709

Fine-scale LTR retrotransposon classification 707

Page 10: The Plant Journal 699–709 doi: 10.1111/tpj.12233 ...renners/Piednoel_Transposons_PlantJournal2013.pdf · Characterization of the LTR retrotransposon repertoire of a plant clade

SUPPORTING INFORMATION

Additional Supporting Information may be found in the onlineversion of this article.Figure S1. Phylogenetic relationships in Ty1/Copia elements fromOrobanche crenata inferred from neighbor-joining analysis of thereverse transcriptase encoding domain.

Figure S2. Phylogenetic relationships in Ty1/Copia elements fromOrobanche pancicii inferred from neighbor-joining analysis of thereverse transcriptase encoding domain.

Figure S3. Phylogenetic relationships in Ty1/Copia elements fromOrobanche cumana inferred from neighbor-joining analysis of thereverse transcriptase encoding domain.

Figure S4. Phylogenetic relationships in Ty1/Copia elements fromPhelipanche lavandulacea inferred from neighbor-joining analysisof the reverse transcriptase encoding domain.

Figure S5. Phylogenetic relationships in Ty1/Copia elements fromPhelipanche ramosa inferred from neighbor-joining analysis ofthe reverse transcriptase encoding domain.

Figure S6. Phylogenetic relationships in Ty1/Copia elements fromLindenbergia philippensis inferred from neighbor-joining analysisof the reverse transcriptase encoding domain.

Figure S7. Phylogenetic relationships in Ty3/Gypsy elements fromOrobanche crenata inferred from neighbor-joining analysis of thereverse transcriptase encoding domain.

Figure S8. Phylogenetic relationships in Ty3/Gypsy elements fromOrobanche pancicii inferred from neighbor-joining analysis of thereverse transcriptase encoding domain.

Figure S9. Phylogenetic relationships in Ty3/Gypsy elements fromOrobanche cumana inferred from neighbor-joining analysis of thereverse transcriptase encoding domain.

Figure S10. Phylogenetic relationships in Ty3/Gypsy elementsfrom Phelipanche lavandulacea inferred from neighbor-joininganalysis of the reverse transcriptase encoding domain.

Figure S11. Phylogenetic relationships in Ty3/Gypsy elementsfrom Phelipanche ramosa inferred from neighbor-joining analysisof the reverse transcriptase encoding domain.

Figure S12. Phylogenetic relationships in Ty3/Gypsy elementsfrom Lindenbergia philippensis inferred from neighbor-joininganalysis of the reverse transcriptase encoding domain.

REFERENCES

Adams, M.D., Tarng, R.S. and Rio, D.C. (1997) The alternative splicing factor

PSI regulates P-element third intron splicing in vivo. Genes Dev. 11,

129–138.Altschul, S.F., Madden, T.L., Sch€affer, A.A., Zhang, J., Zhang, Z., Miller, W.

and Lipman, D.J. (1997) Gapped BLAST and PSI-BLAST: a new genera-

tion of protein database search programs. Nucleic Acids Res. 25, 3389–3402.

Aravin, A.A., Hannon, G.J. and Brennecke, J. (2007) The Piwi-piRNA path-

way provides an adaptive defense in the transposon arms race. Science,

318, 761–764.Baucom, R.S., Estill, J.C., Chaparro, C., Upshaw, N., Jogi, A., Deragon, J.-M.,

Westerman, R.P., Sanmiguel, P.J. and Bennetzen, J.L. (2009) Exceptional

diversity, non-random distribution, and rapid evolution of retroelements

in the B73 maize genome. PLoS Genet. 5 e1000732.

Bennetzen, J.L. (2002) Mechanisms and rates of genome expansion and

contraction in flowering plants. Genetica, 115, 29–36.Bird, A. (2002) DNA methylation patterns and epigenetic memory. Genes

Dev. 16, 6–21.Brennecke, J., Aravin, A.A., Stark, A., Dus, M., Kellis, M., Sachidanandam,

R. and Hannon, G.J. (2007) Discrete small RNA-generating loci as master

regulators of transposon activity in Drosophila. Cell, 128, 1089–1103.

Capy, P., Langin, T., Higuet, D., Maurer, P. and Bazin, C. (1997) Do the integ-

rases of LTR-retrotransposons and class II element transposases have a

common ancestor? Genetica, 100, 63–72.Carl�on, L., G�omez Casares, G., La�ınz, M., Moreno Moral, G., S�anchez

Pedraja, �O. and Schneeweiss, G.M. (2008) M�as, a prop�osito de algunas

Phelipanche Pomel, Boulardia F. W. Schultz y Orobanche L. (Orobancha-

ceae) del oeste del Pale�artico. Docum. Jard. Bot. Atl�antico, 6, 1–128.Chen, Z.J. and Ni, Z. (2006) Mechanisms of genomic rearrangements and

gene expression changes in plant polyploids. BioEssays, 28, 240–252.Criscuolo, A. and Gribaldo, S. (2010) BMGE (Block Mapping and Gathering

with Entropy): a new software for selection of phylogenetic informative

regions from multiple sequence alignments. BMC Evol. Biol. 10, 210.

Domingues, D.S., Cruz, G.M.Q., Metcalfe, C.J., Nogueira, F.T.S., Vicentini,

R., de S. Alves, C. and Van Sluys, M.-A. (2012) Analysis of plant LTR-ret-

rotransposons at the fine-scale family level reveals individual molecular

patterns. BMC Genomics, 13, 137.

Du, J., Tian, Z., Hans, C.S., Laten, H.M., Cannon, S.B., Jackson, S.A.,

Shoemaker, R.C. and Ma, J. (2010) Evolutionary conservation, diversity

and specificity of LTR-retrotransposons in flowering plants: insights from

genome-wide analysis andmulti-specific comparison. Plant J. 63, 584–598.Eickbush, T.H. and Jamburuthugoda, V.K. (2008) The diversity of retrotrans-

posons and the properties of their reverse transcriptases. Virus Res. 134,

221–234.Eilam, T., Anikster, Y., Millet, E., Manisterski, J. and Feldman, M. (2010)

Genome Size in Diploids, Allopolyploids, and Autopolyploids of Mediter-

ranean Triticeae. J. Bot. 2010, 1–12.Estep, M.C., DeBarry, J.D. and Bennetzen, J.L. (2013) The dynamics of LTR

retrotransposon accumulation across 25 million years of panicoid grass

evolution. Heredity, 110, 194–204.Fablet, M. and Vieira, C. (2011) Evolvability, epigenetics and transposable

elements. Biomol. Concepts, 2, 333–341.Gao, X., Hou, Y., Ebina, H., Levin, H.L. and Voytas, D.F. (2008) Chromodo-

mains direct integration of retrotransposons to heterochromatin. Gen-

ome Res. 18, 359–369.Gao, D., Chen, J., Chen, M., Meyers, B.C. and Jackson, S. (2012) A highly

conserved, small LTR retrotransposon that preferentially targets genes in

grass genomes. PLoS ONE, 7, e32010.

Gorinsek, B., Gubensek, F. and Kordis, D. (2004) Evolutionary genomics of

chromoviruses in eukaryotes. Mol. Biol. Evol. 21, 781–798.Gregory, T.R. and Hebert, P.D. (1999) The modulation of DNA content: prox-

imate causes and ultimate consequences. Genome Res. 9, 317–324.Gruner, A., Hoverter, N., Smith, T. and Knight, C.A. (2010) Genome Size Is a

Strong Predictor of Root Meristem Growth Rate. J. Bot. 2010, 1–4.Hawkins, J.S., Grover, C.E. and Wendel, J.F. (2008) Repeated big bangs and

the expanding universe: directionality in plant genome size evolution.

Plant Sci. 174, 557–562.Holligan, D., Zhang, X., Jiang, N., Pritham, E.J. and Wessler, S.R. (2006) The

transposable element landscape of the model legume Lotus japonicus.

Genetics, 174, 2215–2228.Hribov�a, E., Neumann, P., Matsumoto, T., Roux, N., Macas, J. and Dolezel,

J. (2010) Repetitive part of the banana (Musa acuminata) genome investi-

gated by low-depth 454 sequencing. BMC Plant Biol. 10, 204.

Jones, D.T., Taylor, W.R. and Thornton, J.M. (1992) The rapid generation of

mutation data matrices from protein sequences. Comput. Appl. Biosci. 8,

275–282.Jurka, J., Kapitonov, V.V., Pavlicek, A., Klonowski, P., Kohany, O. and

Walichiewicz, J. (2005) Repbase Update, a database of eukaryotic repeti-

tive elements. Cytogenet. Genome Res. 110, 462–467.Jurka, J., Bao, W. and Kojima, K.K. (2011) Families of transposable ele-

ments, population structure and the origin of species. Biol. Direct. 6, 44.

Kashkush, K., Feldman, M. and Levy, A.A. (2002) Gene loss, silencing and

activation in a newly synthesized wheat allotetraploid. Genetics, 160,

1651–1659.Katoh, K., Asimenos, G. and Toh, H. (2009) Multiple alignment of DNA

sequences with MAFFT. Methods Mol. Biol. 537, 39–64.Kordis, D. (2005) A genomic perspective on the chromodomain-containing

retrotransposons: Chromoviruses. Gene, 347, 161–173.Leitch, I.J. (2007) Genome sizes through the ages. Heredity (Edinb), 99,

121–122.Leitch, I.J. and Bennett, M.D. (2004) Genome downsizing in polyploid

plants. Biol. J. Linn. Soc. 82, 651–663.

© 2013 The AuthorsThe Plant Journal © 2013 John Wiley & Sons Ltd, The Plant Journal, (2013), 75, 699–709

708 Mathieu Piednoel et al.

Page 11: The Plant Journal 699–709 doi: 10.1111/tpj.12233 ...renners/Piednoel_Transposons_PlantJournal2013.pdf · Characterization of the LTR retrotransposon repertoire of a plant clade

Liu, B. and Wendel, J.F. (2003) Epigenetic phenomena and the evolution of

plant allopolyploids. Mol. Phylogenet. Evol. 29, 365–379.Llorens, C., Mu~noz-Pomer, A., Bernad, L., Botella, H. and Moya, A. (2009)

Network dynamics of eukaryotic LTR retroelements beyond phylogenetic

trees. Biol. Direct. 4, 41.

Llorens, C., Futami, R., Covelli, L., et al. (2011) The Gypsy Database (GyDB)

of mobile genetic elements: release 2.0. Nucleic Acids Res. 39, D70–D74.Lohe, A.R. and Hartl, D.L. (1996) Autoregulation of mariner transposase

activity by overproduction and dominant-negative complementation.

Mol. Biol. Evol. 13, 549–555.Lohe, A.R., Sullivan, D.T. and Hartl, D.L. (1996) Subunit interactions in the

mariner transposase. Genetics, 144, 1087–1095.Luo, S., Mach, J., Abramson, B., Ramirez, R., Schurr, R., Barone, P.,

Copenhaver, G. and Folkerts, O. (2012) The cotton centromere contains a

Ty3-gypsy-like LTR retroelement. PLoS ONE, 7, e35261.

Meagher, T.R. and Vassiliadis, C. (2005) Phenotypic impacts of repetitive

DNA in flowering plants. New Phytol. 168, 71–80.Melayah, D., Bonnivard, E., Chalhoub, B., Audeon, C. and Grandbastien,

M.A. (2001) The mobility of the tobacco Tnt1 retrotransposon correlates

with its transcriptional activation by fungal factors. Plant J. 28, 159–168.Milne, I., Lindner, D., Bayer, M., Husmeier, D., McGuire, G., Marshall, D.F.

and Wright, F. (2009) TOPALi v2: a rich graphical interface for evolution-

ary analyses of multiple alignments on HPC clusters and multi-core desk-

tops. Bioinformatics, 25, 126–127.Mun, J.H., Kwon, S.J., Yang, T.J., et al. (2009) Genome-wide comparative

analysis of the Brassica rapa gene space reveals genome shrinkage and

differential loss of duplicated genes after whole genome triplication.

Genome Biol. 10, R111.

Nov�ak, P., Neumann, P. and Macas, J. (2010) Graph-based clustering and

characterization of repetitive sequences in next-generation sequencing

data. BMC Bioinformatics, 11, 378.

Novikov, A., Smyshlyaev, G. and Novikova, O. (2012) Evolutionary history

of LTR retrotransposon chromodomains in plants. Int. J. Plant Genomics,

2012, 874743.

Oliver, K.R. and Greene, W.K. (2009) Transposable elements: powerful facili-

tators of evolution. BioEssays, 31, 703–714.Oliver, K.R. and Greene, W.K. (2011) Mobile DNA and the TE-thrust hypothe-

sis: supporting evidence from the primates. Mob. DNA, 2, 8.

Olivieri, D., Sykora, M.M., Sachidanandam, R., Mechtler, K. and Brennecke,

J. (2010) An in vivo RNAi assay identifies major genetic and cellular

requirements for primary piRNA biogenesis in Drosophila. EMBO J. 29,

3301–3317.Parisod, C., Alix, K., Just, J., Petit, M., Sarilar, V., Mhiri, C., Ainouche, M.,

Chalhoub, B. and Grandbastien, M.A. (2010) Impact of transposable ele-

ments on the organization and function of allopolyploid genomes. New

Phytol. 186, 37–45.Pellicer, J., Fay, M.F. and Leitch, I.J. (2010) The largest eukaryotic genome

of them all? Bot. J. Linn. Soc. 164, 10–15.Pertea, G., Huang, X., Liang, F., et al. (2003) TIGR Gene Indices clustering

tools (TGICL): a software system for fast clustering of large EST datasets.

Bioinformatics, 19, 651–652.Piedno€el, M., Aberer, A.J., Schneeweiss, G.M., Macas, J., Novak, P.,

Gundlach, H., Temsch, E.M. and Renner, S.S. (2012) Next-generation

sequencing reveals the impact of repetitive DNA across phylogeneti-

cally closely related genomes of Orobanchaceae. Mol. Biol. Evol. 29,

3601–3611.Pusch, J. and G€unther, K.F. (2009) Familie Orobanchaceae s. str. Somme-

rwurzgew€achse. In Gustav Hegi Illustrierte Flora von Mitteleuropa (Wage-

nitz, G., ed). Jena: Weissdorn-Verlag, vol. 6, pp. 1–13.Renny-Byfield, S., Chester, M., Kova�r�ık, A., et al. (2011) Next generation

sequencing reveals genome downsizing in allotetraploid Nicotiana taba-

cum, predominantly through the elimination of paternally derived repeti-

tive DNAs. Mol. Biol. Evol. 28, 2843–2854.Sarot, E., Payen-Groschene, G., Bucheton, A. and P�elisson, A. (2004) Evi-

dence for a piwi-dependent RNA silencing of the gypsy endogenous ret-

rovirus by the Drosophila melanogaster flamenco gene. Genetics, 166,

1313–1321.Schneeweiss, G.M., Palomeque, T., Colwell, A.E. and Weiss-Schneeweiss,

H. (2004) Chromosome numbers and karyotype evolution in holoparasit-

ic Orobanche (Orobanchaceae) and related genera. Am. J. Bot. 91, 439–448.

Schulman, A.H. and Kalendar, R. (2005) A movable feast: diverse retrotrans-

posons and their contribution to barley genome dynamics. Cytogenet.

Genome Res. 110, 598–605.Shan, X., Liu, Z., Dong, Z., et al. (2005) Mobilization of the active MITE

transposons mPing and Pong in rice by introgression from wild rice

(Zizania latifolia Griseb.). Mol. Biol. Evol. 22, 976–990.Simmons, M.J. and Bucholz, L.M. (1985) Transposase titration in Drosophila

melanogaster: a model of cytotype in the P-M system of hybrid dysgene-

sis. Proc. Natl Acad. Sci. USA, 82, 8119–8123.Skalick�a, K., Lim, K.Y., Matyasek, R., Matzke, M., Leitch, A.R. and Kovarik,

A. (2005) Preferential elimination of repeated DNA sequences from the

paternal, Nicotiana tomentosiformis genome donor of a synthetic, allote-

traploid tobacco. New Phytol. 166, 291–303.Slotkin, R.K. and Martienssen, R. (2007) Transposable elements and the epi-

genetic regulation of the genome. Nat. Rev. Genet. 8, 272–285.Staton, S.E., Bakken, B.H., Blackman, B.K., et al. (2012) The sunflower

(Helianthus annuus L.) genome reflects a recent history of biased accu-

mulation of transposable elements. Plant J. 72, 142–153.Tamura, K., Peterson, D., Peterson, N., Stecher, G., Nei, M. and Kumar, S.

(2011) MEGA5: molecular evolutionary genetics analysis using maximum

likelihood, evolutionary distance, and maximum parsimony methods.

Mol. Biol. Evol. 28, 2731–2739.Theuri, J., Phelps-Durr, T., Mathews, S. and Birchler, J. (2005) A compara-

tive study of retrotransposons in the centromeric regions of A and B

chromosomes of maize. Cytogenet. Genome Res. 110, 203–208.Velasco, R., Zharkikh, A., Affourtit, J., et al. (2010) The genome of the

domesticated apple (Malus 9 domestica Borkh.). Nat. Genet. 42, 833–839.Verbsky, M.L. and Richards, E.J. (2001) Chromatin remodeling in plants.

Curr. Opin. Plant Biol. 4, 494–500.Vieira, C., Nardon, C., Arpin, C., Lepetit, D. and Bi�emont, C. (2002) Evolution

of genome size in Drosophila. Is the invader’s genome being invaded by

transposable elements? Mol. Biol. Evol. 19, 1154–1161.Wang, Q. and Dooner, H.K. (2006) Remarkable variation in maize genome

structure inferred from haplotype diversity at the bz locus. PNAS, 103,

17644–17649.Wang, H. and Liu, J.-S. (2008) LTR retrotransposon landscape in Medicago

truncatula: more rapid removal than in rice. BMC Genomics, 9, 382.

Weiss-Schneeweiss, H., Greilhuber, J. and Schneeweiss, G.M. (2006) Gen-

ome size evolution in holoparasitic Orobanche (Orobanchaceae) and

related genera. Am. J. Bot. 93, 148–156.Wicker, T. and Keller, B. (2007) Genome-wide comparative analysis of copia

retrotransposons in Triticeae, rice, and Arabidopsis reveals conserved

ancient evolutionary lineages and distinct dynamics of individual copia

families. Genome Res. 17, 1072–1081.Wicker, T., Sabot, F., Hua-Van, A., et al. (2007) A unified classification sys-

tem for eukaryotic transposable elements. Nat. Rev. Genet. 8, 973–982.Xiong, Y. and Eickbush, T.H. (1990) Origin and evolution of retroele-

ments based upon their reverse transcriptase sequences. EMBO J. 9,

3353–3362.Zeh, D.W., Zeh, J.A. and Ishida, Y. (2009) Transposable elements and an epi-

genetic basis for punctuated equilibria. BioEssays, 31, 715–726.

© 2013 The AuthorsThe Plant Journal © 2013 John Wiley & Sons Ltd, The Plant Journal, (2013), 75, 699–709

Fine-scale LTR retrotransposon classification 709