both exact target site sequence and long poly(a) tail are required for
TRANSCRIPT
- 1 -
Title 1
Both exact target site sequence and long poly(A) tail are required for precise insertion of the 18S 2
rDNA-specific non-long terminal repeat retrotransposon R7Ag 3
Authors 4
Narisu Nichuguti, Mayumi Hayase and Haruhiko Fujiwara# 5
Address 6
Department of Integrated Biosciences, Graduate School of Frontier Sciences, University of Tokyo, 7
Bioscience Building 501, Kashiwa, Chiba 277-8562, Japan 8 #Corresponding author 9
E-mail: [email protected] 10
Tel: +81-4-7136-3659, FAX: +81-4-7136-3660 11
12
Running title Retrotransposition mechanism of ribosomal element R7Ag 13
Word count for abstract section: 180 words 14
Word count for Abstracts, Introduction, Results, Discussion and Figure legends sections: 33844 15
characters 16
17
18
19
20
21
22
23
24
25
26
27
MCB Accepted Manuscript Posted Online 14 March 2016Mol. Cell. Biol. doi:10.1128/MCB.00970-15Copyright © 2016, American Society for Microbiology. All Rights Reserved.
on February 1, 2018 by guest
http://mcb.asm
.org/D
ownloaded from
- 2 -
ABSTRACT 28
Ribosomal elements (R elements) are site-specific non-long terminal repeat (non-LTR) 29
retrotransposons that target ribosomal DNA. To elucidate how R elements specifically access 30
their target sites, we isolated and characterized the 18S rDNA-specific R element R7Ag from 31
Anopheles gambiae. Using an in vivo and ex vivo recombinant baculovirus retrotransposition 32
system, we found that the exact host 18S rDNA sequence at the target site is essential for the 33
precise insertion of R7Ag. In addition, a long poly(A) tail is necessary for the accurate initiation 34
of R7Ag reverse transcription, a novel mechanism found in non-LTR elements. We further 35
compared the subcellular localizations of proteins in R7Ag as well as R1Bm, another R element 36
that targets 28S rDNA. Although the ORF1 proteins (ORF1ps) of both R7Ag and R1Bm 37
localized predominantly in the cytoplasm, ORF2 proteins (ORF2ps) co-localized in the nucleus 38
with the nucleolar marker fibrillarin. The ORF1ps and ORF2ps of both R elements largely 39
co-localized in the nuclear periphery, and to a lesser extent, within the nucleus. These results 40
suggest that R7Ag and R1Bm proteins may access nucleolar rDNA targets in an 41
ORF2p-dependent manner. 42
43
44
45
46
47
48
49
50
51
52
53
on February 1, 2018 by guest
http://mcb.asm
.org/D
ownloaded from
- 3 -
54
INTRODUCTION 55
Non-long terminal repeat (non-LTR) retrotransposons, also known as long interspersed nuclear 56
elements (LINEs), are the most abundant mobile elements in many organisms. In particular, 57
LINEs comprise approximately 21% of the human genome and are the only active transposable 58
elements that still influence the human genome through their involvement in genome evolution, 59
genome mutation, and disease etiology (1). 60
According to a phylogenetic analysis, non-LTR retrotransposons can be categorized into two 61
groups: the early-branched subtype with a single open reading frame (ORF) and a recently 62
branched subtype with two ORFs (ORF1 and ORF2) (2, 3). Studies of the retrotransposition of 63
recently branched non-LTR elements have mainly involved human L1 elements and have 64
elucidated the following aspects of the retrotransposition process. After transcription from the 65
host genome, the element mRNA is exported to the cytoplasm and translated into two proteins, 66
ORF1p and ORF2p. The resulting proteins associate with their own mRNAs in the cytoplasm to 67
form a ribonucleoprotein (RNP) complex and subsequently translocate to the nucleus. In the 68
nucleus, the endonuclease domain (EN) of ORF2p nicks the bottom strand of the target DNA, 69
after which the reverse transcriptase (RT) domain of ORF2p uses the 3′-hydroxyl end of the 70
nicked DNA as a primer for reverse transcription of the mRNA template. This reverse 71
transcription initiation process has been termed target-primed reverse transcription (TPRT) (4). 72
At the final step of retrotransposition, the second strand in the target site is cleaved and 73
complementary DNA is synthesized. Retrotransposition events of L1 elements feature some 74
common hallmarks, including variable 5′ end truncation, poly(A) tail termination, and the 75
creation of target site duplications (TSDs) (5). 76
Although most non-LTR retrotransposons, including the human L1 element, are randomly 77
inserted throughout the host genome, some elements are inserted into specific sites of repetitive 78
genomic sequences (6), such as ribosomal DNA (rDNA) (3), telomeric repeats (7), and 79
on February 1, 2018 by guest
http://mcb.asm
.org/D
ownloaded from
- 4 -
microsatellite (8). Many non-LTR elements in the early-branched group are site specific. In 80
contrast, among more than 20 clades in the recently branched non-LTR group, only two, R1 and 81
Tx, include site-specific elements (6). It is noteworthy that the site-specific non-LTR elements 82
feature rigid sequence specificity not only for specific DNA targets but also for their own mRNA 83
transcripts in the integration pathway (9-11). Previous studies of R1 clade elements have shown 84
that the primary determinant of sequence-specific integration is the EN domain, which cleaves 85
target DNA during TPRT initiation (12, 13). Target specificity also relies on base pairing between 86
the read-through (downstream region) mRNA product of site-specific non-LTR elements and 87
target DNA sequence at the cleaved site (11, 14). In addition, some telomere-specific non-LTR 88
elements have been reported to localize at the telomere region of the nucleus, suggesting that 89
unidentified access factors that lead these elements to the target site are involved in site 90
specificity (15, 16). However, further studies are needed to clarify these site-specific determining 91
factors and fully elucidate the molecular mechanism of site-specific non-LTR element 92
integration. 93
Ribosomal elements (R elements) comprise a class of sequence-specific non-LTR elements 94
that accumulate in and target the ribosomal DNA (28S, 18S, and 5.8S rDNA) (6). Nine different 95
R elements with distinct target specificities have been reported: R7 and R8 insert into 18S rDNA, 96
whereas the remaining 7 R elements, R1, R2, R4, R5, R6, R9, and RT, insert into 28S rDNA 97
(17-19). Among these, R7 and R1, which are closely related R1 clade elements, are good models 98
to gain a mechanistic understanding of site-specific integration because although they belong to 99
the same clade and have similar structures, the target specificity of R7 has shifted from 28S 100
rDNA to 18S rDNA (3, 6, 20). Herein, we used R7Ag and R1Bm, which were identified from 101
Anopheles gambiae and Bombyx mori, respectively (Fig. 1) (3, 14). These two R elements share 102
common structural features, except for the absence of a poly(A) tail in R1Bm (Fig. 1). 103
We previously established an in vivo retrotransposition assay to evaluate R1Bm using a 104
recombinant Autographa californica nuclear polyhedrosis virus (AcNPV) expression system in 105
on February 1, 2018 by guest
http://mcb.asm
.org/D
ownloaded from
- 5 -
fall armyworm Spodoptera frugiperda 9 (Sf9) cells that lack the R7 element and silkworm larvae, 106
which enabled accurate target-specific integration (21-22). To compare the mechanisms 107
associated with R7Ag and R1Bm, we constructed an accurate model of R7Ag retrotransposition 108
into specific 18S rDNA sites in Sf9 cells using AcNPV. With this system, we demonstrated that 109
R7Ag prefers to insert into the 18S rDNA sequences of its true host A. gambiae, rather than those 110
of S. frugiperda, and requires a long poly(A) tail to initiate accurate mRNA reverse transcription; 111
these were novel findings with respect to the correct integration of site-specific elements. In this 112
study, we also compared the subcellular localization of R7Ag and R1Bm proteins and found that 113
the ORF2ps of both elements co-localized with the nucleolar marker fibrillarin. Localization 114
analyses implied that both R elements gain access to the nucleoli, where the target rDNA resides, 115
through an ORF2p-dependent mechanism. These findings contribute to a better understanding of 116
the mechanisms by which site-specific elements access their target regions, recognize DNA 117
targets, and accurately copy the retrotransposon unit. 118
119
MATERIALS AND METHODS 120
Plasmid construction. For plasmid construction, polymerase chain reaction (PCR) was 121
conducted with the iProofTM DNA polymerase (Bio-Rad, Hercules, CA, USA); the primers used 122
for plasmid construction are shown in Table 1. 123
Constructs for the retrotransposition assay. R7Ag WT (wild type) was constructed based on 124
R7Ag1 (accession number AB090820) (3). Briefly, R7Ag1, lacking the 5′ UTR and ending with 125
oligo(A)8, was synthesized by Epoch Biolabs (Sugar Land, TX, USA) and then this R7Ag 126
construct was subcloned into the NcoI and XbaI sites of the pFastBac HTC plasmid. To construct 127
R7Ag RTm, R7Ag WT was digested with SmaI to remove a 1077-base pair (bp) portion of the 128
RT domain and then treated with T4 DNA polymerase to promote self-ligation. R7Ag A20 was 129
constructed as follows: a portion of the R7Ag A20 sequence was amplified by PCR from R7Ag 130
WT using the primers R7Ag1(5172)f and R7-3′A20Nsi1Xba1. The resulting PCR product was 131
on February 1, 2018 by guest
http://mcb.asm
.org/D
ownloaded from
- 6 -
subcloned between the NotI and XbaI sites of the pFastBac HTC vector. 132
Target plasmid construction. The construction of Ag 18S rDNA-pGEMT, Ag 18S rDNA mutant 133
-pGEMT and Sf9 18S rDNA-pGEMT were performed as follows. A 120-bp target region was 134
constructed by incubating three pairs of oligodeoxynucleotides, Ag18SrDNAforR7F and 135
Ag18SrDNAforR7R, Ag18SrDNAforR7F-mt and Ag18SrDNAforR7R-mt, Sf918SrDNAforR7F 136
and Sf918SrDNAforR7R, at 96°C for 4 min, followed by an annealing step at 64°C for 2 min and 137
an extension step at 72°C for 2 min. The resulting products were cloned using the pGEM-T Easy 138
Vector System (Promega Corp., Madison, WI, USA). 139
Constructs for the immunofluorescence assay. Initially, the pIZT/V5-His-dEGFP plasmid was 140
constructed using a pIZT/V5-His Vector (Invitrogen, Carlsbad, CA, USA) from which EGFP had 141
been deleted. Next, pIZT/V5-His-dEGFP was fused with a HA tag (CYPYDVPDYASL) 142
subcloned between the NotI and XbaI sites, and the resulting vector was named 143
pIZT/V5-HA/His-dEGFP. R7AgORF1-pIZT/V5-HA/His-dEGFP was constructed as follows: a 144
portion of R7AgORF1 was amplified by PCR from the R7Ag WT plasmid with the primers 145
EcoRI-KOZAK-R71-F and NotI-R71-R. The resulting PCR product was subcloned between the 146
EcoRI and NotI sites of pIZT/V5-HA/His-dEGFP. To construct 147
R1BmORF1-pIZT/V5-HA/His-dEGFP, a portion of R1BmORF1 was amplified by PCR from the 148
pAcGHLT-B R1ORFs+3′ UTR plasmid (accession number AB182560) (14) using the primers 149
SpeI-KOZAK-R11-F and EcoRI-R11-R-2. The resulting PCR product was subcloned between the 150
SpeI and EcoRI sites of pIZT/V5-HA/His-dEGFP. 151
Next, the 3xFLAG-pIZT/V5-His-dEGFP construct was constructed as follows: the 3xFLAG-tag 152
region, which includes KOZAK sequences (GCCACC) and a 3xFLAG tag 153
(DYKDHDGDYKDHDIDYKDDDDK), was constructed by incubating phosphorylated 154
oligodeoxynucleotides with the KpnI-KOZAK-3xFLAG-B-S and KpnI-KOZAK-3xFLAG-B-AS 155
primers and T4 Polynucleotide Kinase (Toyobo, Osaka, Japan) at 37°C for 1 h, followed by a 156
5-min incubation at 95°C and cooling to room temperature. The final resulting product was 157
on February 1, 2018 by guest
http://mcb.asm
.org/D
ownloaded from
- 7 -
subcloned between the KpnI and BamHI sites of pIZT/V5-His-dEGFP. 158
3xFLAG-R7AgORF2-pIZT/V5-His-dEGFP was constructed as follows: a portion of the 159
R7AgORF2/3′UTR was amplified by PCR from the R7Ag WT plasmid with the primers 160
SpeI-R7O2S and EcoRI-R7O2A+3′ UTR. The PCR product was then subcloned between the 161
SpeI and EcoRI sites of 3xFLAG-pIZT/V5-His-dEGFP. To construct 162
3xFLAG-R1BmORF2-pIZT/V5-His-dEGFP, a portion of the R1BmORF2/3′ UTR was amplified 163
by PCR from the pAcGHLT-B -R1ORFs/3′ UTR plasmid (accession number AB182560) (14) 164
with the primers BamHI-R1O2S and R1 A5136 (NotI). The resulting PCR product was subcloned 165
between the BamHI and NotI sites of 3XFLAG-pIZT/V5-His-dEGFP. 166
Recombinant AcNPV generation. Recombinant AcNPV generation was performed according to 167
the instructions provided with the Bac-to-Bac Baculovirus Expression System (Invitrogen). 168
Briefly, the above mentioned recombinant pFastBac HTC constructs, which contained a gene of 169
interest driven by the polyhedron promoter, were transformed into DH10Bac E. coli for 170
transposition into the bacmid. Next, recombinant bacmid DNA were isolated via alkaline lysis 171
with sodium dodecyl sulfate (SDS), and PCR was used to verify successful transposition to the 172
bacmid. The isolated recombinant bacmid was then transfected into Sf9 cells using Cellfectin® 173
Reagent (Invitrogen) according to the instruction manual. Four days later, the medium containing 174
the virus was collected and centrifuged at 500 g for 5 min to remove cells and large debris. The 175
clarified supernatant was transferred to fresh tubes and designated P1 viral stock for use in plaque 176
assays and virus amplification. 177
In vivo retrotransposition assay. The in vivo retrotransposition assay was performed as 178
described previously (21). Approximately 1 × 106 Sf9 cells in a 6-well plate were infected with 179
R7Ag-containing AcNPV at a multiplicity of 1 plaque-forming unit (pfu) per cell. The genomic 180
DNA was extracted 72 h post-infection with Gentra Puregene® Kits (QIAGEN, Valencia, CA, 181
USA). PCR assays were conducted using Ex-Taq with 1 μg of Sf9 DNA and the primers 182
R7Ag1(6091)f and 18SBm(1381)r. The reaction mixture was denatured at 96°C for 2 min, 183
on February 1, 2018 by guest
http://mcb.asm
.org/D
ownloaded from
- 8 -
followed by 35 cycles of 96°C for 30 s, 62°C for 30 s, and 72°C for 1 min. One microliter of each 184
mixture was subjected to 2% agarose electrophoresis in Tris-acetate-EDTA buffer and visualized 185
by ethidium bromide staining. PCR products were directly cloned into the pGEM-T Easy vector 186
(Promega). The cloned products were sequenced with a BigDye Terminator cycle sequencing kit 187
(Applied Biosystems, Foster City, CA, USA) on ABI3130 xl and 3500/3500xl Genetic Analyzers 188
(Applied Biosystems). Sequence analysis was performed using the Vector NTI Advance 10 189
system (Invitrogen). 190
Ex vivo retrotransposition assay. The ex vivo retrotransposition assay was performed as 191
described previously (23). Approximately 3 × 105 Sf9 cells in a 12-well plate were transfected 192
with 800 ng of the target plasmid Ag 18S rDNA-pGEMT, Ag 18S rDNA mutant -pGEMT or 193
Sf918S rDNA-pGEMT with the TransFast™ Transfection Reagent (Promega). Subsequently, 194
these cells were infected with R7Ag WT or R7Ag A20 AcNPV at a multiplicity of 1 pfu per cell. 195
Plasmid DNA was extracted 72 h post-infection, and the nested PCR assay was conducted with 196
Ex-Taq (TaKaRa). The first run of the reaction was denatured at 94°C for 1 min, followed by 35 197
cycles of 94°C for 20 s, 61°C for 30 s, and 72°C for 50 s. The second run of the reaction was 198
denatured at 94°C for 1 min, followed by 35 cycles of 94°C for 20 s, 59°C for 30 s, and 72°C for 199
50 s. Two sets of primers used for the nested PCR assay are shown in Table 1. PCR products were 200
directly cloned into the pGEM-T Easy vector (Promega). The cloned products were sequenced 201
with a BigDye Terminator cycle sequencing kit (Applied Biosystems) on ABI3130 xl and 202
3500/3500xl Genetic Analyzers (Applied Biosystems). Sequence analysis was performed using 203
the Vector NTI Advance 10 system (Invitrogen). 204
Western blotting. Fibrillarin expression in Sf9 cells was detected by western blotting as follows: 205
Sf9 cell extracts were subjected to 10% SDS-polyacrylamide gel electrophoresis. HeLa cell and 206
Drosophila melanogaster embryo (Dm embryo 0–18 h) extracts were used as positive controls. 207
Electrophoresed proteins were transferred to a polyvinylidene difluoride membrane (#66543; Pall 208
Corporation, Port Washington, NY, USA) with the Trans-Blot Turbo Blotting System 209
on February 1, 2018 by guest
http://mcb.asm
.org/D
ownloaded from
- 9 -
(#170-4155; Bio-Rad) at 0.1 A and 25 V for 60 min. The membrane was subsequently blocked in 210
5% skim milk in Tris-buffered saline + Tween-20 (TBS-T) overnight at 4°C, washed three times 211
with TBS-T, and incubated overnight at 4°C with a primary antibody against fibrillarin (ab5821; 212
Abcam, Cambridge, UK) or α-tubulin (CLT9002; Cedarlane Laboratories, Burlington, ON, 213
Canada) for 1 h, diluted to 1:10,000 or 1:100,000 in 3% bovine serum albumin (BSA)/Can Get 214
Signal 1 (NKB-101; Toyobo). Following another wash with TBS-T, the membrane was incubated 215
for 1 h with a secondary antibody (anti-mouse IgG NA931 at 1:10,000 or anti-rabbit IgG NA934 216
at 1:10,000; GE Healthcare, Little Chalfont, UK) diluted in 3% BSA/Can Get Signal 2 217
(NKB-301; Toyobo). Finally, protein band detection was performed using ImmunoStar LD 218
(290-69904; Wako Pure Chemical Industries, Ltd., Osaka, Japan), and the blot was imaged on an 219
ImageQuant™ LAS 4000 system (GE Healthcare). 220
Immunofluorescence assay. Sf9 cells were grown on 12-well plates (Iwaki, Fukushima, Japan) 221
and were transfected with 2 μg of plasmid DNA in the presence of 12 μl of TransFast Reagent 222
(Promega). After 72 h, the transfected cells were harvested, allowed to adhere to an EZ SLIDE 8 223
well glass slide (Millicell; Millipore, Billerica, MA, USA), and then fixed with 4% 224
paraformaldehyde in phosphate-buffered saline (PBS) for 15 min, followed by ice-cold methanol 225
for 5 min. Next, cells were blocked by incubating for 30 min at 37°C in 3% BSA + PBS-Tween 226
20 (T). The cells were then incubated with two primary antibodies alone or in a mixture: anti-HA 227
rabbit polyclonal antibody (A190-108A, 1:2,000; Bethyl Laboratories, Montgomery, TX, USA) 228
and anti-FLAG M2 monoclonal antibody (F1804, 1:400; Sigma-Aldrich, St. Louis, MO, USA) in 229
1% BSA in PBST for 1 h at room temperature. Cells were washed three times with 1x PBS (5 230
min per wash) and subsequently incubated in the dark for 1 h at room temperature with a mixture 231
of the following fluorescently-labeled secondary antibodies in 1% BSA: Alexa Fluor® 232
488-labeled anti-mouse IgG (1:500; Abcam) and Alexa Fluor® 555 anti-rabbit IgG (1:1000; 233
Abcam). For fibrillarin detection, an anti-fibrillarin nucleolar marker rabbit polyclonal antibody 234
(ab5821, 1:200; Abcam) was used as a primary antibody, and Alexa Fluor® 555 anti-rabbit IgG 235
on February 1, 2018 by guest
http://mcb.asm
.org/D
ownloaded from
- 10 -
(1:500, Abcam) was used as the secondary antibody. Finally, the slides were washed with 1x PBS 236
and mounted with Vectashield mounting medium containing 4′, 6-diamidino-2-phenylindole 237
(DAPI; Vector Laboratories, Inc., Burlingame, CA, USA). Protein localization was analyzed 238
using a FluoView™ FV1000 confocal microscope (Olympus Corporation, Tokyo, Japan), and 239
images were taken using FV10-ASW 1.6 software (Olympus Corporation). 240
241
RESULTS 242
Analysis of R7Ag target specificity 243
Previously, we identified four copies of the R7Ag element from an A. gambiae database and 244
designated one full-length copy as R7Ag1 (accession no. AB090820) (3). To thoroughly study the 245
target specificity of R7Ag, we further rescreened novel R7Ag clones from an A. gambiae whole- 246
genome shotgun sequence database (WGS anopheles) using the BLASTN program and the 247
R7Ag1 3′UTR and 5′UTR sequences as queries. By investigating the boundary sequences 248
between the 18S rDNA host genome and multiple genomic copies of R7Ag, we identified 49 249
novel clones representing the 3′ junctions (Table 2) and 52 clones representing the 5′ junctions 250
(Table 3). All clones ended with a 5–25 base poly(A) stretch at the 3′ UTR end and included 251
15-base pair (bp) target-site duplications (TSDs: AAGAAGTGGAGCTTG) at both ends. 252
Detailed information about the novel clones identified in this study is summarized in Tables 2 253
and 3. 254
Among the 52 5′ junction clones, a 2-bp difference in the 5′ boundaries between the 18S rDNA 255
target and R7Ag was thought to have resulted from variations in the top strand cleavage site 256
(Table 3). In contrast, among the 49 3′ junction clones, the 3′ boundaries between R7Ag and the 257
18 rDNA target were tightly conserved, suggesting that the EN domain in the ORF2p of R7Ag 258
mediates bottom strand cleavage with a high level of specificity. Table 2 also presents sequence 259
variations in the 3′UTRs of the genomic clones of R7Ag: 19 clones (Class I) ended with 260
…AGTATT, 27 clones (Class II) ended with …AGTGTTAAC, and 3 clones (Class III) ended 261
on February 1, 2018 by guest
http://mcb.asm
.org/D
ownloaded from
- 11 -
with more variable sequences. The full-length R7Ag1 used in this study belongs to Class I. These 262
sequence variations might represent different original copies of R7Ag or mutations that occurred 263
after insertion into the target. 264
R7Ag inaccurately retrotransposes into the 18S rDNA of Sf9 cells 265
To determine whether the full-length R7Ag construct would retain its retrotransposon activity, 266
we established a retrotransposition assay in a baculovirus-based expression system (21), 267
according to the following description. A full-length wild type construct, R7Ag WT, including 268
the ORF1/ORF2/ 3′ UTR portion and an 8-nucleotide (nt) poly(A) tail, was synthesized by Epoch 269
Biolabs (Sugar Land, TX, USA) based on the R7Ag1 genomic sequence. An AcNPV 270
recombinant clone was constructed for expression in the Sf9 cells under the control of the 271
AcNPV polyhedrin promoter (Fig. 2A). The de novo retrotransposition of R7Ag WT into the 272
genomic DNA of Sf9 cells was detected through 40 cycles of PCR using primers R7Ag1(6091)f 273
and 18SBm(1381)r (Table 1) , which were designed to amplify the junction region between the 274
ORF2/3′ UTR C-terminal in R7Ag and the target 18S rDNA sequence (Fig. 2A). As a negative 275
control, we also generated an R7Ag RT mutant construct (R7Ag RTm) that lacked a large portion 276
of the RT domain (Fig. 2A). 277
Using this baculovirus-based assay system, we compared the retrotransposition activities of the 278
WT and RT mutant R7Ag constructs. We observed multiple PCR product bands for the WT 279
construct, but no band for the RT mutant or Sf9 cells alone (negative control; Fig. 2B). This result 280
indicates that the multiple bands yielded by the WT construct represented retrotransposed events 281
that require a functional R7Ag RT domain, as demonstrated for other non-LTR elements. To 282
confirm that these bands represented retrotransposed copies, we isolated and sequenced 23 clones 283
from R7Ag WT PCR products and found that they were inserted around the target sequence of 284
the Sf9 cell 18S rDNA (Fig. 2C). Importantly, all sequenced clones contained inaccurate 285
retrotransposition features. First, accurate EN domain-mediated bottom strand cleavage of the 286
target DNA should occur at the left end of the 18S rDNA target sequence (Fig. 2C, dotted line at 287
on February 1, 2018 by guest
http://mcb.asm
.org/D
ownloaded from
- 12 -
the left end of the black box, EN = 0); however, our results indicated cleavage at various sites 288
near the 18S target. Second, accurate reverse transcription of mRNA should occur at the 3′end of 289
the non-LTR element (Fig. 2C, dotted line at the right end of the white box, RT = 0). In 96% of 290
clones (22/23), however, reverse transcription initiated at various sites upstream or downstream 291
(read-through mRNA into the vector region) of the accurate position. In summary, both the target 292
cleavage and reverse transcription of R7Ag mRNA occurred inaccurately. The above-mentioned 293
variations differed from those observed in endogenous R7Ag elements from A. gambiae genomic 294
DNA (Tables 2 and 3). 295
Host 18S rDNA sequence is required for the accurate insertion of R7Ag WT 296
Target site duplication (TSD) occurs at staggered nicks between the bottom and top strands of 297
target DNA (4, 24). As described above, the suggested TSD sequence of R7Ag was 298
AAGAAGTGGAGCTTG, based on genomic copy sequences in A. gambiae. However, in Sf9 299
cells (S. frugiperda genome), the 18S sequence in this region is CAGGAGTGGAGCCTG (Fig. 300
3A), which differs from the A. gambiae sequence by 3 bp. We also compared the 120-bp 18S 301
rDNA sequence around the cleavage site (60 bp both upstream and downstream) in both species 302
and found that approximately 30 bp on either side of the target (33 bp upstream and 27 bp 303
downstream) were strictly conserved between the species (Fig. 3A), suggesting that the 3-bp 304
difference within the TSD sequence caused inaccurate target DNA cleavage in Sf9 cells. 305
To verify this possibility, we next tested whether the original 18S rDNA sequence could be 306
recognized accurately using a previously developed ex vivo retrotransposition assay (23). This 307
method enabled the testing of site-specific non-LTR element retrotransposition into an exogenous 308
target sequence transfected into Sf9 cells. After the Sf9 cells were transfected with a plasmid 309
including a 120-bp target sequence from one of two species (Ag 18S rDNA-pGEMT or Sf9 18S 310
rDNA-pGEMT), the cells were further infected with an R7Ag WT recombinant baculovirus (Fig. 311
3B). We extracted plasmid DNA from the cells and detected the retrotransposition event using the 312
nested PCR to amplify the 3′ junction between R7Ag WT and the 18S rDNA target on the 313
on February 1, 2018 by guest
http://mcb.asm
.org/D
ownloaded from
- 13 -
plasmid (Fig. 3B). 314
In both case, we observed smeared bands that represented the R7Ag WT retrotransposon events, 315
according to the sequence analyses of the PCR products (Figs. 3C, 4A and B). However, bottom 316
strand cleavage clearly differed between the Ag and Sf9 18S rDNA targets. Although 19 of 20 317
clones exhibited inaccurate Sf9 18S target cleavage by the EN domain of R7Ag (Fig. 4A, right 318
dotted line, EN = 0), 30 of 31 clones exhibited accurate Ag 18S target cleavage (Fig. 4B, right 319
dotted line, EN = 0). To exclude a possibility that additional mutations outside of the conserved 320
60-bp region (mutations; highlighted in grey, Fig. 3A) could affect inaccurate cleavage, we tested 321
our ex vivo retrotransposition assay with a mutant Ag 18S rDNA target plasmid (Ag 18SrDNA 322
mutant) in which only 3 bp in the TSD had been changed to the Sf9 type sequence (A to C, A to 323
G, and T to C; asterisks in Fig. 3A; Fig. 3B and C). Sequence analyses revealed that all 30 clones 324
exhibited the inaccurate cleavage at the Ag 18S rDNA mutant site (Fig. 4C, right dotted line, EN 325
= 0), similar to the results obtained with Sf9 18S target cleavage (Fig. 4A). These data suggest 326
that the sequence specificity of the initial EN cleavage of R7Ag is affected mainly by the 15-bp 327
TSD sequence on the host 18S rDNA target. Additionally, this 3-bp difference within the TSD 328
sequence between Sf9 and Ag 18S rDNA is critical for the target specificity of R7Ag. 329
In addition, in most retrotransposed copies (22/31; 71%), the reverse transcription of R7Ag 330
mRNA occurred at inaccurate positions (downstream of the 3′-end of R7Ag mRNA) even when 331
an A. gambiae exogenous plasmid target was used (Fig. 4B). Using a S. frugiperda exogenous 332
plasmid, we observed similar inaccurate reverse transcription in 65% of the retrotransposed 333
copies (13/20; Fig. 4A). 334
The long poly(A) tail at the end of the 3′ UTR is critical for accurate R7Ag reverse 335
transcription 336
We next questioned why we could not observe accurate reverse transcription of R7Ag mRNA 337
even when using target 18S rDNA from the true host, A. gambiae (Fig. 4B). Notably, site-specific 338
non-LTR retrotransposons end with a 3′ poly(A) tail. The telomere-specific non-LTR element 339
on February 1, 2018 by guest
http://mcb.asm
.org/D
ownloaded from
- 14 -
SART1 in B. mori requires poly(A) tail to initiate target-primed reverse transcription in vivo (11). 340
In addition, human L1 elements function in trans to promote both processed pseudogene 341
formation and retrotransposition of some short interspersed nuclear elements (SINEs) by 342
recognizing the poly(A) tail of mRNA (5, 25). These observations suggest that the poly(A) tail of 343
R7Ag is involved in its retrotransposition. To clarify the function of poly(A) in R7Ag 344
retrotransposition, we generated an R7Ag A20 construct with a longer 20-nucleodite (nt) 345
oligo(A) tail and compared the results obtained with this construct and with an R7Ag WT 346
construct bearing an 8-nt oligo(A) tail (Fig. 5A). We subsequently conducted an ex vivo 347
retrotransposition assay using exogenous target plasmids, including the A. gambiae 18S rDNA 348
target (Fig. 5B). The sequence analyses of the 12 retrotransposed copies cloned from smeared 349
bands showed not only accurate cleavage of the bottom target strand (Fig. 4C, right dotted line, 350
EN = 0) but also accurate reverse transcription that initiated exactly from the poly(A) end of the 351
3′UTR (Fig. 5C, left dotted line, RT = 0). This finding suggests that a longer poly(A) tail at the 352
3′UTR end is critical for the accurate initiation of R7Ag mRNA reverse transcription. 353
Subcellular localization of ORFps from the rDNA-specific elements R7Ag and R1Bm 354
During retrotransposition, rDNA-specific elements (R elements) are assumed to translocate to 355
the nucleolar region, which includes the rDNA cluster, although there is no direct evidence to 356
support this idea to date. To clarify this possibility and the mechanism by which proteins of R 357
elements gain access to the target site, we next attempted to identify the subcellular localization 358
of R7Ag ORF proteins and of the closely related element R1Bm, which targets 28S rDNA. To 359
detect ORF proteins localization signals via immunofluorescence, we generated plasmid 360
constructs for two elements, in which ORF1p was tagged with HA and ORF2p was tagged with 361
3xFLAG (Fig. 6A). These plasmids were transfected transiently into Sf9 cells, and the subcellular 362
localization of each expressed protein was detected with a confocal microscope after labeling 363
with antibodies against HA and 3xFLAG. 364
When R7Ag ORF1p was expressed alone in Sf9 cells, the corresponding signals were observed 365
on February 1, 2018 by guest
http://mcb.asm
.org/D
ownloaded from
- 15 -
broadly throughout the cytoplasm (Fig. 6B-b, Table 4). Similar cellular localization signals were 366
observed for R1Bm ORF1p (Fig. 6B-e, Table 4). In contrast, when ORF2p was expressed alone 367
in Sf9 cells, we observed two types of localization patterns for both R7Ag and R1Bm (Fig. 6C, 368
Table 4). In type I, the localization signals of both R7Ag (Fig. 6C-b) and R1Bm (Fig. 6C-h) were 369
observed broadly in the cytoplasm. In type II, the R7Ag ORF2p signals were observed as a 370
punctate (dotted) pattern in the nuclei (Fig. 6C-e). Although weak R1Bm ORF2p signals were 371
observed in the cytoplasm, some puncta were also observed in the nuclei (Fig. 6C-k). 372
Next, we co-expressed ORF1p and ORF2p simultaneously to monitor the interaction of the two 373
proteins in cells. In this experiment, we also observed two types of co-localized signals for both 374
R-elements (Fig. 7A, Table 5). In type I, some signals for ORF1p (localized broadly in cytoplasm, 375
Fig. 7A-a and-g) and ORF2p (localized mainly in nuclei and weakly in cytoplasm, Fig. 7A-b 376
and-h) co-localized in peripheral regions just outside of the nuclear membrane (Fig. 7A-c and-i). 377
Notably, 59.3% of cells expressing R7Ag proteins and 40% of cells expressing R1Bm proteins 378
corresponded to this type (Table 5). In type II, fewer ORF1p (localized mainly in cytoplasm with 379
mild nuclear localization, Fig. 7A-d and-j) and ORF2p signals (localized in nuclei, Fig. 7A-e 380
and-k) co-localized in nucleus with punctate pattern (Fig. 7A-f and-l), although the signals for 381
R1Bm were less noticeable. Specifically, 12.5% of cells expressing R7Ag proteins and 15% of 382
cells expressing R1Bm proteins corresponded to this type (Table 5). The remaining cells 383
expressing both R-elements exhibited no clear co-localization signals for ORF1p and ORF2p 384
(28% for R7Ag and 45% for R1Bm, Table 5). These results indicate that although most ORF 385
proteins localize separately, some proportion of the ORF proteins of both R-elements co-localize 386
in the cytoplasm and accumulate in the peripheral region of the nuclear membrane. The above 387
results suggest that a small portion of the accumulated proteins of two R elements may 388
translocate into nuclei in an ORF2p-dependent manner. 389
R7Ag and R1Bm ORF2p co-localize with the nucleolar marker protein fibrillarin 390
From the above data, R7Ag and R1Bm ORF2p (or co-localized signals of ORF1p and ORF2p) 391
on February 1, 2018 by guest
http://mcb.asm
.org/D
ownloaded from
- 16 -
exhibited a punctate nuclear pattern (Figs. 6C and 7A). Because the rDNA targets of R elements 392
reside in the nucleolus, these punctuate ORF2p signals might indicate the nucleolar location. To 393
test this possibility, we next used fibrillarin, a nucleolar marker protein, to confirm the nuclear 394
position of ORF2p. The antibody used herein is known to react with fibrillarin from mouse, 395
human, Xenopus laevis and Drosophila melanogaster according to the manufacture′s manual (see 396
Materials and Methods). Although Xing et al. (26) demonstrated in 2011 that this anti-fibrillarin 397
antibody also reacts with fibrillarin from Sf9 cells, we first confirmed the specificity of the 398
antibody in Sf9 cells via western blotting analysis of whole cell lysates from HeLa cells and 399
Drosophila melanogaster embryos. The antibody yielded a 34 kDa band that represents authentic 400
alpha-fibrillarin (Fig. 7B). The Sf9 cell lysate also yielded a similar band at roughly 34 kDa, 401
suggesting that the antibody recognizes S. frugiperda fibrillarin (Fig. 7B). In Sf9 cells, the 402
antibody detected fibrillarin proteins as punctate signals in nuclear regions (Fig. 7C-c and -g). 403
When R7Ag and R1Bm ORF2ps were expressed in Sf9 cells, we observed a few 404
immunofluorescence co-localization signals of ORF2p with fibrillarin signals (Fig. 7C-d and -h, 405
arrowheads (yellow signals) in merged panel), suggesting that retrotransposable units of the two 406
R elements gain access to the nucleolus through a mechanism involving ORF2p. 407
Despite having different retrotransposition target sites, R7Ag and R1Bm ORFps exhibited 408
similar intracellular localization patterns: cytoplasmic localization of ORF1p, accumulation of 409
ORFps in a peripheral region of the nuclear membrane, punctate nuclear localization of ORF2p, 410
and localization of a small portion of ORF2p in the nucleolus (Figs. 6 and 7). These results 411
indicate that both R elements share similar mechanisms of ORFps interaction, accumulation and 412
access to the rDNA cluster target. 413
414
DISCUSSION 415
In this study, the newly cloned R7Ag from A. gambiae exhibited retrotransposition activity in 416
Sf9 cells via a baculovirus-based in vivo retrotransposition assay. Using this baculovirus-based 417
on February 1, 2018 by guest
http://mcb.asm
.org/D
ownloaded from
- 17 -
assay, we have previously succeeded in characterizing the telomere-specific non-LTR elements 418
SART1 and TRAS1 and an rDNA-specific element R1Bm, all of which were identified in the 419
silkworm B. mori (14, 21). B. mori and S. frugiperda are lepidopteran species, whereas A. 420
gambiae is a dipteran species. Therefore, basal host factors required for the retrotransposition of 421
non-LTR elements might be conserved between Lepidoptera and Diptera. Whole-genome 422
sequence data demonstrates that the rDNA contains approximately 50 genomic copies of R7Ag, 423
although it is unclear whether all sequences represent different genomic copies of R7Ag or 424
merely sequence variations (especially regarding poly (A) length) among cells or individuals. 425
R7Ag is an R1 clade element that targets a specific sequence in 18S rDNA. A previous 426
phylogenetic analysis showed that R1Bm, which targets 28S rDNA, is an ancestral element and 427
that R7Ag represents a more recent branch (3). Thus, it has been speculated that 28S rDNA might 428
have been the ancestral target, and 18S rDNA may have become the target of R7Ag during 429
evolution. However, sequences around the cleavage sites in R7Ag and R1Bm are not highly 430
conserved, with the exception of CCAC just upstream of the top strand in both elements (Fig. 1). 431
In particular, the 15-bp TSD of R7Ag and 14-bp TSD of R1Bm have no sequence similarity, 432
suggesting that target sequence recognition by the EN domain differs between the two R elements. 433
In fact, the present report clarified that the 15-bp TSD of R7Ag is involved in the specific 434
recognition and cleavage of the bottom strand of its 18S rDNA target. We observed inaccurate 435
cleavage of the bottom strand target of endogenous (Fig. 2C) and exogenous (Fig. 4A) 18S rDNA 436
from S. frugiperda (not the original host), which features a 3-bp difference in the TSD sequence. 437
However, precise bottom strand cleavage was observed in an exogenous plasmid that included 438
the 18S rDNA target of the true host, A. gambiae (Fig. 4B) but not in an 18S rDNA mutant 439
plasmid (Fig. 4C). Interestingly, the accuracy of S. frugiperda 18S rDNA bottom strand cleavage 440
by R7Ag appeared to be superior in the exogenous plasmid target (Fig. 4A) than in the 441
endogenous genome (Fig. 2C). This indicates that the chromosomal structure around the target 442
region affects the recognition or cleavage of the bottom strand target by R7Ag EN. 443
on February 1, 2018 by guest
http://mcb.asm
.org/D
ownloaded from
- 18 -
Although R7Ag and R1Bm are closely related R1 clade elements, they differ structurally. 444
Similar to other non-LTR elements, R7Ag has a 3′ poly(A) tail that R1Bm lacks. An earlier report 445
demonstrated that R1Bm requires a 147-bp 3′ UTR, but not a poly(A) tail, for efficient 446
retrotransposition (14). In contrast, we demonstrated herein that R7Ag requires a longer poly(A) 447
tail for the precise initiation of mRNA reverse transcription. This observation was supported by 448
our former report in which the loss of the poly(A) tail from the telomere-specific element SART1 449
resulted in decreased and inaccurate reverse transcription (11). In the present study, the use of a 450
R7Ag WT construct with an 8-nt oligo(A) tail at its 3′ end led to inaccurate reverse transcription 451
in 96% of the clones retrotransposed into the endogenous S. frugiperda genome (Fig. 2C). 452
Furthermore, using the same construct, 65% and 71% of clones exhibited inaccurate reverse 453
transcription of their exogenous targets in S. frugiperda and A. gambiae rDNA, respectively (Fig. 454
4A and B). However, when an R7Ag construct with a longer (20 nt) oligo(A) tail was used, 100% 455
of the retrotransposed copies exhibited accurate reverse transcription. Based on Table 2, the 456
average length of the poly(A) tail in genomic copies of R7Ag with a 15-bp TSD (class I) is 457
approximately 13.5 nt (range: 8–25 nt). This suggests that a poly(A) tail longer than 8 nt is 458
necessary for correct recognition by the R7Ag RNP complex (particularly the RT domain of 459
ORF2p), thus enabling positioning of the 3′ end of the mRNA exactly onto the cleaved end of the 460
bottom strand DNA. We do not know why the accurate reverse transcription rate increased with 461
the exogenous S. frugipedra rDNA plasmid target (Fig. 4A, 35%) relative to the endogenous S. 462
frugipedra genome target (Fig. 2C, 4%). In the telomeric repeat-specific SART1, short telomeric 463
repeat-like GGUU sequences in the 3' UTR of mRNA might have annealed the bottom strand 464
target site of (TTAGG)n repeats, allowing reverse transcription to initiate from the poly(A) tail 465
(11). Furthermore, in the I factor of Drosophila melanogaster, tandem UAA repeats at the 3'-end 466
of the transcript are essential for the precise initiation of reverse transcription (27). In addition, 467
almost all R1-clade non-LTR retrotransposons (Mino, RT, R7, R6, Waldo, SART and TRAS) 468
on February 1, 2018 by guest
http://mcb.asm
.org/D
ownloaded from
- 19 -
feature a terminal poly(A) tail. These observations suggest that reverse transcription starts from 469
the tandem repeats adjacent to the 3' UTR end in read through RNA. 470
Based on the above results and discussion, we propose a model to explain the role of the 471
poly(A) tail in R7Ag retrotransposition (Fig. 8). Endogenous R7Ag in the rDNA cluster is 472
co-transcribed by RNA polymerase I (Fig. 8A). During the process of retrotransposition, R7Ag 473
reverse transcription initiates from the poly(A) tail in co-transcribed mRNA, which may be 474
recognized by R7Ag ORF proteins. In this study, however, since we used the baculovirus 475
recombinant system, R7Ag was transcribed by RNA polymerase II from the polyhedrin promoter, 476
read-through into the vector sequence and ended with a poly(A) tail at various sites within the 477
SV40pA signal region (Fig. 8B). In this mRNA, therefore, the original poly(A) tail at the 3′ end 478
of R7Ag resides in the internal region. In the R7AgWT construct, it is hypothesized that the 479
reverse transcriptase (RT) (or some other domain) recognizes 3′-terminal poly(A) tail more 480
effectively than the original short oligo(A) tail (8 nt), and therefore reverse transcription starts at 481
various sites within SV40pA. In R7Ag A20, however, a longer and internal oligo(A) tail (20 nt) 482
might be recognized more effectively than the 3′-terminal poly(A) tail, leading to the accurate 483
insertion of R7Ag. 484
In the non-LTR retrotransposon life cycle, migration of the RNP complex into the nucleus is an 485
essential step for retrotransposition. However, the mechanism by which the RNP of non-LTR 486
elements cross the nuclear membrane remains unclear. One hypothesis suggests that the entrance 487
of the RNP complex into the nucleus is cell cycle-dependent and occurs when the nuclear 488
membrane breaks down. This hypothesis is supported by previous studies in which cell division 489
promotes the retrotransposition efficiency of L1 and is thus considered a host mechanism by 490
which L1 activity is controlled (28, 29). In contrast, another study showed that L1 491
retrotransposition could occur in non-dividing somatic cells, which suggests nuclear import of the 492
L1 RNP (30). In silkworm SART1, nuclear localization signals (NLSs) in the N-terminal region 493
of ORF1 were found to be involved in RNP transport into nucleus (15). Furthermore, two 494
on February 1, 2018 by guest
http://mcb.asm
.org/D
ownloaded from
- 20 -
Drosophila telomere-specific non-LTR retrotransposons, HeT-A and TART, are delivered to their 495
telomeric target sites by the HeT-A gag protein, a finding that strongly supports a nuclear 496
transport mechanism (31). 497
In this study, we found that the ORF2ps of both R7Ag and R1Bm exhibited a punctate 498
localization pattern in the nucleus; additionally, these proteins occasionally co-localized with the 499
nucleolar component fibrillarin (Figs. 6 and 7). This observation indicates that different R 500
elements share a common process of target nucleolus site access. In both R elements, notably, 501
some of the ORF1p and ORF2p signals co-localized in the nucleus (Figs. 6 and 7), suggesting 502
that ORF1p moved into the target site in an ORF2p-dependent manner. We recently observed that 503
R2Bm and R2Ol, other types of R elements that encode a single ORF and target 28S rDNA, also 504
translocated to the nucleolus (unpublished data). Together with the above data, we speculate that 505
R elements share transport mechanisms by which they access the nucleolus before initiating 506
sequence-specific digestion of the bottom strands of rDNA targets. To clarify these R element 507
transportation mechanisms, we attempted to predict the potential nucleolar localization signal 508
(NoLS) in ORF2p using the NoD server (32) and identified two NoLSs in R7Ag but found no 509
signals in R1Bm (data not shown). The ORF2 of R1Bm may contain a non-canonical NoLS, or 510
the two nuclear localization signals (NLSs) predicted in ORF2p by the cNLS Mapper (data not 511
shown) might act as a NoLS because the NoLSs and NLSs are both rich in basic amino acids and 512
often combine or overlap (33). Combining the lack of nuclear export signal (NES) predicted in 513
both R elements (data not shown) with the above notion, it is implied that the subcellular 514
localization of ORF2p may change from Type I (cytoplasm) to Type II (nuclei), but not from 515
Type II to Type I. Further and comparative studies using various R elements will clarify the above 516
possibility and more detailed processes for the RNP localization. 517
518
ACKNOWLEDGEMENTS 519
We are grateful to Tetsuya Kojima, Hiroyuki Watanabe and Akira Ishizuka for assistance with 520
on February 1, 2018 by guest
http://mcb.asm
.org/D
ownloaded from
- 21 -
immunofluorescence assay. We thank Mizuko Osanai-Futahashi and Mariko Kondo for their 521
advice and discussion about the work. 522
This study was supported by grants from the Ministry of Education, Culture, Sports, Science 523
and Technology (MEXT Japan; Grants-In-Aid for Scientific Research (KAKENHI) (A) 524
18207001 and (B) 24370001 to HF) and by the Program for Promotion of Basic Research 525
Activities for Innovative Biosciences (PROBRAIN) to HF. 526
527
REFERENCES 528
1. Beck CR, Garcia-Perez JL, Badge RM, Moran J V. 2011. LINE-1 elements in structural 529
variation and disease. Annu Rev Genomics Hum Genet 12:187–215. doi: 10.1146/annurev–530
genom–082509–141802. 531
2. Malik HS, Burke WD, Eickbush TH. 1999. The age and evolution of non-LTR 532
retrotransposable elements. Mol Biol Evol 16:793–805. 533
3. Kojima KK, Fujiwara H. 2003. Evolution of target specificity in R1 clade non-LTR 534
retrotransposons. Mol Biol Evol 20:351–361. 535
4. Luan DD, Korman MH, Jakubczak JL, Eickbush TH. 1993. Reverse transcription of 536
R2Bm RNA is primed by a nick at the chromosomal target site: a mechanism for non-LTR 537
retrotransposition. Cell 72:595–605. 538
5. Moran J V, Holmes SE, Naas TP, DeBerardinis RJ, Boeke JD, Kazazian HH Jr. 1996. 539
High frequency retrotransposition in cultured mammalian cells. Cell 87:917–927. 540
6. Fujiwara H. 2015. Site-specific non-LTR retrotransposons. Microbiol Spectr 541
3:MDNA3-0001-2014. doi:10.1128/ microbiolspec.MDNA3-0001-2014. 542
7. Fujiwara H, Osanai M, Matsumoto T, Kojima KK. 2005. Telomere-specific non-LTR 543
retrotransposons and telomere maintenance in the silkworm, Bombyx mori. Chromosom Res 544
13:455–467. 545
8. Busseau I, Berezikov E, Bucheton A. 2001. Identification of Waldo-A and Waldo-B, two 546
on February 1, 2018 by guest
http://mcb.asm
.org/D
ownloaded from
- 22 -
closely related non-LTR retrotransposons in Drosophila. Mol Biol Evol 18:196–205. 547
9. Christensen SM, Ye J, Eickbush TH. 2006. RNA from the 5’ end of the R2 retrotransposon 548
controls R2 protein binding to and cleavage of its DNA target site. Proc Natl Acad Sci U S A 549
103:17602–17607. 550
10. Matsumoto T, Hamada M, Osanai M, Fujiwara H. 2006. Essential domains for 551
ribonucleoprotein complex formation required for retrotransposition of telomere-specific 552
non-long terminal repeat retrotransposon SART1. Mol Cell Biol 26:5168–5179. 553
11. Osanai M, Takahashi H, Kojima KK, Hamada M, Fujiwara H. 2004. Essential motifs in 554
the 3′ untranslated region required for retrotransposition and the precise start of reverse 555
transcription in non-long-terminal-repeat retrotransposon. Mol Cell Biol 24:7902–7913. 556
12. Anzai T, Takahashi H, Fujiwara H. 2001. Sequence-specific recognition and cleavage of 557
telomeric repeat (TTAGG)(n) by endonuclease of non-long terminal repeat retrotransposon 558
TRAS1. Mol Cell Biol 21:100–108. 559
13. Maita N, Aoyagi H, Osanai M, Shirakawa M, Fujiwara H. 2007. Characterization of the 560
sequence specificity of the R1Bm endonuclease domain by structural and biochemical studies. 561
Nucleic Acids Res 35:3918–3927. 562
14. Anzai T, Osanai M, Hamada M, Fujiwara H. 2005. Functional roles of 3'-terminal 563
structures of template RNA during in vivo retrotransposition of non-LTR retrotransposon, 564
R1Bm. Nucleic Acids Res 33:1993–2002. 565
15. Matsumoto T, Takahashi H, Fujiwara H. 2004. Targeted nuclear import of open reading 566
frame 1 protein is required for in vivo retrotransposition of a telomere-specific non-long 567
terminal repeat retrotransposon, SART1. Mol Cell Biol 24:105–122. 568
16. Rashkova S, Karam SE, Pardue M-L. 2002. Element-specific localization of Drosophila 569
retrotransposon Gag proteins occurs in both nucleus and cytoplasm. Proc Natl Acad Sci U S A 570
99:3621–3626. 571
17. Burke WD, Müller F, Eickbush TH. 1995. R4, a non-LTR retrotransposon specific to the 572
on February 1, 2018 by guest
http://mcb.asm
.org/D
ownloaded from
- 23 -
large subunit rRNA genes of nematodes. Nucleic Acids Res 23:4628–4634. 573
18. Jakubczak JL, Burke WD, Eickbush TH. 1991. Retrotransposable elements R1 and R2 574
interrupt the rRNA genes of most insects. Proc Natl Acad Sci U S A 88:3295–3299. 575
19. Kojima KK, Kuma K-I, Toh H, Fujiwara H. 2006. Identification of rDNA-specific 576
non-LTR retrotransposons in cnidaria. Mol Biol Evol 23:1984–1993. 577
20. Xiong Y, Eickbush TH. 1988. The site-specific ribosomal DNA insertion element R1Bm 578
belongs to a class of non-long-terminal-repeat retrotransposons. Mol Cell Biol 8:114–123. 579
21. Takahashi H, Fujiwara H. 2002. Transplantation of target site specificity by swapping the 580
endonuclease domains of two LINEs. EMBO J 21:408–417. 581
22. Kawashima T, Osanai M, Futahashi R, Kojima T, Fujiwara H. 2007. A novel 582
target-specific gene delivery system combining baculovirus and sequence-specific long 583
interspersed nuclear elements. Virus Res 127:49–60. 584
23. Osanai-Futahashi M, Fujiwara H. 2011. Coevolution of telomeric repeats and telomeric 585
repeat-specific non-LTR retrotransposons in insects. Mol Biol Evol 28:2983–2986. 586
24. Feng Q, Moran J V., Kazazian HH Jr, Boeke JD. 1996. Human L1 retrotransposon encodes 587
a conserved endonuclease required for retrotransposition. Cell 87:905–916. 588
25. Wei W, Gilbert N, Ooi SL, Lawler JF, Ostertag EM, Kazazian HH Jr, Boeke JD, Moran 589
J V. 2001. Human L1 retrotransposition: cis preference versus trans complementation. Mol 590
Cell Biol 21:1429–1439. 591
26. Xing Y, Shi Z. 2011. Nucleocapsid protein VP15 of White spot syndrome virus colocalizes 592
with the nucleolar proteins nucleolin and fibrillarin. Can J Microbiol. 57:759-764. doi: 593
10.1139/w11-061. 594
27. Chambeyron S, Bucheton A, Busseau I. 2002. Tandem UAA repeats at the 3'-end of the 595
transcript are essential for the precise initiation of reverse transcription of the I factor in 596
Drosophila melanogaster. J Biol Chem. 277:17877-17882. 597
28. Xie Y, Mates L, Ivics Z, Izsvák Z, Martin SL, An W. 2013. Cell division promotes efficient 598
on February 1, 2018 by guest
http://mcb.asm
.org/D
ownloaded from
- 24 -
retrotransposition in a stable L1 reporter cell line. Mob DNA 4:10. doi: 10.1186/1759–8753–599
4–10. 600
29. Shi X, Seluanov A, Gorbunova V. 2007. Cell divisions are required for L1 retrotransposition. 601
Mol Cell Biol 27:1264–1270. 602
30. Kubo S, Seleme MDC, Soifer HS, Perez JLG, Moran J V, Kazazian HH Jr, Kasahara N. 603
2006. L1 retrotransposition in nondividing and primary human somatic cells. Proc Natl Acad 604
Sci U S A 103:8036–8041. 605
31. Rashkova S, Karam SE, Kellum R, Pardue M. 2002. Gag proteins of the two Drosophila 606
telomeric retrotransposons are targeted to chromosome ends. J Cell Biol 159:397–402. 607
32. Scott MS, Troshin P V, Barton GJ. 2011. NoD: a Nucleolar localization sequence detector 608
for eukaryotic and viral proteins. BMC Bioinformatics 12:317. doi: 10.1186/1471–2105–12–609
317. 610
33. Earley LF, Kawano Y, Adachi K, Sun X-X, Dai M-S, Nakai H. 2015. Identification and 611
Characterization of Nuclear and Nucleolar Localization Signals in the Adeno-Associated 612
Virus Serotype 2 Assembly-Activating Protein. J Virol 89:3038–3048. doi: 613
10.1128/JVI.03125–14. 614
615
FIGURE LEGENDS 616
FIG. 1. The schematic structure and insertion sites of R7Ag and R1Bm in rDNA unit 617
The schematic structures of the full-length R7Ag (top) and R1Bm elements (bottom) are shown. 618
ORFs are indicated by open boxes. The 5′ untranslated region (UTR) and 3′UTR are indicated by 619
horizontal lines. EN and RT (gray box) denote the endonuclease and reverse transcriptase 620
domains, respectively. Vertical lines represent the cysteine–histidine (CCHC) motifs near the 621
C-termini of both ORFs. R7Ag and R1Bm elements insert into 18SrDNA and 28S rDNA repeats, 622
respectively, in the same orientation. R7Ag includes a 3′ poly(A) tail (A8), whereas R1Bm lacks 623
this tract. R7Ag ORF1 and ORF2 proteins are intercepted by a 1-bp interspace. R1Bm proteins 624
on February 1, 2018 by guest
http://mcb.asm
.org/D
ownloaded from
- 25 -
overlap by 11 bp. A diagram of the rDNA unit is shown in the middle. The double strand 625
sequences of the precise insertion sites are shown. Bottom and top strand cleavage sites generated 626
by endonucleases are indicated by vertical lines. The asterisks indicate the conserved CCAC 627
sequence immediately upstream of the top strand near the cleavage site in both R7Ag and R1Bm. 628
629
FIG. 2. In vivo retrotransposition assay for R7Ag element using recombinant baculovirus 630
(A) A construct containing the ORF1/ORF2/3′UTR-Poly(A)8 portion of R7Ag cloned into 631
AcNPV is designated as R7Ag WT (wild type); R7Ag RTm (reverse transcription mutant) is a 632
similar construct in which the RT portion has been deleted (indicated by dashed line). 633
Retrotransposition to the 18S rDNA in Sf9 cells was detected using the designed R7Ag primer 634
R7Ag1(6091)f and 18SBm(1381)r, indicated by arrows. Predicted cleavage sites are indicated by 635
open triangles. AcNPV, Autographa californica nuclear polyhedrosis virus; PpH, polyhedrin 636
promoter; 6xHis, hexa histidine-tag. 637
(B) Results of PCR after in vivo retrotransposition. PCR amplification of the 3′ boundaries 638
between the transposed R7Ag copies and the 28S gene. The smeared PCR product band in lane 639
R7Ag WT indicates the retrotransposition event. 640
(C) 3′ junction clones obtained from an R7Ag WT PCR transposition product. The top of the 641
figure includes a diagram of the R7Ag WT 3′UTR and poly(A)8 sequences (open boxes), 642
followed by the vector sequences and 18S rDNA target site indicated by gray and black boxes, 643
respectively. The accurate reverse transcription initiation site (the left dotted vertical line) is 644
indicated by RT: 0. The putative accurate cleavage site of the target DNA regions (the right dotted 645
vertical line) is indicated by EN: 0. The number of clones containing each insertion type is 646
indicated in the furthest right column. The observed nucleotide positions of the reverse 647
transcription and insertion sites in each clone are numbered according to RT: 0 and EN: 0 as 648
position 0. In the fourth clone, the reverse transcription starts from the accurate site with 649
poly(A)17. 650
on February 1, 2018 by guest
http://mcb.asm
.org/D
ownloaded from
- 26 -
651
FIG. 3. Ex vivo retrotransposition assay for the R7Ag element 652
(A) A comparison of the Anopheles gambiae (Ag) and Sf9 cell target site sequences for R7Ag. 653
Vector NTI (Invitrogen) was used to align the 120-bp target site sequences of Ag and Sf9 cells. 654
Variant nucleotides are shaded in gray. The boxed region represents a 15-bp R7Ag target site 655
duplication (TSD) (6), which is caused by staggered nicks on the bottom strand (left end of the 656
box) and the top strand (right end of the box) (see Fig. 1). A three-nucleotide substitution in the 657
TSD sequence is indicated by asterisks. Conserved region in 18S rDNA target between two 658
species is underlined. The Ag 18SrDNA mutant was derived from the Ag 18SrDNA target 659
construct in which only three-nucleotides shown by asterisks were replaced with the 660
corresponding nucleotides in Sf9 rDNA (A to C, A to G, T to C). 661
(B) Schematic of ex vivo retrotransposition assay. First, Sf9 18S rDNA-pGEMT , Ag 18S 662
rDNA-pGEMT or Ag 18S rDNA mutant-pGEMT plasmid was transfected independently into Sf9 663
cells. Subsequently, a baculovirus containing R7Ag WT infected these Sf9 cells. After purifying 664
the plasmid DNA from these cells, retrotransposition of R7Ag into the plasmids was detected by 665
nested PCR to amplify the 3’juntion region with two set of primers indicated by arrows. 666
(C) Results of ex vivo retrotransposition of R7Ag into the target plasmid Sf9 18S rDNA-pGEMT 667
(left panel), Ag 18S rDNA-pGEMT (middle panel) and Ag 18S rDNA mutant-pGEMT (right 668
panel). In middle and right panel, we used a different concentration of plasmid DNA as template 669
for PCR. In lane 2, we used a 10-fold dilution of plasmid in lane 1. M indicates marker. 670
671
FIG. 4. Ex vivo retrotransposition analyses revealed target sequence preference of R7Ag 672
(A) 3′ junction clones obtained from the PCR product for Sf9 18S rDNA target (see Fig. 3B). The 673
results showed that both the insertion sites and reverse transcription initiation patterns were 674
inaccurate. 675
(B) 3′ junction clones obtained from the PCR product for Ag 18S rDNA target (see Fig. 3B). The 676
on February 1, 2018 by guest
http://mcb.asm
.org/D
ownloaded from
- 27 -
results showed that almost all clones (30 of 31 clones; the bottom clone showed inaccurate 677
insertion) inserted accurately into the target site, but reverse transcription initiation were 678
relatively inaccurate (22 of 31 clones; the top clone showed accurate reverse transcription). 679
(C) 3′ junction clones obtained from the PCR product for Ag 18S rDNA mutant target (see Fig. 680
3B). The results showed that all clones (30 clones) inserted inaccurately into the target site. 681
682
FIG. 5. A long poly(A) tail is essential for accurate reverse transcription of R7Ag 683
(A) R7Ag WT and R7Ag A20 constructs for ex vivo retrotransposition analysis. In contrast of 684
8-nt poly(A) tail, the latter construct has 20-nt poly(A) tail. 685
(B) Result of ex vivo retrotransposition of R7Ag A20. The results for R7Ag WT were shown in 686
Fig. 3C (left panel) and Fig. 4B. 687
(C) 3′ junction clones obtained from the R7Ag A20 PCR transposition product. All 12 clones 688
exhibited accurate reverse transcription initiation (RT=0) and accurate insertion into the target 689
(EN = 0). Compare this result with Fig. 4B (R7Ag WT with 8-nt poly(A) tail). 690
691
FIG. 6. Subcellular localization of R7Ag and R1Bm ORFs in transiently transfected Sf9 692
cells 693
(A) R7Ag and R1Bm constructs for the subcellular localization analysis. ORF1p is fused with an 694
HA tag, and ORF2p is fused with a 3xFLAG tag. 695
(B) Subcellular localization of the R7Ag and R1Bm ORF1p. Immunofluorescence of ORF1p in 696
Sf9 cells was analyzed at 72 h post-transfection. Representative images of R7Ag (b) and R1Bm 697
(e) ORF1p (red images) in transfected cells are shown. DAPI (blue) was used to stain nuclear 698
DNA (a and d). A merged image is shown at the far right (c and f). Scale bar; 5 μm. 699
(C) Subcellular localization of the R7Ag and R1Bm ORF2p. Representative images of R7Ag (b 700
and e) and R1Bm (h and k) ORF2p (green) in transfected cells are shown. DAPI (blue) was used 701
to stain nuclear DNA (a, d, g and j). A merged image is shown at the far right (c, f, i and l). Scale 702
on February 1, 2018 by guest
http://mcb.asm
.org/D
ownloaded from
- 28 -
bar; 5 μm. Two types of ORF2p localization pattern were observed. Type I, cytoplasmic 703
localization (b and h); Type II, nucleus localization with dotted signals (e and k). In Type II of 704
R1Bm, weak cytoplasmic signals were also observed (k and l). 705
706
FIG. 7. Co-localization of R7Ag and R1Bm ORF2p with own ORF1p and the nucleolar 707
marker protein fibrillarin 708
(A) Co-localization of ORF1p and ORF2p. Co-transfection of ORF1p and ORF2p constructs 709
simultaneously yielded two type images of localization patterns both in R7Ag and R1Bm. A 710
merged image is shown at right and co-localization of ORF1p (red) and ORF2p (green) is shown 711
as yellow signals. In type I, ORF1p was localized in cytoplasm (a and g) and ORF2p was 712
localized predominantly in nuclei but some in cytoplasm (b and h). A few signals for 713
co-localization of ORF1p and ORF2p, were observed at peripheral region outside the nuclear 714
membrane (c and i, arrowheads). In type II, ORF1p was localized predominantly in cytoplasm 715
but some in nuclei (d and j) and ORF2p was localized in nuclei (e and k). Co-localization signals 716
in type II were observed in nuclei (f and l, arrowheads). The co-localization model is shown in 717
the right end. Yellow dot, co-localization signal; C, cytoplasm; N, nucleus. 718
(B) A rabbit polyclonal anti-fibrillarin antibody specifically detects an endogenous fibrillarin 719
protein in extracts from HeLa cell, Drosophila embryo and Sf9 cells. α-tubulin is shown as a gel 720
loading control. 721
(C) Immunofluorescence analysis of Sf9 cells transfected with R7Ag and R1Bm ORF2 constructs. 722
Images of cells stained with antibodies against the endogenous nucleolus component fibrillarin 723
(red) are shown. R7Ag and R1Bm ORF2p (Green) are indicated. DAPI (Blue) was used to stain 724
nuclear DNA. A merged image is shown in the rightmost columns. Co-localization of ORF2p and 725
fibrillarin is shown by arrows (yellow). 726
727
FIG. 8. Hypothetical models for the role of poly(A) tail in the initiation process of reverse 728
on February 1, 2018 by guest
http://mcb.asm
.org/D
ownloaded from
- 29 -
transcription in R7Ag. 729
(A) A model for transcription and reverse transcription of the endogenous R7Ag. 730
(B) A model for transcription and reverse transcription of the recombinant baculovirus R7Ag 731
construct. 732
733
734
735
736
737
738
739
740
741
on February 1, 2018 by guest
http://mcb.asm
.org/D
ownloaded from
- 1 -
TABLE 1. Primer Sequence List Name Sequence (5' to 3') For plasmid construction R7Ag1(5172)f CAGGGGACTTTCCAGGAGT R7-3'A20Nsi1Xba1 TTCTAGAATGCATTTTTTTTTTTTTTTTTTTTTAATACTTAAGGATTT
Ag18SrDNAforR7F GGAAGTATGGTTGCAAAGTTGAAACTTAAAGGAATTGACGGAAGGGCACCACAAGAAGTGGAGCTTGCGG Ag18SrDNAforR7R CCTCAATAAGTTCGGACCTGGTAAGTTTTCCCGTGTTGAGTCAAATTAAGCCGCAAGCTCCACTTCTTGT Ag18SrDNAforR7F-mt GGAAGTATGGTTGCAAAGTTGAAACTTAAAGGAATTGACGGAAGGGCACCACCAGGAGTGGAGCCTGCGG Ag18SrDNAforR7R-mt CCTCAATAAGTTCGGACCTGGTAAGTTTTCCCGTGTTGAGTCAAATTAAGCCGCAGGCTCCACTCCTGGT Sf918SrDNAforR7F GGGAGTATGGTTGCAAAGCTGAAACTTAAAGGAATTGACGGAAGGGCACCACCAGGAGTGGAGCCTGCGG Sf918SrDNAforR7R CCTTCCGGTGTCCGGGCCTGGTGAGATTTCCCGTGTTGAGTCAAATTAAGCCGCAGGCTCCACTCCTGGT EcoRI-KOZAK-R71-F AGAATTCGCCACCATGGATAAGCAACTGAGAGGAAGGAC NotI-R71-R AAAAAAAAGCGGCCGCAGTTCGAGGGTTGTCCACCGC SpeI-KOZAK-R11-F AACTAGTGCCACCATGTCGGAGGAGGAGAGGGAGC EcoRI-R11-R-2 AAAAAAAAGAATTCCCATATCCATACTCGACCTGATTTAGGAAG KpnI-KOZAK-3xFLAG-B-S CGCCACCATGGACTACAAAGACCATGACGGTGATTATAAAGATCATGACATCGATTACAAGGACGATGACGA
CAAGG KpnI-KOZAK-3xFLAG-B-AS GATCCCTTATCGTCATCGTCCTTGTAATCGATGTCATGATCTTTATAATCACCGTCATGGTCTTTGTAGTCCATG
GTGGCGGTAC
SpeI-R7O2S TTACTAGTATGGAAGTGCTACAGATCAA EcoRI-R7O2A+3'UTR AAGAATTCTTTTTTTTAATACTTAAGGATT BamHI-R1O2S AAAAAGGATCCATGGATATTAGGCCCCGAC R1 A5136(NotI) AAAAAGCGGCCGCTTCCCACCACCTCCCATGGTCCCACCAACCTTGC For in vivo retrotranspostion assay
R7Ag1(6091)f GGCGTGAGATGGAGCGGCTA 18SBm(1381)r AGACAAATCGCTCCACCAAC For ex vivo retrotransposition assay R7Ag1(6349)f GATGTCACAAGTGACACATACTCCTGGTTC R7Ag1(6402)f CAGGAGCAGGAGTGGGGATTTAAC Ag18SR CTCAATAAGTTCGGACCTGGTAAGTTTTC
Ag18S/pGEM R CACTAGTGATTCCTCAATAAGTTCGGACCTG
Sf918SR GTGTCCGGGCCTGGTGAGATTTC
Sf918S/pGEM R CACTAGTGATTCCTTCCGGTGTCCG
on February 1, 2018 by guest
http://mcb.asm
.org/D
ownloaded from
- 2 -
TABLE 2. R7Ag 3' Junction Sequences Determined In Silico Class Type R7Ag 3' UTR PolyA Target site (18S rDNA) Clone Number
I
1 AATAAATCCTTAGTATT 8 AAGAAGTGGAGCTTG 4
2 AATAAATCCTTAGTATT 9 AAGAAGTGGAGCTTG 1
3 AATAAATCCTTAGTATT 11 AAGAAGTGGAGCTTG 1
4 AATAAATCCTTAGTATT 12 AAGAAGTGGAGCTTG 2
5 AATAAATCCTTAGTATT 13 AAGAAGTGGAGCTTG 2
6 AATAAATCCTTAGTATT 14 AAGAAGTGGAGCTTG 3
7 AATAAATCCTTAGTATT 15 AAGAAGTGGAGCTTG 2
8 AATAAATCCTTAGTATT 16 AAGAAGTGGAGCTTG 2
9 AATAAATCCTTAGTATT 25 AAGAAGTGGAGCTTG 2
II
10 AATAAATCCTTAGTGTTAAC 6 AAGAAGTGGAGCTTG 1
11 AATAAATCCTTAGTGTTAAC 10 AAGAAGTGGAGCTTG 1
12 AATAAATCCTTAGTGTTAAC 11 AAGAAGTGGAGCTTG 2
13 AATAAATCCTTAGTGTTAAC 13 AAGAAGTGGAGCTTG 13
14 AATAAATCCTTAGTGTTAAC 13 AAGAAGTGGAGCTTG 1
15 AATAAATCCTTAGTGTTAAC 15 AAGAAGTGGAGCTTG 6
16 AATAAATCCTTAGTGTTAAC 18 AAGAAGTGGAGCTTG 1
17 AATAAATCCTTAGTGTTAAT 6 AAGAAGTGGAGCTTG 1
18 AATAAATCCTTAGTGTTAAT 9 AAGAAGTGGAGCTTG 1
III
19 AATAAATCCTTAGTGTTAACAAAAGAAC 5 AAGAAGTGGAGCTTG 1
20 AATAAATCCTTAGTGGTAGCGG 11 AAGAAGTGGAGCTTG 1
21 AATAAATCCTTAGTAGT 13 AAGAAGTGGAGCTTG 1
Total : 49 Class: categorized by 3' UTR variation.
Type: categorized by poly(A) tail variation.
on February 1, 2018 by guest
http://mcb.asm
.org/D
ownloaded from
- 3 -
TABLE 3. R7Ag 5' Junction Sequences Determined In Silico Type Target site (18S rDNA) R7Ag 5' UTR Clone Number
1 AAGAAGTGGAGCTTG CAGTCGCAAT 41
2 AAGAAGTGGAGCTTGCG CAGTCGCAAT 8
3 AAGAAGTGGAGCTTGCG AGTCGCAAT 2
4 AAGAAGTGGAGCTTGCG TGTCGCAAT 1
Total : 52
on February 1, 2018 by guest
http://mcb.asm
.org/D
ownloaded from
- 4 -
TABLE 4. Localization Pattern of R7Ag and R1Bm Element Proteins Construct nt
Type I (Cytoplasmic) Type II (Nuclei) nc (%) nc (%)
R7Ag ORF1 25 25 (100%) 0
R1Bm ORF1 20 20 (100%) 0
R7Ag ORF2 17 8 (47.0%) 9 (53%)
R1Bm ORF2 22 9 (40.9%) 13 (59.1%)
nt : Total number of independent transfections observed
nc (%): No. of counted cell (% of cell)
R7Ag ORF2p Type I: Cytoplasmic broad localization.
R7Ag ORF2p Type II: Nucleus localization with some dotted pattern.
R1Bm ORF2p Type I: Cytoplasmic broad localization.
R1Bm ORF2p Type II: Cytoplasmic broad localization with some nucleus dotted pattern.
on February 1, 2018 by guest
http://mcb.asm
.org/D
ownloaded from
- 5 -
TABLE 5. RNP Localization Pattern of R7Ag and R1Bm Element Construct nt
Type I (Cytoplasmic) Type II (Nuclei) Non-overlap nc (%): nc (%) nc (%):
R7Ag Co-expression 32 19 (59.3%) 4 (12.5%) 9 (28.2%)
R1Bm Co-expression 20 8 (40%) 3 (15.0%) 9 (45%)
nt : Total number of independent transfections observed
nc (%): No. of counted cell (% of cell)
Type I: Cytoplasmic dotted localization.
Type II: Nucleus dotted localization.
Non-overlap: An overlap of two proteins in fluorescence signal is not sufficient so that
the overall degree of co-localization is not visually apparent.
on February 1, 2018 by guest
http://mcb.asm
.org/D
ownloaded from