paramecium mitochondrial genes - journal of biological ...paramecium mitochondrial genes 11. large...

10
Vol. 259, No. 8, Issue of April 25. pp. 5173-5181. 1984 Printed in U.S.A. THE JOURNAL OF BIOLOGICAL CHEMISTRY fi 1984 hy The American Society of Biological Chemists, Inc. Paramecium Mitochondrial Genes 11. LARGESUBUNITrRNAGENESEQUENCE AND MICROEVOLUTION* (Received for publication, September 6, 1983) Jeffrey J. Seilhamerl, Robin R.GutellOlI, and Donald J. Cummings// From the University of Colorado Health Sciences Center, Department of Microbiology and Immunology, B-175, Denver. Colorado 80262 and TThinmann Laboratories, University of California, Santa Cruz, Santa Cruz, California 95064 Mature Paramecium mitochondrial large subunit rRNA consists of two stable segments: a 20 S segment described previously and a unique 283-base segment similar to 5.8 S rRNAs typically found in eucaryotic cytoplasmic RNA. pBR325 clones of both gene regions from both Paramecium primaurelia and Paramecium tetraurelia were sequenced and aligned. The gene seg- ments lie adjacent to each other very near the replica- tive terminal end of the linear Paramecium mitochon- drial genome and are transcribed from a common 23 S precursor. The precise gene ends were determined us- ing nuclease Sl protection; the large subunit rRNA gene complex (consisting of “5.8 S-like” rRNA, a 19- 26-base excised region, and 20 S rRNA) spans about 2654 base pairs. The gene complex is preceded by a 15-base poly(T)tract and terminates randomly within a 20-base A + T-rich segment immediately preceding the tRNATyr gene. The sequences from the two species were 4% divergent, the changes consisting of 59% transitions, 38% transversions, and 3% insertions or deletions. The sequences were aIigned with Esche- richia coli 23 S rRNA, and a secondary structure model is presented for the entire molecule based on structures proposed for E. coli 23 S rRNA. Increasing attention has been focused upon the structure, function, and evolution of large subunit rRNAs. During the last few years, complete nucleotide sequences for many large subunit rRNAs have beendetermined (1-13). Simultaneously, comparative analysis of these sequences has been used to find a consensus secondary structure for large subunit rRNA (14- 16). The evolution of large subunit rRNAs is particularly interesting in that it has exhibited both fragmentation (i.e. cytoplasmic 5.8 S rRNA (17-24), 7 and 3 S rRNA in Chlu- mydomonas chloroplasts (25), 4.5 S rRNA in plant chloro- plasts (3,26,44)) and the occurrence of one or multiple introns (27-32). Considerations of this fragmented nature of large subunit rRNA evolution with respect to its secondary struc- ture have begun to reveal the strong inter-relationship be- tween the structure and evolution of these rRNAs. We have presented previously (33) the sequence of the * The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked “advertisement” in accordance with 18 U.S.C. Section 1734 solely to indicate this fact. $Supported by National Institutes of Health Grant AG-05221. Present address, California Biotechnology,Inc., 2450 Bayshore Front- age Road, Mountain View, CA 94043. 9 Supported by National Institutes of Health Grant GM-17129 awarded to H. Noller. 11 Supported by National Institutes of Health Grant GM-21948. To whom reprint requests should be addressed. mitochondrial 20 S rRNA from Paramecium aurelia, species 1, or primaurelia. Subsequent work has led to theconclusion that the entirelarge subunit rRNA gene complex consists of this 20 S region plus an additional 283-base region located immediately upstream in the genome. Here we present both the data which support this finding and the sequence of the entire 23 S rRNA gene complex for two divergent species of Paramecium: primaurelia and tetraurelia. Our reasons for sequencing two such divergent species are discussed in the accompanying paper (45) which also describes a similar anal- ysis of Paramecium mitochondrial small subunit rRNA. In addition, the approximate starts and ends of rDNA have been determined by nuclease S1 protection. A model for large subunit rRNA secondary structure is proposed, based on comparative sequence analysis. The Paramecium secondary structure, except for a few deleted or truncated helices and two regions of undetermined structure, is very similar to a proposed model for Escherichia coli 23 S rRNA (14). Finally, transcription of Paramecium mitochondrial rRNA genes is considered. EXPERIMENTAL PROCEDURES All methods used were as described elsewhere (33, 45). The 5’ region of P. primaurelia large subunit rRNA was cloned into E. coli plasmid vector pUR222 by Inara Lazdins in our laboratory, using construction and containment methods compatible with National Institutes of Health guidelines. RESULTS Large Subunit rDNA Location-In order to locate large subunit rDNA, appropriate cloned DNA fragments were la- beled and hybridized to Northern blotsof Paramecium mito- chondrial RNA (Fig. 1). When species 1 HindIII fragment 5 (IH5) or its homologous fragment 3R from species 4 (4H3R) was used as a probe (lane C), the abundant 20 S RNA band hybridized, along with a trace of a larger fragment (labeled 23 S) which could represent a precursor. When the twosmaller HindIII fragments upstream of 1H5 and 4H3R were used as probes (lane A), however, the same 23 S band was present, along with a major small molecular weight band of about 300 bases in size. This result suggested that 20 S rDNA was part of a larger 23 S precursor which gives rise to both the 20 S segment and a -300-base segment and that the cleavage site which separates these two segmentsmust coincide closely with HindIII site C in the DNA sequence (Fig. 1). The DNA sequences surrounding HindIII sites A-C were nearly identi- cal in both species, and the identical result was obtained in both cases. Therefore, the “5.8 S-like” and 20 S rRNA genes lie in tandem, beginning -3.6 kilobase pairs from the replicative terminal end of mitochondrial DNA inbothParamecium species. The 3’ end of the large subunit rRNA gene complex 5173 by guest on May 17, 2020 http://www.jbc.org/ Downloaded from

Upload: others

Post on 17-May-2020

9 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Paramecium Mitochondrial Genes - Journal of Biological ...Paramecium Mitochondrial Genes 11. LARGE SUBUNIT rRNA GENE SEQUENCE AND MICROEVOLUTION* (Received for publication, September

Vol. 259, No. 8, Issue of April 25. pp. 5173-5181. 1984 Printed in U.S.A.

THE JOURNAL OF BIOLOGICAL CHEMISTRY fi 1984 hy The American Society of Biological Chemists, Inc.

Paramecium Mitochondrial Genes 11. LARGE SUBUNIT rRNA GENE SEQUENCE AND MICROEVOLUTION*

(Received for publication, September 6, 1983)

Jeffrey J. Seilhamerl, Robin R. GutellOlI, and Donald J. Cummings// From the University of Colorado Health Sciences Center, Department of Microbiology and Immunology, B-175, Denver. Colorado 80262 and TThinmann Laboratories, University of California, Santa Cruz, Santa Cruz, California 95064

Mature Paramecium mitochondrial large subunit rRNA consists of two stable segments: a 20 S segment described previously and a unique 283-base segment similar to 5.8 S rRNAs typically found in eucaryotic cytoplasmic RNA. pBR325 clones of both gene regions from both Paramecium primaurelia and Paramecium tetraurelia were sequenced and aligned. The gene seg- ments lie adjacent to each other very near the replica- tive terminal end of the linear Paramecium mitochon- drial genome and are transcribed from a common 23 S precursor. The precise gene ends were determined us- ing nuclease S l protection; the large subunit rRNA gene complex (consisting of “5.8 S-like” rRNA, a 19- 26-base excised region, and 20 S rRNA) spans about 2654 base pairs. The gene complex is preceded by a 15-base poly(T) tract and terminates randomly within a 20-base A + T-rich segment immediately preceding the tRNATyr gene. The sequences from the two species were 4% divergent, the changes consisting of 59% transitions, 38% transversions, and 3% insertions or deletions. The sequences were aIigned with Esche- richia coli 23 S rRNA, and a secondary structure model is presented for the entire molecule based on structures proposed for E. coli 23 S rRNA.

Increasing attention has been focused upon the structure, function, and evolution of large subunit rRNAs. During the last few years, complete nucleotide sequences for many large subunit rRNAs have been determined (1-13). Simultaneously, comparative analysis of these sequences has been used to find a consensus secondary structure for large subunit rRNA (14- 16). The evolution of large subunit rRNAs is particularly interesting in that it has exhibited both fragmentation (i.e. cytoplasmic 5.8 S rRNA (17-24), 7 and 3 S rRNA in Chlu- mydomonas chloroplasts (25), 4.5 S rRNA in plant chloro- plasts (3,26,44)) and the occurrence of one or multiple introns (27-32). Considerations of this fragmented nature of large subunit rRNA evolution with respect to its secondary struc- ture have begun to reveal the strong inter-relationship be- tween the structure and evolution of these rRNAs.

We have presented previously (33) the sequence of the

* The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked “advertisement” in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.

$Supported by National Institutes of Health Grant AG-05221. Present address, California Biotechnology, Inc., 2450 Bayshore Front- age Road, Mountain View, CA 94043.

9 Supported by National Institutes of Health Grant GM-17129 awarded to H. Noller.

11 Supported by National Institutes of Health Grant GM-21948. To whom reprint requests should be addressed.

mitochondrial 20 S rRNA from Paramecium aurelia, species 1, or primaurelia. Subsequent work has led to the conclusion that the entire large subunit rRNA gene complex consists of this 20 S region plus an additional 283-base region located immediately upstream in the genome. Here we present both the data which support this finding and the sequence of the entire 23 S rRNA gene complex for two divergent species of Paramecium: primaurelia and tetraurelia. Our reasons for sequencing two such divergent species are discussed in the accompanying paper (45) which also describes a similar anal- ysis of Paramecium mitochondrial small subunit rRNA. In addition, the approximate starts and ends of rDNA have been determined by nuclease S1 protection. A model for large subunit rRNA secondary structure is proposed, based on comparative sequence analysis. The Paramecium secondary structure, except for a few deleted or truncated helices and two regions of undetermined structure, is very similar to a proposed model for Escherichia coli 23 S rRNA (14). Finally, transcription of Paramecium mitochondrial rRNA genes is considered.

EXPERIMENTAL PROCEDURES

All methods used were as described elsewhere (33, 45). The 5’ region of P. primaurelia large subunit rRNA was cloned into E. coli plasmid vector pUR222 by Inara Lazdins in our laboratory, using construction and containment methods compatible with National Institutes of Health guidelines.

RESULTS

Large Subunit rDNA Location-In order to locate large subunit rDNA, appropriate cloned DNA fragments were la- beled and hybridized to Northern blots of Paramecium mito- chondrial RNA (Fig. 1). When species 1 HindIII fragment 5 ( I H 5 ) or its homologous fragment 3R from species 4 (4H3R) was used as a probe (lane C), the abundant 20 S RNA band hybridized, along with a trace of a larger fragment (labeled 23 S) which could represent a precursor. When the two smaller HindIII fragments upstream of 1H5 and 4H3R were used as probes (lane A ) , however, the same 23 S band was present, along with a major small molecular weight band of about 300 bases in size. This result suggested that 20 S rDNA was part of a larger 23 S precursor which gives rise to both the 20 S segment and a -300-base segment and that the cleavage site which separates these two segments must coincide closely with HindIII site C in the DNA sequence (Fig. 1). The DNA sequences surrounding HindIII sites A-C were nearly identi- cal in both species, and the identical result was obtained in both cases.

Therefore, the “5.8 S-like” and 20 S rRNA genes lie in tandem, beginning -3.6 kilobase pairs from the replicative terminal end of mitochondrial DNA in both Paramecium species. The 3’ end of the large subunit rRNA gene complex

5173

by guest on May 17, 2020

http://ww

w.jbc.org/

Dow

nloaded from

Page 2: Paramecium Mitochondrial Genes - Journal of Biological ...Paramecium Mitochondrial Genes 11. LARGE SUBUNIT rRNA GENE SEQUENCE AND MICROEVOLUTION* (Received for publication, September

5174 Paramecium L a 1

A B C

"23 S" 20 s

0 1 2 3 4 n k b

1 l k 3 E 1 I 1 1H5 I H

4E3 I E I I 4H3R I I H

A B C D FIG. 1. Paramecium mitochondrial large subunit rRNA and

rDNA map. Upper, total mitochondrial RNA from P. tetraurelia (lane E ) was Northern-blotted from a 0.75% methyl mercury 1.5% agarose gel and hybridized with labeled species 4 HindIII DNA fragment 3R (4H3R) in lane C. Lane A shows the same Northern blot hybridized with the upstream -200-bp DNA fragment (connecting HindIII sites B and C ) . Lower, simplified restriction map of EcoRI ( E ) and HindIII ( H ) sites within the large subunit rDNA region is shown, and key conserved HindIII sites are indicated (A-D). The position of the 5.8 S-like/ZO S discontinuity is indicated by the oertical dotted line.

was delimited in both species by the tRNATyr gene, the last known gene in the genome, at -1 kilobase pair from the mitochondrial DNA end (see Ref. 33). An additional small RNA band is visible just below the 5.8 S-like rRNA band. We suspect this might be a mitochondrial 5 S rRNA based upon its size, but we have not been able to locate its gene on mitochondrial DNA.

Large Subunit rRNA Sequence-The sequence of the entire large subunit rDNA and 3'-flanking region for both species, as determined by the Maxam and Gilbert (46) method, is shown in Fig. 2. Most regions of the sequence have been determined from at least two different label points and from

-ge Subuni t rRNA

both strands wherever possible. Several regions could be de- termined only from one label point, due to the lack of useful restriction sites; in these cases, the integrity of the sequence is supported by homology between the species. The HindIII sites referred to above are located a t bases 300 (site B ) and 500 (site C ) . The species 1 sequence beyond base 500 corre- sponds to the previously published sequence for 20 S rRNA (33). No transcript has been mapped to the rather large region following the tRNA tyr gene. The species 4 sequence extends to an EcoRI site which lies -600 bases from the end of the genome. The remaining genomic terminal segment has proved difficult to isolate and/or clone.

Overall, the rDNA sequence is 63% A + T (identical to the genomic A + T content) and is 4.0% divergent between the two species. The divergence consisted of 59% transitions, 38% transversions, and 3% insertions or deletions.

rDNA Boundary Determination-The ends of rDNA were determined by nuclease S1 protection of appropriate end- labeled DNA fragment probes by total mitochondrial RNA. Appropriate restriction sites were available in all cases, allow- ing precise mapping of the gene boundaries. The nucleotides corresponding to observed protected lengths as determined in Figs. 3-6 are indicated in Fig. 2. The 5.8 S-like rDNA 5' end (Fig. 3), 3' end (Fig. 4), and 20 S rDNA 5' end (Fig. 5) all produced a small number of bands, indicating relatively dis- crete termini. Apparently, these gene boundaries express very precise transcription initiation and/or processing. Precise de- termination of these gene termini ultimately will require RNA end analysis. In contrast, -20 roughly equimolar bands were produced at the 3' end of 20 S rDNA (Fig. 61, spanning bases 2861-2880 in species 4 (Fig. 2). Such a pattern probably reflects heterogeneous transcription termination or process- ing, terminating randomly anywhere within the last 20 bases before the tRNATyr gene. Also, a protected band whose length included the tRNATyr gene was observed (Fig. 6). This band could have resulted either from a polycystronic RNA or two separate but contiguous RNAs. Both cases suggest the tRNA and rRNA may be derived from the same nascent transcript.

In total, the entire large subunit rDNA region includes 2654 bp,' including the 19-26-bp excised region at the 5.8 S-like/ 20 S rDNA junction. Note that although exact alignment of both rDNA ends with E. coli was not possible in Fig. 2 (see below), the lengths of the unaligned segments were nearly identical in both cases; therefore, the E. coli and Paramecium large subunit rRNA termini coincide precisely a t both 5' and 3' ends. Interestingly, the intergenic regions (i.e. the 15 bases immediately upstream of 5.8 S-like rDNA, the 12-17-base excised region, and the 15 bases immediately before the tRNATyr gene) are nearly 100% A + T and consist of frequent runs of Ts.

Large Subunit rRNA Secondary Structure-The Parame- cium sequence was aligned with E. coli 23 S rRNA in Fig. 2 based on homologous primary and secondary structure (1,14). The secondary structure of Paramecium large subunit rRNA was constructed (Fig. 7) using comparative analysis of 12 organisms (two eubacteria, two chloroplasts, four mammalian mitochondria, two fungal mitochondria, and two eucaryotic cytoplasmic rRNAs (1-14)). Alignment of the two sequences was done only where it was supported by secondary structure and/or phylogenetic data. Two blocks of Paramecium se- quence (bases 1523-1642 and 2770-2880, species 1 numbers) remain unaligned and unstructured. These blocks correspond to structurally variable regions, and since there is no obvious comparative evidence that can be used at this time to deduce a consensus structure for these regions, we have presented

The abbreviation used is: bp, base pair.

by guest on May 17, 2020

http://ww

w.jbc.org/

Dow

nloaded from

Page 3: Paramecium Mitochondrial Genes - Journal of Biological ...Paramecium Mitochondrial Genes 11. LARGE SUBUNIT rRNA GENE SEQUENCE AND MICROEVOLUTION* (Received for publication, September

Paramecium Large Subunit rRNA 5175

TTACTGATGCTACGAGTGCATGCATACGTTCTTTGTTGTTACTGTTCAGTTTTCTGCATTCTTTACTATTGTATTTTGACTCTTCTTACTCTTCTATACT TTACTGATGCTATGAGTGCATGCATACTTTTTTTGTAGTTACTGTACAGTTTTCAGCATTCTTTACTATTGTTTTTTGATTGTTCTTACTCTTCTATACT .................................................................................................... ....................................................................................................

TTTTTTGTTTATGAGAAGTACGAGCACCACTTTAACAATATTACTAAGATGCTCAAAMGCTCATTGATGAGCTTAAGTCCCTCAAGAAGGACTCCTTTT TTTTTTGTTTACGAGAAGTATGAGCACCACTTCGACAATATMCAAAGATGCATAAAAAGCTCATTGAAGAGCTTAAGTCC~AAAGAAGGACTCCTTTT .................................................................................................... ....................................................................................................

t start "58s" . AATTATGATTTTTTTTTTTTTTGAAGTC.. ...... .TAACCAAGCACTAGACGGATGCCTAAAAATCTTGGTTGAGGGCGTAAATATAAATACGAG. ATTTGTATGTAA AATTTTGATTTTTTTTTTTECAGTC.. ....... TAACAAAGCACTAGACGGATGCCTAAAAATCTTGGTTGAGGGCGT~TCTAAATACGAT.ATTTCTATGT~ ............................ GGTTAAGCGACTAAGCGTACACGGTGGATGCCCTGGCAG.TCAGAGGCGATGAAGGA...CGTGCTAATCTGCGATAAGCGT ............................................ G.AC .. G..GGATGCC.....A..T..G..G.G.......A..........A.......TA.G... B . GCTTTTTTTTGATATAT ....AA.. AAATGCTTTG..GATGTGCAACA............................................... ACCCTTAAAA GCTTTCTTTTGATATAT .... AG..AATTGCTTAG..GATGTACAAGG............................................... GCCCTTGAAA CGGT.AAGGTGATATGAACCGTTATAC.CGGCGATTTCCGAATGGGGAACCCAGTCTGTTTCGACACACTATCATTAACTGAATCCATAGGTTAATGAGCCGA ... T ..... TGATAT..........AA.............G............................................................... A

-

ATTTTATGAAGCGAAACATCGTAGTAATAAATCTAAAACAT~TCAA-ATGTTAAAGTAACGGTGAGTGA..AACAAAGTAGCTCAAAAATTAAAAG

ACCGGGGGAACTGAAACATCTAAGTACCCCGAGGAAAAGAAATCAACCGAGATTCCCC.CAGTAGCGGCGAGCGAACGGGGAGCAGCCCAGAGCCTGAATC'~ ATTTTATGAAGCGAAACATCGTAGTAATAAATCTAAAACATAATTCAACTGAGATGTTAAAGTAACGGTGAGTGA..AACAAAGTAGCTCAAAAATTAGAAA

A ...... GAA..GAAACATC..AGTA........AAAA.A.A........GA........AGTA.CGC.GAG.GA......A.....C...A...T.A....

DdeI .

AAGGGGCTGAACTACTTGGAA AACTAGGCTACAGAGGGTGATAGCCCCGT-TTTAGTTTAATTACTTTTTTTTGTGCAGCAAGGGTGGTCTCAAC AGAGGGCTGAATTACTTGGA .......... AAGTAAGCTACAGTGGGTGATAGCCCCGTAAGCT~TTGATTTMTTATTTTTTTTT~GCTGCTAGGGTGGTCTCAAC GTG.TGT.GTGTTA.GTGGAAGCCTCTGGAA.. ................................................................. AGGCGCGCGATA

stop"(. +Start 20s . ..........

..... G .. G...TA..TGGAA........................................C.....................................G..G....A.. CTGGGTTCATTCCCCAGCT~AT-G~~~T~TTTGTGG~GG~T~AG~GTATATATTGTTAA. .CAAAATAATATGGAA.ATGTT.TTTCAGAAAAAAAG.~ NTGGGTTCATTCCCCACCTMTAAAGTTCGATTCTTTGTAGCGGCTCCGAGTATATATTATTAA..CAAMTAATATAGAA.ATGTT.TTTCAAA~AAG.C CAGGGTGACAGCCCCGTACACAAAMTGCACATGCTGTGAGC.TCGATGAGTAGGGCGGGACACGTGGTATCCTGTCTGAATATGGGGGGACCATCCTCCAAGGC .. GGGT. .... CCCC.. .. A .. AAA.T.C...T..T...AGC..C...GAGTA.........A......A........GAA.ATG......C.A......AA G.C

TAA.TAAAACCAAATT.TTGATAGAAAATAAGTACCGTGAGGGAAAGGTGAAAAGAAATT.TGT.AGAAT.GCTTAAMGATTCTGAAATCTAGTGCAGTGCAGTG~C TAA.TAAAACCAAATT.TTCATAGAAAATAAGTACCGTGAGGGAAAGGTGAAAACAAATT.TGT.AGAAT.GTTTAAAAGATTCTGAAATCTAGTGCAGTGAAAC TAAATACTCCTGACTGACCGATAGTGAACCAGTACCGTGAGGGAAAGGCGAAAAGAACCCCGGCGAGGGGAGTGAAAAAGAACCTGAAACCGTGTACGTACAAGC TAA.TA ... C .. A.T. ... GATAG..AA..AGTACCGTGAGGGAAAGG.GAAAAGAA ..... G..AG....G...AAAAGA..CTGAAA.C..GT.C....A A.C

100 spl 100 sp4 0 e.coli 0 conserved

200 spl 200 sp4 0 e.coli 0 conserved

300 spl 300 sp4 78 e.coli 21 conserved

350 spl 350 sp4 181 e.coli 32 conserved

450 s p l 450 sp4 282 e.coli 73 conserved

550 SDI 550 sp4 322 e.coli 85 conserved

650 spl

426 e.coli 650 sp4

124 conserved

750 spl

571 e.coli 750 sp4

188 conserved

AGTTAAAGC .... GTGTT...GTTTTAACGTACCTTTTGTATAATGGGCCAACTAGTTTATAAAATTAGCGAG ................... CTTTATCAAATCGCG.TAATGAAAA AGTTAAAGC....GTGTT ... GTTTTAACGTACCTTTTGTATAATGGGCCAACTAGTTTATAAAATTAGCGAG.. ................. CTT.ATCAAATCGCG.TAATGAAAA 040 spl

AGTGGGAGCACGCTTAGGCGTGTGACTGCGTACCTTTTGTATAATGGGTCAGCGACTTATATTCTGTAGCAAGGTTAACCGAATAGGGGAGCCGAAGCCAMCCGAGTCTTAACTGG AGT...AGC ..... T. ..... GT ..... CGTACCTTTTGTATAATGGG.CA.C.A.TT.. ...... TAGC.AG ................... C . . . . . . A M . C G . G . . . . . . . . . .

640 e.coli 236 conserved

839 sp4

. . TGTTATAGTTAATTTTATAAGACCCGAAGTCAAGTGATCTAATCATGGCTAGCTAG AA ...................... GATCGAACCCATAAATGTTGCA TGTTATAGTTAATTTTATAGACCCGAAGTCAAGTGATCTAATCATGGCTAGGTAGAA. ..................... GATCGAACCCATAAhTGTTGC,~ GCGTTAAGTTGCAGGGTATAGACCCGAAACCCGGTGATCTAGCCATGGGCAGGTTGAAGGTTGGGTAACACTAACTGGAGGACCGAACCGACTAATGTTGAA ... T..AGTT.. ....... AGACCCGAA..C..GTGATCTA..CATGG..AGGT.GAA ...................... GA.CGAr\CC.A..AATGTTG.A

920 spl 919 sp4 750 e.coli 288 conserved

AAATTTCGGGAGAAGCTGTGATTAGGGGTGAAACGCTAATCAAACTTGACGATAGCTGGTTTTTCGCGAAATCTATCTACGTAGAGTATTTTTTTTCTTTTGTGCGCGGTAGTGTMTCT 1040 spl

AAATTAGCGGATGACTTGTGGCTGGGGGTGAAAGGCCMTCAAACCGGGAGATAGCTGGTTCTCCCCGAAAGCTATTTAGGTAGCGCCTCGTGAATTCATCTCCGGGGGTAGAGCACTGT 870 e.coli AAATTTCGGGAGAAGCTGTGATTAGGGGTGAAAGGCTAATCAAACTTGACGATAGCTGGTTTTTCGCGAAATCTATCTACGTAGAGTATTTTTTTTCTTTTGTGCGGGGTAGTGTAATCT 1039 sp4

AAATT ... GGA .. A..TGTG..T.GGGCTGAAAGGC.AATCAAAC..G..GATAGCTGGTT.T.C.CGAA.CTAT.TA.GTAG.G..T..T...T...T.....GGGCTAG.G.A.T.T 367 conserved

TTTCTTAAGMGATC .... TGATCT.TTTTTCGTGAAGAATCTTCCAATACGCATT.AAAATAATAAATAAAACGGACTGAGAGTGCAAAGATTCTTGGTCGAGAGGGA~CAGCC TTTCTTAAGAAGATC TGATCT.TTTTTCGTGAAGAATCTTCCAATACGCATT.AAAATAATAAATAAAACGGACTGAGAGTGCMAGATTCTTGGTCGAGAGGGAAACAGCC TTCGGCAAGGGGGTCATCCCGACTTACCMCCCGATGCAAACTGCGAATACCGGAGAATGTTATCACGGGAGACACACGGCGCGTCCTAACCTCCGTCGTGAAGAGGGAAACAACC TT .... AAG .. G.TC ..... GA .. T......C......AA.CT.C.AATAC......A...TA..A....A.AC..AC.G.G.GTGC.AA..T.C.T.GT..AGAGGGAAAC A.CC

1150 spl 1149 sp4 .... 986 e.coli 424 conserved

CAGACCGAACGATAAAGTGCATAAA.CAATGCGAAGTAAAAGAATTTTTTTTAAAAAA .... ATATTGGGAGGTAGGCTTAGAATCAGCCAGCCTTTAMGAAAG

CAGACCGCCAGCTAAGGTCCCAAAGTCATGGTTAAGTGGGAAACGATGTGGGAAGGCCCAGACAGCCAGGATGTTGGCTTAGMGCAGCCATCATTTAAAGAAAG CAGACCGTACGATAAAGTGCATAAA.CAATGCGAAGTAAAAGAATTGTTTTTAA AAAA.... ATATCGGGAGGTAGGCTTAGMTCAGCCAGCCTTTAAAG~G CAGACCG ... G.TAA.GT.C ..AA.. CA..G. .AAGT ... A.A. .... T...AA..............GGA.GT.GGCTTAGAA.CAGCCA. C.TTTAAAGAAAG

CGAAACAGCTCACCAATAGCTACGACTATTTGAATAAAACTCGTTATTAAAAACAATTTTTAAATTATAGCCCT~AGCATTGTACTGAAGCGG CGTAACAGCTCACCAATAGCTACGATTATTTGMC~CATCGT.ATTAA~TAATTTTTAAATTAAAGCGCTAAGCATTGTACTGAAGCGG

CG.AA.AGCTCAC T. A..T.A..G.GCTAA.C..TG.AC.GAAGC.G CGTAATAGCTCACTGGT CGAGTCGGCCTGCGCGGAA.GATGTAACGGGGCTAAACCATGCACCG~GCTG ....................... ... ...........................................

1250 spl

1091 e.coli 1249 sp4

404 conserved

1343 spl

1160 e.coli 1341 sp4

516 conserved

CGGGT ............... AACTCGGTAGCGAAACGTTTTGTAGGTCGTTGAAGGTTTATTG~TAGGCTGGAGATATC~ATTGATAATGTTGGCATGAGTAATGG 1440 spl CGGGT ............... MCTCGCTAGCGAAACGTTTTGTAGGTTGTTGAAGGTTTATTGAG~TAGGCTGGAGATATC~TTGATAATGTTGGCATGAGTAATAG 1438 sp4 CGGCAGCGAGGCTTATGCGTTGTTGGGTAGGGGAGC~TCTGTAAGCCTGCGAAGGTGTGCTGTGAGGCATGCTGGAGGTATCAGAAGTGCGAATGCTGACATAAGT~~GA CGG .................... T.GGTAG.G.A.CGTT.TGTA.G ..... GAAGGT.T..TG..A...A.GCTGGAG.TATCA.AA.TG .. AATG.TG.CAT.AGTAA... 1272 e.coli

577 conserved FIG. 2. Large subunit rDNA sequence. The sequence rtf large subunit rDNA and flanking regions was

determined using Maxam and Gilbert (46) chemistry. The identlty strands for P. aurelia species 1 and 4 are shown, mutually aligned. The nuclease SI-protected bases obtained from Figs. 3-6 are indicated by solid lines above the bases for species 1 data and below for species 4. Shown also is the sequence for E. coli 23 S rRNA aligned by secondary structure (see text) and bases conserved between Paramecium and E. coli. Key restriction sites mentioned in the text, including Hind111 sites B and C, are indicated.

by guest on May 17, 2020

http://ww

w.jbc.org/

Dow

nloaded from

Page 4: Paramecium Mitochondrial Genes - Journal of Biological ...Paramecium Mitochondrial Genes 11. LARGE SUBUNIT rRNA GENE SEQUENCE AND MICROEVOLUTION* (Received for publication, September

5176 Paramecium Large Subunit rRNA

ACAAAI\TCTT.CAAAT.CATTTTTGTTTGATAAGTTAGGGTTGCTTTGTTTTGATCATCTTACAAAGTGTGATTCGGCCTCTAA ............................. ACAAAATGTT.CAAAT.CATTTTTCTTTGATAAGTTAGGGTTGCTTTGTTTTGATCATCTTACAAAGTGTGATCCGGTCTCTAA TAAACC.CGGTGAAAAGCCCGC~CGCCGGAAGACCAAGGGTTCCTGTCCAACGTTAATC.GGGCCAGGGTGAGTCGACCCCTAAGGCGAGGCCGAAAGGCGTAGTCGATGGGA .. AA ... G....AAA..C....T.G...GA..A...AGGGTT.CT.T.....G.T.ATC......AG.GTGA..CG..C.CTAA............................. 616 conserved

1522 spl 1520 sp4 1383 e.coli

.............................

.......................................................................................................................... 1522 spl .......................................................................................................................... 1520 sp4 AACACCTTAATATTCCTGTACTTGGTGTTACTGCGAAGGGGGGACGGAGAAGGCTATGTTGGCCGGGCGACGGTTGTCCCGGTTTAAGCGTGTAGGCTGGTTTTCCAGGCAAATCCGGAAAA 1505 e.coli .......................................................................................................................... 616 conserved

.............................................................................................. TTTTTTAAATATAAATTTTTAAAGATGA 1550 spl

TCAAGGCTGAGCCGTGATGACGAGGCACTACGGTGCTGAAGCAACAAATGCCCTGCTTCCAGGAAAAGCCTCTAAGCATCAGGTAACATCAAAT 1599 e.coli TTTTTTAAATAGATATTTTTAAAGATGA 1548 sp4 .............................................................................................. ............................ .......................................................................................................................... 616 conserved

GGTTTGTATTTTTTTCTGGCTTCCTTTGCGTTCTTCTTTATTTTTTTAGGAGAGTGTGGGGAGGCGGACTGGAAAAATTTTACAAGGGGCACCGTAC.TAA G C T C T G T A T T T T T T T C T G G C C T C C T T T G T G T T C T T T T T N L C A T T T T A C A T G G T G T A C C G T A C . T A A ............................................................................................ CGTACCCCA ............................................................................................ CGTAC...A

CACTAACACAAGTACTTTAGTCGAGCAGATGACGACAGAAGAGCTAATGATATTG~GGAACTCGGCAAAATTACTTTGTAACTTCGGGATAAAAACTGC CACTAACGCAACTACTTTAGTCGAGCAGATGACGACAAAAGAGCTAATGATATTGAAGGAACTCGGCAAAATTACTTTGTAACTTCGGGATAAAAAGTGC AACCCACACAGGTCGTCAGGTAGAGAATACCAAGGCGCTTGAGAGAACTCGGGTG~GGAACTAGGCAAAATGGTGCCGTAACTTCGGGAGAAGGCACGC .AC .. AC.CA.GT..T...GT.GAG.A.A..A.G.C....GAG..AA.. .... TGAAGGAACT.GGCAAAAT......GTAACTTCGGGA.AA..... GC

GT..................... .............. CAGCAAACAAAAAAATGGGGGGTAGCGACTGTTTACTAAAAACATAAGATTTTGCAAAATTTAATTATGATGTATAAA CT..... .............................. CAGCAAACAAAAA~ATGGGGGGTAGCGACTGTTTACTAAAAACATAAGATTTTGCAAAATTTAATTATGATGTATAAA TCATATGTAGGTGACGTCCCTCGCGGATGGAGCTGAAATCAGTCGAAGATACCAGCTGGCTGCAACTGTTTATTAAAAACACAGCACTGTGCAAACACGAA'~GTGG~CGTATACG ............................................. AA.A.A...G..GG .. GC.ACTGTTTA.TAAAAACA.A..A.T.TGCAAA....AA....GA.GTATA..

ATCTGACTCCTGCCCGGTGTTGTCATGCAAAGTTTTAACGGTTAGC ....... CCTCTTGATAACAAGCAGCAATAAACGGCGGCCATAACTCTGATGGTCCTAAGGTAGCAAAATC

GTGTGACGCCTCCCCGGTGCCGGAAGGTTAATTGATGGGGTTAGCGCAAGCGAAGCTCTTGATCGAAGCCCCGGT~ACGGCGGCCGTAACTAT~CGGTCCTAAGG~AGCG~ATT .T.TGAC.CCTGCCCGGTG .. G..A.G..AA.T..T ... G.T. ............... T......AAGC..C..TAAACGGCGGCC.TAACT.T.A.GGTCCTAAGGTAGC.AAA T.

ATCTGACTCCTGCCCGGTGTTGTCATGCAAAGTTTTAACGGTTAGC. ...... GCTCTTGATAACAAGCAGCAATAAACGGCGGCCATAACTCTGATGGTCCTAAGGTAGCAAAATC

CCTTGACGGGTAAGTTCCGTCCTGCACGAATGGAGTAACGACTGCCCTACTGTCTCCAATATCAGCTCTATGAAATTGAATTTGCTGTGAAGATGCAGCT.TTTTACAACT CCTTGACGGGTAAGTTCCGTCCTGCACGAATGGACTAACGACTGCCCTACTGTCTCCAATATCAGCTCTATGAAATTGAATTTGCTGTGAAGATGCAGCT.TTTTACAACT

CCTTG.CGGGTAAGTTCCG.CCTGCACGAATGG.GTAA.GA..GCC...CTGTCTCCA ....... CTC .. TGAAATTCAA.T.GCTGTGAAGATGCAG........C.. C. CCTTGTCGGGTAAGTTCCGACCTGCACGAATGGCGTAATGATGGCCAGGCTGTCTCCACCCGAGACTCAGTGAAATTGAACTCGCTGTGAAGATGCAGTGTACCCGCGGCA

AGACGGAAAGACCCTATGCACCTTTACTGATGCNAGGAACTAAAGAGATATACTGGAGATAAA.TTAACGTAGGAG ................................... AGACGGAAAGACCCCGTGAACCTTTACTATAGCTTGACACTGAACATTGAGCCTTGATGTGTAGGATAGGTGGG.~GGCTTTGAAGTGTGGACGCCAGTCTGCATGGAGCCG AGACGGAAAGACCCTATGCACCTTTACTGATGCTAGGAACTAAAGGGATATACTGGAGATAAA.TTAAGGTAGGAG ................................... AGACGGAAAGACCC .. TG.ACCTTTACT...GC .. G .. ACT.AA.....A..CT.GA..T..A....AGGT.GGAG...................................

1650 spl 1648 sp4 1608 e.co1i 622 conserved

1750 spl

1708 e.coli 1748 sp4

680 conserved

1830 spl

1823 e.coli 1828 sp4

723 conserved

1940 spl 1938 sp4 1940 e.coli 791 conserved

2050 s p l 2048 sp4 2051 e.coli 871 conserved

2125 spl 2123 sp4 2162 e.co1i 919 conserved

... TTAAATCGCAATTGAAAAACTACTTCTCTTTT.ACG..TCTCTTCATAAAAATGACTAGATAATTTTTACTTTTCTCGCTGTT~~GTTTAACTGGGCCGGTTGCCTCCTAAAAACTAAC 2240 spl ACCTTGAAATACCACCCTTTAATGTTTGATGTTCTAACGTTGACCC .. GTAA...TCCGGGTT...GCGGACAGTCTCTGGTGGGTAGTTTGACTGGGGCGGTCTCCTCCTAAAGAGTAAC 2275 e.co1i ... TTAAATCGCAATTGAAAAACTACTTCTCTTTT.ACG..TCTCTTCATAAAAATGACTAGGTTATTTTTACTTTTCTGGCTGTTAGTTTAACTCGGGCGGTTGCCTCCTAAAAAGTAAC 2238 sp4

... TT .AA... C.A ...... AA. ... T .. T.TT.T.ACG. .... C.. .. TAA...T..................T.TCTGG..G.TAGTTT.ACTGGGGCGGT..CCTCCTAA A.AGTAAC 979 conserved

GGAGGTGAGCATAAAGTTACGCTTTGTGAG~TTTTTTTTCGCAA~TGAGTTAATAAAACT.GCGTMTTTGATTAAATTACACACTAGTAATTTAGGGGCTTATTGCC GGAGCTGAGCATAAAGTTACGCTTTGTGAGGAATTTTTCTTTGCAAAATGAGTT~~TAAAACT.GCGTAATTTGATTAAATTACAAACTAGTAATTTAGGGCCTAATTGCC GGAGGAGCACGAAGG.TTGGCT..AATCCTGGTCGGACATCAGGAGGTTAGTGCAATG.GCATAAGCCAGCTTGACTGCGAGCGTGACGGCGCGAGCAGGTGCGAA.AGCA GGAGG.G..C..A...TT ........ T ............ T..G.A...T.. ... AAT... .. T. .... A..TTGA.T.. ....... AC.........AGG.Gc..A..G c. 1014 conserved

2350 spl 2348 sp4 2381 e.coli

TGCTATAATGATCCGCTGTTTTTTTTTGAATAAGACATCGCTCAACGAAT~GGTACGCTAGGGATAACAGGCTTATAAATTCTGAGAGTTCCTATTA TCCTATAATGATCCGGTGTTTTTTTTTGAATAAGACATCGCTCAACGAATAAAAGGTACGCTAGGGATAACAGGCTTATAAATTCTGAGAGTTCCTATTA GGTCATAGTGATCCGGTGGTTCTGAATGGAAGGGCCATCGCTCAACGGATAAAAGGTACTCCGGGGATAACAGGCTGATACCGCCCAAGAGTTCATATCG .G..ATA.TGATCCGGTG.TT.T...TG.A...G.CATCGCTCAACG.ATAAAAGGTAC.C..GGGATAACACGCT.ATA....C..AGAGTTC.TAT..

AAGAATTTGTTTGGCACCTCGATGTCGGCTCATCACATCCTGGTGGTGCAGAATCTGCCAAGGGTTTGGCTGTTCGCCAATTAAAGTGGTACGTGAGCTG AAGAATTTGTTTGGCACCTCGATGTCGGCTCATCACATCTTGGTGGTGCAGAATCTGCCAAGGGTTTGGCTGTTCGCCAATTAAAGTGGTACCTGAGCTG ACGCCGGTGTTTGCCACCTCGATGTCCGCTCATCACATCCTGGGGCTGAAGTAGGTCCCAAGGGTATGGCTGTTCGCCATTTAAAGTGGTACGCGAGCTG A.G .... TGTTTGGCACCTCGATGTCGGCTCATCACATC.TGG.G.TG.AG.A..T.CCAAGGGT.TGGCTGTTCGCCA.TTAAAGTGCTACG.GAGCTG

CTCTGGTTCGGCATCTACTTTAAATAAAGGTATAGAAGCTACGCTAAGCTGGATAGGCTAATTTTTTAATTAAAGTTTT~TAGCTTG~TAATTAGAAGC CTCTGGTTCGGGATCTACTTTAAATAAAGGTATACATGC~ATGCTAAGCTGGATAAGCTAATTTTTTAATTAAAATTTTTTAGCTTAAAGAATTAGAAGC CACTCGTGTTCGCGTTGTCATGC.CAATGGCACTGCCCGGTAGCTAAATGCGGA ........... AGAGATAAGTGCTGAAAGCATCTAAGCACGAAACT C.CTGGT ........ T .... T....AA.GG.A..G.......GCTAA....G...............A..TAA....T...AGC.T..A.......AA..

2450 spl 2448 sp4 2481 e.coli 1086 conserved

2550 spl 2548 sp4 2581 e.coli 1170 conserved

2650 spl 2647 sp4 2680 e.coli 1233 conserved

2750 spl 2747 sp4 2768 e.coli 1265 conserved

FIG. 2"Continued.

by guest on May 17, 2020

http://ww

w.jbc.org/

Dow

nloaded from

Page 5: Paramecium Mitochondrial Genes - Journal of Biological ...Paramecium Mitochondrial Genes 11. LARGE SUBUNIT rRNA GENE SEQUENCE AND MICROEVOLUTION* (Received for publication, September

Paramecium Large Subunit rRNA 5177

C C m A C C . . A T C T C G T T T C T . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2769 spl TTTTTAGC..AACTCCTTTCT..................................................................................................... 2766 sp4 T.CCCCCCACATCACTTCTCCCTCACCCTTTAACCCTCCTCAACCAACCTTC~CACCACCACCfiCATACCCCCCCTCTCT~CCCCACCCATCCCTTCACCTAACCCCTACTAATC,~CC 2889 e.coli .......... A.....T.TC...................................................................................................... 1269 conserved

............... AATAATATTTTACAACACTACTAAAfiAATTTCMCTTTCCTCfiATACCCCTCTTTACCACCCTACCACCTTTCTTTAAA ............... AATACTAT~AGAACACTACTAACTTAATfiCAACCTACCTCTTATACCCCCC~AACCCCCCTACC~~CCTCCCCTTAAA CTCACGCTTAACCTT................................................................................. ................................................................................................

MAATATTAACACTCTTATTATTATTATAT.CAACTI~ATCCCTCACTCCTTAAACCCCCAC,~CTCTAAATCTCTTCCTACTACCCTCCTTCCTTCC~TCC AAAATACTAAGACTCTTAfiATTATTATAT.CAACTAATCCCTCACTCCTTAAACCCCCACACTCT~AATCTC~CCTACTACC~TCCTTCCTTCC~TCC

-Stop -20s 4 r tRNA tyr . .....................................................................................................

2850 spl 2947 sp4 2904 e.coli 1269 conserved

2950 spl 2947 SQ4 2904 e.coli ..................................................................................................... 1269 conserved

.................................................................................................... ....................................................................................................

TTTGGGAACTMCCTACCCCCTCATTTTTTMAATTCACACC~CCTCTTTTTACTATCTTTAATAACCCCATCTCTATCTTTAACTTTTTTTTTATAACA TTTACCAACTAAATTACCTCCTCATTTCTTAAACCTTACAACCCCATTCTTAAA~TCTTTAATMCCCCATCTCTATCTTTAATfiTTTTTTTATAA~ .................................................................................................... ....................................................................................................

GC.TTTCTTTCTTATTTAATTTTTTTTTTTMAAACATTCACAACTCTAAACTTATTTTTTTA~CAC,~TAAAACCATTCACTATACCTATCA‘~TACCTAAA ACATfiCTTTCTTTTTTAATTTTTTTTTTTT~AAACATAAAMMTCAAAACTCCfiTTTATTCCACATAAC,~A~TATACTAT,~CTACCACTACCTTAA ..................................................................................................... .....................................................................................................

CTACTTTMCTATCTTTTCGATACCCA~CTTTTTATAAAAAATATACMTTTATACCTTCTATATTCCACACCCCATCAACCTCTTTAATAAATACCCA CTACTTTAACTACCTCTTTGATACACAACATTTTTATAAAAAATATACATTCTATACCCTATATATTCCCCAACCTATCAACATATTTATTCACCATCCA .................................................................................................... ....................................................................................................

3050 spl 3047 sp4 2904 e.coli 1269 conserved

3150 spl

2904 e.coli 3147 sp4

1269 conserved

3250 spl 3248 sp4 2904 c.coli 1269 conserved

3350 spl

2904 e.coli 1269 conserved

3348 sp4

AACTTTTAmmTTTTfiTACAAAAAATAAMCCCTATTATTAAAC~CCTT............................................................ CACCTTTTfiTTTTTTTfiTTACTAAAAACAAAACCTTCTTAAATCAATTAAACACCATCCCCCTATCCACMATATTATTTT~~TTCACAT~ATAACATCTCCTTTTCT ............................................................................................................... 2904 e.coli

3401 spl 3459 sp4

............................................................................................................... 1269 conserved

............................................................................................................... AATTTTTTTT~MCCTACCCCTCATCATTACT..............................................................................

3401 spl 3492 sp4 2904 e.col1 1269 conserved

............................................................................................................... ............................................................................................................... FIG. 2-Continued.

1 2 3 4 * C T A G C T A G 5 6

FIG. 3. 5.8 S-like rDNA 5‘ end. The 5’ end-labeled DNA fragment extending leftward from HindIII site A (Fig. l), located at base 300 (Fig. 2), was hybridized with total mitochondrial RNA, digested with nuclease SI (see “Experimental Procedures”), and loaded onto sequence gels adjacent to sequence ladders of the same fragment (lanes 4 and 5). The experiment was done with species 1 and 4 RNA and DNA simultaneously. Also, isolated 23 S (lane I ) , 20 S (lane 2), and 5.8 S-like (lanes 3 and 6) RNAs were hybridized to the same probe and loaded into the lanes indicated.

them as a block of unstructured nucleotides. Comparison of the Paramecium and E. coli secondary struc-

ture (Fig. 7) reveals a substantial degree of conservation of structure between these two RNAs. In general, the few differ- ences between the two structures are due to deletions or truncations of helices within the Paramecium structure rela- tive to E. coli. One particularly interesting difference between the two structures is that the helix representing the 5.8/28 S junction in cytoplasmic large subunit rRNAs (labeled z ) , along with two 5’-neighboring helices, are deleted entirely from the Paramecium sequence. There are three large Paramecium inserts relative to E. coli. The largest of these (bases 472-538 in Fig. 2) falls within the loop beginning at bases 469 (Para- mecium species 4) and 301 of E. coli. This loop in the Para- mecium structure is labeled with two light arrows (see below). Within this inserted region lies the HindIII site C and the 5.8 S-like/ZO S rRNA boundary mentioned above. The inserted sequence appears notably devoid of internal base pairing.

Microevolution of Paramecium rDNA Sequence-Several observations can be made by comparing nucleotide sequences of large and small subunit (45) rRNAs and flanking regions from the two Paramecium species. First, the tRNATy‘ gene is 100% conserved, the rRNA genes are conserved to a slightly lesser degree (94 and 96%), and the mRNA and nontran- scribed regions’ are -80% conserved. This hierarchy of se-

* J. Seilhamer, A. Pritchard, and D. Cummings, manuscript in preparation.

by guest on May 17, 2020

http://ww

w.jbc.org/

Dow

nloaded from

Page 6: Paramecium Mitochondrial Genes - Journal of Biological ...Paramecium Mitochondrial Genes 11. LARGE SUBUNIT rRNA GENE SEQUENCE AND MICROEVOLUTION* (Received for publication, September

5178 Paramecium Large Subunit rRNA

1 -

C T A G 2 3

FIG. 4. 5.8 S-like rDNA 3' end. The DdeI fragment extending rightward from base 398 was 3' end-labeled by fill-in synthesis, and then was subjected to partial digestion by AluI (lane I ) , HindIII (lane 3), and total RNA-protected nuclease S1 (lane 2). The HindIII (also AluI) site corresponds to base 500 in Fig. 2 and site C in Fig. 1. Adjacent sequence lanes were used as molecular weight markers.

quence conservation appears to reflect the relative functional importance of these genes. Second, the rDNAs themselves contain distinct internal regions of differential relative se- quence conservation. There are two very long 100% conserved regions within large subunit rDNA (bases 719-1 157 and 1689- 2095, species 1 numbers) and several 40-bp hypervariable regions within small subunit rDNA. Similarly, these differ- ences may reflect the presence of intragenic regions with different relative structural importance. Some of these changes, however, constitute compensated base changes within stems and therefore are structurally conservative. Third, the small subunit rDNA sequence (45) overall is sub- stantially more (6.4%) divergent than large subunit rDNA (4%). The significance of this observation is not clear.

Although the nontranscribed region downstream of large subunit rDNA is rather divergent overall, one 55-bp region (bases 3003-3053, Fig. 2) immediately following the tRNATyr gene is 100% conserved. This region is mostly A and T runs, and the first half can be folded into a 23-bp hairpin (bases 3003-3030, Fig. 2). Since this region is located just down- stream from the last known gene in the genome (tRNATyr), it could possibly function as a terminator of transcription.

DISCUSSION

Typical procaryotic large subunit rRNA genes consist of a single contiguous segment 2904 bases (in the case of E. coli) in length. Recently, proposed models for rRNA structural evolution contend that the 5' end portion of the large subunit rRNAs found in eucaryotic cytoplasm has evolved into sepa- rate genes, commonly known as 5.8 S rRNAs (17-24). The reason, if any, for this split in large subunit rRNA is not known. The two gene segments may have separated as a manifestation of the evolutionary expansion of nuclear gen- ome size. In contrast, mitochondria of these same organisms have retained procaryotic style (i.e. contiguous) large subunit rRNA genes, an observation which would appear to support the controversial "endosymbiotic" theory of organelle evolu- tion (34, 35). Other discontinuities in mitochondrial large subunit rRNA genes do occur, particularly in fungi, where as many as two distinct introns can occur (11, 27-30). These introns usually occur a t specific sites further into the gene and are typically re-ligated during RNA processing; hence, they represent distinctly different types of discontinuities.

The finding of a discrete 5.8 S-like rRNA in Paramecium mitochondria, then, was quite unexpected. In addition, two coincidences made its presence difficult to detect, leading to improper placement of the 5' end of the gene (33). First, the discontinuity in the RNA happened to coincide with a HindIII site within the DNA sequence (the site used in cloning 20 S rDNA), and both the upstream RNA and DNA fragments were small enough to be missed on Southern blots. Second, a region which strikingly resembled a highly conserved se- quence normally present near large subunit rRNA 5' ends was found near the 5' end of the 20 S rDNA sequence (HR1; see Ref. 33, seen as bases 672-708 here). The homology exhibited was so convincing that we projected that the large subunit rDNA 5' end must be nearby. Later, when a secondary structure for the entire molecule was developed, it became obvious that this placement of the 5' end was incorrect and that the correct 5' end of large subunit rDNA was further upstream. The homologs of the 3' side of two long range and phylogenetically proven helices (denoted x and y in Fig. 7, E. coli structure) were present within the Paramecium structure, but the complementary strands were not present. When se- quence data further upstream was determined, the proper H R l bases (363-404) and the missing 5' halves of the above helices were located. The correct HR1 sequence shows much less homology to other sequences. Perhaps bases 672-708 arose through duplication and subsequent divergence of the HR1 sequence. Both the S1 mapping data (Fig. 3) and the close proximity to the start of E. coli 23 S rRNA support our new placement of the 5' end in Fig. 2.

Recently, the Aspergillus mitochondrial large subunit rRNA sequence has been determined (8, 11). The 5' end of this rRNA was defined as that region of the DNA sequence where the first substantial homology exists with the E. coli 23 S rRNA sequence. This homology begins a t base 180 in E. coli, which, if it represents the true Aspergillus 5' end, prevents the formation of the same two long range helices mentioned above ( x and y, see Fig. 2 in Ref. 11). Inspection of the DNA

by guest on May 17, 2020

http://ww

w.jbc.org/

Dow

nloaded from

Page 7: Paramecium Mitochondrial Genes - Journal of Biological ...Paramecium Mitochondrial Genes 11. LARGE SUBUNIT rRNA GENE SEQUENCE AND MICROEVOLUTION* (Received for publication, September

Paramecium Large Subunit rRNA 5179

1 2 3 C T A G -

FIG. 5 (left). 20 S rDNA 5’ end. The TaqI fragment extending leftward C T A G 1 from base 578 (Fig. 2) was 5’ end-labeled and subjected to sequence chemistry (left 4 lanes) and S1 analysis ( l a n e Z).

FIG. 6 (right). 20 S rDNA 3‘ end. The AuaII fragment extending rightward from base 2635 (Fig. 2) was 3’ end-la- beled by fill-in synthesis and subjected to partial digestion by TaqI (lane I ) , Hue111 (lane 2) , and S1 analysis (lane 3). The sequence ladders were used as mo- lecular weight reference only. Protection of fragment lengths including the tRNA tyr gene is indicated by band t in lane 3.

sequence upstream from the proposed Aspergillus 5’ end re- veals a new potential 5’ end near base 7020 (Aspergillus numbers, Ref. 8) which allows part of helix x and the entire helix y to form. Nuclease S1 mapping data should allow the 5’ end of Aspergillus large subunit rRNA to be placed with certainty.

The discontinuity (or junction between 5.8 and 28 S rRNA) in cytoplasmic large subunit rRNA of eucaryotes typically occurs within the loop in helix t in Fig. 7 (E. coli structure). Helix z constitutes one of three helices formed jointly by the two RNA segments (the other two helices, x and y in Fig. 7, are discussed above). In contrast, the Paramecium disconti- nuity occurs within the large bottom loop (open arrows), and the helix formed from the sequences directly surrounding the insert contains components from both RNA segments. The unique absence of the normal 5.8128 S junction sequences and structures (i.e. helix t and neighboring helices) and the occur- rence of a discontinuity at a new site nearby in the molecule could be related. Perhaps in the absence of the structures normally associated with RNA processing, a new second proc- essing site has arisen. Alternately, the requirement for cleav- age at the normal junction site could have been relieved by the presence of the new discontinuity slightly further into the gene. We have called the upstream gene segment 5.8 S-like rRNA for reasons of convenience. Note, however, that the eucaryotic 5.8 S rRNA contains -160 nucleotides, while the 5.8 S-like fragment contains 280 nucleotides (240 of which can be aligned with the E. coli 23 S rRNA primary and secondary structure).

This is the first report of a mitochondrial large subunit rRNA discontinuity within this particular region. Interest- ingly, Chlamydomonas chloroplast large subunit rRNA has been reported to contain two discontinuities within this par- ticular loop, resulting in separate 3 and 7 S segments (25). In contrast, the compound helix containing this loop cannot

even be found in mammalian mitochondria (4-7) and is substantially shortened in Aspergillus mitochondrial large subunit rRNA (see Fig. 2 in Ref. 11). This particular helix and loop structure, then, is quite variable in length and structure. Other discontinuities are found in higher plant chloroplast large subunit rRNAs, in which the 3‘ end is contained in a discrete 4.5 S rRNA fragment (3, 26).

Thus, both the large and small subunit rRNAs of Parame- cium contain discontinuities, both of which fall at unique locations. In both cases, the excised RNA segments consist of additional nucleotides when compared to other rRNA se- quences a t homologous positions. Furthermore, the inserts are very A + T-rich, and they fall at the end of helices which are phylogenetically confirmed but variable in length. In both cases, the insert is particularly devoid of obvious secondary structure, such that it probably represents a single-stranded region in vivo. We cannot currently determine whether these inserts were removed as an obligatory part of ribosomal as- sembly or rather as a result of ribosomal disruption during cell fractionation and RNA extraction. However, it is not surprising that these enlarged loops have become more acces- sible to cleavage by random (or specific) ribonucleases. One possibility is that the enlarged loops protrude into more external (more vulnerable) sites within the ribosome. Another possibility is that the additional nucleotides cause steric hindrance to the ribosomal structure and must be removed before the proper structure can be assumed. Once the trend for removal of these inserts is set, they would be free to evolve rapidly. I t would be interesting to see if strains of Paramecium exist which contain differing length versions of (or lack en- tirely) these inserts.

These excised regions resemble introns in that they consti- tute deletable and rapidly evolving RNA, except that the two “exons” are not re-ligated. Ligation would probably be unnec- essary in the case of rRNAs, particularly since the two seg-

by guest on May 17, 2020

http://ww

w.jbc.org/

Dow

nloaded from

Page 8: Paramecium Mitochondrial Genes - Journal of Biological ...Paramecium Mitochondrial Genes 11. LARGE SUBUNIT rRNA GENE SEQUENCE AND MICROEVOLUTION* (Received for publication, September

5 180 Paramecium Large Subunit rRNA

ments are held together by extensive secondary structure. Also, these inserts occur within variable regions of the rRNA, while introns tend to be located in universally conserved regions of the molecule.

Another interesting aspect of the Paramecium mitochon- drial rRNA gene arrangement is the transcriptional problems associated with distantly located but coordinately regulated

genes. Several aspects of this problem are discussed in a forthcoming paper.' One possibility is that transcription of both rRNA genes involves initiation from separate but similar promoter sequences, both of which would be particularly strong. Indeed, conserved promoter-like sequences were found directly upstream of yeast mitochondrial rRNA genes (36). We have scanned the sequences upstream of Paramecium

by guest on May 17, 2020

http://ww

w.jbc.org/

Dow

nloaded from

Page 9: Paramecium Mitochondrial Genes - Journal of Biological ...Paramecium Mitochondrial Genes 11. LARGE SUBUNIT rRNA GENE SEQUENCE AND MICROEVOLUTION* (Received for publication, September

Paramecium Large Subunit rRNA 5181

large and small subunit rDNAs and have found no such conserved sequences. Moreover, the structure and evolution- ary patterns at the 5' ends of these genes are totally different. Small subunit rDNA is flanked on both sides by -20-bp hypervariable regions which are not particularly rich in A + T (45); large subunit rDNA is flanked by conserved poly(T) (5' end) and poly(A + T) (3' end) stretches. Poly(T) runs have been postulated to play a role in transcription initiation in both Dictyostelium nuclear protein-coding genes (37) and also in mouse nuclear rRNA genes (38), and therefore these stretches may function similarly here. Other than these poly(T) tracts, Paramecium rDNAs have no obvious con- served intergenic promoter sequence, nor is there much room for one, since in both cases, the preceding gene terminates -20 bp upstream from the rDNA start (45). On the other hand, intragenic promoters such as suggested for tRNAs (39, 40) could be located just inside the start of the rDNA sequence and have not been ruled out. One possibility for such a sequence, ACGGATG, appears at base -+19 in small subunit (45) and -+21 in large subunit rDNA. Its role, if any, in transcription is not known.

Another possibility is that rRNAs are cleaved from larger precursors (as yet undetected by Northern hybridization), analogous to the transcriptional system utilized in mamma- lian mitochondria (41,42). The close proximity of the tRNATY' gene to the 3' end of rDNA is indeed reminiscent of the contiguous gene organization found in these organisms. It is tempting to imagine that the poly(T) stretches which flank large subunit rDNA could act as processing punctuation, allowing the rRNAs to be recognized and cleaved from larger precursors. If the latter were the case, the discontinuity in Paramecium mitochondrial large subunit rRNA might simply reflect the presence of poly(T) (i.e. cleavage signals) at a particularly vulnerable location within the rRNA molecule. This explanation is not so appealing in the case of small subunit rRNA, however, since it is flanked instead by non- conserved sequences which contain no runs of any particular nucleotide.

The sequence comparisons between species 1 and 4 give a good feel for microevolution within a species with respect to each gene. First, it is clear that the same general gene order and structure have been maintained between these two spe- cies, although occasional small (less than 2 bp) insertions and deletions have occurred. Thus, it appears that the genetic differences between the Paramecium subspecies have for the most part been limited to individual base replacements. The mutability of a given nucleotide position apparently depends on function, as illustrated by the observed gene hierarchy and intragenic clustering of mutations. I t is interesting that two genes coded by the same DNA molecule for different subunits of the same cellular structure (i.e. ribosomes) can have differ- ent base replacement rates, as illustrated by the greater di- vergence of small subunit rRNA. Perhaps the greater degree of helicity within small subunit rRNA allows tolerance of more base changes than large subunit rRNA as long as proper base paring is maintained. As far as the nature of these changes, both rRNA genes showed a 1.5-fold preference for transitions over transversions. Transitional mutations appar- ently either arise easier and/or are more evolutionarily stable

than transversions. Sequence comparative analysis also has allowed us to iden-

tify potentially interesting features of the DNA sequence, such as the hairpin following the tRNATY' gene. Terminators similar in structure have been described in other systems (43). Its position within the linear Paramecium mitochondrial gen- ome (distal to the last known gene) makes it particularly interesting.

Acknowledgments-We gratefully acknowledge the help of Harry Noller and Gary Olsen in determining sequence alignment and struc- ture, Arthur Pritchard for careful criticism of the manuscript, and Karin Meng for help in preparing the figures.

REFERENCES 1. Brosius, J., Dull, T., and Noller, H. (1980) Proc. Natl. Acad. Sci. U. S. A.

3. Takaiwa, F., and Sigiura, M. (1982) Eur. J. Biochem. 124 , 13-19 2. Edwards, K., and Kossel, H. (1981) Nucleic Acids Res. 9 , 2853-2869

4. Eperon, I., Anderson, S., and Nierlich, D. (1980) Nature ( L o r d . ) 286,460-

77, 201-204

AG7

5. 6.

7.

8.

9.

10.

11. 12. 13.

14.

15.

16.

VayEtten, R., Wahlberg, M., and Clayton, D. (1980) Cell 2 2 , 157-170 Anderson, S., De Bruijn, M., Coulson, A., Eperon, I., Sanger, F., and Young,

Saccone, C., Cantatore, P., Gadaleta, G., Gallerani, R., Lanave, C., Pepe,

Netzker, R., Kochel, G., Basak, N., and Kuntzel, H. (1982) Nucleic Acids

Georgiev, O., Nikolaev, N., Hadjiolov, A,, Skryahin, K., Zakharyev, V., and

Otsuka, T., Nomiyama, H., Yoshida, H., Kukita, T., Kuhara, S., and Sakaki,

Kochel, H., and Kuntzel, H. (1982) Nucleic Acids Res. 10,4795-4801 Sor, F., and Fukuhara, H. (1983) Nucleic Acids Res. 11,339-348 Veldman, G. M., Klootwijk, J., de Regt, V., C. H. F., Planta, R. J., Branlant,

Noller, H., KO , J , Wheaton, V., Brosius, J., Gutell, R., Kopylov, A,, C., Krol, A,, and Ehel, J-P. (1981) Nucleic Acids Res. 9,6935-6952

Dohme, F., &rr,.W., Stahl, D., Gupta, R., and Woese, C. (1981) Nucleic Acids Res. 9 , 6167-6189

Glotz, C., Zwieb, C., Brimacornbe, R., Edwards, K., and Kossel, H. (1981) Nucleic Acids Res. 9,3287-3306

Branlant, C., Krol, A., Machatt, M., Pouyet, J., Ebel, J., Edwards, K., and Kossel. H. (1981) Nucleic Acids Res. 9. 4303-4324

I. (1982) J. Mol. Biol. 156,683-717

G., and Kroon, A. (1981) Nucleic Acids Res. 9,4139-4148

Res. 10,4783-4794

Bayev, A. (1981) Nucleic Acids Res. 9,6953-6958

Y. (1983) Proc. Natl. Acad. Sci. U. S. A. 80,3163-3167

17. Nazar, R: (1980) FEBS Lett. 119 , 212-214 18. Walker, W. (1981) FEBS Lett. 126 , 150-151 19. Cox, R., and Kelly, M. (1981) FEBS Lett. 130 , 1-6 20. Jacq, B. (1981) Nucleic Acids Res. 9 , 2913-2932 21. Pavlakis, G., Jordan, B., Wurst, R., and Vournakis, J. (1979) Nucleic Acids

22. Jordan, B., Latil-Demotte, M., and Jourdan, R. (1980) Nucleic Acids Res.

23. Olsen, G., and Sogin, M. (1982) Biochemistry 2 1 , 2335-2343 24. Walker, T., and Pace, N. (1983) Cell 3 3 , 320-322 25. Rochaix, J., and Darlix, J. (1982) J. Mol. Biol. 159 , 383-395 26. Edwards, K., Bedbrook, J., Dyer, T., and Kossel, H. (1981) Biochemistry

27. Dujon, B. (1980) Cell 20, 185-197 28. Bos, J., Osinga, K., Van der Horst, G., Hecht, N., Tabak, H., Van Ornmen,

29. Burke, J., and RajBhandary, U. (1982) Cell 31,509-520 30. Wright, R., and Cumrnings, D. (1983) Cum. Genet. 7, 151-157 31. Wild, M., and Sornmer, R. (1980) Nature (Lond.) 2 8 3 , 693-694 32. Nomiyama, H., Sakaki, Y., and Takagi, Y. (1981) Proc. Natl. Acad. Sei.

33. Seilhamer, J., and Curnmings, D. (1981) Nucleic Acids Res. 9 , 6391-6406 34. Gray, M., and Doolittle, F. (1982) Microbiol. Reu. 46 , 1-42 35. Margulis, L. (1975) Symp. SOC. Exp. Biol. 29 , 21-38 36. Osinga, K., and Tabak, H. (1982) Nucleic Acids Res. 1 0 , 3617-3626 37. Kimrnel, A., and Firtel, A. (1983) Nucleic Acids Res. 11 , 541-552 38. Kuehn, M., and Arnheim, N. (1983) Nucleic Acids Res. 11,211-224 39. Galli, G., Hofstetter, H., and Birnstiel, M. (1981) Nature (Lond.) 294 ,

Res. 7,2213-2238

8,3565-3573

Int. 2 , 533-538

G., and Borst, P. (1980) Cell 2 0 , 207-214

U. S. A. 78, 1376-1380

626-fi31 40. Cilberto, G., Castagnoli, L., Melton, D., and Cortese, R. (1982) Proc. Natl.

41. Ojala, D., Montoya, J., and Attardi, G. (1981) Nature (Lond.) 2 9 0 , 470-

"_ ".

Acad. Sci. U. S . A. 7 9 , 1195-1199

474 42. Bakey, J., and Clayton, D. (1980) J. Bid. Chem. 2 5 5 , 11599-11606

44. MacKay, R. (1981) FEBS Lett. 123 , 17-18 43. Rosenberg, M., and Court, D. (1979) Annu. Reu. Genet. 13 , 319-353

45. Seilhamer, J., Olsen, G., and Cummings, D. (1984) J . Biol. Chem. 259 ,

46. Maxam, A., and Gilbert, W. (1980) Methods Enzymol. 6 5 , 499-595 5167-5172

by guest on May 17, 2020

http://ww

w.jbc.org/

Dow

nloaded from

Page 10: Paramecium Mitochondrial Genes - Journal of Biological ...Paramecium Mitochondrial Genes 11. LARGE SUBUNIT rRNA GENE SEQUENCE AND MICROEVOLUTION* (Received for publication, September

J J Seilhamer, R R Gutell and D J Cummingsmicroevolution.

Paramecium mitochondrial genes. II. Large subunit rRNA gene sequence and

1984, 259:5173-5181.J. Biol. Chem. 

  http://www.jbc.org/content/259/8/5173Access the most updated version of this article at

 Alerts:

  When a correction for this article is posted• 

When this article is cited• 

to choose from all of JBC's e-mail alertsClick here

  http://www.jbc.org/content/259/8/5173.full.html#ref-list-1

This article cites 0 references, 0 of which can be accessed free at

by guest on May 17, 2020

http://ww

w.jbc.org/

Dow

nloaded from