characterizing cdna end by circular race

9
257 Virginie Orgogozo and Matthew V. Rockman (eds.), Molecular Methods for Evolutionary Genetics, Methods in Molecular Biology, vol. 772, DOI 10.1007/978-1-61779-228-1_15, © Springer Science+Business Media, LLC 2011 Chapter 15 Characterizing cDNA Ends by Circular RACE Patrick T. McGrath Abstract Rapid amplification of cDNA ends (RACE) is a widely used PCR-based method to identify the 5¢ and 3¢ ends of cDNA transcripts from partial cDNAs. While conceptually simple, this method often requires substantial optimization before accurate end identification is achieved. This is due in part to the anchoring of a universal primer to a cDNA or mRNA for PCR, which can lead to the generation of nonspecific ampli- fication. Here, we describe an improvement of the original RACE method, circular RACE, which can be used to simultaneously identify both the 5¢ and 3¢ end of a target cDNA. Key words: 5¢ UTR, 3¢ UTR, Circular RACE The complete isolation of a full-length gene transcript is an important step in studying a gene function, yet can be difficult even in well-studied model organisms. In eukaryotes, coding regions are divided between multiple exons, which are separated by noncoding introns. The complexity of transcription initiation, termination, and splicing makes ab initio gene prediction from sequence extremely challenging. As an orthogonal approach, expressed sequence tag (EST) libraries containing cDNA frag- ments can be used to identify short (~500 bp) portions of expressed genes. Typically, these libraries will not completely cover the entire transcriptome – ESTs are often biased to the 3¢ end of transcripts and transcripts with low to medium levels of transcription often will not be completely covered. Even in organisms with high- quality genome sequence and large numbers of EST sequences, such as the nematode Caenorhabditis elegans, it has been estimated 1. Introduction

Upload: albert-grind

Post on 18-Apr-2015

44 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Characterizing Cdna End by Circular Race

257

Virginie Orgogozo and Matthew V. Rockman (eds.), Molecular Methods for Evolutionary Genetics, Methods in Molecular Biology, vol. 772, DOI 10.1007/978-1-61779-228-1_15, © Springer Science+Business Media, LLC 2011

Chapter 15

Characterizing cDNA Ends by Circular RACE

Patrick T. McGrath

Abstract

Rapid amplification of cDNA ends (RACE) is a widely used PCR-based method to identify the 5¢ and 3¢ ends of cDNA transcripts from partial cDNAs. While conceptually simple, this method often requires substantial optimization before accurate end identification is achieved. This is due in part to the anchoring of a universal primer to a cDNA or mRNA for PCR, which can lead to the generation of nonspecific ampli-fication. Here, we describe an improvement of the original RACE method, circular RACE, which can be used to simultaneously identify both the 5¢ and 3¢ end of a target cDNA.

Key words: 5¢ UTR, 3¢ UTR, Circular RACE

The complete isolation of a full-length gene transcript is an important step in studying a gene function, yet can be difficult even in well-studied model organisms. In eukaryotes, coding regions are divided between multiple exons, which are separated by noncoding introns. The complexity of transcription initiation, termination, and splicing makes ab initio gene prediction from sequence extremely challenging. As an orthogonal approach, expressed sequence tag (EST) libraries containing cDNA frag-ments can be used to identify short (~500 bp) portions of expressed genes. Typically, these libraries will not completely cover the entire transcriptome – ESTs are often biased to the 3¢ end of transcripts and transcripts with low to medium levels of transcription often will not be completely covered. Even in organisms with high- quality genome sequence and large numbers of EST sequences, such as the nematode Caenorhabditis elegans, it has been estimated

1. Introduction

Bshadows
Resaltar
Page 2: Characterizing Cdna End by Circular Race

258 P.T. McGrath

that 20% of the genome is incorrectly annotated (1) and many gene predictions lack 5¢ and 3¢ UTR regions.

Rapid amplification of cDNA ends (RACE) was originally developed in the late 1980s as a method to clone unknown flanking cDNA sequences for a particular gene of interest when a small amount of cDNA sequence is either already known or can be confi-dently predicted (2–4). A universal anchor sequence is attached to the 5¢ or 3¢ end of RNA or cDNA that has been isolated from an organism of interest. Then, PCR using a primer specific to the anchor sequence and a primer specific to the known sequence of the transcript of interest is used to amplify the region 5¢ or 3¢ of the gene specific primer. This product can then be cloned or sequenced to identify the unknown sequence 5¢ or 3¢ of the known sequence.

As originally described, RACE is often fraught with technical difficulties and can require substantial optimization before the desired result is achieved (5). A number of modifications have been developed to overcome some of these difficulties. Here, we describe one of these methods, circular RACE, that offers a few advantages over traditional RACE. Because no universal adaptor needs to be ligated to a cDNA in circular RACE, multiple gene-specific primers can be used to increase the specificity of the PCR reaction. Additionally, this strategy also allows the simultaneous identifica-tion of both the 5¢ and 3¢ ends of the target transcript by inverse PCR. While circular RACE can also describe a technique where mRNA-derived cDNAs are circularized and used as templates for inverse PCR (6), the protocol described here (adapted from Mandl et al. (7)) involves first circularizing RNA and then generating cDNA to use as template for inverse PCR. We prefer this second protocol as it allows for a powerful control to distinguish between inverse PCR products that are generated from intramolecularly ligated intact mRNAs (the goal) and PCR products that are generated from nonspecific amplification or degraded or incom-pletely processed mRNAs (the contamination). Before mRNAs can be intramolecularly ligated, or circularized, they must first be “decapped” by removing the 5¢-terminal methylated guanine nucle-otide. Thus, by running a reaction without decapping the mRNAs, one can identify the PCR products that are generated from non-specific hybridization or mRNA degradation products.

1. High quality total RNA from your species of interest. 2. Tobacco acid pyrophosphatase (TAP) (10 U/ml) (Epicentre,

Madison, WI). 3. 10× TAP buffer (Epicentre).

2. Materials

2.1. Decapping the RNA

Bshadows
Resaltar
Bshadows
Resaltar
Bshadows
Resaltar
Bshadows
Resaltar
Bshadows
Resaltar
Bshadows
Resaltar
Bshadows
Resaltar
Bshadows
Resaltar
Page 3: Characterizing Cdna End by Circular Race

25915 Characterizing cDNA Ends by Circular RACE

4. Nuclease-free water (Ambion, Austin, TX). 5. RNAseZAP wipes (Ambion). 6. RNAse-free microfuge tubes (Ambion). 7. RNAse-free tips (Ambion).

1. Decapped RNA from previous step. 2. T4 RNA ligase (20 U/ml) (New England Biolabs, Ipswitch, MA). 3. 10× T4 RNA ligase buffer (New England Biolabs). 4. RNAse-free microfuge tubes (Ambion). 5. RNAse-free tips (Ambion).

1. Circularized RNA from previous step. 2. Superscript III RT (200 U/ml) (Invitrogen, Carlsbad, CA)

(items 2–8 can be purchased as the Superscript III First-Strand Synthesis System for RT-PCR).

3. RNaseOUT recombinant RNAse inhibitor (40 U/ml) (Invitrogen).

4. 5× First-strand buffer (Invitrogen). 5. 0.1 M DTT (Invitrogen). 6. 10 mM dNTP mix (Invitrogen). 7. Random hexamers (50 ng/ml) (Invitrogen). 8. RNAse H (2 U/ml) (Invitrogen). 9. Gene-specific primer (1 mM). 10. Thin-walled RNAse-free PCR tubes (Ambion). 11. Nuclease-free water (Ambion).

1. cDNA from previous step. 2. Outer and inner gene-specific primers (10 mM). 3. PfuUltra II Fusion HS DNA polymerase (Agilent Technologies,

Santa Clara, CA). 4. 10× PfuUltra II buffer (Agilent). 5. 10 mM dNTP mix (Invitrogen). 6. De-ionized water. 7. Zymoclean Gel DNA Recovery Kit (Zymo Research, Orange, CA).

A schematic of the overall protocol is shown in Fig. 1. The proto-col assumes that the user has already purified total RNA from their sample of interest. RNA can be extracted using a Trizol-based

2.2. Circularizing the RNA

2.3. Reverse Transcribing the Circularized RNA into cDNA

2.4. Nested PCR

3. Methods

Page 4: Characterizing Cdna End by Circular Race

260 P.T. McGrath

Fig. 1. Schematic depicting the overall strategy of circular RACE, starting from total RNA isolated from a sample. The final PCR product should be sequenced to identify the 5¢ and 3¢ ends of the transcript. While primers are not used in the first step of mRNA uncapping, we show them with respect to the mRNA to aid in primer design. The black bar underneath the mRNA indicates the region of the transcript already identified by ESTs or can be confidently predicted computationally. Please note that random hexamers can be used to generate cDNA in place of the cDNA_r gene-specific primer.

Page 5: Characterizing Cdna End by Circular Race

26115 Characterizing cDNA Ends by Circular RACE

protocol or one of the many commercially available RNA purification kits. The RNA is then decapped with TAP (a control reaction without TAP will also be run). TAP cleaves the pyrophos-phate bond of the 5¢-terminal methylated guanine nucleotide “cap” of eukaryotic messenger RNAs. The decapped RNA can then be intramolecularly ligated with T4 RNA ligase to create cir-cularized RNA, whereby the 5¢ and 3¢ ends of the RNA are joined. At this point, the circularized RNA is reverse transcribed using either a gene-specific primer or random hexamers to create a cDNA containing the 5¢–3¢ junction. Finally, two nested PCR reactions are run from this cDNA template to amplify a PCR product con-taining the junction for the user’s gene of choice. This product can then be cloned and or sequenced to identify the 5¢ and 3¢ ends of the transcript.

The initial quality of the RNA sample is essential to the success of the protocol. Since RNA can be easily degraded, standard RNA handling techniques should be applied. A cleaved mRNA can still be intramolecularly ligated and used as a template for the cDNA synthesis, creating a band that could be interpreted as a 5¢ end of the transcript.

1. To prevent RNA degradation, gloves should always be worn, nuclease-free water should be used for reactions, and bench area/equipment should be wiped down with RNAseZAP wipes (see Notes 1 and 2 for information on how to assess RNA quality).

2. For the +TAP reaction, in an RNAse-free microfuge tube, com-bine 500 ng of total RNA (or 125 ng of mRNA, see Note 3), 1 ml of TAP, 2 ml of 10× TAP Reaction buffer, and enough nuclease-free water to bring the total reaction volume to 20 ml.

3. Spurious PCR products in downstream steps can also result from degraded or incompletely processed transcripts, or cross-contamination from other uncircularized RNA species. PCR of these spurious bands will not require the addition of TAP. Therefore, as a control, in a second RNAse-free microfuge tube, combine 500 ng of total RNA (or 125 ng of mRNA), 2 ml of 10× TAP Reaction buffer, and enough nuclease-free water to bring the total reaction volume to 20 ml.

4. Incubate both reactions at 37°C for 2 h.

1. The decapped RNA is used as a substrate for the next reaction (see Note 4). For the +TAP and −TAP reactions, combine 17 ml of +TAP or −TAP RNA with 2 ml of 10× Reaction buffer, and 1 ml of T4 RNA ligase.

2. Incubate both reactions at 37°C for 2 h. 3. Inactivate the T4 RNA ligase by incubating at 65°C for

15 min.

3.1. Decapping the RNA

3.2. Circularizing the RNA

Page 6: Characterizing Cdna End by Circular Race

262 P.T. McGrath

1. In order to create a template for PCR, the circular RNA must first be reverse transcribed into linear cDNA. The exact location where linearization occurs is determined by the location of the primer and must not be near the 5¢–3¢ junction. Note that this means that a poly-T primer cannot be used for the RT reaction. Rather, a gene-specific primer must be chosen in a particular region of the predicted sequence (illustrated in Fig. 1). This primer will be the reverse complement to part of the known mRNA sequence, and should be ~200 bp downstream of the unknown 5¢ sequence. Alternatively, random hexamers can also be used to generate cDNA. For most genes, template generated from random hexamers is sufficient to identify the 5¢ and 3¢ ends of a transcript. Since cDNA generated from random hex-amers can be used as a template for any gene of interest, we recommend using random hexamers. If difficulty is encoun-tered in subsequent steps, we then recommend switching to a gene-specific primer for the reverse transcription reaction.

2. It is recommended to follow the protocol for RT-PCR exactly to ensure long extensions and sufficient yield. While not neces-sary, performing the following reaction in a PCR machine is encouraged to ensure accurate incubation times and tempera-tures. In two nuclease-free PCR tubes, add 10 ml of circular-ized RNA from the +TAP or −TAP reactions, 2 ml of random hexamers or 2 ml of gene-specific primer, 1 ml of 10 mM dNTP mix, and nuclease-free water to 13 ml. Heat mixture to 65°C for 5 min and then incubate on ice for at least 1 min.

3. To these tubes, add 4 ml 5× First-Strand Buffer, 1 ml 0.1 M DTT, 1 ml RNaseOUT Recombinant Rnase Inhibitor, and 1 ml of SuperScript III RT.

4. Mix by pipetting up and down. When using random hexamers, incubate the tubes at 25°C for 5 min. Then, incubate the tubes at 50°C for 45 min, followed by 55°C for 45 min. Inactivate the reaction by heating at 70°C for 15 min. This can be a use-ful place to stop for the day (see Note 5).

5. Amplification of some targets may require the removal of the complementary RNA. We recommend adding 1 ml (2 Units) of RNAse H to each tube and incubating at 37°C for 20 min.

1. A PCR product containing unknown regions 5¢ and 3¢ from the transcript sequence can now be amplified from the preceding product. Since many transcripts will be found at low levels either due to low expression or limited expression in a subset of tis-sues, we recommend performing a nested PCR reaction to improve specificity. In nested PCR, the target DNA undergoes the first round of amplification using the outer primers (1f and 1r in Fig. 1). This step will amplify DNA from the target cDNA as well as additional nonspecific, unwanted PCR products. A second round of amplification uses two new primers internal

3.3. Reverse Transcribing the Circularized RNA into cDNA

3.4. Nested PCR

Page 7: Characterizing Cdna End by Circular Race

26315 Characterizing cDNA Ends by Circular RACE

to the outer primers used in the first step (2f and 2r in Fig. 1). Note, that the inner primers should not overlap with the outer primers. These inner primers will again amplify from the target DNA, but it is unlikely any nonspecific, contaminating tran-scripts amplified from the first PCR reaction will contain bind-ing sites for the new inner primers.

2. The inner and outer primers should be chosen from the region of known sequence, as shown in Fig. 1. If the primers are at least 50 bp away from the boundary between known and unknown sequence, the final product can be sequenced directly. Alternatively, the PCR product can be cloned into a sequenc-ing vector. The primers should be chosen as close to the bound-ary between known and unknown sequence as possible. This will lead to a shorter PCR product and higher chance of suc-cess in amplifying product from the transcript of interest. However, care should be taken in selecting the sequence of the primers. We recommend using Primer3 software (http://frodo.wi.mit.edu/primer3/) to choose primers with a melting temperature of around 60°C.

3. While most DNA polymerases can be used, we recommend using PfuUltra II due to its high specificity and processivity. In two 200 ml thin-walled PCR tubes, combine 2 ml of cDNA template from the previous reaction, 39.5 ml of distilled water, 5 ml of 10× PfuUltra II reaction buffer, 1.5 ml of dNTP mix, 1 ml of outer primer 1f (at 10 mM), 1 ml of outer primer 1r (at 10 mM), and 1 ml of PfuUltra II fusion HS DNA polymerase. Mix gently by pipetting up and down.

4. We recommend using a touchdown PCR to reduce amplifica-tion of nonspecific sequence. A typical cycling protocol for PCR primers with a 60°C melting temperature would be:

Denature: 95°C for 1 min

Touchdown cycle: Denature at 95°C for 20 sAnneal primers at 65°C for 20 sPrimer extension at 72°C for 30 sCycle seven times, decreasing the temperature

for annealing primers by 1°C each cycle

Subsequent cycle: Denature at 95°C for 20 sAnneal primers at 58°C for 20 sPrimer extension at 72°C for 30 sCycle 30 times

Final extension: 72°C for 3 min

5. Often, a single PCR reaction will not generate enough specific product for sequencing or cloning. Dilute a 2 ml aliquot of the +TAP and −TAP PCR products into 48 ml of de-ionized water or TE buffer. In two 200 ml thin-walled PCR tubes, combine

Page 8: Characterizing Cdna End by Circular Race

264 P.T. McGrath

2 ml of diluted PCR product from the previous reactions, 39.5 ml of distilled water, 5 ml of 10× PfuUltra II reaction buf-fer, 1.5 ml of dNTP mix (10 mM each dNTP), 1 ml of inner primer 2f (at 10 mM), 1 ml of inner primer 2r (at 10 mM), and 1 ml of PfuUltra II fusion HS DNA polymerase. Mix gently by pipetting up and down.

6. Run a touchdown PCR identical to the previous reaction. 7. The PCR product for the +TAP and −TAP reactions should be

analyzed using agarose gel electrophoreses. 8. Any bands observed in the +TAP reaction but not the −TAP

reaction should be cut out and purified using the gel purifica-tion kit (Fig. 2 and see Note 6). These products can be sequenced using the 2f and 2r inner primers (see Note 7). Because of the presence of the poly-A tail within this PCR product, which is difficult to sequence, it will likely be neces-sary to sequence the PCR products from both ends to identify the 5¢ and 3¢ ends.

1. RNA degradation can be a serious issue for this protocol. The amount of degradation in an RNA sample can be estimated using agarose gel electrophoresis. The 28S and 18S rRNA products can be visualized by running a total RNA sample, and should appear as two clear bands. If smearing is observed, RNA should be re-isolated.

2. To ensure that the transcript of interest is present in the RNA sample, a Northern blot can be performed.

4. Notes

Fig. 2. A characteristic agarose gel electrophoresis result from circular RACE. Bands observed in the −TAP control reaction are likely contaminating bands. Any bands that are observed in the +TAP reaction lane should be isolated, purified, and sequenced.

Page 9: Characterizing Cdna End by Circular Race

26515 Characterizing cDNA Ends by Circular RACE

3. For most genes, total RNA can be used as a starting point for circular RACE. However, rRNAs and incompletely processed transcripts can increase background amplification in later PCR reactions. For extremely rare transcripts, it is recommended that mRNA be purified using one of the many commercially available kits. In this case, 125 ng of purified mRNA should be used in place of the 500 ng of total RNA for the decapping reaction.

4. TAP does not need to be deactivated before proceeding with the circularizing reaction.

5. The decapping, circularization, and reverse transcription reac-tions (up to the RNAse H treatment) should be run on the same day. At this point, the cDNA template can be stored indefinitely at −20°C.

6. If no bands are detected after the second round of PCR, it is useful to start troubleshooting with the nested PCR step. Using a positive control, typically a transcript in which the cir-cular RACE protocol had been previously successfully applied to, can be particularly helpful in testing the successes of the preceding reactions. If the positive control works, then stan-dard troubleshooting measures following the guidelines of the DNA polymerase manufacturer typically resolve the issue. New nested primers can also be tried.

7. A poly-A tail should be identified between the 5¢ and 3¢ ends of the transcript in the final sequencing reaction.

Acknowledgment

We thank Andres Bendesky for critical reading of this protocol.

References

1. Salehi-Ashtiani K, Lin C, Hao T et al (2009) Large-scale RACE approach for proactive exper-imental definition of C. elegans ORFeome. Genome Res 19:2334–2342

2. Frohman MA, Dush MK, Martin GR (1988) Rapid production of full-length cDNAs from rare transcripts: amplification using a single gene-specific oligonucleotide primer. Proc Natl Acad Sci USA 85:8998–9002

3. Loh EY, Elliott JF, Cwirla S et al (1989) Polymerase chain reaction with single-sided specificity: analysis of T cell receptor delta chain. Science 243:217–220

4. Ohara O, Dorit RL, and Gilbert W (1989) One-sided polymerase chain reaction: the

amplification of cDNA. Proc Natl Acad Sci USA 86:5673–5677

5. Schaefer BC (1995) Revolutions in rapid amplification of cDNA ends: new strategies for polymerase chain reaction cloning of full-length cDNA ends. Anal Biochem 227: 255–273

6. Maruyama IN, Rakow TL, and Maruyama HI (1995) cRACE: a simple method for identifica-tion of the 5¢ end of mRNAs. Nucleic Acids Res 23:3796–3797

7. Mandl CW, Heinz FX, Puchhammer-Stockl E et al (1991) Sequencing the termini of capped viral RNA by 5¢-3¢ ligation and PCR. Biotechniques 10:484, 486