oa-thalassaemia caused by a poly(a) site mutation reveals that

8
The EMBO Journal vol.5 no. 11 pp.2915-2922, 1986 oa-Thalassaemia caused by a poly(A) site mutation reveals that transcriptional termination is linked to 3' end processing in the human cu2 globin gene Emma Whitelaw and Nick Proudfoot Sir William Dunn School of Pathology, University of Oxford, South Parks Road, Oxford, OXI 3RE, UK Communicated by N.J.Proudfoot We have investigated the process of transcriptional termina- tion in the duplicated human a globin genes which lie 4 kb apart on chromosome 16. In the human erythroleukemic cell line, K562, which expresses high levels of a globin, nuclear run-off experiments suggest that termination occurs within a region of 600 bp past the poly(A) site of both al and a2 globin genes. However, a thalassaemic a2 globin gene with a non-functional poly(A) site AAUAAG, when transfected into HeLa cells, not only fails to 3' end process but also fails to terminate transcription. Studies on both steady-state RNA and nuclear run-off analysis of the primary transcripts show that transcription of the mutant a2 globin gene reads through in- to the intergenic sequence past the normal termination site. These results demonstrate that transcriptional termination and 3' end processing of mRNA are coupled events for the a2 globin gene. Key words: a2 globin gene/mRNA processing/transcription ter- mination/poly(A) site/cx-thalassaemia Introduction The development of in vivo and in vitro systems that accurately initiate the transcription of cloned genes has brought considerable insight into the promoter sequences of eukaryotic genes and the protein factors associated with initiation (Corden et al., 1980; Dynan and Tjian, 1985). However, far less is known about the events involved in transcriptional termination. RNA polymerase I terminates transcription 500 bp downstream of the 3' end of the mature 28S rRNA at a set of repeated elements interspersed by runs of pyrimidines (Grummt et al., 1985). With RNA poly- merase III, the termination process appears to be controlled by a short run of T residues (Bogenhagen and Brown, 1981). The termination sites of RNA polymerase II transcription units have not been accurately identified. For a number of genes, such as mouse globin (Hofer et al., 1982; Citron et al., 1984), rabbit globin (Rohrbaugh et al., 1985), Drosophila histone (Price and Parker, 1984), chicken histone (Krieg and Melton, 1984), sea urchin histone (Birchmeier et al., 1984), adenovirus major late (Fraser et al., 1979; Moore and Sharp, 1985), mouse ca amylase (Hagenbuchle et al., 1984) and the chicken ovalbumin gene (LeMeur et al., 1984), the transcribing RNA polymerases read through the sites on the DNA which correspond to the sequence at the 3' end of the stable mRNA. Formation of correct 3' ends requires endonucleolytic cleavage of the primary transcript. Nuclear run-off experiments, in which the primary transcript is labelled with [32P]UTP, show that transcriptional termination can occur a long way [up to 2.0 kb in the case of the mouse a amylase gene (Hagenbuchle et al., 1984)] downstream of the poly(A) site. Recent studies on the mouse ,B globin gene termination pro- cess indicate that the 0.8-kb DNA fragment in the 3'-flanking IRL Press Limited, Oxford, England region of the gene within which transcriptional termination oc- curs is not sufficient to cause termination when it is introduced into another gene, the adenovirus EIA gene. However, a larger fragment (1.6 kb) containing both the (3 globin gene poly(A) site and the termination region does cause termination of transcrip- tion of the ElA gene (Falck-Pedersen et al., 1985). These ex- periments raise the possibility that termination of transcription may only occur downstream of an active polyadenylation site. We have addressed this question directly using a thalassaemic human a globin gene with a point mutation at the AATAAA sequence which results in a failure to generate correct polyadenylated 3' ends (Higgs et al., 1983). By comparing the transcriptional termination of this mutant a2 globin gene with that of the wild-type a2 globin gene, we show that transcrip- tional termination of the a2 globin gene does require a functional polyadenylation site. Results Transcriptional termination of the a2 and al globin genes oc- curs close to their poly(A) addition sites To determine the position of transcriptional termination for the human a2 and a l globin genes, we carried out nuclear run-off analysis on nuclei isolated from K562 cells (Lozzio and Lozzio, 1975), a human erythroleukaemia cell line that expresses high amounts of at globin following haemin induction (Rutherford et al., 1979; Charnay and Maniatis, 1983). The subline of K562 used in these experiments expresses equivalent amounts of a2 and acl globin mRNAs (data not shown). Figure IA shows a diagram of the human a globin genes with the positions of the DNA probes used in the nuclear run-off analysis indicated on the gene map. Thus a 1.6-kb PstI fragment (probe A) was used to detect both a2 and a1 gene transcripts while two DNA fragments (probes B and C) 3' to the a2 globin gene and one fragment (probe D) 3' to the a 1 globin gene, were used to detect gene transcripts that extend into the 3'-flanking region. Figure lB shows the nuclear run-off data obtained using these a gene probes. The left side panel is an agarose gel frac- tionation of probes A, B and C, while the middle and right panels are hybridizations of 32P-labelled nuclear RNA to blots of these fractionated probes. Probe A is a PstI digest of the a 1 PstI 1.6-kb fragment subcloned into pBR322. Both vector band and probe are present in this lane. As indicated, a strong positive signal was obtained for probe A while the vector band gave a back- ground hybridization signal. Probes B, C and D are purified gel bands and gave significanfly lower signals than for probe A, only slightly higher than background. Probe C is cross-contaminated with a larger sized DNA fragment that contains an Alu repeat sequence. Although this concomitant is only present in trace amounts, a significant hybridization signal was obtained to this fragment which illustrates the danger of low level contaminating Alu sequence in nuclear run-off probes. Table IA quantifies the signals obtained for probes A, B, C and D. Each signal is directly comparable because the number of potentially transcribed nucleotides in all four probes is -1 kb. 2915

Upload: buihanh

Post on 01-Feb-2017

214 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: oa-Thalassaemia caused by a poly(A) site mutation reveals that

The EMBO Journal vol.5 no. 11 pp.2915-2922, 1986

oa-Thalassaemia caused by a poly(A) site mutation reveals thattranscriptional termination is linked to 3' end processing in thehuman cu2 globin gene

Emma Whitelaw and Nick Proudfoot

Sir William Dunn School of Pathology, University of Oxford, South ParksRoad, Oxford, OXI 3RE, UK

Communicated by N.J.Proudfoot

We have investigated the process of transcriptional termina-tion in the duplicated human a globin genes which lie 4 kbapart on chromosome 16. In the human erythroleukemic cellline, K562, which expresses high levels of a globin, nuclearrun-off experiments suggest that termination occurs withina region of 600 bp past the poly(A) site of both al and a2globin genes. However, a thalassaemic a2 globin gene witha non-functional poly(A) site AAUAAG, when transfected intoHeLa cells, not only fails to 3' end process but also fails toterminate transcription. Studies on both steady-state RNA andnuclear run-off analysis of the primary transcripts show thattranscription of the mutant a2 globin gene reads through in-to the intergenic sequence past the normal termination site.These results demonstrate that transcriptional terminationand 3' end processing of mRNA are coupled events for thea2 globin gene.Key words: a2 globin gene/mRNA processing/transcription ter-mination/poly(A) site/cx-thalassaemia

IntroductionThe development of in vivo and in vitro systems that accuratelyinitiate the transcription of cloned genes has brought considerableinsight into the promoter sequences of eukaryotic genes and theprotein factors associated with initiation (Corden et al., 1980;Dynan and Tjian, 1985). However, far less is known about theevents involved in transcriptional termination. RNA polymeraseI terminates transcription 500 bp downstream of the 3' end ofthe mature 28S rRNA at a set of repeated elements interspersedby runs of pyrimidines (Grummt et al., 1985). With RNA poly-merase III, the termination process appears to be controlled bya short run of T residues (Bogenhagen and Brown, 1981). Thetermination sites of RNA polymerase II transcription units havenot been accurately identified. For a number of genes, such as

mouse globin (Hofer et al., 1982; Citron et al., 1984), rabbitglobin (Rohrbaugh et al., 1985), Drosophila histone (Price and

Parker, 1984), chicken histone (Krieg and Melton, 1984), sea

urchin histone (Birchmeier et al., 1984), adenovirus major late(Fraser et al., 1979; Moore and Sharp, 1985), mouse ca amylase(Hagenbuchle et al., 1984) and the chicken ovalbumin gene(LeMeur et al., 1984), the transcribing RNA polymerases readthrough the sites on the DNA which correspond to the sequenceat the 3' end of the stable mRNA. Formation of correct 3' endsrequires endonucleolytic cleavage of the primary transcript.Nuclear run-off experiments, in which the primary transcript islabelled with [32P]UTP, show that transcriptional termination can

occur a long way [up to 2.0 kb in the case of the mouse a amylasegene (Hagenbuchle et al., 1984)] downstream of the poly(A) site.

Recent studies on the mouse ,B globin gene termination pro-cess indicate that the 0.8-kb DNA fragment in the 3'-flanking

IRL Press Limited, Oxford, England

region of the gene within which transcriptional termination oc-curs is not sufficient to cause termination when it is introducedinto another gene, the adenovirus EIA gene. However, a largerfragment (1.6 kb) containing both the (3 globin gene poly(A) siteand the termination region does cause termination of transcrip-tion of the ElA gene (Falck-Pedersen et al., 1985). These ex-periments raise the possibility that termination of transcriptionmay only occur downstream of an active polyadenylation site.We have addressed this question directly using a thalassaemichuman a globin gene with a point mutation at the AATAAAsequence which results in a failure to generate correctpolyadenylated 3' ends (Higgs et al., 1983). By comparing thetranscriptional termination of this mutant a2 globin gene withthat of the wild-type a2 globin gene, we show that transcrip-tional termination of the a2 globin gene does require a functionalpolyadenylation site.

ResultsTranscriptional termination of the a2 and al globin genes oc-curs close to their poly(A) addition sitesTo determine the position of transcriptional termination for thehuman a2 and a l globin genes, we carried out nuclear run-offanalysis on nuclei isolated from K562 cells (Lozzio and Lozzio,1975), a human erythroleukaemia cell line that expresses highamounts of at globin following haemin induction (Rutherford etal., 1979; Charnay and Maniatis, 1983). The subline of K562used in these experiments expresses equivalent amounts of a2and acl globin mRNAs (data not shown).

Figure IA shows a diagram of the human a globin genes withthe positions of the DNA probes used in the nuclear run-offanalysis indicated on the gene map. Thus a 1.6-kb PstI fragment(probe A) was used to detect both a2 and a1 gene transcriptswhile two DNA fragments (probes B and C) 3' to the a2 globingene and one fragment (probe D) 3' to the a 1 globin gene, wereused to detect gene transcripts that extend into the 3'-flankingregion. Figure lB shows the nuclear run-off data obtained usingthese a gene probes. The left side panel is an agarose gel frac-tionation of probes A, B and C, while the middle and right panelsare hybridizations of 32P-labelled nuclear RNA to blots of thesefractionated probes. Probe A is a PstI digest of the a1 PstI 1.6-kbfragment subcloned into pBR322. Both vector band and probeare present in this lane. As indicated, a strong positive signalwas obtained for probe A while the vector band gave a back-ground hybridization signal. Probes B, C and D are purified gelbands and gave significanfly lower signals than for probe A, onlyslightly higher than background. Probe C is cross-contaminatedwith a larger sized DNA fragment that contains an Alu repeatsequence. Although this concomitant is only present in traceamounts, a significant hybridization signal was obtained to thisfragment which illustrates the danger of low level contaminatingAlu sequence in nuclear run-off probes.Table IA quantifies the signals obtained for probes A, B, C

and D. Each signal is directly comparable because the numberof potentially transcribed nucleotides in all four probes is -1 kb.

2915

Page 2: oa-Thalassaemia caused by a poly(A) site mutation reveals that

E.Whitelaw and N.J.Proudfoot

2

a2

Pst I

IL

Probe A

BVector\

A/

3

Pvu EII ALBgII

Dra I Bgl It

B

A B C

Fig. 1. Nuclear run-off analysis on human a2 and a globin gene transcription in haemin-induced K562 nuclei. (A) Map of the human a globin genes. Genesare divided into exons as filled-in boxes, and non-coding or introns as open boxes. Direction of a gene transcription is indicated by arrows. Position of anAlu repeat region between a2 and ac1 is indicated. The positions of the four DNA probes used in the analysis are drawn under the map. (B) Thehybridization signals obtained to blots of probes A, B, C and D versus [32P]nuclear RNA from K562 cells. On the left-hand panel is shown the ethidiumbromide-stained probes separated by agarose gel electrophoresis. Probe A is a PstI digest of the a2 globin gene 1.6-kb PstI fragment in pBR322 at the PstIsite. Probes B, C and D (agarose gel fractionation not shown) are purified restriction fragments of rai2W3'PS (B and C) and from palRB (D): (see Materialsand methods for details of plasmids). These probe fractionations were transferred to cellulose nitrate and hybridised to [a-32P]UTP-labelled RNA from K562nuclei. After extensive washing the autoradiographs were exposed for 3 days at -70°C (see Materials and methods) as shown on the middle and right-handpanels. Duplicate experiments with probes A and B and single experiments wtih probes C and D are shown.

The larger 1.6-kb PstI probe A which contains 600 bp of5'-flanking sequence, similarly covers 1 kb of transcribed genesequence for both the a2 and al globin genes. Therefore thesignal to probe A represents approximately twice the true signalfor a2 or a1 globin gene transcripts separately. The hybridiza-tions in Table IA are normalized to one half of the probe A signal.These data demonstrate that probes B and C hybridize at 30 and15% of the a2 gene signal, while probe D hybridizes at 13%of the a 1 gene signal. This implies that termination of transcrip-tion occurs close to the Pvul sites of both a2 and al globin genes,soon after their respective poly(A) sites.

Ternination oftranscription is linked to polyadenylation in a tran-sient expression system

Because transcriptional termination of the human a globin geneoccurs unusually close to the 3' end of the mRNA, we decidedto investigate the possibility that 3' end processing and termina-tion were coupled events, at least for the human a2 globin gene.Higgs et al. (1983) have described a Saudi Arabian thalassaemica2 globin gene with a single base change that mutates theAATAAA sequence to AATAAG. Furthermore, they demon-strated that this gene fails to generate mRNA 3' ends whentransfected into tissue culture cells. To exploit this mutant genefor studies on ai gene termination, we carried out nuclear run-off analysis on HeLa cells transfected with either the thalassaemic(a2M) or wild-type oa2 (a2W) globin gene, together with exten-sive 3'-flanking sequences. These two constructs were clonedinto the transient expression vector pSVed (Proudfoot et al.,1984) and are called ca2M3'PS and ai2W3'PS, where 3'PS refersto the presence of extensive 3'-flanking sequence (see Materialsand methods and Figure 4). These two plasmids were then tran-siently expressed in HeLa cells and nuclei from these transientexpression experiments were subjected to nuclear run-off analysis.Because high amounts of non-specific transcription are often

Table I. Quantitation of hybridization in nuclear run-off assays

A. K562 expression (Figure 1)

DNA fragment % Hybridization

A PstI-PstI (a2 and al) 100B PvuII-DraI (3Y'a2) Av 30C BglII-SmaI (3'a2) 15D PvuII-HincII (3'al) 13

% Hybridization was calculated from densitometry of the autoradiographs asa % of half the signal obtained with probe A since this probe hybridizesequally to both the a2 and ac1 globin genes. Each value has the backgroundsignal (hybridization to the pBR322 vector band) subtracted from it.

B. HeLa transient expression (Figure 2)

M13 clone % Hybridizationa2W a2M

Sense:NcoI-BstEII 100 100BstEII-BstEII 32 108BstEII-DraI 37 84

Antisense:NcoI-BstEII 29 57BstEllI-BstEII 15 53BstEII-DraI 18 38

Background was estimated as the amount of hybridization to Ml13 alone.The signal minus background of sense transcripts to Ml13 cloneNcoI-BstEII is 100% hybridization.

associated with transient expression, we subcloned three a genefragments, one from within the gene and two in the 3'-flankingregion into an M 13 vector. Clones in both orientations were ob-tained for each a gene fragment. The a gene sequence in each

2916

0

A

I

Pst I

1 Kb64

Pst I

Sma I

5

al Pst I

II

Pvu I

C

Hinc I

A B C

A

Jl

D

A B D

_ i,

1

Page 3: oa-Thalassaemia caused by a poly(A) site mutation reveals that

Transcriptional termination and mRNA processing in the a2 globin gene

a2W

Sense TranscrptM13

Antisense Transcript

li--iDra I

Sense Transcript

*.AaAntisense Transcript

a2MFig. 2. Nuclear run-off analysis on nuclei of HeLa cells transfected with a2W3'PS and a2M3'PS pSVed. M13 clones were made in both orientations for thethree restriction fragment regions of the a2 globin gene as indicated. 5 jig of single strand DNA for each M13 subclone as well as M13 without an insertwere immobilized on dot blots and hybridized to [32P]nuclear RNA as described in Materials and methods. The sense and antisense dot blot signals obtainedwith either a2W or ca2M plasmids are placed above and below the (x gene map so that their positions correspond to the three different DNA fragments usedto make the M13 probes.

M13 clone is - 600 bp in length so that the intensity of hybridiza-tion signal is directly comparable. Single strand DNA prepara-

tions from these six clones, as well as the parental M13, were

immobilized on cellulose nitrate, as 'dot blots' and hybridisedto 32P-labelled nuclear RNA from a2M or a2W transient ex-

pression experiments under conditions of DNA excess.

Figure 2 shows the hybridization data obtained for these sixM13 a gene probes and Table lB quantifies these hybridizationsignals. A striking difference was obtained between a2W anda2M nuclear run-off signals. a2W gave a 3-fold greater signalfor the entirely genic NcoI-BstEH sense probe than for the two3'-flanking region sense probes, indicating significant levels oftranscriptional termination. In contrast with a2M, all three probesgave closely similar signals indicating that no transcriptionaldiscontinuity occurs between gene and 3'-flanking sequence.

Similar data were also obtained from two further nuclear run-

off experiments (data not shown). However, the amount of anti-sense transcripts throughout both the a2W and a2M genes andflanking sequences was quite high, approaching as much as 50%of the sense signal for a2M and nearly equal to the 3'-flankingregion signals of the a2W experiment. These anti-sensetranscripts may be due to the SV40 early promoter which is pre-sent in the pSVed vector and transcribes off the opposite strandto the a gene promoter or from random initiation of transcrip-tion around the plasmid. The non-specific transcriptionpresumably from both strands of the transient expression plasmidmay partly obscure the specific transcription of the a2 globingene and therefore prevent an accurate assessment of the effi-

ciency of the transcriptional termination process or its exact loca-tion. This may give rise to an apparent discrepancy between theK562 and a2W transient nuclear run-off experiments. For K562some signal was obtained for the immediate 3'-flanking signal(probe B) suggesting that termination occurs at least several100 bp past the poly(A) site, while for a2W the signal detectedin the immediate 3'-flanking M13 probe was not above the non-

specific transcription in two of out of three experiments. Theseresults clearly demonstrate some degree of transcriptional ter-mination for az2W but not for a2M and therefore strongly argue

for a direct link between a2 globin mRNA 3' end processingor polyadenylation and transcriptional termination.

a-Thalassaemic globin gene with a non-functional poly(A) siteutilises a cryptic poly (A) site between the hwnan a2 and a]I globingenes

The data presented above suggest that in the absence of a func-tional poly(A) site, the a-thalassaemic globin gene fails to ter-minate transcription. To confirm these data, we analysedsteady-state mRNA produced by the transient expression of thea2M gene transfected into Cos7 cells to determine how far thea2 globin mRNA extended past the non-functional termination/polyadenylation region of the mutant ct2 globin gene. a2M3'PSwas transfected into Cos7 cells and the RNA, obtained after 48 hof transient expression, was analysed by S1 mapping (Figure 3).No mRNA 3' ends were detectable using the 600-bp BstEII-BstEII a2 gene 3' fragment probe. Instead, complete protectionof the probe was seen, indicating that transcripts extended all

2917

Nco I Bst E E

M13i

Bst E E

UT

Page 4: oa-Thalassaemia caused by a poly(A) site mutation reveals that

E.Whitelaw and N.J.Proudfoot

B

A- A+ Co

650 _ *

527 _ 4

403 .;f 4:7 :

C

G Co

650

* 4-S527 ,,

403 -_ -4-A

309 v

v 309 309 *

.--I

Bst ElI

A,B,C

t -- -- --Kb. _

_w 90

a

Bst E [I

_ 76

_b 67

AATAAAB9I U

vI I

Ava lL

S *-

Fig. 3. SI nuclease (A,B,D) and exonuclease VII (C) mapping of the 3' ends of globin mRNA present in cytoplasmic RNA purified from Cos7 cellstransfected with ct2M3'PS pSVed. The two probes used in these experiments are shown in the diagram. They were a BstEII-BgIIl fragment and anAvaII-BglIl fragment purified from the ci2M3'PS DNA by acrylamide gel electrophoresis. The probes were end labelled by filling in with Klenow DNApolymerase and [a-32P]dNTP. A line diagram indicating the sequences protected by the probes is shown at the bottom of the figure. (A) SI analysis ofa2M3'PS cytoplasmic RNA hybridised to the BstEII-BglII probe (lane G). Positions of probe (P), signal (S) and artefact bands (A) are indicated. The twohigher artefact bands are due to the 'breathing' of AT-rich sequences in the RNA-DNA hybrid during the SI reaction. The lower artefact band at 44 bp is aprobe alone band and is also apparent in B and C. The minus RNA control is shown in lane Co. (B) SI analysis of poly(A)-selected ca2M3'PS cytoplasmicRNA. Poly(A)+ (lane A+), poly(A)- (lane A-). The probe used was the BstEII-BglII fragment. (C) Exonuclease VII digests of cs2M3'PS cytoplasmic RNAhybridized to the BstEII-BglI probe. (D) S1 analysis of a2M3'PS cytoplasmic RNA against the smaller Avall-BgIII probe. A doublet band is obtained at122 bp according to restriction fragment size markers. This position is -20 bp 3' to an AATAAA sequence midway between the ct2 and a1 globin genes(Hess et al., 1983). The two artefact bands at 100 bp and 72 bp correspond to the artefact bands arising from AT-rich sequences seen in panels A and Bwith the BstEII-BglII probe.

the way through this first portion of the ct2 -a1 intergenic se-quence (data not shown). However, using the adjacent a2-a1intergenic probe (BstEII-BglII), a strong band (S) was obtain-ed - 600 nucleotides long (Figure 3A, lane G). The other smallerbands (A) are SI artefacts, as discussed in the figure legend. Thisresult obtained by SI analysis was confirmed using the singlestrand-specific 3' exonuclease VII (Figure 3C) which gave thesame 600 nucleotide long band, absent in the control lane. This600-nucleotide band was polyadenylated (Figure 3B). Althougha band at - 800 bp was seen in the poly(A)- RNA, this cor-responds to a very AT-rich DNA sequence. Presumably, this

forms an SI nuclease-sensitive site in the RNA-DNA duplexbecause it was absent when exonuclease VII was used (Figure3C) and so it cannot represent RNA with true 3' ends.To define the 3' end of the a2 gene transcript more precisely,

we used a smaller DNA probe labelled at an AvaII site withinthe BstEII-BglII DNA fragment (see Figure 3). A 122-bpdoublet band was obtained together with the same two artefactbands probably caused by AT-rich sequences (Figure 3D). Thisresult positions the 3' end of the mutant a2 gene transcript inthe middle of the z2 -ac 1 intergenic sequence - 20 bases 3' toan AAUAAA sequence, the only AAUAAA in the whole in-

2918

A

G Co

P_

650

DCo G

A ----_ ._ 527

403

_4 309

S 240

_201_190

_ 160

1 144

S-. 122

4_ 118

Page 5: oa-Thalassaemia caused by a poly(A) site mutation reveals that

Transcriptional termination and mRNA processing in the a2 globin gene

a2M3*S(AATAAG) / a2W*S(AATAAA)

a2M1H e2MS a2MD a2MBVV V Q

I'K

a2W AATAAA

I I I I I I I I I0 1 2 3 4

l l

5

I L

Kb

Fig. 4. Line diagram depicting the various a2 globin gene constructs with respect to the human a globin gene map. The a2- and cxl-globin genes areindicated by rectangles with the filled in exons. The positions of the poly(A) addition sequences are indicated. V denotes sequence added. M denotes mutantand W denotes wild-type.

tergenic sequence (Hess et al., 1983). Figure 4 shows a detailedmap of the human a globin genes and indicates the position ofthe cryptic poly(A) site in the a2 -(xl intergenic sequence. Insummary, these data indicate that when the normal poly(A) siteof the a2 gene is non-functional, a2 gene transcripts extend1.5 kb into the 3'-flanking region to form a new polyadenylated3' end at a cryptic poly(A) site.The inefficiency of the cryptic poly(A) site provides a molecularexplanation for the a-thalassaemia phenotypeTwo sets of data suggest that the transcriptional termination pro-cess of the thalassaemia a2 globin gene is disrupted. First nuclearrun-off analysis indicates that nascent transcripts extend past thenormal termination region and second a cryptic poly(A) site 1 kbinto the 3'-flanking region is utilized by the a2M gene at signifi-cant levels. However, we suspected that the amount of globinmRNA that utilises the cryptic poly(A) site is abnormally low,because the patient with this thalassaemic gene expresses reduc-ed levels of a2 globin mRNA (Higgs et al., 1983).To measure the level of stable a2 globin mRNA produced from

a2M3'PS directly, a quantitative assay of 5' ends was carriedout using primer extension analysis as shown in Figure 5. Theamount of stable mRNA was - 3-4 times lower in Cos7 cellstransfected with a2M 3'PS than in those transfected with a2W3'PS (Figure 5A, lanes 1 and 6). The amount could be lowerfor two reasons: (i) most transcripts are terminating before thecryptic poly(A) site and are unstable, or (ii) transcripts are notterminating before the cryptic poly(A) site but the cryptic poly(A)site is not as effective as normal so that only 20-30% of thetranscripts reaching this point are stabilized by cleavage andpolyadenylation. To test these alternatives, we constructeda2MH, a2MS, a2MD and a2MB, in which a 300-bp fragmentcontaining the wild-type a 1 poly(A) site was placed at the HpaI,Sacd, DraI or BglII sites, 3' to the a2M globin gene (see Figure4). If all these constructs could raise the amounts ofmRNA fromthe a2M gene to equal those of the wild-type gene, this wouldsuggest that termination was not occurring prior to the site ofthe added poly(A) site, but that the cryptic poly(A) site was fourtimes less efficient in stabilizing mRNA than the normal poly(A)site. Following their transfection into Cos7 cells, each of thesefour a2M3'PS constructs with an added normal poly(A) site wastested for levels of a2 mRNA synthesis. As shown in Figure

5A (lanes 2-5), each construct gave normal levels of a mRNAfour times higher than with a2M3'PS. As a measure of transfec-tion efficiency, the plasmid ,B pSVed (see Materials and methods)was used as a co-transfection control with each a2M construct.S1 analysis using a globin 3' probe, gave a band of 210 bpcorresponding to the distance between the EcoRI site and the 3'end of the : globin mRNA (Figure 5B). The signal obtained wasequivalent in all tracks so that we could direcfly compare theamounts of a globin mRNA shown in Figure SA.To confirm that the amounts of a globin mRNA were rescued

by utilizing the added al gene poly(A) site, we carried out 3'end SI analysis of RNA from cell transfected with a2MB, wherethe added poly(A) site is placed 3' to the cryptic poly(A) site(Figure 4). Figure 5C (lane 1) shows the 3' S1 analysis ofa2MBRNA using a alI globin gene 3' probe. The presence of a bandof 220 bp indicated that the added al gene poly(A) site was be-ing utilized while the 120-bp band corresponded to the mismatchbetween the a2 sequence present in the construct and the ca 1 se-quence in the probe. As expected, RNA from cells transfectedwith a2M3'PS gave only the 120-bp mismatch band (lane 2).Figure SD shows the co-transfected ,B globin gene mRNA signal.We found that the a2MB transfection was twice as efficient asa2M3'PS so that the amount of 3' ends from a2MB was 3- to4-fold those bands in a2M3'PS, verifying the quantitation at the5' end (Figure SA). The 3' end of the RNA from cells transfectedwith a2MH, a2MS, a2MD were also analysed with probe Cand were all found to be utilizing the added poly(A) site (resultsnot shown). These data reveal two important features of the a12Mgene. Firstly the inefficiency of the cryptic poly(A) site at leastin part accounts for the reduced levels of a2 globin gene ex-pression observed in this type of thalassaemia. Secondly, nosignificant level of transcriptional termination can occur at leastup to the BglII site in the a2-a 1 intergenic sequence. This lat-ter result confirms our nuclear run-off analysis data and indicatesa linkage between 3' end processing and transcriptional ter-mination.

DiscussionmRNA 3' end formation in eukaryotes involves both endo-nucleolytic cleavage and polyadenylation. The highly conserved

2919

7

Page 6: oa-Thalassaemia caused by a poly(A) site mutation reveals that

E.Whitelaw and N.J.Proudfoot

C1 2 3 4 5 6 7 M

122118

90

Cap-.-_ aot am76

201190180

160144

M 1 2 3

I _

67

122 t.118 % * a

Primer-_ _~n

D

240

Psignal-_. %qp

p signal -_

201190180

160

144

-~l 20O190

_ 180

_ 160' 144

BstEl PvulII I

Hinfl HaeM BstEflI I

-I

60 ^,120primer ae3

LEco RII

{ __d...

220

probe "

Fig. 5. Primer extension (A) and SI nuclease (B, C and D) mapping of 5' and 3' ends of a globin mRNA present in cytoplasmic RNA from Cos7 cellstransfected with a2MH, a2MS, a2MB and a2M3'PS. The positions of the DNA probes with respect to these plasmids are indicated in the diagram.(A) Primer extension of RNA from Cos7 cells transfected with a2W (1), a2MH (2), a2MS (3), a2MD (4), a2MB (5), c2M3'PS (6) and minus RNAcontrol (7) with probe A. The a primer is an antisense single strand 3'-end-labelled Hinfl-HaeIII fragment from the 5'-non-coding region and 1st exon ofat2. The 54-bp primer is extended by 20 bp to the Cap site giving a band of 74 bp. (B) SI analysis of RNA from Cos7 cells transfected with a2W (1),a2MH (2), a2MS (3), a2MD (4), a2MB (5), a2M3'PS (6) and minus RNA control (7) with the 3' , probe. 3' , Probe B is a 3' end labelled, double-stranded probe obtained by linearizing the plasmid pSVod at the unique EcoRI site and filling in with Klenow DNA polymerase and [a-32P]dATP. (C) SIanalysis of RNA from cells transfected with c2MB (1), ae2MPS (2) and a minus RNA control (3) with probe. 3'a Probe was obtained by linearizing ailpSVed at the unique BstEII site and filling in with Klenow DNA polymerase and [a-32P]dGTP. (D) SI analysis of RNA from cells transfected with a2MB

(1), a2M3'PS (2) and a minus RNA control (3) with probe 3'3.

2920

A

4-Cl

.-a 2

B 1 2 3 4 5 6 7 M

1 2 3 M

or

rr

I.--VI 3' '

P'.9e "',\

Page 7: oa-Thalassaemia caused by a poly(A) site mutation reveals that

Transcriptional termination and mRNA processing in the a2 globin gene

sequence AAUAAA found 10-30 bases upstream of most poly-adenylation sites forms part of the recognition signal for cleavageof the primary transcript (Proudfoot and Brownlee, 1976; Fitz-gerald and Shenk, 1981; Higgs et al., 1983; Montell et al., 1983).Recent evidence indicates that sequences located immediatelydownstream of the poly(A) site are also required (Simonsen andLevinsen, 1983; Gil and Proudfoot, 1984; McDevitt et al., 1984).When the DNA sequence corresponding to the AAUAAA signalis altered to AAGAAA in adenovirus (Montell et al., 1983) orAACAAA in a human f3-thalassaemic globin gene (Orkin et al.,1985) respectively, elongated RNA transcripts are observed.Similarly, it has been shown that when the AATAAA is mutatedto AATAAG, as in the case of the human a-thalassaemic globingene, correct 3' end processing is abolished if the mutant geneis transfected into tissue culture cells (Higgs et al., 1983). Wehave extended this observation to show that transcription of themutant a globin gene continues at least 1.0 kb into the 3'-flankingregion to give reduced levels of poly(A) + transcripts which endjust downstream of the only AATAAA found in the intergenicsequence between the two a globin genes. Furthermore, by plac-ing the normal polyadenylation site from the a 1 globin genedownstream of this cryptic poly(A) site we find that the transcriptsread beyond the cryptic poly(A) site and as far as the insertedpoly(A) site.The fact that the normal at 1 globin gene poly(A) site functions

efficiently when placed 3' to the cryptic poly(A) site argues thatthe cryptic site is inefficient, only stabilizing a2 globin mRNAto 20% of the wild-type level. Presumably, the absence of func-tional signals downstream of the cryptic poly(A) site (McDevittet al., 1984; Gil and Proudfoot, 1984) may explain its low effi-ciency. The amount of a2 globin mRNA present in reticulocytesof the thalassaemic patient is - 10-20% that found in a normalindividual (Higgs et al., 1983). The inefficiency of the crypticpoly(A) site may in part account for the a-thalassaemic phenotypeof the mutant gene.

Nuclear run-off experiments to assess a 1 and a2 globin genetranscription in the erythroleukemic cell line, K562, show thattranscriptional termination of both the wild-type a globin genesoccurs within 100-300 bp of their respective poly(A) sites. In-terestingly, the mouse a 1 globin gene transcription also appearsto terminate transcription in a region 50-250 bp 3' of the poly-adenylation site (Sheffery et al., 1984). In both these cases thetranscriptional termination sites are closer to the poly(A) site thanin most other polymerase II genes so far studied (Birnstiel et al.,1985).The fact that termination of transcription of the wild-type c2

globin gene occurs close to the poly(A) site in K562 and yettranscription of the mutant ca2 globin gene in HeLa cells goesat least 1.5 kb beyond this region, suggested that termination doesnot occur without active 3' end processing. Nuclear run-off ex-periments on tissue culture cells transfected with either the mu-tant or the wild-type gene confirm this suggestion sincetermination does occur with the wild-type but not with the mu-tant gene. Because these two genes differ by only one nucleotideA-G in the AATAAA sequence, we must assume that transcrip-tional termination does require a functional poly(A) site. Sucha requirement would make good physiological sense as it wouldprevent termination of transcription within genes. How such asystem could work at the mechanistic level is harder to envisagebut further experiments to delineate the exact sequences requiredfor termination of RNA polymerase II genes should clarify thisissue.

It is interesting that the severity of homozygous a-thalassaemia

in this case is greater than would be expected for a patient withtwo apparently normal cal globin genes (D.Higgs, personal com-munication) and two partially active a2 globin genes. Indeed ana globin output of nearly three functional a globin genes wouldbe expected for this patient rather than the observed single aglobin gene output. An intriguing explanation for these apparentdiscrepancies in a globin levels in the Saudi Arabian a-thalassaemia may be that the inability of the a2 globin gene toterminate transcription results in a2 gene transcripts readingthrough the intergenic sequence into a1 and thereby inhibitinga 1 globin gene expression. A recent paper by Proudfoot (1986)suggests that such a transcriptional interference effect can occurbetween two adjacent genes and that this effect is alleviated byplacing transcriptional termination signals between the two genes.

Materials and methodsa Globin and (3 globin gene constructsa2W3'PS. The plasmid a2W3'PS contains the entire a2 globin gene from thePvuII site 1.5 kb 5' from the cap site to the SnaI site 2.0 kb 3' from the poly(A)site. This 4.5-kb fragment was inserted into the vector pSVed between the EcoRIand PvuI sites. The vector pSVed contains the pBR322 replication origin andtetracycline gene and the SV40 replication origin and enhancer sequence (Proud-foot et al., 1984).a2M3'PS. The plasmid a2M3'PS is identical to a2W3'PS except that the 0.6-kbBstEII fragment containing the wild-type poly(A) site, AATAAA, was replacedby the identical fragment from the mutant gene (Higgs et al., 1983) in whichthe poly(A) site is AATAAG.a2MH, a2MS, a2MD, a2MB. These plasmids originated from a2M3'PS. The300-bp fragment from BstEll to Pvull containing the human a I globin poly(A)site was inserted into a2M3'PS at the HpaI site (c2MH), the SacI site (a2MS),the DraI site (a2MD) and the BglII site (a2MB) as shown in Figure 4.paJRB. The plasmid, in which a 4.0-kb EcoRI-BglJI fragment from the humanal globin gene has been inserted between the EcoRI and BamHI sites of pBR322,has been described by Lauer et al. (1980).a Globin gene M13 clones. Restriction enzyme fragments from the human a2gene and 3' sequences (as indicated in Figure 2) were purified, flush-ended andligated into the SmaI site of M13 mp8 (Maniatis et al., 1982). Clones of bothorientations were isolated and the single-stranded phage DNA was grown up bystandard procedures.(3pSVed. The human , globin gene from HpaI in the 5'-flanking sequence toPstI in the 3'-flanking sequence was inserted into pSVed between the EcoRI siteand the PstI site (Proudfoot et al., 1984).pf35'SV. This was a gift from Dr F.Grosveld. The rabbit (3 globin gene is in-serted into a plasmid containing the SV40 origin and enhancer sequences andlarge T antigen (Grosveld et al., 1982). The presence of these sequences allowsfor replication in HeLa cells of any plasmids containing the SV40 origin.Transient expressionTransfection into Cos7 cells or HeLa cells were carried out as described previously(Mellon et al., 1981; Whitelaw and Proudfoot, 1983). Cos7 cells contain a defectiveSV40-transformed CVI monkey cell line that expresses sufficient levels of SV40T antigen to allow replication of plasmid-containing SV40 origin sequences, suchas pSVod. Experiments with HeLa cells were carried out in the presence of theplasmid p,B5'SV (see above), which allows for replication of other plasmids con-taining the SV40 origin. Plasmid DNA was precipitated with calcium phosphateand added to subconfluent dishes of cells. After 10-16 h, the medium was changedand the cells allowed to grow for another 30 h. The cells were harvested, lysedin NP-40 detergent buffer, and the cytoplasmic and nuclear fractions separatedby centrifugation through a sucrose cushion. Following incubation with proteinaseK, cytoplasmic RNA was purified by phenol/chloroform extraction and ethanolprecipitation (Whitelaw and Proudfoot, 1983). Poly(A) selection was carried outby standard procedures using an oligo(dT) cellulose column (Maniatis et al., 1982).RNA mappingSI nuclease. Probe DNAs (either double or single stranded) were annealed tocytoplasmic RNAs (10-20 ILg) in 30 11 of 80% formamide, 0.04 M Pipes pH 6.4,0.4 M NaCl, 0.1 mM EDTA by denaturation at 80°C for 10 min, then 53°C,double strand probe or 30°C single strand probe overnight. 300 JlI of ice-coldSI buffer (0.25 M NaCl, 0.03 M NaAcetate pH 4.6, 2 mM ZnSO4, 50 Agg/mldenatured sonicated carrier DNA) plus S 1 (3000 units) was quickly added to eachhybridisation and incubated for 1 h at 30°C. SI reactions were ethanol precipitatedand fractionated on denaturing, 7 M urea polyacrylamide gels.

2921

Page 8: oa-Thalassaemia caused by a poly(A) site mutation reveals that

E.Whitelaw and N.J.Proudfoot

Exonuclease VII. Hybridisations were carried out as for the SI experiments. 500 Alof exonuclease VII buffer (30 mM KCI, 10 mM Tris pH 7.8 and 10 mM EDTA)with 4 U of exonuclease VII (Bethesda Research Laboratories) was added tohybridisations and incubated at 37°C for 2 h. The reactions were then ethanolprecipitated and fractionated as for SI experiments.Primer extension. DNA primer and RNA were annealed in 10 ptl of 10 mM PipespH 6.4, 0.4 M NaCl at 80°C for 10 min and 63°C overnight. 50 11 of reversetranscriptase buffer (50 mM Tris, pH 8.2, 10 mM DTT, 6 mM MgCI2, 0.5 mMdATP, dCTP, dTTP, dGTP) plus reverse transcriptase (5 units) were added tohybridisations and incubated at 42°C for I h. The reactions were ethanolprecipitated twice and fractionated by electrophoresis on 7 M urea polyacrylamidegels.Nuclear isolation, transcription and RNA purificationA variant K562 line which forms an adherent monolayer (M.V.Chao, unpublished)was grown in DMEM supplemented with 10% fetal calf serum in the presenceof 50 uM haemin for 3-4 days. Nuclear isolation and transcription was carriedout according to Groudine et al. (1981) but with some modifications. Cells(- 1 x 107) were spun down, washed in 1 x SSC (0.15 M NaCI 0.015 Msodium citrate, pH 7.0). Nuclei were isolated by lysis of cells in RSB (0.01 MTris pH 7.4, 0.01 M NaCI, 3 mM MgCl2) containing 0.5% NP-40 and centrifuga-tion for 10 min, 2000 r.p.m. The nuclei were washed in RSB without NP-40and resuspended in 2 x transcription buffer (5 mM DTT, 180 mM KCl, 10 mMMgCl2, 20 mM Tris pH 7.8, 50% glycerol). Elongation was carried out at 30°Cfor 10 min in the presence of 0.2 mM ATP, GTP, CTP, 200 /Ci [a-32P]UTP(3000 Ci/mmol, Amersham). Reactions were terminated by the addition of RNase-free DNase I (Miles) to 100 ,ug/ml and incubation at 30°C for a further 5 min.Under these conditions, pre-initiated RNA chains are elongated by 100-300nucleotides (Weber et al., 1977). The RNA was purified according to the methodto Treisman and Maniatis (1985). Basically, the reactions were deproteinized ina solution of 0.1 M Tris pH 7.5, 0.15 M NaCl, 1% SDS 10 mM EDTA with100 ytg of proteinase K per ml and subsequent phenol -chloroform extractionand ethanol precipitation. The pellet was taken up in 300 1l 10 mM Tris, pH 7.5,10 mM MgCl2 and incubated at 37°C for 30 min in the presence of 50 gltg/mlRNase-free DNase I (Miles). The reaction was extracted with phenol/chloroformand ethanol precipitated.

Protocols for nuclear isolation transcription and RNA purification for experimentsusing HeLa cells in which plasmids were being transiently expressed were iden-tical to those used for the nuclear isolation, transcription and RNA purificationof K562. The transfection of HeLa cells with plasmids was carried out as describedabove. Nuclei were purified 40-48 h later. Approximately 2 x 106 cells wereused for each experiment.Hybridization to nitrocellulose filtersM13 single strand DNA (5 Rg) was bound to nitrocellulose filters according tothe method of Kafatos et al. (1979) with a miniblot filtration unit.

Experiments on DNA fragments were done by first purifying the fragmentsfrom a large agarose gel and then running them (0.2 ,ig) on a small agarose gel.The DNA was then transferred to nitrocellulose according to Maniatis et al. (1982).Alkali breakage of RNA was performed according to the technique of Hofer andDamell (1981). The volume of hybridization was 1 ml. Hybridization was car-ried out at 42°C in the presence of 50% formamide (Treisman and Maniatis,1985) for 48 h. After hybridization, the filters were washed: 30 min, 25°C,2 x SSC, 0.1% SDS; 30 min; 65°C, 2 x SSC, 0.1I% SDS. Then the filterswere incubated in 2 x SSC at 37°C with RNAse (DNase-free) at a concentra-tion of 2 14g/ml. The filters were finally washed in 0.2 x SSC, 0.1% SDS at68°C for 30 min.

Falck-Pedersen,E., Logan,J., Shenk,T. and Darnell,J.E.,Jr (1985) Cell, 40,897-905.

Fitzgerald,M. and Shenk,T. (1981) Cell, 24, 251-260.Fraser,N.W., Nevins,J.R., Ziff,E. and Darnell,J.E. (1979) J. Mol. Biol., 129,

643.Gil,A. and Proudfoot,N.J. (1984) Nature, 312, 473-474.Grosveld,G.C., de Boer,E., Shewmaker,C.K. and Flavell,R.A. (1982) Nature,

295, 120-126.Groudine,M., Peretz,M. and Weintraub,H. (1981) Mol. Cell Biol., 1, 281 -288.Grummt,I., Maier,U., Ohrlein,A., Hassouna,N. and Bachellerie,J.-P. (1985) Cell,

43, 801-810.Hagenbuchle,O., Wellauer,P.K., Cribbs,D.L. and Schibler,U. (1984) Cell, 38,737-744.

Hess,J.F., Fox,M., Schmid,C.W. and Shen,C.-K.J. (1983) Proc. Natl. Acad.Sci. USA, 80, 5970-5974.

Higgs,D.R., Goodbourn,S.E.Y., Lamb,J., Clegg,J.B., Weatherall,D.J. andProudfoot,N.J. (1983) Nature, 306, 398-400.

Hofer,E. and Darnell,J.E. (1981) Cell, 23, 585-593.Hofer,E., Hofer-Warbinek,R. and Darnell,J.E.,Jr (1982) Cell, 29, 887-893.Kafatos,F.C., Jones,C.W. and Efstratiadis,A. (1979) Nucleic Acids Res., 7,

1541-1551.Krieg,P.A. and Melton,D. (1984) Nature, 308, 203-206.Lauer,J., Shen,C.-K. and Maniatis,T. (1980) Cell, 20, 119-130.LeMeur,M.A., Galliot,B. and Gerlinger,P. (1984) EMBO J., 3, 2779-2786.Lozzio,C.B. and Lozzio,B.B. (1975) Blood, 45, 321-334.Maniatis,T., Fritsch,E.F. and Sambrook,J. (1982) Molecular Cloning. A

Laboratory Manual. Cold Spring Harbor Laboratory Press, NY.McDevitt,M.A., Imperiale,M.J., Ali,H. and Nevins,J.R. (1984) Cell, 37,993-999.

Mellon,P., Parker,V., Gluzman,Y. and Maniatis,T. (1981) Cell, 27, 279-288.Montell,C., Fisher,E.F., Caruthers,M.H. and Berk,A.J. (1983) Nature, 305,600-605.

Moore,C.L. and Sharp,P.A. (1985) Cell, 41, 845-855.Orkin,S.H., Cheng,T.-C., Antonarakis,S.E. and Kazazian,H.H.,Jr (1985) EMBO

J., 4, 453-456.Price,D.H. and Parker,C.S. (1984) Cell, 38, 423-429.Proudfoot,N.J. (1986) Nature, 321, 730-731.Proudfoot,N.J. and Brownlee,G.G. (1976) Nature, 263, 211-214.Proudfoot,N.J,. Rutherford,T.R. and Partington,G.A. (1984) EMBO J., 3,

1533-1540.Rohrbaugh,M.L., Johnson,J., James,M. and Hardison,R. (1985) Mol. Cell. Biol.,

5, 147-160.Rutherford,T.R., Clegg,J.B. and Weatherall,D.J. (1979) Nature, 280, 164-165.Sheffery,M., Marks,P.A. and Riflind,R.A. (1984) J. Mol. Biol., 172, 417-436.Simonsen,C.C. and Levinson,A.D. (1983) Mol. Cell. Biol., 3, 2250-2258.Treisman,R. and Maniatis,M. (1985) Nature, 315, 72-75.Weber,J., Jelinek,W. and Damell,J. (1977) Cell, 10, 611-616.Whitelaw,E. and Proudfoot,N.J. (1983) Nucleic Acids Res., 11, 7717-7733.

Received on 28 July 1986

AcknowledgementsWe gratefully acknowledge Dr Doug Higgs for providing the cs2 thalassaemicglobin gene used throughout these experiments. We thank both Doug Higgs andDean Jackson for helpful discussion. Finally, we are indebted to Zena Werb forher critical reading of the manuscript. EW and NJP were supported by a MedicalResearch Council project grant no. G8108316CB.

ReferencesBirchmeier,C., Schumperli,D., Sconzo,G. and Birnstiel,M.L. (1984) Proc. Natl.

Acad. Sci. USA, 81, 1057-1061.Birnstiel,M.L., Busslinger,M. and Strub,K. (1985) Cell, 41, 349-359.Bogenhagen,D.F. and Brown,D.D. (1981) Cell, 24, 261-270.Charnay,P. and Maniatis,T. (1983) Science, 220, 1281 - 1283.Citron,B., Falck-Pedersen,E., Salditt-Georgieff,M. and Darnell,J.E.Jr (1984)

Nucleic Acids Res., 12, 8723-873 1.Corden,J., Wasylyk,B., Buchwalder,A., Sassone-Corsi,P., Kedinger,C. and

Chambon,P. (1980) Science, 290, 1406-1414.Dynan,W.S. and Tjian,R. (1985) Nature, 316, 774-778.

2922