the of 267, no. 5, issue february 15, pp. 2955 …the journal of biological chemistry 0 1992 by the...

5
THE JOURNAL OF BIOLOGICAL CHEMISTRY 0 1992 by The American Soeiety for Biochemistry and Molecular Biology, Inc. Vol. 267, No. 5, Issue of February 15, pp. 2955-2959,1992 Printed in CJ. S. A. The Glycine-rich Domain of Nucleolin Has an Unusual Supersecondary Structure Responsible for Its RNA-Helix-destabilizing Properties* (Received for publication, June 27, 1991) Laurence GhisolfiS, Gerard Joseph, Frangois Amalric, and Monique Erard From the Centre de Recherchesde Biochimie et de Ggnitique Cellulaires du Centre National de la Recherche Scientifique, I18 Route de Narbonne, 31 062 Toulouse Ckdex, France Nucleolin, a major nucleolar protein implicated in preribosome assembly and transcriptional regulation, possesses a C-terminal domain unusually rich in gly- cine, arginine, and phenylalanine residues. A polypep- tide (plo), corresponding to this domain, has been syn- thesized by means of an Escherichia coli expression system and purified to homogeneity. Nitrocellulose binding assays have clearly shown thatthis domain of nucleolin is capable of interacting with RNA, and in- deed all nucleic acids tested, in an efficient but nonspe- cific manner. A combination of circular dichroism and infrared spectroscopy provide strong evidence that re- peated &turns are a major structural component of this polypeptide, which is entirely consistent with its amino acid composition and above all the presence of repeat motifs such as RGGF.Circulardichroismtechnique also shows that the interaction of p10 with RNA in- volves an unstacking of the nucleotide bases and an unfolding of the RNA secondary structure.While the role of the C-terminal domain of nucleolin in vivo has yet to be established, our findings suggest that it may act to unfold regions of ribosomal RNA so that a second domain of nucleolin has access to its specific binding site. The formation of ribonucleoprotein complexes appears to be an essential step in the processing of mature RNA mole- cules (mRNA or rRNA) from primary transcripts. In the case of ribosomal RNA, this assembly occurs within the nucleolus and involves, among other factors, an abundant nucleolar- specific protein known as nucleolin. This 100-kDa protein, also implicated in transcriptional regulation (I), can be sub- divided into three major domains, each of which seems to have a well defined function. The N-terminal domain com- prises long acidic stretches interspersed with basic repeats, similar to the structure of a high mobility group-type protein. We have recently shown that this domain is responsible for the ability of nucleolin to modulate chromatin condensation (2, 3). In contrast, the central domain contains four RNA binding elements (4) which are probably involved in the recognition of a specific binding site in the 5’ external tran- scribed spacer (5). Finally, the nucleolin C-terminal domain, approximately 85 amino acids long, is strikingly rich in glycine * The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked “uduertisement” in accordance with 18 U.S.C. Section 1734 solely to indicate this fact. The nucleotide sequence(s) reported in thispaper has been submitted A27441. to the GenBankTM/EMBL Data Bank with accession numberfs) $To whom correspondence and reprint requests should be ad- dressed. Tel.: 61-33-59-58. residues interspersed with dimethylarginine and phenylala- nine (6). In an attempt to assign a function to thisregion, we have cloned the corresponding portion of the nucleolin cDNA into an E. coli expression vector and, in this way, obtained a purified preparation of the C-terminal domain (called p10). In thispaper, we investigate the nucleic acid binding prop- erties of p10 andpresenta spectroscopic analysis of the structure of the polypeptide. Our results suggest that this domain is structured in repeated p-turns, centered on a repeat motif RGGF, leading to a nonspecific interaction with RNA in which the ribonucleic acid is both unstacked and unfolded. We propose that this region of nucleolin has a role in unfold- ing the specific binding site which is recognized by the central domain of the protein. MATERIALS AND METHODS Plasmid Constructions-All plasmids were propagated in Esche- richia coli strain DH5a (Bethesda Research Laboratories). Plasmid pT7 Gly-Arg comprises part of the nucleolin cDNA (fragment NcoI- EcoRI of the PUCK 16.2 plasmid (7)) cloned in the BamHI site of the PAR 3040 plasmid (8). Plasmid pSMb”b carries part of the 5’- external transcribed spacer (fragment MboI-MboI of pSPSalB (9)) cloned in the BamHI site of pSP64 (Promega). Plasmid pNSMb-Mb carries the same DNA fragment cloned in theopposite direction. In Vitro Transcription-RNA fragments to be used for protein binding assays were transcribed from plasmids into which restriction fragments spanning the nucleolin binding site had been cloned adja- cent to the phage SP6 RNA polymerase promoter (pSMbMb) or from a control “nonspecific” plasmid (pTZ 18R from Pharmacia LKB Biotechnology Inc.). The plasmid was linearized with a restriction enzyme (Hind111 for pSMbMb, cutting at the 3’-end of the cloned fragment, or DdeI for pTZ 18R) in order to produce run-off transcripts of the same length. RNA was synthesized in vitro with SP6 RNA polymerase according to Sambrook et al. (10) and, for the synthesis of radiolabeled RNA, 0.1 mM [a-32P]CTP (20 Ci/mmol) was included. To verify the quality of the transcript and determine its specific activity, electrophoresis was performed in a 2% agarose gel. The RNA was resuspended in TE buffer (10 mM Tris-HC1, pH 8.0, 1 mM EDTA), and aliquots were used for in vitro binding assays. Unlabeled RNA was purified using a Sephadex G-50 spin column. The RNA was then precipitated, resuspended in phosphate buffer, quantitated by measuring the absorbance at 260 nm, and used for spectroscopy. Labeling of Polydeoxyribonucieotides-Poly(dA) and double- stranded DNA were 5’-end-labeled with 32P using T4 polynucleotide kinase according to Sambrook et al. (10). Peptide Production and Purification-E. coli K38 (11) containing both pGP1-2 (8) and the PAR recombinant plasmid pT7 Gly-Arg was grown in LB with 100 ag/ml kanamycin, 300 pg/ml ampicillin, 0.5 pM isopropyl-,B-D-thiogalactopyranoside at 30 “C. When the culture reached an A6W = 0.6, the temperature was raised to 42 “C for 25 min. Rifampicin was then added to a final concentration of 200 pg/ml, and the culture was left at 42 “C for a further 25 min. The temperature was then shifted to 37 ”C for an additional 2 h before harvesting. The cells were washed in T E buffer, resuspended in 1 ml of lysis buffer (50 mM Tris, pH 8.0, 1 mM EDTA, 1 mM dithiothreitol, 2 mg/ml leupeptin and pepstatin, 1 mM phenylmethylsulfonyl fluoride) and then disrupted by sonication. Glycerol was added to a final concen- tration of 30% (v/v), and the mixture was left for 30 min at 4 “C with 2955

Upload: others

Post on 23-Apr-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: THE OF 267, No. 5, Issue February 15, pp. 2955 …THE JOURNAL OF BIOLOGICAL CHEMISTRY 0 1992 by The American Soeiety for Biochemistry and Molecular Biology, Inc. Vol. 267, No. 5, Issue

THE JOURNAL OF BIOLOGICAL CHEMISTRY 0 1992 by The American Soeiety for Biochemistry and Molecular Biology, Inc.

Vol. 267, No. 5, Issue of February 15, pp. 2955-2959,1992 Printed in CJ. S. A.

The Glycine-rich Domain of Nucleolin Has an Unusual Supersecondary Structure Responsible for Its RNA-Helix-destabilizing Properties*

(Received for publication, June 27, 1991)

Laurence GhisolfiS, Gerard Joseph, Frangois Amalric, and Monique Erard From the Centre de Recherches de Biochimie et de Ggnitique Cellulaires du Centre National de la Recherche Scientifique, I18 Route de Narbonne, 31 062 Toulouse Ckdex, France

Nucleolin, a major nucleolar protein implicated in preribosome assembly and transcriptional regulation, possesses a C-terminal domain unusually rich in gly- cine, arginine, and phenylalanine residues. A polypep- tide (plo), corresponding to this domain, has been syn- thesized by means of an Escherichia coli expression system and purified to homogeneity. Nitrocellulose binding assays have clearly shown that this domain of nucleolin is capable of interacting with RNA, and in- deed all nucleic acids tested, in an efficient but nonspe- cific manner. A combination of circular dichroism and infrared spectroscopy provide strong evidence that re- peated &turns are a major structural component of this polypeptide, which is entirely consistent with its amino acid composition and above all the presence of repeat motifs such as RGGF. Circular dichroism technique also shows that the interaction of p10 with RNA in- volves an unstacking of the nucleotide bases and an unfolding of the RNA secondary structure. While the role of the C-terminal domain of nucleolin in vivo has yet to be established, our findings suggest that it may act to unfold regions of ribosomal RNA so that a second domain of nucleolin has access to its specific binding site.

The formation of ribonucleoprotein complexes appears to be an essential step in the processing of mature RNA mole- cules (mRNA or rRNA) from primary transcripts. In the case of ribosomal RNA, this assembly occurs within the nucleolus and involves, among other factors, an abundant nucleolar- specific protein known as nucleolin. This 100-kDa protein, also implicated in transcriptional regulation (I), can be sub- divided into three major domains, each of which seems to have a well defined function. The N-terminal domain com- prises long acidic stretches interspersed with basic repeats, similar to the structure of a high mobility group-type protein. We have recently shown that this domain is responsible for the ability of nucleolin to modulate chromatin condensation (2, 3). In contrast, the central domain contains four RNA binding elements (4) which are probably involved in the recognition of a specific binding site in the 5’ external tran- scribed spacer (5). Finally, the nucleolin C-terminal domain, approximately 85 amino acids long, is strikingly rich in glycine

* The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked “uduertisement” in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.

The nucleotide sequence(s) reported in thispaper has been submitted

A27441. to the GenBankTM/EMBL Data Bank with accession numberfs)

$To whom correspondence and reprint requests should be ad- dressed. Tel.: 61-33-59-58.

residues interspersed with dimethylarginine and phenylala- nine (6). In an attempt to assign a function to this region, we have cloned the corresponding portion of the nucleolin cDNA into an E. coli expression vector and, in this way, obtained a purified preparation of the C-terminal domain (called p10).

In this paper, we investigate the nucleic acid binding prop- erties of p10 and present a spectroscopic analysis of the structure of the polypeptide. Our results suggest that this domain is structured in repeated p-turns, centered on a repeat motif RGGF, leading to a nonspecific interaction with RNA in which the ribonucleic acid is both unstacked and unfolded. We propose that this region of nucleolin has a role in unfold- ing the specific binding site which is recognized by the central domain of the protein.

MATERIALS AND METHODS

Plasmid Constructions-All plasmids were propagated in Esche- richia coli strain DH5a (Bethesda Research Laboratories). Plasmid pT7 Gly-Arg comprises part of the nucleolin cDNA (fragment NcoI- EcoRI of the PUCK 16.2 plasmid (7)) cloned in the BamHI site of the PAR 3040 plasmid (8). Plasmid pSMb”b carries part of the 5’- external transcribed spacer (fragment MboI-MboI of pSPSalB (9)) cloned in the BamHI site of pSP64 (Promega). Plasmid pNSMb-Mb carries the same DNA fragment cloned in the opposite direction.

In Vitro Transcription-RNA fragments to be used for protein binding assays were transcribed from plasmids into which restriction fragments spanning the nucleolin binding site had been cloned adja- cent to the phage SP6 RNA polymerase promoter (pSMbMb) or from a control “nonspecific” plasmid (pTZ 18R from Pharmacia LKB Biotechnology Inc.). The plasmid was linearized with a restriction enzyme (Hind111 for pSMbMb, cutting at the 3’-end of the cloned fragment, or DdeI for pTZ 18R) in order to produce run-off transcripts of the same length. RNA was synthesized in vitro with SP6 RNA polymerase according to Sambrook et al. (10) and, for the synthesis of radiolabeled RNA, 0.1 mM [a-32P]CTP (20 Ci/mmol) was included. To verify the quality of the transcript and determine its specific activity, electrophoresis was performed in a 2% agarose gel. The RNA was resuspended in TE buffer (10 mM Tris-HC1, pH 8.0, 1 mM EDTA), and aliquots were used for in vitro binding assays. Unlabeled RNA was purified using a Sephadex G-50 spin column. The RNA was then precipitated, resuspended in phosphate buffer, quantitated by measuring the absorbance at 260 nm, and used for spectroscopy.

Labeling of Polydeoxyribonucieotides-Poly(dA) and double- stranded DNA were 5’-end-labeled with 32P using T4 polynucleotide kinase according to Sambrook et al. (10).

Peptide Production and Purification-E. coli K38 (11) containing both pGP1-2 (8) and the PAR recombinant plasmid pT7 Gly-Arg was grown in LB with 100 ag/ml kanamycin, 300 pg/ml ampicillin, 0.5 pM isopropyl-,B-D-thiogalactopyranoside at 30 “C. When the culture reached an A 6 W = 0.6, the temperature was raised to 42 “C for 25 min. Rifampicin was then added to a final concentration of 200 pg/ml, and the culture was left at 42 “C for a further 25 min. The temperature was then shifted to 37 ”C for an additional 2 h before harvesting. The cells were washed in TE buffer, resuspended in 1 ml of lysis buffer (50 mM Tris, pH 8.0, 1 mM EDTA, 1 mM dithiothreitol, 2 mg/ml leupeptin and pepstatin, 1 mM phenylmethylsulfonyl fluoride) and then disrupted by sonication. Glycerol was added to a final concen- tration of 30% (v/v), and the mixture was left for 30 min at 4 “C with

2955

Page 2: THE OF 267, No. 5, Issue February 15, pp. 2955 …THE JOURNAL OF BIOLOGICAL CHEMISTRY 0 1992 by The American Soeiety for Biochemistry and Molecular Biology, Inc. Vol. 267, No. 5, Issue

2956 The Glycine-rich RNA-Helix-destabilizing Domain of Nucleolin gentle agitation and then centrifuged for 45 min at 100,000 X g. Ammonium sulfate (0.1 M final) and betaine (10% w/v) were added to the supernatant. This crude extract was loaded on an heparin Sepharose column as for nucleolin (12), and the resulting fractions containing the peptide were further purified by cation exchange chromatography (Monos from Pharmacia). The 10-kDa peptide “p10” was eluted with a continuous linear KC1 gradient (0.2 to 1 M) made in 10 mM phosphate buffer, pH 7.4.

Filter Bidding Assays-RNA transcripts a t a concentration of 0.1 nM were incubated with 0 to 5 p~ p10 in TMK buffer (20 mM Tris- HCI, pH 7.4, 4 mM MgCI2, 0.2 M KCI) containing 20% glycerol, 1 mM dithiothreitol, 0.5 mg/ml tRNA, 4 pg/ml bovine serum albumin, for 30 min at room temperature. The RNA-peptide mixtures were then filtered through a wet nitrocellulose membrane (Schleicher and Schuell BA 85120) under gentle suction. The filter was washed with TMK buffer and dried in an 80 “C oven under vacuum. The percent- age of 32P retained on the filter was determined by Cerenkov counting in a scintillation counter. Apparent association constants K,, were calculated according to Gregory et al. (13).

Circular Dichroic Measurements-Circular dichroic spectra were recorded at 20 “C with a Jobin-Yvon dichrograph V connected to an Apple I1 microcomputer. A cell of 1-cm optical path length was used to record spectra of RNA and plO-RNA complexes in the near ultraviolet region (320-220 nm) at an RNA concentration of 10 pg/ ml in 0.1 M KCI, 10 mM phosphate, pH 7.4. The results are presented as molar ellipticity values in degrees. cm2.dmol”, based on the nu- cleotide mean residue mass of 330 Da.

A cell of 0.1-mm optical path length was used to record spectra of p10 in the far ultraviolet region (260-190 nm) at a concentration of 2 mg/ml, in 10 mM phosphate, pH 7.4, containing either 0.7 M KC1 or 0.1 M KCI. During scanning, we chose an integration time of 4 s. The results are presented as molar ellipticity values based on the amino acid mean residue mass of 110 Da.

Infrared Spectroscopy Measurements-Infrared spectra were re- corded a t 20 ‘C with a Perkin-Elmer model 1760 Fourier transform spectrophotometer connected to a PE series 7000 computer and equipped with a high sensitivity narrow range mercury cadmium telluride detector. An attenuated total reflectance cell was used, the characteristics of the ZnSe crystal giving it an equivalent optical path length of 15 pm. For each spectrum from 2000 to 800 cm”, 1000 interferograms were co-added and Fourier-transformed to give a final resolution of 4 cm”. p10, a t a concentration of 2 mg/ml, was dissolved in 10 mM phosphate, pH 7.4 containing either 0.7 M KC1 or 0.1 M KCI. The resolution of the bands was enhanced by self-deconvolution, using a Lorentzian line-shape function with a band width a t half- height of 15 cm” and a resolution enhancement factor of 2 (14).

\\ I I 5.12 Kb

Molecular Modeling-Computer modeling was carried out using the Desktop Molecular Modeller software (version 1.2), developed by Crabbe and Appleyard (15). The energy minimization procedure used to refine peptide structures is based on the algorithm and data given by Vinter et al. (16).

RESULTS AND DISCUSSION

Nucleolin shares with other nucleic acid binding proteins such as fibrillarin (17-19), the yeast single-stranded DNA binding protein SSBl (20), and the mammalian protein A1 (21), the feature of having a domain rich in glycine and arginine. It is thus of particular interest to investigate the nucleic acid binding properties of this domain and to compare such properties with those of the whole protein.

To this end, a cDNA corresponding to the C-terminal domain of CHO nucleolin was cloned and expressed in E. coli as detailed under “Materials and Methods” (see also Fig. lA). The purified polypeptide, whose sequence is presented in Fig. 1B, gives a single band after polyacrylamide gel electropho- resis corresponding to a molecular weight of 14,000 (Fig. 1C). However, since the calculated molecular weight is only 10,000, we will refer to this polypeptide as p10 in the present report. The amino acid composition of p10 is remarkable in that it consists of a region, 52 residues long, composed largely of glycine interspersed with arginine and phenylalanine residues.

Nucleic Acid Binding Properties of the Glycine-Rich C- terminal Domain of Nucleolin-Nucleolin specifically recog- nizes a site located in the 5’ external transcribed spacer of pre-rRNA (5). As shown in Fig. 2A, an in vitro RNA transcript containing this site is bound to a nitrocellulose membrane filter more efficiently in the presence of the entire nucleolin protein as compared with the C-terminal domain p10. The apparent association constants, K:, are 0.5 X lo6 M“ in the case of p10 uersus 2 X lo7 M” in the case of nucleolin. Since p10 binds to this RNA with the same affinity as to the nonspecific RNA (Fig. 2 A ) , we can conclude that the nucleolin C-terminal domain does not take part in the specific recog- nition of RNA, which is a property of the whole protein. In fact, p10 seems to be completely insensitive to the nature of the RNA sequence since filter binding experiments performed

a b C d 94 kDa t -.I 67 kDa t .II - E! 43 kDa t

30 kDa t

21 kDa t

I 4 k D a t

cf “p lo

@ MASMTGGQQMGRGSMEDGEIDGNKVTLDWAKPKGEGGFGGRGGGRGGFGG RGGGRGGGRGGFGGRGRGGFGGRGGFRGGRGGGGGGGDFKPQGKKTKE’E

FIG. 1. Expression i n E. coli of the cDNA corresponding to the Gly-Arg region of hamster nucleolin. A, the NcoI-EcoRI fragment of nucleolin cDNA (7) has been cloned in the BarnHI site of the E. coli expression vector PAR 3040 (8), giving rise to the pT7 Gly-Arg expression vector. B, sequence of the synthesized polypeptide. Amino acids in bold are from nucleolin (residues 629 to 713), and the remainder are derived from the vector PAR 3040. C, cells were transformed either with the plasmid PAR 3040 or with the recombinant plasmid pT7 Gly-Arg. Protein extracts were separated by polyacrylamide gel electrophoresis and stained with Coomassie Brilliant Blue. Lane a, size standards (in kDa). Lane b, total E. coli-PAR 3040 protein extract. Lane c, total E. coli-pT7 Gly-Arg extract. Lane d, the unique band corresponding to the purified p10 peptide.

Page 3: THE OF 267, No. 5, Issue February 15, pp. 2955 …THE JOURNAL OF BIOLOGICAL CHEMISTRY 0 1992 by The American Soeiety for Biochemistry and Molecular Biology, Inc. Vol. 267, No. 5, Issue

The Glycine-rich RNA-Helix-de 'stabilizing Domain of Nucleolin 2957

0

n U z U

[protein] X I O " M

lel 100 r - I

[plO] X 10 M 6

FIG. 2. Binding curves for the interaction of p10 with nu- cleic acids. The assays were carried out as described under "Mate- rials and Methods" but in the absence of competitor tRNA in the case of B. A, the binding between specific RNA and either nucleolin (x) or p10 (W) is shown by continuous lines, while the dashed l ine represents the binding between nonspecific RNA and p10 (0). B, the binding between p10 and double-stranded nucleic acids is shown by continuous lines and between p10 and single-stranded nucleic acids by dashed lines. The nucleic acids used are as follows: double-stranded DNA, .; a hybrid with two complementary RNAs, 0 , single-stranded RNA, 0; and poly(dA), 0. With the exception of poly(dA), all nucleic acids are approximately 750 nucleotides long.

with the different homopolyribonucleotides did not reveal any base preference (data not shown). Even more surprisingly, p10 does not appear to be very sensitive to the geometry of the nucleic acid either, as can be seen in Fig. 2 B duplex DNA, single-stranded DNA, or RNA with different degrees of base pairing all seem to be effective ligands of p10.

@-Spiral Formation in the Glycine-rich C-terminal Domain of Nucleolin-As a first approach toward determining the structural conformation of p10, we have used the techniques of circular dichroism (CD) and Fourier Transform Infrared Spectroscopy (FTIR). The CD spectra of p10 (Fig. 3 A ) in 0.7 M KC1 (curve 1 ) and in 0.1 M KC1 (curve 2) show none of the characteristics associated with the spectra of proteins rich in conventional a-helical or @-sheet conformations (for a review, see Ref. 22). They also differ from that associated with ran- dom coil, notably with regard to the intensity of the minimum at 200 nm and to the absence of a positive band at 220 nm. On the other hand, these spectra are fully consistent with the presence of type I (1') @-turns as reviewed by Smith and Pease (23). According to a previously established correlation be- tween a low absolute value of negative ellipticity at 200 nm and a high @-turn content of a peptide chain ( 3 ) , it can be deduced that an increase in ionic strength from 0.1 M KC1 to 0.7 M KC1 stabilizes the @-turn structure of p10. These con- clusions are corroborated by the data from FTIR spectros- copy. The so-called amide I band in the 1700-1600 cm"

200 220 240 I7W

Wavelenglh (nm) Wavenumber (cm"I

FIG. 3. Spectroscopic analysis of the p10 peptide. A, circular dichroic spectra of p10 in 0.7 M KC1 (curue 1 ) and 0.1 M KC1 (curue 2). B, deconvolved Fourier-transform infrared spectra of p10 in 0.7 M KC1 (curue 1 ) and 0.1 M KC1 (curue 2). The two arrows point to the major bands of curue 1 at 1684 cm" and 1672 cm". In all cases, the buffer was 10 mM phosphate, pH 7.4.

R.?bCC"Ce

thrnan : - G C F C C ~ C C R C C R G G - R G G F G G R G ~ G G R G G F R G G R ~ G ~ ~ ( 2 8 )

Z Moure . "GCFCGRCCGRGGFGCRGCGRGG-~ffiRG~ffiRCFRGGRGG~- 1291

Rat : 6S'GCFCCRCCCRGCFCGRGGGRGG-~GGRG~GG~ffiFRGGRCG~G~ (301

6 ~ ~ m r t e r : ~I~GGFGGRGGGRGGFGGRGGGRGGGRGCFGGRG~GGRGGFRGGRG~G~ I 7 )

Chxkcn : 6 3 2 F Q ~ ~ G ~ ~ G G R G G R G G R G G G R G G F G G ~ R G F G G R G G G F R G G R ~ ~ z ( 3 1 1

Xenoplir : " ' S Q ~ G R G ~ c ~ . R G G ~ G G R ~ G G ~ G ~ G ~ f f i ~ ~ ' ~ G G F G G R G ~ F R C @ G G G F ' ' ~ 1 3 2 I

Rat : b F S P ~ G G ~ G o R C C R G G ~ R G G R " . . . . . ( 1 7 1 $ Xenopprrr : ' G F S P ~ G D ~ C ~ G ~ ~ C F R G ~ R G G f G G R G R G G D ~ R C G ~ ~ G G F S S - $ P G ~ G G P R C C C R ~ G C C ~ G A G ~ ~

B (181

y. Physomm: ' X F E G ~ G C ~ o ~ G X G R ~ G G G z ' . . . . . ( 1 9 )

- cat : ' ~ ~ R G ~ R C R C F R G G F ~ Y ~ ~ ~ ~ F R G R G N F R G ' ~ ( 2 0 1

FIG. 4. &Spiral domains of several nucleolar proteins. The p-turn motif RGGF which is modeled in Fig. 5 is double underlined, and other potential @-turns stretches are single underlined. The four mammalian nucleolin sequences have been aligned by introducing a single amino acid gap. The locations of the sequences within the respective proteins are indicated by the numbers in superscript. Both rat and Physarurn fibrillarin sequences come from partial amino acid N-terminal analyses, as shown by the series of dots.

frequency range is composed of the vibrational modes of the CO groups of the peptide backbone which are a function of its secondary structure. Two major bands, already present in 0.1 M KC1, stand out more clearly in the infrared spectrum of p10 in 0.7 M KC1 (see arrows on curve 1 of Fig. 3B) . The band a t 1684 cm" is associated with the @-turn conformation (24), and the band at 1672 cm-' has been correlated with the presence of contiguous &turns (3, 25). In conclusion, both CD and FTIR results indicate that p10 can adopt a repeated @-turn structure.

Detailed studies of a synthetic polypentapeptide (VPGVG), based on the repeat motif of elastin have provided direct evidence that the stacking of consecutive @-turns can form a "@-spiral" (26). More recently, Matsushima et al. (27) have built computer models of several proteins containing repeated @-turns and have emphasized the plausibility of such spiral structures. Our spectroscopical data strongly suggest that such a &spiral is the most likely supersecondary structure of p10. Analysis of the sequence of the nucleolin C-terminal domain (Fig. 4) reveals in particular several repeats of the tetrapeptide RGGF, characteristic of nucleolin from different species as well as of other nucleolar proteins. This tetrapeptide is a good candidate to take a @-turn structure since the crystal structure of [Ledlenkephalin YGGFL has been shown to be a "G-G" @-turn with dihedral angles consistent with a type I' confor- mation (33) . The tetrapeptide RGGF can similarly be modeled in a type I' 0-turn as shown in Fig. 5A. There is the possibility

Page 4: THE OF 267, No. 5, Issue February 15, pp. 2955 …THE JOURNAL OF BIOLOGICAL CHEMISTRY 0 1992 by The American Soeiety for Biochemistry and Molecular Biology, Inc. Vol. 267, No. 5, Issue

2958

A

The Glycine-rich RNA-Helix-destabilizing Domain of Nucleolin

”-i

FIG. 5. Computer modeling of the repeat motif RGGF. A, a bonds-only model of the tetrapeptide with all the atoms labeled except hydrogen. Potential hydrogen bonding is indicated with stars (see text for more details). E, space-fill models of the same tetrapeptide seen from different angles after rotation about the .x axis. Large arrows point to the phenylalanine ring, while arrowheads point to the guanidino-NH2 of the arginine.

of hydrogen bond formation between the CO of the first residue of the turn (arginine) and the NH of the fourth residue (phenylalanine), as in a classical “1-4” turn. The 2 glycine residues occupy the “corner” positions.

Fig. 5B depicts space-fill models of the tetrapeptide RGGF seen from slightly different angles after rotation around the x axis. In particular, the tight core of the turn is clearly visible in the right-hand image of Fig. 5B. The regular stacking of such turns could generate a ,&spiral, with the phenylalanine and arginine side chains projecting away from the spiral axis.

In addition to the RGGF motif, inspection of the sequences shown in Fig. 4 reveals other elements such as RGGG which could form equivalent @-turns. As mentioned above, the 2 glycine residues form the key element of the @-turn, and on this basis we have indicated the stretches of sequence which are likely to adopt this secondary structure. The amino acids between the stretches (commonly GG) and the overlapping nature of the turn elements probably contribute to the flexi- bility of the 0-spiral.

RNA-Helix-destabilizing Properties of Nucleolin C-terminal Domain-In order to get further insight into the RNA binding properties of the nucleolin C-terminal domain, we have stud- ied the interaction between p10 and an in vitro RNA tran- script of 750 bases using circular dichroism.

CD measurements represent a useful tool to follow the structural behavior of RNA within an RNA-protein complex since at wavelengths above 250 nm only the nucleic acid contributes to the dichroic signal. As displayed in Fig. 6, p10 induces a significant perturbation of the spectrum of free RNA. There is a considerable drop in ellipticity at 267 nm from a value of 2.1 X lo4 for free RNA to a plateau value of 1.25 X 10‘ for complexes with a plO/RNA fragment molar ratio superior or equal to 4. This decrease in ellipticity indi- cates unstacking of bases and unfolding of the secondary structure. Similar spectral changes were observed upon bind-

I r i 0

U E

N E u m 8 Kl -

0

0 n

X

m

- 1 0 220 240 260 280 300

W a v e l e n g t h (nrnl

FIG. 6. Spectroscopic analysis of plO-RNA complexes. Cir- cular dichroic spectra of free RNA (curue marked with an arrow) and of plO-RNA complexes (the two superimposed curves), for plO/RNA molecular ratios of 4 and 12. A. absorbance.

ing of T4 phage gene 32 protein (GP32) and fd phage gene V protein to poly(rA) and have been ascribed to an increased base-base distance and a substantial tilt of the bases in the complex (34). The opening up of base-base stacks could cer- tainly be promoted by p10 phenylalanine residues, since it has been shown that an aromatic residue inserts only partially between the bases of a nucleic acid helix, inducing a new pattern of stacking (35) .

Function of Nucleolin C-terminal Domain and Relevance to Other Proteins-Because of its particular composition of gly- cine, arginine, and phenylalanine residues regularly distrib- uted within a repeat motif, nucleolin C-terminal domain might adopt a helical conformation made of repeated 0-turns. Although our present results do not give us access to the parameters of the resulting P-spiral (i.e. its pitch and the number of repeats per pitch), we can correlate a certain number of its structural features with the ability of nucleolin C-terminal domain to bind and unwind RNA. We now have to consider the possible function of these domains in nucleolin and in related proteins.

The regularity of arginine and phenylalanine side chains projecting outside the central core of the spiral structure will presumably create electrostatic and hydrophobic ridges prone to interact with the RNA phosphate backbone and bases, respectively. The hydrophobic interaction takes place at the expense of base stacking and induces a subsequent loss of RNA secondary structure. The fact that most of the arginines are present in vivo as p p - d i m e t h y l derivatives ( 6 ) should not impair the RNA binding properties of the nucleolin C - terminal domain as this modification does not alter the posi- tive charge. Finally, as shown by Matsushima et aZ. (27), an important feature of P-spirals is their capacity to adopt dif- ferent, but related, conformational states. As emphasized by the authors, P-spirals possess a degree of flexibility not readily available to conventional secondary structures such as the a- helix. If the glycine-rich domain of nucleolin is indeed highly flexible, this could explain why it does not show any specificity toward RNA sequence or geometry. Nucleolin is associated with preribosomal RNA and is involved in the early stages of preribosome assembly (36,37) . Among the several RNA bind- ing sites which have been mapped, the one for which nucleolin displays the strongest affinity has been located in the external transcribed spacer ( 5 ) . The specific binding of nucleolin to this site can be ascribed to its four RNA binding domains.’

L. Ghisolfi, A. Kharrat, G. Joseph, F. Amalric, and M. Erard, submitted for publication.

Page 5: THE OF 267, No. 5, Issue February 15, pp. 2955 …THE JOURNAL OF BIOLOGICAL CHEMISTRY 0 1992 by The American Soeiety for Biochemistry and Molecular Biology, Inc. Vol. 267, No. 5, Issue

The Glycine-rich RNA-Helix-destabilizing Domain of Nucleolin 2959

Although, as we have shown, the glycine-rich domain cannot be directly involved in the specific recognition mechanism, its presence is nevertheless indispensable, since it increases the affinity of the RNA-binding domains by an order of magni- tude.’ Taking into account its RNA helix-destabilizing prop- erties, one obvious role for the nucleolin C-terminal domain would thus be to render the specific RNA binding site acces- sible by unfolding those regions responsible for steric hin- drance. The fact that a small number of p10 molecules is sufficient to unstack 750 bases of RNA shows the cooperativ- ity of the “melting” process which must be propagated over a long distance.

Several other proteins possess a domain homologous to the nucleolin C-terminal domain (Fig. 4). Fibrillarin, a 34-36- kDa nucleolar protein, also contains glycine-dimethylarginine clusters interspersed with phenylalanine (17-19) and is likely to interact with the small nucleolar-specific snRNA U3 (17). Similar domains are also present in a yeast 32-kDa nucleolar protein SSBl (20) and in the rat protein A1 (21). The addi- tional presence of a specific RNA-binding domain within the A1 protein (21) and a putative RNA-binding region within Xenopus fibrillarin (18) suggests that, as with nucleolin, such domains are often coupled with a glycine/dimethylarginine RNA unfolding domain. It is intriguing that these latter domains are often situated at the extremities of their respec- tive proteins. Is this a requirement for their activity, or a reflection of the manner in which the proteins have evolved?

There are two other plausible, but not exclusive, functions which could be envisaged for these domains. The first possi- bility is a role in the formation of protein-protein interactions; for example, @-spirals are well known to intertwine within the elastin fiber (26). Secondly, this C-terminal domain may be important for nucleolar targeting, since it is striking that nucleolin, fibrillarin, and SSBl are all nucleolar proteins. In line with this, the p68 RNA helicase from yeast (38) contains several repetitions of a RGGY motif and is located in the nucleolus at a specific stage of the cell cycle.

Acknowledgments-We are grateful to David Barker for his critical reading of the manuscript and to Edouard Barbey for his photographic skills.

REFERENCES 1. Bouche, G., Caizergues-Ferrer, M., Bugler, B. & Amalric, F.

(1984) Nucleic Acids Res. 1 2 , 3025-3035 2. Erard, M., Belenguer, P., Caizergues-Ferrer, M., Pantaloni, A. &

Amalric, F. (1988) Eur. J. Biochem. 175,525-530 3. Erard, M., Lakhdar-Ghazal, F. & Amalric, R. (1990) Eur. J .

Biochem. 191, 19-26 4. Bugler, B., Bourbon, H., Lapeyre, B., Wallace, M., Chang, J. H.,

Amalric, F. & Olson, M. (1987) J. Biol. Chem. 262, 10922- 10925

5. Ghisolfi, L., Joseph, G., Erard, M., Escoubas, J. M., Mathieu, C. & Amalric, F. (1990) Mol. Biol. Rep. 14, 113-114

6. Lapeyre, B., Amalric, F., Ghaffari, S., Rao, V., Dumbar, T. &

7. Lapeyre, B., Bourbon, H. & Amalric, F. (1987) Proc. Nutl, A d .

8. Tabor, S. & Richardson, C. (1985) Proc. Nutl. Acad. Sci. U. S. A.

9. Bourbon, H., Michot, B., Hassouna, N., Feliu, J . & Bachellerie,

10. Sambrook, J., Fritsch, E. & Maniatis, T. (1989) Molecular Clon- in^: A Laboratory Manual, Cold Spring Harbor Press, Cold

Olson, M. (1986) J. Biol. Chem. 261,9167-9173

Sci. U. S. A. 84,1472-1476

82,1074-1078

J . P. (1988) D N A ( N Y ) 7, 181-191

Spring Harbor, New York 11. Russel. M. & Model. P. (1984) J. Bacteriol. 159. 1034-1039 12. Caizerbes-Ferrer, M., Belenber, P., Lapeyre,’B., Amalric, F.,

13. Gregory, R., Cahill, P., Thurlow, D. & Zimmermann, R. (1988)

14. Kaupinnen, J., Moffat, D., Mantsch, H. & Cameron, D. (1981) Appl. Spectrosc. 35, 271-276

15. Crabbe, J . & Appleyard, J. (1989) Desktop Molecular Modeller Vl.2, Oxford University Press, Oxford

16. Vinter, J., Davis, A. & Saunders, M. (1987) J. Comput. Aided Mol. Des. 1 , 31-51

17. Lischwe, M., Ochs, R., Reddy, R., Cook, R., Yeoman, L., Tan, E., Reichlin, M. & Busch, H. (1985) J. Biol. Chem. 2 6 0 , 14304- 14310

18. Lapeyre, B., Mariottini, P., Mathieu, C., Ferrer, P., Amaldi, F., Amalric, F. & Caizergues-Ferrer, M. (1990) Mol. Cell. Biol. 10,

19. Christensen, M. & Fuxa, K. (1988) Biochem. Biophys. Res. Com-

20. Jong, A., Clark, M., Gilbert, M., Oehm, A. & Campbell, J. (1987)

21. Cobianchi, F., SenGupta, D., Zmudzka, B. & Wilson, S. (1986) J.

22. Johnson, C. (1988) Annu. Reu. Biophys. Biophys. Chem. 17,145-

23. Smith, J. & Pease, L. (1980) CRC Crit. Reu. Biochem. 8,315-399 24. Byler, D. & Susi, H. (1986) Biopolymers 24, 469-487 25. Renugopalakrishnan, V., Strawich, E., Horowitz, P. & Glimcher,

M. (1986) Biochemistry 25, 4879-4887 26. Urry, D. (1982) Methods Enzymol. 82, 673-717 27. Matsushima, N., Creutz, C. & Kretsinger, R. (1990) Proteins 7,

28. Srivastara, M., Fleming, P., Pollard, H. & Burns, A. (1989) FEBS

29. Bourbon, H., Lapeyre, B. & Amalric, F. (1988) J. Mol. Biol. 200 ,

30. Bourbon, H. & Amalric, F. (1990) Gene ( N Y ) 8 8 , 187-196 31. Maridor, G. & Nigg, E. (1990) Nucleic Acids Res. 18, 1286 32. Caizergues-Ferrer, M., Mariottini, P., Curie, C., Lapeyre, B., Gas,

33. Smith, D. & Griffin, J. (1978) Science 199 , 1214-1216 34. Scheerhagen, M., Blok, J . & van Grondelle, R. (1985) J. Bwmol.

35. Gabbay, E., Sanford, K., Baxter, C. & Kapicak, L. (1973) Bio-

36. Bourbon, H., Caizergues-Ferrer, M., Amalric, F. & Zalta, J. P.

37. Herrera, A. & Olson, M. (1986) Biochemistry 25 , 6258-6263 38. Iggo, R., Jamieson, D., MacNeill, S., Southgate, J., McPheat, J.

Wallace, M. & Olson, M. (1987) Biochemistry 26,7876-7883

J. Mol. Biol. 204 , 295-307

430-434

mun. 155,1278-1283

Mol. Cell. Biol. 7, 2947-2955

Bwl. Chem. 261,3536-3543

166

125-155

Lett. 250,99-105

627-638

N., Amalric, F. & Amaldi, F. (1989) Genes & Deu. 3,324-333

Struct. Dyn. 2,821-829

chemistry 12,4021-4029

(1983) Mol. Biol. Rep. 9, 39-47

& Lane, D. (1991) Mol. Cell. Biol. 11, 1326-1333