the structure of a 38-kda leucine-rich protein (chondroadherin

8
THE JOURNAL OF BIOLOGIC~L CHEMISTRY Val. 269, No. 94, Issue ofAugust 26, pp. 21547-21554, 1994 0 1994 by The American Society for Biochemistry and Molecular Biology, Inc. Printed in U.S.A. The Structure of a 38-kDa Leucine-rich Protein (Chondroadherin) Isolated from Bovine Cartilage* (Received for publication, April 25, 1994, and in revised form, June 6, 1994) Peter J. NeameSBfl, Yngve Sommarinll, Raymond E. BoyntonS, and Dick Heinegiirdll From the $Shriners Hospital for Crippled Chitdren, Tampa, Florida 33612, the $Department of Biochemistry and Molecular Biology, University of South Florida School of Medicine, Tampa, Florida 33612, and the i~epar~ment of Medical and Physiological Chemistry, University of Land, Srusden A leucine-rich protein, chondroadheri~ has been iso- lated from dissociative extracts of articular cartilage, and its primary structure has been determined by both direct protein sequencing and DNA sequence analysis of polymerase chain reaction products and cDNA clones. This protein is identical to the 36-kDa protein which was isolated by Larsson et aZ. (Larsson, T., Sommarin, Y., Paulsson, M., Antonsson, P., Hedbom, E., Wendel, M., and Heinegiird, D. (1991) J. Biot. Chem. 266,20428.20433). It has 337 amino acids and exists in several isoforms. The two major isoforms are a formwith a calculated molec- ular weight of 38,353and a PI of 9.76 and a smaller form with a calculated molecular weight of 37,304 and a pf of 9.5. The two isoforms result from a cleavage near the C terminus. A further level of heterogeneity is found in that an extra alanine can be found prior to the N-termi- nal cysteine. There are 9 cysteines; disulfide bonds have been directly identified between Cysasz-Cyssz4 and CysZB4- Cysso4. The principal feature of the protein is a series of 10 leucine-rich repeats. The most N-terminal of these repeats contains a cysteine (Cysa3) whichis not disul- fide-bonded and which is difficult to derivatize. It is likely that this free cysteine is involved in structure- stabilizing hydrogen bonding. The mRNA is approxi- mately 1.6 kilobases, of which 511 base pairs is a 3’-un- translated region between the stop codon and the polyadenylation signal. Based on anchored polymerase chain reaction analysis of the mRNA, there is some mi- nor heterogeneity in the position of the 5’ end of the message. Cartilage is largely extracellular matrix. As a result of this, proteins associated with the organization of extracellular ma- trix are particularly abundant. The majority of the matrix con- sists of an underhydrated glycosaminoglycan-containing envi- ronment. The constraints on diffusion of larger molecules (>20 kDa (111 that this matrix imposes might also result in an un- usual abundance of proteins involved in the regulation of the chondrocyte phenotype. With a view to identifying candidates National Institutes of Health Research Grant AFt35322 (to P. J. N.) and * This work was supported by the Shriners of North America and the Swedish Medical Research Council, Folksam’s Stiftelse, Margaret and Axel &son Johnson’s Stiftelse, Greta and Johan Kocks Stiftelse, and Konung Gustaf Vs 80-%rsfond (to D. H. and Y. S.), The costs of publica~ion of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked =a~uert~seme~~” in accordance with 18 U.S.C. Section 1734 solely to indicate this fact. to the ~ n 3 a ~ k ~ / ~ ~ B L Data Bank with accession n ~ m b ~ r ( s j U08018. The nucleotide sequence(sj reported in this paper has been submitted Shriners Hospital for Crippled Children, 12502 N. Pine Dr., Tampa, FL ll To whom correspondence and reprint requests should be addressed: 33612. Tel.: 813-975-7146; Fax: 813-978-9442 or 813-975-7146; E-mail: p.neame~genie.geis.co~. for either of these functions, we have embarked on a systematic characterization of abundant cartilage matrix macromolecules. The main proteinaceous macromole~les of cartilage extra- cellular matrix are usually considered to be collagen type I1 and the aggregating kerata~chondroitin sulfate proteoglycan, ag- pecan. However, cartilage contains a variety of abundant ex- tracellular proteins and glycoproteins in addition to collagen and the well characterized proteoglycans. The classical ex- ample of these proteins is link protein, which stabilizes the complex of aggrecan and hyaluronic acid. Mature articular car- tilage contains a more limited spectrum of macromolecules than developing cartilage. Thus, fetal cartilage contains large quantities of pleiotrophin 12), chondromodulin (31, and N-ter- minal propeptide of type XI collagen a2 (PARP) (41, whereas these are either absent or in considerably lower abundance in mature cartilage. For a review of proteins associated with the cartilage phenotype, see HeinegiIrd and Oldberg (5). Proteins containing the leucine-rich repeat (LRR)‘ have been discovered in increasing numbers since their identification as a family by Patthy (6). The earliest characterized protein in this family was the serum leucine-rich glycoprotein (71. Other mem- bers of this diverse family consist of the leucine-rich proteogly- cans, fibromodulin (81, lumican (9), biglycan (10, 111, PG-Lb (121, and decorin (131, as well as adenyl cyclase (141, the Dro- sophila photoreceptor cell adhesion protein, chaoptin (151, the proteins in the platelet adhesion glycoprotein complex (161, CD14 (171, the Drosophila dorsal-ventral differentiation gene, Toll (181, and ribonuclease inhibitor (19). The crystal structure of the latter has recently been solved and refined a t 2.5-A resolution (20). Both mature and fetal cartilage contain the leucine-rich proteoglycans, biglycan, decorin (211, and fibro- modulin (8). The LRR proteins typically contain an N-terminal region with 2 or 4 cysteines arranged in a characteristic pat- tern. This is followed by the repeating leucine-rich motifs, which vary in numberbetween different members of the LRR family, and these are followed in turn by another cysteine- containing section, with either 2 or 4 cysteines. The LRR pro- teoglycans have additional amino acids precedingthe first cys- teine which, in decorin and biglycan, usually have attached glycosaminoglycans. A 36-kDa protein, derived from cartilage, which is 14% leucine, has been described by Larsson et al. (22), and it was speculated that this protein might also be a member of the LRR protein family. This protein can mediate chondrocyte attach- ment to plastic culture dishes i231. There are two isoforms of differing molecular weight. The higher and lower molecular The abbreviations used are: LRR, leucine-rich repeat; RACE, rapid amplification of cDNA ends; PAGE, polyacrylamide gel electrophoresis; HPLC, high performance liquid chro~ato~aphy; PTH, phenylthiohy- dantoin; PCR, polymerase chain reaction; UTR, untranslated region; bp, base pair(s). 21547

Upload: phungliem

Post on 09-Jan-2017

215 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: The Structure of a 38-kDa Leucine-rich Protein (Chondroadherin

THE JOURNAL OF BIOLOGIC~L CHEMISTRY Val. 269, No. 94, Issue ofAugust 26, pp. 21547-21554, 1994 0 1994 by The American Society for Biochemistry and Molecular Biology, Inc. Printed in U.S.A.

The Structure of a 38-kDa Leucine-rich Protein (Chondroadherin) Isolated from Bovine Cartilage*

(Received for publication, April 25, 1994, and in revised form, June 6, 1994)

Peter J. NeameSBfl, Yngve Sommarinll, Raymond E. BoyntonS, and Dick Heinegiirdll From the $Shriners Hospital for Crippled Chitdren, Tampa, Florida 33612, the $Department of Biochemistry and Molecular Biology, University of South Florida School of Medicine, Tampa, Florida 33612, and the i ~ e p a r ~ m e n t of Medical and Physiological Chemistry, University of Land, Srusden

A leucine-rich protein, chondroadheri~ has been iso- lated from dissociative extracts of articular cartilage, and its primary structure has been determined by both direct protein sequencing and DNA sequence analysis of polymerase chain reaction products and cDNA clones. This protein is identical to the 36-kDa protein which was isolated by Larsson et aZ. (Larsson, T., Sommarin, Y., Paulsson, M., Antonsson, P., Hedbom, E., Wendel, M., and Heinegiird, D. (1991) J. Biot. Chem. 266,20428.20433). It has 337 amino acids and exists in several isoforms. The two major isoforms are a form with a calculated molec- ular weight of 38,353 and a PI of 9.76 and a smaller form with a calculated molecular weight of 37,304 and a pf of 9.5. The two isoforms result from a cleavage near the C terminus. A further level of heterogeneity is found in that an extra alanine can be found prior to the N-termi- nal cysteine. There are 9 cysteines; disulfide bonds have been directly identified between Cysasz-Cyssz4 and CysZB4- Cysso4. The principal feature of the protein is a series of 10 leucine-rich repeats. The most N-terminal of these repeats contains a cysteine (Cysa3) which is not disul- fide-bonded and which is difficult to derivatize. It is likely that this free cysteine is involved in structure- stabilizing hydrogen bonding. The mRNA is approxi- mately 1.6 kilobases, of which 511 base pairs is a 3’-un- translated region between the stop codon and the polyadenylation signal. Based on anchored polymerase chain reaction analysis of the mRNA, there is some mi- nor heterogeneity in the position of the 5’ end of the message.

Cartilage is largely extracellular matrix. As a result of this, proteins associated with the organization of extracellular ma- trix are particularly abundant. The majority of the matrix con- sists of an underhydrated glycosaminoglycan-containing envi- ronment. The constraints on diffusion of larger molecules (>20 kDa (111 that this matrix imposes might also result in an un- usual abundance of proteins involved in the regulation of the chondrocyte phenotype. With a view to identifying candidates

National Institutes of Health Research Grant AFt35322 (to P. J. N.) and * This work was supported by the Shriners of North America and

the Swedish Medical Research Council, Folksam’s Stiftelse, Margaret and Axel &son Johnson’s Stiftelse, Greta and Johan Kocks Stiftelse, and Konung Gustaf Vs 80-%rsfond (to D. H. and Y. S.), The costs of publica~ion of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked = a ~ u e r t ~ s e m e ~ ~ ” in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.

to the ~ n 3 a ~ k ~ / ~ ~ B L Data Bank with accession n~mb~r(s j U08018. The nucleotide sequence(sj reported in this paper has been submitted

Shriners Hospital for Crippled Children, 12502 N. Pine Dr., Tampa, FL ll To whom correspondence and reprint requests should be addressed:

33612. Tel.: 813-975-7146; Fax: 813-978-9442 or 813-975-7146; E-mail: p.neame~genie.geis.co~.

for either of these functions, we have embarked on a systematic characterization of abundant cartilage matrix macromolecules.

The main proteinaceous macromole~les of cartilage extra- cellular matrix are usually considered to be collagen type I1 and the aggregating kerata~chondroitin sulfate proteoglycan, ag- pecan. However, cartilage contains a variety of abundant ex- tracellular proteins and glycoproteins in addition to collagen and the well characterized proteoglycans. The classical ex- ample of these proteins is link protein, which stabilizes the complex of aggrecan and hyaluronic acid. Mature articular car- tilage contains a more limited spectrum of macromolecules than developing cartilage. Thus, fetal cartilage contains large quantities of pleiotrophin 12), chondromodulin (31, and N-ter- minal propeptide of type XI collagen a2 (PARP) (41, whereas these are either absent or in considerably lower abundance in mature cartilage. For a review of proteins associated with the cartilage phenotype, see HeinegiIrd and Oldberg (5).

Proteins containing the leucine-rich repeat (LRR)‘ have been discovered in increasing numbers since their identification as a family by Patthy (6). The earliest characterized protein in this family was the serum leucine-rich glycoprotein (71. Other mem- bers of this diverse family consist of the leucine-rich proteogly- cans, fibromodulin (81, lumican (9), biglycan (10, 111, PG-Lb (121, and decorin (131, as well as adenyl cyclase (141, the Dro- sophila photoreceptor cell adhesion protein, chaoptin (151, the proteins in the platelet adhesion glycoprotein complex (161, CD14 (171, the Drosophila dorsal-ventral differentiation gene, Toll (181, and ribonuclease inhibitor (19). The crystal structure of the latter has recently been solved and refined a t 2.5-A resolution (20). Both mature and fetal cartilage contain the leucine-rich proteoglycans, biglycan, decorin (211, and fibro- modulin (8). The LRR proteins typically contain an N-terminal region with 2 or 4 cysteines arranged in a characteristic pat- tern. This is followed by the repeating leucine-rich motifs, which vary in number between different members of the LRR family, and these are followed in turn by another cysteine- containing section, with either 2 or 4 cysteines. The LRR pro- teoglycans have additional amino acids preceding the first cys- teine which, in decorin and biglycan, usually have attached glycosaminoglycans.

A 36-kDa protein, derived from cartilage, which is 14% leucine, has been described by Larsson et al. (22), and it was speculated that this protein might also be a member of the LRR protein family. This protein can mediate chondrocyte attach- ment to plastic culture dishes i231. There are two isoforms of differing molecular weight. The higher and lower molecular

’ The abbreviations used are: LRR, leucine-rich repeat; RACE, rapid amplification of cDNA ends; PAGE, polyacrylamide gel electrophoresis; HPLC, high performance liquid chro~ato~aphy; PTH, phenylthiohy- dantoin; PCR, polymerase chain reaction; UTR, untranslated region; bp, base pair(s).

21547

Page 2: The Structure of a 38-kDa Leucine-rich Protein (Chondroadherin

21548 Structure of a Cartilage LRR Protein weight forms have been separated by ion exchange chromatog- raphy and, based on tryptic peptide mapping, are very similar, if not identical (24). The higher molecular weight form is more cationic, as indicated by its enhanced retention on CM-cellulose.

We report here the structure of an abundant, leucine-rich protein (chondroadherin), derived from bovine articular carti- lage, which corresponds to the 36-kDa protein. The structure was derived from a combination of Edman degradation of pep- tide fragments, sequence analysis of PCR products, and cDNA sequencing.

EXPERIMENTAL PROCEDURES Materials-Articular cartilage was from the occipital condyles of

2-year-old calves and was obtained fresh from iocal slaughterhouses. Gel filtration media (Sephacryt S-300 and Superose 12) were from Phar- macia Biotech Inc. Proteases were obtained from Boehringer M a n ~ e i m and were sequencing grade. Protein sequencing reagents and KPLC columns for analysis of amino acid derivatives were obtained from Ap- plied Biosystems International. Peptide separation was performed on Brownlee RP-300 or Vydac octodecylsilane (ODs) columns (Rainin, Em- eryville, CA) with solvents from Burdick and Jackson (Muskegon, MI) using methods described elsewhere (4). General reagents were from Sigma. Reagents for PCR amplification were obtained from Perkin- Elmer.

Tissue Extraction-Methods for extracting and purifying the carti- lage proteins were as described previously (4,101, Briefly, they were as follows. Cartilage was thinly sliced ( 4 mm thickness) and extracted with 4 M guanidine hydrochloride in the presence of protease inhibitors. Extracts were dialyzed to bring the concentration of guanidine to 0.4 M, reassociating the proteoglycan aggregates. Aggregates were removed by isopycnic cesium chloride density gradient centrifugation (251. The re- gion of the gradient with a density <1.3 g/ml (A4 fraction) was used for subsequent purification.

Protein P u r i ~ e a ~ ~ o n ~ h o n d r o a d h e r i n was purified by gel fiitration on either a Superose 12 or a Sephacryl S-300 column equilibrated in 4 M guanidine hydroc~oride buffered to pH 6.5 with 'Ik.is-HC1. Individual fractions were fractionated by reversed-phase HPLC as described pre- viously for the small, cartilage-derived glycoprotein (SCGP) and the N-terminal propeptide of type XI collagen a2 (PARP) (4). Either Brown- lee RP-300 (CS) or Vydac ODS (C18) columns (0.46 x 30 cm) were used. Chondroadherin elutes later than link protein on a reversed-phase col- umn. The purification scheme is illustrated in Fig. 1. Alternatively, chondroadherin can be purified by sequential DEAE- and CM-cellulose ion exchange chromatography at neutral pH. A final CM-cellulose step at a pH of 5 results in the protein being pure by SDS-PAGE (22).

Reduction and Carboxymethylation-Reduction and S-carboxy- methylation with iodoacetic acid were as described previously (10). The phenylthiohydantoin (F'TH)-derivative of S-carboxymethylcysteine elutes between serine and glutamine using Applied Biosystems stand- ard chromato~aphic conditions for their PTH-derivative analysis col- umns and a 120-A microbore HPLC.

Peptide Fragment ~ n ~ ~ t ~ o n - ~ o t e o l y t i c digestion with either en- doprotease Lys-C, endoprotease Asp-N, or endoprotease Glu-C was per- formed at enzyme:substrate ratios of 150 according to the manufactur- er's literature. Tryptic digestions of proteins which had been Western- blotted onto Immobilon-PSQ (Mi1lipore.e) was performed using the method of Fernandez et al. (26). Peptides were separated by reversed- phase HPLC on a Vydac C18 column (2.1 x 30 mm), eluted with a gradient of acetonitrile (340% in 55 min) in 0.1% trifluoroacetic acid at a flow rate of 0.2 mumin. Eluant was monitored at 220 nm.

Protein Sequencing Strategy-Initial digests of reduced and car- boxymethylated chondroadherin were using endoprotease Lys-C and endoprotease Asp-N. Peptides were isolated by reversed-phase HPLC and sequenced, The sequences resulting from alignment of overlapping peptides in these two digests fell into three large, non-overlapping biocks (Fig. 2).

O~igo~~cleotide S~n~hes~-Oligonucleotides were designed from the degenerate reverse-translated peptide sequence using the program ''OligoTM'' (National Biosciences, Plymouth, MN). Oligonucleotides were synthesized on an Applied Biosystems 391A DNA synthesizer and pu- rified on an OPC cartridge (Applied Biosystems) after cleavage and deprotection.

PCR Strategy-PCR amplification was performed using either a Perkin-Elmer thermocycler or a Hybaid Omnigene thermocycler. As the initial protein sequence data divided into three sets of overlapping peptides, a nested PCR strategy was used to both join the protein

sequences together in the correct orientation and to obtain a new, DNA- deduced, protein sequence. Two degenerate forward primers (P1 and P2, Table I) were designed from the N-terminal sequence, together with two degenerate reverse primers (AP1 and AP2, Table I) from the other two blocks of protein sequence. The template was bovine chondrocyte oligo(dT)-primed cDNA from newborn epiphyseal cartilage that had been maintained in explant culture for 3 days. RNA for the cDNA synthesis was isolated using the method of Smale and Sasse (27). Initial amplification was for 30 cycles at an annealing temperature of 50 "C using P1 as the forward primer.

The products of the first amplification (1 pl in each case) were then re-amplified for an additional 20 cycles at 50 "C using P2 as the forward primer. The products were analyzed by agarose gel electrophoresis (NuSieve GTG, FMC Corp.). Excised bands were purified on glassmilk (WizardTM PCR preps, Promega) for reamplification.

RACE Analysis-The procedure was as described by Clontech in the manual for their 5'-AmplifinderTM RACE kit. The signal peptide se- quence of chondroa~erin was obtained by synthesizing cDNA with primer RACE-PI, followed by ligation of a synthetic anchor oligonucle- otide to the cDNA as described in the Clontech manual. The Clontech- supplied forward primer together with primer RACE-P2 were then used to isolate a 220-bp PCR product using the ligated cDNAas template. The band was somewhat diffuse and was sequenced with primer RACE-P2.

PCR Product Sequencing-DNA sequencing was performed by Se- quetech (Mountain View, CA) using fluorescent dye termination se- quencing chemistry and analysis of the reaction products on an Applied Biosystems 373A DNA sequencer.

Northern Blotting and cDNA Library Screening-Whole RNA was prepared form freshly isolated bovine tracheal chondrocytes by the method of Adams et al. (28). Ten pg of the total RNA was electropho- resed, transferred to nylon membrane (Hybond, Amersham, UK), and hybridized with random-primed, 3eP-labeled probe based on the 350-bp PCR product of primers P2 and AP1 (Table I) according to standard protocols. The filter was washed with 0.2 x SSPE, 0.1% SDS at 42 "C for 15 min, followed by 0.2 x SSPE, 0.1% SDS at 55 "C for 15 min, and finally with 0.1 x SSPE, 0.1% SDS at 55 "C for 15 min.

A h g t l l cDNAiibrary prepared from bovine tracheal chondrocytes (8) was screened with either the PCR product of primers M1 andAP3 or the product of primers P2 and AP1 (Table I). After subcloning of EcoRI fragments into Bluescript I1 KS vector (Stratagene), DNA sequence was obtained by standard double-stranded dideoxy termination sequeneing using T3, T7, and internal sequencing primers.

RESULTS Sequential gel permeation chromatography and subsequent

reversed-phase HPLC chromatography of the low buoyant den- sity fraction from associative cesium chloride gradients of dis- sociatively extracted bovine articular cartilage is an effective way of preparing samples for N-terminal analysis. We have identified a protein in the 30-40-kDa range which eluted late from a reversed-phase column and which had an N-terminal amino acid sequence which was not in the G e n ~ a n k ~ or N ~ ~ - P r o t e i n ~ d e n t ~ c a t i o n Resource data bases.

Isolation and Preliminary Characterization-Dissociative gel filtration analysis of the upper fraction of an isopycnic ce- sium chloride gradient fractionation of dissociative extracts of articular cartilage results in rapid separation of low molecular weight protein (~50,000) from small proteoglycans. Subsequent reversed-phase chromatography enables individual compo- nents to be identified by N-terminal sequencing.

A novel protein, eluting later than fibromodulin on a gel filtration column (Fig. l), was identified by N-terminal se- quencing. Chondroadherin elutes late on a reversed-phase col- umn eluted with an acetonitrile gradient using trifluoroacetic acid as the ion-pairing reagent. On SDS-PAGE analysis, the H P ~ C - p ~ f i e d protein appeared as 2 bands with apparent sizes of 36 and 37 kDa. Sequence analysis of Coomassie blue- stained Weste~-blotted protein showed that both bands had identical N termini. Tryptic digestion of the bands and separa- tion of the resulting peptides on a reversed-phase column showed them to be essentially identical proteins (not shown).

The directly determined N-terminal amino acid sequence was clear for the first 20-30 cycles of Edman degradation.

Page 3: The Structure of a 38-kDa Leucine-rich Protein (Chondroadherin

Structure of a Cartilage LRR Protein 21549

1.5

E, 1 .o

0 co cu 8 5 e 5: 4 0.5 a

0.0

Fibromodulin

Time (minutes)

A I

0 10 20 30 40 50 Fraction

FIG. 1. Preparation of chondroadherin from the low buoyant

tion of dissociatively extracted articular cartilage. 2-year old bo- density fraction of a cesium chloride density gradient separa-

vine articular cartilage was extracted with 4 M guanidine hydrochloride, proteoglycans were reassociated by dialysis, and the extract was frac- tionated by cesium chloride density gradient centrifugation. The low density fraction was applied to a Sephacryl S300 column and monitored at 280 nm as described in the text (main chromatogram). Individual fractions were applied to a reversed-phase column, as described (inset chromatogram shows fraction 24), and individual proteins were identi- fied by N-terminal sequence analysis or by peptide mapping followed by sequence analysis of peptides.

However, there was what appeared to be a high level of lag. After 30 cycles, the background obliterated the signal. An ala- nine was consistently observed in the first cycle, as well as carboxymethylcysteine in cases where the sample was reduced and carboxymethylated.

Protein Sequencing-The majority of the protein sequence was obtained by sequence analysis and alignment of overlap- ping peptides obtained by digesting the reduced and carboxy- methylated protein with endoprotease Lys-C, endoprotease Glu-C, and endoprotease Asp-N (Fig. 2). Additional sequence was obtained from tryptic peptides obtained after digesting Western-blotted proteins with trypsin as described by Fernan- dez et al. (26). Final alignment of sequence fragments was by sequence analysis of PCR fragments (Fig. 2).

Two peptides from a digest with endoprotease Asp-N had identical N termini but differing C termini. The differing C termini could not easily be explained either by incomplete or excessive digestion by the enzyme as the C terminus of the shorter peptide would have to have resulted from a cleavage between Thr and Lys. This is unlikely, in our experience, with this enzyme. It seems likely, therefore, that the protein has been subjected to processing at the C terminus.

Further evidence for heterogeneity at the C terminus came from a digestion of native protein with endoprotease Glu-C (Staphylococcal V8 protease). Digestion products were sepa- rated by gel filtration followed by reversed-phase HPLC. A group of peptides with different molecular weights but with identical N termini were subdigested with chymotrypsin. The peptides derived from the C terminus and included the two C-terminal disulfide bonds (Fig. 3). As one of these disulfide- bonded peptides was larger than the other, the data are con- sistent with the appearance of two bands on SDS-PAGE. The difference in predicted molecular weight (1,049) is consistent with the difference in size observed on an SDS-PAGE gel.

PCR Product Sequencing-The protein sequence derived from overlapping peptide fragments comprised three blocks of data (Fig. 2). In order to determine the order of these blocks, two sense strand primers (P1 and P2) were designed from the known N-terminal region, and two antisense primers (AP1 and AP2) were designed from the protein sequence in the other two blocks (Table I) and used for nested PCR. Initial amplification of cDNA from RNA derived from newborn calf chondrocytes was performed using P1 as the forward primer. The crude products from this amplification (1 pl) were then analyzed using P2 as the forward primer. A strong 350-bp band was seen with prim- ers P2 and AP1 using either crude Pl/APl or crude Pl/AP2 as template. Afaint 600-bp band was seen with P1 andAP2. When the gel-purified 600-bp band seen as the product of PlIAP2 was used as a template for primers P2 and AP1, a 350-bp product was also seen. This conclusively ordered the three non-overlap- ping blocks of protein sequence.

The product of PS/APl was sequenced in both directions, with clearer sequence being obtained in the sense direction. This sequence, together with peptide-derived sequence, was then used as the basis for designing new primers which made it possible to obtain overlapping sequenceable PCR products in the direction of both the N and C termini. The locations of the sequenced PCR products are shown in Fig. 2.

RACE Analysis-To determine the degree and nature of processing events at the N terminus of chondroadherin, we used the RACE technique (rapid amplification of cDNA ends). cDNA was prepared using primer RACE-P1 (Table I). A syn- thetic oligonucleotide (supplied by Clontech) was ligated to the cDNA as an anchor sequence, and nested PCR was performed using a primer from the anchor sequence and primer RACE-P2. The product was a somewhat diffuse band of approximately 220 bp. The DNA sequence of this band, obtained using primer RACE-P2, was clear until a point 43 bp 5' to the initiatingATG. The DNA sequence at this point appeared to consist of at least two and probably three products, in that the forward primer sequence could be seen in several overlapping sets of data.

Some ambiguities usually result from direct sequencing of PCR products. In some cases, these could be resolved by ana- lyzing the sequence from the reverse direction and in others they could be resolved by comparison with the protein-derived degenerate DNA sequence. A small proportion of the ambiguity (such as that at bp 69) may result from the fact that we are examining both alleles of the gene.

Northern Blotting-Northern blotting of total RNA from bo- vine tracheal chondrocytes, using the PCR product of primers P2 and AP1 as a probe, indicated that there is a single mRNA species of approximately 1.9 kb (Fig. 4). This is consistent with the two forms of chondroadherin deriving from a single mRNA species rather than, for example, alternatively spliced forms of mRNA.

cDNA Isolation and Sequencing-The sequence was further confirmed by sequence analysis of two cDNAclones as shown in Fig. 2. Several cDNA clones were identified by probing a cDNA library from bovine tracheal chondrocytes with 32P-labeled PCR products of primers M1 and AP3 (393 bp) or P2 and AP1 (350 bp). Clones isolated using the Ml/AP3 probe included cDNA corresponding to proteins other than chondroadherin, as well as clones which corresponded to the 3' end of the mRNA for chondroadherin. One of these clones ( A l ) ended a t base 481 (Fig. 2). A clone isolated from a bovine articular cartilage cDNA library (A131 using the P2/AP1 probe gave clear sequence over base pairs 354-610 which, when translated, corresponded to the protein sequence of chondroadherin. No full-length clone was found in this library. The longest cDNA covered bases 354-1654. This enabled us to confirm the PCR-derived se-

Page 4: The Structure of a 38-kDa Leucine-rich Protein (Chondroadherin

21550 Structure of a Cartilage LRR Protein

20 40 60 80 100 120 140 160 180 200 2 2 0 240 260 280 300 320 1 . 1 . 1 . 1 . 1 . 1 . 1 . 1 . 1 . 1 . 1 . 1 . 1 . 1 . 1 . 1 .

Lys-c A ~ P - N - - "- "

"

GIu-C Tryptic peptides

" - - 2 RACE-P1

R A E P Z - M1 AP1 APZ AP3 - b - - A13 - hl

FIG. 2. Primers, direct PCR sequencing, and cDNA sequencing strategy from which the protein and DNAsequences were obtained. The locations of the peptides which were used to derive the majority of the protein sequence of chondroadherin are shown by the arrows immediately under the line representing the protein and numbered from the N-terminal cysteine. They are grouped into sequence obtained from the intact protein (top arrow) and sequence derived from peptides obtained by digestion of reduced and carboxymethylated protein with either endoprotease Asp-N (second line ofarrows), endoprotease Lys-C (third line ofarrows), endoprotease Glu-C (fourth line ofarrows), or trypsin (fif th line ofarrows). The PCR primers discussed in the text and shown in Table I, as well as the PCR products whose sequence was used to align the major blocks of protein sequence, are shown below this. Sequences from cDNA clones are shown as the lowest arrows.

Digestion of intact protein with endoprotease Glu-C

{ I - h 2 9 5 - C - t e n n i ~ l 272-294

Subsequent Digestion with chymotrypsin +

{ K-T+

l!Ll!Lm 272-280

GZIIIW 288-292

lvxmmmF 311-321

281-286 LEWEEWP?EPIW 293-310 RZ*KlTRrdhcqh 3224-tennind

FIG. 3. Schematic of the sequence analysis of the C-terminal endoprotease Glu-C-derived peptide used to derive disulfide bonds. Apeptide with two N termini (upper two sequences) was isolated from an endoprotease Glu-C digest of chondroadherin. The amino acid sequence location is shown on the right (numbers refer to Fig. 5). Sub- digestion of this peptide with chymotrypsin and separation of the prod- ucts by reversed-phase chromatography yielded the lower set of pep- tides, one of which had three N termini. Lower case indicates residues that are not found in the smaller form of chondroadherin. The asterisk indicates PTH-cystine detected during Edman degradation. The dashes indicate blanks in the Edman degradation.

TABLE I Key primers used to derive PCR products and the sequence of

the 5' end of the mRNA The primers are described in the text and their locations are shown in

Fig. 2. The primer sequences are blocked to correspond to their respec- tive amino acids. Degenerate codons synthesized are: Y (CPT), H (MCPT), R (MG), B (CMG), W (A/"), S (C/G), M (MC), I (inosine).

Primer Sequence (IUPAC code) Location (Fig. 1)

P1 CCH AAR GTB WSH GAR AAR A 186-206 P2 CAR MGI AAY AAY TTY CCI GT 224-243 AP1 TYT CGS WIA RRT AIA RCC AIC 590-570 AP2 GT RTT RTC IAR CCA IAR IGT 810-791 AP3 ARG T"T CAG CGT GGT CAC ACC 865-845 M1 CTGCTCTCCCCTCTGGTCAAC 473493 RACE-P1 CAG GTG CAG CGA CAC GAG G'TT 295-275 RACE-P2 G CCC CAC C'TT GTC GCG AGT GA 176-156

quence, confirm that histidine was the C-terminal amino acid, and obtain the sequence of the 3'-UTR. Identical DNA se- quences in the region covering the C-terminal of the protein were found in two independent cDNA clones, further support- ing the evidence that only one mRNA species is being made. Fig. 5 shows the complete cDNA and PCR-derived sequence and the corresponding protein sequence.

DISCUSSION Separation of cartilage proteins extracted with 4 M guanidine

hydrochloride that are recovered in the low buoyant density

9.5 - 6.2 - 3.9 - 2.8- 1.9 -

FIG. 4. Northern blot analysis of chondroadherin mRNA. Total RNA, 10 pg, from bovine tracheal chondrocytes was analyzed by North- ern blotting using the 32P-labeled PCR product of primers P2 and A P 2 as a probe. RNA size standards are indicated on the left.

fractions of cesium chloride gradients by a combination of gel filtration and reversed-phase HPLC has proved to be a very rapid and effective way of analyzing the major proteins of 4 0 kDa. We report here the isolation of a leucine-rich protein (chondroadherin) which is abundant in bovine articular carti- lages, as well as in tracheal cartilage and the nucleus pulposus of the spine. This protein has previously been described as a 36-kDa basic protein which enhances chondrocyte adhesion to plastic culture dishes (23). It is secreted into the medium in cartilage explant cultures and has a faster turnover than the trimeric cartilage matrix protein (22).

The complete amino acid sequence (Fig. 5) was derived by direct protein sequence analysis of peptides derived from di- gestion of the reduced and carboxymethylated chondroadherin with endoprotease Lys-C, endoprotease Asp-N, and endopro- tease Glu-C, and by sequence analysis of both DNA obtained by PCR amplification of bovine cartilage cDNA and by direct se- quence analysis of a cDNA clone corresponding to the majority of the protein. The locations of the peptides characterized and the PCR products sequenced are shown in Fig. 2. The signal peptide sequence was obtained by RACE analysis of cDNA. The primary N-terminal of the intact protein is the first cysteine. However, a significant proportion of the protein (1620% as estimated by N-terminal yields) also includes the preceding alanine.

Chondroadherin is not derived from a larger precursor, as shown by the complete sequence of its mRNA, derived from a

Page 5: The Structure of a 38-kDa Leucine-rich Protein (Chondroadherin

Structure of a Cartilage LRR Protein.

10 20 30 40 50 60 70 80 90 GGACTTGCCG CTGCCCCCTG c c m c c s AGACGTCCTG accATGGccc GCCCGATGCT cmaxcngc cmwm ~~mcxwc

M A R P M L L L S / N L S L G L L A - 9

100 110 120 130 140 150 160 170 180 CAGCCPGCPC; ecc CCAGARtTGc CAclGccAcA GCGAcm GCAmTC TGCGACRAOG -

S L L P A L A A C P Q N C H C H S D L Q H V I C D K V G L Q 2 2

190 200 210 220 230 240 250 260 270 GAAGATCCCC AAGGTGTCAG AGARGACCAA GCTGCTPAAC CTCCAGC- ACAAC'ITCCC TGTGCTOGCA ACCAACPCAT TTCGGGCCAT K I P K V S E K T K L L N L Q R N N F P V L A T N S P R A M 5 2

280 290 300 310 320 330 340 350 360 GCCCAACCTC GTGTcG@pGc ACCTGCAGCA CTG-TC CGCGAGGfGG C- ClTCCGAGGT CTeAAGCAGc " C A m A C m P N L V S L H L Q H C Q I R E V A A G A F R G L X Q L I Y L 8 2

370 380 390 400 410 420 430 440 450 GTACCTGTCC CATAACGACA TCCGCGTGC" GCGTGCCGGT GCCTTTGACG ACCTGACCGA GCPCACCTAC CTCTACCPGG ACCACAATAA Y L S H N D I R V L R A G A F D D L T E L T Y L Y L D H N K 1 1 2

460 470 480 490 500 510 520 530 540 GGPGACGGRG cT(;cCCCGGG CCCTCl'GGTC AAC- TCmGCl' CAACAACAAC AAGAWCGAG A- V T E L P R G L L S P L V N L F I L Q L N N N K I R E L R S 1 4 2

550 560 570 580 590 600 610 620 630 AGGTGCCITC CAGGGCGCCA AGGACCPGCG CTGGCTCTAC CTGTCGGAAA ACTCACTCAG TTCCCTGCAG CCCGGCGCTC TGGACGACGT

G A F Q G A K D L R W L Y L S E N S L S S L Q P G A L D D V 1 7 2

640 650 660 670 680 690 700 710 720 GGAAAACCTT GCCAAGTl" ACCTGGACAG GARCCAGCTG TCCAGC?!ACC CCZXEXX TCPGRGCAAG cl%MGPX TCGAGGAGCT E E L A X F Y L D R N Q L S S Y P S A A L S X L R V V E E L 2 0 2

730 740 750 760 710 780 790 800 810 GAAGCTGTCC CACAAKCCC " U C A T TCCCXATAAT GCC'ITCCAGT CCTTCGGTAG ATACCTGGAG ACGCTCPGGC TGGACAACAC

K L S H N P L K S I P D N A F Q S F G R Y L E T L W L D N T 2 3 2

820 830 840 850 860 870 880 890 900 CAACCTGGAG " T T C l G ACGGTGCCXT CCTGGGTGTG ACCACGCTGA AACA'IWICCA TcIy;GAGAAC AACXGTC'IGC ACCAGCTGCC N L E R F S D G A F L G V T T L X H V H L E N N R L H Q L P 2 6 2

910 920 930 940 950 960 970 980 990 CTCCAACTTC CCCTPTGACA GC-C CCTCACCCTC ACCAACAACC CCTGGAAGTG TACCTGCCAG CTCCGGGGCC 'ITCGCSiGGTG

S N F P F D S L E T L T L T N N P W K C T C Q L R G L R R W 2 9 2

1000 1010 1020 1030 1040 1050 1060 1070 1080 GClGGAAGCF AAGACCTCTC GCCC!lGAW CACTSTGCA TCACCCGCCA A G l ' l ' C A ~ CCAGCACATT CGTGACACGG ACGCCWCCG

L E A K T S R P D A T C A S P A K F R G Q H I R D T D A F R ~ ~ ~

1090 1100 1110 1120 1130 1140 1160 1110 CGGcpGWvLG TITCCCACCA AGAGGTCCAA GAAAGCCGGC CGCCATTAAA CAGGTCCTGA CCCAGCCAGT CCTGGTGACT G G X K T G C C

1150

G C K F P T K R S K K A G R H '

1180 1190 1200 1210 1220 1230 1240 1250 1260 l"KCCGGt+ W C T A C T G ATCCTCl!CAC CTCTGACPCC CATCTTCTCC CARCACCC'l'C TGC'l'GATACA TAGACC%TT CACACCTAGA

1270 1280 1290 1300 1310 1320 1330 1340 1350 CATGTCCl'GG CAGGGCACCX GGGCACPCCA GCACAAACCC AGcTCCACTl' GATGCXAGA GCTCCAGCAC GccTGGccTt ACGGCCACAG

1360 1370 1380 1390 1400 1410 1420 1430 1440 CTCcpcpcAG AGAAGCTGTT GCTAXACCC CCAAGTCCAT TAGGACCAGA A C A m m C CA"CAGGATG GCCACCCTTC CAGAAl'CCTC

1450 1460 1470 1480 1490 1500 1510 1520 1530 CTCTKATl?I' CCcpTcc tTG l'CAmCAA ACFAACAl'CA GA'pccpTGcC Cl'ACCCCRG CTCCl'AGGAA GGGTClGMG CCCCTACCCl'

1540 1550 1560 1570 1580 1590 1600 1610 1620 CAACTCTGCC AGCCCCCACC TGCCAGGRCA CTGcTGCfip GCCACACATC CTCCCACGCT G C A T l " i ' m CCCAGATTTC TATAAATATA

1630 1640 1650 AATTTAl'GTA l"ATAATAA AAM" AAAA

21551

FIG. 5. The primary structure and PCR- and cDNA-derived sequence of chondroadherin. Amino acids are numbered based on the predominant N terminus (cysteine; residue 1). Nucleotides are numbered arbitrarily starting from the first clear data in the RACE experiment described in the text. This is estimated to be 9-28 bp from the 5' end of the cDNA. The un~rLined amino acids at the C terminus represent the additional sequence from the larger form of chondroadherin. The underlined &nine at amino acid residue -1 indicates the alternative N terminus residue. Undertined bases highlight the points at which the PCR-derived sequence is ambiguous.

bovine tracheal chondrocyte cDNA library and RACE analysis. Larsson et al. (22) found low levels of carbohydrate associ- The protein found in cartilage is initially synthesized with a ated with chondroadherin. One component of the carbohydrate 24-amino acid signal sequence. The codon for His337, found as was xylose, particularly surprising in view of the absence of the C terminus amino acid in the larger form of chondroadhenn hexosamines. During Edman degradation, we have been un- is followed by a stop codon (TAA). This is followed by a 511-bp able to find any sites with significantly low levels of asparagine, UTR before a polyadenylation signal is found. serine, or threonine. There are no Asn-Xaa-SerPThr N-glycosy-

Page 6: The Structure of a 38-kDa Leucine-rich Protein (Chondroadherin

21552 Structure of a Cartilage LRR Protein

lation sites. We have obtained clear protein sequence through all candidate glycosylation sites with the exception of serine 122. The identity of this residue was determined unequivocally by DNA sequencing. The protein sequence in this region was determined from a peptide that was a minor component in a mixture of two peptides. The background from the predominant sequence made it impossible to determine equivocally whether this serine was present in unmodified form. Heterogeneity-Chondroadherin exists in two forms, differ-

ing in molecular mass by about 1 kDa as estimated from SDS- PAGE gels. The larger form is more basic than the smaller form (not shown). On reduction and carboxymethylation, both bands increased slightly in apparent size, indicating the presence of internal disulfide bonds. The bands were electroblotted onto Immobilon-PSQ (Millipore) and analyzed by N-terminal se- quence analysis, as well as in situ tryptic digestion by the method of Fernandez et al. (26). Both bands gave the same N-terminal HPLC tryptic maps of the two bands were essen- tially identical (not shown).

Sequence analysis of peptides derived from either endopro- tease Asp-N or endoprotease Glu-C digestion of a mixture of both forms of the protein consistently yielded two peptides which differed only in their C termini. The C termini of the peptides were threonine or histidine. Glutamate would be usual after digestion with endoprotease Glu-C. We concluded that the C terminus of the large form of the protein was histi- dine, while the C terminus of the smaller form was threonine (Thr328). This would implicate a protease cleaving at KFPT- KRSKK.

An additional level of heterogeneity exists as a result of two sites of removal of the signal peptide. On Edman degradation of the intact protein, we consistently observed alanine in the first cycle of sequencing. The sequence data subsequent to this had considerable lag from the previous cycle, suggesting that the first amino acid was either not derivatized easily or not cleaved off easily. An alternative explanation was that the N terminus was heterogeneous. On obtaining the DNA sequence of the RACE-derived PCR product, it was apparent that the N termi- nus cysteine was preceded by alanine. Thus, c h o n d r o a ~ e ~ has two alternative N termini. The first of these sites (Pro-Ala- Leu-Ala- J Ala-Cys) can be predicted from the “-3,-1” rule for prediction of signal peptide cleavage sites (29) while the second site, preceding the first cysteine, is more likely to result from exopeptidase activity.

The calculated molecular weight of the larger form of chon- droadherin is 38,353 with an estimated PI of 9.79. The smaller form has a calculated molecular weight of 37,304 with a PI of 9.55. It has been reported that chondroadherin contains 1% carbohydrate (22). By matrix-assisted laser desorption mass spectrometry (not shown), the masses of the larger and smaller forms of the protein are heterogeneous, with at least four spe- cies each, but are, on average, 400 Da higher than the calcu- lated values, presumably as a result of carbohydrate substitu- ents which would therefore contain approximately three sugar groups.

Disulfide Bonds-Chondroadherin protein has been shown to form oligomers, in particular a dimer of approximately 70 kDa (22). This dimer forms in conditions which would promote disulfide rearrangement (pH 8) and may result from cross- linking via cysteine, It is not seen using the methods described here, where the protein is kept below a pH of 7. Thus, it is probably occurring in situ rather than after isolation and may indicate that individual molecules are in close proximity to each other.

There are 9 cysteine residues in chondroadherin. However, the majority of chondroadherin is monomeric, as shown both by

its migration on an SDS-PAGE gel and by its relative elution position on a gel filtration column. At least 1 of the cysteines must, therefore, be a free cysteine. When the native protein is digested with endoprotease Glu-C and peptides are isolated, the peptide containing the fifth cysteine can be isolated as a discrete entity. Subdigestion of this peptide with trypsin (not shown) yields a peptide containing the free cysteine; se- quence analysis of this peptide gave a small PTH-derivative peak near PTH-proline. This is likely to be F’TH-cysteine. The reduced and carboxymethylated protein gave a peak of PTH- S-carboxymethylcysteine in this position. This free cysteine may be the reason for low levels of dimerization.

The free cysteine in chon~oadherin is found within the first of the leucine-rich repeats. It is probable, given the likely am- phipathic nature of the leucine-rich repeats (30), that they form a “sandwich” with the hydrophobic residues in the interior of the protein. With this arrangement, the likely function of the conserved serines and asparagines is to stabilize the structure with hydrogen bonds. The cysteine in the first leucine-rich re- peat takes the place of the asparagine and thus may be in- volved in noncovalent stabilization of the structure, rather than covalent stabilization through a disulfide bond. It is note- worthy that carboxymethylation of the native protein in the presence of EDTA does not result in derivatization of this cys- teine, even in the presence of 4 M guanidine hydrochloride.

After reduction and carbox~ethylation, the protein mi- grates more slowly on SDS-PAGE, indicating that the protein has unfolded to a certain extent. By analogy with the small, leucine-rich proteoglycan, biglycan, disulfide bonds can be ex- pected between the first and fourth cysteines (10). We were unable to identify PTH-cystine after sequencing through the fourth cysteine in the N-terminal fragment derived from diges- tion of the native protein with endoprotease Glu-C. Neither have we unequivocally identified disulfide bonds between the second and third cysteines. However, some species of the LRR protein family, notably the leucine-rich a2 glycoprotein from serum, do not have the Cys-Xaa-Cys motif of the second and third cysteines. It seems reasonable, therefore, to assume that the first and fourth cysteines are likely to be disulfide-bonded in all members of this family. However, the redox state of the second and third cysteines is uncertain.

In biglycan, a further disulfide bond is found between the C-terminal 2 cysteines (10). However, the protein described here has an additional 2 cysteines in this region, for a total of 4. Digestion of native protein with endoprotease Glu-C, isola- tion of the C-terminal peptide, and sub~gestion with chymo- trypsin gave a product with three N termini. The peptides comprising this disulfide-bonded mixture contained the 4 C- terminal cysteines; the sixth and seventh cysteines were found in one peptide and the eighth and ninth were in the other two. When the F’TH-derivatives released at each cycle were exam- ined, it was clear that an amino acid, eluting near PTH-tyro- sine, was released after sequencing through the sixth and ninth cysteines. This was likely to be PTH-cystine. A corresponding, although more difficult to discern, peak could also be found after sequencing through the eighth cysteine and likely corre- sponds to a disulfide bond between the seventh and eighth cysteines (Fig. 3). Disulfide bonds were therefore positively identified between Cys282-Cys324 and C y ~ ~ ~ ~ - C y s ~ ~ .

Sequence ~om~logy-Chondroadhe~ contains 10 leucine- rich repeat sequences (Fig. 6). The leucine-rich repeat (LRR) is a sequence whose consensus is Leu-Xaa-Xaa-Leu-Xaa-Leu- Xaa-Xaa-Asn-Xaa-Leu-Ser-Xaa-Leu, where Leu can also be va- line, isoleucine, or methionine, serine can be exchanged for threonine, and cysteine can be exchanged for asparagine.

A search of the n o n r e d ~ d a n t data base of the National Cen-

Page 7: The Structure of a 38-kDa Leucine-rich Protein (Chondroadherin

Structure of a Cartilage LRR Protein 21553

FIG. 6. Alignment of the leucine- rich repeats. The 10 leucine-rich repeats in c h o n ~ o a d ~ e ~ n are shown aligned with each other. The consensus sequence criterion is that the respective amino acid i s present in at least 40% of cases. The amino acid sequence location is shown on the right. The likely extent of the 6-sheet, based on the crystal structure of ribonu-

L Q K I P K V S E K T K L L N L Q R N N F P V L A A N S F R A M P N L V S L H L Q H C Q I R E V A A G A F R G L K Q L I Y L Y L S H N D I R V L R A G A F D D L T E L T Y L Y L D H N K V T E L P R G L L S P L V N L F I L Q L N N N R I R E L R S G A F Q G A K D L R W L Y L S E N S L S S L Q P G A L D D V E N L A K F Y L D R N Q L S S Y P S A A L S R L R V V E E L K L S H N P L K S I P D N A F Q S F G R Y L E T L W L D N T N L E K F S D G A F L G V T T L K H V H L E N N R L H Q L

clease inhibitor-(20), i s shown below the consensus sequence. G A F

ter o f Biotechnology Information (NCBI) with the program Blastp (31) found, as expected, many members of the LRR family. Most of the LRR family can be subdivided into two p u p s which contain either a cysteine in the repeats or which contain an asparagine in the repeats (summarized by Malvar et al. (32)). The group which contains asparagine can be further subdivided into 6 groups, depending on the consensus sequence of the repeats and the length of the average repeating unit (32). Chondroadherin does not readily fit into any of these groups, although, like all the connective tissue-derived LRR-containing proteoglycans (decorin, biglycan, fibromodulin, lumican), with the exception of PG-Lb (which contains 61, it contains 10 LRRs.

Some of the members of the LRR family of proteins are spe- cific cell adhesion molecules (15, 33). Chondroadherin also has the property of mediating chondrocyte attachment to surfaces (23). Mammalian members of the LRR family that contain the pattern of cysteines found in the C terminus are platelet gly- coproteins V (341, M (351, Iba (361, and Ibp (37), and the insu- lin-like growth factor acid-labile fragment (38). These are all cell surface-associated proteins and contain between 2 and 15 LRRs. Another secreted protein which contains LRR domains and a similar pattern of cysteines is ~ r o s o p ~ ~ ~ ~ slit protein (39) which is involved in axonal development. However, this is a large and complex protein with 4 sets of LRRs which contain 5 or 3 repeats each.

The number o f proteins known to contain LRR domains is increasing rapidly It appears unlikely that the conserved pat- tern of hydrophobic residues reflects a conserved function. The LRR domain tends to force the similarity of large numbers of proteins which are not obviously functionally related. Indeed, the conserved amino acid residues, which are hydrophobic (Leu, Ile, Val, Met) or capable of hydrogen bonding (Asn, Cys, Ser, Thr) are likely to be oriented toward the interior of the protein. As suggested by Roth (161, the pattern of residues between the conserved residues is likely to be of greater impor- tance, as these residues will be oriented toward the solvent and will presumably mediate any functions associated with the LRR domain. We tested for proteins which would have a similar solvent-associated structure to chondroadherin by removal of the consensus sequence of the LRR domain (replacing the con- served Leu and Asn residues in the sequence with Xaa) and compared the resulting sequence with the sequences in the Genpept, SwissProt, and PIR data bases, using the NCBI BLAST service. No significant similarities were found, indicat- ing that the protein described here may not be closely related to any other members of the LRR family, in spite of a superficial resemblance. In this respect, the LRR structure may resemble the immunoglobulin domain as a conserved folding motif which can be used for a large number of differing purposes.

The crystal structure of porcine ribonuclease inhibitor has recently been described by Kobe and Deisenhofer (20). The 2.5-A resolution structure shows that in this protein, the LRR

21-44 45-68 69-92 93-116 117-140 141-164 165-188 189-212 213-237 238-261

L L x x L Y L x H N x L s x L consensus

<--------B-sheet----z

motif consists of a @-sheet and an (Y helix in an antiparallel arrangement. The LRR motif forms the @-sheet, while the re- mainder of the repeat forms a turn and an (Y helix. It is probable that most LRR structures approximate to this pattern. How- ever, there are significant differences between the sequence of chondroadherin and that o f ribonuclease inhibitor. The most obvious difference is that the N- and C-terminal disulfide- bonded structures are missing in ribonuclease inhibitor. The periodicity of the LRR domains in ribonuclease inhibitor is either 28 or 29 amino acids, whereas in chondroadherin it i s 24-25 amino acids. W i l e the regions of chondroadherin which are likely, by analogy with ribonuclease inhibitor, to form @ sheets are the same length (11 amino acids), the putative (Y

helices are much shorter (7 amino acids in chondroadherin versus 11 in ribonuclease inhibitor). The repeats in ribonucle- ase inhibitor consist of alternating “A” type (containing a cys- teine) and “ B type (containing an asparagine) LRRs. In chon- droadherin, the majority of repeats are of the “ B type, and they differ in detail from the consensus sequence described for the crystal structure. Thus, while the structure described €or ribo- nuclease inhibitor is likely to be a paradigm for LRR sequences, there may be significant differences in the overall structure between members of this family

Chondroadherin was originally described as a rapidly la- beled component of organ cultures of bovine tracheal cartilage (40, 41). It has been shown to be as potent as collagen type I1 and fibronectin in mediating chondrocyte attachment to plastic surfaces coated with the protein. However, it does not promote cell spreading (23). The high molecular weight form was the only form found in tracheal cartilage, possibly indicating the presence of a unique protease in articular cartilage, where both forms can be found. The exact function of chondroadherin re- mains to be established, but it is clearly part of the macromo- IecuIar system that chondrocytes use to interact with the ex- tracellular matrix.

The sequence of chondroadherin does not provide any indi- cation of a characteristic cell binding motif. Comparison with the LRR proteins in cartilage, which do not appear to promote cell binding, could be info~at ive . The pronounced domain structure similarities (10 LRRs) between the LRR-containing proteins isolated from cartilage may indicate related biological functions. Thus decorin and fibromod~in bind to collagens (421, and all bind to transforming growth factor-p (43). It is not, as yet, clear whether chondroadherin binds to collagens andlor transforming growth factor-p. However, caution should be used in extrapolating from similar domain structures. Immuno- globulin domains, for example, have a broad variety of binding functions, and it is not easily predictable as to which regions on the surface of the molecule form a binding site.

Acknowkdgments-We are grateful to Ulrika Pettersson for skillful technical assistance and Carmen Young for mass spectrometry.

Page 8: The Structure of a 38-kDa Leucine-rich Protein (Chondroadherin

21554 Structure of a Curtilage LRR Protein

REFERENCES

2. Neame, P. J., Young, C . N., Brock, C . W., Treep, J. T., Ganey, T. M., Sasse, J., 1. Maroudas, A. (1976) J. Anat. 122,335-347

3. Neame, P. J., Treep, J. T., and Young, C . N. (1990) J. Bid. Chem. 265, 9628-

4. Neame, P. J., Young, C . N., and Treep, J. T. (1990) J. Biol. Chem. 285,20401-

5. Heineghrd, D., and Oldberg, A. (1989) FASEB J. 3,2042-2051 6. Patthy, L. (1987) *J. Mol. Bioi. 198, 567-577 7. Takahashi, N., Takahashi, Y., and Putnam, F. W. (1985) fiw. Natl. Acad. Sei.

8. Oldberg, k., Antonsson, P., Lindblom, K., and HeinegArd, D. (1989) EMBO J.

9. Blochberger, T. C., Vergnes, J.-P., Hempel, J., and Hassell, J. R. (1992)J. Biol.

10. Neame, P. J., Choi, H. U., and Rosenberg, L. C. (1989) J. Biol. Chem. 264,

11. Fisher, L. W., "mine, 3. D., and Young, M. F. (1989) J. Biof. Chem. 264,

12. Shinomura, T., and Kimata, K. (1992) J. Biol. Chem. 267,12654270 13. Day, A. A, Ramis, C . I., Fisher, L. W., Gehron-Robey, P, Tennine, 3. D., and

14. Kataoka, T., Broek, D., and Wigler, M. (1985) Cell 43,493-505 Young, M. F. (1986) Nscleic Acids Res. 14,9861-9876

15. Krantz, D. E., and Zipursky, S. L. (1990) EMBO J. 9,1969-1977 16. Roth, G. J. (1991) Blood 77,5-19 17. Ferrero, E., Hsieh, C.-L., Francke, U., and Goyert, S. M. (1990) J. Immunof.

18. Hashimoto, C., Hudson, K., and Anderson, K. (1988) Cell 52,269-279 19. Hofsteenge, J., Kieffer, B., Matthies, R., Hemmings, B. A,, and Stone, S. R.

20. Kobe, B., and Deisenhofer, J. (1993) Nature 866,751-756 21. Choi, H., Johnson, T., Pal, S., Tang, L., Rosenberg, L., and Neame, P. (1989) J.

22. Larsson, T., Sommarin, Y., Paulsson, M., Antonsson, P., Hedbom, E., Wendel,

and Rosenberg, L. C . (1993) J. Orthop. Res. 11,479-491

9633

20408

U. S. A. 82,19061910

8,2601-2604

Chem. 267,347-352

8653-8661

4571-4576

145,331-336

(1988) Biochemistry 27,8537-8544

Bid. Chem. 264,2876-2884

M., and Heinegiird, D. (1991) J. B i d . Chem. 266,20428-20433

23. Sommarin, Y., Larsson, T., and Heinegiird, D. (1989) Exp. Cell. Res. 184, 181-192

24. Heinegiird, D., and Pimental, E. R. (1992) in Articular Cartilage and Osteo- arthritis (Kuettner, K. E., ed) pp. 95-112, Raven Press, New York

25. Sajdera, S. W., and Hascall, V. C . (1969) J. Biol. Chem. 244, 77-87 26. Femandez, J., DeMott, M., Atherton, D., and Mische, S. M. (1992) Anal.

27. Smale, G., and Sasse, J. (1992) Anal. Biochem. 203, 352356 28. Adams, M. E., Huang, D. Q., Yao, L. Y., and Sandell, L. J. (1992)Anal. Biochem.

29. van Heijne, G. 11986) Nucleic Acids Res. 14, 4683-4690 30. Krantz, D. D., Zidovetzki, R., Kagan, B. L., and Zipursky, S. L. (1991) J. Biol.

31. Al~chul , S. F., Gish, W., Miller, W., Myers, E. W., and Lipman, D. 3. (1990) J. Chem. 266, 16801-16807

32. Malvar, T., Biron, R. W., Kaback, D. B., and Denis, C . L. (1992) Genetics 132, Mot. Bid. 215, 403-410

33. Lopea, J., Chung, D., Fujikawa, K., Hagen, F., Davie, E., and Roth, G. (1988) 951-962

34. Lanza, F., Morales, M., de La Salie, C., Cazenave, J. P., Clemetson, K. J., Proc. Natl. Acad. Sei. U. S. A. 86, 2135-2139

35. Hickey, M. J., Deaven, L. L., and Roth, G. J. (1990) FEBS Lett. 274, 189-192 Shimomura, T., and Phillips, D. R. (1993) J. Biot. Chem. 268,20801-20807

36. Lopez, J., Chung, D., Fujikawa, K., Hagen, F., Papayannopoulou, T., and Roth,

37. Wicki, A. N., Walz, A,, Gerber-Huber, S. N., Wenger, R. H., Vornhagen, R., and

38. Leong, S. R., Baxter, R. C . , Camerato, T., Dai, J., and Wood, W. I. (1992) Mol.

39. Rothberg, J. M., Jacobs, J. R., Goodman, C. S., and Artvanis-Tsakonas, S.

Biochem. 201,255-264

202,89-95

G. (1987) Proc. Natl. Acad. Sci. U. S. A. 84, 5615-5619

Clemetson, K. J. (1989) Thromb. Huemostasis 61,44%453

Endocrinol. 6, 870-876

40. Paulsson, M., and Heineglrd, D. (1982) Biochem. J. 207,207-213 41. Paulsson, M., Sommarin, Y., and HeinegArd, D. (1983) Biochern. J . 212,659-

(1990) Genes & Deu. 4,2169-2187

667 42. Hedbom, E., and Heinegiird, D. (1993) J. Biol. Chem. 268, 27307-27312 43. Fukushima, D., Biitzow, R., Hildebrand, A., and Ruoslahti, E. (1993) J. Biol.

Chem. 268,22710-22715