the complete primary structure of hla-bw58”

10
THE JOURNAL OF BIOLOGICAL CHEMISTRY 0 1985 by The American Society of Biological Chemists, Inc. Vol. 260, No. 22, Issue of October 5, pp. 11924-11933,1985 Printed in USA. The Complete Primary Structure of HLA-Bw58” (Received for publication, March 22,1985) Judy P. Ways$, Helene L. Copping, and Peter Parham From the Departments of Cell Biologyand §Medical Microbiology, Sherman Fairchild Building, Stanford University School of Medicine, Stanford, California 94305 Serological studies indicate that HLA-B17 molecules are unusually cross-reactive with products of the HLA- A locus. In particular, a mouse monoclonal antibody MA2.1 defines an epitope that is shared by HLA-A2 and the two subtypes (Bw57 and Bw58) of B17. To investigate these relationships at the structural level, we have isolated a gene coding for Bw58 from the WT49 B cell line. The gene was transfected into mouse L cells and its protein product was characterized with a panel of monoclonal anti-HLA antibodies. The nu- cleotide sequence of 3520 base pairs of DNA encom- passing the seven exons coding for Bw58 and associ- ated introns was determined. The deduced protein se- quences for Bw58 and eight other HLA-A,B,C mole- cules were compared. In the first polymorphic domain (al), Bw58 is unusual in that it is as homologous to HLA-A locus products as to HLA-B locus products. In the second polymorphic domain (as), Bw58 has greater homology to B locus products. In the a1 domain of Bw58, small segments of amino acid and nucleotide sequence homology with A2 (residues 62-65) and with Aw24 (residues 75-83) are found in the major region of polymorphic diversity (residues 62-83). These sim- ilarities provide structural correlates for the serologi- cal relationships between Bw58 and A locus molecules, with residues 62-65 possibly being involved in the MA2.1 epitope. From comparisons of four HLA-A and four HLA-B sequences, there is a difference in the patterns of variation for A and B locus molecules. For B locus molecules there is greater variation in the a1 domain than in the a2 domain. For A locus molecules, variation in the two domains is similar and like that for B locus a2 domains. In comparison to other HLA- A,B,C genes, novel inverted repeat sequences were found in the nucleotide sequence of HLA-Bw58. These sequences flank the putative RNA splicing sites at the 3’ end of the exons encoding the a2 and a3 protein domains. The class I molecules encoded by the MHCl are a family of homologous, highly polymorphic cell surface antigens present on all nucleated cells. They function in the immune recogni- tion of antigenically foreign target cells by cytolytic thymus- Program and the United States Public Health Services AI 17892.The * This research was supported by grants from the Searle Scholars costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked “advertisement” in accordance with 18 U.S.C. Section 1734 solely to indicate this fact. $Supported by Predoctoral Training Grant CA 09302 from the National Institutes of Health. The abbreviations used are: MHC, major histocompatibility com- plex; LB medium, Luria-Bertani medium; &m, &microglobulin; bp, base pairs; kb, kilobase pairs. derived lymphocytes. The polymorphic features of class I molecules play an important role in this process and also provide the major immunological barrier to tissue transplan- tation between individuals (1). The heavy chains of human class I MHC antigensare encoded by the HLA-A,B, and C loci on chromosome 6 (2). They individually associate noncovalently with P2-m which is encoded by a single gene on chromosome 15 (3-6). Each heavy chain is composed of three extracellular domains (a1, a2, a3) of -90 residues in length, a hydrophobic transmembrane region of -39 residues, and a hydrophllic cytoplasmic domain of -25 residues. The third extracellular domain (a3) and p2- m share sequence homology with immunoglobulin domains and are relatively conserved within a species (7-9). Most of the polymorphism is due to substitutions within the a1 and a2 extracellular domains (9). Serological analysis has shown that over 60different allelic products are encoded by genes at the HLA-A,B,C loci (10). The polymorphism of these molecules and theircross-reactiv- ity have been extensively analyzed with alloantisera. Mouse monoclonal antibodies have proved useful in more precisely defining the antigenic properties of these molecules (11). For example, the monoclonal antibody MA2.1 was found to rec- ognize an epitope shared by HLA-A2 and HLA-B17 molecules (12). Cross-reactivity between products of the HLA-A and HLA-B loci has rarely been studied with alloantisera and has usually involved the public HLA-B specificities Bw4 and Bw6 (13-15). Subsequent to the description of the MA2.1 mono- clonal antibody, a re-evaluation of cross-reactive alloantisera showed that the epitope shared by HLA-A2 and HLA-B17 could be defined (16-18) and was the predominant interlocus epitope detected (19). Other interlocus cross-reactions were found and many involve the HLA-B17 molecule. These sero- logical results suggest that theHLA-B17 molecule may show specific structural relationships to products of the HLA-A locus. To investigate these hypothetical relationships, we have determined the primary structure of an HLA-B17 gene and protein and compared it to the structure of HLA-A2 and other class I molecules. HLA-B17 is divided into two serological subgroups, HLA- Bw57 and HLA-Bw58, with alloantisera (20). Both subtypes bind the monoclonal antibody MA2.1 with equivalent affinity (21). The HLA-Bw58 gene was chosen for analysis because of the availability of the WT49 cell line that is homozygous for HLA-Bw58. EXPERIMENTALPROCEDURES Construction of a WT49 Genomic Library-High molecular weight DNA was isolated from the B lymphoblastoid cell line WT49 using a modification of the procedure of Goss-Bellard et al. (22). The DNA was partially digested with EcoRI and fractionated on 10-40% sucrose gradients. Fractions containing DNA fragments of 15-25 kb, as assessed by agarose gel electrophoresis, were pooled. These fragments were ligated at a 1:1 molar ratio to X Charon 4 vector arms and then 11924

Upload: lykhanh

Post on 23-Jan-2017

223 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: The Complete Primary Structure of HLA-Bw58”

THE JOURNAL OF BIOLOGICAL CHEMISTRY 0 1985 by The American Society of Biological Chemists, Inc.

Vol. 260, No. 22, Issue of October 5, pp. 11924-11933,1985 Printed in U S A .

The Complete Primary Structure of HLA-Bw58”

(Received for publication, March 22,1985)

Judy P. Ways$, Helene L. Copping, and Peter Parham From the Departments of Cell Biology and §Medical Microbiology, Sherman Fairchild Building, Stanford University School of Medicine, Stanford, California 94305

Serological studies indicate that HLA-B17 molecules are unusually cross-reactive with products of the HLA- A locus. In particular, a mouse monoclonal antibody MA2.1 defines an epitope that is shared by HLA-A2 and the two subtypes (Bw57 and Bw58) of B17. To investigate these relationships at the structural level, we have isolated a gene coding for Bw58 from the WT49 B cell line. The gene was transfected into mouse L cells and its protein product was characterized with a panel of monoclonal anti-HLA antibodies. The nu- cleotide sequence of 3520 base pairs of DNA encom- passing the seven exons coding for Bw58 and associ- ated introns was determined. The deduced protein se- quences for Bw58 and eight other HLA-A,B,C mole- cules were compared. In the first polymorphic domain (al), Bw58 is unusual in that it is as homologous to HLA-A locus products as to HLA-B locus products. In the second polymorphic domain (as), Bw58 has greater homology to B locus products. In the a1 domain of Bw58, small segments of amino acid and nucleotide sequence homology with A2 (residues 62-65) and with Aw24 (residues 75-83) are found in the major region of polymorphic diversity (residues 62-83). These sim- ilarities provide structural correlates for the serologi- cal relationships between Bw58 and A locus molecules, with residues 62-65 possibly being involved in the MA2.1 epitope. From comparisons of four HLA-A and four HLA-B sequences, there is a difference in the patterns of variation for A and B locus molecules. For B locus molecules there is greater variation in the a1 domain than in the a2 domain. For A locus molecules, variation in the two domains is similar and like that for B locus a2 domains. In comparison to other HLA- A,B,C genes, novel inverted repeat sequences were found in the nucleotide sequence of HLA-Bw58. These sequences flank the putative RNA splicing sites at the 3’ end of the exons encoding the a2 and a3 protein domains.

The class I molecules encoded by the MHCl are a family of homologous, highly polymorphic cell surface antigens present on all nucleated cells. They function in the immune recogni- tion of antigenically foreign target cells by cytolytic thymus-

Program and the United States Public Health Services AI 17892. The * This research was supported by grants from the Searle Scholars

costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked “advertisement” in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.

$Supported by Predoctoral Training Grant CA 09302 from the National Institutes of Health.

The abbreviations used are: MHC, major histocompatibility com- plex; LB medium, Luria-Bertani medium; &m, &microglobulin; bp, base pairs; kb, kilobase pairs.

derived lymphocytes. The polymorphic features of class I molecules play an important role in this process and also provide the major immunological barrier to tissue transplan- tation between individuals (1).

The heavy chains of human class I MHC antigens are encoded by the HLA-A,B, and C loci on chromosome 6 (2). They individually associate noncovalently with P2-m which is encoded by a single gene on chromosome 15 (3-6). Each heavy chain is composed of three extracellular domains (a1, a2, a3) of -90 residues in length, a hydrophobic transmembrane region of -39 residues, and a hydrophllic cytoplasmic domain of -25 residues. The third extracellular domain (a3) and p2- m share sequence homology with immunoglobulin domains and are relatively conserved within a species (7-9). Most of the polymorphism is due to substitutions within the a1 and a2 extracellular domains (9).

Serological analysis has shown that over 60 different allelic products are encoded by genes at the HLA-A,B,C loci (10). The polymorphism of these molecules and their cross-reactiv- ity have been extensively analyzed with alloantisera. Mouse monoclonal antibodies have proved useful in more precisely defining the antigenic properties of these molecules (11). For example, the monoclonal antibody MA2.1 was found to rec- ognize an epitope shared by HLA-A2 and HLA-B17 molecules (12). Cross-reactivity between products of the HLA-A and HLA-B loci has rarely been studied with alloantisera and has usually involved the public HLA-B specificities Bw4 and Bw6 (13-15). Subsequent to the description of the MA2.1 mono- clonal antibody, a re-evaluation of cross-reactive alloantisera showed that the epitope shared by HLA-A2 and HLA-B17 could be defined (16-18) and was the predominant interlocus epitope detected (19). Other interlocus cross-reactions were found and many involve the HLA-B17 molecule. These sero- logical results suggest that the HLA-B17 molecule may show specific structural relationships to products of the HLA-A locus. To investigate these hypothetical relationships, we have determined the primary structure of an HLA-B17 gene and protein and compared it to the structure of HLA-A2 and other class I molecules.

HLA-B17 is divided into two serological subgroups, HLA- Bw57 and HLA-Bw58, with alloantisera (20). Both subtypes bind the monoclonal antibody MA2.1 with equivalent affinity (21). The HLA-Bw58 gene was chosen for analysis because of the availability of the WT49 cell line that is homozygous for HLA-Bw58.

EXPERIMENTAL PROCEDURES

Construction of a WT49 Genomic Library-High molecular weight DNA was isolated from the B lymphoblastoid cell line WT49 using a modification of the procedure of Goss-Bellard et al. (22). The DNA was partially digested with EcoRI and fractionated on 10-40% sucrose gradients. Fractions containing DNA fragments of 15-25 kb, as assessed by agarose gel electrophoresis, were pooled. These fragments were ligated at a 1:1 molar ratio to X Charon 4 vector arms and then

11924

Page 2: The Complete Primary Structure of HLA-Bw58”

in vitro packaged using E. coli 2688 and 2690 extracts (23). After amplification, this library yielded 3 X lo5 independent recombinant clones.

Screening of the Genomic ~ ~ & r ~ ~ - 5 0 0 , 0 ~ recombinant phage were screened by the in situ plaque hybridization technique of Benton and Davis using two DNA probes (24). The probe pJYHLA-B7 contains an almost full length HLA-B7 cDNA (25). It detects many HLA class I sequences and was the gift of S. Weissman (Yale University). A second probe, JY15O/C5, consists of a 550-bp PvuII fragment which was isolated from an HLA-B7 genomic clone provided hy S. Weissman. It contains most of the 3'untranslated region of the HLA-B7 gene and has a specificity similar to the HLA-B locus specific clone of Koller et al. (26). The probes were radioactively labeled with [ C X ~ ~ - P ] ~ A T P by nick translation (27).

Subeloning of DNA Fragments and DNA Sequencing-DNA from recombinant phage XC4Bw58 was digested with EcoRI and Hind111 and subcloned into the EcoRI site of calf intestinal p~osphatase- treated pBR322. Clones with a 6.5-kb EcoRI insert, which contains the class I sequence, were identified by the colony hybridization method of Hanahan and Meselson (28) using the JY15O/C5 probe. A partial restriction map of the 6.5-kb insert was obtained from single and multiple digestions of DNA from one subclone, pBw58, with various restriction endonucleases.

To obtain clones for DNA sequencing, pBw58 was digested with restriction enzymes that were selected to gve 300-500-bp fragments covering the entire gene. The digested DNA was electrophoresed in 1% agarose gels, the bands corresponding to each fragment were excised, and the fragments were isolated by a modification of the glass powder method of Vogelstein and Gillespie (29). DNA fragments were subcloned into M13mp18 and M13mp19 vectors for sequencing of both strands.

DNA sequencing was by the method of Sanger et al. (30). The orientation of sequences obtained from restriction enzyme fragments was determined by alignment with the partial restriction map of pBw58 and by alignment with the homologous sequences of the HLA- A2 (31) and HLA-B7 (25) genes. Sequences were stored and analyzed using the Microgenie program (32) of an IBM personal computer.

Cell Lines, Transfection, and Assays-Human B cell lines WT49 (HLA-AB; Bw58) and FMB (HLA-Al,w32; Bw57,w44) were cultured in RPMI 1640 supplemented with 10% fetal calf serum and 2 mM

glutamine. Murine Ltk- cells were maintained in Dulbecco's minimal essential media containing 10% fetal calf serum (DMEM-lo), 2 mM glutamine, 100 p M nonessential amino acids, and 30 pg of BrUrd per ml. Confluent cultures were treated for 2 min at 37 "C with 0.25% trypsin and 0.5 mM EDTA, then washed in DMEM-10 before serial passage or radioimmunoassay.

DNA-mediated gene transfer was by the calcium/phosphate pre- cipitation method of Wigler et al. (33) using lo6 Ltk- mouse cells, 1 pg of hC4Bw58 DNA, 50 ng of plasmid pBR322 containing the herpes simplex virus thymidine kinase gene, and 10 pg of high molecular weight DNA from mouse liver. Transfected Ltk cells were maintained for 24 h in DMEM-10,2 mM glutamine, and nonessential amino acids with 25% conditioned medium (filtered supernatant harvested from confluent Ltk- cells cultured in the absence of BrUrd). Transfected cells were then maintained in the same medium with hypoxanthine, lo-* M, ~ i n o p t e r i n , 4 X lo-? M; thymidine, 1.6 X M (BAT); and without conditioned medium.

The specificities of the monoclonal antibodies used in this study are summarized by Colombani et al. (34) excepting MB40.7 (35) and K2.2.2 (36). CT1 and CT2 refer to the C21.48A.l and C23.242 monoclonal antibodies described by Liabeuf et al. (37). The antibody 15.1.5 reacts with H-2Kk, Dk" molecules (38). Indirect r a d i o i ~ u n e cell binding assays were as decribed (39).

RESULTS AND DISCUSSION

Isolation of Bw58 HLA Gene from a WT49 Genomic Li- brary-A genomic library was made with DNA from the B lymphoblastoid cell line WT49. Thirty-nine positive clones were detected from 5 X lo5 phage screened with the HLA-B7 cDNA probe. Ten were positive with the HLA-B locus probe and were plaque-purified for further analysis. DNA was iso- lated and digested with EcoRI and the fragments were sub- jected to Southern blotting analysis (40) using both HLA class I probes. The inserts from all 10 clones gave two fragments with lengths corresponding to 5.6 and 6.5 kb. The class I probes only hybridized to the 6.5-kb fragment. A full length

Monclonal HLA-A,B Antibody Specificity (AI FMB

GAP.A3 A3 PA2.1 CRll-351 A2,A28.A28*

A2,A28*

A2.A2Wl A2,A28,ABr MA2.1 AZ517 887.1 MB40.2 MB40.3 ME1 B27M1

MB40.7 SFRS.B6

K2.7.2

MB40.P BB7.6

PA2.2 MB40.4

MB40.5 W132 a 1 128 887.5 BB7.7 BBM. 1 L368 CT3

M28 C f l

M18

Buffer 15 1.5

B7;h B7,B40 87,822,827

827 Bw6 Bw6 Bw6 Bw6

Polymorphic

Polymorphlc Polymorphjc

Monomorphic Monomorph~c Monomorphic Combinatorial Combinatorial

82m 82m

82m g: $23 None

(BI LtK'B1712Cl (C) LtK-

U

I I I 1 I I I I I I I I 1 0 20 40 60 80 0 20 40 60 80 0 20 40 60 80

RA~IOACTIVITY BOUND {cpm X 10-31 FIG. 1. Indirect cell binding assay of transfected L cells with monoclonal antibodies. L cells or B cells

were incubated in the presence of 10% fetal calf serum (0) or 10% human serum (0 + I I ) before radioimmunoassay. Bars represent the mean of duplicates. Panel A, FMB cell line (Al, w32: Bw57, 44). Panel B, Ltk+B17/2C1 cell (clone 2C1 from tk+ plus XC4B17 DNA transfection). Panel C, Ltk- cells (not transfected).

Page 3: The Complete Primary Structure of HLA-Bw58”

11926 Primary Structure of HLA-Bw58

FIG. 2. A partial restriction map and the DNA sequencing strategy used for pBw58. A, the 6.5-kilobase pair EcoRI fragment was digested with a combination of nine restriction enzymes. Exon and intron organization were deduced by comparison of the resulting map to HLA-A2 and -B7 restriction maps. Exon and intron organi- zation is: 0, exon; m, an untranslated 8th exon; a, 3' untranslated region. B, restriction enzyme fragments spanning 3.5 kb from XbuI to PuuII were subcloned into M13 vectors and the DNA sequence was determined in both directions. C, fragments for which a complete sequence was not obtained were isolated, digested with a second enzyme, and the resulting fragments were subcloned and the DNA sequence was determined.

HLA-B7 gene is also contained within an EcoRI fragment of this size (41).

Transfection and Expression of the HLA-Bw58 Gene on the Surface of Murine L Cells-Due to the multiplicity of cross- hybridizing class I genes within the human genome, unambig- uous identification of a particular cloned gene can only be made by serological characterization of its expressed protein product. To do this, DNA from clone XC4Bw58 was co- transfected with the thymidine kinase gene (tk) into tk- deficient mouse L cells.

The serological characteristics of the molecule expressed by the XC4Bw58 gene in mouse L cells were determined by radioimmune assay with a panel of 28 monoclonal antibodies against monomorphic and polymorphic determinants of HLA- A,B,C molecules. In Fig. 1, the results obtained with the transfectant Ltk' B17/2C1 are compared to the parental mouse cell Ltk- and the human cell line FMB that expresses HLA-B17 of the Bw57 subtype. FMB was used in preference to the WT49 cell from which the XC4Bw58 gene was derived because it does not express HLA-AB, which the antibody MA2.1 that positively defines HLA-B17 also binds. When HLA-A,B,C genes are transfected into mouse L cells, their products associate with mouse Pz-m and are then transported to the plasma membrane. At the cell surface, they exchange their mouse P2-m for bovine P2-m that is derived from the fetal calf serum used to supplement the growth media. Human Pz-m can be introduced into cell surface class I molecules by incubating the L cells with human serum (42). In the experi- ment shown in Fig. 1, the open histogram shows the radioac- tivity bound by the mouse L cells without incubation with human serum. The additional radioactivity bound after in- cubation with human serum is given by the shaded histogram.

In the absence of human &-m the mouse parental cells (Ltk-) only bound the control antibody 15-1-5 which reacts with the mouse H-SK,D antigens. After incubation of Ltk- cells with human serum, the binding of monoclonal antibodies with specificity for human B2-m was significantly increased. No comparable increase was seen with antibodies directed against HLA-A,B,C heavy chain determinants or combina- torial determinants involving both heavy chain and &m. In contrast, the L cells transfected with XC4Bw58 DNA bound ail three antibodies against monomorphic heavy chain deter- minants in the absence of human &-m. When human P2-m was present, the binding of antibodies with specificity for

human ,&-m and combinatorial determinants was selectively increased. These serological results unequivocally show that XC4Bw58 codes for a HLA-A,B,C heavy chain.

Of 17 polymorphic monoclonal molecules tested, only MA2.1 with specificity for A2 and B17 gave strong binding to Ltk+ B17/2C1 cells. This result showed that XC4Bw58 coded for A2 or B17 and either was possible because the WT49 cell line from which the genomic library was derived expresses both of these molecules.

To discriminate between A2 and B17, we included three antibodies PA2.1, CRll-351, and AZ,A28Ml that are specific for A2 but not B17. They did not react with the transfected cells, giving results analogous to those obtained with the human B cell line FMB that expresses B17 but not A2. The FMB cell line also bound MB40.7, BB7.6, and other antibod- ies of broad polymorphic specificity, probably due to reactions with HLA-A,B,C molecules other than B17 that are expressed by this cell.

This analysis conclusively shows that the XC4Bw58 clone codes for a B17 gene. On the basis of these results with monoclonal antibodies we cannot distinguish between the Bw57 and Bw58 subtypes of B17. However, repeated HLA

HLA-BUS8

TCTACACAACCCAA~ACTCTCCCCCCGCTCCCACT-CTCCCCACGC~CCCACCCGC~CTCACAA 70

TCTCCTCAGACCCCGACAZCCCCCTCACCCCCCCCCCAACCGTCCTCCTCCTCCTCTCCCCCCCACTCCC 140

ACCGCAGGCCGCGCCGCACGACCTCACCACCCCCCCCCCCACCACCCTCCCCCCCCTCTCACCCCCTCCT 280 CCTCACCGAGACCTCGCCCGCZCACTCCCCCCTCCCCACCCAAATCCCCTCTCTCGCCACCACCCACCCC 210

C C C C C C C A C C C T C C C A C T C ~ A T C A C C T A T T T C T A C A C C C C C A T C T C C C G C C C C C C C C C C C C C C A C C C C C C 350

H K V T A P R T V L L L L W C A V A

L T E T U A

C S H S M R Y F Y T A ~ I S R P G R G E P R

CTTCATCCCACTCCCCTACCTCCACCACACCCACITCGTCACGTTCCACACCCACCCCCCCAGTCCCACC 420 F I A ~ C Y V ~ ~ T ~ F ~ R F D S D A A S P R

T E P R A P Y I E O E C P E Y U D C K T R N H

K A S A Q T Y R E N L K I A L R Y Y N q S E A 4 90

* C L : C C T C C G C C C ~ C A C T T I C C C A C A C A A C C T C ~ ~ G ~ T C ~ ~ ~ C T C C C C T A C T A ~ ~ ~ C C A G A G C C A C C C C C C 560 630 700 770

y o " . ~ y L y L " - y

TCGCCCCACGCCCCTCACCCCCCCCCCCCCGTCCCACCCTCTCACATCATCCACACCATCTATCCCTCCCAC 840

CTCCCCCCCCACCCCCCCCTCCTCCCCCCCCATCACCACTC~CCCTACCACCCCAACCATTACATCCCCC 910 L C P D C R L L R C H D q S , A Y D C K D Y I A

L N E O L S S R T A A D T A A Q I T Q R K Y E A

A R V A E Q L K A Y L E C L C V E U L R R Y L T G A A C G A C C A C C T G A C C T C C T C C A C C C C C C C C C A C A C C G C C C C T C A C A T C A C C C A C C C C A A C ~ G C C A C C C 980

GCCCCCTCTCCCGGACCACCTCACACCCTACCTCCACCCCCTCTCCCTCCAGTCCCTCCCCACATACCTC lo50

CAGAACCCGAAGCACACCCTCCACCCCCCCCCTACCACCCCCACTCGCCACCCTTCCCCATCTCCTATAC 1120 CTCGCCCCCCATCCCCTCCCACCACAACACCACCAAAATCCCATCACCGCTACAATCTCCCCCTCCCTTC 1190 AATGGACAATCCCATGAGTTTTCCTCCTCACTTTCCTCTCACCCCCCCCTCTTCTCTCTACCACAATTAACCG 1260 ATGACGTCTCTGACCAAATCCACCCCAAGACACTCCCTACAATACTCATCACCCCTCCCCTTTGACCCCT 1330 CCACCACCCTTCGCAACCCTCACTTTTCCTCTCACCCCTTCTTCTCTCCCTCACACTCACTCTCTTTCCC 1400

E N G K E T L Q K A

GCT~T~ATTCCACCACTTCTCACTCACTTTACCTCCACTCACATCACCACCACAACTCCCTC~TCCCCCC 1470

CTGTGCCCCTTCCCCACACCACCTCTCCTGTCCATTCCATTCTCACCCTCCTCACATCCCTCCTCCTACCCTCTC 1610 T C A C A C A C T C G A A C T T T C C A A T C A A T A C C A C A T T A T C C C 1540

n a " " . 7 . " , , m

CCATCACACATGCAAAGCCCCTCAATTTTC~CACTCTTCCCATCACACCCCCCAAACACACACCTCACCC

A C C A C C C C C T C T C T G A C C A T G A C C C C A C C C T C A G O T C C T C C

ACTCACCTGGCAGCCGCATGCCCACCACCAAACTCACCACACTCACCTTCTCCACACCAGACCAGCACCA

CATAG~ICCTTCCACAACTGCCCAGCTGTGCTGGTCCCTTCTCC~CAACACCACACATACACATCCCATC

Y . r l . l . . l U H P V S D H E A T L R C U A L C P Y P A E I T

L T U Q R D C E D q T q D T E L V E T R P A C

D R T F Q K W A A V V V P S C E E q R Y T C H

V Q H E G L P K P L T L R W TACAGCATGAGGGCCTGCCCAACCCCCTCACCCTCACATCCCC~AACCACCCCCATGACGCCTCATATCT CTTCTCAGCCAAACCACCACCCCTTCTCCACCCCTTCAGCACCCTCAGCCCCCCTCCTCTTCCCCTCCTT

TCCCICACCCATCTTCCCACTCCACCATCCCCATCCTCCCCATTCTTCCTCCCCTCCCTCTCCTACCACT

~~~ ~~ ~~~~~ ~

E P S S q S T I P I V C I V A C L A V L A V

TCTCGTCATCCCACCTGTGGTCCCTACTCTCATCTCTACCACGAACACCTCACGTACCCAACGCCTCACC

CTGCCAACCAGCATCCACACACCCCCTAACCCACCCTCCCACCCTGTCTCCCACCACTTACTCTTTTCTC

TTCATCAGTCAGGCGAACGTCCCTCCTAACCACACACCTTACCACCCCACTTCCTCCACCACCCACACTT CCTTTCCTCCTCTTTCCTCATCCTGCCGTCTCCCTCTCTACTCATA~CTCTCCAAATTCCTTTTCCCTCCAA CACCACCACCTTCCTCTAACATCTTAACCCCCTCC~TCCTCCCACTCCCCTCACACGACATTTTCTTCCC

A C A C C T C C A A A A C C A C C C A C C T A C T C C C T C T C C A C C ~ C ACCCACCCCATAATTCCTCCTGTCCCACGTCTCTCCCCCTCTCACCAGCTCCTCTITTTCTTCTACTC

CAGCCAGCGACAGTGCCCAGCGCTCTCATCTCTCTCTCACACCTTCAAAACCTCACATTCTTCCCCTCTA CAGTCCCTCGCCTCCCGGCTCTCGGGGTCCCTCCCCCACACCCCAAACCCCTCCCTAATCCCCATTCTTT

TTCATCACTCTPCTTTTCTCTACCCTCACACACCTCTCTTCTGAGCCACTCACATCCACCAT~TCTTCAC C C C T C C C ~ ~ T T C T C A C T T C A A ~ C C ~ T C C C A T C T C T T T C ~ C C A A A C G C A C C T C A A T C T C T C T C C C T C C C~.GTTAGCATAATCTCACCACGTCCACACACACCCCACCCTTCTCTCCACTCTCACCCCTCTTCCCATCC TGACCTGTCTTTCCTCCCCACTCA~CTTTCCTCTTCCACACACCTCCCCCTCCATCTCTCCATCTCTGTC TCAAC'ITTACCTCCACTGACCTCCAACTTCTTACTTCCCTACTCAAAATAACAATCTCAATATAAATTTC TTTTCTCAAATATTTCCTATGACACCTTCATCCATTAATTAAATAAGTCAATTCCTCCAATTTCACACAC

C A - T C T C C ~ C C C C C C A C T C T C C A C C C C C C C T C T C C C C A T ~ T C ~ A T C C C C T T T CAAATAAAGACCTC~CAACCTTCCACAATCTCCATCTTCCCTCTCCTCACTCTCTTCCACCTCCCCTCTC

V V I G A V V A T V H C R R K S S

GGTGCGCTCTGGCTTTTCTTCTCCCACTCCCCC~TTCAACCCCCACCTACAACTCTTCCCTCCCTCATTA

C A C C A C A T G T C A C A A T C A A C C A C C C A T C T A T C A C C T T G C C A

G G K C C S Y S Q A A

S S D S A Q G S D V S L T A

G A T T C G G A T C T T T C C C C T G T C ' ~ C C ' ~ C C C C ' ~ C T T T A C A C T C T C A T C G C T T A C C A T C A C T A A C C A G A A T T ~ G

ATGTCCTTAGTCCTCACCTG

1680

1750

1820

1890

1960 2030

2100

2170 2240 2310

2450 2380

2520 2590

2660 2730

2800

2940 2870

3010 3080 3150 3220 3290 3360 3430 3500 3510

FIG. 3. Complete nucleotide sequence of an HLA-Bw58 gene. Exon-intron boundaries were determined as described in the text. The single letter code for amino acids is shown above protein- encoding regions. In the 5' and 3' end, recognition sequences are underlined and a putative mRNA start site is shown starred. The region homologous to the 8th exon of HLA-A2 (31) is underlined.

Page 4: The Complete Primary Structure of HLA-Bw58”

Primary Structure of HLA-Bw58 11927

FIG. 4. Comparison of the amino acid sequence of HLA-Bw58 to other class I sequences. The complete Bw58 sequence is given in the standard one-letter amino acid code. In the other sequences, those residues that differ from Bw58 are designated with homol- ogy being indicated by a horizontal line. All protein encoding regions are shown including cytoplasmic domain three which is not translated in HLA-Bw58, B7, p12.4, PR9, and PR14 class I genes. The sequences HLA-A28 (48), B40 (49), B27 (50), H-2Dd (51), and the three ex- tracellular domains of B7 (52) are from protein analysis. All other sequences were deduced from the corresponding

A3 (45), Aw24 (46), B7 (25), Cw3 (47), nucleotide sequences. They are: A2 (31),

p12.4 (44), H-2Kb (53), H-2Kd (54), H- 2Db (55, 561, H-2Ld (57, 58, 9), and the rabbit class I sequences PR9 and PR14 (59).

WMAIN TWO

HLA-Bw58 A2

A28

Aw24 A3

8 4 0 87

827

~ 1 2 . 4 cw3

H-2Kb

H-ZDb H - z d

H-2Ld 27.1

PR9 PR14

DOMAIN THREE

CYTOPLASMIC WMAIN IWO

330 HLA-Bw58 SSDSAWSDVSLTAU

Az CK * CK M Aw24

cw3 -N -__- E"1-M p1z.4 -N

87 C

H-2Kd G -T--L-PDGK H-2Ld G -S-EM--RDCK

27. I 0 XS-DR--PDGK PR9 GGHFC+"-D--MP-

PR14 GGHRE--"D--MP-

H - Z K ~ G- -T--L-PDCK

H - Z D ~ G -S-EH--RDM

CYTOPLASMIC WMAIN THREE

Au V- V- V- "

""D-fi-p VN-AL-P D-AL-P D-AL-P

"q"DcAp-P D""R-Tp-

"" ""_ ""_

-

YMVNDPMSLAU WHDPHSLAU

"

- "

Page 5: The Complete Primary Structure of HLA-Bw58”

11928 Primary Structure of HLA-Bw58

TABLE I Number of amino acid differences in HLA-A,B,C, extracellular

domains a1 a2

Bw58IB7 B27 €340 A2 A3 Aw24 A28 cw3

B7/€327 p12.4

B40 A2 A3

A28 Aw24

cw3

B27/B40 p12.4

A2 A3 Aw24 A28 cw3 p12.4

B40/A2 A3 Aw24 A28 cw3 p12.4

A2/A3 Aw24 A28 cw3

A3/Aw24 p12.4

A28 cw3

Aw241A28 p12.4

cw3 p12.4

A28/Cw3 p12.4

Cw3lp12.4 Residues per

domain

16 15 17 17 17 12 17 22 17 10 13 16 14 16 11 14 16 13 16 15 14 15 20 18 20 20 21 20 18 24 4 11 6

20 28 12 3

19 27 13 22 21 19 24 24 90

12 8

11 18 16 19 16

17 10 5

20 17 19 19 13 15 9

15 13 14 13 11 13 18 18 19 18 14 17 10 10 4

19 17 12 7

15 14 9

20 16 17 14 16 92

a

a3

2 2 1 6 3 3 5 5 4 0 1 8 3 3 6 7 4 1 8 3 3 6 7 4 7 2 2 6 6 3 5 5 2 8 6 0 5 6 1 5 6 1 7 6 5

92

__ al,a2,a3

30 25 29 41 36 34 38 35 38 20 19 44 34 38 36 34 35 23 39 31 31 34

35 45 40 42 44 38 44 19 26 12 47 51 24 15 40 39 27 48 38 43 47 45

274

38

typing of the WT49 cell line with well defined alloantisera has shown that it is homozygous for Bw58 and we therefore believe that it is reasonable to assume that the B17 gene derived from this cell line codes for Bw58.

The Structure of the HLA-Bw58 Gene-A 6.5-kb EcoRI fragment of clone XC4Bw58 containing the entire HLA-Bw58 gene was subcloned into pBR322. A partial restriction map of this plasmid clone, pBw58, was derived using endonucleases that cut infrequently in HLA-A2 and HLA-B7 genes. By alignment of restriction enzyme sites shared by HLA-Bw58 and HLA-B7 genes, an estimate of the position of the -3.5- kb HLA-Bw58 gene within the insert was obtained. Restric- tion enzyme fragments spanning this region were subcloned into M13 vectors for DNA sequence determination as shown in Fig. 2. The nucleotide sequence of the entire 3520-base pair region is shown in Fig. 3.

The exon-intron organization of the HLA-Bw58 gene was determined by comparing the deduced amino acid sequence of pBw58 in all three reading frames with HLA-A2 and HLA- B7 protein sequences. Splicing signals at the exon-intron

boundaries are consistent with those described by Breathnach and Chambon (43). The HLA-Bw58 gene consists of seven protein-encoding exons separated by seven introns of varying length. The organization is similar to that decribed for HLA- B7 (25) and two HLA class I pseudogenes HLA p12.4 (44) and LN-11A (25). It differs from HLA-A2 (31), -A3 (45), -Aw24 (46), and -Cw3 (47) because translation is terminated in the seventh exon and the eighth exon is therefore not translated.

Protein Sequence of HLA-Bw58 and Antigenic Structure- In Fig. 4, the deduced HLA-Bw58 protein sequence is com- pared to other class I sequences from rabbit, mouse, and human (9, 31, 44-59). The results show that Bw58 is a novel HLA-A,B,C molecule. Considerable differences between Bw58 and all other HLA molecules are found throughout the sequence, for example, in the leader peptide. However, as the antigenic properties of class I molecules are determined by the three external domains (a1, a2, a3), we shall focus on them. At 73 positions in the external three domains, Bw58 showed differences from one or more of the other eight expressed HLA proteins. With the exception of position 103, all of these positions were previously shown to be sites of polymorphic variation. A comparison of the external domain sequences for the nine expressed HLA molecules shows that Bw58 has specific substitutions at five positions: 45, 67, 70, 103, 116. One or more of these presumably account for the unique alloantigenic epitopes of Bw58. Such epitopes have so far not been defined with monoclonal antibodies.

A search was made to identify amino acid residues critical for an antigenic determinant shared by HLA-A2 and HLA- Bw58 molecules and defined by the monoclonal antibody MA2.1 (12). We have shown previously that MA2.1 binds with identical affinity to both molecules which suggests they share an identical determinant (21).

At 33 of the 73 polymorphic positions in the three external domains, HLA-A2 and HLA-Bw58 have the same residue. Thirteen are in the first domain, fourteen in the second domain, and six in the third domain. However, for 29 of these 33 positions, the residue shared by A2 and Bw58 is also found in the majority of the nine HLA-A,B,C sequences compared. For this reason, we think they are unlikely to be critical for an epitope specific to A2 and B17. The remaining four residues are at positions 62, 65, 97, and 194. Only the glycine residue at position 62 is shared exclusively by A2 and Bw58. The other residues of restricted specificity are all found in two additional molecules; arginine 65 in A28, A3, arginine 97 in B40 and Cw3, valine 194 in A28 and Cw3. Glycine 62 is found at the amino-terminal end of the region of highest sequence diversity in class I molecules: residues 62-83. Further com- parison of A2 and Bw58 within this region of variability shows sequence identity from residues 62 to 65 involving three polymorphic positions: 62, 63, 65, including 65 which is an- other of the three residues of restricted distribution. At posi- tion 65, all B locus molecules except HLA-Bw58 have a glutamine residue. HLA-Bw58, like three out of four A locus antigens, has an arginine at this position. The sequence from 62-65 therefore provides the best candidate for involvement in the MA2.1 epitope.

Within residues 62-69, Bw58 bears a greater similarity to the A locus molecules A2, A28, A3, and Aw24 than it does to the allelic B locus molecules B7, B40, and B27. These simi- larities are also true at the nucleotide level. Bw58 has 2/24 nucleotide differences from A2 compared to 8/24 from B7. A continued comparison of this region shows that from position 70 to 83 Bw58 shows no extraordinary homology with A2, A28, and A3 and is in general more similar to the B locus

Page 6: The Complete Primary Structure of HLA-Bw58”

Poly(A) site and CATG Position

HLA - Bw58 TTCCA GAATCTGCATGTT 13 HLA - A2 - _ - - - --G--CA-G---- 18 HLA- A3 _ _ _ _ I --G--CA-G---- 18

HLA-Aw24 ___._ --G--CA-G---- 18 HLA - C W ~ " " _ ""-c""". 14

HLA-pl2.4 ___._ --G--CATG---- 18 HLA- B7 " "* A-d 13 H - 2Kb " " _ """. GG - CAC 17 H - 2Kd GG - CAC 17 " " _ " _ " "

Reference G + T homologv Position ~

HLA-Bw58 TCTGTTGCAGG 43 HLA - A2 (31) -T"------- 47 HLA - A3 (45) -T--"----- 48

HLA-Aw24 (46) -T"------- 47 44

HLA-p12.4 (44) -T"----"- 47 HLA-B7 (25) H - 2Kb (53) " " " _ GGAC 48 H- 2Rd (54) """-GGAC 48

Rabbito-globin (65) -G- - - - - G-A 25 Human IFN-a1 (66) -G"---TTC 34 HumanIFN-a2 (67) -A""-T-C 20 Human IFN-y (67) -G"---G-C 25

Chickenovalbumin (68) -G- - -A- 53 SV40 early (69) "" 23

* This sequence was obtained by DNA sequencing in the 3' to 5' direction only of a 5' end Smal-XBal fragment

e 3' end sequence positions refer to the distance from the AATAAA polyadenylation signal,

HLA-Cw3 (47) """""_

5' end sequence positions refer to the distance from the ATG start codon.

of the Bw58 gene (Fig. 2). CCCA repeats are underlined.

The start of poly(A) tail for a B7cDNA (25).

products. However, a striking identity between Bw58, Aw24, and the pseudogene p12.4 is seen from positions 75-83.

Our comparison of sequences within the major region of polymorphic diversity shows an NH,-terminal segment where Bw58 is more like A locus molecules than B locus molecules and a second segment where Aw24 is more like B locus molecules than A locus molecules and most like Bw58. The existence of localized short segments of homology in poly- morphic regions between alleles at different loci is a recurring feature of class I genes (46,48, 50, 53, 60).

The A locus characteristics of the Bw58 sequence in the major polymorphic region are also apparent when one com- pares the overall sequence relationsbips in the a1 domain

(Table I). Unlike the other HLA-A and B locus molecules, Bw58 shows no greater homology with allelic molecules in this domain. This is not true for the a2 domain where Bw58 is clearly more homologous to B locus products.

Overall Organization of Polymorphism in HLA-A,B,C Mol- ecules-The sequence of Bw58 has revealed only a single position of polymorphic variation that was not identified from comparison of the existing 8 sequences for bona fide HLA- A,B,C molecules. As an overall comparison of the sequences showed that Bw58 is not a particularly close relative to any of the other molecules, this suggests that the majority of the positions of variability in HLA-A,B,C molecules have been identified. This gives one some confidence in an assessment

Page 7: The Complete Primary Structure of HLA-Bw58”

11930

TABLE I11 Inverted repeat sequences in the HLA-Bw58 gene

position lengths matches start

position Sequence End Insert Number of dc Location

IR 1 277 231

IR 2 61 1 577

IR 3 1102 1063

IR 4 1268 1229

IR 5 1912 1906

IR 6 1942 1923

IR 7 2024 1978

IR 8 2417 241 1

IR 9 3350 3301

TCCT CGCCCCC AGGACGCGGGGG

CGGCCCGGGGTCGC GCCCGGCCCCAGTG

CCTTCCCCATCTCCTATAGGT GGAAG~CAA~AGG TCCA

CTCTGAGGAAA GAGTCTCCTTT

AGCCCCTCACCCTG TCGGGGAGTACGAC

GGGATGAGGGG CCC ACTCCCC

CCTCCTTTCCCAGAG GGACGAAAGGGACTC

CCTTAGGAGGG GGAATCGTCCC

ATTTGAGAGAGCAAAT TAAACTCTTTTGT'ITA

*

* *

** ***

*

* *

*

* *

*

* *

287 220

624 564

1122 1046

1278 1219

1925 1893

1952 1914

2038 1964

2427 2401

3365 3286

bp 46

34

39

39

6

19

46

6

49

11/12

1211 4

1612 1

10/11

12/14

1011 1

13/15

loll 1

14/16

-28.8

"27.8

-31.8

-10.4

-24.2

-24.8

-18.4

-17.6

-11.4

Intron 1

Intron 2

Exon/intron 3

Intron 3

Exon 4

Exonlintron 4

Intron 4

Intron 5

3' noncoding region

of the overall pattern of structural polymorphism based on the nine sequences in Fig. 4. In analyzing the sequences we have taken into account the numbers and distribution of positions showing variability and also the number of alter- native residues found at a given position.

Comparison of the three domains gives 29, 32, and 12 positions of variability for al, as, and as, respectively. Al- though these numbers are comparable for al and a2, the distribution of variability is different and in general the diversity of residues at each position is greater in al than in a2. This can be appreciated by noting that only 6 out of the 32 positions in a2 have more than two alternative residues whereas for al the number is 14 out of 29. This is in part due to locus-specific differences as described in the next section.

The most significant cluster of variable positions is found between residues 62 and 83 at the COOH-terminal end of the cy1 domain. A less extensive and variable cluster is found between residues 95 and 116 at the NHz-terminal end of the cyz domain. These two clusters are bounded by highly con- served stretches of sequence and between them is a completely conserved sequence that contains asparagine 86 where the single carbohydrate chain is attached.

The NHz-terminal two-thirds of a1 is for the most. part conserved with a scattering of positions showing modest pol- ymorphism. Exceptions to this generalization are residues 9 and 45 which exhibit five different residues, as high a diversity as seen anywhere in the sequence.

The COOH-terminal two-thirds of a2 is also relatively conserved. Although positions of variability are scattered throu~hout this part of the sequence there is no apparent pattern to their organization.

Analysis of the mouse H-2 sequences in Fig. 4 leads to similar conclusions. Although the numbers of polymorphic positions in each domain are comparable for H-2 and HLA, there are some differences. At positions 152-156 in cyz and positions 195-198 in a3, the mouse sequences show greater

variability than the corresponding human sequences. A and B Locus Products Show Distinct Patterns of Poly-

morphic Variation-With the completion of the Bw58 se- quence, it was possible to compare the patterns of poly- morphic variation between four HLA-A locus proteins and four HLA-B locus proteins. A potentially important difference was observed. For the four A locus molecules, the amount of p o l y m o ~ ~ c variation in the al and az domains is similar. For the four B locus molecules, an equivalent variation in the a2 domain is observed. However, a much greater variation is seen for B locus molecules in the a1 domain. This can be appreciated most simply by comparing the number of posi- tions of polymorphic variation. For the a1 domain of B locus molecules, it is 21 compared with 14 for A locus molecules. For a2 there are 15 polymorphic residues for B and 17 for A. Closer examination shows that there are nine positions in cyl: 11, 12, 24, 32, 41, 45, 67, 69, and 71, where no variation is seen for A and where there is ,variability for B. The most striking of these is position 67 where all four B locus molecules differ. In contrast, there are only two positions in the cy1

domain, 76 and 79, where conservation of B locus sequences and variability of A locus sequences is seen. Differences of this type are not found between A and B molecules in the a2 domain.

Another interesting difference between the al and a2 do- mains is the existence of locus specific residues. In cyz there are five positions: 105, 109, 138, 144, and 151, where comer-, vation within A and B locus products but differences between the loci exist. In al, 43 is the only such position. The single C locus sequence shows identity to the 3 locus products at all positions.

Although these observations are of interest their signifi- cance is difficult to assess because of the relatively small sample size of the sequences and the possible biases in their selection. From a serological viewpoint, one would have ex- pected the four B locus molecules to be more closely related

Page 8: The Complete Primary Structure of HLA-Bw58”

Primary Structure of HLA-Bw58 11931

EXON INTRON - 180 181 182

G,C,G,C.G.C.G.G.~*A.C,C.A.G.G.C.G A C C A

179 G G T.C.G.C.A.G.A. G.A.G.G.G.G.T

178 177 GaC

176 G.C

A. T

A, T

. .

. .

. . 175 G.C

. . GsC . . GeC . .

174 C.C

A.A

A. T

173 G.C

. . t .

. .

. . A.T e .

G.C . .

172 G.C.T

T.A.T

C.G

171 C.G

. *

. A

* . . . A. T . .

T.C.C.G.C.A.G.A.T. *C.G.C*C.G*G*G.G.

dues 1037 to 1130 of the HLA-Bw58 gene. The GT splice site FIG. 5. A potentia1 stem and loop structure formed by resi-

between exon 3 and intron 3 is underlined. Amino acid numbers are given for each codon.

in structure than the four A locus molecules. It therefore seems significant that more extensive variation is seen for the B locus products. This result would also be consistent with the observed larger number of alleles at the B locus (10). Analysis of the smaller number of A and B gene nucleotide sequences is consistent with these observations made from the protein sequence.

Nucleotide Sequence of the H U - B w 5 8 Gene-Comparison of the nucleotide sequence of the Bw58 gene with five other HLA-class I genes revealed greatest homology (95%) with the allelic gene B7. Homologies of 88% with Cw3, 84% with A2 and A3, and 83% with Aw24 and p12.4 were obtained. In their analysis of the Aw24 gene, NGuyen et al. (46) found that for A locus alleles homology was greater in the 3‘ half of the gene than in the 5’ half. In contrast, when alleles of different loci were compared, the level of homology remained relatively constant and at a lower level. A similar trend was observed here for the two alleles B7 and Bw58 of the B locus. Thus

locus specific characteristics for both A and B locus alleles exist in their 3’ halves.

Qkocinski and Max (61) reported that MHC genes have an unusually high content of the dinucleotide CG when com- pared to other genes. They also showed that the CG pairs were clustered toward the 5‘ end of the gene and were thus correlated with the regions of greatest polymorphic variation. The distribution of CG in Bw58 is consistent with their observation from other MHC genes.

Potential Regulatory Sequences in the Bw58 Gene-Eukar- yotic genes that are transcribed by RNA polymerase I1 share CAAT and TATA sequences in their 5’ untranslated regions that are thought to be promoter elements important in the initiation of transcription. Previous analysis of the nucleotide sequences of the A 2 , A3, and Aw24 genes shows that they contain a CAAT sequence and a variant TATA sequence (TCTAAA) 80 bases and 50 bases, respectively, to the 5’ side of the ATG start codon (Table 11). These sequences are also found at homologous positions in the Bw58 gene. This con- trasts with Cw3 which has a CAAT sequence 117 bases to the 5’ side of the staft codon and has a different variant of the TATA sequence (TCTGAA). In murine class I genes, the sequence ACCC at position -113 from the ATG start codon is similar to a distal promoter sequence found in several eukar- yotic genes (53). For example in the rabbit @-globin gene, three ACCC repeats are found at a similar position (62). In this region of HLA-A,B,C genes, we have found three repeats of the sequence CCCA which may act as a third component of the RNA polymerase I1 promoter. Four direct repeats of the same sequence are also found to the 3’ side of the TATA box. As shown in Table 11, the putative class I mRNA start site (63) a t adenosine residue -21 in HLA-Bw58 is highly conserved in homologous positions in murine, rabbit, and human class I genes.

The polyadenylation signal AATAAA has been i d e n t ~ e d in the 3’ untranslated region of HLA-AS, A3, Aw24, and Cw3 genes. In the HLA-Bw58 gene, it occurs 394 bases after the 8th exon which i s used in HLA-A2, -A3, -Aw24, and -Cw3 but not in HLA-Bw58 due to the presence of a TGA termination codon in the seventh exon. An adenosine residue situated 19 nucleotides to the 3’ side of the poly(A) signal is likely to be the polyadenylation start site since it is followed by a poly(A) tail in the HLA-B7 cDNA (25). As shown in Table 11, it is conserved in mouse and human class I genes. Directly 3’ to the poly(A) site is the sequence CATG in Bw58 and p12.4 genes, CACGTG in HLA-A2, A3, and Aw24 genes, and CGCATG in the Cw3 gene. These sequences are similar to a CAYTG consensus sequence observed near the poly(A) site of many eukaryotic genes and which has homology to the small nuclear RNA U4 (64). Further to the 3’ of this region is a (G + Tf-rich region which is conserved in rabbit, mouse, and human genes and is similar to sequences found near the poly(A) site of other eukaryotic and prokaryotic genes (Table 11) (65-69). These two regions have been postulated to provide signals for polyadenylation and 3’ end processing of mRNA.

Inverted Repeat Nucleotide Sequences in the Bw58 Gene- A common feature of class I sequences is that pairs of non- allelic genes share small segments of identity in regions of a general polymo~hic diversity. For example, in the Bw58 sequence we have found segments of identity with A2 and Aw24 within the major diversity region. These observations have led to the hypothesis that genetic exchange events occur between class I genes and could thereby contribute to their diversification (70). Inverted repeat structures could play a role in such exchange mechanisms and we therefore analyzed

Page 9: The Complete Primary Structure of HLA-Bw58”

11932

the nucleotide sequences of HLA-Bw58, -A2, -A3, -Aw24, and -Cw3 genes for their presence.

Nine inverted repeat sequences were found in the Bw58 gene and are shown in Table 111. Five are located in introns. Three others show specific orientation at potential regulatory sequences. IR9 is in the 3’ untranslated region and flanks the sequence which is homologous to the polyadenylation signal in rabbit class I genes (63). This sequence probably is not used for polyadenylation in HLA class I genes. IR3 symmet- rically flanks the GT splice site at the boundary between exon 3 and intron 3. The repeat consists of 21 nucleotides with a homology of 76% and a AG of -31.8 (71). The homology of these segments increases when a three-nucleotide segment is deleted. A stem and loop structure as shown in Fig. 5 could be a thermodyn~cal ly favorable interme~ate. The nucleo- tide segment within the inverted repeat consists of 18 nucleo- tides of the third exon and 18 nucleotides of the 3”flanking intron. Within the ends of the repeat sequences, a short direct repeat of the sequence ATA flanks the repeats themselves. In this respect, this sequence is similar to those of transposable elements. It is of interest that the amino acids encoded by the sequence that is flanked by this inverted repeat are poly- morphic and show the characteristics of a region that has undergone genetic exchange (50). The 5’ end of the inverted repeat sequence is highly conserved in other class I genes but the 3’ end repeat is poorly conserved and so far is found only within the HLA-Bw58 and -B7 sequences.

A second inverted repeat found in HLA-Bw58 and in several other HLA class I genes consists of a short 11-nucleotide repeat that symmetrically flanks the GT splice site of exon four. The presence of two such repeats around the GT splice sites of these genes suggests they may play some role in the splicing of class I mRNA.

Acknowledgments-We thank Drs. F. Ward and D. Kostyu for HLA typing and Drs. H. McDevitt, N. Holmes, D. Mathis, C. Benoist, A. Buchman, D. Peattie, R. Fox, and L. Naumovski for their support and advice. We thank Drs. H. On, A. Biro, S. Weissman, J. Strom- inger, J. Lopez de Castro, and B. Jordan for sharing their unpublished results. We thank. A. McIntyre for preparation of the manuscript.

REFERENCES 1. Klein, J. (1979) Science 203,516-521 2. Van Someren, H., Westerveld, A., Hagemeijer, A., Mees, J. R.,

Meera Khan, P., and Zaalberg, 0. B. (1974) Proc. Natl. Aead.

3. Goodfellow, P. N., Jones, E. A,, van Heyningen, V., Solomon, E., Bobrow, M., Miaiano. V.. and Bodmer, W. F. (1975) Nature

Sei. U. S. A. 71,962-965

.,- I I

254,267-269 4. Grey, H. M., Kubo, R. T., Colon, S. M., Poulik, M. D., Cresswell,

P., Swinger, T.. Turner. M.. and Strominger, J. L. (1973) J.

. ,

Exp. “ed. 138,‘1608-1612 - . . .

5. Vitetta, E. S., IJhr, J. W., and Boyse, E. A. (1975) J. Immunol.

6. Coligan, J. E., Kindt, T. J., Uehara, H., Martinko, J., and Na- thenson, S. G. (1981) Nature 291,35-39

7. Orr, H. T, Lancet, D., Robb, R. J., Lopez de Castro, J., and Strominger, J. L. (1979) Nature 282,266-270

8. Peterson, P. A., Cunningham, B. A., Berggard, I., and Edelman, G. M. (1972) Proc. Natl. Acad. Sei U. S. A. 69,1697-1701

9. Kimball, E. S., and Coligan, J. E. (1983) Contemp. Topics Mol. Immunol. 9 , l - 6 3

10. Bodmer, W. F., Albert, E., Bodmer, J. G., Dausset, J., Kissmeyer- Nielsen, F., Mayr, W., Payne, R., van Rood, J. J., Trnka, Z., and Watford, R. L. (1984) Human Immunol 11,117-125

11. Parham, P., Androlewicz, M. J., Brodsky, F. M., Holmes, N., and Ways, J. P. (1982) J. Immunol. Methods 5 3 , 133-173

12. McMichael, A. J., Parham, P., Rust, N., and Brodsky, F. (1980) Human I m m u ~ ~ . 1,121-129

13. Belvedere, M., Mattiuz, P., and Curtoni, E. S. (1975) Immuno- genetics 1 , 538-548

114,252-254

14. Legrand, L., and Dausset, J. (1974) Transplantation 19,177-180 15. Scalamonga, M., Mercurial, F., Pizzi, C., and Sirchia, G. (1976)

Tissue Antigens 7,125-127 16. Ciaas, F., Castelli-Visser, R., Schreuder, I., and van Rood, J.

(1982) Tissue Antigens 19,388-391 17. Ahern, A. T,, Phelan, D. L., Oldfather, J. W., Parham, P., and

Fuller, T. C. (1982) 8th Annual Meeting of the American Asso- ciation for Clinical Histocompatibility Testing, Abstr. A-1

18. Darke, C. (1984) Essue Antigens 2 3 , 141-150 19. Richiardi, P., Cambon, A., and Menicucci, A. (1985) in Histocom-

patibility Testing 1984, (Albert, E. D., Baur, M. P., and Mayr, W. R., eds) pp. 237-238, Springer-Verlag, Berlin

20. Gyodi, E., and Petranyi, G. G. (1980) in H i s ~ ~ o m ~ t ~ ~ l i t y Testing 1980 (Teraski, P. I., ed) pp. 447-449, UCLA Tissue Typing Laboratory, Los Angeles, CA

21. Ways, J. P., and Parham, P. (1983) Biochem, J. 216,423-432 22. Goss-Bellard, M., Oudet, P., and Chambon, P. (1973) Eur. J.

Eiochem. 36,32-38 23. Maniatis, T., Fritsch, E. F., and Sambrook, J. (1982) Molecular

C1oning:A Laboratory Manual. Cold Spring Harbor Laboratory, Cold Spring Harbor, NY

24. Benton, W. D., and Davis, R. W. (1977) Science 196,180-182 25. Biro, P. A., Pan, J., Sood, A. K., Kole, R., Reddy, V. B., and

Weissman, S. M. (1983) Cold Spring Harbor Symp. Quant. Biol.

26. Koller, B. H., Sidwell, B., DeMars, R., and Orr, H. T. (1984) 47,1079-1086

Proc. Natl. Acad. Sei. U. S.A. 81,5175-5178 27. Rigby, P. W. J., Dieckmann, M., Rhodes, C., and Berg, P. (1977)

J. Mol. Eiol. 113,237-251 28. Hanahan, D., and Meselson, M. (1983) Methods EnzymoL 100,

29. Vogelstein, B., and Gillespie, D. (1979) Proc. Natl. Acad. Sei.

30. Sanger, F., Nicklen, S., and Coulsen, A. R. (1977) Proc. Natl.

31. Koller, B. H., and Orr, H. T. (1985) J. ImmunoZ. 134,2727-2733 32. Queen, C., and Korn, L. J. (1984) Nucl. Acids Res. 12 , 581-599 33. Wigler, M., Pellicer, A., Silverstein, S,, and Axel, R. (1978) Cell

34. Colombani, J., Dausset, J., Lepage, V., Degos, L., Kalil, J., and

35. Parham, P. (1983) Immunogenetics 18, l -16 36. Boyd, H. C., Smilek, D. E., Spielman, R. S., Zmijewski, C. M.,

and McKearn, T. J. (1981) Immunogenetics 12,313-319 37. Liabeuf, A., Leborge de Kaouel, C., Kourilsky, F., Malissen, B.,

Manuel, Y., and Sanderson, A. R. (1981) J. Immunol 127 ,

38. Ozato, K.> Mayer, N., and Sachs, D. H. (1980) J. ImmunoZ. 124,

39. Parham, P. (1983) Methods Enzymol. 9 2 , 110-138 40. Southern, E. M. (1975) J. Mol. Biol. 9 8 , 503-517 41. Barbosa, J. A., Kamarck, M. E., Biro, P. A., Weissman, S. M.,

and Ruddle, F. (1982) Proe. Natl. Acad. Sei. U. S. A. 79,6327- 6331

42. Bernabeu, C., Van de Rijn, M., Lerch, P. G., and Terhorst, C. P. (1984) Nature 308,642-645

43. Breathnach, R., and Chambon, P. (1981) Annu. Reu. ~iochem.

44. Malissen, M., Malissen, B., and Jordan, B. R. (1928) Proc. Natl. Acad. Sci. U. S. A. 79,893-897

45. Strachan, T., Sodoyer, R., Damotte, M., and Jordan, B. R. (1984) EMEO J. 3,887-894

46. N’Guyen, C., Sodoyer, R., Trucy, J., Strachan, T., and Jordan, B. R. (1985) Immunogenetics 21,479-489

47, Sodoyer, R., Damotte, M., Delovitch, T. L., Trucy, J., Jordan, B. R., and Strachan, T. (1984) EMEO J. 3,879-885

48. Lopez de Castro, J. A,, Strominger, J. L., Strong, D. M., and Orr, H. T. (1982) Proc. Natl. Acad. Sei. U. S. A. 79,3813-3817

49. Lopez de Castro, J., Bragado, R., Strong, D. H., and Strominger, J. L. (1983) Biochemistry 22,3961-3969

50. Ezquerra, A., Bragado, R., Vega, M. A., Strominger, J. L., Woody, J. N., and Lopez de Castro, J. A. (1985) Eiochemistry 24,1733- 1741

51. Nairn, R., Nathenson, S. G., and Coligan, J. E. (1981) Eiochem-

52. Orr, H. T., L6pez de Castro, J. A., Lancet, D., and Strominger, J.

333-342

U. S. A. 76,615-619

Acad. Sei. U. S. A. 74,5463-5467

14,725-731

Fellous, M. (1982) Tissue Antigens 2 0 , 161-171

1542-1548

533-540

50,349-383

istry 20,4739-4745

L. (1979) Biochemistry 18,5711-5720

Page 10: The Complete Primary Structure of HLA-Bw58”

P r i ~ ~ S ~ r u ~ t u r ~ of H ~ - ~ ~ ~ 8 11933 53. Weiss, E., Golden, L., Zakut, R., Mellor, A,, Fahrner, K., Kvist, 62. Dierks, P., van Ooyen, A., Cochran, M. D., Dobkin, C., Reiser, J.,

54. Kvist, S., Roberts, L., and Dobberstein, B. (1983) EMBO J. 2, 63. Marche, P. N., Tykocinski, M. L., Max, E. E., and Kindt, T. J.

55. Maloy, W. L., and Coligan, J. E. (1982) Immunogenetics 16,ll- 64. Berget, S. M. (1984) Nature 309, 179-182

56. Reyes, A. A., Schold, M., and Wallace, R. B. (1982) Immunoge- 66. Nagata, S., Mantei, N., and Weissmann, C. (1980) Nature 287,

57. Moore, K. W., Sher, B. T., Sun, Y. H., Eakle, K. A., and Hood, 67. Taya, Y., Devos, R., Tavernier, J., Cheroutre, H., Engler, G., and

58. Evans, G. A., Margulies, D. H., Camerini-Otero, R. D., Ozato, K., 68. Benoist, C., O’Hare, K., Breathnach, R., and Chambon, P. (1980)

S., and Flavell, R. A. (1983) EMBO J. 2,453-462 and Weissmann, C. (1983) Cell 32,695-706

245-254 (1985) ~ ~ m u ~ g e ~ ~ ~ c s 2 1, 71-82

22 65. Gil, A,, and Pmudfoot, N. J. (1984) Nature 312,473-474

netics 16, 1-9 401-408

L. (1982) Science 215,679-682 Fiers, W. (1982) EMBO J. 1, 953-958

and Seidman, J. G.-(1982) Proc. NatZ. Acad. Sci. U. S. A. 70, Nucleic Acids Res. 8,127-142 1994-1998 69. Fiers, W., Contreras, R., Haegeman, G., Rogiers, R., Van de

59. Tykocinski, M., Marche, P. N., Max, E. E., and Kndt, T. J. Voorde, A,, Van Heuverswyn, H., Van Herreweghe, J., Vol- (1984) J. Immunol. 133,2261-2269 ckaert, G., and Ysebaert, M. (1978) Nature 273,113-120

60. Pease, L. R., Schulze, D. H., Pfaffenbach, G. M., and Nathenson, 70. Baltimore, D. (1981) Cell 24,592-594 S. G. (1983) Proc. Natl. Acad. Sci. U. S. A. 80,242-246

61. Tykocinski, M. L., and Max, E. E. (1984) Nucleic Acids Res. 12, 71. Tinoco, I., Jr., Borer, P. N., Dengler, B., Levine, M. D., Uhlen-

4385-4396 beck, 0. C., Crothers, D. M., and Gralla, J. (1973) Nature New Bwl. 2 4 6 , 4 0 4 1