a mosaic of monofunctional domains
TRANSCRIPT
Biochem. J. (1987) 246, 375-386 (Printed in Great Britain)
The pentafunctional arom enzyme of Saccharomyces cerevisiae isa mosaic of monofunctional domainsKenneth DUNCAN,* R. Mark EDWARDSt and John R. COGGINS*j*Department of Biochemistry, University of Glasgow, Glasgow G12 8QQ, and tSearle Research and Development,Lane End Road, High Wycombe, Bucks. HP12 4HL, U.K.
The nucleotide sequence of the Saccharomyces cerevisiae AROI gene which encodes the arom multifunctionalenzyme has been determined. The protein sequence deduced for the pentafunctional arom polypeptide is1588 amino acids in length and has a calculated Mr of 174555. Functional regions within the polypeptidechain have been identified by comparison with the sequences of the five monofunctional Escherichia colienzymes whose activities correspond with those of the arom multifunctional enzyme. The observedhomologies demonstrate that the arom polypeptide is a mosaic of functional domains and are consistentwith the hypothesis that the AROI gene evolved by the linking of ancestral E. coli-like genes.
INTRODUCTIONMultifunctional enzymes are found in all classes of
organisms, including bacteria, higher plants and mam-mals, but they appear to be particularly common on thebiosynthetic pathways ofthe lower eukaryotes (Kirschner& Bisswanger, 1976; Schmincke-Ott & Bisswanger, 1980;Hardie & Coggins, 1986). Most multifunctional enzymescatalyse two or more consecutive reactions on abiosynthetic pathway. In some cases the pathwayintermediates are covalently bound to the enzyme, as inthe case of the fatty acid synthases (Schweizer, 1986;Hardie & McCarthy, 1986), but more frequently theproducts of the individual reactions are free to diffuseaway from the enzyme surface. The arom multifunctionalenzyme (Lambert et al., 1985; Coggins et al., 1985;Coggins & Boocock, 1986) which catalyses the fivecentral steps of the shikimate pathway (see Fig. 1) is anexample of this latter type of multifunctional enzyme.One remarkable feature of the arom system is the
diversity in the patterns of gene and enzyme organizationfound in different species. Genetic and biochemicalstudies have revealed the presence of an 'arom genecluster' in Neurospora crassa (Giles et al., 1967a;Catcheside et al., 1985), Aspergillus nidulans (Ahmed &Giles, 1969; Charles et al., 1986), Saccharomycescerevisiae (de Leeuw, 1967; Larimer et al., 1983),Schizosaccharomycespombe (Strauss, 1979; Nakanishi &Yamamoto, 1984), a number of other fungal and yeastspecies (Ahmed & Giles, 1969; Bode & Birnbaum, 1981),and in Euglena gracilis (Berlyn et al., 1970). In contrast,the corresponding structural genes for the five centralenzymes of the shikimate pathway in Escherichia coli,Salmonella typhimurium and Bacillus subtilis are widelyscattered about the genome (Bachmann, 1983; Sanderson& Roth, 1983; Henner & Hoch, 1980) and in the case ofE. coli the five enzymes have also been shown to beseparable (Berlyn & Giles, 1969; Chaudhuri & Coggins,1985; Coggins et al., 1985). In plants three of theenzymes of the pathway are separable but two,
3-dehydroquinase and shikimate dehydrogenase, co-purify (Polley, 1978; Koshiba, 1979; Fiedler & Schultz,1985; Coggins, 1986; Mousdale et al., 1987) and havebeen shown to occur on a single bifunctional polypeptidechain (Polley, 1978; Fiedler & Schultz, 1985; Mousdaleet al., 1987).
These observations raise the question of whatrelationship there is between the prokaryotic mono-functional shikimate pathway enzymes and the multi-functional eukaryotic enzymes. On the basis of limitedproteolysis experiments on the N. crassa arom multi-functional enzyme (Smith & Coggins, 1983; Cogginset al., 1985; Coggins & Boocock, 1986) and fromknowledge of its subunit molecular mass and the subunitmolecular mass of the five corresponding E. coli enzymeswe have proposed that the arom protein has a mosaicstructure consisting of five autonomous, monofunctionaldomains each one homologous to the appropriate E. clienzyme (Coggins et al., 1985; Chaudhuri & Coggins,1985; Coggins & Boocock, 1986; Hardie & Coggns,1986).To confirm this hypothesis we set out to determine -tbe
complete sequence of the S. cerevisiae arom protein andthe five corresponding monofunctional. E. coli enzymes,.While this work was in progress Hawkins and vi1co-workers reported the partial (Charles et at., 1985) andlater the complete (Charles et al., 1986) sequence of theA. nidulans arom multifunctional enzyme and pointedout that it contained a region homologous to E. coliEPSP synthase (Charles et al., 1986; Duncan et al., 19S4).Here we report the complete sequence of the S. cerevisiaearom multifunctional enzyme and compare it with tiesequences, determined in our laboratory, of all five of thecorresponding monofunctional E. coli enzymes (Duncanet al., 1984, 1986; Millar et al., 1986; Millar & Coggins,1986; Anton & Coggins, 1987). The results confirm thatthe arom polypeptide is a 'mosaic' of five functionaldomains, each of which is homologous to a mono-functional E. coli polypeptide.
Abbreviation used: EPSP, 5-enoylpyruvylshikimate 3-phosphate,t To whom correspondence and reprint requests should be addressed.These sequence data have been submitted to the EMBL/GenBank Data Libraries under the accession number Y003 13.
Vol. 246
375
K. Duncan, R. M. Edwards and J. R. Coggins
GO CHH
HO.
Erytpho
OH co2hrose 4- oJ)sphate QHL0CO 1 O CH2H OH. 2 OH
M2+NAD+
2
HO~ Go-Ks 2
OH3
3-1heptulo.
Phosphoenolpyruvate
CO-2 NADPH NP+
O -OH 4OH
3-Dehydroshikimate
Deoxy-D-arabino-isonic acid-7-phosphate
CO-H AT]
HO OHOH
Shikimate
3-Dehydroquinate
CO-P ADP
OH
Shikimate 3-phosphate
Go-
2
CH2
(E)' co2OH
5-Enolpyruvylshikimate3-phosphate
FMNNADPH
7
2 CH /tPrephenate Phenylalanine
P.AnthranilateoTTryptophano C02- p-Aminobenzoate
OH
Chorismate
Fig. 1. Reactions of the early common pathway of aromadc amino acid biosynthesis
The numbers on the Figure refer to the enzymes of the pathway: (1) 3-deoxy-D-arabino-heptulosonic acid 7-phosphate synthase(EC 4.1.2.15), (2) 3-dehydroquinate synthase (EC 4.6.1.3), (3) 3-dehydroquinase (EC 4.2.1.10), (4) shikimate dehydrogenase (EC1.1.1.25), (5) shikimate kinase (EC 2.7.1.71), (6) 5-enoylpyruvylshikimate 3-phosphate (EPSP) synthase (EC 2.5.1.19, alternativename 3-phosphoshikimate 1-carboxyvinyltransferase), (7) chorismate synthase (EC 4.6.1.4). The arom multifunctional enzymecatalyses reactions 2-6.
MATERIALS AND METHODSMaterials
Restriction enzymes were purchased from a number ofcommercial suppliers and were used in accordance withthe manufacturers' instructions. T4 DNA ligase andE. coli DNA polymerase I were from Bethesda Re-search Laboratories, Paisley, U.K. All reagents forDNA sequencing, including z-[35S]thioATP, werepurchased from Amersham International, Amersham,Bucks., U.K.
Cloning and DNA sequence analysisPlasmid preparations and manipulations were as
described in Maniatis et al. (1982). DNA sequencingmethods have been described previously (Duncan et al.,1984). The paired vectors M13mp8 and M13mp9(Messing & Vieira, 1982) or Ml3mpl8 and Ml3mp19(Norrander et al., 1983) were used.
RESULTS AND DISCUSSION
Nucleotide sequence determination of the ARO1 gene
The characterization of two independently isolatedS. cerevisiae AROJ clones, pFL6 [a derivative ofYpARI (Larimer et al., 1983)] and pME173, has been
described (Duncan et al., 1987). These plasmids, whichby restriction analysis have almost identical genomicinserts, are capable of complementing the auxotrophiclesions in a number of E. coli aromatic pathway mutantstrains, namely aroA, aroB, aroD and aroE strains. Theability of a series of deletion derivatives of pME173 tocomplement the various aromatic pathway mutants leaddirectly to the location on the genomic insert of the'sub-regions' within AROJ, and suggested the likelydirection of transcription by comparison with the knownorder of the activities on the analogous N. crassapolypeptide (Duncan et al., 1987).The nucleotide sequence of pFL6 was determined,
using the M13/dideoxy method (Sanger et al., 1977;Messing & Vieira, 1982; Biggin et al., 1983; Norranderet al., 1983) and the sequence confirmed by sequenceanalysis of the Sau3A fragments between the KpnI andHindIll sites of pME173. The sequencing strategy isoutlined in Fig. 2. Both strands were sequenced in theirentirety and all the restriction sites used to generatefragments for sequencing were overlapped by thesequenced fragments.
Translation of the DNA sequence revealed a singleopen reading frame which is sufficiently long to encodethe arom polypeptide. This 1588 amino acid sequence(shown in Fig. 3) has a calculated Mr of 174555, whichcompares with an estimate by SDS/polyacrylamide-gel
1987
PEP Pi
376
0 0---.<CH 2
Sequence of Saccharomyces cerevisiae arom multifunctional enzyme
ATG TAG
pF L6
Taql 1-ISau3A 1=1 I _~.IBgllPstiHpaIHinc I1:XbaIPvulEco R I -SstI
BamHI I
pME173/Sau3A I '., * *-l
500 bp
, r-or: ---. L 0 -1aI 11I I I I I I
-F--Iw~~~~~~~~~~~~
I8-
4 ]
4- I--.Il
-JI-I--
4 J,c-
1I
Fig. 2. Sequencing strategy for the yeast ARO1 gene
Restriction sites used for sub-cloning into Ml 3 prior to DNA sequence analysis are shown. Arrows indicate the direction andlength of sequence obtained; *indicates clones resulting from digestion at EcoRI* sites, or blunt-end cloning of sheared DNA.
electrophoresis of 165000 for the corresponding enzymeisolated from Neurospora crassa (Lumsden & Coggins,1977). Although direct evidence in support of theassignment of the N-terminal methionine is lacking, thisparticular ATG codon is favoured for two reasons. Thenext methionine in the open reading frame, Met-104, iswithin the region of the arom polypeptide which ishomologous to the product of the E. coli aroB gene (seebelow). Also, it has been shown that the first AUG codonin eukaryotic mRNA usually serves as the translationinitiation codon (Kozak, 1984); evidence has beenobtained that this codon is the first AUG in the AROJtranscript (Duncan et al., 1987).
Overall comparison of S. cerevisiae ARO1 gene with thefive corresponding aro genes of E. coli and the aromgene of Aspergillus nidulansThe subunit Mr of each of the individual E. coli
enzyme is listed in Table 1. The yeast arom subunit Mrof 174555 corresponds closely to the Mr of 159698 forthe combined E. coli enzymes. In order to confirm thatthe arom protein contained all five E. coli domains thepredicted AROJ coding sequence was compared, in turn,with each of the individual E. coli protein sequencescorresponding to the five activities of the complex. Thiswas carried out using the programs 'BESTFIT' and'GAP' (University of Wisconsin Genetics ComputerGroup Package of DNA sequence analysis programs;Devereux et al., 1984). BESTFIT uses the 'localhomology' algorithm of Smith & Waterman (1981) tofind the best segments of similarity between twosequences. The alignments obtained are illustrated inFig. 4. There are very clear homologies between thesequence of each E. coli enzyme and a region ofcorresponding length in the S. cerevisiae multifunctionalenzyme. The order of the activities on the arom poly-peptide chain is the same as that predicted for N. crassa(Giles et al., 1 967a), and for S. pombe (Strauss,1979; Nakanishi & Yamamoto, 1984), and not as
originally deduced for S. cerevisiae by de Leeuw (1967).The computer programs were also used to align thesequence of the A. nidulans arom polypeptide (Charleset al., 1985, 1986) with the S. cerevisiae sequence (Fig. 5).The order of functional regions is the same for these twofungal multifunctional enzymes, which are more hom-ologous to each other than they are to the five individualE. coli enzymes. The number of position identities andthe percentage homologies for pair-wise combinations ofthe bacterial and fungal enzymes is shown in Table 2. TheS. cerevisiae, A. nidulans and E. coli sequences are allclearly homologous although the exact degree ofhomology varies with the different domains as discussedbelow.
Specific homologies between the functional domainsThe first 392 amino acid residues of the S. cerevisiae
arom polypeptide are homologous with the E. coli aroBgene product, 3-dehydroquinate synthase (Millar &Coggins, 1986). In the alignment shown in Fig. 4 thereis 36% identity between the two sequences. Thedistribution of the homology shows two very highlyconserved sub-domains, consisting of residues 100-213and 258-387 in the S. cerevisiae sequence. One of thesesub-domains includes the flafl nucleotide-binding foldpreviously identified between residues 96 and 126 in theE. coli 3-dehydroquinate synthase sequence (Millar &Coggins, 1986). Linking these two conserved sub-domains there is a region of very low homology (residues214-257 in the S. cerevisiae sesquence) which contain-sonly one conserved residue and where, in the S. cere-visiae sequence, there is a 27 amino acid insertion.The A. nidulans polypeptide contains a shorter insertion(13 amino acids compared with the E. coli sequence) inthis region (Fig. 5) which cannot be essential for enzymeactivity.A sequence of 11 amino acids (residues 393-403)
connects the aroB region to a region homologous withthe E. coli aroA gene product, EPSP synthase. The.
Vol. 246
377
378 K. Duncan, R. M. Edwards and J. R. Coggins
1 ATGGTGCAGTTAGCCAAAGTCCCAATTCTAGGAAATGATATTATCCACGTFGGGTATAACATTCATGACCATTGGTFGAAACCATAATT 90MetValGlnLeuAlaLyeValProIleLeuGlyAunAupIleIleHilValGlyTyrAunhleHisAspHisLLuValGluThrIleIle
91 AAACATTGTCCTTCTTCGACATACGTTATTTGCAATGATACGAACTTGAGTAAAGTTCCATACTACCAGCAATTAGTCCTGGAATTCAAG 180LyUHlCy.ProSerSerThrTyrValIleCysAsnAspThrAunLeuSerLymValProTyrTyrGlnG1nLeuValLeuGluPheLys
181 GCTFCITTGCCAGAAGGCTCTCGTTTAC 1rACTTATGTTGTTAAACCAGGTGAGACAAGTAAAAGTAGAGAAACCACAGCGCAGCTAGAA 270AlaSerLeuProGluGlySerArgLeuLeuThrTyrValValLysProGlyGluThrSerLyuSerArgGluThrLysAlaGlnLeuGlu
271 GATTATCTTTTAGTGGAAGGATGTACTCGTGATACGGTTATGGTAGCGATCGGTGGTGGTGTTATTGGTGACATGATTGGGTTCGTTGCA 360AupTyrLeuLeuValGluGlyCymThrArgAspThrValMetValAlaIleGlyGlyGlyValIleGlyAspMetIleGlyPheValAla
361 TCTACA1FATGAGAGGTGTTCGTGTTGTCCAAGTACCAACATCCTTATGGCAATGGTCGATFCCTCCATTGGTGGTAACTGCTATT .50SerThrPheMetArgGlyValArgValValGlnValProThrSerLeuLeuAlaMetValAmpSerSerIleGlyGlyLyeThrA alle
451 GACACTCCTCTAGGTAAAAACTTTA1rGGTGCATTrTGGCAACCAAAAnGTCCTTGTAGATATTAAATGGCTAGAAACGTTAGCCAAG 540AmpThrProLeuGlyLyAsAnPheIleGlyAlaPheTrpGlnProLyePheValLeuValAspIleLysTrpLeuGluThrLeuAlaLys
541 AGAGAGTTTATCAATGGGATGGCAGAAGTTATCAAGACTGCTTGTA1TrGGAACGCTGACGAATTTACTAGATTAGAATCAAACGC7TCG 630ArgGluPheIleAunGlyMetAlaGluValIleLysThrAlaCy IleTrpAsnAlaAmpGluPheThrArgLeuGluSerAenAlaSer
631 TrGTrCTTAAATGTTGT'AATGGGGCAAAAAATGTCAAGGTTACCAATCAATTGACAAACGAGATTGACGAGATATCGAATACAGATATT 720LeuPheLeuAunValValAunGlyAlaLyeAunValLysValThrAunGlnLeuThrAunGluIleAupGluIleSerAunThrAspIle
721 GAAGCTATGTTGGATCATACATATAAGTTAGTTCTTGAGAGTATTAAGGTCAAAGCGGAAGTTGTCTCTTCGGATGAACGTGAATCCAGT 810GluAlaMetLeuAspHiuThrTyrLyuLeuValLeuGluSerIleLyuValLysAlaGluValValSerSerAupGluArgGluSerSer
811 CTAAGAAACC1TITGAACTTCGGACATTCTATTGGTCATGCTTATGAAGCTATACTAACCCCACAAGCATTACATGGTGAATGTGTGTCC 900LeuArgAunLeuLeuAsnPheGlyHisSerIleGlyHiuAlaTyrGluAlaIleLeuThrProGlnAlaLeuHiuGlyGluCyuValSer
901 A?TGGTATGGTTAAAGAGGCGGAATTATCCCGTTATTTCGGTATTCTCTCCCCTACCCAAGTTGCACGTCTATCCAAGATTTGGFTTGCC 990IleGlyMetValLyaGluAlaGluLeuSerArgTyrPheGlylleLeuSerProThrGlnValAlaArgLeuSerLyslleLeuValAla
991 TACGGGTTGCCTGTTTCGCCTGATGAGAAATGGTTTAAAGAGCTAACCTTACATAAGAAAACACCATTGGATATCTTATTGAAGAAAATG 1080TyrGlyLeuProValSerProAupGluLyuTrpPheLyuGluLe'iThrLeuHisLy.LyuThrProLeuAepIleLeuLeuLymLysMet
1081 AGTATTGACAAGAAAAACGAGGGTTCCAAAAAGAAGGTGGTCATrmTAGAAAGTATrGGTAAGTGCTATGGTGACTCCGCTCAATTTGTT 1170SerIleAspLyuLyeAunGluGlySerLymLyuLyuValValIleLeuGluSerIleGlyLyuCy.TyrGlyAspSerAlaGlnPh.Val
1171 AGCGATGAAGACCTGAGArmfATTCTAACAGATGAAACCCTCGTTTACCCCTTCAAGGACATCCCTGCTGATCAACAGAAAGTTGTTATC 1260SerA.pGluAepL.uArgPh.IleLeuThrAepGluThrLeuValTyrProPheLyuAmpIleProAiaAspGlnGlnLyuValVallle
1261 CCCCCTGGTTCTAAGTCCATCTCCAATCCTrm AATTCTTGCTGCCCTCGGTGAAGGTCAATGTAAAATCAAGAACTTATTACATTCT 1350ProProGlySerLysSerIleSerAenArgAlaLeuIleLeuAlaAlaLeuGlyGluGlyGlnCyuLyeIleLyeAunLeuLeuHlSer
1351 GATGATACTAAACATATGTTAACCGCTGTTCATGAATTGAAAGGTGCTACGATATCATGGGAAGATAATGGTGAGACGGTAGTGGTGGAA 1440AupAepThrLyuHKtMetLeuThrAlaValHieGluLeuLyuGlyAlaThrIleSerTrpGluAupAunGlyGluThrValValValGlu
1441 GGACATGGTGGTTCCACATTGTCAGCTTGTGCTGACCCCTTATATCTAGGTAATGCAGGTACTGCATCTAGATTTTCGACTTCCTFGGCT 1530GlyHlGlyGlySerThrLeuSerAlaCysAlaAupProLeuTyrLeuGlyAenAlaGlyThrAlaSerArgPheLeuThrSerLeuAla
1531 GCCTrTGGTCAATTCTACTTCAAGCCAAAAGTATATCGTTTTAACTGGTAACGCAAGAATGCAACAAAGACCAATTGCTCC1TTGGTCGAT 1620AlaLeuValAunSerThrSerSerGlnLyeTyrIleValLeuThrGlyA.nAlaAr5MetGlnGlnArgProIleAlaProLeuValAsp
1621 TCTI7GCGTGCTAATGGTACTAAAATTGAGTACTTGAATAATGAAGGTTCCCTGCCAATCAAAGTTTATACTGATTCGGTATTCAAAGGT 1710SerLeuArgAlaAunGlyThrLyeIleGluTyrLeuAunAunGluGlySerLeuProIleLyuValTyrThrAspSerValPheLyuGly
1711 GGTAGAATTGAATTAGCTGCTACAGTTTCTTCTCAGTACGTATCCTCTATCTTGATGTGTGCCCCATACGCTGAAGAACCTGTAACTTTG 1800GlyArgIllGluLeuAlaAlaThrValSerSerGlnTyrValSerSerIleLeuMetCy.AlaProTyrAlaGluGluProValThrLeu
1801 GCTCT-GTTGGTGGTAAGCCAATCTCTAAATTGTACGTCGATATGACAATAAAAATGATGGAAAAATTCGGTATCAATGTTGAAACTTCT 1890AlaLeuValGlyGlyLysProIleSerLyuLeuTyrValAupMetThrIleLyuMetMetGluLysPheGlylleAunValGluThrSer
1891 ACTACAGAACCTTACACTTATTATATTCCAAAGGGACAT'TATATTAACCCATCAGAATACGTCATTGAAAGTGATGCCTCAAGTGCTACA 1980ThrThrGluProTyrThrTyrTyrIleProLy.GlyHisTyrIleAsnProSerGluTyrValIleGluSerAupAlaSerSerAlaThr
1981 TACCCATTLGCC?rCGCCGCAATGACTCGTACTACCGTAACGGTTCCAAACATiGG1rrrGAGTCGTTACAAGGTGATGCCAGATTTGCA 2070TyrProLeuAlaPheAlaAlaMetThrGlyThrThrValThrValProAenIleGlyPheGluSerLeuGlnGlyAupAlaArSPheAla
2071 AGAGATGTC;TTAGACCTATGGGTTCTAAACCTAAACGGCAACT1TCAACTACTGTrTCGGGTCCTCCTGTAGGTAC11TAAAGCCA 2160ArgAepValLeuLysProMetGlyCyeLyuIleThrGlnThrAlaThrSerThrThrVaJSerGlyProProValGlyThrLeuLysPro
2161 TTAAAACATGTIGATATGGAGCCAATGACTGATGCGTFCTTAACTGCATGTGTTGTTGCCGCTATTTCGCACGACAGTGATCCAAATTCT 2250LeuLyHHluValAspMetGluProMetThrAspAlaPh.LeuThrAlaCyuValValAlaAlalleSerHi AupSerA.pProAunSer
2251 GCAAATACAACCACCATTGAAGGTATTGCAAACCAGCGTGTCAAAGAGTGTAACAGAATITFGGCCATGGCTACAGAGCTCGCCAAATTT 2340Al&AsnThrThrThrIleGluGlylleAlaAenGlnArgValLyuGluCysAsnArgIleLeuAlaNetAlaThrGluLeuAlaLyePhe
2341 GGCGTCAAAACTACAGAATTACCAGATGGTATTCAAGTCCATGGTTTAAACTCGATAAAAGATFTGAAGGTTCCTTCCGACTCTFCTGGA 2430GlyValLyuThrThrGluLeuProAspGlylleGlnValHusGlyLeuAsnSerIleLysAspLeuLyUValProSerACpSerSerGly
1987
Sequence of Saccharomyces cerevisiae arom multifunctional enzyme
2431 CCTGTCGGTGTATGCACATATGATGATCATCGTGTGGCCATGAGTTTCTCGCTTCrTGCAGGAATGGTAAATTCTCAAAATGAACGTGAC 2520ProValGlyValCyuThrTyrAspAspHKiArgValAlaMetSerPheSerLeuLeuAlaGlyMetValAunSerGlnAsnGluArgAsp
2521 GAAGTrGCTAATCCTGTAAGAATACTTGAAAGACATrGTACTGGTAAAACCTGGCCIGGCTGGTGGGATGTG1TACATTCCGAACTAGGT 2610GluValAlaAonProValAr.IleLeuGluArgHliCyuThrGlyLyuThrTrpProGlyTrpTrpAupV&lLeuHisSerGluLeuGly
2611 GCCAAATTAGATGGTGCAGAACCTTTAGAGTGCACATCCAAAAAGAACTCAAAGAAAAGCGTTGTCATTATTGGCATGAGAGCAGCTGGC 2700AlaLyeLeuAupGlyAlaGluProLeuGluCyuThrSerLyLysAmnSerLymLyuSerVlVal1l1le.leGlyMetArgAlaAlaGly
2701 AAAACTACTATAAGTAAATGGTGCGCATCCGCTCTGGGTTACAAATrAGTTGACCTAGACGAGCrITTTGAGCAACAGCATAACAATCAA 2790LyuThrThrIleSerLysTrpCyuAlaSerAlaLeuGlyTyrLyuLeuValAupLeuAupGluLeuPheGluGlnGlnHisAunAmnGln
2791 AGTGTTAAACAATTTGTrGTGGAGAACGGTTGGGAGAAGTTCCGTGAGGAAGAAACAAGAATTTFCAAGGAAGTTATTCAAAATrACGGC 2880SerValLysGlnPheValValGluAunGlyTrpGluLysPheArgGluGluGluThrArgIlePheLysGluValIl eGlnAnTyrGly
2881 GATGATGGATATGTTTTCTCAACAGGTGGCGGTATTGTTGAAAGCGCTGAGTCTAGAAAAGCCTTAAAAGAT m GCCTCATCAGGTGGA 2970AupAspGlyTyrValPheSerThrGlyGlyGlyIleValGluSerAlaGluSerArgLyuAlaLeuLymAepPheAlaSerSerGlyGly
2971 TACGTTTTACACTTACATAGGGATATTGAGGAGACAATTGTCTrTTTACAAAGTGATCCTTCAAGACCTGCCTATGTGGAAGAAATTCGT 3060TyrValLeuHisLeuHiuArgAspIleGluGluThrIleValPheLeuGlnSerAupProSerArSProAlaTyrValGluGluIleArg
3061 GAAGTGGAACAGAAGGGAGGGGTGGTATAAAGAATGCTCAAATTTCTCTTTCTTTGCTCCTCATFGCTCCGCAGAAGCTGAGTTCCAA 3150GluValTrpAunAraArgGluGlyTrpTyrLyuGluCysSerAunPheSerPhePheAlaProHisCysSerAlaGluAlaGluPheGln
3151 GCTCTAAGAAGATCGTTTAGTAAGTACATTGCAACCATTACAGGTGTCAGAGAAATAGAAATTCCAAGCGGAAGATCTGCCTTiGTGTGT 3240AlaLeuAr&ArgSerPheSerLysTyrIleAlaThrIleThrGlyValArgGluIleGluIleProSerGlyArgSerAlaPheValCys
3241 1rAACCFGATGACTTAACTGAACAAACTGAGAATTrGACTCCAATCTGTrATGGTAGTGAGGCTGTAGAGGTCAGAGTAGACCATTTG 3330LeuThrPheAspAupLeuThrGluGlnThrGluAunLeuThrProIleCysTyrGlyCyeGluAlaValGluValArgValAspHisLeu
3331 GCTAATTACTCTGCTGATTTCGTGAGTAAACAGTTATCTATATrGCGTAAAGCCACTGACAGTATTCCTATCATTTTTACTGTGCGAACC 3420AlaAunTyrSerAlaAspPheValSerLyuGlnLeuSerIleLeuArgLyeAlaThrAupSerIleProlleIlePheThrValArgThr
3421 ATGAAGCAAGGTGGCAACrrFCCTGATGAAGAGTTCAAAACCTTGAGAGAGCTATACGATATTGCCTTGAAGAATGGTGTFGAATTCCTT 3510MetLyeGlnGlyGlyAsnPheProAupGluGluPheLyuThrLeuArgGluLeuTyrA.pIleAlaLeuLysAunGlyValGluPheLeu
3511 GACTTAGAACTAACTTTACCTACTGATATCCAATATGAGGTTATTAACAAAAGGGGCAACACCAAGATCATTGGTTCCCATCATGACTTC 3600AupLeuGluLeuThrLeuProThrAspIleGlnTyrGluValIleAsnLysArgGlyAunThrLyuIleIleGlySerHleHisAupPhe
3601 CAAGGATTATACTCCTGGGACGACGCTGAATGGGAAAACAGATTCAATCAAGCGTTAACTCTTGATGTGGATGTTGTAAAATTl;TGGGT 3690GlnGlyLeuTyrSerTrpAupAspAlaGluTrpGluAsnArgPheAunGlnAlaLeuThrLeuAupValAspValValLyuPheValGly
3691 ACGGCTGTTAATTTCGAAGATAATTGAGACTGGAACACTTTAGGGATACACACAAGAATAAGCCTTTAATTGCAGTTAATATGACTTCT 3780ThrAlaValAsnPheGluAspAunLeuArgLeuGluHi PheArgAupThrHisLysAunLysProLeuIleAlaValA.nMetThrSer
3781 AAAGGTAGCATTTCTcGTGATrTGAATAATGATTvAACACCTGTGACATCAGATTTATTGCCTAACTCCGCTGCCCCTGGCCAATTGACA 3870LysGlySerIleSerArgValLeuAsnAunValLeuThrProValThrSerAupLeuLeuProAsnSerAlaAlaProGlyGlnLeuThr
3871 GTAGCACAAATAACAAGATGTATACATCTATGGGAGGTATCGAGCCTAAGGAACTGTTTGTrGTTGGAAAGCCAA7TGGCCACTCTAGA 3960ValAlaGInIleAenLyuMetTyrThrSerMetGlyGlyIleGluProLyuGluLeuPheValValGlyLymProlleGlyHi SerArg
3961 TCGCCAATTTTACATAACACTGGCTATGAAATTTTAGGTTTACCTCACAAAGTCGATAAATTTGAAACTGAATCCGCACAATrGGTGAAA 4050SerProIleLeuHiuAunThrGlyTyrGluIleLeuGlyLeuProHisLysPheAspLyePheGluThrGluSerAlaGlnLeuValLys
4051 GAAAAACTrTTGACGGAAACAAGAACTrrGGCGGTGCTGCAGTCACAATTCCTCTGAAATTAGATATAATGCAGTACATGGATGAATTG 4140GluLyeLeuLeuAspGlyAunLyeAsnPheGlyGlyAlalAaValThrIleProLeuLyaLeuAspIleMetGlnTyrMetAupGluLeu
4141 ACTGATCCTGCTAAAGTTAFFGGTGCTGTAAACACAGTTFATACCAITGGGTAACAAGAAGTTTAAGGGTGATAATACCGACTGGTrAGGT 4230ThrAspAlaAlaLysValIleGlyAlaValAunThrValI leProLeuGlyAunLysLysPheLyuGlyApAunThrAspTrpLeuGly
4231 ATCCGTAATGCCTTAATTAACAATGGCGTTCCCGAATATGTrGGTCATACCGCGMTGGTTATCGGTGCAGGTGGCACTTCTAGAGCC 4320IleArgAmnAlaLeuIleAsnAsnGlyValProGluTyrValGlyHiuThrAlaGlyLeuValIleGlyAlaGlyGlyThrSerArgAla
4321 GCCCTTTACGCCCCAGTTT CACAAAGTCF CAGGACAACTrCGATGAA CATTAAGAGTC^CTT 4410AlaLeuTyrAlaLeuHluSerLeuGlyCysLysLysIlePhellelleAsnArgThrThrSerLyuLeuLysProLeulleGluSerLeu
4411 CCATCTGATTCAACArTArrGGAATAGAGTCCACTAAATCTATAGAAGAGATTAAGGAACACGTTGGCGTTGCTGTCAGCTGTGTACCA 4500ProSerGluPheAmnIleIleGlyIleGluSerThrLyuSerIleGluGluIleLy.GluHiuValGlyValAlaValSerCysValPro
4501 GCCGACAAACCATTAGATGACGAACTT'AAGTAAGCTGGAGAGATTCCGTGAAAGGTGCCCATGCTGCTmrGTACCAACCTTATTG 4590AlaAspLysProLeuAspAupGluLeuLeuSerLyuLeuGluArgPheLeuValLysGlyAlaHisAlaAlaPheValProThrLeuLeu
4591 GAAGCCGCATACAAACCAAGCGTTACTCCCGTTATGACAATrTCACAAGACAAATATCAATGGCACGrrGTCCCTGGATCACAAATGrFA 4680GluAlaAlaTyrLyuProSerValThrProValMetThrIleSerGlnAepLysTyrGlnTrpHiaValValProGlySerGlnMetLeu
4681 GTACACCAAGGTGTAGCTCAGT AGTGGACAGGATTCAGGGCCCT AAGGCCATTTrTGATGCCGTrACGAAAGAGAG 4767
ValHisGInGlyValAl&GlnPheGluLysTrpThrGlyPheLyuGlyProPheLyuAlaIlePheAspAlaValThrLyaGluEnd
Fig. 3. DNA sequence of the AROI coding region, and the corresponding arom protein sequence
Vol. 246
379
K. Duncan, R. M. Edwards and J. R. Coggins
Table 1. Structure of the five E. coi enzymes which correspond to the S. cerevisiae arom activities
Note that the length of shikimate kinase is reported here as 173 amino acids and is 174 amino acids in the text. The N-terminalmethionine is cleaved post-translationally (Millar et al., 1986).
Pathway Calculated Length Quaternarystep Enzyme activity E. coli gene Mr (amino acids) structure
2
34
5
6
3-Dehydroquinatesynthase3-DehydroquinaseShikimatedehydrogenase
ShikimatekinaseEPSP synthase
Total
aroB
aroDaroE
aroL
aroA
38880
2637729380
18937
46112
159689
362
240272
173
427
1475
Monomer
DimerMonomer
Monomer
Monomer
S. cerevisiae EPSP synthase domain is located betweenamino acids 404 and 866. Of the five functional domainsof the arom multifunctional enzyme this is the best con-served, with 38% identity between the E. coli and S. cere-visiae sequences and 55% identity between the fungalsequences (Table 2). As with the 3-dehydroquinatesynthase domain there are two very well conservedsub-domains separated by a region with no homology. Inthe S. cerevisiae sequence this unconserved region of 51residues (Ile-701 to Thr-753), like the unconserved regionseparating the two 3-dehydroquinate synthase sub-domains, contains an 11-residue insertion compared withthe E. coli sequence. This two sub-domain pattern isillustrated in Fig. 6, which shows the alignment of twobacterial and two fungal EPSP synthase sequences.Much attention has been focused recently on EPSP
synthase since the discovery that the commerciallyimportant herbicide glyphosate (N-phosphonomethyl-glycine) acts on plants by inhibiting this enzyme(Amrhein et al., 1980; Mousdale & Coggins, 1984).Glyphosate is also a potent inhibitor of the N. crassa andE. coli enzymes (Boocock & Coggins, 1983; Lewendon &Coggins, 1983). A glyphosate-insensitive form of EPSPsynthase has been isolated from a Salmonella typhi-murium mutant resistant to glyphosate (Comai et al.,1983) and it has been shown that the only alteration inthe enzyme structure is a Pro to Ser change at position101 in the enzyme sequence (Stalker et al., 1985). Pro-l01is conserved between E. coli and S. typhimurium (Fig. 6),but in both the fungal sequences it is replaced by Phe(position 505 in the S. cerevisiae sequence). This positionfollows a highly conserved region in the bacterial andfungal sequences and precedes a less well conservedsequence which includes a 5-amino-acid insertion in bothfungal sequences (Fig. 6). The absence ofa conserved Proat position 505 in the fungal sequences indicates that thisresidue cannot be an essential feature of glyphosate-sensitive forms of the enzyme.
It has been proposed that the mechanism of EPSPsynthase involves a cysteine residue at the active site(Ganem, 1978). The greater than 98% inactivation ofthe multifunctional N. crassa EPSP synthase by N-ethylmaleimide and the protection against inactivationby this reagent observed in the presence of shikimate3-phosphate and glyphosate are consistent with thissuggestion (M. R. Boocock & J. R. Coggins, unpub-
lished work), but it should be noted that cysteine-directedreagents do not completely inactivate the monofunctionalE. coli (Lewendon, 1984) and Aerobacter aerogenes(Steinrucken & Amrhein, 1984) enzymes. This impliesthat an important cysteine residue is near to but notnecessarily at the active site of the enzyme. There is asingle cysteine which is conserved in all four EPSPsynthase (Cys-853 in the S. cerevisiae sequence, seeFig. 6); further experiments are required to establishthe precise functional role of this residue.The EPSP synthase region is linked to a region
homologous to the E. coli aroL gene product, shikimatekinase II, by a 20-amino-acid sequence (residues 867-886in the S. cerevisiae sequence). Homology with E. colishikimate kinase II (Millar et al., 1986) extends toresidue 1059. The homology found in this region, whichis 23% for the E. coli versus fungal sequences and 4000for the two fungal species, is lower than that found in theEPSP synthase region. Although the overall degree ofhomology between the yeast and E. coli shikimate kinasesequences is rather low, there is one well conserved regionwhich has sequence homology with the 'A' sequence ofthe ATP-binding site of phosphofructokinase andadenylate kinase (Walker et al., 1982). This 'A'sequence, G-X4-G-K-(T)-X6-I/V, occurs between resi-dues 895 and 909 in the S. cerevisiae sequence(corresponding to E. coli shikimate kinase residues9-23); the final residue, conserved between these twospecies, is an alanine rather than the usual isoleucine orvaline. Comparing the S. cerevisiae and A. nidulansshikimate kinase domains, the sequences around this 'A'region of the ATP-binding site are very homologous(9/11 matches). However, the A. nidulans enzyme doesnot have the GKT motif; instead it was GKS. Althoughnot unique in this respect (Midgeley & Murray, 1985),this feature is unusual in that the GKT is highlyconserved over a wide species range and over a widerange of different ATP-utilizing enzymes, and it isconserved between the S. cerevisiae and E. coli shikimatekinases (Millar et al., 1986).
Following the shikimate kinase region is a regionwhich shows homology to the E. coli aroD gene product,3-dehydroquinase (Duncan et al., 1986). In the alignmentshown in Fig. 4 the N-terminal amino acid of E. coli3-dehydroquinase overlaps with the C-terminal aminoacid of E. coli shikimate kinase. The percentage
1987
380
Sequence of Saccharomyces cerevisiae arom multifunctional enzyme 381
S.CerevIlsae(l)N V Q L A K VMPI[i?Gj. ND0 1 1KNV C V N IN N LVjT I I K N C P S TT VVI C N D TN..l L S K V Pi 1 Q QflV L 1571B.colIarol NSIVT..J. .KRRSYP IT IA SCGLFNI& SLPK!CQ VKLVEIT M. EKT L ALUL~IMY.D K
1 jI aroBS.c.reviuiae K F K A S L P K S R L L T Y V V K PKUT SK-SJ BUT K A Q L K TD YLL . . V K G C T TV N VA ICCGGCVI G-D1N (I1515
S.cerevisiae I F V A S T F MVN V V V V SLLV I I OSL SV L KT ADL L C K D F GV L V 1V L(15K.colliarog V N I I JS.crevimieKGA K GKINfjNiTj ACjJ ADKfT Ii ~ SAll F KENVVNCjNFI AV fK V T N 0LTVND I 0KW (235)B.colI aroS KT JG PA LS A
GVGLG K T Al CLWLK CNA G ANF V Q PJL D A LUD L 0 C
S-.cereRiriLe KC V S IG NVG [N L Ti AF~IOWLr P TD 01 VlAWIILUZSKIMA AL WTF7N .SNGD[NK VF K K L TLW K K T P L (2353)S.cer*lla -1S.cerevistee SNTDIDILL AK N L0 K N K L S K K K V V ILKSIKA C ML OSAtSS F1VN K LSiFr7 ILGTO TAL VIP F K D IQ An (413)
S.Cerovieiae 00I0KV V1 IDP[FKKj I G S ALKlI IL:Mj7 TCKJCKTGI K I. VMDIDW LRTKI I LT DNKTKL VT ISF K 0 NP[(473)K.coll areA K RL PL SG K9LVLN DCQ A
S.cereviuta. C KT V V KG H G C TLS AC ADL
P AG9GQCT ASI T SLA DAJj INS T55 K0 ZY KIL TGM SA WDQIi(4731K.coliaroA STSCKI IUN~~~~~~~JPLHAVIS AL KUI.N LUAL C . SDZILKL JKKJ
__co__aoA_ VrI P LVOSI L KY L MN KGA LAP1KVT VT D S L VF RGSK LA A TV SS 0 VSSI LKS11 (593
S.cerevisla. A K K P VT:LA LVCff- K PLIS C~V TIG ANH K KlFnGINr VT STL TR KPSVTS VIPKGTIVLTGI APS NQI I[i(85331K.coll aroAIe E 2GLs rLG AGTANPLA ALC i TG.WS9 P R LVWK
S.cerevieiae S A S S A V P LA AN ANT CTEfITV TV P NFn K ATD SV FTKEA CK T VTA STI TVJ S CPJ (7913K.coll are IGNACSWL :JL G AKJI Ktj I QU P LI L Q GG tGGJIVJDCG SIVS SQJF.~JKK~J T LC TCOD LI SCT SC
S.cerevielae PVCM V MLK P LKV OKGK PH TO A FL TAIC V V A AKISHG OS P NSATTN VTI K CIAP VKM K CS rIV IM~(6531K.colliaroA KLN IDDNVNNRIPKUDLVSKNPT IL FGVaIA.QKT QFV S QS G Y L V
S.cer.vilaae K SLAITMK LFC AKT AK PD CITr VTVN L NI G1 D LK ASSCi PVC VKCHTV C KS VTQ AN SF L LRS N V (7133)K.coll areAKoSA GK V AK IC H tTVK GRV. I T PKIKla NF A1.1 K I K A NO S N A N C F SL V TR
S.cereviulse PVGTLN NKSD LKVA VNEN[STDK A F LCTA C CVVDAA fl .NSDSDKLCSAAKLDCTTIG A KPQltVK7C TSKKNIS(68 773
S.cerevisia K VVI AV ATLCK GTIS V C AG L CS V K L KVDMLOK LSF P~ N N NC S V KOF VrVA N C KK G K K (847)
K.CelliArOA R2AL AE.V. Y .... ..RTASTJA KLLTFWNN NNC RIVAVLCAPVD VLVMI 3JLAA
S.cerevislae NVFWIQ V pv I RNr7 KTWnGWW:VLn G D G L E C TS KKVAS(1 71)B.colI aroL K 0 .S P. L. C,K, P L KV0K K 0ALUDV S MAN Q A.TK P . .V I S C I S. A .A . I NC .~
7 1 /aroA~reL(14) 1/ao
S.cerevluiae A 0 F V S K 0 L S~~IGKAT05I IIFTVT TTNl QS CKNFWP0CK NF NTQSK V D I A LV K L (1174)
S.cereviuiae T L PT GV K.. V I NKSC TKIICSN F0CLVSA VYKLD DOE A QKV fEOA VVKFVCTA(2)K.coll aroL FT CDPL V K K TV AC VKA VN ANO K V D S NRRFN K TPD A..Q I IQ AL SKTW N IOS F ADI G AL NR
S.cerevlalac V IF KO EL IL NY G D DK N PSOT K N K LI AV NS NTSK AKC D F LS N NG V LT KVT S DL L DN S A P C (1268)K.colliaroL EnS.cerevisiae L V A 01 TSKP Vi-C-D S P Y E IW : Z WKP ICK NS SFS FIL FTCSVAK AL CL A. .L .[RS KFS OKY FA(1341)K.coll aroL A LRP N LLCK.NKl TI D LLVH4JW J~.JI. JFI I D JT F9 A S QV IN GK L A QTLJ VNC SK V
areD/(240) (1)/aroK ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ arL/171(I/soS.crervIluae T KST A 0V KM K KSL LOC K NFV CCL AF ADV TIP L Kf LO T VI N KLTGAACI VIA V NV I P. Nfl K K F (11401)K.coll aroDK LA I OF INTLKG AP I CS CL NVKI TV F K KL K A F RAS A K LFD K S WIAAL C IAV NT L K .~ LC L
S.cerevluiae K C ONT KV L C I RLNATl : I pN N C TPF VVE C nKTA7CN I C A C C KTS S A L V A LN C C Kn K IDL1F LI(11741K.coll aroDK Lj ~ V LLS1DTM EK.ILFTFRS. GGI IST9A CT GL DNIDL9L
S.cerevisla. TTKLKP 1 L P SF.VFNK GmI C I K ST KM D FKIQ KKL V CW V A WV S CV P NOK P LOOKLV fL . K-FVG (15312
S.cerevislae VLKSFLVKCARNAAF HFPTLLKAHA KNKPSVjLFA V NMTSKIn lSKVQV LNVVCSNLVTP CVATSQJiLP 1AA Q[~ 11572)K.coii aroD .LQIKSPTCISV V 0E FyAD RQ K C TK I.I.. JAF K W AGCKES CS KGNGA N11L C WII CA G9AKA JLLSVWCA
S.cerevsliae C F K C P F KAIF0A V TKNGG fEPK11586)-KFIGHIS [-] HNTGYEI GL ....M DKF(31K.colliaroDAK~JLPG KP....VIKQL ....QK A~VF[JN!T FHSKS Q JNL.....SA ]P
aroD/(2401 (Iare/12 roE
Fi.S.ceAminoaci homologiesLV Etwe theDGS. NerevisiAe ArVom TmLtifunctIonal ezMeD:tTDAKVIandNthecorespondin moo FctoaE1c0lienzymes PE GR
Key.aoB,aOAPINDdeyrountesntae arA,FFSAGGKGANEPSP synthase; aroL, shikimate kiae;ao,L-eydounaearER hkmtdehydroenase Nubrsaoe adblwth eune inicteaIno Aci posITiosi h .cvsanyeadihiniiulaeccl e Nzymes, :Arsetvl.GpiN both sEquence mAinaiGheOV mi with Fig. 5. K I MIiRI14
homoogy amongL th NThre species isRsimilarPtotht it apentAdecGApeptideLIisolaed.fro Th 3-deydrofoudCieteisoTshiKimtLKinaESeP IKIEE MMVPdoaiP(abeL).qunaeDctveofth N.crss aro polypeptide112Confirmation that thisEreio oFA th S. QerevSiaEL (S ChaudhurIN &JG R. CoggPAIns unpublished. work)
sequence truly encodes the 3-dehydroquinase activity This peptide had been radiolabelled, by treatmentwas provided by the observation that there is homology with 3-dehydroquinate and NaB3H4, on the lysine
Vol. 246
K. Duncan, R. M. Edwards and J. R. Coggins
S.c revlse.a Vl L A K V pr-L7 N V G YMN! H ONHL YV:TKT!fl TSFIT I C iTX TN. L 5 K VIFY Y Q L 157)
A.nldulaen JS NPFTK ISIL GIRI1SI IAD FPG LWRN IV A .TIT.L.V!L VTD T NJIG 5I Y
ScIerevIlsa. KF K ~S L PF G L IiTIM1 V K1 GOjiTIS TTilaI Tf 1AJQ L I D L fl. V C.MET liiDT V1 v IR G GTG TV G DIN 11315
A.nidulene K RAW 9 I 1K5R TKADII I LS0N PGRDTV I LGGGVIG
S.cer*vleia. I G V A S6Tl GVRV TI L L AN V D 1I1IG0GK TA!I D T PFL GKMN O-AFi FP K F V L VDMJI wf(175)
A.nlidulans 16 IIW PT KI Y I DL
S.cer*vleia. TA K A V TA Cf VW N A D fla[Li1 Si -71S L F LMN VflM G A K N V K V T N Q L T N I DI 1I (2351
j~~~~~~~~~~~~ K A~~~~~ 1 aIG V TV K TF JG aIl r T AN a F. I T U
S.ceroevtola N T D I MA L D T Y K LVO R I K V -A IDIW L R N L L AIG y 1 [ILT PT]A Lo (2541
R N L L N V H S I C H A I A . I L T F QIHL.11 R L: I r A
.A G L FDK K[W FL H K 5A.wnidulans l A 9LJA R LIG LK V A U5S RI V C UAA Y L PT K D A R I R KILjTJA G151 C S
S.Cer*vislae fIf L K Kr-IS I ID iTT111GWSRTTV fVjI fLu SfTr17K C flGD SfQ F [Fj D aFD L lilF IflT D T L MY~FMF K D I A 14131
UK T RWS V A N D I R5JV V A S I G V A N
S.cerevleiae 0 113K V V I IPGF K SIl NIt ALI L A A TLTG fIQOfK IK N L TiiWDD T K HN -iT MTRV N OiK G !-TF I DiV~ (4731
A.ftidulane S S V I C AIP I S_N _ALJ V L L G SjWT R K LLa .S D Di T VIM LIMN L ALL TJFLS Bigaa
S. cerevielae GI iT V [V-IE GMH GUGS T fla MiC AD0 ~FiFMLLUA GTAl.RFL TIS LRA L V[NUST SnS K Y I L TiG1NA[45l3331A.nldul aneig IV L V V N G K jjG JG L
C .1T EI2a.
S.corevleale. 1I A L V LsflRAliNlG T K IZ Y L N N G S L PI KIV Y TD0 S V [MKJ[T'G fRI KI 1TTIO v s sjI
A.Imidul.ans I G D0L D JAL TjIA N V L PF LNH T S K G AlA ~M~H A~ K
S.cerevelea. P1 VTii LIA CKPIS1K Lrn Di T1I KRW11N NVTMR11 iT Ti"P fTVY FlK IiiM N[Wil5 v TIj9(631A.Imidulan~e P V T L LV K PFI SJ FW I D TT A NJR SIFG DVOW Xi: K T YaML! P
S.coreviala. 1sfTY AFr-AIW4TGTT1VIT GF [IF 10G FAJR LK FHCCK TOUriAI TV fPi (7131
A.nImduItne D A SAJTCYj L : V AJV IIT G.TT.irJC p N IG S A L G R VK[!j P C, CIT V K,0 ElJ~K [
S.c*roviesie V TOLK L K H V D m A IF L T A V V A A I S D S- ID FPM S A T T T I KE G IA QR V K TC N IL[A?Nj7731
A_n_dulan S D GoI RI.aA T S K G G T Lia C V RuFi T SNHRP K T T V S SI IAN RVK CN IKL M
S.cerovlslae A T L A FGVK T T tiL PF[1 I V G L M1K D L VP D S G V Vii1 [FaV 1LLOA G N V (1331A.nldulans K Djr L A K F GCV I C R NJDtD GI. .LI D G I R L R P1 V G V 5V AJF F VL..J.L
S.cer*v1eiae N S Q N E D KV ANM V it1 L a1 N1CM K T GD0VN S LM A K-L D Ar P L I C T S K K N S (8171
A.nidulans S L VT 0 FTL
L KK T TO;J a3L G[CK]K L E K1 P V A A S G D
S.Corevlela* K KrnMV V I IG A KT T I S K V-C A SrA L G K L LF 13Q H N N S V K F V V ET W Kr R (947)
A.nidulane N A I I I G R I A G KS TA N JV SK1A5JNM R FVjD L DJT KL .T V G N T I D I I K WI GI..!.AIM
S.Cerevslae FT MF Kwlg VI N Y DDIG0 V FI T1GR G I F11 A S[a KfA LfK F A S S M V L H L H I T (10031
A.nidulane 5L K±IL KU. T L K RIG V F A G G GV V KjN FP A K L[ LT DY H K T K G V L L L N K K I
S.Cerevlula* VWTF SflFMP RK( V W N rR R 1KG Cf-Y K P H CflA KAK F13QN. .L aS NjS K YIA( O1011
A.nldulans D F L S I 15K .s .P.A. Y.!...5JD0N C V LL2.JK F15JF 13E C N Q Y Y s D A P S CG L A D NR L V
S ov lse oe H LA N S . (1114
A.nldulane D SI
K K K. JF A SL jLF1P R KA D I0 KE V CV CS A V L R V DIL LOK D P A S N N D I S
S.cerev1leao AfiDMF VfiS Kj FL ILURj K A T D S IFI I F T V K[ GINF P DFKFK F K T L ALYIRL KN Vrg F L L1E L (11741.
A.nidulanv V DJ V 11Q LSIF L S
S.corevlslaa T L T I Y .nmINK TNKI F1IIG iF jL Y S!1 WI0 A IK N F M-O rA LIT L 0DV fV F [VnTTM( 12321
KI FI HS NDI P KG A LIS A.N KIFIYC KIL
A.nidulan. AF15 .1AUJ FTK.N KI5.0VI!.±jI.~C11.1 L[W5I _JH~~ JS GVC 13
S. cerevlmlae flV 13 TR CF HV S SI rC IN TL .DKLF Fi (1191)A.nldulann 5S A T L S L I KIP K K F A I F 15]S P IJS S T T P Y L S A I T T PF A W L R T K]
S.coervileI. T [qTlA lV F K L t. N K N F G, A A V T t, lil1D 74( Y D (iTjLlT 0, A A V I A V N
IHP 1. G. N K K F (1401)
A.nldulans C S Lij A15 L T S A A P S V T R S S T S C F S TI LI It R S SE15jI. T S F PC R L A R WL H A SY V
S. -ereulmle* D W 1 I It N A N N r-lY P S1 V 1; H T A (4iT--V] A G GT jS li A A S InC KK[T FV VI N (14581A Nd TeDS W I s FtJ A G V YU P It IF 0 LL... I NH [ J S [ J Y U C 5
S.Cerovm1.e TF LK PFL I Ffi F-%]F NFr -lI( fF sT XFlJI [RF KHV GV A V CV PA'A K D I.fl K (I1312)A.nidulans PI LI N MH V sU F P SI Y15J_Sj V JP S15jF 5JS V V A I T I A I D15T M 5T 5C HN F A Q
S.r-ere ie ce L. F' V 1GJ]M A A V F1 T [I. F A rW7 -K7 1S rI-TTiFP fliT I S FDJK Y QflH- V V rV`GS WL VH rQT]-jV AR F K W T (13/,2)A. n idulans A D E A V K I A P I RjJ t. TI P JIKF LV5y I2 ..
G
S.-erevlelee G F KG("PF KAlI F DA VT (1388)IA. nidulaone S I 1. A C:F 1. T P S N T L S R Y H R V V A P S F N
Fig. 5. Amino acid homologies between the S. cerevisiae and the A. nidulans arom multifunctional enzymes
Numbers indicate amino acid positions in the sequences. Again, gaps in both sequences maintain the format from Fig. 4.
residue which is known to form a imine intermediateduring the enzyme-catalysed reaction (Chaudhuri et al.,1986). This homology predicts that Lys- 1227 in theS. cerevisiae arom polypeptide is the residue involved inimine formation in the 3-dehydroquinase reaction. Thealignment places Lys-170 at the active site of the E. colienzyme; this prediction has also been confirmed by theisolation of an active site peptide (K. Duncan &J. R. Coggins, unpublished work). The mechanism forthe action of 3-dehydroquinase proposed by Walsh
(1979) requires a basic group for proton abstraction.Chaudhuri et al. (1986) have provided evidence that, forboth the E. coli and N. crassa enzymes, this group is theimidazole side chain of a histidine residue. Two histidineresidues are conserved between the E. coli and S. cere-visiae sequences, but only one of these (His-1198) isalso conserved in the 3-dehydroquinase domain of the A.nidulans arom polypeptide (Charles et al., 1985, 1986)(Fig. 5). It is therefore reasonable to propose that this isthe active site histidine residue.-
1987
382
Sequence of Saccharomyces cerevisiae arom multifunctional enzyme
Table 2. Summary of the homologies found between the five E. coli monofunctional enzymes and the S.multifunctional enzymes
cerevisiae and A. nidulans
Number of residuesNumber of residues conserved conserved in thebetween the E. coli mono- two multifunctional Number of residues
Polypeptide functional enzymes and each enzymes and in the conserved in thechain length arom multifunctional enzyme corresponding functional domains(amino acid) E. coli monofunc- of the two arom
E. coli enzyme residues) S. cerevisiae A. nidulans tional enzyme polypeptide chains
3-Dehydroquinate 362 130 (36%) 128 (36%) 96 (27%) 201 (51%)synthaseEPSP synthase 427 162 (38%) 146 (34%) 127 (30%) 254 (55%)Shikimate kinase 174 39 (23%) 35 (20%) 22 (13%) 68 (40%)3-Dehydroquinase 240 50 (21%) 41 (17%) 30 (13%) 97 (42%)Shikimate 272 68 (25%) 41 (15%) 29 (11%) 75 (27%)dehydrogenase
144A1U@ EVHP..CV^HSS 1 L V L A A L G S G T C R I K N L L H S D D T C VM A L lE aA.nldulans EV N pS cer9vl*l&* L V Y P F K ID I P A1D Q Q K V V I P P G S K S I S N R A L I L A A L G L G Q C K I K N L L H S D D T K N V1EH e L&.coll *ro Nl Z S L T L Q P I A R V D G T I N L P G S KE V S N R A L L L A A L A H G K T V L T N L L D S D D V R H N L N A T A LS.tvh aroA N Z S L T L Q P I AR V D G A I N L P G S K V S N AA LL L A A L A C G K T A L T N L L D SID D V R H M L N A L S A L
A.nidulans G A A T F SW eE E GE V L V V NG KGGI .N L QA S S SPLLY L G NA G T S _FL T T VA TLA NS . S T V D S sS.c-r*vlsl&* K A T I S W Z D N G Z T V V V Z G H G G S T L CA C D P L Y L G N AG T AS R F L T S L AAL V N S T S S Q K Y|IVZ.coll aroA .GV S YT L S AD T R C e I I G N G G PL H A E GA L L F L G N A G TA N R P L A AA L C L . . . . . G S N D I VStvDPh aroA .JI N Y T L S A D R T I C D I TG N G G ALR P G LE L F L G N A G T A N R P L AA A L C L . . . . . G Q N E I V
A.nldulans L T GIN N Q P I G D L V D A T A NV L P L N T S K G R A S L P L K I A A S A G G IMN L A A KV SS QIYS.cerevlslae |L T G|N A R N Q P IEP|L V D S L A NT K E N N G S L|P|I K V Y T D K G G RLIE L A A T|V S S Q|YE.coll aroA L T GIE P|R MER P I IL V DAL RL GG AKIITY LIE QIEN Y PIP. .L R L Q fFITIG GIN V D V D G SV S S QIFS.tyh aroA |L T G|C P|R M K|ER P I G L V DS QGANIDYLQEQAE N Y PNJ. . L R L RLGR JTLGGJDWE V D G S V S S Q F
A.nldulan V S SLL M C|A;PYI;PKP VL R L G K P JT ID T A N R S F G I D V Q K S T T C E H T Y H I P Q G RS.cerevlslae V SSZL M|C AP]YAE|E|P V T LA L GIGIK P I 1S5 1JYID |TI K NINEK F G I N VE T S T T EP Y T Y Y I P K G HE.coll aroA L T A LLNTjA PL A PE . DITIV I R I KIGID L VIS K P Y IDI IITIL N LIIK TIF GIV E I E . N Q H Y Q Q F V V K G G Q SS.typh aroA L T ALL NT A P L AP K. D TI I R[EK JE L VIS K P Y I D I TL N L N K T F G V E I A . N H H Y Q Q F V V K G G Q Q
A.nldulans YV NF Evi S;1AT PLV V T T CT P NIS A L Q G D A R F A V R PGC T v cS.cer*vl9lb* |YI N|P|S 9 Y V I 9 S D A S S A T Y P L A F| AA T|G|T|T VT|V|P N|I G|F E|S|L|Q G D A R F A tV L|K P N G|C K T QE.coll aroA IYIQ S|P|G TIY|L VIE|G|D AS S AISIYIFL AIA^AAI K|G|G|TV|K|V|T G|I GI NISINIQG DIIIR F AI.ID V L EKEN G ATIIIC wS.typh aroA NH S P G IRYLVjG D JSA Y F L GAA A I K JGG TVJKLVT G IK J N1G I R F AJ.D VL2E K GA TW T w
A.nidulans T E T S T T VWG P S D G I L . RA T S KR G Y G T N DR C V P R C F R T G S H L P N E K S Q T T P P V S S G IAN QS.cerevluiae T A T S T T V S G P P V G T L K P L K H V D LE P M T D A F L TA C V V A A I S H D S D P N S ANT T IE GGIIIINIQRE.coll aroA G D D Y I S C[TlR G EL N A I D M D M N H IP D A A M T IA . . . . . . . . . . . T A A L FA KGIT TR LR N IYINWRS.tjh &toA G D t F I A CWR G E L H I D M N N H I P D A A M T I ... . . . . . . . T T A L F A K G JT L R N Y N WI]
A.nidulens K CIC N I KA.K.Dr F V I C R H D G .... . . L C I D G I DR S N LR Q P V G G V F CDD H RIVRS.cerevislae |V K C|C N R I L|A N T E L A K F|G V K T- T|C|L P D GrIQ V H G L N S I K D L KVVS D S S G P V G V C rYD D HN |V|AE.coll aroA IV K EIT D|R|L F A NA T E LREKIVIGIAE VEIEIG HIDIYIIY. RI TIPP E K L N FA I AIT YIND HRINAS.Yshb aroA V K E T D RL FAA E LJR JVGJA E V EE JG H l.JY WJ....... I... R I T P A K L Q H A D I G jN D H RJKA
(8661A.nidulans F S V1L ..... . . . . S L V T P Q TL[ LE K E VG KT W 1G W W D T R Q L F K V KS.cerev1lse NSF S L LRG N V N S Q N E R D E VANP VIKILIE R H|CT|GIKTIWI GWWDVIL .N. . . . . HR.coll aroA C F S L V. . . ....... . L S D T|PV|T|I L|D P KIC TIAIK TIFIPD Y F 9 Q L AI S Q DAS.tYeharO JCF S LV.V. . .. . . . . L S DT| P VTIL| D P KCT K T F PD Y FC QU L R N S T P A
('271
Fig. 6. Amino acid homologies in the EPSP synthase domain in four species, S. cerevisiae, A. nidulans, E. coli and S. typhimurium
Only positions which are conserved in at least three of the sequences are boxed.
In some species of micro-organism there is aninducible catabolic pathway-that allows the utilizationof the plant metabolite quinic acid as a carbon source(Giles et al., 1967b; Giles, 1978). One of these catabolicenzymes is a 3-dehydroquinase, and it has been reportedthat there is no discernible homology between thebiosynthetic 3-dehydroquinase domain of the A. nidulansarom polypeptide and the inducible catabolic 3-dehydroquinases of N. crassa and A. nidulans (Da Silvaetal., 1986). Neither the 3-dehydroquinase domain of theS. cerevisiae arom protein nor the E. coli biosynthetic3-dehydroquinase show any homology with the induciblefungal 3-dehydroquinases, which supports the proposal
that the biosynthetic and degradative 3-dehydroquinasefunctions have arisen independently (Da Silva et al.,1986).The C-terminal region of the arom polypeptide (resi-
dues 1306-1588) is homologous to the E. coli aroE geneproduct, shikimate dehydrogenase (Anton & Coggins,1987). In this case, the homology between E. coliand S. cerevisiae (25%) is higher than that for the shiki-mate kinase and 3-dehydroquinase domains; the A. nidu-lans shikimate dehydrogenase domain however hasdiverged substantially, being only 15% homologous withthe E. coli and 27% homologous with the S. cerevisiaesequences (Table 2). This final domain of the arom
Vol. 246
383
K. Duncan, R. M. Edwards and J. R. Coggins
A.nidulansA S.cerevisiae
E.coll
[387) 2A S VfA N[IfiAVVfPAPSIEIE H . .GVAHSSNVICAIPPA QQF SJD [L F I TDFETDLTYLF KDVIJPIADQQKVVII PPDCQSA. .. .MES LTLQP I ARVD TINLT
A.nidulansB S.cereviuiae
E.coll
[8651L RQL l..L AR
L F KVQF .KV.KEGifEELEEEVAASGPDHSE G AKJL DG A LE CT S K K
i ;SAA..... .....
(894)RGN [QRA IVYr I
NSKK VV I.MTQPL LI
1032 )A.nidulans E C S N I
c S.cerevisiae E C S N F-E.coli [JV A H I
A.nidulansD s.cereviulae
E.coli
110691Q YY S RD ASP S G L ARRS ED F N R L Q V A JG Q ID S L SSF F A P H C EA E FQ. .LRRSWSKYIATITGVRE III DAT NE PS Q VI S G IRS AL A Q TI NC KrEV TV K D LVI
112661 [1313)R I FGGFM|TPV|SHPS FAA PGQL SATER GLSL E K K F
RV fL VT SD LL N AAPGQLTV AQ N NJY T SMG GWE MJE L G
RLAGEVFGSGGNFWCGK AV R ANL GK... ...... ...METYVF
Fig. 7. Amino acid homologies in the regions which link the functional domains
The linker sequences shown are between: (A) 3-dehydroquinate synthase and EPSP synthase; (B) EPSP synthase and shikimatekinase; (C) shikimate kinase and 3-dehydroquinase; (D) 3-dehydroquinase and shikimate dehydrogenase.
polypeptide chain is connected to the 3-dehydroquinaseregion by a 12-amino-acid peptide (residues 1294-1305 inthe S. cerevisiae arom sequence).Linkage of the domainsThe S. cerevisiae arom polypeptide chain contains
1588 amino acid residues, which is 113 more than thetotal number of amino acid residues found in the fivecorresponding E. coli polypeptide chains (Table 1).Many of these extra amino acids occur in the regionslinking the various domains (Fig. 7). Zalkin et al. (1984)have postulated that connector regions are probablyessential for the structural integrity of multifunctionalproteins, but that their sequence is not important. TheS. cerevisiae-E. coli homologies break down towards theend of the E. coli sequences, making it impossible to sayprecisely where one domain ends and another begins inthe arom sequence and making it difficult to define wherethe connectors begin and end. There are nonetheless fourobvious connector regions linking the five domains of theS. cerevisiae arom polypeptide chian. These connectorregions are characterized by a lack of homology betweenthe E. coli and S. cerevisiae sequences that extends oversome 30-40 residues and in three of the four cases by aninsertion of from 11 to 20 amino acids in the S. cerevisiaesequence (Fig. 7). Secondary structure predictionsfollowing the method ofChou and Fasman indicates thatthese non-homologous connector regions are essentiallydevoid of secondary structure. In the three cases wherethere are insertions the additional amino acids are mainlyhydrophilic.Codon usageThe codon usage of the AROJ gene is shown in Table
3. The pattern resembles that for other S. cerevisiae genesinvolved in amino acid biosynthesis, for example TRP5(Zalkin & Yanofsky, 1982), HIS] (Hinnesbusch & Fink,1983) and HIS4 (Donahue et al., 1982) suggesting thatAROI is expressed at about the same level as these otheramino acid biosynthetic enzymes. Studies on two highlyexpressed S. cerevisiae genes, alcohol dehydrogenase I
.
Table 3., Codon utilizadon in the ARO1 gene
'Term' indicates translation termination codons.
TI' Phe 32 TCT Ser 32TMC Phe 29 TCC Ser 26
TTA Leu 57 TCA Ser 17TTG Leu 44 ICG Ser 14
TAT Tyr 23
TAC Tyr 22
TAA Term -
TAG Term *
T&T Cys 15
TGC Cys 8
T&A Term -
TGG Trp 17
CIT Leu 17 CCT Pro 5 CAT His 27 OCT Arg 16
CIC Leu 4 CC Pro 33 CAC His 11 (GC Arg 0
CrA Leu 15 CCA Pro 31 CAA Glu 34 CGA Arg 1
CMG Leu 9 OCG Pro 0 CAG Glu 10 OGG Arg 0
ATT lle 66 ACT Thr 46 AAT Asn 37 AGT Ser 19
ATC lie 23 ACC Thr 20 AAC Asn 34 AGC Ser 8
ATA Ile 17 ACA Thr 33 AAA Lys 66 AGA Arg 31
ATG Met 31 ACG Thr 9 AAG Lys 46 AGG Arg 5
GIT Val 67 GCT Ala 48 GAT Asp 54 CCGG Gly 73
GTC Val 25 GCC Ala 31 GAC Asp 27 GGG Gly 17
GrA Val 21 GCA Ala 28 GAA Glu 72 GGA Gly 17
GTG Val 18 (=G Ala 6 GAG Glu 38 GM Gly 6
and glyceraldehyde-3-phosphate dehydrogenase, allowedBennetzen & Hall (1982) to identify the 25 preferredcodons which correspond to the most abundantisoacceptor tRNA species of S. cerevisiae. They havederived a codon bias index which quantifies the degree ofthe bias towards these selected codons and whichcorrelates well with the extent of expression of a gene.The codon bias index was calculated forAROJ; the valueobtained (0.25) indicates that is is a moderatelyexpressed gene. This is consistent with the report that theAROM gene of A. nidulans is also expressed at a lowlevel compared with the highly expressed phospho-glycerate kinase gene (Clements & Roberts, 1986).
DISCUSSIONThere is an increasing body of evidence that long
polypeptide chains have evolved by the fusion of smallerpre-existing functional modules (Hardie & Coggins,
1987
384
Sequence of Saccharomyces cerevisiae arom multifunctional enzyme 385
1986). In some cases, for example the immunoglobulins,the fusions have involved the repetition and diversi-fication of a single structural element, presumablythrough gene duplication followed by divergence (Cush-ley, 1986). In other cases there is evidence that functionswhich in some species are present as separate mono-functional proteins occur in other species as fusedmultifunctional proteins. The amino acid sequences ofthese multifunctional proteins have mosaic structureswith recognizable regions that are closely related to theirmonofunctional counterparts (Hardie & Coggins, 1986).The results presented here for the S. cerevisiae arommultifunctional enzyme demonstrate that its penta-functional polypeptide chain has such a mosaic struc-ture and in this respect is very similar to the A. nidu-lans arom multifunctional enzyme (Charles et al.,1986). The most likely explanation for the origin of thepentafunctional fungal arom polypeptides is that theyhave arisen by the fusion of ancestral E. coli-like genes(Hardie & Coggins, 1986; Charles et al., 1986). Thealternative explanation, that the multifunctional enzymesare more ancient and that the monofunctional bacterialenzymes arose from them by mutational insertion of stopand start codons, cannot however be totally excluded(Hardie & Coggins, 1986).Assuming that the gene fusion hypothesis is correct it
would be expected that at least some of the functionalregions of the arom polypeptide chain would maintain adegree of structural autonomy and that functionaldomains might be isolatable, for example by limitedproteolysis. Although no such studies of the S. cere-visiae arom polypeptide have been reported thedomain structure of the closely related N. crassa arompolypeptide has been studied directly by limitedproteolysis (Smith & Coggins, 1983; Coggins et al., 1985;Coggins & Boocock, 1986). A very stable C-terminaltryptic fragment of Mr 68000 which carries both the3-dehydroquinase and shikimate dehydrogenase acti-vities has been isolated (Smith & Coggins, 1983; Coggins& Boocock, 1986). One particularly interesting propertyof this bifunctional fragnent of the arom polypeptide isthat even after denaturation with 8 M-urea or sodiumdodecyl sulphate it can refold and regain some of itsshikimate dehydrogenase activity (Smith & Coggins,1983; Coggins & Boocock, 1986). This implies that theC-terminal region of the arom polypeptide is a trulyautonomous functional region. Evidence has also beenpresented that expression of the C-terminal region of theA. nidulans AROM gene gives an independently foldingpolypeptide chain carrying 3-dehydroquinase activity(Kinghorn & Hawkins, 1982) and a truncated bi-functional A. nidulans AROM polypeptide carryingEPSP synthase and 3-dehydroquinase activity has beenreported (Charles et al., 1986). The early genetic data forthe N. crassa arom locus, which included the descriptionofmany point mutations lacking a single enzyme activity(Giles et al., 1967a; Rines et al., 1969; Case & Giles,1971) is also consistent with the mosaic model for thearom polypeptide.
Forty years ago Horowitz proposed that biosyntheticpathways, as they occur today, are the result ofretroevolution; that is, they have been progressively builtbackwards from the final metabolite of the pathway(Horowitz, 1945). The mechanism of this processpresumably involved gene duplication followed bydivergence (Horowitz, 1945, 1965) and one would
therefore expect that the evolved proteins would retainsome homology with the ancestral protein at the end ofthe metabolic sequence. Evidence in support of thishypothesis has recently been provided by the demon-stration that two enzymes catalysing successive steps inmethionine biosynthesis in E. coli are homologous(Belfaiza et al., 1986). While the sequence homologiespresented here between the five monofunctional E. colishikimate pathway enzymes and the multifunctionalarom polypeptides imply that the tertiary structures ofthe functional domains are conserved, we have so farbeen unable to identify any homologies, at the primarystructure level, between the five shikimate pathwayenzymes. The question of whether these five enzymeshave common structural features at the tertiary level willhave to await detailed three-dimensional structuralanalysis.
Gaertner and his co-workers have attributed somevery interesting catalytic properties to the N. crassa aromsystem (Gaertner et al., 1970; Welch & Gaertner, 1975,1976). These included 'catalytic facilitation' (Gaertneret al., 1970), 'channelling' (Welch & Gaertner, 1975) and'co-ordinate regulation' (Welch & Gaertner, 1976). It isnow clear that all these experiments were carried out witharom that was not only proteolytically degraded(Gaertner, 1978) but was also seriously deficient in3-dehydroquinate synthase activity (Lambert et al.,1985). Also the kinetic parameters used in the calculationswere very different from those determined more recentlywith well defined preparations of homogeneous enzyme(Lambert et al., 1985; Coggins & Boocock, 1986). At thepresent time we are not aware of any conclusive evidenceof catalytic interactions between the componentenzymes, nor have we obtained any evidence ofco-ordinate activation (G. A. Nimmo, M. R. Boocock,J. M. Lambert & J. R. Coggins, unpublished work). Thislack of evidence for any special catalytic properties forthe arom system has lead us to consider an alternativeadaptive advantage for the occurrence of the penta-functional arom polypeptide chain. By having fiveenzymic functions involved in catalysing five sequentialsteps on a biosynthetic pathway on a single multi-functional polypeptide chain the problem of co-ordinating the expression of the five separate enzymeactivities is avoided. In this connection it is interesting tonote that the turnover numbers of the five arom enzymeactivities for the N. crassa multifunctional enzyme arevery similar (Lambert et al., 1985).We wish to thank Dr. Frank Larimer (Oak Ridge National
Laboratory, U.S.A.) for a gift of plasmid pFL6. The workcarried out in Glasgow was supported by a research grant fromSearle Research and Development.
REFERENCESAhmed, S. I. & Giles, N. H. (1969) J. Bacteriol. 99, 231-237Alton, N. K., Buxton, F. Patel, V., Giles, N. H. & Vapnek, D.
(1982) Proc. Natl. Acad. Sci. U.S.A. 79, 1955-1959Amrhein, N., Schab, J. & Steinrucken, H. C. (1980) Natur-
wissenschaften 67, 356-357Anton, I. A. & Coggins, J. R. (1987) Biochem. J., in the
pressBachmann, B. (1983) Microbiol. Rev. 44, 180-230Belfaiza, J., Parsot, C., Martel, A., Bouthier de la Tour, C.,
Margarita, D., Cohen, G. N. & Saint-Girons, I. (1986) Proc.Natl. Acad. Sci. U.S.A. 83, 867-871
Bennetzen, J. L. & Hall, B. D. (1982) J. Biol. Chem. 257,3026-3031
Vol. 246
386 K. Duncan, R. M. Edwards and J. R. Coggins
Berlyn, M. B. & Giles, N. H. (1969) J. Bacteriol. 99, 222-230Berlyn, M. B., Ahmed, S. I. & Giles, N. H. (1970) J. Bacteriol.
104, 768-774Biggin, M. D., Gibson, T. J. & Hong, G. F. (1983) Proc. Natl.Acad. Sci. U.S.A. 80, 3963-3965
Bode, R. & Birnbaum, D. (1981) Z. Allg. Mikrobiol. 21,417-422
Boocock, M. R. & Coggins, J. R. (1983) FEBS Lett. 154,127-133
Case, M. E. & Giles, N. H. (1971) Proc. Natl. Acad. Sci. U.S.A.68, 58-62
Catcheside, D. E. A., Storer, P. J. & Klein, B. (1985) Mol. Gen.Genet. 199, 446-451
Charles, I. J., Keyte, J. W., Brammar, W. J. & Hawkins, A. R.(1985) Nucleic Acids Res. 13, 8119-8128
Charles, I. J., Keyte, J. W., Brammar, W. J., Smith, M. &Hawkins, A. R. (1986) Nucleic Acids Res. 14, 2201-2213
Chaudhuri, S. & Coggins, J. R. (1985) Biochem. J. 226,217-223
Chaudhuri, S., Lambert, J. M., McColl, L. A. & Coggins, J. R.(1986) Biochem. J. 239, 699-704
Coggins, J. R. (1986) in Biotechnology and Crop Improvementand Protection (Day, P. R., ed.), pp. 101-110, British CropProtection Council, Croydon
Coggins, J. R. & Boocock, M. R. (1986) in MultifunctionalProteins: Structure and Evolution (Hardie, D. G. &Coggins, J. R., eds.), pp. 259-281, Elsevier, Amsterdam
Coggins, J. R., Boocock, M. R. Campbell, M. S., Chaudhuri,S., Lambert, J. M., Lewendon, A., Mousdale, D. M. &Smith, D. D. S. (1985) Biochem. Soc. Trans. 13, 299-303
Comai, L., Sen, L. C. & Stalker, D. M. (1983) Science 221,370-371
Clements, J. & Roberts, G. F. (1986) Gene 44, 97-105Cushley, W. (1986) in Multifunctional Proteins: Structure and
Evolution (Hardie, D. G. & Coggins, J. R., eds.), pp. 13-53,Elsevier, Amsterdam
Da Silva, A. J. F., Whittington, H., Clements, J., Roberts,C. F. & Hawkins, A. R. (1986) Biochem. J. 240, 481-488
de Leeuw, A. (1967) Genetics 56, 554-555Devereux, J., Haeberli, P. & Smithies, 0. (1984) Nucleic Acids.
Res. 12, 387-395Donahue, T. F., Farnbaugh, P. J. & Fink, G. R. (1982) Gene
18, 47-59Duncan, K., Lewendon, A. & Coggins, J. R. (1984) FEBS Lett.
170, 59-63Duncan, K., Chaudhuri, S., Campbell, M. S. & Coggins, J. R.
(1986) Biochem. J. 238, 475-483Duncan, K., Dacey, S. A., Edwards, R. M. & Coggins, J. R.
(1987) FEBS Lett., in the pressFiedler, E. & Schultz, (1985) Plant Physiol. 79, 212-218Gaertner, F. H., Ericson, M. C. & DeMoss, J. A. (1970) J. Biol.Chem. 245, 595-600
Gaertner, F. H. (1978) in Microenvironments and MetabolicCompartmentation (Srere, P. A. & Estabrook, R. W., eds.),pp. 345-353, Academic Press, New York
Ganem, B. (1978) Tetrahedron 34, 3353-3383Giles, N. H. (1978) Am. Naturalist 112,-641-658Giles, N. H., Case, M. E., Partridge, C. W. H. & Ahmed, S. I.
(1967a) Proc. Natl. Acad. Sci. U.S.A. 58, 1453-1460Giles, N. H., Partridge, C. W. H., Ahmed, S. I. & Case, M. E.
(1-967b)-Proc. Natl;- Acad. Sci. U;S.A;-58, 1930-1937Hardie, D. G. & Coggins, J. R. (1986) in Multifunctional
Proteins: Structure and Evolution (Hardie, D. G. &Coggins, J. R., eds.), pp. 332-334, Elsevier, Amsterdam
Hardie, D. G. & McCarthy, T. (1986) in MultifunctionalProteins: Structure and Evolution (Hardie, D. G. &Coggins, J. R., eds.), pp. 229-258, Elsevier, Amsterdam
Henner, D. J. & Hoch, J. A. (1980) Microbiol. Rev. 44, 57-82Hinnebusch, A. G. & Fink, G. R. (1983) J. Biol. Chem. 258,
5238-5247
Horowitz, N. H. (1945) Proc. Natl. Acad. Sci. U.S.A. 31,153-157
Horowitz, N. H. (1965) in Evolving Genes and Proteins(Bryson, V. & Vogel, H. J., eds.), pp. 15-23, Academic Press,New York
Kinghorn, J. R. & Hawkins, A. R. (1982) Mol. Gen. Genet.186, 145-152
Kirschner, K. & Bisswanger, H. (1976) Annu. Rev. Biochem.45, 143-166
Koshiba, T. (1979) Plant Cell Physiol. 2, 667-670Kozak, M. (1984) Nucleic Acids Res. 12, 857-872Lambert, J. M., Boocock, M. R. & Coggins, J. R. (1985)
Biochem. J. 226, 817-829Larimer, F. W., Morse, C. C., Beck, A. K., Cole, K. W. &
Gaertner, F. H. (1983) Mol. Cell. Biol. 3, 1609-1614Lewendon, A. (1984) Ph.D. Thesis, University of GlasgowLewendon, A. & Coggins, J. R. (1983) Biochem. J. 213,
187-191Lumsden, J. & Coggins, J. R. (1977) Biochem. J. 161, 599-607Maniatis, T., Fritsch, E. F. & Sambrook, J. (1982) Molecular
Cloning: A Laboratory Manual, Cold Spring HarborLaboratory, New York
Messing, J. & Vieira, J. (1982) Gene 19, 269-276Midgley, C. A. & Murray, N. E. (1985) EMBO J. 4, 2695-2703Millar, G. & Coggins, J. R. (1986) FEBS Lett. 200, 11-17Millar, G., Lewendon, A., Hunter, M. & Coggins, J. R. (1986)
Biochem. J. 237, 427-437Miozar, G. F. & Yanofsky, C. (1979) Nature (London) 277,
486-489Mousddle, D. M. & Coggins, J. R. (1984) Planta 160, 78-83Mousdale, D. M., Campbell, M. S. & Coggins, J. R. (1987)
Phytochemistry 26, in the pressNakanishi, N. & Yamamoto, M. (1984) Mol. Gen. Genet. 195,
164-169Norrander, J., Kempe, T. & Messing, J. (1983) Gene 26,
101-106Polley, L. D. (1978) Biochim. Biophys. Acta 526, 259-266Rines, H. W., Case, M. E. & Giles, N. H. (1969) Genetics 61,
789-800Sanderson, K. E. & Roth, J. R. (1983) Microbiol. Rev. 47,
410-553Sanger, F., Nicklen, S. & Coulson, A. R. (1977) Proc. Natl.Acad. Sci. U.S.A. 74, 5463-5467
-Schmincke-Ott, E. & Bisswanger, H. (1980) in MultifunctionalProteins (Bisswanger, H. & Schmincke-Ott, E., eds.), pp. 1-29,Wiley, New York
Schweizer, M. (1986) in Multifunctional Proteins: Structureand Evolution (Hardie, D. G. & Coggins, J. R., eds.), pp.195-227, Elsevier, Amsterdam
Smith, D. D. S. & Coggins, J. R. (1983) Biochem. J. 213,405-415
Smith, T. F. & Waterman, M. S.. (1981) Adv. Appl. Math. 2,482-489
Stalker, D. M., Hiatt, W. R. & Comai, L. (1985) J. Biol. Chem.260, 4724-4728
Steinrucken, H. C. & Amrhein, N. (1984) Eur. J. Biochem. 143,341-349
Strauss, A. (1979) Mol. Gen. Genet. 172, 233-241Walker, J. E., Saraste, M., Runswick, M. J. & Gray, M. J.
(1982) EMBO J. 1, 945-951Walsh,-C. (1979) Enzyme Reaction Mechanisms, pp. 553-556,
Freeman, San FranciscoWelch, G. R. & Gaertner, F. H. (1975) Proc. Natl. Acad. Sci.
U.S.A. 72, 4218-4222Welch, G. R. & Gaertner, F. H. (1976) Arch. Biochem.
Biophys. 172, 476-489Zalkin, H. & Yanofsky, C. (1982) J. Biol. Chem. 257,
1491-1500Zalkin, H., Puluh, J. L., van Cleemput, M., Maoye, W. S. &
Yanofsky, C. (1984) J. Biol. Chem. 259, 3985-3992
Received 5 December 1986/2 April 1987; accepted 13 May 1987
1987