duplication and divergence the chalcone synthase gene of ...9033 thepublication costs ofthis article...

6
Proc. Natl. Acad. Sci. USA Vol. 93, pp. 9033-9038, August 1996 Evolution Duplication and functional divergence in the chalcone synthase gene family of Asteraceae: Evolution with substrate change and catalytic simplification (anthocyanin/flavonoid genetics/gene phylogeny/secondary metabolism/stilbene synthase) YRJO HELARIUTA*t*, MiKA KOTILAINEN*, PAULA ELOMAA*, NISSE KALKKINEN*, KARE BREMER§, TEEMU H. TEERI*, AND VICTOR A. ALBERTt *Institute of Biotechnology, University of Helsinki, P.O. Box 45, FIN-00014 Helsinki, Finland; *The New York Botanical Garden, Bronx, NY 10458-5126; and §Department of Systematic Botany, Uppsala University, Villavagen 6, S-752 36 Uppsala, Sweden Communicated by Michael T Clegg, University of California, Riverside, CA, May 20, 1996 (received for review November 29, 1995) ABSTRACT Plant-specific polyketide synthase genes con- stitute a gene superfamily, including universal chalcone syn- thase [CHS; malonyl-CoA:4-coumaroyl-CoA malonyltrans- ferase (cyclizing) (EC 2.3.1.74)] genes, sporadically distrib- uted stilbene synthase (SS) genes, and atypical, as-yet- uncharacterized CHS-like genes. We have recently isolated from Gerbera hybrida (Asteraceae) an unusual CHS-like gene, GCHS2, which codes for an enzyme with structural and enzymatic properties as well as ontogenetic distribution dis- tinct from both CHS and SS. Here, we show that the GCHS2- like function is encoded in the Gerbera genome by a family of at least three transcriptionally active genes. Conservation within the GCHS2 family was exploited with selective PCR to study the occurrence of GCHS2-like genes in other Asteraceae. Parsimony analysis of the amplified sequences together with CHS-like genes isolated from other taxa of angiosperm sub- class Asteridae suggests that GCHS2 has evolved from CHS via a gene duplication event that occurred before the diversifi- cation of the Asteraceae. Enzyme activity analysis of proteins produced in vito indicates that the GCHS2 reaction is a non-SS variant of the CHS reaction, with both different substrate specificity (to benzoyl-CoA) and a truncated catalytic profile. Together with the recent results of Durbin et al. [Durbin, M. L., Learn, G. H., Jr., Huttley, G. A. & Clegg, M. T. (1995) Proc. Natl. Acad. Sci. USA 92, 3338-3342], our study confirms a gene duplication-based model that explains how various related func- tions have arisen from CHS during plant evolution. Plant-specific polyketide synthase genes constitute a gene superfamily. Genes encoding chalcone synthase [CHS; malo- nyl-CoA:4-coumaroyl-CoA malonyltransferase (cyclizing) (EC 2.3.1.74)] and flavonoids, their corresponding reaction products, seem to be universally distributed in plants (1-3). CHS genes have been isolated from a wide taxonomic spec- trum from nonflowering seed plants to dicots and monocots (4, 5). Stilbene synthase (SS) genes coding for enzymes with a related activity to CHS have been isolated from species that accumulate stilbene phytoalexin (5-8). In addition to SS genes, other structurally unusual CHS-like genes or gene products have been reported (9-12). Recently, Tropf et al. (13) have provided evidence that SS genes have evolved from CHS genes via independent gene duplication events several times during seed plant evolution. Durbin et al. (14) have demonstrated an analogous mechanism leading to the evolution of structurally unusual genes in the genus Ipomoea. We have recently isolated a structurally unusual CHS-like gene, GCHS2, from Gerbera hybrida (Asteraceae; ref. 15). GCHS2 is '70% identical to typical CHS genes and the related SS genes at the level of deduced amino acid sequence. Its expression pattern at both organ and cellular levels is not correlated with anthocyanin pigmentation, for which CHS provides the first committed biosynthetic step. Furthermore, the catalytic properties of the corresponding enzyme differ from CHS and SS, although the GCHS2 catalytic reaction and its role in vivo are not yet completely understood. In this study we show that the GCHS2-like genes in Aster- aceae constitute a gene family, whose corresponding amino acid sequences share some consensus residues. Phylogenetic parsimony analysis of (i) the GCHS2 nucleotide sequence, (ii) further GCHS2-like genes screened from a Gerbera cDNA library, (iii) gene fragments amplified from various Asteraceae using GCHS2 family-specific primers, and (iv) CHS superfam- ily genes isolated from other angiosperms of subclass Asteri- dae indicates that GCHS2 probably evolved from CHS via a single gene duplication event that occurred before the diver- sification of Asteraceae. Structural and functional variation observed among the members of the GCHS2 gene family suggests that subsequent diversification has also taken place. A comparison of the catalytic properties of GCHS2 to parsley CHS shows that both substrate specificity and progressivity of catalytic reaction steps have changed during GCHS2 evolution. MATERIALS AND METHODS Plant Material. G. hybrida is a hybrid of two species (G. jamessonii and G. viridifolia) belonging to the tribe Mutisieae (Asteraceae subfamily Cichorioideae; ref. 16). We chose for analysis Leibnitzia (a closely related genus) and Onoseris (a more distantly related genus) from Mutisieae, Taraxacum (tribe Lactuceae) from Cichorioideae, and Dahlia (tribe He- liantheae) from subfamily Asteroideae (16). Mature plants of G. hybrida var. Regina (obtained from Terra Nigra, De Kwakel, The Netherlands) and seedlings of Leibnitzia anandria and Onoseris sagittatis were grown under standard greenhouse conditions. Leaf material of Dahlia sp. and Taraxacum sp. (collected from gardens in Helsinki) were also used as sources of DNA. Isolation of GCHS17 and GCHS26 from a Genomic A Library. Nuclear DNA from Gerbera leaves was prepared by the method of Jofuku and Goldberg (17). A genomic library was constructed with LambdaGEM-11 vector (Promega) and was screened using GCHS1-3 cDNA clones as probes (15, 18). Short fragments (181 bp) from the clones were amplified using primers designed from the conserved region of the CHS genes Abbreviations: CHS, chalcone synthase; SS, stilbene synthase; 2-ME, 2-mercaptoethanol. Data deposition: The sequences reported in this paper have been deposited in the GenBank data base (accession nos. X91339-X91345). I tTo whom reprint requests should be sent at present address: Depart- ment of Biology, New York University, New York, NY 10003. 9033 The publication costs of this article were defrayed in part by page charge payment. This article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. §1734 solely to indicate this fact. Downloaded by guest on August 4, 2021

Upload: others

Post on 07-Mar-2021

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Duplication and divergence the chalcone synthase gene of ...9033 Thepublication costs ofthis article weredefrayed in part bypage charge payment.Thisarticle mustthereforebeherebymarked"advertisement"

Proc. Natl. Acad. Sci. USAVol. 93, pp. 9033-9038, August 1996Evolution

Duplication and functional divergence in the chalcone synthasegene family of Asteraceae: Evolution with substrate changeand catalytic simplification

(anthocyanin/flavonoid genetics/gene phylogeny/secondary metabolism/stilbene synthase)

YRJO HELARIUTA*t*, MiKA KOTILAINEN*, PAULA ELOMAA*, NISSE KALKKINEN*, KARE BREMER§,TEEMU H. TEERI*, AND VICTOR A. ALBERTt*Institute of Biotechnology, University of Helsinki, P.O. Box 45, FIN-00014 Helsinki, Finland; *The New York Botanical Garden, Bronx, NY 10458-5126; and§Department of Systematic Botany, Uppsala University, Villavagen 6, S-752 36 Uppsala, Sweden

Communicated by Michael T Clegg, University of California, Riverside, CA, May 20, 1996 (received for review November 29, 1995)

ABSTRACT Plant-specific polyketide synthase genes con-stitute a gene superfamily, including universal chalcone syn-thase [CHS; malonyl-CoA:4-coumaroyl-CoA malonyltrans-ferase (cyclizing) (EC 2.3.1.74)] genes, sporadically distrib-uted stilbene synthase (SS) genes, and atypical, as-yet-uncharacterized CHS-like genes. We have recently isolatedfrom Gerbera hybrida (Asteraceae) an unusual CHS-like gene,GCHS2, which codes for an enzyme with structural andenzymatic properties as well as ontogenetic distribution dis-tinct from both CHS and SS. Here, we show that the GCHS2-like function is encoded in the Gerbera genome by a family ofat least three transcriptionally active genes. Conservationwithin the GCHS2 family was exploited with selective PCR tostudy the occurrence ofGCHS2-like genes in other Asteraceae.Parsimony analysis of the amplified sequences together withCHS-like genes isolated from other taxa of angiosperm sub-class Asteridae suggests that GCHS2 has evolved from CHSvia a gene duplication event that occurred before the diversifi-cation of the Asteraceae. Enzyme activity analysis of proteinsproduced in vito indicates that the GCHS2 reaction is a non-SSvariant of the CHS reaction, with both different substratespecificity (to benzoyl-CoA) and a truncated catalytic profile.Together with the recent results of Durbin et al. [Durbin, M. L.,Learn, G. H., Jr., Huttley, G. A. & Clegg, M. T. (1995) Proc. Natl.Acad. Sci. USA 92, 3338-3342], our study confirms a geneduplication-based model that explains how various related func-tions have arisen from CHS during plant evolution.

Plant-specific polyketide synthase genes constitute a genesuperfamily. Genes encoding chalcone synthase [CHS; malo-nyl-CoA:4-coumaroyl-CoA malonyltransferase (cyclizing)(EC 2.3.1.74)] and flavonoids, their corresponding reactionproducts, seem to be universally distributed in plants (1-3).CHS genes have been isolated from a wide taxonomic spec-trum from nonflowering seed plants to dicots and monocots (4,5). Stilbene synthase (SS) genes coding for enzymes with arelated activity to CHS have been isolated from species thataccumulate stilbene phytoalexin (5-8). In addition to SS genes,other structurally unusual CHS-like genes or gene productshave been reported (9-12). Recently, Tropf et al. (13) haveprovided evidence that SS genes have evolved from CHS genesvia independent gene duplication events several times duringseed plant evolution. Durbin et al. (14) have demonstrated ananalogous mechanism leading to the evolution of structurallyunusual genes in the genus Ipomoea.We have recently isolated a structurally unusual CHS-like

gene, GCHS2, from Gerbera hybrida (Asteraceae; ref. 15).GCHS2 is '70% identical to typical CHS genes and the related

SS genes at the level of deduced amino acid sequence. Itsexpression pattern at both organ and cellular levels is notcorrelated with anthocyanin pigmentation, for which CHSprovides the first committed biosynthetic step. Furthermore,the catalytic properties of the corresponding enzyme differfrom CHS and SS, although the GCHS2 catalytic reaction andits role in vivo are not yet completely understood.

In this study we show that the GCHS2-like genes in Aster-aceae constitute a gene family, whose corresponding aminoacid sequences share some consensus residues. Phylogeneticparsimony analysis of (i) the GCHS2 nucleotide sequence, (ii)further GCHS2-like genes screened from a Gerbera cDNAlibrary, (iii) gene fragments amplified from various Asteraceaeusing GCHS2 family-specific primers, and (iv) CHS superfam-ily genes isolated from other angiosperms of subclass Asteri-dae indicates that GCHS2 probably evolved from CHS via asingle gene duplication event that occurred before the diver-sification of Asteraceae. Structural and functional variationobserved among the members of the GCHS2 gene familysuggests that subsequent diversification has also taken place. Acomparison of the catalytic properties of GCHS2 to parsleyCHS shows that both substrate specificity and progressivity ofcatalytic reaction steps have changed during GCHS2 evolution.

MATERIALS AND METHODSPlant Material. G. hybrida is a hybrid of two species (G.

jamessonii and G. viridifolia) belonging to the tribe Mutisieae(Asteraceae subfamily Cichorioideae; ref. 16). We chose foranalysis Leibnitzia (a closely related genus) and Onoseris (amore distantly related genus) from Mutisieae, Taraxacum(tribe Lactuceae) from Cichorioideae, and Dahlia (tribe He-liantheae) from subfamily Asteroideae (16).Mature plants of G. hybrida var. Regina (obtained from

Terra Nigra, De Kwakel, The Netherlands) and seedlings ofLeibnitzia anandria and Onoseris sagittatis were grown understandard greenhouse conditions. Leaf material of Dahlia sp.and Taraxacum sp. (collected from gardens in Helsinki) werealso used as sources of DNA.

Isolation of GCHS17 and GCHS26 from a Genomic ALibrary. Nuclear DNA from Gerbera leaves was prepared bythe method of Jofuku and Goldberg (17). A genomic librarywas constructed with LambdaGEM-11 vector (Promega) andwas screened using GCHS1-3 cDNA clones as probes (15, 18).Short fragments (181 bp) from the clones were amplified usingprimers designed from the conserved region of the CHS genes

Abbreviations: CHS, chalcone synthase; SS, stilbene synthase; 2-ME,2-mercaptoethanol.Data deposition: The sequences reported in this paper have beendeposited in the GenBank data base (accession nos. X91339-X91345).

I tTo whom reprint requests should be sent at present address: Depart-ment of Biology, New York University, New York, NY 10003.

9033

The publication costs of this article were defrayed in part by page chargepayment. This article must therefore be hereby marked "advertisement" inaccordance with 18 U.S.C. §1734 solely to indicate this fact.

Dow

nloa

ded

by g

uest

on

Aug

ust 4

, 202

1

Page 2: Duplication and divergence the chalcone synthase gene of ...9033 Thepublication costs ofthis article weredefrayed in part bypage charge payment.Thisarticle mustthereforebeherebymarked"advertisement"

Proc. Natl. Acad. Sci. USA 93 (1996)

(15), and their sequences were used for classification anddesignation of the subcloning strategy. Both strands of theGCHS17 and GCHS26 clones were determined using thedeletion strategy both manually and with an automated se-quence determination system (ALF; Pharmacia; ref. 18).Sequence alignments were done using the CLUSTAL program ofthe PC/gene package (IntelliGenetics).

Amplification of GCHS2-Like Sequences from Dahlia, Leib-nitzia, Onoseris, and Taraxacum. Partially degenerate (inosinecontaining) primers including restriction enzyme sites wereused to amplify fragments from various asteraceous species.5'-TGCACTCGAGTGA(A/C/G/T)AA(A/G)ACAGC(A/C/G/T)ATAAA(A/G)AA-3' and 5'-ACTGGGATCCACC-(A/C/G/T)GGGTG(A/C/G/T)ACCATCCA(A/G)AA-3'corresponding to peptides C(D/E)KTAIKK and FWMVH-PGG were used for specific amplification of GCHS2-likegenes. For amplification, plant DNA was extracted by themethod of Dellaporta et al. (19) and purified in isopycnic CsClgradients (18). PCR was performed using a "touch-down"strategy: 10 times (940 75 s; 500 5 min adding -1° per cycle,slope +220, 10 per 10 s; 720 5 min) followed by 31 times (94075 s; 530 2 min; 720 5 min). To verify that the amplificationproducts did not contain any chimeric artifacts due to recom-bination among related gene family members (20), a second,independent amplification was performed. After PCR, frag-ments were separated from primers by gel electrophoresis,purified from agarose and digested with the correspondingrestriction enzyme pair for cloning into plasmid pOK12 (21)for sequence analysis of both strands.

Phylogenetic Analysis of CHS-Like Gene Sequences. Nucle-otide sequences representing 20 sequences (corresponding tothe amino acid sequence from Rll to S389 in GCHS1) werealigned (using CLUSTAL) based on their deduced amino acidsequences. The aligned sequences (available on request) weresubjected to cladistic parsimony analysis using the programPAUP (22). Heuristic search options were random addition ofsequences with 100 replicates and subsequent tree-bisection-reconnection branch swapping to generate multiple equallyparsimonious trees. Branch support was estimated using par-simony jackknifing with 10,000 replicates (23). The trees wereoriented by treating CHS superfamily sequences from taxa ofthe Asterid I and Asterid II clades of Chase et al. (24) asmonophyletic, respectively.

Expression of GCHS26 in Escherichia coli and Analysis ofCorresponding Enzymatic Activity. Plasmids pHTT402,pHTT406 (15), and pHTT409 express GCHS2, parsley CHS,and GCHS26 genes from the vector pKKtac (25) in E. coli. InpHTT409, the intron in the genomic clone of GCHS26 wasremoved with the help of an oligonucleotide spanning theputative joint site of the exons. In these expression constructsthe initiation of translation takes place at the plant genes'ATG. The vector without an insert served as the control. Forenzyme production, E. coli DH5a cells (26) harboring theexpression constructs were grown to OD600 = 0.8-1.0, inducedfor 1.5 h with 1 mM isopropyl 3-D-thiogalactoside at 280C,pelleted, and stored at -70°C. The enzymatic reaction wasanalyzed exactly as described in ref. 15. In certain experiments,2-mercaptoethanol (2-ME) was added in the reaction to inhibitits progressivity.

Production, Purification, and UV Spectrum Analysis ofBenzoyl-CoA-Derived Products. The ethyl acetate extractsfrom a large-scale enzymatic reaction [in 6 x 1 ml of 50 mMHepes-KOH (pH 7)/1 mM EDTA/100 ,uM benzoyl-CoA/32gLM malonyl-CoA (2 x 1 ml: 800 nM [14C]malonyl-CoA)/200gg of dialyzed E. coli protein extract] were evaporated in avacuum centrifuge and redissolved in 10% methanol in water.This material was subjected to reverse-phase HPLC on a 0.4 X10 cm LiChrospher 100 RP-18 (5 ,um) column (Merck) con-nected to a Beckman 126 system gold gradient HPLC pump.Chromatography was performed with a flow of 1 ml/min using

a linear gradient of methanol (0-60% in 40 min) in H20. Theeluent was monitored at 220 nm using a Waters 990 + diodearray detector. For spectral information, data from 200 to 400nm was collected at a resolution of 1.4 nm. Furthermore, theradioactivity of each fraction was measured by scintillationcounter, and the peak fractions were analyzed by TLC.

RESULTSIsolation and Characterization of GCHS17 and GCHS26

Clones. Among 3.4 million plaque-forming units screened,eight A clones hybridizing to GCHS1-3 cDNA probes wereisolated. According to the classification based on the ampli-fication of a 181-bp fragment from a conserved region (15), theclones were deduced to represent four different sequences.Two novel genes, GCHS17 and GCHS26, having a continuousreading frame (except for an intron) showed similarity to theGCHS2 gene. The exon/intron boundaries as well as the startand stop codons of the reading frames were deduced based on thegeneral similarity of CHS enzymes at the amino acid sequencelevel and by comparison to the GCHS2 cDNA sequence.GCHS17 is a truncated clone missing approximately the first 50codons. It has a 28-bp first exon, a 1638-bp intron, and a 1016-bpsecond exon. GCHS26 harbors the entire reading frame: a 196-bpfirst exon, a 483-bp intron, and a 1016-bp second exon.We compared the deduced amino acid sequences of

GCHS17 and GCHS26 to each other and to the other astera-ceous CHS superfamily sequences. The sequences form twosubgroups: usual CHS-like sequences (with 88-93% intra-group identity) and GCHS2-like sequences (83-84% intra-group identity). The identity between the two groups is73-77%. Next, we compared these sequences to a CHS con-sensus sequence that contains the 260 residues identical in ninefunctionally verified sequences of a wide evolutionary spec-trum (15). GCHS2 deviates at 49, GCHS26 at 47, and thetruncated GCHS17 at 42 positions. At 18 comparable sites, allGCHS2-like sequences deviate in an identical way from theCHS consensus. At an additional six sites, they deviate inconcert from the consensus of GCHS1, CHS from Dendran-thema grandiflora, and GCHS3. For comparison, GCHS1 andthe CHS from D. grandiflora deviate from the CHS consensusat five positions, whereas GCHS3 deviates at 10 sites. Thesequence analysis suggests that, in the Gerbera genome,GCHS2-like genes form a family, at least in the sense that theyshare some diagnostic characteristics in the primary structureof the corresponding enzymes.

Isolation and Characterization of Gene Fragments fromOther Asteraceae Species Using Primers Designed for theGCHS2 Gene Family. To study the distribution of GCHS2-likegenes in the Asteraceae, we designed specific primers for PCRamplification based on the GCHS2 diagnostic sites (Fig. 1).Amplification products of expected size were obtained fromLeibnitzia, Onoseris, Taraxacum, Dahlia, and Gerbera (forcontrol). In Fig. 1, deduced amino acid sequences correspond-ing to the amplification products of the four species are shown.Each fragment has a reading frame without stop codons, andthe degree of deviation from the CHS consensus is of the sameorder of magnitude as that of GCHS2 family of Gerbera,suggesting that the fragments analyzed represent coding re-gions of CHS superfamily genes. Leibnitzia LACHS1 andOnoseris OSCHS1 sequences share the clear majority offeatures common to the GCHS2 family. LACHS1 follows theGCHS2 consensus in 13 of 16 comparable sites and OSCHS1follows this consensus in 11 of 16 comparable sites.The Taraxacum and Dahlia sequences also share some

residues diagnostic for GCHS2-like sequences. TXCHS1shares four of seven comparable positions with the GCHS2consensus, DHCHS1 shares three of seven comparable posi-tions, and DHCHS2 shares three of nine comparable positions.Furthermore, the Taraxacum and Dahlia sequences often

9034 Evolution: Helariutta et al.

Dow

nloa

ded

by g

uest

on

Aug

ust 4

, 202

1

Page 3: Duplication and divergence the chalcone synthase gene of ...9033 Thepublication costs ofthis article weredefrayed in part bypage charge payment.Thisarticle mustthereforebeherebymarked"advertisement"

Proc. Natl. Acad. Sci. USA 93 (1996) 9035

cons. IC KS I R M TEE L NP C Y APSLDI IFAGGTVLR AKD AENN GARV VVCSEITAVTIGCHS1 CDKSMIRKRYMHITEEYLKQN NMCAYMAPSLD 95 FAGGTVLRLAKDLAENNKGARVLVVCSEITAVT 200MUMCHS CDKSMIRKRYMHLTEEYLKEN NLCEYMAPSLD 95 FAGGTVLRLAKDLAENNKIARVLVVCSEITAVT 200GCHS3 CDKSMIRKRYMHITEEFLKEN SMCK MAPSLD 98 FAGGTVLRLAKDLAENNKGARVLVVCSEITAVT 203GCHS2 CEK AIKKRYHALTEIYLQEN TMCE MAPSLD 97 AGGTVLRLAKDLAENNKGERVLLVCSEITAE 202GCHS26 CDK AIKKRY ALTEEYLKQN SMCE MAPSL 98 AGG VLRLAKDLAENNKG RVLVVCIEITA 203

GCHS17 CEK AIKKRY VLTEIYLEK NMCE MAPS 42 AGG VLRLAKDLAENNK RVLVICSEIT 147+ + * * * * *

LACHS1 --------RYMVLTEEYLKEN NMCEIMAPSL 25 AGGTVLRIAKDLAENNKG RVLVVClEIl A 130

OSCHS1 --------RYMVLTEEYLKEN NMCEgMAPSL 25 AGGt LRLAKDLAENNK RVLVVCSEIT 130TXCHS1 - RYIHHTEEFLKEN NMCGYN PSL 25 GG LRLAKDVAENNK RVLVVCEE 130DCHS1 --------RYMFLTEIFLKDN NMCEYA PSLI 25 AGG VLRLAKDIAENNKG RVLVVCIElIA 130DHCS G ~Vf~ LA D AENN RVVV1icons.

GCHS1

MUMCHS

GCHS3

GCHS2GCHS26

GCHS17

LACHS1

OSCHS1

DHCHS2

AQT PDS GAIDGHLREVGL FHLLKDVPG SKNI L AF ISD N FW AHPGGMVSAAQTILPDSEGAIDGHLREVGLTFHLLKDVPGLISKNIEKALTTAFSPLGIDWNS IFWIAHPGG 309MISAAQTILPDSEGAIDGHLREVGLTFHLLKDVPGLISKNIEKALTQAFSPLGISDWNSIFWIAHPGG 309MVSAAQTILPDSEGAIDGHLIEVGLTFHLLKDVP LI KNIEKALIQAFSPLNI DWNSIFWIAHPGG 312IVST QTILPDIE A XHLRE GLTF L DVP MV KNIENAEKASPLGI DWNSVFWM HPGG 311

IVST QTILPDSE A _HLRE GLTF L DVP MI KNIEDV VKE SPLGISDWNSLFW HPGG 312

IVST QTILPDgE A HLRE GLTF LEDVP MVSKNIEDA MKA SPLGISDYNSLFWM HPGG 256+ + * * * * + *

IVSA QTILPISE A HLRE GLTFLDVP MIS NIKDVDQAFSPLGISDWNSL ------- 193IVS QTLLPDSE A HLRE GLTFjL DVP MIKNIEDVIVKA SPLGISDWNSL ------- 193IVSA QTIIPDE AI3HLRE GLKHLDVP MI SNIENILMKA HPLGflDWNSM ------- 33

FIG. 1. Alignment of deduced amino acid sequences for CHS superfamily genes of Asteraceae. Three selected regions are shown, correspondingto the sequence ranges of PCR-amplified genes (the forward and reverse primer sites are denoted with hyphens). Positions of final residues perblock are marked for each polypeptide. cons., strict consensus of CHS sequences based on 9 sequences from a wide taxonomic spectrum (15). Theamino acid residues deviating from this consensus are shown with a black background. Consensus within the GCHS2 family of Gerbera thatrepresents deviation from CHS is indicated by * when compared with all CHS sequences and + against CHS genes of Asteraceae only. Note thatin complete sequences (not shown) two asterisked sites lie between the upper two blocks and three others lie after the last block. MUMCHS, CHSof D. grandiflora (27). LACHS1, OSCHS1, TXCHS1, DHCHS1, and DHCHS2 are the amplified fragments of Leibnitzia, Onoseris, Taraxacum, andDahlia, respectively. These partial sequences vary in size because of cloning aspects; in Taraxacum, a sequence prematurely ending at a BamHIsite was isolated, whereas in Dahlia, two BamHI fragments were isolated separately.deviate from the CHS consensus in the same positions as theGCHS2 family of Gerbera, even if the derived residue is notidentical. In contrast, there are several sites that are shared byTaraxacum and/or Dahlia clones and the CHS consensus thatdeviate from the GCHS2 consensus.

Phylogenetic Analysis of Asteraceae and Asterid CHS Su-perfamily Genes. To study the phylogenetic relationships ofthe GCHS2 gene family, we performed a parsimony analysis of19 CHS-like sequences at the nucleotide level (Fig. 2). Asrecent publications (13, 14) have shown general overviews ofCHS superfamily phylogeny, we included only selected CHS-like sequences available for angiosperm subclass Asteridae.These included CHS for Apiaceae, Asteraceae, Convolvu-laceae, Scrophulariaceae, and Solanaceae. Based on priorphylogenetic results from the plastid gene rbcL (24), CHS-likesequences from the first two families were expected to appearas one monophyletic branch of Asteridae, whereas the latterthree were expected to occur in another. Genes selectedincluded two unusual UV-inducible CHS genes of Petunia(CHSB and CHSG; ref. 9), the CHSB-related Ipomoea se-quences of (14), and the GCHS2-like genes of Gerbera andother Asteraceae.The parsimony analysis yielded 2 equally most-parsimonious

trees of 2720 steps with a consistency index of 0.50 and aretention index of 0.52 (31). One of these trees is shown in Fig.2. The alternative tree differs only in the placement of Leib-nitzia and Onoseris GCHS2-like sequences relative to those ofGerbera (Fig. 2). In every case, GCHS2-like sequences arederived from within CHS as sister-group to CHS3. Jackknifebranch support values (23) suggest uncontradicted support forthe clade comprising all Asteraceae CHS-like sequences andfor each of the two major clades containing GCHS2-likesequences (Fig. 2).GCHS2-like genes appear to be monophyletic in all most-

parsimonious trees, but support for this association is weak

(Fig. 2). Nevertheless, it is likely that GCHS2 has emergedfrom CHS as the result of a single gene duplication event, withsubsequent differentiation during the evolution of the Aster-aceae. This duplication event must have occurred prior todiversification of the Asteraceae because all GCHS2-like genes(of tribes Mutisieae, Lactuceae, and Heliantheae) would bederived from the CHS3 and CHS1 lineages (Fig. 2), both ofwhich include Gerbera (Mutisieae). Phylogenetic analyses ofmorphological and other molecular data indicate that tribeMutisieae is the basal-most lineage of cichorioid-asteroidAsteraceae (-23,000 species), with only subfamily Barnade-sioideae (92 species) more primitive in the family (16).Comparison of the Enzymatic Activities of GCHS2 and

GCHS26 with Parsley CHS. To investigate the catalytic prop-erties of GCHS26 and compare them with those of thepreviously analyzed GCHS2 and parsley CHS (15, 32), wecloned the cDNA into the expression vector pKKtac andproduced the enzymes in E. coli. A parsley CHS cDNA wasused as a reference for CHS function. As a control for the E.coli background, the vector with no insert was also used.

Fig. 3A shows the products of the three enzymes formed with4-coumaroyl-CoA as a substrate. The initial product of atypical CHS reaction is chalcone, but in vitro most of it isconverted nonenzymatically to naringenin in the course of thereactions, and this is the main product observed with parsleyCHS in the chromatogram. In contrast, GCHS2 and GCHS26reveal no formation of naringenin, but produce a faint (andblurred) signal (P0.08; ref. 15) near the start. The radioactivityat the front detected with low-pH extractions probably repre-sents malonic acid liberated from malonyl-CoA by thioester-ases in the extracts (15).

Since in our previous study (15) we found that GCHS2 isable to use benzoyl-CoA as a substrate (leading to the accu-mulation of a product, P0.51) we tested the activity of GCHS26similarly. Fig. 3B shows that GCHS26 produces two signals

Evolution: Helariutta et al.

Dow

nloa

ded

by g

uest

on

Aug

ust 4

, 202

1

Page 4: Duplication and divergence the chalcone synthase gene of ...9033 Thepublication costs ofthis article weredefrayed in part bypage charge payment.Thisarticle mustthereforebeherebymarked"advertisement"

Proc. Natl. Acad. Sci. USA 93 (1996)

CHS (Antirrhinum: Scrophulariaceae)

~~1 CHSB (Petunia: Solanaceae)86 CHSA (Ipomoea: Convolvulaceae)

100 CHSB (Ipomoea: Convolvulaceae)

CHSG (Petunia: Solanaceae)- CHSJ (Petunia: Solanaceae)1- CHSA (Petunia: Solanaceae)

l...... CHS1 (Lycopersicon: Solanaceae)CHS (Petroselinum: Apiaceae)

^^"sC4 -

50

UiL;Hlll (Gerbera: Asteraceae)MUMCHS (Dendranthema: Asteraceae)

GCHS3 (Gerbera: Asteraceae)

81TXCHS1 (Taraxacum: Asteraceae)

DHCHS1 (Dahlia: Asteraceae)

.0 OSCHS1 (Onoseris: Asteraceae)75

LACHS1 (Leibnitzia: Asteraceae)

GCHS26 (Gerbera: Asteraceae)

GCHS2 (Gerbera: Asteraceae)

GCHS1 7 (Gerbera: Asteraceae)

FIG. 2. Phylogenetic relationships of GCHS2-like genes within the context of the CHS gene superfamily of angiosperm subclass Asteridae. Thetree shown is one of two most-parsimonious topologies differing only in the reversed placements of OSCHS1 (from Onoseris) and LACHS1 (fromLeibnitzia). The tree was oriented by treating the two major clades of Asteridae (24) as monophyletic groups. The same orientation was also indicatedby midpoint rooting (28). Branch lengths are proportional to numbers of hypothesized nucleotide changes under the accelerated transformationcharacter optimization (29, 30). The scale bar indicates 50 nucleotide changes. Numbers at nodes are parsimony jackknife support values; valuesclose to or greater than 63% may indicate nodes set off by uncontradicted synapomorphies, whereas values between 50% and 63% indicate nodeswith some robustness to extra steps (23). Support values under 50% are not shown. All Asteraceae CHS-like sequences form a group supportedby 67% ofjackknife replicates. GCHS2-like sequences are shown to derive monophyletically from CHS1 and CHS3, but support for this relationshipis weak. It is therefore possible that all GCHS2-like genes trace to a single duplication event that took place before the diversification of Asteraceae;DHCHS1 from Dahlia (of the derived subfamily Asteroideae) is embedded within an asteraceous CHS-like clade otherwise dominated by Gerberaand other taxa of tribe Mutiseae, which occupies a primitive position in Asteraceae phylogeny (16).

(PO.51 and PO.9) specific for benzoyl-CoA. This indicates thatboth GCHS2 and GCHS26 differ from typical CHS by theirsubstrate specificity; both enzymes are strikingly inactive withthe natural substrate of CHS, 4-coumaroyl-CoA, but both areable to use benzoyl-CoA instead. Furthermore, neither en-zyme is able to catalyze the formation of products that aresoluble to organic phase in high pH, which contrasts withparsley CHS (and the two CHS enzymes of Gerbera with both4-coumaroyl-CoA and benzoyl-CoA as substrates; ref. 15).To characterize further the product specificity of each

enzyme with benzoyl-CoA we purified the major products byHPLC and analyzed their identity based on mobility in TLC(Fig. 3C) and UV absorption spectrum (Fig. 3D). Parsley CHSproduces four products (PO.51a, PO.51b, PO.67, PO.82), one ofwhich (PO.51a) is produced by both GCHS2 and GCHS26 andone of which (PO.82) is produced also by GCHS26, but notsignificantly by GCHS2. Based on this analysis, the twoproducts soluble at high pH are PO.5lb and PO.67.To understand the relationship of the compound shared

between GCHS2 and GCHS26 (PO.51a) to the other reactionproducts, we studied the effect of 2-ME, a known inhibitor ofthe progressivity of the CHS reaction (33). With parsley CHS,2-ME decreases the formation of the normal end-productnaringenin (probably by disturbing a cysteine residue at theenzyme's active site; ref. 34) and increases the formation of abyproduct, bis-noryangonin, which is a reaction intermediate

(33). As a control, we first tested the effect of 2-ME on thereaction with 4-coumaroyl-CoA. As expected, treatment with100 mM concentration blocked naringenin formation and ledto the production of a high mobility signal likely ascribable toliberated malonic acid (Fig. 3E). With benzoyl-CoA as sub-strate, increasing the amount of 2-ME in the reaction led toreduced formation of the high-pH soluble compounds PO.51band PO.67, whereas the accumulation of PO.51a (not soluble toorganic phase in high pH) remained high (Fig. 3F). As with4-coumaroyl-CoA, the high mobility signal became stronger.This signal (and no others) increased even in a second controlreaction where malonyl-CoA was the only substrate (Fig. 3F).In conclusion, the effect of 2-ME on the parsley CHS reactionwith benzoyl-CoA strongly suggests that PO.51a represents anintermediate of the normal CHS reaction. Indeed, NMRstructural analysis of the PO.51 compound indicates the ab-sence of the second aromatic ring typical for CHS reaction, andtherefore incomplete cyclization in the GCHS2 reaction (I.Kilpelainen, personal communication). This catalytic trunca-tion with substrate change implies a functional deviation fromCHS consistent with the virtually unaltered floral pigmenta-tion of Gerbera plants transgenic for an antisense-GCHS2Agrobacterium construct (35).

DISCUSSIONThe GCHS2 gene of G. hybrida has been shown to be a novelmember of the CHS gene superfamily, encoding an enzyme

9036 Evolution: Helariutta et al.

I

Dow

nloa

ded

by g

uest

on

Aug

ust 4

, 202

1

Page 5: Duplication and divergence the chalcone synthase gene of ...9033 Thepublication costs ofthis article weredefrayed in part bypage charge payment.Thisarticle mustthereforebeherebymarked"advertisement"

Proc. Natl. Acad. Sci. USA 93 (1996) 9037

A

S-F

B-F

-....

0 qp - 0.51

...

Awf

!im:

- Nar

- 0.08- S

V V P P 26 26 2 2mc Mc mc mc mc mc mc mchi to hi lo hi lo hi lo

C

SD 0 0} - sV V P P 26 26 2 2mb mb mb mb mb mb mb mbhi lo hi lo hi lo hi lo

D

P/ P/ P/ P0.51 0.67 0.51b mb

lo

26/ 26/ 260.82 0.51 mb

lo

S

S_

'a -S

P P P P P P P P V V P Pmb mb mb mb m m m m mb mb mc mchi hi lo lo hi hi lo J lIolo lo lo

+ + + +

P P P P P P P Pmb mb mb mb mb mb mb mbhi lo hi lo hi lo hi lo0 0 12 12 60 60 300 300

FIG. 3. Chromatographic analysis of in vitro CHS reaction products. (A) TLC analysis with 4-coumaroyl-CoA and malonyl-CoA as substrates.2, GCHS2; 26, GCHS26; P, parsley CHS expressed in E. coli; V, E. coli control (vector); c,-coumaroyl-CoA; m, malonyl-CoA substrates of thereactions. The pH at extraction is labeled lo (pH 4) and hi (pH 8.8). (B) TLC analysis with benzoyl-CoA and malonyl-CoA as substrates. b,benzoyl-CoA. (C) TLC analysis of the radioactive fractions purified by HPLC run adjacent to the nonpurified reactions. 2/0.51, 26/0.51, 0.82,P/0.51, 05lb, 0.67, 0.82, unknown products of GCHS2, GCHS26, and parsley CHS reactions, respectively. (D) UV spectrum of the fractions analyzedin C; (E) TLC analysis of the effect of 100 mM concentration of 2-ME to parsley CHS reaction. (-) absence/(+) presence of 2-ME. (F) TLC analysisof the effect of increasing concentration of 2-ME to parsley CHS reaction. 0, 12, 60, and 300 mM concentrations of 2-ME. S, F: start and frontof the chromatograms. Nar, position of naringenin; 0.08, 0.51 position of unknown products.

with structural and enzymological properties as well as onto-genetic distribution distinct from CHS and SS (15). Here, wedescribe the evolutionary and functional relationships ofGCHS2 to other genes of the superfamily. The Gerbera genome

harbors a family of GCHS2-like genes with at least three mem-bers, and amplified fragments of GCHS2-like genes have beenobtained from other species of Asteraceae. Based on the phylo-genetic hypothesis presented (Fig. 2), two strongly-supported

0 :

- F

_NW.

fW :.

2/0.51

E

2 P/mb 0.82lo

26/0.82P/0.82

P/0.5126/0.512/0.51P/0.67P/0.51 b

220 nm 350 nm

F-F

- 0.51

-F

,K.?

- Nar

- 0.51

-S

Evolution: Helariutta et al.

.;..-f

Dow

nloa

ded

by g

uest

on

Aug

ust 4

, 202

1

Page 6: Duplication and divergence the chalcone synthase gene of ...9033 Thepublication costs ofthis article weredefrayed in part bypage charge payment.Thisarticle mustthereforebeherebymarked"advertisement"

Proc. Natl. Acad. Sci. USA 93 (1996)

lineages of GCHS2-like genes have emerged from CHS by geneduplication and functional divergence during the evolution ofAsteraceae. Although the relationship is only weakly supported,maximum parsimony indicates a monophyletic origin for both ofthese lineages, suggesting that a single gene duplication gave riseto all GCHS2-like genes before Asteraceae diversified.Together with the recent results of Tropf et al. (13) and

Durbin et al. (14), our study confirms a gene duplication basedmodel that explains how various related functions have arisenfrom CHS during plant evolution. Based on some commonmotifs in primary structure and the similar sequential reactionmechanism, it has been suggested that CHS itself shares acommon origin with fatty acid synthases of primary metabo-lism (36, 37). In the reaction of SS, which uses identicalmalonyl-CoA and 4-coumaroyl-CoA substrates, only the finalcyclization step of the reaction is modified (37). In the GCHS2reaction, as studied here, both the substrate specificity as wellas the progressivity of the reaction have been changed. Alteredsubstrate specificity in a CHS-like enzyme has been reportedfor acridone synthase, an enzyme in the alkaloid biosyntheticpathway using N-methylanthraniloyl-CoA as substrate (11, 12).Nevertheless, the truncation of the CHS reaction in GCHS2catalysis is a novel feature. It suggests that the initially rela-tively complex CHS reaction has been simplified along theprotein evolutionary process, and that this simplification musthave been selectively advantageous to have been retained. Ifthe catalytic steps of the CHS reaction (and those of relatedfatty acid synthases) evolved stepwise over time, then thetruncated reaction of GCHS2 may be considered a reversal toa more primitive condition in enzymatic evolution. Truncationof the CHS reaction leading to the accumulation of novelmetabolites of secondary metabolism has been hypothesizedpreviously in the context ofp-hydroxyphenylbutan-2-one bio-synthesis in raspberry (39) and the biosynthetic origin ofbis-noryangonins (33). Our present results with the GCHS2reaction support these untested suppositions and imply thatcatalytic simplification (i.e., evolutionary reversal), like substratechange, may be a recurring theme in CHS superfamily evolution.The GCHS2-like sequences from Taraxacum (tribe Lac-

tuceae, subfamily Cichorioideae) and Dahlia (tribe Heli-antheae, subfamily Asteroideae) form a well-supported lin-eage apart from those of the three species of the tribeMutisieae (subfamily Cichorioideae), which indicates a furtherdivergence in the GCHS2 family along with the diversificationof Asteraceae. Additionally, the three GCHS2-like genes ofGerbera clearly differ from one other. GCHS17 and GCHS26are generally expressed at a lower level than GCHS2, and thestrong expression in floral organs typical for GCHS2 is lacking(data not shown). Furthermore, in the comparison of thecatalytic properties of GCHS2 and GCHS26, a slightly differ-ent product specificity was observed (Fig. 3B). In the future,in combination with biochemical investigation of the functionof GCHS2, these studies on the molecular evolution of GCHS2will lead to a greater understanding of the role, biologicalsignificance, and diversity of GCHS2 as a novel enzyme ofsecondary metabolism in the Asteraceae.

We thank Hans V. Hansen for the Leibnitzia and Onoseris seedmaterial; Neil Courtney-Gutterson for the CHS sequence of D.grandiflora; and James S. Farris, Jaakko Hyvonen, Ilkka Kukkonen,Barbara Meurer-Grimes, Joachim Schroder, Lena Struwe, and RistoVainola for valuable discussions or assistance. Eija Holma, MarjaHuovila, Paivi Laamanen, and Keijo Virta are acknowledged for theirexcellent technical assistance. This work was partially funded by theAcademy of Finland, The Swedish Natural Science Research Council,and the Lewis B. and Dorothy Cullman Foundation.

1. Swain, T. (1986) in Plant Flavonoids in Biology and Medicine:Biochemical, Pharmacological and Structure-Activity Relation-

ships, eds. Cody, V., Middleton, E., Jr., & Harborne, J. B. (Liss,New York), pp. 1-14.

2. Stafford, H. (1991) Plant PhysioL 96, 680-685.3. Koes, R. E., Quattrocchio, F. & Mol, J. N. M. (1994) BioEssays

16, 123-132.4. Niesbach-Klosgen, U., Barzen, E., Bernhardt, J., Rohde, W.,

Schwarz-Sommer, Z., Reif, H. J., Wienand, U. & Saedler, H.(1987) J. Moi. Evol. 26, 213-225.

5. Fliegmann, J., Schroder, G., Schanz, S., Britsch, L. & Schroder,J. (1992) Plant Mol. Biol. 18, 489-503.

6. Schroder, G., Brown, J. W. S. & Schroder, J. (1988) Eur. J. Bio-chem. 172, 161-169.

7. Melchior, F. & Kindl, H. (1990) FEBS Lett. 268, 17-20.8. Sparvoli, F., Martin, C., Scienza, A., Gavazzi, G. & Tonelli, C.

(1994) Plant Mol. Bio. 24, 743-755.9. Koes, R. E., Spelt, C. E., van den Elzen, P. J. M. & Mol, J. N. M.

(1989) Gene 81, 245-257.10. Shen, J. B. & Hsu, F. (1992) Moi. Gen. Genet. 234, 379-389.11. Baumert, A., Maier, W., Groeger, D. & Deutzmann, R. (1994) Z.

Naturforsch. 49, 26-32.12. Junghans, K. T., Kneusel, R. E., Baumert, A., Maier, W., Groger,

D. & Matern, U. (1995) Plant Mol. Bio. 27, 681-692.13. Tropf, S., Lanz, T., Rensing, S. A., Schr6der, J. & Schr6der, G.

(1994) J. Mol. Evol. 38, 610-618.14. Durbin, M. L., Learn, G. H., Jr., Huttley, G. A. & Clegg, M. T.

(1995) Proc. Natl. Acad. Sci. USA 92, 3338-3342.15. Helariutta, Y., Elomaa, P., Kotilainen, M., Griesbach, R. J.,

Schroder, J. & Teeri, T. H. (1995) Plant Moi. Biol. 28, 47-60.16. Bremer, K. (1994) Asteraceae: Cladistics and Classification (Tim-

ber, Portland, OR).17. Jofuku, D. & Goldberg, R. B. (1988) in Plant Molecular Biology:

A Practical Approach, ed. Shaw, C. H. (IRL, Oxford), pp. 37-66.18. Sambrook, J., Fritsch, E. F. & Maniatis, T. (1989) Molecular

Cloning: A Laboratory Manual (Cold Spring Harbor Lab. Press,Plainview, NY), 2nd Ed.

19. Dellaporta, S. L., Wood, J. & Hicks, J. B. (1983) Plant Mol. Biol.Rep. 1, 19-21.

20. Saiki, R. K., Gelfand, D. H., Stoffel, S., Scharf, S. J., Higuchi, R.,Horn, G. T., Mullis, K. B. & Erlich, H. A. (1988) Science 239,487-491.

21. Vieira, J. & Messing, J. (1991) Gene 100, 189-194.22. Swofford, D. L. (1993) PAUP: Phylogenetic Analysis Using Parsi-

mony (Illinois Nat. His. Survey, Champaign), Version 3.1.23. Farris, J. S., Albert, V. A., Kallersjo, M., Lipscomb, D. & Kluge,

A. G. (1996) Cladistics, in press.24. Chase, M. W., Soltis, D. E., Olmstead, R. G., Morgan, D., Les,

D. H., et al. (1993) Ann. Mo. Bot. Gard. 80, 528-580.25. Takkinen, K., Laukkanen, M.-L., Sizmann, D., Alfthan, K.,

Immonen, T., Vanne, L., Kaartinen, M., Knowles, J. K. C. &Teeri, T. T. (1991) Protein Eng. 4, 837-841.

26. Hanahan, D. (1983) J. Mo. Biol. 166, 557-580.27. Courtney-Gutterson, N., Napoli, C., Lemieux, C., Morgan, A.,

Firoozabady, A. & Robinson, K. E. P. (1994) BiolTechnology 12,268-271.

28. Farris, J. S. (1972) Am. Nat. 106, 645-668.29. Farris, J. S. (1970) Syst. Zool. 19, 83-92.30. Swofford, D. L. & Maddison, W. P. (1987) Math. Biosci. 87,

199-229.31. Farris, J. S. (1989) Cladistics 5, 417-419.32. Schuz, R., Heller, W. & Hahlbrock, K. (1983) J. Biol. Chem. 258,

6730-6734.33. Kreuzaler, F. & Hahlbrock, K. (1975) Arch. Biochem. Biophys.

169, 84-90.34. Lanz, T., Schroder, G. & Schroder, J. (1990) Planta 181, 169-175.35. Elomaa, P., Helariutta, Y., Kotilainen, M. & Teeri, T. H. (1996)

Mol. Breeding 2, 41-50.36. Kauppinen, S., Siggaard-Andersen, M. & von Wettstein-

Knowles, P. (1988) Carlsberg Res. Commun. 53, 357-370.37. Verwoert, I. I. G. S., Verbree, E. C., Van der Linden, K. H.,

Nijkamp, H. J. J. & Stuitje, A. R. (1992) J. Bacteriol. 174, 2851-2857.

38. Kindl, H. (1985) in Biosynthesis and Biodegradation of WoodComponents, ed. Higuchi, T. (Academic, New York), pp. 349-377.

39. Borejsza-Wysocki, W. & Hrazdina, G. (1994) Phytochemistry 35,623-628.

9038 Evolution: Helariutta et al.

Dow

nloa

ded

by g

uest

on

Aug

ust 4

, 202

1