directed molecular evolution of proteins k johnsson

359
Directed Molecular Evolution of Proteins Edited by S. Brakmann and K. Johnsson Directed Molecular Evolution of Proteins: or How to Improve Enzymes for Biocatalysis. Edited by Susanne Brakmann and Kai Johnsson Copyright ã 2002 Wiley-VCH Verlag GmbH & Co. KGaA ISBNs: 3-527-30423-1 (Hardback); 3-527-60064-7 (Electronic)

Upload: dli8713

Post on 02-Apr-2015

305 views

Category:

Documents


5 download

TRANSCRIPT

Directed Molecular Evolution of ProteinsEdited by S. Brakmann and K. JohnssonDirected Molecular Evolution of Proteins: or How to Improve Enzymes for Biocatalysis.Edited by Susanne Brakmann and Kai JohnssonCopyright 2002 Wiley-VCH Verlag GmbH & Co. KGaAISBNs: 3-527-30423-1 (Hardback); 3-527-60064-7 (Electronic)Related Titles from Wiley-VCHKellner,R.; Lottspeich, F.; Meyer, H. E.Microcharacterization of Proteins1999ISBN 3-527-30084-8Bannwarth,W.; Felder, E.; Mannhold, R.; Kubinyi, H.; Timmermann, H.Combinatorial Chemistry. A Practical Approach2000ISBN 3-527-30186-0Gualtieri,F.; Mannhold, R.; Kubinyi, H.; Timmermann, H.New Trends in Synthetic Medicinal Chemistry2001ISBN 3-527-29799-5Clark,D. E.; Mannhold, R.; Kubinyi, H.; Timmermann, H.Evolutionary Algorithms in Molecular Design2000ISBN 3-527-30155-0Directed Molecular Evolution of Proteinsor How to Improve Enzymes for BiocatalysisEdited bySusanne Brakmann and Kai JohnssonThe Editor of this volumeDr. Susanne BrakmannAG Angewandte Molekulare EvolutionInstitut fur Spezielle ZoologieUniversitat LeipzigTalstrae 3304103 Leipzig, GermanyProf. Dr. Kai JohnssonInstitute of Molecularand Biological ChemistrySwiss Federal Institute ofTechnology LausanneCH-1015 Lausanne, SwitzerlandCover Illustration Recent advances in automationand robotics have greatly facilitated the high throughput screening for proteins with desiredfunctions. Among other devices liquid handlingtools are integral parts of most screening robots.Depicted are 96-channel pipettors for the microliter-and submicroliter range (illustrations kindlyprovided by Cybio AG, Jena).This book was carefully produced. Nevertheless,editors, authors and publisher do not warrant theinformation contained therein to be free of errors.Readers are advised to keep in mind that state-ments, data, illustrations, procedural detailsor other items may inadvertently be inaccurate.Library of Congress Card No.:applied forBritish Library Cataloguing-in-Publication DataA catalogue record for this book is available fromthe British Library.Die Deutsche Bibliothek CIP Cataloguing-in-Pub-lication DataA catalogue record for this publication is availablefrom Die Deutsche Bibliothek. Wiley-VCH Verlag GmbH, Weinheim 2002All rights reserved (including those of translationin other languages). No part of this book may bereproduced in any form by photoprinting, mi-crofilm, or any other means nor transmitted ortranslated into machine language without writtenpermission from the publishers.In this publication, even without specific indi-cation, use of registered names, trademarks, etc.,and reference to patents or utility models does notimply that such names or any such information areexempt from the relevant protective laws and reg-ulations and, therefore, free for general use, nordoes mention of suppliers or of particular com-mercial products constitute endorsement orrecommendation for use.Printed on acid-free paper.Printed in the Federal Republic of Germany.Composition Mitterweger & PartnerKommunikationsgesellschaft mbH, PlankstadtPrinting betz-druck GmbH, DarmstadtBookbinding Grobuchbinderei J. SchafferGmbH & Co. KG, GrunstadtISBN 3-527-30423-1ContentsList of Contributors XI1 Introduction 12 Evolutionary Biotechnology From Ideas and Conceptsto Experiments and Computer Simulations 52.1 Evolution in vivo From Natural Selection to Population Genetics 52.2 Evolution in vitro From Kinetic Equations to Magic Molecules 82.3 Evolution in silico From Neutral Networks to Multi-stable Molecules 162.4 Sequence Structure Mappings of Proteins 252.5 Concluding Remarks 263 Using Evolutionary Strategies to Investigate the Structureand Function of Chorismate Mutases 293.1 Introduction 293.2 Selection versus Screening 303.2.1 Classical solutions to the sorting problem 313.2.2 Advantages and limitations of selection 323.3 Genetic Selection of Novel Chorismate Mutases 333.3.1 The selection system 353.3.2 Mechanistic studies 373.3.2.1 Active site residues 373.3.2.2 Random protein truncation 423.3.3 Structural studies 443.3.3.1 Constraints on interhelical loops 443.3.4 Altering protein topology 463.3.4.1 New quaternary structures 473.3.4.2 Stable monomeric mutases 493.3.5 Augmenting weak enzyme activity 513.3.6 Protein design 533.4 Summary and General Perspectives 57Directed Molecular Evolution of Proteins: or How to Improve Enzymes for Biocatalysis.Edited by Susanne Brakmann and Kai JohnssonCopyright 2002 Wiley-VCH Verlag GmbH & Co. KGaAISBNs: 3-527-30423-1 (Hardback); 3-527-60064-7 (Electronic)4 Construction of Environmental Libraries for Functional Screeningof Enzyme Activity 634.1 Sample Collection and DNA Isolation from Environmental Samples 654.2 Construction of Environmental Libraries 684.3 Screening of Environmental Libraries 714.4 Conclusions 765 Investigation of Phage Display for the Directed Evolution of Enzymes 795.1 Introduction 795.2 The Phage Display 795.3 Phage Display of Enzymes 815.3.1 The expression vectors 815.3.1.1 Filamentous bacteriophages 815.3.1.2 Other phages 835.3.2 Phage-enzymes 845.4 Creating Libraries of Mutants 875.5 Selection of Phage-enzymes 895.5.1 Selection for binding 895.5.2 Selection for catalytic activity 905.5.2.1 Selection with substrate or product analogues 905.5.2.2 Selection with transition-state analogues 925.5.2.3 Selection of reactive active site residues by affinity labeling 965.5.2.4 Selection with suicide substrates 985.5.2.5 Selections based directly on substrate transformations 1025.6 Conclusions 1086 Directed Evolution of Binding Proteins by Cell Surface Display: Analysisof the Screening Process 1116.1 Introduction 1116.2 Library Construction 1136.2.1 Mutagenesis 1136.2.2 Expression 1146.3 Mutant Isolation 1156.3.1 Differential labeling 1156.3.2 Screening 1196.4 Summary 124Acknowledgments 1247 Yeast n-Hybrid Systems for Molecular Evolution 1277.1 Introduction 1277.2 Technical Considerations 1307.2.1 Yeast two-hybrid assay 1307.2.2 Alternative assays 1417.3 Applications 1477.3.1 Protein-protein interactions 1477.3.2 Protein-DNA interactions 149Contents VI7.3.3 Protein-RNA interactions 1507.3.4 Protein-small molecule interactions 1537.4 Conclusion 1558 Advanced Screening Strategies for Biocatalyst Discovery 1598.1 Introduction 1598.2 Semi-quantitative Screening in Agar-plate Formats 1618.3 Solution-based Screening in Microplate Formats 1648.4 Robotics and Automation 1699 Engineering Protein Evolution 1779.1 Introduction 1779.2 Mechanisms of Protein Evolution in Nature 1789.2.1 Gene duplication 1799.2.2 Tandem duplication 180ba-barrels 1819.2.3 Circular permutation 1829.2.4 Oligomerization 1839.2.5 Gene fusion 1849.2.6 Domain recruitment 1849.2.7 Exon shuffling 1869.3 Engineering Genes and Gene Fragments 1879.3.1 Protein fragmentation 1889.3.2 Rational swapping of secondary structure elements and domains 1899.3.3 Combinatorial gene fragment shuffling 1909.3.4 Modular recombination and protein folding 1949.3.5 Rational domain assembly engineering zinc fingers 1999.3.6 Combinatorial domain recombination exon shuffling 2009.4 Gene Fusion From Bi- to Multifunctional Enzymes 2039.4.1 End-to-end gene fusions 2039.4.2 Gene insertions 2039.4.3 Modular design in multifunctional enzymes 2049.5 Perspectives 20810 Exploring the Diversity of Heme Enzymes through Directed Evolution 21510.1 Introduction 21510.2 Heme Proteins 21610.3 Cytochromes P450 21810.3.1 Introduction 21810.3.1 Mechanism 22010.3.2.1 The catalytic cycle 22010.3.2.2 Uncoupling 22210.3.2.3 Peroxide shunt pathway 22210.4 Peroxidases 22310.4.1 Introduction 22310.4.2 Mechanism 223VII10.4.2.1 Compound I formation 22310.4.2.2 Oxidative dehydrogenation 22610.4.2.3 Oxidative halogenation 22610.4.2.4 Peroxide disproportionation 22610.4.2.5 Oxygen transfer 22710.5 Comparison of P450s and Peroxidases 22710.6 Chloroperoxidase 22810.7 Mutagenesis Studies 22910.7.1 P450s 23010.7.1.1 P450cam 23010.7.1.2 Eukaryotic P450s 23010.7.2 HRP 23110.7.3 CPO 23110.7.4 Myoglobin (Mb) 23210.8 Directed Evolution of Heme Enzymes 23310.8.1 P450s 23310.8.2 Peroxidases 23410.8.3 CPO 23610.8.4 Catalase I 23610.8.5 Myoglobin 23710.8.6 Methods for recombination of P450s 23710.9 Conclusions 23811 Directed Evolution as a Means to Create Enantioselective Enzymes for Usein Organic Chemistry 24511.1 Introduction 24511.2 Mutagenesis Methods 24711.3 Overexpression of Genes and Secretion of Enzymes 24811.4 High-Throughput Screening Systems for Enantioselectivity 25011.5 Examples of Directed Evolution of Enantioselective Enzymes 25711.5.1 Kinetic resolution of a chiral ester catalyzed by mutant lipases 25711.5.2 Evolution of a lipase for the stereoselective hydrolysis of ameso-compound 26811.5.3 Kinetic resolution of a chiral ester catalyzed by a mutant esterase 26911.5.4 Improving the enantioselectivity of a transaminase 27011.5.5 Inversion of the enantioselectivity of a hydantoinase 27011.5.6 Evolving aldolases which accept both D- and L-glyceraldehydes 27111.6 Conclusions 27312 Applied Molecular Evolution of Enzymes Involved in Synthesis and Repairof DNA 28112.1 Introduction 28112.2 Directed Evolution of Enzymes 28212.2.1 Site-directed mutagenesis 28312.2.2 Directed evolution 284Contents VIII12.2.3 Genetic damage 28512.2.4 PCR mutagenesis 28612.2.5 DNA shuffling 28712.2.6 Substitution by oligonucleotides containing random mutations(random mutagenesis) 28812.3 Directed Evolution of DNA polymerases 28912.3.1 Random mutagenesis of Thermus aquaticus DNA Pol I 29112.3.1.1 Determination of structural components for Taq DNA polymerasefidelity 29212.3.1.2 Directed evolution of a RNA polymerase from Taq DNA polymerase 29312.3.1.3 Mutability of the Taq polymerase active site 29412.3.2 Random oligonucleotide mutagenesis of Escherichia coli Pol I 29412.4 Directed Evolution of Thymidine Kinase 29512.5 Directed Evolution of Thymidylate Synthase 29712.6 O6-Alkylguanine-DNA Alkyltransferase 30012.7 Discussion 30213 Evolutionary Generation versus Rational Design of Restriction Endonucleaseswith Novel Specificity 30913.1 Introduction 30913.1.1 Biology of restriction/modification systems 30913.1.2 Biochemical properties of type II restriction endonucleases 31013.1.3 Applications for type II restriction endonucleases 31113.1.4 Setting the stage for protein engineering of type II restrictionendonucleases 31313.2 Design of Restriction Endonucleases with New Specificities 31313.2.1 Rational design 31313.2.1.1 Attempts to employ rational design to change the specificityof restriction enzymes 31314.2.1.1 Changing the substrate specificity of type IIs restriction enzymesby domain fusion 31613.2.1.3 Rational design to extend specificities of type II restriction enzymes 31613.2.2 Evolutionary design of extended specificities 31813.3 Summary and Outlook 32414 Evolutionary Generation of Enzymes with Novel Substrate Specificities 32914.1 Introduction 32914.2 General Considerations 33114.3 Examples 33314.3.1 Group 1 33314.3.2 Group 2 33714.3.3 Group 3 33814.4 Conclusions 339Index 343IXList of ContributorsProf. Dr. Frances H. ArnoldChemical Engineering 21041California Institute of Technology1201 East California BoulevardPasadena, California 91125, USAProf. Dr. Stephen Benkovic, Dr. Stefan LutzDepartment of ChemistryThe Pennsylvania State University414 Wartik LaboratoryUniversity Park, Pennsylvania 16802, USAProf. Dr. Uwe BornscheuerInstitut fur Chemie und BiochemieErnst-Moritz-Arndt-UniversitatSoldmannstrae 1617487 GreifswaldProf. Dr. Virginia W. CornishColumbia UniversityDepartment of Chemistry3000 Broadway, MC 3167New York, NY 10027-6948, USADr. Rolf DanielInstitut fur Mikrobiologie und GenetikGeorg-August-UniversitatGrisebachstrae 837077 GottingenProf. Dr. Jacques FastrezLaboratoire de Biochimie Physiqueet de Biopolyme resUniversite Catholique de LouvainPlace L. Pasteur, 1.Bte 1BB-1348 Louvain-la-Neuve, BelgiumProf. Dr. Donald HilvertETH HonggerbergLaboratorium fur Organische ChemieHCI, F339CH-8093 Zurich, SchweizProf. Dr. Lawrence A. LoebDepartment of PathologySchool of MedicineUniversity of WashingtonBox 357705Seattle, Washington 981957705, USAProf. Dr. Alfred PingoudInstitut fur BiochemieJustus-Liebig-UniversitatHeinrich-Buff-Ring 5835392 GiessenProf. Dr. Manfred T. ReetzMax-Planck-Institut fur KohlenforschungKaiser-Wilhelm-Platz 145740 MulheimProf. Dr. Peter K. SchusterInstitut fur Theoretische Chemie und StrahlenchemieUniversitat WienWahringerstrae 17A-1090 Wien, OsterreichDr. Andreas SchwienhorstInstitut fur Mikrobiologie und GenetikGeorg-August-UniversitatGrisebachstrae 837077 GottingenProf. Dr. K. Dane WittrupDept. Chemical Engineering & Div. Bioengineeringand Environmental HealthMassachusetts Institute of TechnologyCambridge, MA 02139, USADirected Molecular Evolution of Proteins: or How to Improve Enzymes for Biocatalysis.Edited by Susanne Brakmann and Kai JohnssonCopyright 2002 Wiley-VCH Verlag GmbH & Co. KGaAISBNs: 3-527-30423-1 (Hardback); 3-527-60064-7 (Electronic)Subject Indexaachilles' heel approach 314, 326acyclovir 298acyl-enzyme 99aequorin 170affinity labeling, selection 90, 9698 covalent 96 cystein 96 maturation 115agar plate assay 335 format, screening 165aldol condensation 97aldolase 97, 275277, 340 D-2-keto-3-deoxygluconate (KDG) 276,340 2-keto-3-deoxy-6-phosphogluconate(KDPG) 275 pyruvate 276alkaline phosphatase 86altering protein topology 4651amplification 29 biotechnology, evolution 14antibiotic catalytic, chorismate mutase 3335, 41,87 marker, yeast two-hybrid assay132 selection 31antibody 112 catalytic (see also chorismate mutase)3335, 41, 87, 249 phage display (see there) 81aptamers 14Arg90, chorismate mutase 38AraC-LexA system 142AroH-class, chorismate mutase 33AroQ-class, chorismate mutase 33, 57arthrobacter species 274assay achilles' heel approach 314, 326 agar plate assay 335 automated analysis 324 protocol in microplate format 323 bacterial two-hybrid assay (see there) 127,130143 biomedical applications 314 cleavage assay 326 colony assay 321 coupled transcription/translation 320 experimental genetics 314 fluorescence assay 324, 326 growth assay 335 high-throughput approach 325 His6-tagged 323 HTS (high throughput screening) 259,319 ee assays 259 molecular breeding 320 Ni2 -NTA 323 overlay assay 168 phage display 320 pipetting robot 324 protein complementation assays(see there) 144 purification 323 restriction fragment length poly-morphisms (RFLPs) 313 ribosome display 320 screening colorimetric assay 166 overlay assay 168 pH indicator assay 274 shuffling 319ATP-binding polypeptides 57augmentation weak enzyme activity 5152automation 172173auxotrophic marker 136Directed Molecular Evolution of Proteins: or How to Improve Enzymes for Biocatalysis.Edited by Susanne Brakmann and Kai JohnssonCopyright 2002 Wiley-VCH Verlag GmbH & Co. KGaAISBNs: 3-527-30423-1 (Hardback); 3-527-60064-7 (Electronic)avidity 82, 107azidothymidine (AZT) 298bB2H (see bacterial two-hybrid) 142143BAC libraries 71bacillus subtilis chorismate mutase 33 lipase 272bacterial display technique 25 two-hybrid assay (B2H) 142143 AraC-LexA system 142 bacterial activator system 142 k-repressor 142bacteriophage 7984 absorption coefficient 84 concentration 84 f1 81 fd 81 g3p (see there) 8182 g8p (see there) 8182 M13 81 infection/infectivity 81, 84 lambda 83 morphogenesis 80 particle 81 phagemids 81 polyphage 84 replicative form 82 secretion 81 T4 83 western blot 84ba-barrel 185, 186, 193, 337 ba8-barrel 337BCNU (1,3,-bis(2chloroethyl)1-nitrosourea) 302beta ba-barrel 185, 186, 193 ba8-barrel 337 b-galactosidase 86, 144, 338 b-lactamase 86, 89, 99, 107Bgl-I 312bias 288BIFL (burst integrated fluorescencelifetime) 170173binary patterning 53biocatylysis 249 screening strategies, biocatalyst discovery(see also screening) 163164biological information 8bioluminescence 172biopanning 80821,3,-bis(2-chloroethyl)1-nitrosourea (BCNU)302blood coagulation 191breeding of molecules 16BsCM (B. subtilis, chorismate mutase)3840, 4243 kinetic studies 43 redesign 39 truncation 4243b-type enzymes 221burkholderia cepacia 270burst integrated fluorescence lifetime(BIFL) 170173ccalmodulin 103carbonic anhydrase 86carboxyl esterase (see also esterase) 333carboxy-terminus 82carotenogenic gene cluster 166catalase I 220, 230, 240241catalysis/catalytic antibodies 3335, 41, 87, 249 elution, selection 107 fidelity 283 transition metal catalysis 249CCP (cytochrome c peroxidase) 232, 239,339Cdc25 145cDNA display, phage display 82 fos 82 jun 82cell surface display 3, 111124 expression (see there) 114 library construction (see there) 113114 mutant isolation (see there) 115124chaperonin GroEL 84 GroES 84chemical inducers of dimerization (see protein-smallmolecule interactions) 153 tagging 1415chemoluminescence 172chimera 2chloromethylketones 98chloroperoxidase (see CPO) 220, 228,232233chorismate mutase 2959 AroH-class 33 AroQ-class 33, 57 B. subtilis (BsCM) 3840, 4243 kinetic studies 43Subject Index 344 redesign 39 truncation 4243 bacillus subtilis 33 biochemical characterization 56 catalytic antibody 3335, 41, 87 1F7 33, 41 combinatorial mutagenesis 3840 randomization 56 computation studies 37 constraints on interhelial loops44 E.coli, EcCM 3335, 41 electrostatic stabilization 3840 engineered hexamer, EcCM 48 monomer, EcCM 4849 310 helix 42 hydrophobic core 49 kinetic studies 39, 43 Kcat values 54 Km values 54 loop importance of loop length 48 insertion, EcCM 47 insertion, MjCM 49 preferences, EcCM 44 mechanism 37 mechanistic studies 37 18O isotype effect at O 38 origins of thermostability, EcCM 49 phenotype - genotype, linkage 36 reaction catalized 3338 isotype effects 38 redesign, BsCM 39 role of Arg90 38 saccharomyces cerevisae 33, 41 selection system 35 chorismate mutase-deficient bacterialstrain 54 KA12 34 liquid 49 pKIMP-UAUC 35 size-exclusion chromatography 51 structural studies 44 transition state inhibitor 37 uncatalyzed reaction 37chromatography, size-exclusion, chorismatemutase 51circular permutation 182, 186 ba-barrel 186 evolutionary advantages 186CLERY (combinatorial libraries enhancedby recombination in yeast) 242cloning of DNA fragments 31 of functional genes from natural microbialconsortia 65 contermination of the purified DNA 65 extraction of DNA from soils 65CMCM (combinatorial multiple cassettemutagenesis) 267coefficient of variance (CV) 121combinatorial syntheses of chemical compoundlibraries 29 techniques 1complementation 293 heterologous 72compound I 224, 227, 235, 239, 241coumermycin 153coupling efficiency of P450s (uncoupling)226, 234covalent 96, 99 intermediate 99 selection 96CPO (chloroperoxidase) 220, 228, 232233 directed evolution of 240 instability to peroxide 233 mutagenesis of 235 reactions catalyzed 228 stability to H2O2 233Cre-Lox 83cutoff fluorescence selection 119120 purity 120 yield 120CV (coefficient of variance) 121cyclin-dependent kinase 2 147cytochrome b-562 44 c 220 peroxidase (CCP) 232, 239, 339 P450 monooxygenase 222227,231235, 237238, 241243, 338339 catalytic cycle 225 comparison to peroxidase 231232 coupling efficiency 226 directed evolution of 237238 Km for H2O2 231232 mutagenesis of 233235 P450 1A1 241242 P450 1A2 241242 P450 2B4 235 P450 BM3 237238, 241Subject Index 345 P450cam 222, 226227, 233234,237, 339 P450 2E1 235 P450SPa 232 reactive intermediated 224 reactions catalyzed 222, 224 recombination of 241242dD-2keto-3-deoxygluconate (KDG) 276, 340Darwinian evolution 8, 51, 57DBD (see DNA binding domain) 127,130134DD-peptidase 86de novo design protein design 53 rational design 27 tailored enzymes 57degenerate oligonucleotide 87 DNA syntheses, oligonucleotide-directed29degree of neutrality 17dehydrogenase 35 activity 72 oxidative dehydrogenation 230 prephenate 35dexamethasone dexamethasone-FK506 153 dexamethasone-methotrexate CID 153dGTP, 8-oxo-dGTP 288differential labeling 115118digital imaging 167directed evolution (see also evolution) 5659,249278, 327 back-crossing 266 cassette mutagenesis 252 colony picker 262 DNA shuffling 252 enzyme variants 250 hot regions 267 hot spots 265 mutant gene 250 phage display 261 protein sequence space 262 random mutagenesis (see there) 250, 290 saturation mutagenesis 252 selection methods 251 site-directed mutagenesis 250 specificity 316 in vivo selection 261display/display techniques 2, 25 bacterial 25 cell surface display 3, 111124 mRNA display 2 phage display (see there) 1, 25, 79108 ribosomal 2, 25 yeast display system 112dithiothreitol (DTT) 101diversity 1DNA achilles' heel 314 applied molecular evolution of enzymesinvolved in synthesis and repair of DNA283307 artificial chromosomes 314 binding domain 127 and transcription activation domains,yeast two-hybrid assay 130134 binding protein 314 cDNA (see there) 82 cleavage domain 318 cloning of DNA fragments 31 degenerate oligonucleotide-directedDNA syntheses 29 environmental (see there) 6465, 165, 168 extraction 66 gene replacement 314 targeting 314 isolation of soil and sediment samples 67 manipulation 326 methylation 314 methylphosphonate 317 non-specific DNA 315 O6-alkylguanine-DNA alkyltransferase283, 302304 PNA-assisted rare cleavage 314 polymerase 106, 284, 291293, 295 crystal structure 292 directed evolution of RNA polymerasefrom Taq DNA polymerase 295 E. coli 284, 296297 exonucleolytic proofreading 292 fidelity 291 low fidelity mutants 295 mismatch repair 292 motiv A 292 motiv B 292 motiv C 293 mutator 295 replication 291 temperature-sensitive mutant 293 thermus aquaticus 284 DNA Pol 1 293295 mutability 296 plasticity 296Subject Index 346 protein-DNA interactions (see protein)149150 rat DNA Pol-b 293 recognition domain 318 repair and replication 283 shuffling 2, 51, 87, 114, 194196, 252,289290, 338339 chimeric genes 289 directed evolution 252 exon shuffling 205 family DNA shuffling 195 limitation 195 mutagenesis methods 252 of mutant DNA fragments 29 triple-helix 314 formation 314domain 3D-domain swapping 187 recruitment 182, 188190 additional domains 189 chemistry 190 deleterious modification 189 domain swapping 187 regulation 189 substrate specificity 189dPTP 288DTT (dithiothreitol) 101eEcoRI 312EcoRV 312, 315enantioselective enzymes 249278enantioselectivity (E) 98, 335, 337 Quick E 337 selectivity factor E 261enantioselectivity of enzymes 3endonuclease, restriction (see there) 311327engineering/engineered chorismate mutase 4849 hexamer, EcCM 48 monomer, EcCM 4849 genes and gene fragments 191206 protein folding (see there) 198203 protein fragmentation (see there)192193 in silico protein engineering 212 protein evolution 181212 rational protein engineering 193194 reverse engineering 185, 192environmental libraries (see also libraries)2, 6376 collection 65 sampling sites 65 storage 65 transport 65 construction of 6871 activity-based screening 68 BAC libraries 71 cosmid library 69 hosts 68 insert size 68 plasmid library 69 protocols 69 vector selection 68 DNA (see also there) 6465 environmental DNA libraries 165, 168 isolation 65 follow-up analysis 75 screening of 7175 activity-based strategies 71 sequence-based approaches 74enzyme 249 applied molecular evolution of enzymesinvolved in synthesis and repair of DNA283307 b-type enzymes 221 enantioselective 249278 enantioselectivity of enzymes 3 enzyme-catalyzed pericyclic reactions 33 ITCHY (incremental truncation for thecreation of hybrid enzymes) 196, 199 oligomerization, role in enzyme function187 proximal ligand of heme enzymes 220,227228, 231, 235238 screening 164 secretion of 252 type II restriction enzymes 312, 315, 318epicurian coli 335336 XL1-Red 273epoxide hydrolase 254epPCR (see error-prone)equilibrium screening 111, 115 ligand concentration 115117error model 12 rate of replication 12 threshold 12, 14 critical replication accuracy 14 phase transition 14error-prone PCR 1, 87, 113, 251, 338Escherichia coli (E. coli) 3335, 41 chorismate mutase (EcCM) 3335, 41 loop insertion 47 loop preferences 44 DNA polymerase I 296297Subject Index 347esterase 273, 333 carboxyl esterase 333 lipase - esterase activity 72evolution/evolutionary process amino acid exchange 317 ancestral proteins 312 applied molecular evolution of enzymesinvolved in synthesis and repair of DNA283307 biotechnology 527 amplification 14 diversification 14 selection 14 constants 17 chorismate mutases (see there) 2959 Darwinian 8, 51, 57 design 6, 327 directed (see there) 5659, 249278, 327 efficiency 17 evolutionary pressure 317 gene shuffling 327 heme enzymes, directed evolution237242 homologous recombination 318 neutral 7, 16 optimization, evolutionary 165 population 17 protein biosynthesis 317 punctated 24 selective advantage 314 in silico 16 size 17 strategies 29 success 17 techniques 1 in test tube 8 in vitro 323exon shuffling (in nature) 190191,204205 DNA shuffling 205 lox-Cre recombination 204 mechanism 190 significance 190 trans-splicing group II intron ribozymes 204 inteins 205exploitation of natural products 71expression 114 eucaryotic secretory pathway 114 non-eucaryotic 114 posttranslational events 114 proteolysis 114 solubility 114 stability maturation 114 surface expression level 114ff1 phagemide, bacteriophages 81FACS-based screening 164FCS (fluorescence correlation spectroscopy)171174fd phagemide, bacteriophages 815FdUR 301fibrinolysis 191FIDA (fluorescence intensity distributionanalysis) 171174fidelity of catalysis 283 conformational changes 294 DNA polymerases 291 dNTP binding step 294fitness 56 differential 5 mean 6fitness, landscape 285FK506 153FKBP12-repamycin-associated protein 154FKBP12-repamycin-binding domain 154FkpA 85flow cytometry 120 reactor 10fluorescamine 170172fluorescent/fluorescence assay 324, 326 burst integrated fluorescence lifetime(BIFL) 170173 correlation spectroscopy (FCS) 171174 cutoff fluorescence selection (see there)119120 fluorescein fluorescence 122123 green fluorescent protein (GFP) 166 intensity distribution analysis (FIDA)171174 labeling 115 intensity 115 two-color fluorescent labeling 124 polarization 171 resonance energy transfer (FRET) 129,171 screening, fluorescence-based 1715fluorouracil (5-FU) 299Fok-I 312, 318follow-up analysis, environmental libraries75fos 82Subject Index 348mid-Fourier transform infrared spectroscopy(infrared spectroscopy) 173fractionation method 66FRET (fluorescence resonance energy trans-fer) 129, 171fusion protein 82 proteolytic degradation 85 western blot 84gg3p infection, bacteriophages 8182 signal sequence 82g8p infection, bacteriophages 8182 signal sequence 82galactosidase 144, 334 b-galactosidase 86, 144, 338 protein complementation assays 144ganciclovir 298gene carotenogenic gene cluster 166 combinatorial gene fragment shuffling194 duplication 183184 ba-barrels 185 isolated copies 183 mechansims 183 outcomes 183 proteases 184185 tandem 183 fusion 182, 188 aromatic amino acid pathway 188 concerted expression 188 end-to-end 207 in-frame 207 reporter gene 207 solubility 207 transporter proteins 207 insertion 207209 allosteric regulation 208 biosensor 208 regulatory function 188 substrate channeling 188 overexpression of genes 252 recruitment 184 sharing 184 shuffling 6genetic algorithms 277 damage 287288 alkylation 288 chemicals 288 deamination 288 frameshifts 288 intercalation 288 transitions 288 transversions 288 X-rays 288 population (see there) 5 selection 30genotypephenotype mapping 10, 17geotrichum candidum 270GFP (green fluorescent protein) 166Gibbs reagent 237glutathione transferase 103 S-transferase 86glycinamide ribonucleotide transformylase(PurN) 201glycosidase 99growth assay 335GTPase 86guaiacol 339guanyl nuclotide exchange factor 145hhalo formation screening 167Hamming distance, sequence space 8hapten 94helix/helial, 310 helix, chorismate mutase 42 constraints on interhelical loops 44 interhelical turn sequences 51 selection advantage 51heme enzymes 3, 219242 chloroperoxidase (see there) 232233 comparison of P450s and peroxidases231232 cytochrome P450 (see there) 222227 directed evolution (see evolution)237242 heme proteins 220221 mutagenesis studies 233236 peroxidases (see there) 227231 proximal ligand of 220, 227228, 231,235238hemoglobin (Hb) 228hepatitis delta virus RNA 21herpes simplex virus type 1 (HSV-1) 297heterologous complementation 72high throughput instrumentation 115 screening (see HTS) 32, 58, 73, 165,250, 254262, 274HIS3 reporter 143HIS6RPro Mnt variant 149HIV protease 149 reverse transcriptase 293Subject Index 349homogenous time resolved fluorescence171173horseradish peroxidase (HRP) 167, 227, 230,232, 235, 239240HTS (high throughput screening) 32, 58, 73,165, 250, 254262, 274 automation 259 capillary array electrophoresis 254 chemical sensors 261 circular dichroism 259 desymmetrization 256 ee assays 259 electrospray ionization mass spectro-metry 254 fluorescence 259 gas chromatography 254 IR-thermography 254 laser-induced fluorescence detection(LIF) 257 microchips 256 microtiter plates 257 pH indicator assay 274 pseudo-eantiomers 254 pseudo-prochiral compounds 254 reaction microarrays 260 robotics 259hybrid bacterial two-hybrid (B2H) 143 ITCHY (incremental truncation for thecreation of hybrid enzymes) 196, 199 split-hybrid system 137 yeast n-hybrid systems for molecularevolution 127158 yeast two-hybrid assay (see there) 127,130141hydantoinase 274hypercube, sequence space 8iIGPS (indole-3glycerol phosphatesynthase) 337improving enzyme activity 5152in silico evolution 16 protein engineering 212indigo 338indirubin 338indole-3-glycerol phosphate synthase (IGPS)337infrared spectroscopy (mid-Fourier transforminfrared spectroscopy) 173instrumentation, high-throughout 115intersectuib theorem 19iron response protein-iron response elementinteraction 152ITCHY (incremental truncation for thecreation of hybryd enzymes) 196, 199jJun- 82kKA12 (see also chorismate mutase) 35Kcat values, chorismate mutase 54Km values chorismate mutase 54 for H2O2, cytochrome P450s 231KDG (D-2keto-3-deoxygluconate)276, 340kinetic equations 10 fitness weighting term 10 parallel reactions 10 production terms 10 folding 21 algorithm 21 elementary steps 21 resolution 249 screening 115117 competition 117 label intensity 117 unlabeled ligand competior 117llabeling differential 115118 fluorescent 115b-lactamase 86, 89, 99, 107k-bacteriophage 83library 1 of altered genes 288 BAC libraries 71 CLERY (combinatorial libraries enhancedby recombination in yeast) 242 combinatorial 2930, 87 combinatorial syntheses of chemicalcompound libraries 29 construction 113114 cosmid 69 creation 88 diversity 88, 11 environmental (see there) 2, 6376 plasmid 69 quality 88 screen random libraries of RNAmolecules 152Subject Index 350 selection from large combinatoriallibraries 33 transformation 88lipase 3, 253, 333 ab hydrolase fold 270 bacillus subtilis lipase 272 lipase - esterase activity 72 conformational flexibility 270 oxyanion hole 271liquid handling 173175low copy plasmids 52luciferase 172luminescence 172 bioluminescence 172 chemoluminescence 172lysis, direct lysis of cells 66 method 66lysozyme 86mM13 phagemide, bacteriophages 81maleimide 106marker antibiotic 132 auxotrophic 136 counter-selective 132master sequence 916Mb (myoglobin) 221, 228, 236238, 241 directed evolution of 241 mutagenesis of 236 peroxygenase activity 236metagenome 63L-methionine 274methotrexate homodimer 153Michael addition 94microbial culture collection 164 diversity 6366 cloning of functional genes fromnatrural microbial consortia (see there)65 recovery or fractionation ob microbialcells 66microplate (microtiter plate) 168, 175misincorporations 288MNNG (N-methyl-N'-nitro-N-nitro-guanidine) 302molecular evolution 1 biopolymers 5 concepts 5 origins 5 phylogeny 6monovalent display 83mRNA display 2MS2 coat protein-stem-loop RNA interac-tion 152multi-stable molecules 16mutagenesis 113114, 250251, 285 chorismate mutase, combinatorialmutagenesis 3840 combinatorial multiple cassette muta-genesis (CMCM) 267 CPO (chloroperoxidase) 235 cytochrome P450 monooxygenase233235 directed evolution 320 DNA shuffling 252, 285 gene shuffling 327 heme enzymes, mutagenesis studies233236 methods 251 cassette mutagenesis 252 error-prone polymerase chain reaction(epPCR) 251, 327, 338 saturatiion mutagenesis 252 Mb (myoglobin) 236 molecular breeding 320 nucleoside analogues 327 optimal mutagenesis rate 113 peroxidase 235 polymerase chain reaction (see PCR) 285,288289 protein folding 202 random (see there) 194, 250, 285, 290291 saturation (see there) 1, 113, 252, 337 shuffling 320 site-directed 57, 250, 285 specificity 316 spiked oligodeoxynucleotide 323 in vitro evolution 323mutant cloud 11 isolation 115124 differential labeling 115118 screening 119124mutation 6, 29, 290 adaptive 6 degenerative oligonucleotides 290 rates 6 selectively neutral 6 within limited regions 290mutator strain 335myoglobin (see Mb) 221, 228, 236238, 241Subject Index 351nNa+/H+antiporter 72NADH 222, 226NAD(P)H 222, 226nanoplate (silicon wafer) 175177natural evolution 263 exploitation of natural products 71 selection 5networks, neutral 16, 1821 connected/connectedness 19, 21 extended 21neural networks 277neutral degree of neutrality 17 evolution 7, 16 networks (see there) 16, 1821 selectively neutral mutation 6n-hybrid, yeast n-hybrid systems for mole-cular evolution 127158N-methyl-N'-nitro-N-nitroguanidine(MNNG) 302non-ribosomal peptide synthetases(see NRPSs) 208209, 211normalization 122novel substrate specificities 331340NRPSs (non-ribosomal peptide synthetases)208209, 211 combinatorial approaches 211 domain exchange 209 module exchange 209 rational engineering 209, 211nuclease 86nucleotide analogs 288290o18O isotype effect at O 38Ob-replicase 8O4-methylated thymine (O4mt) 304O6-alkylguanine-DNA alkyltransferase 283,302304O6-benzylguanine (BG) 303oligomerization 182, 187 role in enzyme function 187 substrate shielding 187oligonucleotide, degenerated 87 DNA syntheses, degenerate oligonucleo-tide-directed 29overlay assay 168oxidative dehydrogenation 230 halogenation 2308oxo-dGTP 288ppanning procedure 2PCR (polymerase chain reaction) 1, 87, 113,251, 288289 error-prone (epPCR) 1, 87, 113, 251, 338 mutagenesis methods 251 mutagenesis 288289 error rate 288 manganese 288 mutation frequency 288 Taq polymerase 288 sexual PCR 289PDZ domain with new specifities 149penicillin acylase 86 penicillin G 167, 170peptides that bound target proteins 147percolation phenomenon 19peroxidase 220224, 227231, 235,238240 chloroperoxidase (see there) 220, 232233 comparison to cytochrome P450s231232 cytochrome c peroxidase (CCP) 232, 239,339 directed evolution of 238240 disproportionation 230 horseradish (HRP) 167, 227, 230, 232, 235,239240 mechanism 227231 mutagenesis of 235 reactions catalyzed 227228peroxide disproportionation 230 shunt pathway (see also peroxygenase)226227, 231232, 237238peroxygenase 227, 231233, 236 activity, myoglobin (Mb) 236pH indicator 166 assay 274, 335 HTS 274phage display 1, 25, 79108 antibodies 81 Fab 81 Fv 81 bacterial Sec system 82 cDNA display 82 of enzymes 8187 monovalent display 83 phage shock promotor 83 polyvalent display 82 technique 25phage-enzymes, selection 89107 affinity 89Subject Index 352 binding 89 biopanning 89phenotype 31 mapping 10 space 17phospholipase 86, 339phospholipid 334phosphonate, transition-state analogues 94phosphonylating agents 102phosphoribosylanthranilate isomerase(PRAI) 337pKIMP-UAUC (see also chorismate mutase)35PKSs (polyketide synthetases) 208210 combinatorial engineering 210 domain exchange 209 iterative (type II) PKSs 210 minimal PKSs 210 module exchange 209 rational engineering 209plasminogen 86, 98polyketide synthetases (see PKSs) 208210polymerase 86, 106 chain reaction (see PCR) 1, 87, 113,251, 288289 DNA polymerase (see there) 106, 284,291293population genetics 5 differential equations 5 recombination 5PRAI (phosphoribosylanthranilate isomer-ase) 337prephenate dehydratase 35 dehydrogenase 35probability 119 poisson random number 119 statistical confidence 119proFARI 337protease 86protein ancestral 312 complementation assays 144 adenylate cyclase 144 b-galactosidase 144 dihydrofolate reductase 144 ubiquitin 144 degradation, ubiquitin-mediated 145 engineering 165, 193194 evolution in nature, mechanisms of182191 blood coagulation 191 circular permutation (see there) 182, 186 domain recruitment (see there) 182,188190 exon shuffling in nature (see there)190191 fibrinolysis 191 gene duplication (see there) 183184 gene fusion (see there) 182, 188 modular protein evolution 191 oligomerization (see there) 182, 187 tandem duplication 184186 folding 198203 in the context of protein engineering198 folding pathway 202 modular engineering 200 mutagenesis 202 problem of incorrect folding 200 fragment complementation 192 fragmentation 192 ba-barrel 193 permissive sites 192 green fluorescent protein (GFP) 144, 166 homogenous time resolved fluorescence171173 in silico protein engineering 212 localization, to detect protein-protein in-teractions 145 Cdc25 145 guanyl nucleotide exchange factor 145 Ras nucleotide exchange factors 145 protein-DNA interactions 149150 contrast to phage display 150 HIS6RPro Mnt variant 149 zinc finger variants 150 protein-protein interactions 147149 HIV protease 149 identify peptides that bound targetproteins 147 inhibit cyclin-dependent kinase 2 147 PDZ domain with new specifities 149 for the retinoblastoma protein 147 protein-RNA interactions 150153 iron response protein iron responseelement interaction 152 MS2 coat protein-stem-loop RNAinteraction 152 screen random libraries of RNAmolecules 152 protein-small molecule interactions153155 coumermycin 153 dexamethasone-FK506 153 dexamethasone-methotrexate CID 153Subject Index 353 FKBP12repamycin-binding domain154 FKBP12-repamycin-associated protein154 FK506 153 methotrexate homodimer 153 Tat-TAR interaction 152 sequence space 2526, 53 holes 26 occurence of function 26 simple lattice models 26 SHIPREC (sequence-homology indepen-dent protein recombination) 196,241242 topology, altering 4651proteolytic degradation of fusion protein 85protoporphyrin IX 220221proximal ligand of heme enzymes 220,227228, 231, 235238pseudomonas (p.) p. aeruginosa 252 p. fluorescence 273PurN (glycinamide ribonucleotide trans-formylase) 201push-pull mechanism 228229, 239 of O-O bound cleavage 239putidaredoxin 222qquasi-species 916 mutant distribution 12 stable stationary solution 12Quick E 337rrandom mutagenesis 194, 250, 290291 genetic damage 285 oligonucleotide 284285, 290 promotor sequences 291 protein truncation 4243 replication 13 walk, diffusion 7Ras nucleotide exchange factors 145rat DNA Pol-b 293reactive immunization, selection 97REBASE, Database 313recombination 1, 241242 CLERY (combinatorial libraries enhancedby recombination in yeast) 242 cytochrome P450s 241242 lox-Cre recombination 204 population genetics 5 sexual 29, 51 SHIPREC (sequence-homology indepen-dent protein recombination) 196,241242reductase 86, 144, 222reeingineering quaternary structures 47regioselectivity 94relay series 22repamycin, FKBP12repamycin-bindingdomain 154replica plating 74repressed transactivator system 137resorufin esters 336restriction endonucleases with novel specifi-city (see also RM systems) 311327 8 base pair cutters 313 Bgl-I 312 chimeric nuclease 318 crystal structures 312 DNA cleavage domain 318 DNA recognition domain 318 EcoRI 312 EcoRV 312, 315 Fok-I 312, 318 gene technology 313 PNA-assisted rare cleavage 314 rare cutters 314 recognition site 314 restriction fragment length poly-morphisms (RFLPs) 313 type II restriction endonuclease 312, 326 enzymes 312, 315, 318retinoblastoma protein 147retrovirus mediated gene transfer 302reverse engineering 185 Y2H system 137RFLPs (restriction fragment length poly-morphisms) 313rhizomucor miehei 270ribonuclease A 86ribonucleotide transformylase, glycinamide(PurN) 201ribosome/ribosomal display 2, 25 inefficient ribosome binding sites 52ribozymes 14, 278RM systems (see also restriction endo-nucleases) 311327 methyltransferase activity 311 restriction endonuclease activity 311 selfish genetic elements 312Subject Index 354RNA folding 17 hepatitis delta virus RNA 21 information carrier 16 magic molecule 16 minimal free energy 17 mRNA (see there) 2 protein-RNA interactions (see protein)150153 rRNA (see there) 64 secondary structures 17 world 16robotics 173174 screening, robotic 120Rop 4416S rRNA 64ssaccharomyces cerevisae, chorismate mu-tase 33, 41saturation mutagenesis 1, 113, 252, 337 hotspot 113scaffoldings 290SCRATCHY 198199 homology-independent fragment shuf-fling 198screening 2, 3031, 58, 111124, 163176 agar plate format 165 colorimetric assay 166 of environmental libraries (see there)7175 enzyme/enzyme activity 6376, 164 equilibria (see there) 111, 115 facilitated 31 FACS-based 164 fluorescence-based 171 halo formation 167 high throughput (see HTS) 32, 58, 73, 165,250, 254262, 274 kinetic (see there) 111, 115117 overlay-assay 168 oversampling 119 robotic 120 procedure 2 screen random libraries of RNA mole-cules 152 solution-based 167 statistic 111 strategies for biocatalyst discovery163164 visual signal 166scytalone dehydratase 166Sec system, bacterial 82selection 2, 2930, 165167 advantages 3233, 46 in vivo selection schemes 58 affinity 81, 8990, 93 labeling 90, 9698 enantioselectivity of enzymes 3 antibiotic 31 binding 89 biopanning 89 biotechnology 14 catalytic activity 90107 elution 107 covalent 96, 99 cutoff fluorescence selection (see there)119120 cystein 96 enantioselectivity 98 genetic 30 from large combinatorial libraries 33 limitations 3233 mutation, selectively neutral 6 natural 5 of phage-enzymes (see there) 89107 product analogues 90 reactive immunization 97 regioselectivity 94 substrate 90, 102 turnover 102 with suicide substrates 98102 sulfonamide 93 transition-state analogues (see there)9295 unanticipated or undesired solutions 33 vector 68selectivity factor E 261SELEX 1, 14sequence sequence - structure relations 17 space 2, 53, 284286 Hamming distance 8 hypercube 8 protein sequence space (see there)2526, 53 structure mappings of proteins 2526sexual recombination 29, 51shape space covering 18shikimate pathway 33, 34SHIPREC (sequence-homology independentprotein recombination) 196, 241242shuffling combinatorial gene fragment shuffling194Subject Index 355 DNA (see there) 2, 51, 87, 114, 194196,252, 289290, 338339 family shuffling 289 of mutant DNA fragments 29 mutagenesis 327 SCRATCHY, homology-independentfragment shuffling 198199r-complex 94silicon wafer (nanoplate) 175177single pass 121Skp 85small molecule-protein interactions (see pro-tein) 153155SNase (staphylococcal ribonuclease) 102solution-based screening 167specificity 189, 315320 artificial substrates 326 canonical site 319 chimeric nuclease 318, 326 direct readout 316 directed evolution 316 domain recruitment, substrate specificity189 extended 320321 flanking sequence effect 319 indirect readout 316 intra- and intermodular interactions 326 modified substrates 317 novel substrate specificities (see there)331340 rational design 315 relaxed 317 site directed mutagenesis 316 transition state analogs 315, 320, 327 water-mediated contact 319 zinc fingers 318split-hybrid system 137staggered extension process (StEP) 339staphylococcal ribonuclease (SNase) 102StEP (staggered extension process) 339stereoselective synthesis 249steric hindrance 333stop codon 85structure common 18 sequence structure mappings of proteins2526 sequence - structure relations 17subriligase protease 86substrate analogues 9093subtiligases 104subtilisin 86, 102103 subtilisin E 168suicide substrates, selection with 98102sulfonamide, transition-state analogues9295superoxide dismutase 87tT4 bacteriophage 83Tat-TAR interaction 152tertiary structure 290thermostability, origins of EcCM, chorismatemutase 49thermus aquaticus 284 DNA Pol 1 293295 high fidelity mutants 295thioredoxin 87thymidine kinase 297299 herpes simplex virus type 1 (HSV-1) 297thymidylate synthase 283, 299302 methylation of dUMP 299thymitaq 301thymydine kinase 283transactivator system, repressed 137transaminase 274transformation of cells 31 efficiencies 32 microorganism 58transition 2223 major or discontinous 23 metal catalysis 249 minor or continuous 22transition-state analogues, selection 9295 phosphonate 94 sulfonamide 93trans-splicing inteins 205 exon shuffling 205trypsin 86uubiquitin-mediated protein degradation 145uncoupling (coupling efficiency of P450s)226, 234urate oxidase 87vvisual signal screening 166wweak augmentation weak enzyme activity5152 promotor 52Subject Index 356yY2H (see yeast, two-hybrid assay) 127,130141yeast CLERY (combinatorial libraries enhancedby recombination in yeast) 242 display system 112 n-hybrid systems for molecular evolution127158 two-hybrid assay (Y2H) 127, 130141 activation domain 131 antibiotic marker 132 auxotrophic marker 136 bacterial two-hybrid assay (see there) 142 counter-selective markers 132 DNA-binding and transcription activa-tion domains 130134 false negatives 140141 false positives 139140 negative selection 137 promoter 136 reporters 134137 reverse Y2H system 137 variations 137zzinc finger 203204 specificity 318 variants, protein-DNA interaction 150Subject Index 3571IntroductionKai Johnsson, and Susanne BrakmannThe application of evolutionary and combinatorial techniques to study and solve com-plex biological and chemical problems has become one of the most dynamic fields inchemistry and biology. The book presented here is a loose collection of articles aimingto provide an overview of the current state of the art of the directed evolution of pro-teins as well as highlighting the challenges and possibilities in the field that lie ahead.Although the first examples of directed molecular evolution date back to the pioneer-ing experiments of S. Spiegelman et al. and of M. Eigen and W. Gardiner, who pro-posed that evolutionary approaches be adapted for the engineering of biomolecules [1,2], it was the success of methods such as phage display for in vitro selection of peptidesand proteins as well the selection of functional nucleic acids using the SELEX proce-dure (Systematic Evolution of Ligands by Exponential enrichment) that brought thepower of this concept to the attention of the general scientific community [3, 4]. Inthe last decade, directed evolution has become a key technology for biomolecule en-gineering. The success of the evolutionary approach, however, not only depends onthe potency of the method itself but is also a result of the limitations of alternativeapproaches, as our lack of understanding of the structure-function relationship ofproteins in general hinders the rational design of biomolecules with new func-tions. What are the prerequisites for a successful directed evolution experiment?In its broadest sense, (directed) evolution can be considered as repeated cycles of var-iation followed by selection. In the first chapter of the book, the underlying principlesof this concept and their application to the evolutionary design of biomolecules arereviewed by P. Schuster one of the pioneers in the field of molecular evolution.Naturally, the first step of each evolutionary project is the creation of diversity. Themost straightforward approach to create a library of proteins is to introduce randommutations into the gene of interest by techniques such as error-prone PCR or satura-tion mutagenesis. The success of random mutagenesis strategies is witnessed by theirample appearances in the different chapters of this book describing case studies ofparticular classes of proteins and enzymes. In addition, recombination of mutantDirected Molecular Evolution of Proteins: or How to Improve Enzymes for Biocatalysis.Edited by Susanne Brakmann and Kai JohnssonCopyright 2002 Wiley-VCH Verlag GmbH & Co. KGaAISBNs: 3-527-30423-1 (Hardback); 3-527-60064-7 (Electronic)genes by DNAshuffling or related techniques can be used to create additional diversityand to accumulate rapidly beneficial and additive point mutations [5]. This is a keytechnique that also surfaces in the majority of the chapters. The sequence spacesearched by these approaches is, however, quite limited. DNA shuffling betweenhomologous genes, which has also been called family shuffling, allows yet unexploredregions of sequence space to be accessed [6]. In the chapter by S. Lutz and S. J. Ben-kovic, an approach to create chimeras even between non-homologous genes and itsapplication in protein engineering is described.An interesting alternative to the generation of libraries with in vitro methods is thegeneration of so-called environmental libraries, described by R. Daniel. Here, advan-tage is taken of natural microbial diversity by isolating and cloning environmentalDNA and by using the resulting libraries to search for novel biocatalysts.After the creation of diversity, i.e. the generation of a library of different mutants, theprotein(s) with the desired phenotype (function or activity) have to be selected fromthelibrary. This can be achieved by either selection or screening procedures. The principaladvantage of selection is that much larger libraries can be examined: the number ofclones that can be subjected to selection is, in general, five orders of magnitudes abovethose that can be sorted by advanced screening methods. Impressive examples for thepower of true selection, where the survival of the host is directly coupled to the desiredphenotype, can be found in the chapters written by D. Hilvert et al. and J. F. Davidson etal.. The major challenge of most selection approaches is to couple the desired pheno-type, such as the catalysis of an industrially important reaction, to the survival of thehost. But what can be done if the desired phenotype cannot provide a direct selectiveadvantage to a given host organism? Different approaches appear feasible: if the de-sired property binds to a given molecule, display systems for the protein of interestsuch as phage display, ribosomal display or mRNA display, and the subsequent in vitroselection of binders by so-called panning procedures are established technologies [3, 7,8]. A recent publication by the group of J. W. Szostak describes the employment of invitro selection of functional proteins from libraries of completely randomized 80mers(actual library size $1013) using mRNA display. This work highlights the power of invitro selection, and is a striking example of an experiment that would simply be im-possible to perform using screening procedures [9]. In the chapter written by P. Sou-million and J. Fastrez, an interesting extension of this approach, the in vitro selection ofnovel enzymatic activities using phage display, is reviewed. Here, clever selectionschemes link the immobilization of the phage to the desired reactivity.Another approach to the selection of biomolecules with novel functionalities, i.e.binding, or even enzymatic activity, is based on the yeast two- and three-hybrid sys-tem. The potential and limitations of these and related approaches are reviewed in thechapter contributed by the group of V. W. Cornish et al.1 Introduction 2Despite their inferiority in terms of number of clones examined, screening proce-dures have become increasingly important over the last years. One important reasonfor this is the enormous technological progress that has been achieved in automationand miniaturization, allowing up to 106different mutants to be screened in a reason-able timeframe. An overview of advanced screening strategies is given in the article ofA. Schwienhost. In the chapter written by K. D. Wittrup a discussion of the prerequi-sites for a successful screening process is given, analyzing the outcome of the directedevolution of proteins displayed on cell surfaces as a function of the screening condi-tions. The power of intelligently designed screening processes is demonstrated in thefollowing contributions: M. T. Reetz and K.-E. Jaeger describe screening techniques toengineer the enantioselectivity of enzymes; T. Lanio et al. present their approaches forthe evolutionary generation of restriction endonucleases, U. T. Bornscheuer reports onthe functional optimization of lipases, and last but not least, P. C. Cirino and F. H.Arnold give an overview of directed evolution experiments with heme enzymes.Clearly, there are various developments and applications in the field of directedevolution that are not covered by any of the articles published in this book. Neverthe-less, we hope to provide a snapshot of this rapidly developing field that will inspire andsupport scientists with different backgrounds and intentions in planning their ownexperiments.Finally, we would like to thank all authors for their contributions, and P. Golitz andK. Kriese of Wiley-VCH for their continuous motivation and help in getting this bookpublished.References[1] S. Spiegelman, I. Haruna, I. B. Holland,G. Beaudreau, D. Mills, Proc. Natl. Acad. Sci.USA 1965, 54, 919927.[2] M. Eigen, W. Gardiner, Pure Appl. Chem.1984, 56, 967978.[3] G. P. Smith, Science 1985, 28, 13151317.[4] a) C. Tuerk, L. Gold, Science 1990, 249,505510; b) A. D. Ellington, J. W. Szostak,Nature 1990, 346, 818822.References[5] W. P. Stemmer, Nature 1994, 370, 389391.[6] A. Crameri, S. A. Raillard, E. Bermudez,W. P. Stemmer, Nature 1998, 391, 288291.[7] J. Hanes, A. Pluckthun, Proc. Natl. Acad.Sci. USA 1997, 91, 49374942.[8] R. W. Roberts, J. W. Szostak, Proc. Natl.Acad. Sci. USA 1997, 94, 1229712302.[9] A. D. Keefe, J. W. Szostak, Nature 2001, 410,715718.1 Introduction 32Evolutionary Biotechnology From Ideas and Conceptsto Experiments and Computer SimulationsPeter SchusterResearch on biological evolution entered the realm of science in the 19th century withthe centennial publications by Charles Darwin and Gregor Mendel. Molecular modelsfor evolution under controlled conditions became available only in the second half ofthe twentieth century after the initiation of molecular biology. This chapter presents anaccount of the origins of molecular evolution and develops the concepts that have led tosuccessful applications in the evolutionary design of biopolymers with predefinedproperties and functions.2.1Evolution in vivo From Natural Selection to Population GeneticsNature is the unchallenged master in design by variation and selection and sinceCharles Darwin's epochal publication of the Origin of Species [1, 2] the basic prin-ciples of the mechanism behind natural selection have become known. Darwin de-duced his principle of evolution from observations in the field and compared spe-cies adapted to their natural habitats with the results achieved through artificial selec-tion by animal breeders and in nursery gardens. Natural selection introduces changesin populations by differential fitness, which is tantamount to the instantaneous dif-ferences in the numbers of decedents between two competing variants. In artificialselection the animal breeder or the gardener interferes with the natural selection pro-cess by discarding the part of the progeny with undesired properties. Only shortly afterthe publication of Darwin's Book of the Century the quantitative rules of geneticswere discovered by Gregor Mendel [1, 2]. It took, nevertheless, about seventy yearsbefore Darwin's theory was united successfully with the consequences of Mendel'sresults in the development of population genetics [2, 3].The differential equations of population genetics are commonly derived for sexuallyreplicating species and thus deal primarily with recombination as the dominant sourceDirected Molecular Evolution of Proteins: or How to Improve Enzymes for Biocatalysis.Edited by Susanne Brakmann and Kai JohnssonCopyright 2002 Wiley-VCH Verlag GmbH & Co. KGaAISBNs: 3-527-30423-1 (Hardback); 3-527-60064-7 (Electronic)of variation. Mutation is considered as a rather rare event. In evolutionary design ofbiopolymers the opposite is true: Mutation is the common source of variation andrecombination occurs only with special experiments, gene shuffling [4], for exam-ple. In the formulation of the problem we shall consider here the asexual case exclu-sively. The mathematical expression dealing with selection through differential fitnessis then of the formdxkdt = xk (fknj=1 fjxj) = xk(fk)Y k = 1; 2; F F F ; n: (1)The fraction of variant Ikis denoted by xkwith rk xk = 1; fk is its fitness value. Accord-ingly, we introduced f = rk fk xk as the mean fitness of the population. The mathe-matical role of fis to maintain the normalization of variables. The interpretation of Eq.(1) is straightforward: Whenever the differential fitness, fk-f, of a variant Ik is positiveor its fitness is above average, fk>f, dxk/dt is positive and this variant will increase infrequency. The opposite is true if fk kcr = 1 j1=j1; and (7a)Gk is partitioned, if kk < kcr = 1 j1=j1: (7b)Connectedness of a neutral network, implying that it consists of a single component, isimportant for evolutionary optimization. Populations usually cover a connected area insequence space and they migrate (commonly) by the Hamming distance moved. Ac-cordingly, if they are situated on a particular component of a neutral network, they canreach all sequences of this component. If the single component of the connected neu-tral network of a common structure spans all sequence space, a population on it cantravel by random drift through whole sequence space.Neutral networks connect sequences forming the same secondary structure of mini-mumfree energy. Every sequence, however, forms a great number of sub-optimal struc-tures, which are also computable by suitable algorithms. Seen froma given structure Sk,the neutral set Gk is surrounded by the set of compatible sequences Ck. This set containsall sequences which form Sk as sub-optimal or minimum free energy structure. By tak-ing two structures at random, say Sj and Sk, and considering the two sets of compatiblesequences, Cjand Ck, it was proven[42] that the intersection is always non-empty: CjCk =. In other words, this intersection theoremcan be expressed by: Given an arbitrary pairof structures, there will be at least one sequence that can adopt both structures3).3) It is important to stress that the intersection theoremcannot be extended to three or more structures: Forthree or more structures there may but need not exist a sequence that can form all of them [42].2.3 Evolution in silico From Neutral Networks to Multi-stable Molecules 19Fig. 2.8. Neutral networks in sequence space.The pre-image of the structure in the lower part ofthe figure is a connected neutral network span-ning whole sequence space. Networks of thisclass are typical for frequent structures. The upperpart of the figure shows an example of a parti-tioned network, which consists of one giantcomponent and many small islands. Connectivityis determined by the mean fraction of neutralneighbors, kk, of the pre-image of the corre-sponding structure, Sk, in sequence space.2 Evolutionary Biotechnology From Ideas and Concepts to Experiments and Computer Simulations 20The existence of extended and connected neutral networks in RNA sequence spacewas proven by an elegant experiment recently published by Erik Schultes and DavidBartel [43]. At the starting point for their work were two ribozymes of known structureswith chain length k=88: (i) an RNA ligase evolved in the laboratory [44], and (ii) anatural cleavage ribozyme isolated from hepatitis delta virus RNA [45]. The two struc-tures have no base pair in common and apparently no common phylogenetic history.Then, an RNA sequence was designed and synthesized at the intersection of the twoneutral networks of the reference structures. This means that a chimeric sequence wassynthesized which was compatible with both structures. The chimera did form bothstructures on folding and showed both activities, although they were substantiallyweaker than those of the reference ribozymes, the ligase and the cleavage ribo-zyme, respectively. Only two or three selected point mutations or base pair exchangesare required, however, to reach full catalytic efficiency. Still, the two optimized RNAmolecules have a Hamming distance of about forty from their reference sequences.Next, Schultes and Bartel explored further the mutational neighborhoods and foundneutral paths of Hamming distance around 40, by preparing and analyzing series ofRNA sequences, along which neighboring sequences differing in a single base or basepair only. Without interruption these two neutral paths lead from the chimeric RNAwith both catalytic activities to the two reference ribozymes. This result presents adirect proof for a sequence space-wide extension of the two neutral networks aswell as an experimental confirmation of the existence of a non-empty intersectionof the two compatible sets. The existence of multi-stable RNA molecules has beenderived also by means of a recently developed kinetic folding algorithm [46], whichresolves the folding process to elementary steps involving single base pairs. Applica-tion to sequences at the intersection of structures allows the design of moleculesswitching between two or more conformations with predefined rate constants [47].Computer simulations of evolution in sequence space through replication and mu-tation in populations of RNAmolecules under the conditions of a flowreactor (Fig. 2.4)were carried out first in the 1980s [48]. Typical sustainable population sizes are be-tween one thousand and one hundred thousand molecules. The mutation rate, p,is adjusted to the chain lengths of the molecules so that the majority of mutationevents leads to single point mutations and double mutations in a single replicationevent are very rare. Basic to these in silico studies is a straightforward introductionof phenotypes, represented by molecular structures, into the model (Fig. 2.9). Everynewly formed genotype produced in the population by an off-the-cloud mutation (Fig.2.5) is folded into its minimum free energy structure and the resulting structure isevaluated to yield the replication rate of fitness value of the new molecular variant.These early studies of evolution in silico provided already clear evidence for the punc-tuated nature of the optimization process and neutral drift during the epochs of phe-notypic stasis, independent of whether the simulations were conceived to aim at one2.3 Evolution in silico From Neutral Networks to Multi-stable Molecules 21particular target structure or at some property shared by several classes of structures.Later on, further studies on neutral evolution were performed with the goal to checkthe diffusion approximation of random drift [10]. A more recent investigation [49, 50]explored and revealed the mechanism of punctuated evolution. A typical plot of thecourse of the mutation-selection process is shown in Fig. 2.10: The mean distance totarget of the population (which is a measure of fitness in these simulation experi-ments) is plotted against time and shows pronounced punctuation. Adaptive periodsare interrupted by long epochs of stasis with respect to fitness. Evolution in genotypespace, however, neither slows down nor stops on the mean fitness plateaus [51]. In-spection of the sequence distribution of the population provides new insights into theprocess.An evolutionary trajectory leading from an initial population to the final state ischaracterized by a uniquely defined time-ordered series of phenotypes, called the relayseries [49]. It can also be understood as a series of transitions between pairs of con-secutive phenotypes in the relay series. Transitions are off-the-cloud mutations lead-ing to newphenotypes and fall into two classes: (i) minor or continuous transitions andFig. 2.9. Evolutionary dynamics with pheno-types. The sketch shows a sequence of eventsfollowing an off-the-cloud mutation and leadingan innovation, which consists in the incorporationof a new mutant into the replication-mutationensemble: (i) A new variant sequence, Ik, iscreated through a mutation, Ij Ik, (ii) the se-quence is converted into a structure, Sk = (Ik),and (iii) the fitness of the new phenotype isdetermined by means of the mapping fk = f (Sk).Eventually, the new variant is fully integrated intothe replication-mutation ensemble.2 Evolutionary Biotechnology From Ideas and Concepts to Experiments and Computer Simulations 22(ii) major or discontinuous transitions.4)Minor transitions between structures occurwith high frequency and involve changes that are easy to accomplish with a singlepoint mutation, like opening or closing of single base pairs adjacent to stacks. Open-ing of stacks with marginal stability also falls into this class. The sequence constraint islow: Almost every sequence forming the initial structure yields the final structure of aminor transition on one or a few different single point mutations. Major transitionsbetween structures require simultaneous changes in several adjacent and/or distantbase pairs and occur at single point mutations with low probability only. Major transi-tions are characterized by strong constraints on initial sequences. In other words, theyrequire special initial sequences and thus occur with low probability when averagedover the entire neutral network.Analysis of the dynamics on the plateaus of constant fitness falls into one of twodifferent scenarios: (i) Neutral evolution in the conventional sense consisting of chan-ging genotypes that give rise to the same phenotype or phenotypic stasis expressed by asingle phenotype on the relay series, and (ii) a neutral random walk on a subset ofclosely related phenotypes of identical fitness, which are accessible from each otherthrough minor transitions, that manifest itself by a sometimes large number of stepsin the relay series with frequent repetitions of particular phenotypes. Very rarely, fit-ness neutral major transitions are also observed inside fitness plateaus. As we shall seebelow the two scenarios are not very different in reality: Scenario (ii) is readily con-verted into scenario (i) by an increase in population size. Each quasi-stationary epochends with a major transition that is accompanied by a gain in fitness. Astraightforwardinterpretation of this finding suggests that the population undertakes a random searchduring the epochs of phenotypic stasis until a mutant sequence is produced that in-itiates a fitness improving major transition. A cascade of fitness improving minortransitions commonly follows the major transition, and the close neighborhood ofthe new variant is thereby instantaneously explored.The explanation given above is strongly supported by the dynamics observed ingenotype space. When the population enters a fitness plateau the distribution of gen-otypes is very narrow (Fig. 2.10). Then, while the population diffuses on a neutralsubspace of sequence space, the width of the mutant cloud increases steadily andseems to approach a saturation phase. Instantaneously, when the population reachesthe end of the fitness plateau, the width of the distribution drops as the populationpasses a bottleneck in genotype space. This picture of population dynamics on theneutral subset, slow spread and fast contraction, is complemented by a recordingof the migration of the population center through sequence space. On the plateau,during the spread of the distribution, the center is almost stationary or drifts very4) The choice of the adjectives continuousand discontinuous points to topological relations between thepre-images of the corresponding structures in sequence space [52].2.3 Evolution in silico From Neutral Networks to Multi-stable Molecules 23slowly. At the end of the quasi-stationary epoch, however, the velocity of the populationcenter shows a sharp peak corresponding to a jump in sequence space. Major transi-tions lead to genotypes, which represent bottlenecks for evolutionary optimization.Individual trajectories of evolution in the flow reactor are not reproducible in detail.Relay series of different computer runs under identical conditions5)involve differentstructures and the corresponding genotypes have sequences that diverge from initialconditions. Almost all quantities, for example the number of replications required toreach the target or the number of minor transitions, show widely scattered distribu-Fig. 2.10. Variability in genotype space duringpunctuated evolution. Shown are the results of asimulation of RNA optimization towards a tRNAtarget with population size N = 3000 and muta-tion rate p = 0.001 per site and replication event:(i) The trace of the underlying trajectory recordingthe average distance from target, (gray,left ordinate scaled by 0.22, or full length is 50)and (ii) two plots of different measures of evo-lution in genotype space, the migration of thepopulation center (with dt = 8000replications) and the width of the population, against time expressed as the totalnumber of replications performed until time t.The upper plot is a measure of genotype diversityand shows the mean Hamming distance withinthe population (, dotted line, right ordi-nate). The lower curve presents the Hammingdistance between the centers of the population attimes t and t+dt (, full line, left ordi-nate) and measures the drift velocity of the po-pulation center. The arrow indicates a sharp peakof at the end of the second longplateau, which reaches a height of Hammingdistance ten.5) Identical conditions here means that everything was chosen to be the same except the seeds of therandom number generator.2 Evolutionary Biotechnology From Ideas and Concepts to Experiments and Computer Simulations 24tions. Population size effects on the evolutionary processes are pronounced. The num-ber of replications increases with population size, a dramatic effect is seen with thenumber of minor transitions: It decreases by a factor of about four in the range be-tween N= 1000 and N= 100000 molecules. The number of major transitions, however,shows only small scatter and is remarkably constant in this range of population sizes.Modeling of neutral evolution by means of a birth-and-death process provides astraightforward interpretation of this result: Minor transitions have a sufficientlyhigh probability of occurrence such that frequent variants, once formed, stay in a lar-ger population and do not reappear in further steps of the relay series. The low sen-sitivity of the numbers of major transitions to both population size and sequence ofrandom events, however, makes them candidates for constants of evolution: Theyrepresent essential innovations and their number appears to depend only on initialand final state.2.4Sequence Structure Mappings of ProteinsIn this section we do not aim at a presentation of the current state of the art in thedesign of proteins by variation and selection. This will be done in great detail in theother chapters of this volume. What we shall try to do instead is a comparison of resultsderived for proteins and RNA molecules to point out common features as well asdifferences.The experimental results of selection and evolution of molecules derived here camemainly from investigations on RNA molecules and this simply because RNA is bettersuited for studies, since (i) RNA unites the properties of genotype and phenotype inone and the same molecule, and (ii) the bases in the base pairs of the stacking regionsof RNA are complementary (AU, GC, and sometimes GU). These relations are funda-mental for the simple logic for secondary structure formation, and have no counterpartin proteins. In addition, RNA secondary structures play almost always the role of anintermediate in the kinetic folding process and thus have a physical meaning. A thirdproblem with the evolutionary design of proteins is the problem to link messengersand function carriers. This can be solved elegantly by the various display techniques:phage, bacterial, ribosomal display and others. Another elegant method based on acovalent link between RNA and protein has been used in a paper discussed below[53]. Although variation selection methods are available for proteins, they cannotcompare successfully with the ease of selection procedures when both propertiesare contained in the same molecule like in the case of RNA.Protein sequence space was postulated as a useful tool for discussing protein evolu-tion already in 1970 [54]. Later on most extensive model studies were more or less2.4 Sequence Structure Mappings of Proteins 25confined to rather simple lattice models [55]. Systematic studies on random sequencemodel proteins [56] gave two important results: (i) more sequences than structures,and (ii) a few common folds compared to a great variety of rare folds. The secondfinding was also obtained by different stability considerations [57]. It is worth noticingthat the frequency distribution of protein lattice fold is remarkably similar to that ofRNA molecules with random sequences of the same chain length [40]. Shape spacecovering as observed with RNAs does not hold for lattice proteins [41].Neutral networks [41] represent more or less the basic and most important feature ofgenotype-phenotype mappings. Although protein structure and function has beendiscussed with respect to neutrality for a very long time, direct evidence for neutralityand neutral networks came only recently fromempirical potentials and neural networkstudies [58, 59]. Other investigations on protein foldability landscapes are in generalagreement with the existence of extended neutral networks too [60, 61].It is worth mentioning in this context that there seems to be a general differencebetween RNA and protein landscapes: Certain amino acid composition ratios betweenhydrophobic and hydrophilic amino acids presumably give rise to insoluble aggregatesand this may lead to holes in protein sequence space. Perhaps, the concept of holeyadaptive landscapes as favored in a series of recent papers on models of evolution [62]might be useful in this context.Finally, two experimental results are highly relevant in this context: The first studyon true random sequence proteins [53] revealed that the occurrence of function inprotein sequence space has approximately the same probability, 1012, as in RNA se-quence space. The second remarkable finding showed that very different structures ofproteins, with no sequence homology, of course, gave rise to the binding affinity toAMP, the target molecule. More studies following along the line of this elegant experi-ment will provide the desired insight in protein sequence-structure mapping. Thesecond experiment was done four years ago [63]: Two protein molecules with 50 %sequence homology have entirely different structures. A fully b-sheet structure wasturned into an a-helix bundle by changing only half of the amino acid residues. En-tirely different structures can be found at not too large Hamming distances in se-quence space.2.5Concluding RemarksWhat distinguishes the evolutionary strategy from conventional or rational design?The primary and most important issue is that we need not know the structure thatyields the desired function. It is sufficient to derive an assay that allows for testingwhether or not a candidate molecule has the desired property. At the current state2 Evolutionary Biotechnology From Ideas and Concepts to Experiments and Computer Simulations 26of the art, de novo rational design of biopolymers gives very poor results and as long asthis deficiency in structure prediction methods cannot be overcome, evolutionarysearch for function will be superior.Variation and selection turns out to be an enormously potent tool for improvementalso in vitro. Why this is so, does not trivially follow from the nature of randomsearches. The efficiency of Monte-Carlo methods may work very poorly as weknow from other optimization problems. The intrinsic regularities of genotype-phe-notype mappings with high degrees of neutrality and very wide scatter of the points insequence space, which lead to the same or very similar solutions, are the clues toevolutionary success.AcknowledgementsThe work reported here was supported financially by the Austrian Fond zur Forderungder wissenschaftlichen Forschung (FWF), Projects P-13093-GEN, P-13887-MOB, and P-14898-MAT as well as by the Jubilaumsfond der Osterreichischen Nationalbank, ProjectNo.7813.References[1] K. Sander, Biologie in unserer Zeit, 1988, 18,161167 (in German).[2] G. de Beer, Notes and Records of the RoyalSociety of London 1964, 19, 192226.[3] R. A. Fisher, The genetical theory of naturalselection, Oxford University Press, Oxford(UK), 1930.[4] W. P. C. Stemmer, Proc. Natl. Acad. Sci.USA, 1994, 91, 1074710751.[5] M. Kimura, The neutral theory of molecularevolution, Cambridge University Press,Cambridge (UK), 1983.[6] J. L. King, T. H. Jukes, Science, 1969,788798.[7] S. F. Elena, V. S. Cooper, R. E. Lenski,Science 1996, 272, 18021804.[8] D. Papadopoulos, D. Schneider, J. M.Meier-Eiss, W. Arber, R. E. Lenski, M. Blot,Proc. Natl. Acad. Sci. USA, 1999, 96,38073812.[9] B. Derida, L. Peliti, Bull. Math. Biol., 1991,53, 355382.[10] M. A. Huynen, P. F. Stadler, W. Fontana,Proc. Natl. Acad. Sci. USA 1996, 93,397401.[11] M. Eigen, W. C. Gardiner, Pure Appl.Chem. 1984, 56, 967978.References[12] S. Spiegelman, Quart. Rev. Biophys., 1971,4, 213253.[13] H. F. Judson, The eighth day of creation,Jonathan Cape, London,1979.[14] R. W. Hamming, Coding and informationtheory, 2nded., Prentice Hall, EnglewoodCliffs, NJ, 1989.[15] P. F. Stadler, G. P. Wagner. Evol. Comp.,1998, 5, 241275.[16] M. Eigen, Naturwissenschaften, 1971, 58,465523.[17] E. Domingo, J. J. Holland, Annu. Rev.Microbiol., 1997, 51, 151178.[18] C. K. Biebricher, W. C. Gardiner,Biophys. Chem., 1997, 66, 179192.[19] M. Eigen, P. Schuster, Naturwissenschaften,1977, 64, 541565.[20] M. Eigen, J. McCaskill, P. Schuster,Adv. Chem. Phys., 1989, 75, 149263.[21] J. W. Drake, Proc. Natl. Acad. Sci. USA,1991, 88, 71607164.[22] J. W. Drake, Proc. Natl. Acad. Sci. USA,1993, 90, 41714175.[23] J. W. Drake, B. Charlesworth,D. Charlesworth, J. F. Crow. Genetics,1998, 148, 166716862.5 Concluding Remarks 27[24] P. Schuster, J. Swetina, Bull. Math. Biol.,1988, 50, 635660.[25] C. L. Burch, L. Chao, Nature, 2000, 406,625628.[26] C. O. Wilke, J. L. Wang, C. Ofria, R. E.Lenski, C. Adami, Nature, 2001, 412,331333.[27] A. D. Ellington, J. W. Szostak, Nature,1990, 346, 818822.[28] C. Tuerk, L. Gold, Science, 1990, 249,505510.[29] A.Watts, G. Schwarz, Biophys. Chem.,1997, 66 (2/3), 67284.[30] D. S. Wilson, J. W. Szostak, Ann. Rev.Biochem., 1999, 68, 611147.[31] L. Gold, C. Tuerk, P. Allen, J. Binkley,D. Brown, L. Green, S. MacDougal,D. Schneider, D. Tasset, S. R. Eddy. In:R. F. Gestland, J. F. Atkins, eds. The RNAworld. Cold Spring Harbor Press, Plain-view, NY, 1993, pp. 497509.[32] A. A. Beaudry, G.F. Joyce, Science, 1992,257, 635641.[33] R. R. Breaker, Chem. Rev., 1997, 97,371390.[34] N. Lehman, G. F. Joyce, Current Biology,1993, 3, 723734.[35] D. P. Bartel, J. W. Szostak, Science, 1993,261, 14111418.[36] R. F. Gesteland, J. F. Atkins, eds. The RNAworld. Cold Spring Harbor Press, Plain-view, NY, 1993.[37] P. Schuster, Biol. Chem., 2001, 382,in press.[38] P. Higgs, Quart. Rev. Biophys., 2000, 33,199253.[39] P. Schuster, P. F. Stadler, In: M. J. C.Crabbe, M. Drew, A. Konopka, Handbookof Computational Chemistry, Marcel Dek-ker, New York, 2001, in press.[40] W. Gruner, R. Giegerich, D. Strothmann,C. Reidys, J. Weber, I. L. Hofacker,P. F. Stadler, P. Schuster, Mh. Chem., 1996,127, 355389.[41] P. Schuster, W. Fontana, P. F. Stadler,I. L. Hofacker, Proc. Roy. Soc. London B,1994, 255, 279284.[42] C. Reidys, P. F. Stadler, P.Schuster, Bull.Math. Biol., 1997, 59, 339397.[43] E. A. Schultes, D. P. Bartel, Science, 2000,289, 448452.[44] E. H. Ekland, J. W. Szostak, D. P. Bartel,Science, 1995, 269, 364370.[45] A. T. Perotta, M. D. Been, J. Mol. Biol.,1998, 279, 361373.[46] C. Flamm, W. Fontana, I. L. Hofacker,P. Schuster, RNA, 2000, 6, 325338.[47] C. Flamm, I. L. Hofacker, S. Maurer-Stroh,P. F. Stadler, M. Zehl, RNA 2001, 7,254265.[48] W. Fontana, P. Schuster, Biophys. Chem.,1987, 26, 123147.[49] W. Fontana, P. Schuster, Science, 1998,280, 14511455.[50] P. Schuster, W. Fontana, Physica D, 1999,133, 427452.[51] P. Schuster, A. Wernitznig, Is there aconstant number of evolutionary innovationsrequired to reach a given target? Preprint,2001.[52] B. M. Stadler, P. F. Stadler, G. P. Wagner,W. Fontana, J. Theor. Biol., 2002, in press.[53] A. D. Keefe, J. W. Szostak, Nature, 2001,410, 715718.[54] J. Maynard Smith, Nature 1970, 225,563564.[55] K. Yue, K. M. Fiebig, P. D. Thomas,H. S. Chan, E. I. Shakhnovich, K. A. Dill,Proc. Natl. Acad. Sci. USA, 1993, 90,19421946.[56] H. Li, R. Helling, C. Tang, N. Wingreen,Science, 1996, 273, 666669.[57] S. Govindarajan, R. A. Goldstein, Proc.Natl. Acad. Sci. USA, 1996, 93, 33413345.[58] A. Babajide, I. L. Hofacker, M. J. Sippl,P. F. Stadler, Folding & Design, 1997, 2,261269.[59] A. Babajide, R. Farber, I. L. Hofacker,J. Inman, A. S. Lapedes, P. F. Stadler,J. Theor. Biol, 2001, 212, 3540.[60] S. Govindarajan, R. A. Goldstein, Biopoly-mers, 1997, 42, 427438.[61] S. Govindarajan, R. A. Goldstein, Proteins,1997, 29, 461466.[62] S. Gavrilets, Trends in Ecology andEvolution, 1997, 12, 307312.[63] S. Dalal, S. Balasubramanian, L. Regan,Nat. Struct. Biol., 1997, 4, 548552.2 Evolutionary Biotechnology From Ideas and Concepts to Experiments and Computer Simulations 283Using Evolutionary Strategies to Investigate the Structureand Function of Chorismate Mutases1)Donald Hilvert, Sean V. Taylor, and Peter Kast3.1IntroductionEvolution is the slow and continual process by which all living species diversify andbecome more complex. Through recursive cycles of mutation, selection and amplifi-cation, new traits accumulate in a population of organisms [1]. Those that provide anadvantage under prevailing environmental conditions are passed from one generationto the next. Since ancient times, man has exploited evolution in a directed way toproduce plants and animals with useful characteristics. Crossbreeding individualswith favorable traits successfully harnesses sexual recombination, one of the mostpowerful evolutionary strategies to generate new variants. From these crossings, pro-geny with improved features are chosen for additional breeding cycles, thus channel-ing the course of development.Biologists and chemists have recently begun to use evolutionary strategies to studyand tailor the properties of individual molecules rather than whole organisms. Anarray of methods has been developed to generate diversity in populations of mole-cules. Depending on the experiment, mutagenesis might entail degen