adventures in computational enzymology
DESCRIPTION
Adventures in Computational Enzymology. John Mitchell. MACiE Database. M echanism, A nnotation and C lassification i n E nzymes . http://www.ebi.ac.uk/thornton-srv/databases/MACiE/. G.L. Holliday et al ., Nucl. Acids Res ., 35 , D515-D520 (2007). EC Classification. Class. Subclass. - PowerPoint PPT PresentationTRANSCRIPT
Adventures in Computational Adventures in Computational EnzymologyEnzymology
John Mitchell
MMechanism, AAnnotation and CClassification iin EEnzymes.http://www.ebi.ac.uk/thornton-srv/databases/MACiE/
MACiE DatabaseMACiE Database
G.L. Holliday et al., Nucl. Acids Res., 35, D515-D520 (2007)
Enzyme Nomenclature and Enzyme Nomenclature and ClassificationClassificationEC ClassificationEC Classification
Class
Subclass
Sub-subclass
Serial number
EC Classification
Enzyme Commission (EC) Nomenclature, 1992, Academic Press, San Diego, 6th Edition
Chemical reaction
The EC ClassificationThe EC Classification
Reaction direction arbitrary.
Doesn’t deal with structural and sequence information.
Thus, cofactors and active site residues ignored.
However, it was never intended to describe mechanism.
Only deals with overall reaction.
A New Representation of Enzyme Reactions?
Should be complementary to, but distinct from, the EC system.
Should take into account:
Reaction Mechanism;
Structure;
Sequence.
Need a database of enzyme mechanisms.
MMechanism, AAnnotation and CClassification iin EEnzymes.http://www.ebi.ac.uk/thornton-srv/databases/MACiE/
MACiE DatabaseMACiE Database
Coverage of MACiE
Representative – based on a non-homologous dataset,and chosen to represent each available EC sub-subclass.
Coverage of MACiE
Representative – based on a non-homologous dataset,and chosen to represent each available EC sub-subclass.
Structures exist for:
6 EC 1.-.-.-
56 EC 1.2.-.-
184 EC 1.2.3.-
1312 EC 1.2.3.4
MACiE covers:
6 EC 1.-.-.-
53 EC 1.2.-.-
156 EC 1.2.3.-
199 EC 1.2.3.4
Repertoire of Enzyme CatalysisRepertoire of Enzyme Catalysis
G.L. Holliday et al., J. Molec. Biol., 372, 1261-1277 (2007)
Repertoire of Enzyme Catalysis
0
20
40
60
80
100
120
140
HeterolyticElimination
HomolyticElimination
ElectrophilicAddition
NucleophilicAddition
HomolyticAddition
ElectrophilicSubstitution
NucleophilicSubstitution
HomolyticSubstitution
Reaction Types
Num
ber
of
step
s in
MA
CiE
Intramolecular
Bimolecular
Unimolecular
Enzyme chemistry is largely nucleophilic
0
50
100
150
200
250
300
350
400
450
Reaction Types
Num
ber
of
ste
ps in M
ACiE
ProtonProtontransfertransfer
AdAdNN22 E1E1 SSNN22 E2E2 RadicalRadicalreactionreaction
Tautom.Tautom. OthersOthers
Repertoire of Enzyme Catalysis
Residue Catalytic Propensities
Evolution of Enzyme FunctionEvolution of Enzyme Function
D.E. Almonacid et al., to be published
Work with domains - evolutionary & structural units of proteins.
Map enzyme catalytic mechanisms to domains to quantify convergent and divergent functional evolution of enzymes.
Domains
Functional Classification: EC
Enzyme Commission (EC) Nomenclature, 1992, Academic Press, San Diego, 6th Edition
Chemical reaction
Enzyme Catalysis Databases
G.L. Holliday et al., Nucleic Acids Res., 35, D515 (2007)
S.C. Pegg et al., Biochemistry, 45, 2545 (2006)
N. Nagano, Nucleic Acids Res., 33, D407 (2005)
Coverage of MACiE
Representative – based on a non-homologous dataset,and chosen to represent each available EC sub-subclass.
Coverage of SFLD
Based on a few evolutionarily related families
Coverage of EzCatDB
But without mechanisms.
Structural Classification: CATHOrengo, C. A., et al. Structure, 1997, 5, 1093
Dataset
CATH Enzymes in(single-domain) PDB
Database entries 395 >>799
EC sub-subclasses 114 184
EC serial numbers 326 1312
To avoid the ambiguity of multi-domain structures we use only single-domain proteins.
Numbers of CATH code occurrences per EC number
C
A
T
H
c.-.-.- c.s.-.- c.s.ss.- c.s.ss.sn
3.17
11.00
28.00
38.33
1.73
3.27
4.89
5.80
1.38
1.93
2.24
2.46
1.11
1.60
1.19
1.22
Results: Convergent Evolution
2.46 CATH/EC reaction
Convergent Evolution
Numbers of CATH code occurrences per EC number
C
A
T
H
c.-.-.- c.s.-.- c.s.ss.- c.s.ss.sn
3.17
11.00
28.00
38.33
1.73
3.27
4.89
5.80
1.38
1.93
2.24
2.46
1.11
1.60
1.19
1.22
Results: Convergent Evolution
2.46 CATH/EC reaction: Convergent EvolutionAn average reaction has evolved independently in 2.46 superfamilies
EC reactions/CATH
C4.75
19.50
39.25
90.00
c.-.-.-c.-.-.-
c.s.-.-c.s.-.-
c.s.ss.-
c.s.ss.sn
A3.14
7.00
10.48
17.90
T1.36
1.79
2.08
3.05
H1.20
1.36
1.462.05
database entries/CATH
2.18
Results: Divergent Evolution
1.46 EC reactions/CATH Divergent Evolution
EC reactions/CATH
C4.75
19.50
39.25
90.00
c.-.-.-c.-.-.-
c.s.-.-c.s.-.-
c.s.ss.-
c.s.ss.sn
A3.14
7.00
10.48
17.90
T1.36
1.79
2.08
3.05
H1.20
1.36
1.462.05
database entries/CATH
2.18
Results: Divergent Evolution
1.46 EC reactions/CATH: Divergent EvolutionAn average superfamily has evolved 1.46 different reactions
Density Functional Theory Calculations on
Dehydroquinase
Mattias Blomberg et al., to be published
DFT – System Size
• System sizes of ~100-150 atoms can be treated using DFT
• That raises the question of how to treat the rest of the protein.
Dielectric Continuum or QM/MM?
• One approach is to cut out the active site residues and treat the rest of the protein as a dielectric continuum.
• Another approach is to treat the active site as QM and the rest of the protein using MM.
QM
ε=4
QM
MM
Dielectric Continuum or QM/MM?
• One approach is to cut out the active site residues and treat the rest of the protein as a dielectric continuum.
• Another approach is to treat the active site as QM and the rest of the protein using MM.
QM
ε=4
QM
MM
Dehydroquinase - Part of the
Shikimate Pathway
Shikimate & Chorismate Pathways
Dehydroquinase (Shikimate Pathway)
Shikimate & Chorismate Pathways
• Biosynthetic pathway for phenylalanine, tyrosine and tryptophan.
• Present in plants, microorganisms and fungi but not in mammals.
• The target for Glyphosate, an important herbicide.
• Understanding the mechanisms and developing inhibitors is of great importance for the development of new herbicides, fungicides and antibiotics.
Two Types of Dehydroquinases
• Type I: E. coli and S. typhi,
(EC 4.2.1.10) MACiE M0054
Mechanism: cis-dehydration,imine intermediate.
• Type II: S. coelicor, M. tuberculosis and H. pylori
(EC 4.2.1.10). MACiE M0055Mechanism:trans-dehydration,enol(ate) intermediate.
Proposed Mechanism of DHQase
Arg113
NH
NH+NH2 -O
Tyr28
NN
His106
H
N Ala82
HO
O
Pro15NH
O NHH
Asn16
H
OHOH
O
H
HO
-O2C
HO2HN
Asn79
Arg113
NH
NH+NH2 HO
Tyr28
N Ala82
HO
O
Pro15NH
O NH-
Asn16
HNN
His106
H
OHOH
O
-O2C
HO2HN
Asn79
OH
Arg113
NH
NHNH2 HO
Tyr28
NN
His106
N Ala82
HO
O
Pro15NH
O NHH
Asn16
H
OHOH
-O2C OO
HH
O2HN
Asn79
+
Models of DHQase Active Site
Energetics of DHQase
Model A
Does Asn16 Protonate the DHQ Enolate?
Other Things we doOther Things we do
Chemoinformatics for pharmaceutical design …
…using Machine Learning for prediction of solubility, bioavailability and bioactivity.
Machine Learning Methods
• Recognise patterns in data• Similar inputs Similar outputs• Make full use of all available information• One application is solubility
Machine Learning Methods
• Can be used for Classification or for Regression
• Can be used with chemoinformatics, physicochemical or experimental (e.g., assay) data as descriptors
Solubility is an important issue in drug discovery and a major source of attrition
This is expensive for the industry
A good model for predicting the solubility of druglike molecules would be very valuable.
Drug Disc.Today, 10 (4), 289 (2005)
Machine Learning Method
Random Forest
Machine Learning Method
k-Nearest Neighbours
Machine Learning Method
Winnow (“Molecular Spam Filter”)
Future DirectionsFuture Directions
Current coverage of MACiE
Representative – based on a non-homologous dataset
Future coverage of MACiE
Adding homologues – to facilitate study of divergent evolution
Divergent Evolution using MACiE
This will use our reaction similarity work to measure changes in chemistry
Using Machine Learning Methods to calculate and predict protein-ligand binding energies
Building on our previous work …
P.M. Marsden et al., Org. Biomol. Chem., 2, 3267 (2004)
Computational Toxicology
Predicting bioavailability problems, off-target activities and side effects of drug candidates
QM, QM/MM and MD Simulation Work
• Using computational chemistry to study enzyme mechanisms
Fosfomycin Resistance Protein A
ACKNOWLEDGEMENTSACKNOWLEDGEMENTS
Dr Gemma Holliday
Dr Daniel Almonacid
Dr Noel O’Boyle
Dr Mattias Blomberg
Prof. Janet Thornton (EBI)
Dr Peter Murray-Rust
Dr Jochen Blumberger
ACKNOWLEDGEMENTSACKNOWLEDGEMENTS
Cambridge Overseas
Trust
All slides after here are for information only
Similarity of Enzyme MechanismsSimilarity of Enzyme Mechanisms
N.M. O'Boyle, et al., J. Molec. Biol., 368, 1484-1499 (2007)
Measuring Similarity of Enzyme Mechanisms
Coverage of MACiE
Representative – based on a non-homologous dataset,and chosen to represent each available EC sub-subclass.
UnimolecularHeterolytic Bimolecular
IntramolecularElimination
UnimolecularHomolytic Bimolecular
Intramolecular
Electrophilic BimolecularIntramolecular
Addition Nucleophilic BimolecularIntramolecular
Homolytic BimolecularIntramolecular
UnimolecularElectrophilic Bimolecular
Intramolecular
UnimolecularSubstitution Nucleophilic Bimolecular
Intramolecular
UnimolecularHomolytic Bimolecular
Intramolecular
Ingold, C. K. Cornell University Press,
1969.
Repertoire of enzyme catalysisRepertoire of enzyme catalysis
“Other reactions” and Named organic reactions currently supported in MACiE
______________________________________________
Aldol Condensation Hydride Transfer Amadori Rearrangement Isomerisation A-SN1 Michael Addition A-SN2 Nucleophilic Attack A-SNi Pericyclic Reaction Claisen Rearrangement Proton Transfer Condensation Radical Formation E1cb Radical Propagation Group Transfer Radical Termination Heterolysis Redox Homolysis Tautomerisation______________________________________________
Repertoire of enzyme catalysisRepertoire of enzyme catalysis
Functionality for amino acids currently supported in the MACiE
________________________________________________
Activating residue Proton acceptor Charge destabiliser Proton donor Charge stabiliser Proton relay Covalently attached Radical acceptor Electrophile Radical donor Hydride relay Radical relay Hydrogen bond acceptor Radical stabiliser Hydrogen bond donor Spectator Leaving group Steric hindrance Metal ligand Unknown function Nucleophile Unspecified steric role________________________________________________
Function of catalytic residuesFunction of catalytic residues
CMLReact
Customisable mark-up language
Allows validation
Uses dictionary technology
Separates content from presentation
Open Source
BUT still under development
An Overview of MACiE and CMLReact
Energetics of DHQase
Model A
TS1 - Proton Transfer
TS2 - Dehydration
Mattias Blomberg
69/41
Model C
Model A
Model B
Model C
Models A, B & C
MD and QM/MM Calculations on Fosfomycin Resistance Protein A
Fosfomycin Resistance Protein A
Fosfomycin Resistance Proteins
• Fosfomycin inhibits the first step in the bacterial cell-wall synthesis (MurA).
• Mn(II)-dependent soluble glutathione (GSH) transferase.
• FosA homologues in pathogenic bacteria: FosB and FosX.
Impact on Pathogens
• Low toxicity and broad-spectrum activity have resulted in an increased clinical use of fosfomycin
• Fosfomycin is most commonly used in treatments of lower urinary tract infections
• Fosfomycin alone or in combination with other drugs could also be useful against resistant Staphylococci and E. Coli, which can give serious infections for hospitalized patients (pneumonia, urinary tract infections, skin infections and bacteraemia).
Proposed Mechanism
• Lys90, Tyr100 and Arg119 mutants have a large effect on the turnover of the enzyme. They are all involved in the stabilization of the phosphonate group (Beharry et al, J Biol Chem, 2005, 17786.)
• Recent docking and mutation studies indicate that Trp34, Gln36, Tyr39, Ser50, Lys90 and Arg93 are involved in the binding of GSH (Rigsby et al, Arch. Biochem. Biophys, 2007, 277.)
• Tyr39 has been proposed to participate in the ionization of GSH (Rigsby et al, Arch. Biochem. Biophys, 2007, 277.)
Docking of GSH in FosA
10 structures from the lowest energy conformations. The GSH thiol is placed in the vicinity of FCN.
30 LGA Dockings using AutoDock 4, 1.5 Å clustering.
MD simulations
• Amber 9.
• FF03 force field, TIP3P water model.
• Truncated octahedron > 10 Å of water around the solute.
• 10 Å cutoff on non-bonding interactions
• Charges and Force constants for the Mn-centre (His, Glu, Mn, FCN) calculated using Gaussian 03.
Backbone RMSD residue 1-268
GS-
GSH
t (ps)
Distance GSH(S) – FCN (C) of the different Protonation States of GSH
GS-
GSH
t (ps)
GS- Leaves the Binding Pocket
MD snapshot of FosA active site
Residues Shown to Affect FosA Actvity and Interactions with the Modelled GSH
Residue Interacting with GSH CommentsArg93 Yes Lys90 YesSer50 No FCNTyr39 YesGln36 YesTrp34 YesGln91 No His64 No Mn-ligandTyr62 Yes Cys48 No FCN Tyr128 Yes Arg119 YesTrp46 No FCNTyr65 NoSer94 No FCNGlu95 YesSer98 No FCNTyr100 No FCNAsp103 NoHis107 NoGlu110 No Mn-ligandThr9 No FCN
Most of the observed changes in FosAactivity can be identified with the interactions with
FCN or the modelled binding of GSH
QM/MM-model of FosA
Unrestricted
Restricted
Preliminary Energetics for FosA