generating synthetically accessible ligands by de novo design synthetic sprout a peter johnson...
DESCRIPTION
Example: 3D shapes of sites H-bond acceptor site H-bond donor site Hydrogen Bond SitesTRANSCRIPT
Generating Synthetically Accessible Ligands by De Novo Design
Synthetic Sprout
A Peter Johnson Krisztina Boda
Attilla TingJon Baber
SPROUT is the De Novo design system developed in Leeds
SPROUT components
Identification of potential interaction sites complementary to the receptor, ie H bonding, hydrophobic sites, metal co-ordination sites etc.
Automated docking of small fragments at the interaction sites.
Generation of hypothetical structures by linking the docked fragments together.
Tools for scoring, sorting and navigating the answer set.
Example: 3D shapes of sites
H-bond acceptor site
H-bond donor site
Hydrogen Bond Sites
Boundary Surface
Docking of small fragments at target sites
Target sites are generated either by SPROUT module HIPPO (or similar system) or come from a pharmacophore hypothesis.
Small fragments with complementary functionality are selected by the user and automatically docked into the target site(s).
In addition to these small fragments, it is also possible to dock large fragments which are known to satisfy several of the target sites. Such a large fragment can then act as a “seed” for further growth.
A successful dock must place the small fragment at the target site with the correct orientation to satisfy any directional constraints.
The docking process is very fast and uses a novel hierarchical least squares optimisation procedure.
Structure generationThe SPIDER module links the target sites together in a pairwise fashion to make complete molecular structures which satisfy target sites. It does this by sequentially adding new fragments in an exhaustive fashion.
There is no element of random choice in this process, which means that various heuristics have to be adopted to avoid a combinatorial explosion.
The main approximations employed are:There is a sampling of all the possible conformations about single bonds.Growth is only permitted from atoms/bonds which are closest to the target site which is to be reached
Main algorithm of SPIDER
Multiphase heuristic graph search on a forest ( set of trees)Two trees are searched and removed in each phase and a new tree generated which contains skeletons connections both set of sites
Each phase consists of a bi-directional searchBreadth First Search (BFS)Depth First Search (DFS)
Typical saving bi-directional search 10 successors, 6 level: 2x103 << 106
Connection of Partial Structures
Common template is located in two structures (one from each tree)
Structures are overlayed by the common template
Combined structure is docked to the united set of target sites also considering the steric constraints of the receptor site
Side effect joins are axamined for validity (e.g. fusion on figure)
Navigating the answer setsEstimated binding energy score Ranking final de novo set Ranking and pruning (with caution)
intermediate trees to reduce combinatorial problem.
Estimated ease of synthesis score Ranking final de novo answer set Too slow (~1 structure per minute) to be
useful for intermediate pruning Need faster methods for intermediate
pruning
Recent Advances Parallelization of structure generation
– Farm of SG’s or pcs– SPROUT server – BEOWOLF cluster currently 11
dual processor 600Mhz Pentium III VLSPROUT screens virtual libraries SYNSPROUT generates synthetically
accessible ligands Receptor SPROUT generates potential
synthetic receptors for small movecules
The perennial modellers problem
Hypothetical ligands, including those predicted to bind very strongly, have no practical value unless they can be readily synthesised.Our attempts to provide solutions:CAESA post design estimation of synthetic accessibilitySynSPROUT synthetic constraints built into the de novo design processVLSPROUT even greater synthetic constraints – only members of a specific virtual library are generated
Synthetic Sprout Approach
Pool of readily available starting materials, e.g. subset of ACD
Knowledge Base of reliable high yielding reactions, e.g. esterification,amide formation, reductive amination..
Readily synthesablePutative ligand structures
VIRTUAL SYNTHESISIN RECEPTOR CAVITY
Creation of Starting Material Libraries Obvious Classes eg amino acids “Drug like” starting materials selected by
hand “Drug like” starting materials generated
automatically by retrosynthetic analysis of drug databases
12 O
OOH
3
EXPLANATION Ether FormationIF EtherTHEN disconnect bond between 2 and 3 add-atom O[Hs=1], Cl, Br 3 with – add-hydrogen to 2END-THEN
EXPLANATION Ether FormationIF EtherTHEN disconnect bond between 2 and 3 add-atom O[Hs=1], Cl, Br 3 with – add-hydrogen to 2END-THEN
12 O
OH
3O
N
S
O
NH
O O
13
2
4
O
OOH
EXPLANATION Amide FormationIF AmideTHEN disconnect bond between 1 and 3 add-atom O[Hs=1] to 1 with – add-hydrogen to 2END-THEN
N
S
O
NH2
O O
13
2
4
O
OOH
EXPLANATION Amide FormationIF AmideTHEN disconnect bond between 1 and 3 add-atom O[Hs=1] to 1 with – add-hydrogen to 2END-THEN
N
S
O
NH
O O
13
2
4
O
O
EXPLANATION Amide FormationIF AmideTHEN disconnect bond between 1 and 3 add-atom O[Hs=1] to 1 with – add-hydrogen to 2END-THEN
N
S
O
NH
O OH
O
O
EXPLANATION Amide FormationIF AmideTHEN disconnect bond between 1 and 3 add-atom O[Hs=1] to 1 with – add-hydrogen to 2END-THEN
N
S
O
NH
O OH
O
O1
2
4
3
Retro-Synthetic Knowledge BaseRetro-Synthetic Rule
N
S
O
NH2
O O
EXPLANATION Amide FormationIF AmideTHEN disconnect bond between 1 and 3 add-atom O[Hs=1] to 1 with – add-hydrogen to 2END-THEN
O
OOH
EXPLANATION Ether FormationIF EtherTHEN disconnect bond between 2 and 3 add-atom O[Hs=1], Cl, Br 3 with – add-hydrogen to 2END-THEN
12 O
HO,Cl,Br OH
3O
EXPLANATION Ether FormationIF EtherTHEN disconnect bond between 2 and 3 add-atom O[Hs=1], Cl, Br 3 with – add-hydrogen to 2END-THEN
12 O
HO,Cl,Br OH
3OH
EXPLANATION Ether FormationIF EtherTHEN disconnect bond between 2 and 3 add-atom O[Hs=1], Cl, Br 3 with – add-hydrogen to 2END-THEN
OH
Cl
Br
OH
O
OH
OH
O
Cl
OH
O
Br
Automatic Template LibraryGeneration
Synthetic Template Library
Corina
Omega
SyntheticKnowledge Base• Functional groups
PerceptionKnowledge Bases•Aromatic•Normalisation•Hybridisation•H-bonding properties
Single 3D ConformerGeneration
Multiple ConformerGeneration
2D Drug-like Structures
Retro-SyntheticKnowledge Base
Fragmentation
Filter
Clustering
Ring Perception
Retro-Synthetic rules
Retro-synthetic patterns
Automatic Chemical Perception Information
Perceived – Aromatic atoms and
bonds– Normalised bonds– Hybridisation including
induced hybridisation– H-Donors / Acceptors– Number of hydrogens
attached to an atom– Number of connections to
an atom– Number of available
electron pairs– Charge at an atom
CHEMICAL-LABEL <NitrogenWithLP--SP2>X[SPCENTRE=2]-N[HS=0,1,2];[SPCENTRE=3]EXPLANATION N with lone pair next to sp2 centre behaves as sp2.IF NitrogenWithLP--SP2THEN set-av-eps 2 to 0 set-hybridisation 2 to 2END-THEN
Example from Hybridisation knowledgebase
Rule based system where rules are encoded using the PATRAN language (similar to SMILES)
Perception - Binding Properties
O Single atom based Vs C Functional group based
– D - H donor– A - H acceptor– J - Joinable*– H - Hydrophobic– N - None
O - original method C - current method
* According to reaction knowledge base
Synthetic Template
Primary Amine (Donor)
Carboxylic Acid(Acceptor)
Phenol(Acceptor-Donor)
NN
N
OO
OA
A
AD H
A
D
H
A
A
Synthetic Knowledge BaseSynthetic Rules
EXPLANATION Amide Formation 1IF Carboxylic Acid INTER Primary AmineTHEN destroy-atom 3 form-bond - between 1 and 5 change-hybridization 5 to SP2Dihedral 0 0Dihedral 0 180 Bond-length 1.35END-THEN
Joining Rules• Steps of formation• Hybridization change• Bond type• Bond length• Dihedral angles/penalties
OH
O
NH
H+
451
2
3
O
NH
O OH
NH2
NH2 OH
1. Dock selected fragments to each site 2. Select two sites for connection
NH2 OH
DF Searchtowards acceptor site
O OH
NH2
OO
OH
PrimaryAmine
CarboxylicAcid O
OH
NH
OH2
ReductiveAmination
O
OH
ONH2 OH
Carbonyl
PrimaryAmine
O OH
N
ON OH
1
2
ONH
OH
OH
O OH
NH
OO
OO
OH
Overlapping common fragmentO OH
NH2 BF Searchtowards donor site
O OH
NH
OO
1AmideFormation
Acceptor Site
Donor site
De-novo DesignUsing Synthetic Sprout
O OH
N
OO
1
O
OH
N OH2
2.Reductive Amination ( Carbonyl - Primary Amine )
1.Amide Formation ( Carboxylic Acid -Primary Amine )
New Problems - Hybridisation change
(SP3 SP2)
SP3 SP2
Hybridisation change in Amide Formation 2.( Carboxylic Acid - Secondary Amine )
Secondary Amine Nitrogen becomes SP2
Hybridisation change (SP2 SP3)
SP2
SP3
Carbonyl Carbon becomes SP2
Hybridisation change in Reductive Amination 1.( Carbonyl - Primary Amine )
Selection of Synthetic Reactions
Amide Formation Ether Formation Ullman reaction Amine Alkylation Ester Formation Aldol Wittig Imine C-S-C Formation Reductive Amination
CDK2
Docked:890
Docked:780Docked:935
Docked:358
1534
1079
71
Library :300 fragments/1055 conformations
Run time : 10 h
12
3
4
5
1 Amide Alkylation 2 ( Secondary Amide – Primary Alkyl Halide )2 Wittig Reaction ( Carbonyl = Primary Alkyl Halide )3 Ether Formation 1 ( Alcohol - Alcohol )4 & 5 Amine Alkylation 1 (Primary Amine - Primary Alkyl Halide )
Act Score : -7.80
SynSPROUTCurrent statusWorks well for small starting material libraries (low hundreds).Several libraries now built including amino acid library for peptide generation. Library from MDDR being built.Potential for suggesting starting points for new combinatorial librariesFuture workExtend types of chemistry allowedDevelop algorithms which would permit the use of libraries of hundreds of thousands of starting materials (such as ACD).Parallelisation helps but on its own is not sufficient to cope with the inevitable combinatorial explosion.
Acknowledgements
Co-workers :Krisztina Boda Attilla TingJon Baber
Special thanks to Open Eye Scientific Software for providing access to OMEGA