david evans, eli-lilly, 'field-aligned matched pairs
TRANSCRIPT
![Page 1: David Evans, Eli-Lilly, 'Field-Aligned Matched Pairs](https://reader035.vdocument.in/reader035/viewer/2022081502/554e833ab4c905f66a8b562b/html5/thumbnails/1.jpg)
David Evans and George Papadatos
Lilly Research Centre, Erl Wood Manor, Windlesham, UK
22nd September 2011
![Page 2: David Evans, Eli-Lilly, 'Field-Aligned Matched Pairs](https://reader035.vdocument.in/reader035/viewer/2022081502/554e833ab4c905f66a8b562b/html5/thumbnails/2.jpg)
• Discover new chemotypes
• Multiobjective space • Isosteres in activity
• Improvements in properties
• Want to use multiple tools in same environment
• But understand what works when
![Page 3: David Evans, Eli-Lilly, 'Field-Aligned Matched Pairs](https://reader035.vdocument.in/reader035/viewer/2022081502/554e833ab4c905f66a8b562b/html5/thumbnails/3.jpg)
• Open Source Workflow tool – main client is free
• But support is available and can integrate commercial vendors + in-
house code as nodes
• Have released many Erl Wood nodes to KNIME community site
• http://tech.knime.org/community/erlwood
![Page 4: David Evans, Eli-Lilly, 'Field-Aligned Matched Pairs](https://reader035.vdocument.in/reader035/viewer/2022081502/554e833ab4c905f66a8b562b/html5/thumbnails/4.jpg)
FieldAlign
Xedmin
Xedex
Xedmin •XED minimization
•2D -> 3D
Xedex • Conformational
analysis
FieldView •Launches FieldView
•View field points +
energies + other data
All nodes pass SDF
![Page 5: David Evans, Eli-Lilly, 'Field-Aligned Matched Pairs](https://reader035.vdocument.in/reader035/viewer/2022081502/554e833ab4c905f66a8b562b/html5/thumbnails/5.jpg)
FieldAlign • Flexible alignment
of query molecules
onto template
![Page 6: David Evans, Eli-Lilly, 'Field-Aligned Matched Pairs](https://reader035.vdocument.in/reader035/viewer/2022081502/554e833ab4c905f66a8b562b/html5/thumbnails/6.jpg)
Company Confidential Copyright © 2008 Eli Lilly and Company
Process is more than just the database search
WHY ?
![Page 7: David Evans, Eli-Lilly, 'Field-Aligned Matched Pairs](https://reader035.vdocument.in/reader035/viewer/2022081502/554e833ab4c905f66a8b562b/html5/thumbnails/7.jpg)
Don’t want
to load all
databases
onto all
users’ PCs
Command-
line search
SOAP Web
Service
•Apache
Tomcat
node
Platform-independent communication
+ secure
intranet !
![Page 8: David Evans, Eli-Lilly, 'Field-Aligned Matched Pairs](https://reader035.vdocument.in/reader035/viewer/2022081502/554e833ab4c905f66a8b562b/html5/thumbnails/8.jpg)
• Read in pre-built hypothesis
(MOE, Phase)
• Or sketch from template molecule
• Jmol based visualizer
• Can also annotate and filter hits,
aids manual inspection
Non-proprietary structure
![Page 9: David Evans, Eli-Lilly, 'Field-Aligned Matched Pairs](https://reader035.vdocument.in/reader035/viewer/2022081502/554e833ab4c905f66a8b562b/html5/thumbnails/9.jpg)
Maximum Unbiased Validation (MUV) dataset
• 17 targets, total 30 ligands and 15000 decoys per target,
source: PubChem bioactivity data.
• Wide-ranging targets: hormone receptors, kinases, proteases,
GPCRs plus others (e.g. HSP90, HIV RT).
• Unbiased for chemical analogues as MUV ligands pre-
clustered with 2D fingerprint
•1.16 compounds per scaffold class
MUV: J. Chem. Inf. Model., 2009, 49 (2), 169-184.
How well do automated pharmacophore
methods do compared to 2D methods?
![Page 10: David Evans, Eli-Lilly, 'Field-Aligned Matched Pairs](https://reader035.vdocument.in/reader035/viewer/2022081502/554e833ab4c905f66a8b562b/html5/thumbnails/10.jpg)
![Page 11: David Evans, Eli-Lilly, 'Field-Aligned Matched Pairs](https://reader035.vdocument.in/reader035/viewer/2022081502/554e833ab4c905f66a8b562b/html5/thumbnails/11.jpg)
• Have looked at whole molecule similarity
• Is there more data if we find fragments which maintain activity?
• Matched Molecular pair analysis (MMP) • Fragments compounds and finds pairs where only one fragment differs
![Page 12: David Evans, Eli-Lilly, 'Field-Aligned Matched Pairs](https://reader035.vdocument.in/reader035/viewer/2022081502/554e833ab4c905f66a8b562b/html5/thumbnails/12.jpg)
The mining and statistical analysis of transformations and their impact on properties of interest (e.g. solubility or activity)
left molecule right molecule transformation ΔSolubility (mgml)
-0.8
+1.2
+2.4
H F
Br OCH3
![Page 13: David Evans, Eli-Lilly, 'Field-Aligned Matched Pairs](https://reader035.vdocument.in/reader035/viewer/2022081502/554e833ab4c905f66a8b562b/html5/thumbnails/13.jpg)
It used to be a slow and computationally expensive process...
• Pair-wise maximum common substructure extraction – O(N2)
Recently a much more efficient algorithm was published
* * >>
1) Cleave all acyclic single bonds, one by one:
2) Index all the fragments (cf. book index):
3) Enumerate the values for each key:
Hussain and Rea (2010). J. Chem. Inf. and Model., 50 (3), 339-348.
(*in an automated and unsupervised way)
Wagener and Lommerse (2006). J. Chem. Inf. and Model., 46 (2), 677-685.
Mol A >> Mol B
![Page 14: David Evans, Eli-Lilly, 'Field-Aligned Matched Pairs](https://reader035.vdocument.in/reader035/viewer/2022081502/554e833ab4c905f66a8b562b/html5/thumbnails/14.jpg)
In: MolRegnos (IDs), structures (in RDKit format) and property values
Out: Matched pairs (left and right molecule, IDs, transformation, property values, ΔP, context, transformation atom count)
Available as an Erl Wood community contribution node
![Page 15: David Evans, Eli-Lilly, 'Field-Aligned Matched Pairs](https://reader035.vdocument.in/reader035/viewer/2022081502/554e833ab4c905f66a8b562b/html5/thumbnails/15.jpg)
Find isosteres in chEMBL
chEMBL – Database of published medicinal chemistry activity data
– Using chEMBL_10 , total >1,000,000 compounds
Use here just human protein kinase inhibitors
Quality assurance for chEMBL data (SQL statement) • Med. chem. friendly compounds, parent structure, not downgraded,
confidence score = 9, exact IC50 or Ki values only (converted to pIC50/pKi) ~14K data points
• Compare biological values coming from the same assay ID only
Aggregate transformations; calculate and bin ΔpIC50s in 3 bins
• Good – Bad – Neutral(depending on a cut-off c = 0.4 log units)
![Page 16: David Evans, Eli-Lilly, 'Field-Aligned Matched Pairs](https://reader035.vdocument.in/reader035/viewer/2022081502/554e833ab4c905f66a8b562b/html5/thumbnails/16.jpg)
• Each
transformation
has a neutral
count
• Absolute value
or percentage:
NeutralCount%
![Page 17: David Evans, Eli-Lilly, 'Field-Aligned Matched Pairs](https://reader035.vdocument.in/reader035/viewer/2022081502/554e833ab4c905f66a8b562b/html5/thumbnails/17.jpg)
chEMBL workflow outputs isosteric fragments
How similar are
isosteres in 2D
fingerprint space?
In field space?
Could fields help us
find unexpected
isosteres?
![Page 18: David Evans, Eli-Lilly, 'Field-Aligned Matched Pairs](https://reader035.vdocument.in/reader035/viewer/2022081502/554e833ab4c905f66a8b562b/html5/thumbnails/18.jpg)
• 1802 fragment pairs from chEMBL_10 kinase data set
• 481 with no rotatable bonds left or right
• Simplifies conformational analysis
• For each fragment pair
1. Swap attachment points for adamantyl
2. FieldAlign to get field similarity (Use adamantyl to
constrain overlay)
3. RDKit fingerprint similarity – topological Daylight-esque
4. Correct similarities for adamantyl
• Are there isosteric pairs with high field similarity but low RDKit
similarity?
![Page 19: David Evans, Eli-Lilly, 'Field-Aligned Matched Pairs](https://reader035.vdocument.in/reader035/viewer/2022081502/554e833ab4c905f66a8b562b/html5/thumbnails/19.jpg)
Field
Sim
RDKit Sim
Size by Neutral
Count %
Larger more
isosteric
Pairs with high
field similarity
but low 2D
similarity
Pairs with high
field and 2D
similarity
![Page 20: David Evans, Eli-Lilly, 'Field-Aligned Matched Pairs](https://reader035.vdocument.in/reader035/viewer/2022081502/554e833ab4c905f66a8b562b/html5/thumbnails/20.jpg)
Field
Sim
RDKit Sim
Size by Neutral
Count %
Only those with
>60% isosteric
examples
Thiophene -> Phenol
![Page 21: David Evans, Eli-Lilly, 'Field-Aligned Matched Pairs](https://reader035.vdocument.in/reader035/viewer/2022081502/554e833ab4c905f66a8b562b/html5/thumbnails/21.jpg)
Field
Sim
RDKit Sim
Size by Neutral
Count %
Only those with
>60% isosteric
examples
Imidazole->
Morpholine?
![Page 22: David Evans, Eli-Lilly, 'Field-Aligned Matched Pairs](https://reader035.vdocument.in/reader035/viewer/2022081502/554e833ab4c905f66a8b562b/html5/thumbnails/22.jpg)
Field
Sim
RDKit Sim
Size by Neutral
Count %
Only those with
>60% isosteric
examples
Some small
fragments
![Page 23: David Evans, Eli-Lilly, 'Field-Aligned Matched Pairs](https://reader035.vdocument.in/reader035/viewer/2022081502/554e833ab4c905f66a8b562b/html5/thumbnails/23.jpg)
WEE1
kinase
PDB 2I06
Non-proprietary structure
(from PDB)
Solvent-
exposed
Buried
![Page 24: David Evans, Eli-Lilly, 'Field-Aligned Matched Pairs](https://reader035.vdocument.in/reader035/viewer/2022081502/554e833ab4c905f66a8b562b/html5/thumbnails/24.jpg)
Field
Sim
RDKit Sim
Size by Neutral
Count %
Only those with
>60% isosteric
examples
Me-tetrazole ->
oxadiazole
![Page 25: David Evans, Eli-Lilly, 'Field-Aligned Matched Pairs](https://reader035.vdocument.in/reader035/viewer/2022081502/554e833ab4c905f66a8b562b/html5/thumbnails/25.jpg)
Field
Sim
RDKit Sim
Size by Neutral
Count %
Only those with
>60% isosteric
examples
Thiophene ->
phenol
![Page 26: David Evans, Eli-Lilly, 'Field-Aligned Matched Pairs](https://reader035.vdocument.in/reader035/viewer/2022081502/554e833ab4c905f66a8b562b/html5/thumbnails/26.jpg)
Non-proprietary structure
(from PDB)
![Page 27: David Evans, Eli-Lilly, 'Field-Aligned Matched Pairs](https://reader035.vdocument.in/reader035/viewer/2022081502/554e833ab4c905f66a8b562b/html5/thumbnails/27.jpg)
• 6299 data points from thermodynamic solubility assay
• 423 single-point transformations
• 215 no-rotatable point transformations
• Aggregate transformations; calculate and bin ΔlogS in 3 bins
• Good – Bad – Neutral (c = 0.3 log units)
• Are there transformations which increase solubility with low
field similarity but high RDKit similarity?
![Page 28: David Evans, Eli-Lilly, 'Field-Aligned Matched Pairs](https://reader035.vdocument.in/reader035/viewer/2022081502/554e833ab4c905f66a8b562b/html5/thumbnails/28.jpg)
Field
Sim
RDKit Sim
Size by Good
Count %
Only those with
>60% boosting
examples
Ring contraction
+ twist ?
![Page 29: David Evans, Eli-Lilly, 'Field-Aligned Matched Pairs](https://reader035.vdocument.in/reader035/viewer/2022081502/554e833ab4c905f66a8b562b/html5/thumbnails/29.jpg)
Field
Sim
RDKit Sim
Size by Good
Count %
Only those with
>60% boosting
examples
Big boost from
morpholine
![Page 30: David Evans, Eli-Lilly, 'Field-Aligned Matched Pairs](https://reader035.vdocument.in/reader035/viewer/2022081502/554e833ab4c905f66a8b562b/html5/thumbnails/30.jpg)
• Can mine chEMBL data for non-obvious isosteres
• Will other data sets find more?
• Would like to improve workflow to make isostere data set for
3D similarity comparison
• Improve fragmentation/conformer/ alignment handling?
• Need to include whole molecule?
• Need 3D binding site data as well to confirm isosterism?
• KNIME platform developing
• Virtual screening and evaluation environment
• Rapid experimentation with varied tools
• http://tech.knime.org/community/erlwood
![Page 31: David Evans, Eli-Lilly, 'Field-Aligned Matched Pairs](https://reader035.vdocument.in/reader035/viewer/2022081502/554e833ab4c905f66a8b562b/html5/thumbnails/31.jpg)
George Papadatos
Juliette Pradon
Hina Patel
Nikolas Fechner
David Thorner
Michael Bodkin
KNIME, chEMBL + Cresset !
![Page 32: David Evans, Eli-Lilly, 'Field-Aligned Matched Pairs](https://reader035.vdocument.in/reader035/viewer/2022081502/554e833ab4c905f66a8b562b/html5/thumbnails/32.jpg)
ROC curves for
retrieval of >66%
isosteric groups
Field similarity
performs better
than RDKit
But AUC = 0.68
Workflow not
optimized for
this purpose