virtual libraries and virtual screening in drug discovery ... · • virtual libraries and virtual...

Post on 19-Jun-2020

15 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Virtual Libraries and Virtual Screening in

Drug Discovery Processes using KNIME

Iván Solt

Solutions for Cheminformatics

Drug Discovery Strategies for known targets

High-Throughput

Screening (HTS)

Cells or recombinant

protein

Fluorescent or

luminescent readout

Automated,

miniaturized

Thousands of

samples / day

Number of primer

actives: ~1%

Virtual Screening (VS)

Ligand or structure based

Virtual or real libraries

Similarity search, 2D or

3D

Can lead to thousands of

possible actives: further

processing needed

Measurement:

Enrichment ratio, ROC

curves for known actives

Virtual Library Design Workflow

DB

DB

Databases

Reactions

Molecules

Queries

Compound selection

Similarity searches

Substructure searches

Enumeration

Fuse fragments

R-group composition

Reaction enumeration

Library analysis

Clustering

2D similarity screen

3D Shape similarity

screen

Fragmentation

R-group decomposition

Fragmentation

Reagent clipping

Find or Virtually Create Candidates

• Virtual screening of

existing compounds

Pros:

– Fast

– Hits are readily available

for in vitro experiments

Cons:

– Limitation on available

compounds

Pros:

– No limitation on virtual

compound space

– Structural novelty

Cons:

– Are hits synthetically

available?

• De novo design

Find or Virtually Create Candidates

• Virtual screening of

existing compounds

Pros:

– Fast

– Hits are readily available

for in vitro experiments

Cons:

– Limitation on available

compounds

Pros:

– No limitation on virtual

compound space

– Structural novelty

Cons:

– Are hits synthetically

available?

• De novo design

Virtual Screening Workflow

DB

DB

Molecules

in-house or

commercially

available

1. Reactions

virtual synthetic

path

Synthetically

Accessible

Compounds

2. Filtering

3. Similarity

Search

4. 3D alignmentin vivo

experiment? 5. Clustering

Step 1: Reaction Enumeration

• Reaction schema for accessible

syntheses

• Combinatorial or sequential enumeration

• Reaction rules: phrase + apply public

and

in-house chemical knowledge

– Selectivity with tolerance

– Reactivity

– Exclusion rules

EXCLUDE: match(reactant(1), "[Cl,Br,I]C(=[O,S])C=C") or

match(reactant(0), "[H][O,S]C=[O,S]") or

match(reactant(0), "[P][H]") or

(max(pka(reactant(0), filter(reactant(0),

"match('[O,S;H1]')"), "acidic")) > 14.5) or

(max(pka(reactant(0), filter(reactant(0),

"match('[#7:1][H]', 1)"), "basic")) > 0)

Step 1: Reaction Enumeration

Reaction rules ON

• Fewer results than

theoretical

• Unfeasible starting

materials eliminated

• Feasible products only

• Custom rules can be

added to increase

selectivity

Reaction rules OFF

• More results

• Best for debugging

purposes

• Prodcts may be incorrect

due to neglecting

chemical rules

Step 1: Reaction Enumeration

Step 2: Filtering• Lead likeness, drug likeness

– Chemical Terms

• Could it fit to the active centre?

– Basic analysis: size, mass...

• Could it get to the active centre?

ADME properties:

solubility, pKa, polar surface,

partition coefficients...

• Structural filtering

– e.g. reactive groups

• Toxicity, environmental concerns,

etc...

Calculator plugins

Elemental Analysis

Elemental Analysis

IUPAC Name

Structure to Name

Protonation

pKa

Microspecies

Isoelectric Point

Partitioning

logP

logD

Charge

Charge

Polarizability

Orbital

Electronegetivity

Isomers

Tautomerization

Stereoisomer

Conformation

Conformer

Flexible 3D Alignment

Molecular Dynamics

Geometry

Topology Analysis

Geometry

Polar Surface Area (2D)

Molecular Surface Area (3D)

Markush

Markush Enumeration

Other

Hydrogen Bond Donor-

Acceptor

Huckel Analysis

Refractivity

Structural Framework

Resonance

Step 3: Similarity search

Screen 2D +

Descriptor package

Screen against known

bioactives

• Chemical Fingerprints

Topology

• Pharmacophore Fingerprints:

Custom atomic properties + their

topological relationship

• ECFP/FCFP

Similarity searches

• H-bond donors / acceptors

• Cationic / anionic groups

• Hydrophobic groups

• Aromatic groups

• etc.

• Tanimoto, Eucledian, Tversky

metrics

• Metrics optimization

0.47 0.55

0.57

0.28

0.20

0.06

regular Tanimoto

optimized Tanimoto

Step 4: Screen 3D

• Align the candidates to the known active in 3D

• Treat the candidate flexible!

• Consider pharmacophore atom types

(align cationic to cationic, etc.)!

• Problem: complicated conformational space

Step 4: Screen 3D

Simple sampling of the

conformational space:

Minimum and maximum distance

between atom pairs in the full

torsion space

Select atoms

• Colors (e.g. pharmacophore types )

• Topological features

(e.g.:longest chain start/end/center)

• Ring centers (aromatic, aliphatic)

Calculate

• Min/max internal distance ranges

• Distance histograms for selected

atoms

• Only once for each molecule

Step 4: Screen 3D

„Hybrid” alignment:

Separate translation&rotation

from torsions

• Robust and goes fast

• Needs good guess on atom-

atom mapping:

• Same colors

• Distance ranges must be allowed

for all mapped pairs

• Triangle inequality must be

fulfilled for any atom triplet

Screen 3D: Test on DUD

0

5

10

15

20

25

30

% o

f th

e a

cti

ves r

etr

iev

ed

Average of 1% Enrichments

Giganti et al. J. Chem. Inf. Model. 2010, 50, 992

Screen 3D: Test on DUD

0

10

20

30

40

50

60

70

80

90

100

% o

f th

e a

cti

ves r

etr

iev

ed

Average of 10% enrichments

Giganti et al. J. Chem. Inf. Model. 2010, 50, 992

Screen 3D: Test on DUD

Speed

Average time per compound

(without precalculations)

ChemAxon Screen3D 0.07

ROCS 0.5

FRED 1.0

ICMsim 2.4

Surflex-sim 6.7

FlexS 6.9

Surflex-dock 14.6

FLEXX 15.6

ICM 17.7

Intel Xeon 2.4 GHz

Intel Q6600 2.4 GHz

Giganti et al. J. Chem. Inf. Model. 2010, 50, 992

Step 5: Clustering, library analysis

Wide range of methods

• Unsupervised, agglomerative

clustering

• Hierarchical and non-hierarchical

methods

• Similarity based and structure

based techniques

Flexible search options

• Tanimoto and Euclidean metrics,

weighting

• Maximum common substructure

identification

• chemical property matching

including atom type, bond type,

hybridization, charge

JKlustor

JChem Extensions in KNIME

• Worklflow management

in KNIME

• JChem extension nodes

developed by InfoCom,

Japan

• Constantly developing

palette of available

JChem tools

JChem Extensions in KNIME

• IO – molecule and reaction import, export,

drawing

• Visualization

• Manipulators

Calculator plugins

Reactor

Similarity and structure-based search

Fingerprint calculation

Fragmentation

Clustering

R-group composition, decompozition

Standardization

...

• Database management

• Molecular format conversion

• Web search services

Step 1: Reaction Enumeration

Step 2: Filtering

Step 2: Filtering

Step 3: Similarity search

JChem Extensions in KNIME

DB

DB

1. Reactions

virtual synthetic

path

Synthetically

Accessible

Compounds

2. Filtering

3. Similarity

Search

4. 3D alignmentin vivo

experiment?

1. Import reactants

2. Enumerate reaction

• Carry out topology

analysis

3. Calculate properties

• Filter

4. Screen for similarity against

known active

5. Export results

Conclusions

• Virtual libraries and virtual screening are essential

tools in modern Drug Discovery

• No special hardware, short experiment cycles,

variety of approaches

• Database of synthetically accessible compounds

can be designed with reaction libraries and custom

in-house synthetic knowledge

• Powerful 3D alignment techniques allow high-

throughput conformational screening with great

efficiency

• Straightforward integration into KNIME

Contributors

• Tímea Polgár

• Attila Tajti

www.chemaxon.com

top related