chemoinformatics in drug design

49
Chemoinformatics in Drug Design Biological Sequence Analysis, May 6, 2011 Irene Kouskoumvekaki, Associate Professor, Computational Chemical Biology, CBS, DTU-Systems Biology

Upload: isabelle-oconnor

Post on 30-Dec-2015

109 views

Category:

Documents


4 download

DESCRIPTION

Chemoinformatics in Drug Design. Irene Kouskoumvekaki, Associate Professor, Computational Chemical Biology, CBS, DTU-Systems Biology. Biological Sequence Analysis, May 6, 2011. Computational Chemical Biology group. Tudor Oprea Guest Professor. Olivier Taboureau Associate Professor. - PowerPoint PPT Presentation

TRANSCRIPT

Chemoinformatics in Drug Design

Biological Sequence Analysis, May 6, 2011

Irene Kouskoumvekaki,Associate Professor,Computational Chemical Biology,CBS, DTU-Systems Biology

2 CBS, Department of Systems Biology

Computational Chemical Biology group

Irene Kouskoumvekaki

Associate Professor

Olivier Taboureau

Associate Professor

Sonny Kim Nielsen

PhD student Kasper Jensen

PhD student

Tudor Oprea

Guest Professor

Ulrik Plesner

master student

3 CBS, Department of Systems Biology

4 CBS, Department of Systems Biology

5 CBS, Department of Systems Biology

Definition:Chemoinformatics

Gathering and systematic use of chemical information, and application of this information to predict the behavior of unknown compounds in silico.

data prediction

6 CBS, Department of Systems Biology

Definition:A drug candidate…

... is a (ligand) compound that binds to a biological target (protein, enzyme, receptor, ...) and in this way either initiates a process (agonist) or inhibits it (antagonist)

The structure/conformation of the ligand is complementary to the space defined by the protein’s active site

The binding is caused by favorable interactions between the ligand and the side chains of the amino acids in the active site. (electrostatic interactions, hydrogen bonds, hydrophobic contacts...)

7 CBS, Department of Systems Biology

In vitro / In silico studies

Drug Discovery

Clinical studies

Animal studies

8 CBS, Department of Systems Biology

The Drug Discovery Process

Chemoinformatics

9 CBS, Department of Systems Biology

The Drug Discovery Process

MKTAALAPLFFLPSALATTVYLA

GDSTMAKNGGGSGTNGWGEYL

ASYLSATVVNDAVAGRSAR…(etc)

We know the structure of the biological target

We identify/predict the binding pocket

Challenge:

To design an organic molecule that would bind strong enough to the biological target and modute it’s activity.

New drug candidate

10 CBS, Department of Systems Biology

What is it?Alzheimer's is a disease that causes failure of brain

functions and dementia. It starts with bad memory and disability to function in common everyday activities.

Example: – Alzheimer’s disease

How do you get it?Alzheimer's disease is the result of malfunctioning

neurons at different parts of the brain. This, in turn, is due to an inbalance in the concentration of neurotranmitters.

11 CBS, Department of Systems Biology

How can we treat it?

Example: – Alzheimer’s disease

Acetylkolin neurotransmitter

Drug against Alzheimer’s

12 CBS, Department of Systems Biology

Old School Drug discovery process

Screening collection

HTS

Actives

103 actives106 cmp.

Follow-up

Hits

1-10 hits

Lead series

0-3 lead series

Hit-to-lead

Clinical trials

Drug candidate

0-1

Lead-to-drug

High rate of false positives !!!

13 CBS, Department of Systems Biology

14 CBS, Department of Systems Biology

Failures

15 CBS, Department of Systems Biology

Drug discovery in the 21st Century

Diverse set of molecules tested in the lab

in vitro in silico + in vitro

Computational methods to select subsets (to be tested in the lab) based on prediction of drug-likeness, solubility, binding, pharmacokinetics, toxicity, side effects, ...

16 CBS, Department of Systems Biology

The Lipinski ‘rule of five’ for drug-likeness prediction

Octanol-water partition coefficient (logP) ≤ 5 Molecular weight ≤ 500 # hydrogen bond acceptors (HBA) ≤ 10 # hydrogen bond donors (HBD) ≤ 5

If two or more of these rules are violated, the compound might

have problems with oral bioavailability.(Lipinski et al., Adv. Drug Delivery Rev., 23, 1997, 3.)

17 CBS, Department of Systems Biology

Major Aspects of Chemoinformatics

18 CBS, Department of Systems Biology

Major Aspects of Chemoinformatics

•Information Acquisition and Management: Methods for collecting data (mainly experimental). Development of databases for storage and retrieval of information.

•Information Use: Data analysis, correlation and model building.

•Information Application: Prediction of molecular properties relevant to chemical and biochemical sciences.

19 CBS, Department of Systems Biology

Major Aspects of Chemoinformatics

•Information Acquisition and Management: Methods for collecting data (mainly experimental). Development of databases for storage and retrieval of information.

•Information Use: Data analysis, correlation and model building.

•Information Application: Prediction of molecular properties relevant to chemical and biochemical sciences.

20 CBS, Department of Systems Biology

Information Acquisition and Management

21 CBS, Department of Systems Biology

Small molecule databases

22 CBS, Department of Systems Biology

0

2,000,000

4,000,000

6,000,000

8,000,000

10,000,000

12,000,000

14,000,000

16,000,000

18,000,000

20,000,000

May-05 Sep-05 Jan-06 May-06 Sep-06 Jan-07 May-07 Sep-07

Compound

Substance

Growth In PubChem Substances & Compounds

Recent count: Substance: 72,156,631 Compound: 28,807,320 Rule of 5: 20,692,980

23 CBS, Department of Systems Biology

Searching in PubChem

24 CBS, Department of Systems Biology

Structural representation of molecules

Structural representation of molecules

25 CBS, Department of Systems Biology

Major Aspects of Chemoinformatics

•Information Acquisition and Management: Methods for collecting data (mainly experimental). Development of databases for storage and retrieval of information.

•Information Use: Data analysis, correlation and model building.

•Information Application: Prediction of molecular properties relevant to chemical and biochemical sciences.

26 CBS, Department of Systems Biology

Beyond the Lipinski Rule of 5...

•Chemometrics: The application of mathematical or statistical methods to chemical data (simple, linear methods)e.g. Principal Component Analysis

•Machine Learning: The design and development of algorithms and techniques that allow computers to learn (complex, non-linear algorithms)e.g. Artificial Neural Networks, K-means clustering

27 CBS, Department of Systems Biology

Major Aspects of Chemoinformatics

•Information Acquisition and Management: Methods for collecting data (mainly experimental). Development of databases for storage and retrieval of information.

•Information Use: Data analysis, correlation and model building.

•Information Application: Prediction of molecular properties relevant to chemical and biochemical sciences.

28 CBS, Department of Systems Biology

Prediction of Solubility, ADME & Toxicity

Solid

drug

Dissolution

Solubility

Drug in

solution

Membrane

transfer

Absorption

Absorbed

drug

Liver extraction Systemic

circulation

Metabolism

29 CBS, Department of Systems Biology

Prediction of biological activity/selectivity

30 CBS, Department of Systems Biology

Prediction models at CBS

31 CBS, Department of Systems Biology

Virtual screening

32 CBS, Department of Systems Biology

Virtual Screening Flavors

1D1D filters

e.g. Lipinskis Rule of Five

LIGAND-BASED

TARGET-BASED

33 CBS, Department of Systems Biology

Molecular similarity on the Chemical Space

• Similar Property Principle – Molecules having similar structures and properties are expected to exhibit similar biological activity. (Not always true!)

• Thus, molecules that are located closely together in the chemical space are often considered to be functionally related.

34 CBS, Department of Systems Biology

Ligand-based VS: Fingerprints

– widely used similarity search tool– consists of descriptors encoded as bit strings– Bit strings of query and database are compared using

similarity metric such as Tanimoto coefficient

MACCS fingerprints: 166 structural keys

that answer questions of the type:

• Is there a ring of size 4?

• Is at least one F, Br, Cl, or I present?

where the answer is either

TRUE (1) or FALSE (0)

35 CBS, Department of Systems Biology

Tanimoto Similarity

Tc c

a b c

9

10 9 90.9

or 90% similarity

36 CBS, Department of Systems Biology

Tanimoto Similarity

37 CBS, Department of Systems Biology

Ligand-based VS: Pharmacophore

38 CBS, Department of Systems Biology

Structure-based Virtual Screening: Docking

Given a protein and a database of ligands, docking scores determine which ligands are most likely to bind.

Binding pocket of target Library of small compounds

39 CBS, Department of Systems Biology

Energy of binding

Binding pocket of target Library of small compounds

-10 kcal/mol

+1 kcal/mol

-1 kcal/mol

+10 kcal/mol

ΔG = ΔH - TΔS

vdW

Hbond

Desolvation E

Electrostatic E

Torsional free E

40 CBS, Department of Systems Biology

“Docking” and “Scoring”

•Docking involves the prediction of the binding mode of individual molecules

– Goal: new ligand orientation closest in geometry to the observed X-ray structure (Conformations of ligands in complexes often have very similar geometries to minimum-energy conformations of the isolated ligand)

•Scoring ranks the ligands using some function related to the free energy of association of the two partners, looking at attractive and repulsive regions and taking into account steric and hydrogen bonding interactions

– Goal: new ligand score closest in value to the docking score of the X-ray structure

41 CBS, Department of Systems Biology

Docking algorithms

•Most exhaustive algorithms:– Accurate prediction of a binding pose

•Most efficient algorithms– Docking of small ligand databases in reasonable time

•Rapid algorithms– Virtual high-throughput screening of millions of

compounds

42 CBS, Department of Systems Biology

Scoring functions

•Molecular mechanics force field-based

Score is estimated by summing the strength of intermolecular van der Waals and electrostatic interactions between all atoms of the ligand-target complex

-CHARMM, AMBER

•Empirical-based

Based on summing various types of interactions between the two binding partners (hydrogen bonds, hydrophobic, …)

- ChemScore, GlideScore, AutoDock

•Knowledge-based

Based on statistical observations of intermolecular close contacts from large 3D databases, which are used to derive potentials or mean forces

-PMF, DrugScore

43 CBS, Department of Systems Biology

Ligand-based VSgood enrichment of candidate

molecules from the screening of large databases with less computational efforts

×too coarse to pick up subtle differences induced by small structural variations in the ligands

many options for model refinement

Structure-based VSbetter fit for analyzing smaller

sets of compounds, especially in retrospective analysis

include all possible interactions thus allowing the detection of unexpected binding modes

×Changing parameters for docking algorithms and scores is demanding

Mutants are being developed:

• pharmacophore methods with information about the target’s binding site

• docking programs that incorporate pharmacophore constraints

Combination of pharmacophore, docking and molecular dynamics (MD) screens

44 CBS, Department of Systems Biology44

http://www.vcclab.org/lab/edragon/

45 CBS, Department of Systems Biology

Public Web Chemoinformatics Toolshttp://pasilla.health.unm.edu/

http://pasilla.health.unm.edu/

46 CBS, Department of Systems Biology

ChemSpiderwww.chemspider.com

47 CBS, Department of Systems Biology

Open Babelhttp://openbabel.org/wiki/Main_page

48 CBS, Department of Systems Biology D. Vidal et al, Ligand-based Approaches to In Silico Pharmacology, Chemoinformatics and Computational Chemical Biology, Ed J. Bajorath, Springer, 2011

49 CBS, Department of Systems Biology

Questions?