history and challenges of chemoinformatics -...

42
C 3 Introduction into CI; SS 03/1st lecture © Gasteiger et al. History and Challenges of Chemoinformatics Johann Gasteiger Computer-Chemie-Centrum University of Erlangen-Nürnberg D-91052 Erlangen, Germany www2.chemie.uni-erlangen.de/

Upload: trinhbao

Post on 14-May-2018

220 views

Category:

Documents


4 download

TRANSCRIPT

C3 Introduction into CI; SS 03/1st lecture© Gasteiger et al.

History and Challenges of Chemoinformatics

Johann GasteigerComputer-Chemie-Centrum

University of Erlangen-NürnbergD-91052 Erlangen, Germany

www2.chemie.uni-erlangen.de/

C3 Introduction into CI; SS 03/1st lecture© Gasteiger et al.

Overview

• the scope of chemoinformatics

• the beginnings

• a field of ist own

• scientific challenges

• political challenges

C3 Introduction into CI; SS 03/1st lecture© Gasteiger et al.

Synthesis of Properties

The most fundamental and lasting objective of synthesis is not

production of new compoundsbut

production of properties

George S. HammondNorris Award Lecture, 1968

C3 Introduction into CI; SS 03/1st lecture© Gasteiger et al.

What structure do I need for a certain property?structure-activity relationships

How do I make this structure?synthesis design

What is the product of my reaction?reaction predictionstructure elucidation

Fundamental Questions in Chemistry

C3 Introduction into CI; SS 03/1st lecture© Gasteiger et al.

chemicalstructure

physicalproperty

chemicalproperty biological

property

starting materials

synthesisplanning

reactionpredictionstructure

elucidation

C3 Introduction into CI; SS 03/1st lecture© Gasteiger et al.

Chemoinformatics - Why?

• complex relationshipsstructure - biological activitychemical reactivity

• amount of informationmany millions of compounds and reactionsmany millions of publications

C3 Introduction into CI; SS 03/1st lecture© Gasteiger et al.

Number of Compounds in Chemistry

compounds published in CAS

0

10

20

30

40

50

1965 1970 1975 1980 1985 1990 1995 2000 2005

year

com

poun

ds (m

illio

ns)

C3 Introduction into CI; SS 03/1st lecture© Gasteiger et al.

From Data to Knowledge

know-ledge

information

data

generalization

context

measurementcalculation

deductivelearning

inductivelearning

C3 Introduction into CI; SS 03/1st lecture© Gasteiger et al.

Chemoinformatics: Definition

„The use of information technology and management has become a critical part of the drug discovery process. Chemoinformatics is the mixing of those information resources to transform data into information and information into knowledge for the intended purpose of making better decisions faster in the area of drug lead identification and organization.“

F. K. Brown, Annual Reports in Medicinal Chemistry 1998, 33, 375-384

C3 Introduction into CI; SS 03/1st lecture© Gasteiger et al.

Chemoinformatics: Definition

The application of

informatics methods

to solve

chemical problems

C3 Introduction into CI; SS 03/1st lecture© Gasteiger et al.

The Scope of Chemoinformatrics

• structure representation and searching

• data analysis and chemometrics

• molecular modeling

• spectra analysis and structure elucidation

• reaction representation and searching

• reaction modeling and synthesis design

C3 Introduction into CI; SS 03/1st lecture© Gasteiger et al.

Application Areas for Chemoinformatics

• drug design

• analytical chemistry

• chemical engineering

• inorganic chemistry

• medicinal chemistry

• organic chemistry

• physical chemistry

• theoretical chemistry

C3 Introduction into CI; SS 03/1st lecture© Gasteiger et al.

• structure representation1965, Morgan

• structure elucidation1965, Sasaki, Munk, DENDRAL

• synthesis design1970, Corey & Wipke, Ugi, Gelernter, Hendrickson

• molecular modeling1970, Langridge, Marshall

• data analysis / chemometrics1970, Kowalski, Wold

Chemoinformatics – An Old Discipline

C3 Introduction into CI; SS 03/1st lecture© Gasteiger et al.

Structure Representation

• European industry: BASF, Hoechst, ICI, Thomae,BASIC, IDC (1965 - )

• Wiswesser Line Notation (1969 - )• Chemical Abstracts Service: Morgan Algorithm 1965• Sheffield: M. Lynch, P. Willett (1970 - )• Paris: J.E.Dubois, DARC (1970 - )• Munich: I. Ugi, J. Gasteiger, C. Jochum (1972 -)

C3 Introduction into CI; SS 03/1st lecture© Gasteiger et al.

Computer-Assisted Structure Elucidation

• DENDRAL: C.Djerassi, J. Lederberg, D.Feigenbaum (1965)

• CHEMICS: S.Sasaki (1965)• M.Munk (1968)• C.Steinbeck (1998)

C3 Introduction into CI; SS 03/1st lecture© Gasteiger et al.

Computer-Assisted Synthesis Design

1969 Corey + Wipke OCSS LHASA, SECS1973 Ugi + Gasteiger CICLOPS WODCA, THERESA1971 Hendrickson SYNGEN1976 Bersohn SYNSUP1977 Gelernter SYNCHEM1985 Hanessian CHIRON1988 Zefirov FLAMINGOES1988 Sasaki + Funatsu AIPHOS

C3 Introduction into CI; SS 03/1st lecture© Gasteiger et al.

Visualzation of Chemical Structures

LHASA 1970

C3 Introduction into CI; SS 03/1st lecture© Gasteiger et al.

Data Analysis Methods

• Chemometrics: B.Kowalski (1970)• PLS: S. Wold (1978)• Self-organizing neural network: Kohonen (1983)• Backpropagation Algorithm: Rumelhart, Hinxton

(1987)

C3 Introduction into CI; SS 03/1st lecture© Gasteiger et al.

Databases

• Chemical Abstracts Service (1975)• DARC System (1980)• Cambridge CSD (1984)• Inorganic Structures Database (1985)• Beilstein (1990)• Gmelin (1990)• ChemInformRX (1991)• SpecInfo (1991)

C3 Introduction into CI; SS 03/1st lecture© Gasteiger et al.

• data storage and retrieval

• property prediction

• drug design

• synthesis design

• spectra analysis and prediction

Common Topics: Structure Representation

C3 Introduction into CI; SS 03/1st lecture© Gasteiger et al.

Common Topics: Data Analysis Methods

• property prediction

• drug design

• analytical chemistry

• spectra analysis and prediction

C3 Introduction into CI; SS 03/1st lecture© Gasteiger et al.

• representation of chemical structures

• searching structures in databases

• visualization of chemical structures

• representation of chemical reactions

• data analysis methods

Common Topics

C3 Introduction into CI; SS 03/1st lecture© Gasteiger et al.

Handbook of Chemoinformatics

J. Gasteiger (Editor)

65 authors73 contributions

4 volumes1900 pages

Wiley-VCH, Weinheim(August 2003)

From Data to Knowledge

C3 Introduction into CI; SS 03/1st lecture© Gasteiger et al.

• representation of Markush structures (patents)

• representation of polymers

• conformational flexibility (bioactive conformation)

• similarity searching (beyond fingerprints and Tanimoto coefficient)

Chemical Structures - Challenges

C3 Introduction into CI; SS 03/1st lecture© Gasteiger et al.

• scoring functions for docking

• flexibility of proteins

Proteins - Challenges

C3 Introduction into CI; SS 03/1st lecture© Gasteiger et al.

gene drugprotein lead

Bioinformatics Chemoinformatics

C3 Introduction into CI; SS 03/1st lecture© Gasteiger et al.

• electroníc laboratory notebooks

• publishing chemical information (3D structures, spectra)

• publishing and searching on the internet

• text mining

• optical character recognition

• input of chemical structure (hand writing, voice)

Information Acquisition - Challenges

C3 Introduction into CI; SS 03/1st lecture© Gasteiger et al.

• descriptor elimination

• model validation

• automatic model building

• definition of applicability domain

Data Mining - Challenges

C3 Introduction into CI; SS 03/1st lecture© Gasteiger et al.

• modeling of chemical reactivity

• prediction of the course of chemical reactions

• synthesis design

• prediction of metabolism/degradation (abiotic and biotic)

• analysis of biochemical pathways

Chemical Reactions - Challenges

C3 Introduction into CI; SS 03/1st lecture© Gasteiger et al. /slides/Biochemical_Pathways/Folien/CCC/gcb00.ppt© Gasteiger et al.C3

What is a Chemical Reaction?

+EC - Nr.: 4.1.3.7

COOHC

CH2

O

COOH

the bioinformaticianan event influenced by a gene, a protein

the computer scientista context sensitive graph rewriting rule

the chemistan event breaking and making bonds

CH3 CO

S CoA

COOH

CH2

C

CH2

COOHHO

COOH

C3 Introduction into CI; SS 03/1st lecture© Gasteiger et al.

C3 Introduction into CI; SS 03/1st lecture© Gasteiger et al.

Biochemical Pathways

/slides/Biochemical_Pathways/Folien/CCC/roche_2.ppt© Gasteiger et al.C3

C3 Introduction into CI; SS 03/1st lecture© Gasteiger et al.

Glucose6-phosphate

NADP+

NADPH H+

6-Phospho-gluconolactone

H2O6-Phospho-gluconate

Ribulose5-phosphate

CO2

Xylulose5-phosphate

Ribose5-phosphate

Glyceraldehyde3-phosphate

Sedoheptulose7-phosphate

Erythrose4-phosphate

Fructose6-phosphate

H+NADP+NADPH

1

3

45

6

2

5

9

24

7

8

10 11

14

Glyceraldehyde3-phosphate

15

1

23

45

67

8

10

12

12 13

14

1512

5[r10] 5[r12] 10[r14] 10[r1] 10[r2] 10[r3] 8[r4] 3[r6] 3[r8]

2[c13] 20[c2] 10[c6] 1[c8] ---> 20[c4] 20[c5] 10[c9] 3[c12]

maximize NADPH production

Pathway Searching

C3 Introduction into CI; SS 03/1st lecture© Gasteiger et al.

• physical– spectra (CASE)

– color of dyes etc

• chemical– chemical reactivity

• biological– toxicity

• risk assessment (chemical + biological) REACH

Prediction of Properties - Challenges

C3 Introduction into CI; SS 03/1st lecture© Gasteiger et al.

Application Areas for Chemoinformatics - Challenges

• drug design

• analytical chemistry

• chemical engineering

• inorganic chemistry

• medicinal chemistry

• organic chemistry

• physical chemistry

• theoretical chemistry

C3 Introduction into CI; SS 03/1st lecture© Gasteiger et al.

Teaching

Sheffield

UMIST

Strasbourg

Erlangen

Indiana University

C3 Introduction into CI; SS 03/1st lecture© Gasteiger et al.

Textbooks on Chemoinformatics

• V. Gillet, A. Leach

• J. Gasteiger, T. Engel

• J. Bajorath

C3 Introduction into CI; SS 03/1st lecture© Gasteiger et al.

Chemoinformatics - A Textbook -

J. Gasteiger, T. Engel(Editors)

650 pages

Wiley-VCH, Weinheim(September 2003)

C3 Introduction into CI; SS 03/1st lecture© Gasteiger et al.

Teaching

• define curriculum in chemoinformatics

• what contents of chemoinformatics have to go into regular chemistry curricula

C3 Introduction into CI; SS 03/1st lecture© Gasteiger et al.

Cooperation Industry - Academia

• industry: generate data

• academia: develop methods

provide academia access to data

C3 Introduction into CI; SS 03/1st lecture© Gasteiger et al.

Funding

• increase awareness for importance of Chemoinformatics

• go into committees

C3 Introduction into CI; SS 03/1st lecture© Gasteiger et al.

Get Organized!

Chemometrics Society

QSAR Society

FECS Working Party: Computational Chemistry

Chemoinformatics Society