slides/cactvs/acswashington2000.ppt © ihlenfeldt 2000 c3c3 chemical visualization: the art of...

47
/slides/cactvs/acswashington2000.ppt © Ihlenfeldt 2000 C 3 Chemical Visualization: The Art of Drawing a Chemical Structure W. D. Ihlenfeldt Computer-Chemistry-Center University of Erlangen-Nuremberg Erlangen, Germany O O O OH

Upload: dustin-daniels

Post on 01-Jan-2016

214 views

Category:

Documents


0 download

TRANSCRIPT

/slides/cactvs/acswashington2000.ppt© Ihlenfeldt 2000C3

Chemical Visualization:The Art of Drawing a Chemical Structure

W. D. Ihlenfeldt

Computer-Chemistry-Center

University of Erlangen-Nuremberg

Erlangen, Germany

O

O

O

O H

/slides/cactvs/acswashington2000.ppt© Ihlenfeldt 2000C3

Topics

• Motivation

• Drawing One Structure

• Drawing a Set of Structures

• Visualizing Structure Attributes

/slides/cactvs/acswashington2000.ppt© Ihlenfeldt 2000C3

Motivation: 3D is not Everything

3D structure displays are valuable tools, but ...

• limited to viewing part of structure

• unsuited for quick comparisons

/slides/cactvs/acswashington2000.ppt© Ihlenfeldt 2000C3

Motivation: Defending 2D Plots

2D structure plots are still the core of of chemical information:

• show complete structure

• easy recognition of patterns

O

O

O

O H

/slides/cactvs/acswashington2000.ppt© Ihlenfeldt 2000C3

Motivation

• So why can’t you just draw it with [your editor of choice] ?

The trend towards combinatorial libraries and structuredesign requires fast, reliable and automatic drawing of enormous numbers of compounds.

Unfortunately, 2D structure drawing is hard. [Helson, Rev. Comp. Chem., 13, 313, 1999]

• Because nobody wants to draw millions of structures!

/slides/cactvs/acswashington2000.ppt© Ihlenfeldt 2000C3

Topics

• Motivation

• Drawing One Structure

• Drawing a Set of Structures

• Visualizing Structure Attributes

/slides/cactvs/acswashington2000.ppt© Ihlenfeldt 2000C3

The Rules

• Chemists know how a good plot should look like:

• Not a simple projection from 3D• Complex, ill-defined set of rules

O

O

O

O H

/slides/cactvs/acswashington2000.ppt© Ihlenfeldt 2000C3

Basic Structure Data

• Draw a structure from connectivity• Atom/bond table• No coordinates to begin with• Often specified in linear notation such as SMILES O1C(/C=C/[C@@](/C=C/[C@@H](C(=C\CC[C@H]([C@@H]1C)C)/C)OC)(O)C)=O

O

O

O

O H

/slides/cactvs/acswashington2000.ppt© Ihlenfeldt 2000C3

The Complications

• Orientation of ring systems

• Close atoms by colliding fragments

Ô

O O

Ô

/slides/cactvs/acswashington2000.ppt© Ihlenfeldt 2000C3

• Rules for bridge systems

• Cages

The Complications

/slides/cactvs/acswashington2000.ppt© Ihlenfeldt 2000C3

The Complications

• 1.3/1.4-embedded rings

• Crowded connection points

/slides/cactvs/acswashington2000.ppt© Ihlenfeldt 2000C3

The Complications

• Choice of stereo attributes (wedges, etc.)

• Trans-bonds in rings

O

H O

OOO

O

O

O H

/slides/cactvs/acswashington2000.ppt© Ihlenfeldt 2000C3

A Simple Structure?

• dual 1.4-embedded rings• close contacts• no solution on 120º grid• 16-membered ring• implicit constraints• with +/- 60,90,45,35º angles• 814 naive patterns (4.3·1012)• max. for real-time response: 250.000 (2.5·105)

N

N

/slides/cactvs/acswashington2000.ppt© Ihlenfeldt 2000C3

Controlling the Number of Patterns

• closing distance and angle• cis/trans information• total distance criterium• full loop criterium • no clustering of non-60° angles• pseudo energy selector• favoring 60°, symmetry

/slides/cactvs/acswashington2000.ppt© Ihlenfeldt 2000C3

A Simple Structure?

N

N

N

NN

N

Isis/Draw 2.2

ChemSketch 3.5

ChemDraw 5.0

N

NCACTVS 3.113

/slides/cactvs/acswashington2000.ppt© Ihlenfeldt 2000C3

More Examples

Symmetry-preserving

optimization by

synchronous bond

bendingN

Ni

N

NN

/slides/cactvs/acswashington2000.ppt© Ihlenfeldt 2000C3

More Examples

Optimization by

bond shortening

H N

NOH

O

O

N

O

N

H

ON

H

O

N

H

O

H N

NO

/slides/cactvs/acswashington2000.ppt© Ihlenfeldt 2000C3

More Examples

Cage system

analysis

(not a template)OHHHOHHHHHHHHHHHHH

O

O

/slides/cactvs/acswashington2000.ppt© Ihlenfeldt 2000C3

More Examples

Complex ring

system and bond

arragenment

analysis

O

O

O

O

O

O

O HH H

H

/slides/cactvs/acswashington2000.ppt© Ihlenfeldt 2000C3

More Examples: Complex Ring Systems

C\C(C(CC)C(C)/C([H])=C/C([H])=C(CO3)/C2(O)C3C(OC)C(C)=CC2C4=O)=C([H])/CC1OC5(CCC(C)C(C(C)C)O5)CC(O4)C1

H

H

OO H

O

O

H

O

O

O

/slides/cactvs/acswashington2000.ppt© Ihlenfeldt 2000C3

Complex RingAnalysis

• Not really a norbornane-type bridge system• 3 trans-ringbonds• 2 implicit cis bonds• 2 implicit trans bonds

O

O

O

O

O

O

O HH H

H

/slides/cactvs/acswashington2000.ppt© Ihlenfeldt 2000C3

The CACTVS Plot Algorithms

• Defines state of the art• Fully stereo-aware, in chains and rings• Triple bonds in rings• Automatic wedge assignment• Intelligent pseudo-energy optimizer

C[C@H]1[C@H](C)CC\C=C/[C@H](OC)/C=C/[C@@](C)(O)/C=C/C(O1)=O

/slides/cactvs/acswashington2000.ppt© Ihlenfeldt 2000C3

Open Access

Test it!

http://www2.ccc.uni-erlangen.de/services/gif.html (final version soon)

/slides/cactvs/acswashington2000.ppt© Ihlenfeldt 2000C3

Not just a Passive Image...

Web Interface

With image maps:

Portable,

user-friendly

selection/

manipulation

alternative

C[C@H]1[C@H](C)CC\C=C/[C@H](OC)/C=C/[C@@](C)(O)/C=C/C(O1)=O

/slides/cactvs/acswashington2000.ppt© Ihlenfeldt 2000C3

More Examples

OO

O

O

N

O

S

OH

O

N

O N

O

O

N

H

O

NH

ON

O

N

O

N

O N

OO

N O

H

H

H

H

H

H

H

H

H

H

H

H

H

H

H

H

H

HH

H

H

H

/slides/cactvs/acswashington2000.ppt© Ihlenfeldt 2000C3

Topics

• Motivation

• Drawing One Structure

• Drawing a Set of Structures

• Visualizing Structure Attributes

/slides/cactvs/acswashington2000.ppt© Ihlenfeldt 2000C3

Just a typicalsmall screening

experiment

1148structures

/slides/cactvs/acswashington2000.ppt© Ihlenfeldt 2000C3

346 Clusters

cluster #19218 compounds

fingerprintclustering(144 bits)

/slides/cactvs/acswashington2000.ppt© Ihlenfeldt 2000C3

A Closer Look

• Plots have similar characteristics• Structures not optimally aligned

/slides/cactvs/acswashington2000.ppt© Ihlenfeldt 2000C3

Aligning Multiple Structures

• Move Morgan-center atom to origin• Operate on hexagonal grid• Center on grid

N

/slides/cactvs/acswashington2000.ppt© Ihlenfeldt 2000C3

Aligning Multiple Structures

Operations on structures:

• Move to neighbor grid (6)

• Rotate by multiple of 60 degs. (5)• Rotation around center or hetero atom• Horizontal and vertical flip• 120 deg. flip of non-terminal, non-ringbond• Shuffle to other sequence position

/slides/cactvs/acswashington2000.ppt© Ihlenfeldt 2000C3

Aligning Multiple Structures

Merit function:• Multiple occupation of grid cells• Bonus for overlay of ring atoms, aromatic atoms, hetero atoms• Multiplicative rating within cell• Rate only against earlier entries in sequence, weight 1/n

Optimizer:• Taboo-search

/slides/cactvs/acswashington2000.ppt© Ihlenfeldt 2000C3

A Simple Example

S

H N

HN

S

OH N

NH

O

SO H N

NH

/slides/cactvs/acswashington2000.ppt© Ihlenfeldt 2000C3

TheCleaned Cluster

/slides/cactvs/acswashington2000.ppt© Ihlenfeldt 2000C3

Topics

• Motivation

• Drawing One Structure

• Drawing a Set of Structures

• Visualizing Structure Attributes

/slides/cactvs/acswashington2000.ppt© Ihlenfeldt 2000C3

Atomic Charges - Wild Growth...

/slides/cactvs/acswashington2000.ppt© Ihlenfeldt 2000C3

More Methods to Display Charges...

/slides/cactvs/acswashington2000.ppt© Ihlenfeldt 2000C3

And Another Method!

/slides/cactvs/acswashington2000.ppt© Ihlenfeldt 2000C3

2D Voronoi Polygons

• Polygons around atoms• Full use of available space• Extra hidden points to limit area• Encode attribute by color codes• Immediate recognition

/slides/cactvs/acswashington2000.ppt© Ihlenfeldt 2000C3

Back to the Cluster

• David Covell, NCI, NIH: „What is the essence? How can I SEE what the principle behind the cluster is?“

• What are the similarities?• What are the differences?• What is the prototypical compound?

/slides/cactvs/acswashington2000.ppt© Ihlenfeldt 2000C3

Decoding Fingerprints

• Fingerprints encode presence of fragments• No count• No location

• Vector %11010

• Loss of information - can it be reclaimed or ignored?

O N O

N

O O O O O O O

/slides/cactvs/acswashington2000.ppt© Ihlenfeldt 2000C3

Decoding Fingerprints

Approach: • Perform statistics on all possible matches of all fragments in fingerprint set and weigh atom participation

• Relative occurence of fragments on atom compared to prototype

• Prototype is either virtual average cluster structure, or specific selected compound

/slides/cactvs/acswashington2000.ppt© Ihlenfeldt 2000C3

Results:VirtualMedian Structure

/slides/cactvs/acswashington2000.ppt© Ihlenfeldt 2000C3

Results: Chose aPrototype

/slides/cactvs/acswashington2000.ppt© Ihlenfeldt 2000C3

Results: ChoseAnotherPrototype

/slides/cactvs/acswashington2000.ppt© Ihlenfeldt 2000C3

Acknowledgements

• Marc Nicklaus, David Covell, et al. at NCI, NIH

• BASF

• Chemical Concepts

• DuPont de Nemours

/slides/cactvs/acswashington2000.ppt© Ihlenfeldt 2000C3

More Information

W. D. Ihlenfeldt

[email protected]

http://www2.ccc.uni-erlangen.de/wdi/