slides/cactvs/acswashington2000.ppt © ihlenfeldt 2000 c3c3 chemical visualization: the art of...
TRANSCRIPT
/slides/cactvs/acswashington2000.ppt© Ihlenfeldt 2000C3
Chemical Visualization:The Art of Drawing a Chemical Structure
W. D. Ihlenfeldt
Computer-Chemistry-Center
University of Erlangen-Nuremberg
Erlangen, Germany
O
O
O
O H
/slides/cactvs/acswashington2000.ppt© Ihlenfeldt 2000C3
Topics
• Motivation
• Drawing One Structure
• Drawing a Set of Structures
• Visualizing Structure Attributes
/slides/cactvs/acswashington2000.ppt© Ihlenfeldt 2000C3
Motivation: 3D is not Everything
3D structure displays are valuable tools, but ...
• limited to viewing part of structure
• unsuited for quick comparisons
/slides/cactvs/acswashington2000.ppt© Ihlenfeldt 2000C3
Motivation: Defending 2D Plots
2D structure plots are still the core of of chemical information:
• show complete structure
• easy recognition of patterns
O
O
O
O H
/slides/cactvs/acswashington2000.ppt© Ihlenfeldt 2000C3
Motivation
• So why can’t you just draw it with [your editor of choice] ?
The trend towards combinatorial libraries and structuredesign requires fast, reliable and automatic drawing of enormous numbers of compounds.
Unfortunately, 2D structure drawing is hard. [Helson, Rev. Comp. Chem., 13, 313, 1999]
• Because nobody wants to draw millions of structures!
/slides/cactvs/acswashington2000.ppt© Ihlenfeldt 2000C3
Topics
• Motivation
• Drawing One Structure
• Drawing a Set of Structures
• Visualizing Structure Attributes
/slides/cactvs/acswashington2000.ppt© Ihlenfeldt 2000C3
The Rules
• Chemists know how a good plot should look like:
• Not a simple projection from 3D• Complex, ill-defined set of rules
O
O
O
O H
/slides/cactvs/acswashington2000.ppt© Ihlenfeldt 2000C3
Basic Structure Data
• Draw a structure from connectivity• Atom/bond table• No coordinates to begin with• Often specified in linear notation such as SMILES O1C(/C=C/[C@@](/C=C/[C@@H](C(=C\CC[C@H]([C@@H]1C)C)/C)OC)(O)C)=O
O
O
O
O H
/slides/cactvs/acswashington2000.ppt© Ihlenfeldt 2000C3
The Complications
• Orientation of ring systems
• Close atoms by colliding fragments
Ô
O O
Ô
/slides/cactvs/acswashington2000.ppt© Ihlenfeldt 2000C3
• Rules for bridge systems
• Cages
The Complications
/slides/cactvs/acswashington2000.ppt© Ihlenfeldt 2000C3
The Complications
• 1.3/1.4-embedded rings
• Crowded connection points
/slides/cactvs/acswashington2000.ppt© Ihlenfeldt 2000C3
The Complications
• Choice of stereo attributes (wedges, etc.)
• Trans-bonds in rings
O
H O
OOO
O
O
O H
/slides/cactvs/acswashington2000.ppt© Ihlenfeldt 2000C3
A Simple Structure?
• dual 1.4-embedded rings• close contacts• no solution on 120º grid• 16-membered ring• implicit constraints• with +/- 60,90,45,35º angles• 814 naive patterns (4.3·1012)• max. for real-time response: 250.000 (2.5·105)
N
N
/slides/cactvs/acswashington2000.ppt© Ihlenfeldt 2000C3
Controlling the Number of Patterns
• closing distance and angle• cis/trans information• total distance criterium• full loop criterium • no clustering of non-60° angles• pseudo energy selector• favoring 60°, symmetry
/slides/cactvs/acswashington2000.ppt© Ihlenfeldt 2000C3
A Simple Structure?
N
N
N
NN
N
Isis/Draw 2.2
ChemSketch 3.5
ChemDraw 5.0
N
NCACTVS 3.113
/slides/cactvs/acswashington2000.ppt© Ihlenfeldt 2000C3
More Examples
Symmetry-preserving
optimization by
synchronous bond
bendingN
Ni
N
NN
/slides/cactvs/acswashington2000.ppt© Ihlenfeldt 2000C3
More Examples
Optimization by
bond shortening
H N
NOH
O
O
N
O
N
H
ON
H
O
N
H
O
H N
NO
/slides/cactvs/acswashington2000.ppt© Ihlenfeldt 2000C3
More Examples
Cage system
analysis
(not a template)OHHHOHHHHHHHHHHHHH
O
O
/slides/cactvs/acswashington2000.ppt© Ihlenfeldt 2000C3
More Examples
Complex ring
system and bond
arragenment
analysis
O
O
O
O
O
O
O HH H
H
/slides/cactvs/acswashington2000.ppt© Ihlenfeldt 2000C3
More Examples: Complex Ring Systems
C\C(C(CC)C(C)/C([H])=C/C([H])=C(CO3)/C2(O)C3C(OC)C(C)=CC2C4=O)=C([H])/CC1OC5(CCC(C)C(C(C)C)O5)CC(O4)C1
H
H
OO H
O
O
H
O
O
O
/slides/cactvs/acswashington2000.ppt© Ihlenfeldt 2000C3
Complex RingAnalysis
• Not really a norbornane-type bridge system• 3 trans-ringbonds• 2 implicit cis bonds• 2 implicit trans bonds
O
O
O
O
O
O
O HH H
H
/slides/cactvs/acswashington2000.ppt© Ihlenfeldt 2000C3
The CACTVS Plot Algorithms
• Defines state of the art• Fully stereo-aware, in chains and rings• Triple bonds in rings• Automatic wedge assignment• Intelligent pseudo-energy optimizer
C[C@H]1[C@H](C)CC\C=C/[C@H](OC)/C=C/[C@@](C)(O)/C=C/C(O1)=O
/slides/cactvs/acswashington2000.ppt© Ihlenfeldt 2000C3
Open Access
Test it!
http://www2.ccc.uni-erlangen.de/services/gif.html (final version soon)
/slides/cactvs/acswashington2000.ppt© Ihlenfeldt 2000C3
Not just a Passive Image...
Web Interface
With image maps:
Portable,
user-friendly
selection/
manipulation
alternative
C[C@H]1[C@H](C)CC\C=C/[C@H](OC)/C=C/[C@@](C)(O)/C=C/C(O1)=O
/slides/cactvs/acswashington2000.ppt© Ihlenfeldt 2000C3
More Examples
OO
O
O
N
O
S
OH
O
N
O N
O
O
N
H
O
NH
ON
O
N
O
N
O N
OO
N O
H
H
H
H
H
H
H
H
H
H
H
H
H
H
H
H
H
HH
H
H
H
/slides/cactvs/acswashington2000.ppt© Ihlenfeldt 2000C3
Topics
• Motivation
• Drawing One Structure
• Drawing a Set of Structures
• Visualizing Structure Attributes
/slides/cactvs/acswashington2000.ppt© Ihlenfeldt 2000C3
Just a typicalsmall screening
experiment
1148structures
/slides/cactvs/acswashington2000.ppt© Ihlenfeldt 2000C3
346 Clusters
cluster #19218 compounds
fingerprintclustering(144 bits)
/slides/cactvs/acswashington2000.ppt© Ihlenfeldt 2000C3
A Closer Look
• Plots have similar characteristics• Structures not optimally aligned
/slides/cactvs/acswashington2000.ppt© Ihlenfeldt 2000C3
Aligning Multiple Structures
• Move Morgan-center atom to origin• Operate on hexagonal grid• Center on grid
N
/slides/cactvs/acswashington2000.ppt© Ihlenfeldt 2000C3
Aligning Multiple Structures
Operations on structures:
• Move to neighbor grid (6)
• Rotate by multiple of 60 degs. (5)• Rotation around center or hetero atom• Horizontal and vertical flip• 120 deg. flip of non-terminal, non-ringbond• Shuffle to other sequence position
/slides/cactvs/acswashington2000.ppt© Ihlenfeldt 2000C3
Aligning Multiple Structures
Merit function:• Multiple occupation of grid cells• Bonus for overlay of ring atoms, aromatic atoms, hetero atoms• Multiplicative rating within cell• Rate only against earlier entries in sequence, weight 1/n
Optimizer:• Taboo-search
/slides/cactvs/acswashington2000.ppt© Ihlenfeldt 2000C3
A Simple Example
S
H N
HN
S
OH N
NH
O
SO H N
NH
/slides/cactvs/acswashington2000.ppt© Ihlenfeldt 2000C3
Topics
• Motivation
• Drawing One Structure
• Drawing a Set of Structures
• Visualizing Structure Attributes
/slides/cactvs/acswashington2000.ppt© Ihlenfeldt 2000C3
2D Voronoi Polygons
• Polygons around atoms• Full use of available space• Extra hidden points to limit area• Encode attribute by color codes• Immediate recognition
/slides/cactvs/acswashington2000.ppt© Ihlenfeldt 2000C3
Back to the Cluster
• David Covell, NCI, NIH: „What is the essence? How can I SEE what the principle behind the cluster is?“
• What are the similarities?• What are the differences?• What is the prototypical compound?
/slides/cactvs/acswashington2000.ppt© Ihlenfeldt 2000C3
Decoding Fingerprints
• Fingerprints encode presence of fragments• No count• No location
• Vector %11010
• Loss of information - can it be reclaimed or ignored?
O N O
N
O O O O O O O
/slides/cactvs/acswashington2000.ppt© Ihlenfeldt 2000C3
Decoding Fingerprints
Approach: • Perform statistics on all possible matches of all fragments in fingerprint set and weigh atom participation
• Relative occurence of fragments on atom compared to prototype
• Prototype is either virtual average cluster structure, or specific selected compound
/slides/cactvs/acswashington2000.ppt© Ihlenfeldt 2000C3
Acknowledgements
• Marc Nicklaus, David Covell, et al. at NCI, NIH
• BASF
• Chemical Concepts
• DuPont de Nemours
/slides/cactvs/acswashington2000.ppt© Ihlenfeldt 2000C3
More Information
W. D. Ihlenfeldt
http://www2.ccc.uni-erlangen.de/wdi/