evaluating automatic atom mapping algorithms · 2015. 4. 21. · acs national meeting,...
TRANSCRIPT
![Page 1: Evaluating Automatic Atom Mapping Algorithms · 2015. 4. 21. · ACS National Meeting, Philadelphia, USA 20th August 2012 Test set Reactions Pharmaceutical ELN subset 18,244 ChemReact68](https://reader036.vdocument.in/reader036/viewer/2022071405/60fac779cf8dba3437692a15/html5/thumbnails/1.jpg)
Evaluating the Quality and Performance of Automatic Atom
Mapping Algorithms
ACS National Meeting, Philadelphia, USA 20th August 2012
Daniel Lowe and Roger Sayle
NextMove Software
Cambridge, UK
![Page 2: Evaluating Automatic Atom Mapping Algorithms · 2015. 4. 21. · ACS National Meeting, Philadelphia, USA 20th August 2012 Test set Reactions Pharmaceutical ELN subset 18,244 ChemReact68](https://reader036.vdocument.in/reader036/viewer/2022071405/60fac779cf8dba3437692a15/html5/thumbnails/2.jpg)
What is Atom-Mapping?
Mapping algorithm
ACS National Meeting, Philadelphia, USA 20th August 2012
![Page 3: Evaluating Automatic Atom Mapping Algorithms · 2015. 4. 21. · ACS National Meeting, Philadelphia, USA 20th August 2012 Test set Reactions Pharmaceutical ELN subset 18,244 ChemReact68](https://reader036.vdocument.in/reader036/viewer/2022071405/60fac779cf8dba3437692a15/html5/thumbnails/3.jpg)
Why Perform Atom-Mapping?
• Assigning roles to reagents
• Normalization of reactions for registration
ACS National Meeting, Philadelphia, USA 20th August 2012
![Page 4: Evaluating Automatic Atom Mapping Algorithms · 2015. 4. 21. · ACS National Meeting, Philadelphia, USA 20th August 2012 Test set Reactions Pharmaceutical ELN subset 18,244 ChemReact68](https://reader036.vdocument.in/reader036/viewer/2022071405/60fac779cf8dba3437692a15/html5/thumbnails/4.jpg)
Why Perform Atom-Mapping?
• More precise database searches
– Solvents/catalysts can be distinguished from reactants
– Allows the relationship between the reactant atoms and product atoms to be made explicit
ACS National Meeting, Philadelphia, USA 20th August 2012
![Page 5: Evaluating Automatic Atom Mapping Algorithms · 2015. 4. 21. · ACS National Meeting, Philadelphia, USA 20th August 2012 Test set Reactions Pharmaceutical ELN subset 18,244 ChemReact68](https://reader036.vdocument.in/reader036/viewer/2022071405/60fac779cf8dba3437692a15/html5/thumbnails/5.jpg)
Example
ACS National Meeting, Philadelphia, USA 20th August 2012
• I want to find reactions converting an alkene to a cyclopropane so I search for C=C>>C1CC1
![Page 6: Evaluating Automatic Atom Mapping Algorithms · 2015. 4. 21. · ACS National Meeting, Philadelphia, USA 20th August 2012 Test set Reactions Pharmaceutical ELN subset 18,244 ChemReact68](https://reader036.vdocument.in/reader036/viewer/2022071405/60fac779cf8dba3437692a15/html5/thumbnails/6.jpg)
Why Perform Atom-Mapping?
• Identifying suspect reactions:
ACS National Meeting, Philadelphia, USA 20th August 2012
![Page 7: Evaluating Automatic Atom Mapping Algorithms · 2015. 4. 21. · ACS National Meeting, Philadelphia, USA 20th August 2012 Test set Reactions Pharmaceutical ELN subset 18,244 ChemReact68](https://reader036.vdocument.in/reader036/viewer/2022071405/60fac779cf8dba3437692a15/html5/thumbnails/7.jpg)
Qualities to look for in an atom mapping algorithm
• Chemically plausible atom mappings
• Ability to distinguish genuine reactants from solvents/catalysts
• Support for unbalanced reactions
– Side product not specified
– Reactant stoichiometry > 1
• Fast run-time
ACS National Meeting, Philadelphia, USA 20th August 2012
![Page 8: Evaluating Automatic Atom Mapping Algorithms · 2015. 4. 21. · ACS National Meeting, Philadelphia, USA 20th August 2012 Test set Reactions Pharmaceutical ELN subset 18,244 ChemReact68](https://reader036.vdocument.in/reader036/viewer/2022071405/60fac779cf8dba3437692a15/html5/thumbnails/8.jpg)
Algorithms Evaluated
ACS National Meeting, Philadelphia, USA 20th August 2012
Vendor:Program Version
Accelrys:Pipeline Pilot 8.5.0.200
ChemAxon:Marvin 5.10.1
GGA:Indigo 1.1
InfoChem:ICMAP 5.10
PerkinElmer:ChemDraw Ultra 12.0
![Page 9: Evaluating Automatic Atom Mapping Algorithms · 2015. 4. 21. · ACS National Meeting, Philadelphia, USA 20th August 2012 Test set Reactions Pharmaceutical ELN subset 18,244 ChemReact68](https://reader036.vdocument.in/reader036/viewer/2022071405/60fac779cf8dba3437692a15/html5/thumbnails/9.jpg)
Methodology
ACS National Meeting, Philadelphia, USA 20th August 2012
Test set Reactions
Pharmaceutical ELN subset 18,244
ChemReact68 database 67,926
SPRESI database subset 5,230
Reactions extracted from 2008-2011 USPTO patent applications*
562,872
* Lowe, D. M. Automated Extraction of Reactions from the Patent Literature. 243rd ACS National Meeting & Exposition, San Diego, CA, March 27, 2012.
![Page 10: Evaluating Automatic Atom Mapping Algorithms · 2015. 4. 21. · ACS National Meeting, Philadelphia, USA 20th August 2012 Test set Reactions Pharmaceutical ELN subset 18,244 ChemReact68](https://reader036.vdocument.in/reader036/viewer/2022071405/60fac779cf8dba3437692a15/html5/thumbnails/10.jpg)
Methodology-cont.
• Reaction SMILES were used as input and output for all algorithms bar ICMAP
• Input and output was converted to and from RDF for use with ICMAP
• Indigo was ran with its default configuration and more lenient settings for matching valences, charges and bond orders
• Marvin was configured to use its best quality mapping strategy
ACS National Meeting, Philadelphia, USA 20th August 2012
![Page 11: Evaluating Automatic Atom Mapping Algorithms · 2015. 4. 21. · ACS National Meeting, Philadelphia, USA 20th August 2012 Test set Reactions Pharmaceutical ELN subset 18,244 ChemReact68](https://reader036.vdocument.in/reader036/viewer/2022071405/60fac779cf8dba3437692a15/html5/thumbnails/11.jpg)
Ability to map all product atoms
ACS National Meeting, Philadelphia, USA 20th August 2012
![Page 12: Evaluating Automatic Atom Mapping Algorithms · 2015. 4. 21. · ACS National Meeting, Philadelphia, USA 20th August 2012 Test set Reactions Pharmaceutical ELN subset 18,244 ChemReact68](https://reader036.vdocument.in/reader036/viewer/2022071405/60fac779cf8dba3437692a15/html5/thumbnails/12.jpg)
c-c bonds broken
ACS National Meeting, Philadelphia, USA 20th August 2012
![Page 13: Evaluating Automatic Atom Mapping Algorithms · 2015. 4. 21. · ACS National Meeting, Philadelphia, USA 20th August 2012 Test set Reactions Pharmaceutical ELN subset 18,244 ChemReact68](https://reader036.vdocument.in/reader036/viewer/2022071405/60fac779cf8dba3437692a15/html5/thumbnails/13.jpg)
Speed Comparison
ACS National Meeting, Philadelphia, USA 20th August 2012
1.7 3.6 1.6 4.0 Average reagents per
reaction
![Page 14: Evaluating Automatic Atom Mapping Algorithms · 2015. 4. 21. · ACS National Meeting, Philadelphia, USA 20th August 2012 Test set Reactions Pharmaceutical ELN subset 18,244 ChemReact68](https://reader036.vdocument.in/reader036/viewer/2022071405/60fac779cf8dba3437692a15/html5/thumbnails/14.jpg)
Simple mappings
ACS National Meeting, Philadelphia, USA 20th August 2012
Marvin/ChemDraw/Indigo/ICMAP
![Page 15: Evaluating Automatic Atom Mapping Algorithms · 2015. 4. 21. · ACS National Meeting, Philadelphia, USA 20th August 2012 Test set Reactions Pharmaceutical ELN subset 18,244 ChemReact68](https://reader036.vdocument.in/reader036/viewer/2022071405/60fac779cf8dba3437692a15/html5/thumbnails/15.jpg)
Simple mappings
ACS National Meeting, Philadelphia, USA 20th August 2012
Marvin/ChemDraw/Indigo/ICMAP
![Page 16: Evaluating Automatic Atom Mapping Algorithms · 2015. 4. 21. · ACS National Meeting, Philadelphia, USA 20th August 2012 Test set Reactions Pharmaceutical ELN subset 18,244 ChemReact68](https://reader036.vdocument.in/reader036/viewer/2022071405/60fac779cf8dba3437692a15/html5/thumbnails/16.jpg)
More complicated Mappings
ACS National Meeting, Philadelphia, USA 20th August 2012
ChemDraw
Marvin
![Page 17: Evaluating Automatic Atom Mapping Algorithms · 2015. 4. 21. · ACS National Meeting, Philadelphia, USA 20th August 2012 Test set Reactions Pharmaceutical ELN subset 18,244 ChemReact68](https://reader036.vdocument.in/reader036/viewer/2022071405/60fac779cf8dba3437692a15/html5/thumbnails/17.jpg)
More complicated Mappings
ACS National Meeting, Philadelphia, USA 20th August 2012
ICMAP
Indigo
![Page 18: Evaluating Automatic Atom Mapping Algorithms · 2015. 4. 21. · ACS National Meeting, Philadelphia, USA 20th August 2012 Test set Reactions Pharmaceutical ELN subset 18,244 ChemReact68](https://reader036.vdocument.in/reader036/viewer/2022071405/60fac779cf8dba3437692a15/html5/thumbnails/18.jpg)
Reuse of reactants
ACS National Meeting, Philadelphia, USA 20th August 2012
![Page 19: Evaluating Automatic Atom Mapping Algorithms · 2015. 4. 21. · ACS National Meeting, Philadelphia, USA 20th August 2012 Test set Reactions Pharmaceutical ELN subset 18,244 ChemReact68](https://reader036.vdocument.in/reader036/viewer/2022071405/60fac779cf8dba3437692a15/html5/thumbnails/19.jpg)
Reuse of reactants
ACS National Meeting, Philadelphia, USA 20th August 2012
Marvin
![Page 20: Evaluating Automatic Atom Mapping Algorithms · 2015. 4. 21. · ACS National Meeting, Philadelphia, USA 20th August 2012 Test set Reactions Pharmaceutical ELN subset 18,244 ChemReact68](https://reader036.vdocument.in/reader036/viewer/2022071405/60fac779cf8dba3437692a15/html5/thumbnails/20.jpg)
Reuse of reactants
ACS National Meeting, Philadelphia, USA 20th August 2012
ChemDraw
![Page 21: Evaluating Automatic Atom Mapping Algorithms · 2015. 4. 21. · ACS National Meeting, Philadelphia, USA 20th August 2012 Test set Reactions Pharmaceutical ELN subset 18,244 ChemReact68](https://reader036.vdocument.in/reader036/viewer/2022071405/60fac779cf8dba3437692a15/html5/thumbnails/21.jpg)
Reuse of reactants
ACS National Meeting, Philadelphia, USA 20th August 2012
Indigo
![Page 22: Evaluating Automatic Atom Mapping Algorithms · 2015. 4. 21. · ACS National Meeting, Philadelphia, USA 20th August 2012 Test set Reactions Pharmaceutical ELN subset 18,244 ChemReact68](https://reader036.vdocument.in/reader036/viewer/2022071405/60fac779cf8dba3437692a15/html5/thumbnails/22.jpg)
Reuse of reactants
ACS National Meeting, Philadelphia, USA 20th August 2012
ICMAP
![Page 23: Evaluating Automatic Atom Mapping Algorithms · 2015. 4. 21. · ACS National Meeting, Philadelphia, USA 20th August 2012 Test set Reactions Pharmaceutical ELN subset 18,244 ChemReact68](https://reader036.vdocument.in/reader036/viewer/2022071405/60fac779cf8dba3437692a15/html5/thumbnails/23.jpg)
Single Atom Mapping
ACS National Meeting, Philadelphia, USA 20th August 2012
ICMAP/Marvin
ChemDraw/Indigo
![Page 24: Evaluating Automatic Atom Mapping Algorithms · 2015. 4. 21. · ACS National Meeting, Philadelphia, USA 20th August 2012 Test set Reactions Pharmaceutical ELN subset 18,244 ChemReact68](https://reader036.vdocument.in/reader036/viewer/2022071405/60fac779cf8dba3437692a15/html5/thumbnails/24.jpg)
Bugs and quirks
• Marvin
– 2 unsuccessful mappings produced unchecked exceptions rather than checked exceptions
• ChemDraw
– Hydrogen on aromatic atoms missing in SMILES output
• Indigo
– Calculation of valency fails for aromatic sulfur
ACS National Meeting, Philadelphia, USA 20th August 2012
![Page 25: Evaluating Automatic Atom Mapping Algorithms · 2015. 4. 21. · ACS National Meeting, Philadelphia, USA 20th August 2012 Test set Reactions Pharmaceutical ELN subset 18,244 ChemReact68](https://reader036.vdocument.in/reader036/viewer/2022071405/60fac779cf8dba3437692a15/html5/thumbnails/25.jpg)
Bugs and quirks
• ICMAP
– Single atom products are interpreted as empty molecules or occasionally replaced by a product from a previous reaction (bug reported)
– Input files must be < 2gb and use dos line endings
ACS National Meeting, Philadelphia, USA 20th August 2012
![Page 26: Evaluating Automatic Atom Mapping Algorithms · 2015. 4. 21. · ACS National Meeting, Philadelphia, USA 20th August 2012 Test set Reactions Pharmaceutical ELN subset 18,244 ChemReact68](https://reader036.vdocument.in/reader036/viewer/2022071405/60fac779cf8dba3437692a15/html5/thumbnails/26.jpg)
conclusions
• ICMAP produced the best quality mappings on the tested sets
• Atom mapping isn’t as simple as finding a maximum common subgraph mapping
• In all the algorithms there were aspects that could be improved to yield appreciable benefits
ACS National Meeting, Philadelphia, USA 20th August 2012
![Page 27: Evaluating Automatic Atom Mapping Algorithms · 2015. 4. 21. · ACS National Meeting, Philadelphia, USA 20th August 2012 Test set Reactions Pharmaceutical ELN subset 18,244 ChemReact68](https://reader036.vdocument.in/reader036/viewer/2022071405/60fac779cf8dba3437692a15/html5/thumbnails/27.jpg)
acknowledgements
• Ed Griffen and Nick Tomkinson, AstraZeneca.
• Andrew Wooster, GSK.
• Hans Kraut, InfoChem
• Thank you for your time.
ACS National Meeting, Philadelphia, USA 20th August 2012