improving online chemistry one structure at a time filemetabolic pathway databases adme/tox data...

105
Improving Online Chemistry One Structure at a Time Antony Williams AstraZeneca, February 10 th 2012

Upload: others

Post on 13-Sep-2019

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Improving Online Chemistry One Structure at a Time fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing Open Source

Improving Online Chemistry –

One Structure at a Time

Antony WilliamsAstraZeneca, February 10th 2012

Page 2: Improving Online Chemistry One Structure at a Time fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing Open Source

We Have …Too Much Data!!!

Page 3: Improving Online Chemistry One Structure at a Time fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing Open Source

It is so difficult to navigate…

What’s the

structure?

Are they in

our file?

What’s

similar?

What’s the

target?Pharmacology

data?

Known

Pathways?

Working On

Now?Connections

to disease?

Expressed in

right cell type?

Competitors?

IP?

Page 4: Improving Online Chemistry One Structure at a Time fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing Open Source

Literature Patents NewsPipeline SAR CSRs SafetyIn vivo Etc

Pharma Information..and the web…

Page 5: Improving Online Chemistry One Structure at a Time fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing Open Source

The World of Online Chemistry

Property databases

Compound aggregators

Screening assay results

Scientific publications

Encyclopedic articles (Wikipedia)

Metabolic pathway databases

ADME/Tox data – eTOX for example

Blogs/Wikis and Open Notebook Science

Contributing Open Source code to projects

Page 6: Improving Online Chemistry One Structure at a Time fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing Open Source

PubChem

Page 7: Improving Online Chemistry One Structure at a Time fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing Open Source

ChEMBL

Page 8: Improving Online Chemistry One Structure at a Time fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing Open Source

Collaborative Knowledge Management

Page 9: Improving Online Chemistry One Structure at a Time fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing Open Source

e-Science and Primary Data

How much data generated in a lab, that COULDgo public, is lost forever?

Page 10: Improving Online Chemistry One Structure at a Time fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing Open Source

e-Science and Primary Data

How much data generated in a lab, that COULDgo public, is lost forever?

Public Domain reference databases of value?

Syntheses

Properties

Spectra

CIFs

Images

Page 11: Improving Online Chemistry One Structure at a Time fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing Open Source

e-Science and Primary Data

How much data generated in a lab, that COULDgo public, is lost forever?

Public Domain reference databases of value?

Syntheses

Properties

Spectra

CIFs

Images

Much of chemistry is chemical structure-based –where and how could we host these data?

Page 12: Improving Online Chemistry One Structure at a Time fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing Open Source

RSC’s ChemSpider

Page 13: Improving Online Chemistry One Structure at a Time fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing Open Source

We Want to Answer Questions

Questions a chemist might ask…

What is the melting point of n-heptanol?

What is the chemical structure of Xanax?

Chemically, what is phenolphthalein?

What are the stereocenters of cholesterol?

Where can I find publications about xylene?

What are the different trade names for Ketoconazole?

What is the NMR spectrum of Aspirin?

What are the safety handling issues for Thymol Blue?

Page 14: Improving Online Chemistry One Structure at a Time fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing Open Source

Available Information…

Linked to vendors, safety data, toxicity, metabolism

Page 15: Improving Online Chemistry One Structure at a Time fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing Open Source

Available Information….

Page 16: Improving Online Chemistry One Structure at a Time fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing Open Source

Crowdsourced “Annotations”

Users can add

Descriptions/Syntheses/Commentaries

Links to PubMed articles

Links to articles via DOIs

Add spectral data

Add Crystallographic Information Files

Add photos

Add MP3 files

Add Videos

Page 17: Improving Online Chemistry One Structure at a Time fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing Open Source
Page 18: Improving Online Chemistry One Structure at a Time fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing Open Source

Spectra

Page 19: Improving Online Chemistry One Structure at a Time fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing Open Source

Data on the Web

Page 20: Improving Online Chemistry One Structure at a Time fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing Open Source
Page 21: Improving Online Chemistry One Structure at a Time fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing Open Source

Chemistry Data online is messy

We have inherited errors

All public compound databases, including ours, have errors

“Incorrect” structures – assertions, timelines etc

“Incorrect” names associated with structures

Properties

Links

Publications

ENORMOUS CHALLENGE

Page 22: Improving Online Chemistry One Structure at a Time fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing Open Source

What could create change?

Harvard Business Review (2010)

“One change would make a substantial difference [to drug R&D]: the creation of

agreed-upon standards for digitally representing drug assets.”

Consider drug structures ONLY…

Page 23: Improving Online Chemistry One Structure at a Time fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing Open Source

The Structure of Vitamin K?

Page 24: Improving Online Chemistry One Structure at a Time fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing Open Source

MeSH

A lipid cofactor that is required for normal blood clotting. Several forms of vitamin K have been identified: VITAMIN K 1 (phytomenadione) derived from plants, VITAMIN K 2 (menaquinone) from bacteria, and synthetic naphthoquinone provitamins, VITAMIN K 3 (menadione). Vitamin K 3 provitamins, after being alkylated in vivo, exhibit the antifibrinolytic activity of vitamin K. Green leafy vegetables, liver, cheese, butter, and egg yolk are good sources of vitamin K

Page 25: Improving Online Chemistry One Structure at a Time fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing Open Source

The Structure of Vitamin K1?

Page 26: Improving Online Chemistry One Structure at a Time fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing Open Source

What is the Structure of Vitamin K1?

Page 27: Improving Online Chemistry One Structure at a Time fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing Open Source

CAS’s Common Chemistry

Page 28: Improving Online Chemistry One Structure at a Time fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing Open Source

Wikipedia

Page 29: Improving Online Chemistry One Structure at a Time fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing Open Source
Page 30: Improving Online Chemistry One Structure at a Time fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing Open Source
Page 31: Improving Online Chemistry One Structure at a Time fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing Open Source

ChEBI – Manual Curation

Page 32: Improving Online Chemistry One Structure at a Time fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing Open Source
Page 33: Improving Online Chemistry One Structure at a Time fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing Open Source
Page 34: Improving Online Chemistry One Structure at a Time fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing Open Source
Page 35: Improving Online Chemistry One Structure at a Time fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing Open Source

“2-methyl-3-(3,7,11,15-tetramethylhexadec-2-enyl)naphthalene-1,4-dione”

Variants of systematic names on PubChem

2-methyl-3-[(E,7R,11R)-3,7,11,15-tetramethyl

2-methyl-3-[(E,7S,11R)-3,7,11,15-tetramethyl

2-methyl-3-[(E,7R,11S)-3,7,11,15-tetramethyl

2-methyl-3-[(E,7S,11S)-3,7,11,15-tetramethyl

2-methyl-3-[(E,11S)-3,7,11,15-tetramethyl

2-methyl-3-[(E)-3,7,11,15-tetramethyl

2-methyl-3-(3,7,11,15-tetramethyl

2-methyl-3-[(E)-3,7,11,15-tetramethyl

Page 36: Improving Online Chemistry One Structure at a Time fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing Open Source

What’s Methane?

Page 37: Improving Online Chemistry One Structure at a Time fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing Open Source

What’s Methane?

Page 38: Improving Online Chemistry One Structure at a Time fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing Open Source

What ELSE is Methane???

Page 39: Improving Online Chemistry One Structure at a Time fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing Open Source
Page 40: Improving Online Chemistry One Structure at a Time fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing Open Source

EPA’s DailyMed

Page 41: Improving Online Chemistry One Structure at a Time fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing Open Source

EPA’s DailyMed

Page 42: Improving Online Chemistry One Structure at a Time fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing Open Source

EPA’s DailyMed

Page 43: Improving Online Chemistry One Structure at a Time fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing Open Source

With Great Fanfare…

Page 44: Improving Online Chemistry One Structure at a Time fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing Open Source

NPC Browser http://tripod.nih.gov/npc/

Page 45: Improving Online Chemistry One Structure at a Time fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing Open Source

NPC Browser http://tripod.nih.gov/npc/

Page 46: Improving Online Chemistry One Structure at a Time fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing Open Source
Page 47: Improving Online Chemistry One Structure at a Time fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing Open Source

Openness and Quality IssuesWilliams and Ekins, DDT, 16: 747-750 (2011)

Science Translational Medicine 2011

Page 48: Improving Online Chemistry One Structure at a Time fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing Open Source

Content is King and Quality Costs Curated Chemistry “content” is expensive to create

Patent searching

Structures and properties

Drug databases

Literature databases

Chemical Abstracts Service (CAS), the “Gold Standard” in Chemistry related information 104 years of content

>50 million substances

Proprietary platform

Page 49: Improving Online Chemistry One Structure at a Time fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing Open Source

The EXPERTS must get it right?!

Page 50: Improving Online Chemistry One Structure at a Time fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing Open Source

Wikipedia, C&E News, PubChemC&E News (from ACS)

Page 51: Improving Online Chemistry One Structure at a Time fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing Open Source

Feedback from Steve Ritter

“Although CAS and C&EN are both part of the ACS Publications Division, we at C&EN still have to pay for our SciFinder access, strangely enough.”

“It would be nice to have an authoritative web-based source of standard, well-drawn structures for chemists to go to so they can freely cut and paste structures into their papers, PowerPoint presentations, and anything else they might need. Maybe Wikipedia will be that source one day.”

Page 52: Improving Online Chemistry One Structure at a Time fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing Open Source

Public Domain Databases

Our databases are a mess…

Non-curated databases are proliferating errors

We source and deposit data between databases

Original sources of errors hard to determine

Curation is time-consuming and challenging

Page 53: Improving Online Chemistry One Structure at a Time fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing Open Source

Stop Whining – Fix it

Page 54: Improving Online Chemistry One Structure at a Time fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing Open Source

Crowdsourced Curation

Crowd-sourced curation: identify/tag errors, edit names, synonyms, identify records to deprecate

Page 55: Improving Online Chemistry One Structure at a Time fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing Open Source

Search “Vitamin H”

Page 56: Improving Online Chemistry One Structure at a Time fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing Open Source

“Curate” Identifiers

Page 57: Improving Online Chemistry One Structure at a Time fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing Open Source

“Curate” Identifiers

Page 58: Improving Online Chemistry One Structure at a Time fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing Open Source

“Curate” Identifiers

Page 59: Improving Online Chemistry One Structure at a Time fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing Open Source

Standards : Structure Standardization

Page 60: Improving Online Chemistry One Structure at a Time fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing Open Source

Standards : Structure Standardization

Page 61: Improving Online Chemistry One Structure at a Time fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing Open Source

Standards : Structure Standardization

Page 62: Improving Online Chemistry One Structure at a Time fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing Open Source

What needs to happen?

Standards

Standardization of structures ChEBI/PubChem sharing

InChI adoption

Page 63: Improving Online Chemistry One Structure at a Time fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing Open Source

The InChI Identifier

Page 64: Improving Online Chemistry One Structure at a Time fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing Open Source

Multiple Layers

Page 65: Improving Online Chemistry One Structure at a Time fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing Open Source

InChIStrings Hash to InChIKeys

Page 66: Improving Online Chemistry One Structure at a Time fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing Open Source

Vancomycin – Search the Internet

Page 67: Improving Online Chemistry One Structure at a Time fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing Open Source

Vancomycin

Search Molecular

SKELETON

Search Full Molecule

Page 68: Improving Online Chemistry One Structure at a Time fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing Open Source

Full Skeleton Search: 104 Hits

Page 69: Improving Online Chemistry One Structure at a Time fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing Open Source

Full Molecule Search: 4 Hits

Page 70: Improving Online Chemistry One Structure at a Time fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing Open Source

Crowdsourcing Works

>130 people have deposited data and participated in data curation

Different level curators check each other

More curators and depositors are encouraged!

Page 71: Improving Online Chemistry One Structure at a Time fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing Open Source

What needs to happen?

Standards

Standardization of structures ChEBI/PubChem sharing

InChI adoption

Collaboration

Stop reinventing the wheel

Share data, share efforts and speed the process

Page 72: Improving Online Chemistry One Structure at a Time fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing Open Source

Antony Williams vs Identifiers

Passport ID

Dad, Tony, others

SSN

Green Card

License5 email addresses

ChemSpiderman (blog,

Twitter account,

Facebook, Friendfeed)

OpenID

….

Page 73: Improving Online Chemistry One Structure at a Time fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing Open Source

Aspirin names and synonyms

• Text searches depend on correct association

• 335 suggested identifiers for Aspirin just on PubChem!

• Disambiguation dictionaries are necessary, not just for authors!

Page 74: Improving Online Chemistry One Structure at a Time fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing Open Source
Page 75: Improving Online Chemistry One Structure at a Time fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing Open Source

The Final Search Strategy

Page 76: Improving Online Chemistry One Structure at a Time fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing Open Source

All Those Names, One Structure

Page 77: Improving Online Chemistry One Structure at a Time fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing Open Source

Curated Dictionaries Matter

Page 78: Improving Online Chemistry One Structure at a Time fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing Open Source

Success Depends on Dictionaries

Page 79: Improving Online Chemistry One Structure at a Time fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing Open Source

Validated Name-Structure Dictionaries

Chemical name dictionaries are used for: Text-mining (publications, patents)

Used to index PubMed and link to Google Patents

Linking to other databases – think Biology! When structures are not available drug names link

Searching the web Names link to structures link to InChIs

Page 80: Improving Online Chemistry One Structure at a Time fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing Open Source

I want to know about “Vincristine”

If all algorithms work then

everything on the page is

correct by default except the

name-structure relationship!

Page 81: Improving Online Chemistry One Structure at a Time fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing Open Source

Vincristine: Identifiers and Properties

Page 82: Improving Online Chemistry One Structure at a Time fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing Open Source

Vincristine: PatentsLinked by Name

Page 83: Improving Online Chemistry One Structure at a Time fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing Open Source

Vincristine: ArticlesLinked by Name

Page 84: Improving Online Chemistry One Structure at a Time fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing Open Source

ChemSpider Everywhere:

What do computers want?

Web services

Page 85: Improving Online Chemistry One Structure at a Time fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing Open Source

Mass Spec Analysis

Page 86: Improving Online Chemistry One Structure at a Time fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing Open Source

ChemSpider Interface

Page 87: Improving Online Chemistry One Structure at a Time fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing Open Source
Page 88: Improving Online Chemistry One Structure at a Time fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing Open Source

Tinuvin 328

Page 89: Improving Online Chemistry One Structure at a Time fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing Open Source

Position sorted by references

Page 90: Improving Online Chemistry One Structure at a Time fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing Open Source

Position 1 only

Page 91: Improving Online Chemistry One Structure at a Time fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing Open Source

Web Services

Page 92: Improving Online Chemistry One Structure at a Time fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing Open Source

Web Services Open Up Collaboration

Agilent, Bruker, Waters and Thermo all use our web-based services for compound lookup

Many academic sites integrating directly –metabonomics, name lookup, semantic markup

Mobile app integration

Commercial structure drawing packages

Page 93: Improving Online Chemistry One Structure at a Time fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing Open Source

ChemSpider Everywhere : Embed

Page 94: Improving Online Chemistry One Structure at a Time fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing Open Source

ChemSpider Everywhere: Spectral Game

Page 95: Improving Online Chemistry One Structure at a Time fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing Open Source

ChemSpider Everywhere

Crowdsourced Curation of Spectra

Page 96: Improving Online Chemistry One Structure at a Time fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing Open Source

ChemSpider Everywhere : ChemMobi

Page 97: Improving Online Chemistry One Structure at a Time fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing Open Source

Structure Database Lookup

Page 98: Improving Online Chemistry One Structure at a Time fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing Open Source

ChemSpider Resources for Chemistry

Page 99: Improving Online Chemistry One Structure at a Time fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing Open Source

It is so difficult to navigate…

What’s the

structure?

Are they in

our file?

What’s

similar?

What’s the

target?Pharmacology

data?

Known

Pathways?

Working On

Now?Connections

to disease?

Expressed in

right cell type?

Competitors?

IP?

Page 100: Improving Online Chemistry One Structure at a Time fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing Open Source

Open PHACTS Project Develop a set of robust standards…

Implement the standards in a semantic integration hub

Deliver services to support drug discovery programs in pharma and public domain

22 partners, 8 pharmaceutical companies, 3 biotechs

36 months project

Guiding principle is open access, open usage, open source

- Key to standards adoption -

Page 101: Improving Online Chemistry One Structure at a Time fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing Open Source
Page 102: Improving Online Chemistry One Structure at a Time fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing Open Source

Internet Data

The Future

Commercial Software

Pre-competitive Data

Open Science

Open Data

Publishers

Educators

Open Databases

Chemical Vendors

Small organic molecules

Undefined materials

Organometallics

Nanomaterials

Polymers

Minerals

Particle bound

Links to Biologicals

Page 103: Improving Online Chemistry One Structure at a Time fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing Open Source

The Future of Chemistry on the Web?

Public compound databases federate & build a linked environment of validated data!

Data validation needs are not ignored

Publishers layer on information to make publications discoverable

Public-Private databases can be linked

Open Data proliferate

The “Semantic Web” in action

Page 104: Improving Online Chemistry One Structure at a Time fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing Open Source

Acknowledgments

The ChemSpider team

Our data providers, depositors, collaborators and curators

Software providers – OpenEye, ChemDoodle, ACD/Labs, GGA Software, Open Source (Jmol, JSpecView, OpenBabel)

Page 105: Improving Online Chemistry One Structure at a Time fileMetabolic pathway databases ADME/Tox data –eTOX for example Blogs/Wikis and Open Notebook Science Contributing Open Source

Thank you

Email: [email protected]

Twitter: ChemConnector

Blog: www.chemspider.com/blog

Personal Blog: www.chemconnector.com

SLIDES: www.slideshare.net/AntonyWilliams