building a community resource of open spectral data

46
Building a Community Resource of Open Spectral Data

Upload: orcid-0000-0002-2668-4821

Post on 10-May-2015

1.292 views

Category:

Technology


0 download

DESCRIPTION

A presentation given at #FACCS2010 in Raleigh, North CarolinaChemSpider is an online database of almost 25 million chemical compounds sourced from over 300 different sources including government laboratories, chemical vendors, public resources and publications. Developed with the intention of building community for chemists ChemSpider allows its users to deposit data including structures, properties, links to external resources and various forms of spectral data. Over the past three years ChemSpider has aggregated almost 3000 spectra including Infrared and Raman Data and continues to expand as the community deposits additional data. The majority of spectral data is licensed as Open Data allowing it to be downloaded and reused in presentations, lesson plans and for teaching purposes. This presentation will provide an overview of our efforts to build a structure-indexed online database of spectral data, initiate a call to action to the community to participate in improving this resource for the community at large and discuss how such a resource could be used as the basis of a spectral game to teach students spectral interpretation.

TRANSCRIPT

Page 1: Building a Community Resource of Open Spectral Data

Building a Community Resource of Open Spectral Data

Page 2: Building a Community Resource of Open Spectral Data

If there were an online DB of spectra.. Reference data

Could dramatically reduce rework

Excellent teaching resource

Opportunity for crowdsourced review and annotation of data (“Wikipedia for spectra”)

Other…

Page 3: Building a Community Resource of Open Spectral Data

Where is chemistry online? Encyclopedic articles (Wikipedia) Chemical vendor databases Metabolic pathway databases Property databases Patents with chemical structures Drug Discovery data Scientific publications Compound aggregators Blogs/Wikis and Open Notebook Science

Page 4: Building a Community Resource of Open Spectral Data

And where are spectra online?

Page 5: Building a Community Resource of Open Spectral Data

And where are spectra online?

Page 6: Building a Community Resource of Open Spectral Data

And where are spectra online?

Page 7: Building a Community Resource of Open Spectral Data

And where are spectra online?

Page 8: Building a Community Resource of Open Spectral Data

And where are spectra online?

Page 9: Building a Community Resource of Open Spectral Data

What’s in the way of “Open Spectra”

Spectral databases are revenue generators Intellectual property Scientist versus organizational ownership Legal risks It’s generally “work” to release data

Confusion about “Free Data” versus “Open Data” – some people will provide “Free Data”, some will provide “Open Data”. They are different

Page 10: Building a Community Resource of Open Spectral Data

A Pragmatic Vision“Build a Structure Centric Community to

Serve Chemists”

Integrate chemical structure data on the web Create a “structure-based hub” to information and

data Provide access to structure-based “algorithms” Let chemists contribute their own data Allow the community to curate/correct data

Page 11: Building a Community Resource of Open Spectral Data

We Answer Questions for Chemists Questions a chemist might ask…

What is the melting point of n-heptanol? What is the chemical structure of Xanax? Chemically, what is phenolphthalein? What are the stereocenters of cholesterol? Where can I find publications about xylene? What are the different trade names for Aspirin? What is the IR spectrum of Benzoic Acid?

Page 12: Building a Community Resource of Open Spectral Data

www.chemspider.com

Page 13: Building a Community Resource of Open Spectral Data

Search for a Chemical…by name

Page 14: Building a Community Resource of Open Spectral Data

Available Information…

Linked to vendors, safety data, toxicity, metabolism

Page 15: Building a Community Resource of Open Spectral Data

Available Information….

Page 16: Building a Community Resource of Open Spectral Data

ChemSpider Today

24.8 million structures 400 data sources Grows daily Community annotation and curation

We curate, edit, change, enhance data daily

Page 17: Building a Community Resource of Open Spectral Data

ChemSpider : Spectra Linked

Page 18: Building a Community Resource of Open Spectral Data

ChemSpider: Spectra Linked

Page 19: Building a Community Resource of Open Spectral Data

Spectra Linked

Page 20: Building a Community Resource of Open Spectral Data

Spectra Linked

Page 21: Building a Community Resource of Open Spectral Data

Spectra on ChemSpider

Page 22: Building a Community Resource of Open Spectral Data

Sources of Spectra

Sourced from online sources with permission

Private collections

The MAJORITY deposited by ChemSpider users

Page 23: Building a Community Resource of Open Spectral Data

Spectral Uploading

Locate the structure of interest and deposit spectrum

Page 24: Building a Community Resource of Open Spectral Data

Spectral Uploading Various types of NMR spectra supported

Page 25: Building a Community Resource of Open Spectral Data

Spectral formats supported

Spectra may be uploaded as JCAMP-DX format

Graphical formats such as JPEG, PNG, GIF

PDF files

Page 26: Building a Community Resource of Open Spectral Data

Multiple Spectra for One Structure

Page 27: Building a Community Resource of Open Spectral Data

ChemSpider ID 24528095 H1 NMR

Page 28: Building a Community Resource of Open Spectral Data

ChemSpider ID 24528095 C13 NMR

Page 29: Building a Community Resource of Open Spectral Data

ChemSpider ID 24528095 HHCOSY

Page 30: Building a Community Resource of Open Spectral Data

ChemSpider ID 24528095 HSQC

Page 31: Building a Community Resource of Open Spectral Data

ChemSpider ID 24528095 HMBC

Page 32: Building a Community Resource of Open Spectral Data

Deposit spectra against new structure

If a NEW compound has spectral data then deposit the structure onto ChemSpider first

Page 33: Building a Community Resource of Open Spectral Data

Available Spectra http://www.chemspider.com/spectra.aspx

Page 34: Building a Community Resource of Open Spectral Data

Embedding Data

Page 35: Building a Community Resource of Open Spectral Data

Web Services

Page 36: Building a Community Resource of Open Spectral Data

www.SpectralGame.comhttp://www.jcheminf.com/content/1/1/9

Page 37: Building a Community Resource of Open Spectral Data

Spectral Game

Page 38: Building a Community Resource of Open Spectral Data

Increasing Complexity

Page 39: Building a Community Resource of Open Spectral Data

Spectral Game

Page 40: Building a Community Resource of Open Spectral Data

Data Curation

Page 41: Building a Community Resource of Open Spectral Data

Reversed Spectrum

Page 42: Building a Community Resource of Open Spectral Data

Download, reprocess, redeposit

Page 43: Building a Community Resource of Open Spectral Data

True Curation of Data

Page 44: Building a Community Resource of Open Spectral Data

Building Other Spectral Games!

We would like to build other forms of the spectral game. The database is presently very rich in NMR data

There are presently 101 Infrared Spectra 46 Raman Spectra We would like more!!!

Page 45: Building a Community Resource of Open Spectral Data

Invitations

Spectral data are welcomed from associated syntheses, lab experiments etc

Upload structures, spectra, analyses etc to ChemSpider to share with the community

Use www.SpectralGame.com and encourage your students

Page 46: Building a Community Resource of Open Spectral Data

Thank you

Email: [email protected] Twitter: ChemConnectorBlog: www.chemspider.com/blogPersonal Blog: www.chemconnector.comSLIDES: www.slideshare.net/AntonyWilliams