building a community resource of open spectral data
DESCRIPTION
A presentation given at #FACCS2010 in Raleigh, North CarolinaChemSpider is an online database of almost 25 million chemical compounds sourced from over 300 different sources including government laboratories, chemical vendors, public resources and publications. Developed with the intention of building community for chemists ChemSpider allows its users to deposit data including structures, properties, links to external resources and various forms of spectral data. Over the past three years ChemSpider has aggregated almost 3000 spectra including Infrared and Raman Data and continues to expand as the community deposits additional data. The majority of spectral data is licensed as Open Data allowing it to be downloaded and reused in presentations, lesson plans and for teaching purposes. This presentation will provide an overview of our efforts to build a structure-indexed online database of spectral data, initiate a call to action to the community to participate in improving this resource for the community at large and discuss how such a resource could be used as the basis of a spectral game to teach students spectral interpretation.TRANSCRIPT
Building a Community Resource of Open Spectral Data
If there were an online DB of spectra.. Reference data
Could dramatically reduce rework
Excellent teaching resource
Opportunity for crowdsourced review and annotation of data (“Wikipedia for spectra”)
Other…
Where is chemistry online? Encyclopedic articles (Wikipedia) Chemical vendor databases Metabolic pathway databases Property databases Patents with chemical structures Drug Discovery data Scientific publications Compound aggregators Blogs/Wikis and Open Notebook Science
And where are spectra online?
And where are spectra online?
And where are spectra online?
And where are spectra online?
And where are spectra online?
What’s in the way of “Open Spectra”
Spectral databases are revenue generators Intellectual property Scientist versus organizational ownership Legal risks It’s generally “work” to release data
Confusion about “Free Data” versus “Open Data” – some people will provide “Free Data”, some will provide “Open Data”. They are different
A Pragmatic Vision“Build a Structure Centric Community to
Serve Chemists”
Integrate chemical structure data on the web Create a “structure-based hub” to information and
data Provide access to structure-based “algorithms” Let chemists contribute their own data Allow the community to curate/correct data
We Answer Questions for Chemists Questions a chemist might ask…
What is the melting point of n-heptanol? What is the chemical structure of Xanax? Chemically, what is phenolphthalein? What are the stereocenters of cholesterol? Where can I find publications about xylene? What are the different trade names for Aspirin? What is the IR spectrum of Benzoic Acid?
www.chemspider.com
Search for a Chemical…by name
Available Information…
Linked to vendors, safety data, toxicity, metabolism
Available Information….
ChemSpider Today
24.8 million structures 400 data sources Grows daily Community annotation and curation
We curate, edit, change, enhance data daily
ChemSpider : Spectra Linked
ChemSpider: Spectra Linked
Spectra Linked
Spectra Linked
Spectra on ChemSpider
Sources of Spectra
Sourced from online sources with permission
Private collections
The MAJORITY deposited by ChemSpider users
Spectral Uploading
Locate the structure of interest and deposit spectrum
Spectral Uploading Various types of NMR spectra supported
Spectral formats supported
Spectra may be uploaded as JCAMP-DX format
Graphical formats such as JPEG, PNG, GIF
PDF files
Multiple Spectra for One Structure
ChemSpider ID 24528095 H1 NMR
ChemSpider ID 24528095 C13 NMR
ChemSpider ID 24528095 HHCOSY
ChemSpider ID 24528095 HSQC
ChemSpider ID 24528095 HMBC
Deposit spectra against new structure
If a NEW compound has spectral data then deposit the structure onto ChemSpider first
Available Spectra http://www.chemspider.com/spectra.aspx
Embedding Data
Web Services
www.SpectralGame.comhttp://www.jcheminf.com/content/1/1/9
Spectral Game
Increasing Complexity
Spectral Game
Data Curation
Reversed Spectrum
Download, reprocess, redeposit
True Curation of Data
Building Other Spectral Games!
We would like to build other forms of the spectral game. The database is presently very rich in NMR data
There are presently 101 Infrared Spectra 46 Raman Spectra We would like more!!!
Invitations
Spectral data are welcomed from associated syntheses, lab experiments etc
Upload structures, spectra, analyses etc to ChemSpider to share with the community
Use www.SpectralGame.com and encourage your students
Thank you
Email: [email protected] Twitter: ChemConnectorBlog: www.chemspider.com/blogPersonal Blog: www.chemconnector.comSLIDES: www.slideshare.net/AntonyWilliams