© s.j. coles 2005 echeminfo2005 open archives as a route for capture, dissemination and access to...

20
eChemInfo2005 © S.J. Coles 2005 Open Archives as a Route for Capture, Dissemination and Access to Chemical Data and Information Simon Coles School of Chemistry, University of Southampton, U.K. [email protected]

Upload: alexandrina-sims

Post on 13-Dec-2015

215 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: © S.J. Coles 2005 eChemInfo2005 Open Archives as a Route for Capture, Dissemination and Access to Chemical Data and Information Simon Coles School of Chemistry,

                                                             

eChemInfo2005 © S.J. Coles 2005

Open Archives as a Route for Capture, Dissemination and Access to

Chemical Data and Information

Simon Coles

School of Chemistry,

University of Southampton, U.K.

[email protected]

Page 2: © S.J. Coles 2005 eChemInfo2005 Open Archives as a Route for Capture, Dissemination and Access to Chemical Data and Information Simon Coles School of Chemistry,

                                                             

eChemInfo2005 © S.J. Coles 2005

Data – Information – Knowledge

Experiment

Structure - Property, Prediction

Model

Cl

O

Page 3: © S.J. Coles 2005 eChemInfo2005 Open Archives as a Route for Capture, Dissemination and Access to Chemical Data and Information Simon Coles School of Chemistry,

                                                             

eChemInfo2005 © S.J. Coles 2005

Data Overload!

Cl

Cl

Cl

Cl

Cl

Cl

ClCl Cl

Cl

Cl

ClCl

O

O

O

O

N

N

N

N

N+

O

O

O

N+

O

O

O

15,000,000

1.5,000,000

450,000

Page 4: © S.J. Coles 2005 eChemInfo2005 Open Archives as a Route for Capture, Dissemination and Access to Chemical Data and Information Simon Coles School of Chemistry,

                                                             

eChemInfo2005 © S.J. Coles 2005

Funding Body Mandate

Page 5: © S.J. Coles 2005 eChemInfo2005 Open Archives as a Route for Capture, Dissemination and Access to Chemical Data and Information Simon Coles School of Chemistry,

                                                             

eChemInfo2005 © S.J. Coles 2005

Open Access as the Answer?

Page 6: © S.J. Coles 2005 eChemInfo2005 Open Archives as a Route for Capture, Dissemination and Access to Chemical Data and Information Simon Coles School of Chemistry,

                                                             

eChemInfo2005 © S.J. Coles 2005

Separating Data from Interpretations

Underlying data

Intellect & Interpretation

Page 7: © S.J. Coles 2005 eChemInfo2005 Open Archives as a Route for Capture, Dissemination and Access to Chemical Data and Information Simon Coles School of Chemistry,

                                                             

eChemInfo2005 © S.J. Coles 2005

Crystallography workflow

RAW DATA DERIVED DATA RESULTS DATA

Page 8: © S.J. Coles 2005 eChemInfo2005 Open Archives as a Route for Capture, Dissemination and Access to Chemical Data and Information Simon Coles School of Chemistry,

                                                             

eChemInfo2005 © S.J. Coles 2005

Simple & Rapid Deposition Data manipulation toolbox Associated Metadata

Value added

Format conversion

Page 9: © S.J. Coles 2005 eChemInfo2005 Open Archives as a Route for Capture, Dissemination and Access to Chemical Data and Information Simon Coles School of Chemistry,

                                                             

eChemInfo2005 © S.J. Coles 2005

An Archive Entry

ecrystals.chem.soton.ac.uk

Page 10: © S.J. Coles 2005 eChemInfo2005 Open Archives as a Route for Capture, Dissemination and Access to Chemical Data and Information Simon Coles School of Chemistry,

                                                             

eChemInfo2005 © S.J. Coles 2005

Access to the underlying data

Page 11: © S.J. Coles 2005 eChemInfo2005 Open Archives as a Route for Capture, Dissemination and Access to Chemical Data and Information Simon Coles School of Chemistry,

                                                             

eChemInfo2005 © S.J. Coles 2005

Metadata Publication

• Using simple Dublin Core • Crystal structure• Title (Systematic IUPAC Name)• Authors• Affiliation• Creation Date

• Additional chemical information through Qualified Dublin Core• Empirical formula• International Chemical Identifier (InChI)• Compound Class• Keywords

• Specifies which ‘datasets’ are present in an entry

• DOI

• Rights

Page 12: © S.J. Coles 2005 eChemInfo2005 Open Archives as a Route for Capture, Dissemination and Access to Chemical Data and Information Simon Coles School of Chemistry,

                                                             

eChemInfo2005 © S.J. Coles 2005

Harvesting & Aggregating: Google

Coles, S.J., Day, N.E., Murray-Rust, P., Rzepa, H.S., Zhang, Y., Org. Biomol. Chem., 2005, (10),1832-1834. DOI: 10.1039/b502828k

Page 13: © S.J. Coles 2005 eChemInfo2005 Open Archives as a Route for Capture, Dissemination and Access to Chemical Data and Information Simon Coles School of Chemistry,

                                                             

eChemInfo2005 © S.J. Coles 2005

Harvesting: OAIster

Page 14: © S.J. Coles 2005 eChemInfo2005 Open Archives as a Route for Capture, Dissemination and Access to Chemical Data and Information Simon Coles School of Chemistry,

                                                             

eChemInfo2005 © S.J. Coles 2005

Linking and aggregating

Page 15: © S.J. Coles 2005 eChemInfo2005 Open Archives as a Route for Capture, Dissemination and Access to Chemical Data and Information Simon Coles School of Chemistry,

                                                             

eChemInfo2005 © S.J. Coles 2005

Embedded in a science portal

Page 16: © S.J. Coles 2005 eChemInfo2005 Open Archives as a Route for Capture, Dissemination and Access to Chemical Data and Information Simon Coles School of Chemistry,

                                                             

eChemInfo2005 © S.J. Coles 2005

eBank/eCrystals Future

Phase 2 completion: • Robust software • Full embedding in daily laboratory practice• Roll out to other institutions• Full support from host institution• Final endorsement by IUCr

Phase 3:• Community acceptance• Specialised aggregator services (Crystallography)• Generic aggregator services (Chemistry / Science)• Heterogeneous sources for aggregators• Archive development in other disciplines

Page 17: © S.J. Coles 2005 eChemInfo2005 Open Archives as a Route for Capture, Dissemination and Access to Chemical Data and Information Simon Coles School of Chemistry,

                                                             

eChemInfo2005 © S.J. Coles 2005

Laboratory Repositories

Page 18: © S.J. Coles 2005 eChemInfo2005 Open Archives as a Route for Capture, Dissemination and Access to Chemical Data and Information Simon Coles School of Chemistry,

                                                             

eChemInfo2005 © S.J. Coles 2005

R4L: Prototype Repository

Page 19: © S.J. Coles 2005 eChemInfo2005 Open Archives as a Route for Capture, Dissemination and Access to Chemical Data and Information Simon Coles School of Chemistry,

                                                             

eChemInfo2005 © S.J. Coles 2005

Thanks

Chemistry: Mike Hursthouse, Jeremy Frey, Andrew Milsted, Susanne Huth, Wendy King, David Hughes

Electronics and Computer Science: Les Carr, Chris Gutteridge, Tim Miles-Board

UKOLN / PSIgate: Liz Lyon, Rachel Heery, Monica Duke, Michael Day, Andy Powell, John Blundon-Ellis

££££($$$$)’s

Page 20: © S.J. Coles 2005 eChemInfo2005 Open Archives as a Route for Capture, Dissemination and Access to Chemical Data and Information Simon Coles School of Chemistry,

                                                             

eChemInfo2005 © S.J. Coles 2005

Take-Home Message

“The internet wasn't created for mockery! It was created so scientists from different universities

could share datasets....”

Simpson, H. The Simpsons (2005), Eds. Groening, M., Brooks, J.L. & Simon, S., Series 16, Episode 8, Original air date (US) 06-Feb-2005.

http://www.tvtome.com/tvtome/servlet/GuidePageServlet/showid-146/epid-346864/