moving forward our shared data agenda: a view from the publishing industry icsti, march 2012

18
Moving forward our shared data agenda: a view from the publishing industry ICSTI, March 2012

Upload: melvyn-king

Post on 25-Dec-2015

214 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Moving forward our shared data agenda: a view from the publishing industry ICSTI, March 2012

Moving forward our shared data agenda: a view from the publishing industry

ICSTI, March 2012

Page 2: Moving forward our shared data agenda: a view from the publishing industry ICSTI, March 2012

Data and the Scientific Article

Researchers perceive data sets as “important, but hard to access”

Publishing Research Consortium, 2010Researchers, N = 3824

Important, but hard to access

Page 3: Moving forward our shared data agenda: a view from the publishing industry ICSTI, March 2012

Overview: Data & the Scientific Article

• Current approaches• Thoughts for the future

Page 4: Moving forward our shared data agenda: a view from the publishing industry ICSTI, March 2012

Supplementary Material

• Authors can upload Supplementary Material with their paper

Pro’s• Coupling of data and article• Peer review• Citation mechanism• Preservation (byte-wise)

Con’s• Limited data type support• Compatibility (format support)• Limited capacity• Data not centrally stored

Page 5: Moving forward our shared data agenda: a view from the publishing industry ICSTI, March 2012

Connecting with Data Repositories, 1

Link to CCDC database(indicates that information for thisarticle is available)

Screenshot of journal article on ScienceDirect (http://dx.doi.org/10.1016/j.jfluchem.2009.07.015)

Article Linking example: CCDC

Page 6: Moving forward our shared data agenda: a view from the publishing industry ICSTI, March 2012

Connecting with Data Repositories, 2

... clicking on the CCDC logo takes the reader to a page at the CCDC repository with data related to the article

Screenshot of information page at CCDC (Cambridge Crystallographic Data Centre)

Article Linking example: CCDC

Page 7: Moving forward our shared data agenda: a view from the publishing industry ICSTI, March 2012

Connecting with Data Repositories, 3

Tagged Genbank entry(genetic sequence)

Screenshot of journal article on ScienceDirect (http://dx.doi.org/10.1016/j.biortech.2010.03.063 )

Entity Linking example: Genbank Accession Number

Page 8: Moving forward our shared data agenda: a view from the publishing industry ICSTI, March 2012

Connecting with Data Repositories, 4

... clicking on the linked Genbank accession code takes the reader to an information page on the NCBI data repository about that specific genetic sequence

Screenshot of information page at NCBI (National Center for Biotechnology Information)

Entity Linking example: Genbank Accession Number

Page 9: Moving forward our shared data agenda: a view from the publishing industry ICSTI, March 2012

Connecting with Data Repositories, 5

Database Subject Type of Linking

CCDC Crystallography Article-level

PANGAEA Earth Sciences Article-level*

EMBL Molecular Interactions Chemistry Entity, tagging

Molecular INTeraction DB Chemistry Entity, tagging

Genbank Nucleotides Entity, tagging

UniProt Proteins Entity, tagging

Protein Data Bank Proteins Entity, tagging

ClinicalTrials Medicine Entity, tagging

TAIR (Arabidopsis) Model organism Entity, tagging

Mendelian Inheritance in Men Genetics, inheritance Entity, tagging

*: with Application

Page 10: Moving forward our shared data agenda: a view from the publishing industry ICSTI, March 2012

The Article of the Future

Page 11: Moving forward our shared data agenda: a view from the publishing industry ICSTI, March 2012

Discovery and Use via SciVerse Applications

Use information from SciVerse and the web

Support for rich user interfaces

Integrated directly into the online article

Simple to build using Content and Framework APIs

Open standards (Apache Shindig, Open Social)

Features & Benefits

Page 12: Moving forward our shared data agenda: a view from the publishing industry ICSTI, March 2012

Discovery and Use via SciVerse Applications

•Give me your data, my way…

Openness and Interoperability

•Know who I am and what I want…

Personalization

•The right contacts, at the right time…

Collaboration and trusted views

Libraries can become focal point for applicationsResearchers can save time and improve their information discovery process

“Apps interacting with results are very important to help save time…”

Specific information can be targeted by applications to facilitate content mining and speed up the search time, utilising more time for analysis.

“what faculty is really after is something that ties this altogether, so its all in one place…”

Applications assist researchers to extract all information – content, data, figures etc. to a single analysis source which can be on a local database at the customer’s institute.

Page 13: Moving forward our shared data agenda: a view from the publishing industry ICSTI, March 2012

Applications example: NCBI Genome Viewer Scans the article and builds list of sequences based on NCBI accession numbers tagged in the article View/analyze sequence data from genes in the article using NCBI Sequence Viewer See specific information about each strand; zoom in/out; export data

Screenshots of journal article on ScienceDirect (http://dx.doi.org/10.1016/j.ygeno.2007.07.010)

Page 14: Moving forward our shared data agenda: a view from the publishing industry ICSTI, March 2012

Applications example: PANGAEA Document identifier sent to PANGAEA data repository for earth sciences PANGAEA returns map plotted with locations where cited data was collected Push-pins open with details of dataset and direct link to data on PANGAEA.de

Screenshots of journal article on ScienceDirect (http://dx.doi.org/10.1016/S0377-8398(01)00044-5)

Page 15: Moving forward our shared data agenda: a view from the publishing industry ICSTI, March 2012

Elsevier Enables Content Mining

CONTENT

Customers may:

Run extensive searches and use locally loaded content for text mining purposes for their own research.

Perform extensive mining operations on subscribed content . Structuring input text Deriving patterns within the structured

text Evaluation and interpretation of the

output.

Extract semantic entities from Elsevier content for the purpose of recognition and classification of the relations between them

Integrate results on a server used for the customer’s own mining system for access and use by its researchers through the customer’s internal secure network.

Enabling developers who wish to design and implement applications to analyse our content, or test applications as part of their research within Elsevier content

Page 16: Moving forward our shared data agenda: a view from the publishing industry ICSTI, March 2012

Our Content Mining Solution Suite

CONTENT DELIVERYSEARCH &

WORKFLOWSOLUTIONS

ANALYSIS

Page 17: Moving forward our shared data agenda: a view from the publishing industry ICSTI, March 2012

Current initiative overview

◦ Supplementary Material◦ Linking to Data Repositories◦ Presentation via Article of the Future◦ Discovery and Use via SciVerse Applications◦ Empower scientists to mine content and use locally

***************************◦ Data store (600 terrabytes as present)◦ Executable papers◦ Workflow tools◦ Etc.

Page 18: Moving forward our shared data agenda: a view from the publishing industry ICSTI, March 2012

Conclusions: some thoughts for the future

RESEARCHERS

FUNDERSPUBLISHERS

INSTITUTIONS

Need for aligned strategies and policies, sustainable business models, and concerted collaboration