Data Publication:Discover, Explore, Visualise
Alejandra Gonzalez-Beltran, PhDResearch Lecturer
Oxford e-Research CentreUniversity of Oxford
Data Visualisation and the Future of Academic PublishingUniversity of Oxford and Oxford University Press
June 10th 2016@alegonbel
Philippe Rocca-Serra, PhDSenior Research Lecturer
AlejandraGonzalez-Beltran, PhDResearch Lecturer
Milo Thurston, DPhDResearch Software Engineer
MassimilianoIzzo, PhDResearch Software Engineer
Peter McQuilton, PhDKnowledge Engineer
Our main areas of research and activity:
• Enabling reproducible research through…
• Data collection, curation, representation etc.• Data publication• Data provenance • Development of software, infrastructure• Open, community ontologies and standards• Semantic web / linked data• Training
Communities we work with/for:Allyson Lister, PhDKnowledge Engineer
EamonnMaguire, DPhilSoftware Engineer contractor
David Johnson, PhDResearch Software Engineer
Susanna-Assunta Sansone, PhDPrincipal Investigator, Associate Director
OutlineOutline
• Challenges associated to scholarly data
• Importance of all research outputs / metadata
• Reproducibility crisis• Experiments description• Data availability
• Data publication• Springer Nature Scientific Data
• Discover, Explore, Visualise Scholarly Data• Scientific Data ISA-explorer
• Challenges associated to scholarly data
• Importance of all research outputs / metadata
• Reproducibility crisis• Experiments description• Data availability
• Data publication• Springer Nature Scientific Data
• Discover, Explore, Visualise Scholarly Data• Scientific Data ISA-explorer
Credit to: https://www.digital-science.com/blog/news/five-top-reasons-to-protect-your-data-and-practise-safe-science/
Challenges related to scholarly dataChallenges related to scholarly data
• Outputs are multi-dimensional, diverse, not always well cited / storedo Software, codes, workflows etc.; hard(er) to get hold of
• Data often distributed and fragmented to fit (siloed) databaseso Without enough information for others to understand it
• Uneven level of details and annotation across different databaseso Specialized, generalist, public and institutional
• Data curation activities are perceived as time consumingo Collection and harmonization of detailed methods and experimental
steps is done/rushed at publication stage
But… shared data is not always understandable, reusable
But… shared data is not always understandable, reusable
Importance of- avoid selective reporting- experimental design- statistical power- statistical analysis- code/methods availability- data availability
Importance of- avoid selective reporting- experimental design- statistical power- statistical analysis- code/methods availability- data availability
• Incentive, credit for sharingo Big and small datao Unpublished datao Long tail of datao Curated aggregation
• Peer review of data• Value of data vs. analysis• Discoverability and reusability
o Complementing community databases
Growing number of data papers and data journalsGrowing number of data papers and data journals
nature.com/scientificdataHonorary Academic Editor Susanna-Assunta Sansone, PhD
Managing EditorAndrew L Hufton, PhD
Editorial CuratorVarsha Khodiyar
PublisherIain Hrynaszkiewicz
A new open-access, online-only publication for descriptions of scientifically valuable datasets
Supported by
nature.com/scientificdataHonorary Academic Editor Susanna-Assunta Sansone, PhD
Managing EditorAndrew L Hufton, PhD
Editorial CuratorVarsha Khodiyar
PublisherIain Hrynaszkiewicz
A new open-access, online-only publication for descriptions of scientifically valuable datasets
Supported by
Research
papers
Data
records
Data
Descriptors
Value added: complement between traditional articles & repositories
Value added: complement between traditional articles & repositories
Scientific hypotheses:SynthesisAnalysisConclusions
Methods and technical analyses supporting the quality of the measurements:What did I do to generate the data?How was the data processed?Where is the data?Who did what when
Relation with traditional articles – contentRelation with traditional articles – content
Citation of and links to data files and databasesCitation of and links to data files and databases
Credit for data producersCredit for data producers
A new article typeA new article type
A new category of publication that provides detailed descriptors of scientifically valuable datasets
Mandates open data, without unnecessary restrictions, as a condition of submission
SummarySummary
• Challenges associated to scholarly data
• Importance of all research outputs / metadata
• Reproducibility crisis• Experiments description• Data availability
• Data publication• Springer Nature Scientific Data
• Discover, Explore, Visualise Scholarly Data• Scientific Data ISA-explorer
• Challenges associated to scholarly data
• Importance of all research outputs / metadata
• Reproducibility crisis• Experiments description• Data availability
• Data publication• Springer Nature Scientific Data
• Discover, Explore, Visualise Scholarly Data• Scientific Data ISA-explorer
Philippe Rocca-Serra, PhDSenior Research Lecturer
AlejandraGonzalez-Beltran, PhDResearch Lecturer
Milo Thurston, DPhDResearch Software Engineer
MassimilianoIzzo, PhDResearch Software Engineer
Peter McQuilton, PhDKnowledge Engineer
Communities we work with/for:Allyson Lister, PhDKnowledge Engineer
EamonnMaguire, DPhilSoftware Engineer contractor
David Johnson, PhDResearch Software Engineer
Susanna-Assunta Sansone, PhDPrincipal Investigator, Associate Director
Our main areas of research and activity:
• Enabling reproducible research through…
• Data collection, curation, representation etc.• Data publication• Data provenance • Development of software, infrastructure• Open, community ontologies and standards• Semantic web / linked data• Training