what are we going to do with data? rob l davidson gigascience @bobbledavidson #wcsj2015 this...
TRANSCRIPT
![Page 1: WHAT ARE WE GOING TO DO WITH DATA? Rob L Davidson GigaScience @bobbledavidson #WCSJ2015 This presentation DOI:10.6084/m9.figshare.1439750](https://reader030.vdocument.in/reader030/viewer/2022032611/56649f435503460f94c63698/html5/thumbnails/1.jpg)
WHAT ARE WE GOING TO DO WITH DATA?
Rob L DavidsonGigaScience
@bobbledavidson#WCSJ2015
This presentation DOI:10.6084/m9.figshare.1439750
![Page 2: WHAT ARE WE GOING TO DO WITH DATA? Rob L Davidson GigaScience @bobbledavidson #WCSJ2015 This presentation DOI:10.6084/m9.figshare.1439750](https://reader030.vdocument.in/reader030/viewer/2022032611/56649f435503460f94c63698/html5/thumbnails/2.jpg)
Up ahead
• The need for Open Data in science• GigaScience and GigaDB• Everything is data• Open is accessible• Literate programming• So, what are we going to do with data?
DOI:10.6084/m9.figshare.1439750
![Page 3: WHAT ARE WE GOING TO DO WITH DATA? Rob L Davidson GigaScience @bobbledavidson #WCSJ2015 This presentation DOI:10.6084/m9.figshare.1439750](https://reader030.vdocument.in/reader030/viewer/2022032611/56649f435503460f94c63698/html5/thumbnails/3.jpg)
THE NEED FOR OPEN DATA IN SCIENCE
DOI:10.6084/m9.figshare.1439750
![Page 4: WHAT ARE WE GOING TO DO WITH DATA? Rob L Davidson GigaScience @bobbledavidson #WCSJ2015 This presentation DOI:10.6084/m9.figshare.1439750](https://reader030.vdocument.in/reader030/viewer/2022032611/56649f435503460f94c63698/html5/thumbnails/4.jpg)
Researcher bias Positive result bias
20 teams do studies, 1 publishes p<0.05 Poorly explained analyses
DOI: 10.1371/journal.pmed.0020124 DOI:10.6084/m9.figshare.1439750
![Page 5: WHAT ARE WE GOING TO DO WITH DATA? Rob L Davidson GigaScience @bobbledavidson #WCSJ2015 This presentation DOI:10.6084/m9.figshare.1439750](https://reader030.vdocument.in/reader030/viewer/2022032611/56649f435503460f94c63698/html5/thumbnails/5.jpg)
Problem: ReproducibilityOut of 18 microarray papers, results
from 10 could not be reproduced
5
DOI: 10.1038/ng.295 DOI:10.6084/m9.figshare.1439750
![Page 6: WHAT ARE WE GOING TO DO WITH DATA? Rob L Davidson GigaScience @bobbledavidson #WCSJ2015 This presentation DOI:10.6084/m9.figshare.1439750](https://reader030.vdocument.in/reader030/viewer/2022032611/56649f435503460f94c63698/html5/thumbnails/6.jpg)
Software?http://reproducibility.cs.arizona.edu/
“The good news is that I was able to find some code. I am just hoping that it isa stable working version of the code... I have lost some data... The bad news is that the code is not commented and/or clean. So, I cannot really guarantee that you will enjoy playing with it.”
613 paperstested
123 successfulreproductions
DOI:10.6084/m9.figshare.1439750
![Page 7: WHAT ARE WE GOING TO DO WITH DATA? Rob L Davidson GigaScience @bobbledavidson #WCSJ2015 This presentation DOI:10.6084/m9.figshare.1439750](https://reader030.vdocument.in/reader030/viewer/2022032611/56649f435503460f94c63698/html5/thumbnails/7.jpg)
DOI: 10.1371/journal.pmed.1001747
85% of research resources are wasted!
We must ... favor ... unbiased, transparent, collaborative research with greater standardization
Share data, protocols, materials, software, other tools
DOI:10.6084/m9.figshare.1439750
![Page 8: WHAT ARE WE GOING TO DO WITH DATA? Rob L Davidson GigaScience @bobbledavidson #WCSJ2015 This presentation DOI:10.6084/m9.figshare.1439750](https://reader030.vdocument.in/reader030/viewer/2022032611/56649f435503460f94c63698/html5/thumbnails/8.jpg)
What, why Open Data?
• Knowledge is open if anyone is – free to access, – use,– modify, – and share it – subject, at most, to measures
that preserve provenance and openness.
http://opendefinition.org/od/ DOI:10.6084/m9.figshare.1439750
![Page 9: WHAT ARE WE GOING TO DO WITH DATA? Rob L Davidson GigaScience @bobbledavidson #WCSJ2015 This presentation DOI:10.6084/m9.figshare.1439750](https://reader030.vdocument.in/reader030/viewer/2022032611/56649f435503460f94c63698/html5/thumbnails/9.jpg)
FAIR Datahttp://datafairport.org/ DOI:10.6084/m9.figshare.1439750
![Page 10: WHAT ARE WE GOING TO DO WITH DATA? Rob L Davidson GigaScience @bobbledavidson #WCSJ2015 This presentation DOI:10.6084/m9.figshare.1439750](https://reader030.vdocument.in/reader030/viewer/2022032611/56649f435503460f94c63698/html5/thumbnails/10.jpg)
GIGASCIENCE AND GIGADB
DOI:10.6084/m9.figshare.1439750
![Page 11: WHAT ARE WE GOING TO DO WITH DATA? Rob L Davidson GigaScience @bobbledavidson #WCSJ2015 This presentation DOI:10.6084/m9.figshare.1439750](https://reader030.vdocument.in/reader030/viewer/2022032611/56649f435503460f94c63698/html5/thumbnails/11.jpg)
The publishing tradition
18121665 1869
DOI:10.6084/m9.figshare.1439750
![Page 12: WHAT ARE WE GOING TO DO WITH DATA? Rob L Davidson GigaScience @bobbledavidson #WCSJ2015 This presentation DOI:10.6084/m9.figshare.1439750](https://reader030.vdocument.in/reader030/viewer/2022032611/56649f435503460f94c63698/html5/thumbnails/12.jpg)
The publishing tradition
• Aimed at paper product• Limited length• Limited detail• No supporting data• No supporting code• Poor images• Limited figures
DOI:10.6084/m9.figshare.1439750
![Page 13: WHAT ARE WE GOING TO DO WITH DATA? Rob L Davidson GigaScience @bobbledavidson #WCSJ2015 This presentation DOI:10.6084/m9.figshare.1439750](https://reader030.vdocument.in/reader030/viewer/2022032611/56649f435503460f94c63698/html5/thumbnails/13.jpg)
Anatomy of a traditional Publication
Data
Idea
Study
Analysis
Answer
Metadata
13
DOI:10.6084/m9.figshare.1439750
![Page 14: WHAT ARE WE GOING TO DO WITH DATA? Rob L Davidson GigaScience @bobbledavidson #WCSJ2015 This presentation DOI:10.6084/m9.figshare.1439750](https://reader030.vdocument.in/reader030/viewer/2022032611/56649f435503460f94c63698/html5/thumbnails/14.jpg)
Anatomy of an Open Data Publication
14
Data
Idea
Study
Analysis
Answer
Metadata
DOI:10.6084/m9.figshare.1439750
![Page 15: WHAT ARE WE GOING TO DO WITH DATA? Rob L Davidson GigaScience @bobbledavidson #WCSJ2015 This presentation DOI:10.6084/m9.figshare.1439750](https://reader030.vdocument.in/reader030/viewer/2022032611/56649f435503460f94c63698/html5/thumbnails/15.jpg)
Multi-faceted publication
Open-access journal
Data Publishing Platform
Data Analysis Platform
Data
Metadata
Methods
Analyses
DOI:10.6084/m9.figshare.1439750
![Page 16: WHAT ARE WE GOING TO DO WITH DATA? Rob L Davidson GigaScience @bobbledavidson #WCSJ2015 This presentation DOI:10.6084/m9.figshare.1439750](https://reader030.vdocument.in/reader030/viewer/2022032611/56649f435503460f94c63698/html5/thumbnails/16.jpg)
“Deconstructed”Journal
“Regular”Journal
“Conscientious” Online Journal
16
DOI:10.6084/m9.figshare.1439750
![Page 17: WHAT ARE WE GOING TO DO WITH DATA? Rob L Davidson GigaScience @bobbledavidson #WCSJ2015 This presentation DOI:10.6084/m9.figshare.1439750](https://reader030.vdocument.in/reader030/viewer/2022032611/56649f435503460f94c63698/html5/thumbnails/17.jpg)
“Deconstructed”Journal
“Regular”Journal
“Conscientious” Online Journal
17
DOI:10.6084/m9.figshare.1439750
![Page 18: WHAT ARE WE GOING TO DO WITH DATA? Rob L Davidson GigaScience @bobbledavidson #WCSJ2015 This presentation DOI:10.6084/m9.figshare.1439750](https://reader030.vdocument.in/reader030/viewer/2022032611/56649f435503460f94c63698/html5/thumbnails/18.jpg)
“Deconstructed”Journal
“Regular”Journal
“Conscientious” Online Journal
18
DOI:10.6084/m9.figshare.1439750
![Page 19: WHAT ARE WE GOING TO DO WITH DATA? Rob L Davidson GigaScience @bobbledavidson #WCSJ2015 This presentation DOI:10.6084/m9.figshare.1439750](https://reader030.vdocument.in/reader030/viewer/2022032611/56649f435503460f94c63698/html5/thumbnails/19.jpg)
Image Source: http://commons.wikimedia.org/wiki/File:System-Mechanic-California.jpg
“Deconstructed”Journal
“Regular”Journal
“Conscientious” Online Journal
19
DOI:10.6084/m9.figshare.1439750
![Page 20: WHAT ARE WE GOING TO DO WITH DATA? Rob L Davidson GigaScience @bobbledavidson #WCSJ2015 This presentation DOI:10.6084/m9.figshare.1439750](https://reader030.vdocument.in/reader030/viewer/2022032611/56649f435503460f94c63698/html5/thumbnails/20.jpg)
EVERYTHING IS DATA
DOI:10.6084/m9.figshare.1439750
![Page 21: WHAT ARE WE GOING TO DO WITH DATA? Rob L Davidson GigaScience @bobbledavidson #WCSJ2015 This presentation DOI:10.6084/m9.figshare.1439750](https://reader030.vdocument.in/reader030/viewer/2022032611/56649f435503460f94c63698/html5/thumbnails/21.jpg)
Data is dataDOI:10.1186/2047-217X-3-7 DOI:10.6084/m9.figshare.1439750
![Page 22: WHAT ARE WE GOING TO DO WITH DATA? Rob L Davidson GigaScience @bobbledavidson #WCSJ2015 This presentation DOI:10.6084/m9.figshare.1439750](https://reader030.vdocument.in/reader030/viewer/2022032611/56649f435503460f94c63698/html5/thumbnails/22.jpg)
Software is data
“For loading data from the provided datasets, a script that can load individual spectra or images is provided”
DOI: DOI:10.6084/m9.figshare.1439750
![Page 23: WHAT ARE WE GOING TO DO WITH DATA? Rob L Davidson GigaScience @bobbledavidson #WCSJ2015 This presentation DOI:10.6084/m9.figshare.1439750](https://reader030.vdocument.in/reader030/viewer/2022032611/56649f435503460f94c63698/html5/thumbnails/23.jpg)
Metadata is dataFindable, reusable…• Bioontologies/ISA-Tab
– Standard language • ORCID
– Unique, traceable authors• Fundref
– Track funding outputs• API’s
– Easy search
DOI:10.6084/m9.figshare.1439750
![Page 24: WHAT ARE WE GOING TO DO WITH DATA? Rob L Davidson GigaScience @bobbledavidson #WCSJ2015 This presentation DOI:10.6084/m9.figshare.1439750](https://reader030.vdocument.in/reader030/viewer/2022032611/56649f435503460f94c63698/html5/thumbnails/24.jpg)
ACCESSIBLE, USABLE DATA
DOI:10.6084/m9.figshare.1439750
![Page 25: WHAT ARE WE GOING TO DO WITH DATA? Rob L Davidson GigaScience @bobbledavidson #WCSJ2015 This presentation DOI:10.6084/m9.figshare.1439750](https://reader030.vdocument.in/reader030/viewer/2022032611/56649f435503460f94c63698/html5/thumbnails/25.jpg)
Curation
• Not all science data is pretty• ISA-Tab, SRA helps • Peer reviewed data is better data
http://bit.ly/1F47YZz DOI:10.6084/m9.figshare.1439750
![Page 26: WHAT ARE WE GOING TO DO WITH DATA? Rob L Davidson GigaScience @bobbledavidson #WCSJ2015 This presentation DOI:10.6084/m9.figshare.1439750](https://reader030.vdocument.in/reader030/viewer/2022032611/56649f435503460f94c63698/html5/thumbnails/26.jpg)
Software pipelinesGigagalaxy.net
Tool List Tool Parameters History/results
DOI:10.6084/m9.figshare.1439750
![Page 27: WHAT ARE WE GOING TO DO WITH DATA? Rob L Davidson GigaScience @bobbledavidson #WCSJ2015 This presentation DOI:10.6084/m9.figshare.1439750](https://reader030.vdocument.in/reader030/viewer/2022032611/56649f435503460f94c63698/html5/thumbnails/27.jpg)
Visualise pipelinesDOI:10.6084/m9.figshare.1439750Gigagalaxy.net
![Page 28: WHAT ARE WE GOING TO DO WITH DATA? Rob L Davidson GigaScience @bobbledavidson #WCSJ2015 This presentation DOI:10.6084/m9.figshare.1439750](https://reader030.vdocument.in/reader030/viewer/2022032611/56649f435503460f94c63698/html5/thumbnails/28.jpg)
Reproducing results? SOAPdenovo2 S. aureus pipelineDOI: 10.1186/2047-217X-1-18 DOI:10.6084/m9.figshare.1439750
![Page 29: WHAT ARE WE GOING TO DO WITH DATA? Rob L Davidson GigaScience @bobbledavidson #WCSJ2015 This presentation DOI:10.6084/m9.figshare.1439750](https://reader030.vdocument.in/reader030/viewer/2022032611/56649f435503460f94c63698/html5/thumbnails/29.jpg)
Easy installation• Virtual machine
– Pre-installed– Peer-reviewed– Reproducibility, frozen in time
DOI:10.1186/2047-217X-3-23 DOI:10.6084/m9.figshare.1439750
![Page 30: WHAT ARE WE GOING TO DO WITH DATA? Rob L Davidson GigaScience @bobbledavidson #WCSJ2015 This presentation DOI:10.6084/m9.figshare.1439750](https://reader030.vdocument.in/reader030/viewer/2022032611/56649f435503460f94c63698/html5/thumbnails/30.jpg)
Literate programming• Data journalism for all!• KnitR, iPython, project Jupyter
DOI:10.1186/2047-217X-3-3 DOI:10.6084/m9.figshare.1439750
![Page 31: WHAT ARE WE GOING TO DO WITH DATA? Rob L Davidson GigaScience @bobbledavidson #WCSJ2015 This presentation DOI:10.6084/m9.figshare.1439750](https://reader030.vdocument.in/reader030/viewer/2022032611/56649f435503460f94c63698/html5/thumbnails/31.jpg)
WHAT ARE WE GOING TO DO WITH DATA?
DOI:10.6084/m9.figshare.1439750
![Page 32: WHAT ARE WE GOING TO DO WITH DATA? Rob L Davidson GigaScience @bobbledavidson #WCSJ2015 This presentation DOI:10.6084/m9.figshare.1439750](https://reader030.vdocument.in/reader030/viewer/2022032611/56649f435503460f94c63698/html5/thumbnails/32.jpg)
Add valuehttp://bit.ly/1JyTfxO DOI:10.6084/m9.figshare.1439750
![Page 33: WHAT ARE WE GOING TO DO WITH DATA? Rob L Davidson GigaScience @bobbledavidson #WCSJ2015 This presentation DOI:10.6084/m9.figshare.1439750](https://reader030.vdocument.in/reader030/viewer/2022032611/56649f435503460f94c63698/html5/thumbnails/33.jpg)
Do science?
• Data– DOI: 10.5524/100034
• Subsequent analysis– DOI: 10.1126/scitranslmed.3006086
• Science journalism– http://bit.ly/1AXEkKJ
• Why not do part 2 as well?
DOI:10.6084/m9.figshare.1439750
![Page 34: WHAT ARE WE GOING TO DO WITH DATA? Rob L Davidson GigaScience @bobbledavidson #WCSJ2015 This presentation DOI:10.6084/m9.figshare.1439750](https://reader030.vdocument.in/reader030/viewer/2022032611/56649f435503460f94c63698/html5/thumbnails/34.jpg)
Summary
• Science has problems – so how good can science journalism be?
• Things are changing – slowly• The future is bright• The future is data-driven• Data journalists will be the new scientists?
DOI:10.6084/m9.figshare.1439750
![Page 35: WHAT ARE WE GOING TO DO WITH DATA? Rob L Davidson GigaScience @bobbledavidson #WCSJ2015 This presentation DOI:10.6084/m9.figshare.1439750](https://reader030.vdocument.in/reader030/viewer/2022032611/56649f435503460f94c63698/html5/thumbnails/35.jpg)
THANKS!
DOI:10.6084/m9.figshare.1439750
GigaScience team:Scott EdmundsPeter LiChris HunterJesse XiaoRob Davidson