reproducible research (rr) - github...
TRANSCRIPT
ReproducibleResearch
Motivations
Acknowledgementof RR
Media
Challenges
Summary
References
Reproducible Research (RR)Some Comments
Greg Voisin and Sahir Bhatnagar1
June 18, 2015
1http://admingreenwoodlab.github.io/tutorials/1 / 18
ReproducibleResearch
Motivations
Acknowledgementof RR
Media
Challenges
Summary
References
Why should wecare about RR?
For Science
Standard to judgescientific claims
Avoid duplication
Cumulativeknowledge
development
For You
Better workhabits
Better teamwork
Changesare easier
Higher re-search impact
2 / 18
ReproducibleResearch
Motivations
Acknowledgementof RR
Media
Challenges
Summary
References
Why should wecare about RR?
For Science
Standard to judgescientific claims
Avoid duplication
Cumulativeknowledge
development
For You
Better workhabits
Better teamwork
Changesare easier
Higher re-search impact
2 / 18
ReproducibleResearch
Motivations
Acknowledgementof RR
Media
Challenges
Summary
References
Figure 1 : http://www.nature.com/news/reproducibility-1.17552
3 / 18
ReproducibleResearch
Motivations
Acknowledgementof RR
Media
Challenges
Summary
References
Medicine
Figure 2 : Annals of Internal Medicine (Liane et al. 2007)
4 / 18
ReproducibleResearch
Motivations
Acknowledgementof RR
Media
Challenges
Summary
References
Bioconductor
Figure 3 : Bioconductor (Gentleman and Lang 2004)
5 / 18
ReproducibleResearch
Motivations
Acknowledgementof RR
Media
Challenges
Summary
References
Biostatistics
Figure 4 : Biostatistics (Peng 2009)
6 / 18
ReproducibleResearch
Motivations
Acknowledgementof RR
Media
Challenges
Summary
References
Biostatistics requirements for RR
1 data analysis script
2 other code
3 data
4 script for results used in paper
5 knitr file (.Rnw)
6 resulting .tex file from compiling with knitr
7 bibTEXfile
7 / 18
ReproducibleResearch
Motivations
Acknowledgementof RR
Media
Challenges
Summary
References
CRAN has a dedicated Task View for RR
http://cran.r-project.org/web/views/
8 / 18
ReproducibleResearch
Motivations
Acknowledgementof RR
Media
Challenges
Summary
References
Coursera
Figure 5 : https://www.coursera.org/course/repdata
9 / 18
ReproducibleResearch
Motivations
Acknowledgementof RR
Media
Challenges
Summary
References
Costs
10 / 18
ReproducibleResearch
Motivations
Acknowledgementof RR
Media
Challenges
Summary
References
How did they come up with that number?
11 / 18
ReproducibleResearch
Motivations
Acknowledgementof RR
Media
Challenges
Summary
References
12 / 18
ReproducibleResearch
Motivations
Acknowledgementof RR
Media
Challenges
Summary
References
13 / 18
ReproducibleResearch
Motivations
Acknowledgementof RR
Media
Challenges
Summary
References
Challenges
• Large data/computations
• Complicated pipelines
• Privacy issues
• Getting PI’s on board
14 / 18
ReproducibleResearch
Motivations
Acknowledgementof RR
Media
Challenges
Summary
References
Challenges
• Large data/computations
• Complicated pipelines
• Privacy issues
• Getting PI’s on board
14 / 18
ReproducibleResearch
Motivations
Acknowledgementof RR
Media
Challenges
Summary
References
Challenges
• Large data/computations
• Complicated pipelines
• Privacy issues
• Getting PI’s on board
14 / 18
ReproducibleResearch
Motivations
Acknowledgementof RR
Media
Challenges
Summary
References
Challenges
• Large data/computations
• Complicated pipelines
• Privacy issues
• Getting PI’s on board
14 / 18
ReproducibleResearch
Motivations
Acknowledgementof RR
Media
Challenges
Summary
References
Challenges
• Large data/computations
• Complicated pipelines
• Privacy issues
• Getting PI’s on board
14 / 18
ReproducibleResearch
Motivations
Acknowledgementof RR
Media
Challenges
Summary
References
Caution
• Reproducible doesn’t make it right
• Not Reproducible doesn’t make it wrong
15 / 18
ReproducibleResearch
Motivations
Acknowledgementof RR
Media
Challenges
Summary
References
Caution
• Reproducible doesn’t make it right
• Not Reproducible doesn’t make it wrong
15 / 18
ReproducibleResearch
Motivations
Acknowledgementof RR
Media
Challenges
Summary
References
Caution
• Reproducible doesn’t make it right
• Not Reproducible doesn’t make it wrong
15 / 18
ReproducibleResearch
Motivations
Acknowledgementof RR
Media
Challenges
Summary
References
If you can only take away one thing from today’sdiscussion...
Reproducibility ∝ 1
copy paste
16 / 18
ReproducibleResearch
Motivations
Acknowledgementof RR
Media
Challenges
Summary
References
References I
Christopher Gandrud, Reproducible research with r and rstudio,Chapman and Hall-CRC The R Series, 2013.
David Smith, Did an excel error bring down the london whale?,http://blog.revolutionanalytics.com/2013/02/
did-an-excel-error-bring-down-the-london-whale.
html.
C. Laine, S. N. Goodman, M. E. Griswold, and H. C. Sox,Reproducible research: moving toward research the public canreally trust, Ann. Intern. Med. 146 (2007), no. 6, 450–453.
New York Times, Reporters find science journals harder to trust,but not easy to verify,http://www.nytimes.com/2006/02/13/business/media/
13journal.html?_r=0&adxnnl=1&pagewanted=all&
adxnnlx=1390399611-aqm52MhkXkIFF7Azx7irCg.
R. D. Peng, Reproducible research and Biostatistics, Biostatistics10 (2009), no. 3, 405–408.
17 / 18
ReproducibleResearch
Motivations
Acknowledgementof RR
Media
Challenges
Summary
References
References II
Sergey Fomel and Jon F. Claerbout, Guest editor’s introduction:Reproducible research, Computing in Science and Engineering(Jan/Feb 2009).
Yihui Xie, Dynamic documents with r and knitr, Chapman andHall-CRC The R Series, 2013.
18 / 18