tools for reproducible research in an increasingly digital world

Post on 23-Jan-2017

486 Views

Category:

Science

3 Downloads

Preview:

Click to see full reader

TRANSCRIPT

brian m. bot | sage bionetworks | @BrianMBot

mayo clinic - 2015 jan 28

tools for reproducible research

in an increasingly digital world

sage bionetworks~40 FTEs

1/2 research - 1/3 platform - 1/6 leadership/support

sage bionetworks

focused on a world where biomedical research will fundamentally change to be more open and collaborative

production

distribution

aggregation

6%

21%

8%

11%

54%cannot reproduce

can reproduce in principle

can reproduce w/discrepancies

can reproduce from processed data w/discrepancies

can reproduce partially

the status quo tolerates poor communication of findings

Ioannidis A. et al. Nature Genetics 2009

208,294,724 datapoints

124 pages supplemental material

?? lines unobtainable source code

?? version or architecture of statistical analysis program (R)

enumerable R packages and package dependencies

key R package “ClaNC” no longer available

1231 citations

often what is in principle reproducible, is not practically reproducible

unidentified publication‣ from journal with 5 year impact factor of 27‣ article freely available for download‣ data freely available for download

“Scientists often study the past as obsessively as historians because few

other professions depend so acutely on it. Every experiment is a conversation with

a prior experiment, every new theory a refutation of the old”

-Siddhartha Mukherjee, The Emperor of All Maladies

scientific method1. define a question

2. gather information and resources (background research)

3. form a hypothesis

8. retest (frequently done by other scientists)

4. test hypothesis experimentally

5. analyze experimental data

7. publish results

6. draw conclusions based on data

7. publish results

finitein

∞...

conducting research for others to consume

(even if the ‘other’ is future you)

reproducible research

tools for reproducible research

code

data

analysis

tools for reproducible research

code

version control

tools for reproducible research

code

version control

client-server

distributed

tools for reproducible research

code

version control

client-server(e.g svn, cvs)

tools for reproducible research

code

version control

client-server

distributed

tools for reproducible research

code

version control

(e.g git, mercurial)distributed

tools for reproducible research

code

version control

distributed

tools for reproducible research

code

data

analysis

tools for reproducible research

data

generic

domain repositories

results

tools for reproducible research

data

digital object identifier (doi)

a unique identifier which remains fixed over the lifetime of a web-accessible object

metadata, including the object’s location, is stored in association with the doi and may change over time

referring to an online document by its doi provides more stable linking than simply referring to a url

tools for reproducible research

code

data

analysis

tools for reproducible research

analysis

R Sweave knitr

great if you know LaTeX

tools for reproducible research

analysis

R Sweave knitr

great if you are lazy

(like me)

tools for reproducible research

analysis

knitr

# Hello World Title ### Author: Brian M. Bot

This is a narrative with inline code execution to tell me that pi is equal to `r pi`. And a plot to show a simple function.

```{r} x <- 1:100 y <- log(x)/x plot(x,y) ```

tools for reproducible research

analysis

knitr

# Hello World Title ### Author: Brian M. Bot

This is a narrative with inline code execution to tell me that pi is equal to `r pi`. And a plot to show a simple function.

```{r} x <- 1:100 y <- log(x)/x plot(x,y) ```

tools for reproducible research

analysis

ipython notebook

tools for reproducible research

other tools

galaxy

docker

packrat

shiny

tools for reproducible research

other tools

enables sharing of all resources (data, code, results) and their relationships to one another

tools for reproducible research

tools for reproducible research

tools for reproducible research

Go Hawks!

mayo clinic - 2015 jan 28

in an increasingly digital world

brian m. bot ——————

brian.bot@sagebase.org @BrianMBot

sage bionetworks

tools for reproducible research

top related