i cannot reproduce the work from my own laboratory

9
I Cannot Reproduce The Work From My Own Laboratory Philip E. Bourne University of California San Diego [email protected] D. Garijo, S. Kinnings, L. Xie, L. Xie, P.E. Bourne & Y. Gil 2013 Quantifying Reproducibility in Computational Biology: The Case of the Tuberculosis Drugome. PLOS ONE,8(11): e80278

Upload: philip-bourne

Post on 06-May-2015

832 views

Category:

Education


2 download

DESCRIPTION

Presented at the EBI Data and Literature Integration Workshop, UK, December 11, 2013.

TRANSCRIPT

Page 1: I Cannot Reproduce The Work From My Own Laboratory

I Cannot Reproduce The Work From My Own Laboratory

Philip E. BourneUniversity of California San Diego

[email protected]

D. Garijo, S. Kinnings, L. Xie, L. Xie, P.E. Bourne & Y. Gil 2013 Quantifying Reproducibility in Computational Biology: The Case of the Tuberculosis Drugome. PLOS ONE,8(11): e80278

Page 2: I Cannot Reproduce The Work From My Own Laboratory

The Case of the Tuberculosis Drugome

Similarities between the binding sites of M.tb proteins (blue), and binding sites containing approved drugs (red)

Kinnings et al 2010 PLoS Comp Biol 6(11): e1000976

Page 3: I Cannot Reproduce The Work From My Own Laboratory

Characteristics of the Original and Current Experiment

• Original and Current:– Purely in silico– Uses a combination of public databases and open

source software by us and others • Original:

– http://funsite.sdsc.edu/drugome/TB/• Current:

– Recast in the Wings workflow system

Page 4: I Cannot Reproduce The Work From My Own Laboratory

Considered the Ability to Reproduce by Four Classes of User

• REP-AUTHOR – original author of the work• REP-EXPERT – domain expert – can reproduce

even with incomplete methods described• REP-NOVICE – basic domain (bioinformatics)

expertise• REP-MINIMAL – researcher with no domain

expertise

Page 5: I Cannot Reproduce The Work From My Own Laboratory

Rule #1: A Conceptual Overview of the Method Should Always Be Included

Page 6: I Cannot Reproduce The Work From My Own Laboratory

Time to Reproduce the Method

Page 7: I Cannot Reproduce The Work From My Own Laboratory

Some Findings

• Reproducibility is a relative term eg p-value by novices• The scripts reveal features of the method not found in

the paper and should be published/accessible• Missing parameter values confound reproducibility• Missing intermediate data confounds reproducibility• Changing public data and software confounds exact

reproducibility – more versioning is required as is more intermediate data

Page 8: I Cannot Reproduce The Work From My Own Laboratory

Some Thoughts 1/2

• Reproducibility has an associated cost : benefit ratio

• Is there benefit to pre- vs post-publication making of reproducibility

• Thus do we really care enough about reproducibility?

• How much do workflows increase productivity?• Tools help but policy change is required. What

should that policy be?

Page 9: I Cannot Reproduce The Work From My Own Laboratory

Some Thoughts 2/2

• If I take your experiment and make it reproducible should I be rewarded? If yes how much? Isn’t this like taking your data and putting it in a database?

• Should the funders mandate reproducibility?• Should publishers begin to accept workflows

and virtual machines?