i cannot reproduce the work from my own laboratory
DESCRIPTION
Presented at the EBI Data and Literature Integration Workshop, UK, December 11, 2013.TRANSCRIPT
I Cannot Reproduce The Work From My Own Laboratory
Philip E. BourneUniversity of California San Diego
D. Garijo, S. Kinnings, L. Xie, L. Xie, P.E. Bourne & Y. Gil 2013 Quantifying Reproducibility in Computational Biology: The Case of the Tuberculosis Drugome. PLOS ONE,8(11): e80278
The Case of the Tuberculosis Drugome
Similarities between the binding sites of M.tb proteins (blue), and binding sites containing approved drugs (red)
Kinnings et al 2010 PLoS Comp Biol 6(11): e1000976
Characteristics of the Original and Current Experiment
• Original and Current:– Purely in silico– Uses a combination of public databases and open
source software by us and others • Original:
– http://funsite.sdsc.edu/drugome/TB/• Current:
– Recast in the Wings workflow system
Considered the Ability to Reproduce by Four Classes of User
• REP-AUTHOR – original author of the work• REP-EXPERT – domain expert – can reproduce
even with incomplete methods described• REP-NOVICE – basic domain (bioinformatics)
expertise• REP-MINIMAL – researcher with no domain
expertise
Rule #1: A Conceptual Overview of the Method Should Always Be Included
Time to Reproduce the Method
Some Findings
• Reproducibility is a relative term eg p-value by novices• The scripts reveal features of the method not found in
the paper and should be published/accessible• Missing parameter values confound reproducibility• Missing intermediate data confounds reproducibility• Changing public data and software confounds exact
reproducibility – more versioning is required as is more intermediate data
Some Thoughts 1/2
• Reproducibility has an associated cost : benefit ratio
• Is there benefit to pre- vs post-publication making of reproducibility
• Thus do we really care enough about reproducibility?
• How much do workflows increase productivity?• Tools help but policy change is required. What
should that policy be?
Some Thoughts 2/2
• If I take your experiment and make it reproducible should I be rewarded? If yes how much? Isn’t this like taking your data and putting it in a database?
• Should the funders mandate reproducibility?• Should publishers begin to accept workflows
and virtual machines?