standards and tools for model management in biomedical research

22
Standards and tools for model management in biomedical research Dagmar Waltemath University of Rostock, Germany [email protected] Dagmarwaltemath Clickable slides available online from slideshare.

Upload: university-of-rostock

Post on 12-Jan-2017

142 views

Category:

Science


4 download

TRANSCRIPT

Standards and tools for model managementin biomedical research

Dagmar WaltemathUniversity of Rostock, Germany

[email protected]

Clickable slides available online from slideshare.

2

© OpenStreetMap contributors

Standards and tools for model management

Junior research group: Management of simulation studies in systems biology

Tool development: SBGN-ED for the graphical representation of networks

Infrastructure project: Data management for systems biology in Germany

3

Standards and tools for model management

Figs: BioModels (top) and DOI: 10.1073/pnas.88.16.7328 (bottom)

4

Most scientific discoveries rely onprevious or other findings.

5

Most scientific discoveries rely onprevious or other findings.

6

Most scientific discoveries rely onprevious or other findings.

Fig.: Tyson 2001 (BIOM195)

Fig.: Tyson 1991 (BIOM005)

7

Goals of scientific publication– To announce a result

– To convince readers that the result is correct

Most scientific discoveries rely onprevious or other findings.

Traditional science

● Mathematical, complete proofs ● Result description and protocols

in experimental sciences

Computer-driven science

● Data analysis with modular software tools/packages

● Workflows● Databases rather than direct inquiry

from in-house laboratories

Mesirov (2010) Science, doi:10.1126/science.1179653

8

Can we rely on findings that we ourselves cannot evaluate? (Probably not!)

“only in ~20–25% of the projects were the relevant published data completely in line with our in-house findings (Fig. 1c). In almost two-thirds of the projects, there were inconsistencies [..] that either considerably prolonged the duration of the target validation process or, in most cases, resulted in termination of the projects because the evidence [..] was insufficient to justify further investments into these projects.” (Prinz et al (2011))

9

Reproducibility issues are discussed amongkey players in science.

Publication: 10.7554/eLife.04333 ; Project progress: https://osf.io/e81xl/wiki/Studies/

10

Reproducibility issues are discussed amongkey players in science.

Fig.: Chris Ryan/Nature, doi: 10.1038/505612a

11

We identified key challenges of reproducibility insystems biology and systems medicine.

Lack of data standards – Lack of data quality and quantity – Lack of data availability – Lack of transparency

12

53 researchers 17 countries various different professions

A lack of suitable data standards hinders researchers in providing reproducible results.

Whole Cell meeting (2015)– Goal: To identify the needs and shortcomings for today's modeling tasks

– Results: ● New developments initiated (databases, data curation tools, training data,

modeling approaches, parameter estimation tools, frameworks, parallel simulators, extensions to standard formats)

● New grant proposals and follow-up projects, new networks, better standards, improved tools

Fig.: Waltemath et al (2016) IEEE TBME, accepted for publicationProject homepage: http://bit.ly/wholecell

13

A lack of data availability makes it impossible for researchers in reproducing results.

Issues– Simulation studies comprise

of several files

– Data is heterogeneous, distributed, complex

– Documentation of the how the study was performed often missing

● Model code in BioModels, including supplemental with a howto reproduce the figures given in the original paper

● Online tool makes data available and browseable

TriplexRNA

Recon 2Recon 2

● Publication backed up with a website containing the supplemental material

● Model code in (noncurated) BioModels● Visualisation of the model can easily

be explored● References to original works

14

The COMBINE initiative works towards reproducibility and tool interoperability in computational biology.

m nCoordinate annual meetings

SimulationGuidelinesOntologies

- Next HARMONY: Auckland, June 7-11, 2016

- Next COMBINE:Newcastle, Sep 19-23, 2016

Coordinate standards development

- Common procedures- Interoperable software tools- Discussion forums, mailing lists...

Represent community

- Funders- Other communities

Provide standards resources

- Single entry point- Resolvable URI- Web infrastructure

15

The COMBINE initiative works towards reproducibility and tool interoperability in computational biology.

● Model description (network, parameters, kinetics)

Fig.: SBGN-PD map, http://sbgn.org

● Visual representation of network (glyphs)

16

The COMBINE initiative works towards reproducibility and tool interoperability in computational biology.

● Simulation setup

● Definition of observed variables (plots, data tables)

● All files that belong to a (reproducible) simulation study

● Description of archive content

● Have a look at a fully featured COMBINE archive on github

Figs: BioModels

17

Use of standard formats leads to interoperable software.

internet

internet

internet

SEARCHubiquitin

internet

RESULTSEXPORT

EXPORT

EXPORT

EXPORT

Query database for annotations, persons, simulation descriptions

Retrieve information about models, simulations, figures, documentation

Export simulation study as COMBINE archive

Download archive and open the study with your favourite simulation tool

Open archive in CAT to modify its contents and to share it with others

Cardiac Electrophysiology Web Lab, Oxford

M2CAT, SEMS

WebCAT, SEMS

JWS Online, Stellenbosch, SA SED-ML Web Tools, BIOQUANT

18

We develop tools that help researchers manage standardised data efficiently.

Storage Search, retrieval & ranking

Using graph databases to integrate standardised model-based data.

doi: 10.1093/database/bau130

doi: 10.1186/s13326-015-0014-4

Search across heterogeneous data, ontologies, and structures.

https://dx.doi.org/10.6084/m9.figshare.3382993.v1

SED-ML DB in JWS Online

Our methods are tested & used in major model repositories.

BioModels Physiome Model repository

19

We develop tools that help researchers manage standardised data efficiently.

Transfer of results Version control & Provenance

Bundling files necessary to reproduce a modeling result.

doi: 10.1093/bioinformatics/btv484

Figure courtesy Martin Scharm, slideshare

Tracking the development of simulation studies over time.

https://dx.doi.org/10.6084/m9.figshare.2543059.v5

Our methods are tested & used in major model repositories.

BioModels Physiome Model repository

20

How can we bridge the gap between standards for systems biology and systems medicine?

Fig. courtesy Atalag et al (2015) http://hdl.handle.net/2292/27911

21

Research results must be well documented, comprehensible and reproducible to be trust-able and reusable.

Ways outCurrent status Desired status

Blogs and databases

Detailed documentation

Open data

Standards

Reproducibility initiative

Sustainable Software

Infrastructure

Comprehensible, findable, available, correct models and simulation studies.

Many scientific studies in the life sciences are

not reproducible.

Waltemath and Wolkenhauer (2016) How modeling standards, software, and initiatives support reproducibility in systems biology and systems medicine. Accepted for publication, IEEE Transactions in Biomedical Engineering

Thank you for your attention.

http://www.denbi.de/

Gary Bader Mike Hucka Chris Myers

David Nickerson Dagmar WaltemathNicolas Le Novère

Martin Golebiewski

Falk Schreiber

m n

@SemsProject

http://co.mbine.org