data and model management in systems biology
TRANSCRIPT
![Page 1: Data and model management in Systems Biology](https://reader035.vdocument.in/reader035/viewer/2022081515/5876e9ca1a28ab046d8b6d83/html5/thumbnails/1.jpg)
Data and model management in Systems Biology
Dagmar WaltemathUniversity of Rostock, Germany
Kinetics on the move – Happy 10th anniversary to SABIO-RK!Heidelberg, 31st May, 2016
http://www.slideshare.net/dagwa/data-and-model-management-in-systems-biology
![Page 2: Data and model management in Systems Biology](https://reader035.vdocument.in/reader035/viewer/2022081515/5876e9ca1a28ab046d8b6d83/html5/thumbnails/2.jpg)
2
Junior research group: Management of simulation studies in systems biology
Tool development: SBGN-ED for the graphical representation of networks
Infrastructure: Data management for systems biology in Germany
Standards and tools for model management
www.sems.uni-rostock.de
![Page 3: Data and model management in Systems Biology](https://reader035.vdocument.in/reader035/viewer/2022081515/5876e9ca1a28ab046d8b6d83/html5/thumbnails/3.jpg)
© 2009 UNIVERSITÄT ROSTOCK 3
NBI-SysBio: Data management for systems biology in Germany
3● Sustainable infrastructure for data management
● Access to documented and reproducible results
● Systems Biology Standards
● Tool Development
● Education
www.denbi.de (training – services – jobs)
![Page 4: Data and model management in Systems Biology](https://reader035.vdocument.in/reader035/viewer/2022081515/5876e9ca1a28ab046d8b6d83/html5/thumbnails/4.jpg)
© 2009 UNIVERSITÄT ROSTOCK 4
Photo: NY - http://nyphotographic.com (CC BY-SA 3.0) Photo: janneke staaks on flickr
Fig. courtesy 10.1371/journal.pbio.1001779
TM
![Page 5: Data and model management in Systems Biology](https://reader035.vdocument.in/reader035/viewer/2022081515/5876e9ca1a28ab046d8b6d83/html5/thumbnails/5.jpg)
© 2009 UNIVERSITÄT ROSTOCK 5
Data management is …
● Data management describes procedures and actions that help to store, preserve, organize and control the data generated during a (research) project.
● Aspects of data management include: – Data Ownership;– Metadata Compilation;– Data Lifecycle Control;– Data Quality; – Data Access and Dissemination Photo: NY - http://nyphotographic.com (CC BY-SA 3.0)
![Page 6: Data and model management in Systems Biology](https://reader035.vdocument.in/reader035/viewer/2022081515/5876e9ca1a28ab046d8b6d83/html5/thumbnails/6.jpg)
5/31/16 © 2009 UNIVERSITÄT ROSTOCK 6
● Data about data● Improved understanding of encoded data items● Descriptive details● Discovery and search for existing data, online browsing of data● Standardized and structured information
– Purpose, origin, time references, geographic location, creator, access conditions, and terms of use of your data collection
● Often encoded in ontologieshttps://www.libraries.psu.edu/psul/pubcur/what_is_dm.html#data-management
Metadata
![Page 7: Data and model management in Systems Biology](https://reader035.vdocument.in/reader035/viewer/2022081515/5876e9ca1a28ab046d8b6d83/html5/thumbnails/7.jpg)
5/31/16 © 2009 UNIVERSITÄT ROSTOCK 7
● Well-structured, controlled vocabularies
● Capture and convey commonly agreed definitions and concepts in a domain
● Communication across people and software tools
● Enable reuse of domain knowledge
● Make implicit domain knowledge explicit and queryable
● Bio-ontologies
– Gene Ontology, ChEBI, UniProt
– Systems Biology Ontology (concepts and terminology for modeling)
Ontologies
![Page 8: Data and model management in Systems Biology](https://reader035.vdocument.in/reader035/viewer/2022081515/5876e9ca1a28ab046d8b6d83/html5/thumbnails/8.jpg)
8
Example: Definition of „cell growth“ in the Gene Ontology
5/31/16
id: GO:0016049name: cell growthnamespace: biological_processdef: "The process in which a cell irreversibly increases in size over time by accretion and biosynthetic production of matter similar to that already present."synonym: "cell expansion" RELATED []synonym: "cellular growth" EXACT []synonym: "growth of cell" EXACT []is_a: GO:0009987 ! cellular processis_a: GO:0040007 ! Growthrelationship: part_of GO:0008361 ! regulation of cell size
© 2009 UNIVERSITÄT ROSTOCK
![Page 9: Data and model management in Systems Biology](https://reader035.vdocument.in/reader035/viewer/2022081515/5876e9ca1a28ab046d8b6d83/html5/thumbnails/9.jpg)
5/31/16 © 2009 UNIVERSITÄT ROSTOCK 9
● Increased confidence and trust in the data● Better understanding of how to use the data, and of the data itself● Better data quality ● Coherent data when standards are used● Improved business processes (saving time, guaranteeing high quality)● Improved access to data and improved reproducibility● Better exploitation of data through easier data exchange and
integration
Advantages of careful & planned data management
![Page 10: Data and model management in Systems Biology](https://reader035.vdocument.in/reader035/viewer/2022081515/5876e9ca1a28ab046d8b6d83/html5/thumbnails/10.jpg)
5/31/16 © 2009 UNIVERSITÄT ROSTOCK 10
● Reusable
● Exchangeable
● Interoperable
● Long-term available (in open repositories)
● Curateable
● Shareable
Advantages of standardised data
![Page 11: Data and model management in Systems Biology](https://reader035.vdocument.in/reader035/viewer/2022081515/5876e9ca1a28ab046d8b6d83/html5/thumbnails/11.jpg)
5/31/16 © 2009 UNIVERSITÄT ROSTOCK 11
Photo: janneke staaks on flickr
![Page 12: Data and model management in Systems Biology](https://reader035.vdocument.in/reader035/viewer/2022081515/5876e9ca1a28ab046d8b6d83/html5/thumbnails/12.jpg)
5/31/16 © 2009 UNIVERSITÄT ROSTOCK 12
Research data in the modeling life cycle
Modelsequations, parameters,data tables
Ideastext,
drawings
Experimental results
text, data tables
Publicationstext,
figures
Analysesconfiguration files,
data tables
Fig. courtesy Martin Scharm (adapted)
![Page 13: Data and model management in Systems Biology](https://reader035.vdocument.in/reader035/viewer/2022081515/5876e9ca1a28ab046d8b6d83/html5/thumbnails/13.jpg)
5/31/16 © 2009 UNIVERSITÄT ROSTOCK 13
Research data in the modeling life cycle
● Mathematical formulae
● Networks, diagrams
● Image data
● Publications
● Experiment descriptions
● Experimental results (both lab and simulation)
● Definitions of things (e.g., gene functions, chemical structures...)
Figures top to bottom: (1) By Noah A. Rosenberget al. Slightly modified by User:Wobble. - Public Library of Science, CC BY 3.0, https://commons.wikimedia.org/w/index.php?curid=2839383; (2) By http://rsb.info.nih.gov/ij/images/, Public Domain, https://commons.wikimedia.org/w/index.php?curid=655748; (3) BIOM005, generated using CellDesigner 4, (4,5) PMID:18669651
![Page 14: Data and model management in Systems Biology](https://reader035.vdocument.in/reader035/viewer/2022081515/5876e9ca1a28ab046d8b6d83/html5/thumbnails/14.jpg)
5/31/16 © 2009 UNIVERSITÄT ROSTOCK 14
● Heterogenuous
● Highly connected
● Context-dependent
● Distributed
● Big
Research data in the modeling life cycle
Figures top to bottom: (1) By Noah A. Rosenberget al. Slightly modified by User:Wobble. - Public Library of Science, CC BY 3.0, https://commons.wikimedia.org/w/index.php?curid=2839383; (2) By http://rsb.info.nih.gov/ij/images/, Public Domain, https://commons.wikimedia.org/w/index.php?curid=655748; (3) BIOM005, generated using CellDesigner 4, (4,5) PMID:18669651
![Page 15: Data and model management in Systems Biology](https://reader035.vdocument.in/reader035/viewer/2022081515/5876e9ca1a28ab046d8b6d83/html5/thumbnails/15.jpg)
5/31/16 © 2009 UNIVERSITÄT ROSTOCK 15
The model
● Mathematical equations
● Biological entities
● Kinetic information
● Encoding: & semantic annotationsTM
<bqmodel:isDescribedBy><rdf:Bag>
<rdf:li rdf:resource="http://identifiers.org/pubmed/18669651"/></rdf:Bag>
</bqmodel:isDescribedBy>
<parameter id="parameter_49" name="L" metaid="metaid_0000078" value="20670"/>
![Page 16: Data and model management in Systems Biology](https://reader035.vdocument.in/reader035/viewer/2022081515/5876e9ca1a28ab046d8b6d83/html5/thumbnails/16.jpg)
5/31/16 © 2009 UNIVERSITÄT ROSTOCK 16
SBML – Standard for model encoding
● Systems Biology Markup Language
● Community-driven de-facto Standard
● Free & open source: www.sbml.org
● Supported by many organizations and tools
● Encodes computational models of biological processes (compartments – species – reactions - parameters)
![Page 17: Data and model management in Systems Biology](https://reader035.vdocument.in/reader035/viewer/2022081515/5876e9ca1a28ab046d8b6d83/html5/thumbnails/17.jpg)
5/31/16 © 2009 UNIVERSITÄT ROSTOCK 17
SBGN – Standard for visual representation
● Systems Biology Graphical Notation● Standardised glyphs for biological entities
● Three levels
– SBGN-AF | SBGN-ER | SBGN-PD
● Free & open source: www.sbgn.org
● Tool support
● Interpretable Format: SBGN-ML
Fig.: http:sbgn.org
![Page 18: Data and model management in Systems Biology](https://reader035.vdocument.in/reader035/viewer/2022081515/5876e9ca1a28ab046d8b6d83/html5/thumbnails/18.jpg)
5/31/16 © 2009 UNIVERSITÄT ROSTOCK 18
Fig.: SBGN map for BIOM183, CellDesigner
SBGN – Standard for visual representation
Fig.: SBGN map for BIOM005, CellDesigner
![Page 19: Data and model management in Systems Biology](https://reader035.vdocument.in/reader035/viewer/2022081515/5876e9ca1a28ab046d8b6d83/html5/thumbnails/19.jpg)
5/31/16 © 2009 UNIVERSITÄT ROSTOCK 19
● Reproduce behaviour of the model
● Publish and share virtualexperiments– Simulation setup / conditions– Pre- and post-processing– Observations
● Encoding: & & result data in Excel, CSV files <listOfSimulations> <uniformTimeCourse id="sim1" initialTime="0" outputStartTime="0" outputEndTime="100" numberOfPoints="100"> <algorithm kisaoID="KISAO:0000019"/> </uniformTimeCourse> </listOfSimulations>
The analysis
Fig. M. Stefan et al, http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2596252/
![Page 20: Data and model management in Systems Biology](https://reader035.vdocument.in/reader035/viewer/2022081515/5876e9ca1a28ab046d8b6d83/html5/thumbnails/20.jpg)
5/31/16 © 2009 UNIVERSITÄT ROSTOCK 20
SED-ML – Standard for model analysis
● Links to models used in an analysis
● Pre- and Post-processing of models
● Type of simulation
● Definition of output
● Free an open source: www.sed-ml.org
● Tool support
→Showcase your tool support online ←
![Page 21: Data and model management in Systems Biology](https://reader035.vdocument.in/reader035/viewer/2022081515/5876e9ca1a28ab046d8b6d83/html5/thumbnails/21.jpg)
5/31/16 © 2009 UNIVERSITÄT ROSTOCK 21
SED-ML – Standard for model analysis
Fig. M. Stefan et al, http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2596252/
Simulation of BIOM183 in SED-ML Web Tools without simulation description
![Page 22: Data and model management in Systems Biology](https://reader035.vdocument.in/reader035/viewer/2022081515/5876e9ca1a28ab046d8b6d83/html5/thumbnails/22.jpg)
5/31/16 © 2009 UNIVERSITÄT ROSTOCK 22
m nCoordinate annual meetings
SimulationGuidelinesOntologies
- Next HARMONY: Auckland, June 7-11, 2016
- Next COMBINE:Newcastle, Sep 19-23, 2016
Coordinate standards development
- Common procedures- Interoperable software tools- Discussion forums, mailing lists...
Represent community
- Funders- Other communities
Provide standards resources
- Single entry point- Resolvable URI- Web infrastructure
![Page 23: Data and model management in Systems Biology](https://reader035.vdocument.in/reader035/viewer/2022081515/5876e9ca1a28ab046d8b6d83/html5/thumbnails/23.jpg)
Standard-compliant software tools for modeling
5/31/16 © 2009 UNIVERSITÄT ROSTOCK 23
The path2models project integrated data from different databases into more than 140.000 SBML models.
Fig.: Büchel et al BMC Sys Biol (2013)http://www.ebi.ac.uk/biomodels-main/path2models
![Page 24: Data and model management in Systems Biology](https://reader035.vdocument.in/reader035/viewer/2022081515/5876e9ca1a28ab046d8b6d83/html5/thumbnails/24.jpg)
5/31/16 © 2009 UNIVERSITÄT ROSTOCK 24
The Systems Biology Workbench is a software framework to help heterogeneous application components communicate with each other.
Modeling
Editing
Simulating
Analysinghttp://sbw.sourceforge.net
Standard-compliant software tools for modeling
![Page 25: Data and model management in Systems Biology](https://reader035.vdocument.in/reader035/viewer/2022081515/5876e9ca1a28ab046d8b6d83/html5/thumbnails/25.jpg)
5/31/16 © 2009 UNIVERSITÄT ROSTOCK 25
The decision whether and how to share data often rests with researchers. Roche DG, Lanfear R, Binning SA, Haff TM, Schwanz LE, et al. (2014) Troubleshooting Public Data Archiving: Suggestions to Increase Participation. PLoS Biol 12(1): e1001779. doi:10.1371/journal.pbio.1001779
![Page 26: Data and model management in Systems Biology](https://reader035.vdocument.in/reader035/viewer/2022081515/5876e9ca1a28ab046d8b6d83/html5/thumbnails/26.jpg)
5/31/16 © 2009 UNIVERSITÄT ROSTOCK 26
● Bundling files● Shipping results● Exchanging data● Keeping provenance
● Encoding: zip-like file with a manifest (meta-data)● Generate, modify & share through WebCAT
COMBINE Archive
![Page 27: Data and model management in Systems Biology](https://reader035.vdocument.in/reader035/viewer/2022081515/5876e9ca1a28ab046d8b6d83/html5/thumbnails/27.jpg)
5/31/16 © 2009 UNIVERSITÄT ROSTOCK 27
COMBINE Archive
Original publication
SBGN map
SBML model versions
SED-ML files
Open in Webcat
Open in SEEK
![Page 28: Data and model management in Systems Biology](https://reader035.vdocument.in/reader035/viewer/2022081515/5876e9ca1a28ab046d8b6d83/html5/thumbnails/28.jpg)
5/31/16 © 2009 UNIVERSITÄT ROSTOCK 28
Model curation & publication
![Page 30: Data and model management in Systems Biology](https://reader035.vdocument.in/reader035/viewer/2022081515/5876e9ca1a28ab046d8b6d83/html5/thumbnails/30.jpg)
5/31/16 © 2009 UNIVERSITÄT ROSTOCK 30
Model curation, simulation & publication
![Page 31: Data and model management in Systems Biology](https://reader035.vdocument.in/reader035/viewer/2022081515/5876e9ca1a28ab046d8b6d83/html5/thumbnails/31.jpg)
5/31/16 © 2009 UNIVERSITÄT ROSTOCK 31
Introduction to SEEK & FAIRDOM by Olga Krebs.
![Page 32: Data and model management in Systems Biology](https://reader035.vdocument.in/reader035/viewer/2022081515/5876e9ca1a28ab046d8b6d83/html5/thumbnails/32.jpg)
32
Thank you for your attention.
http://www.denbi.de/ @SemsProject
m nhttp://co.mbine.org