systems biology model semantics and integration
TRANSCRIPT
Slide 1
Systems Biology Model Semantics and Integration
Allyson ListerBiology, Neurosciences & Computing GroupNewcastle University29 July 2011
This presentation is licensed under Creative Commons BY-SA 2.5
Background in Standards
In the beginning, there was syntax...TrEMBL, FuGE, SBML
...then came content...MIGS/MIMS
...And ultimately, semanticsOBI, SBO, BioPAX
Background in Integration
Throughout, an interest in data integrationRedundancy removal in TrEMBL
International Protein Index (IPI)
FuGE-based metadata database and data storage (SyMBA)
Semantic data integration
Rule-Based Mediation (RBM)
Integrate data from multiple data sources into a single, core ontology for reasoning, querying and data extraction back to a chosen (non-OWL) format
RBM (continued)
Resolution of syntactic and semantic heterogeneity occurs separatelyThe core ontology is a semantically-rich description of the research domain of interest
Syntactic ontologies pass data to the core via SWRLAre either syntactic translations of data formats into OWL or pre-existing OWL ontologies
Mainly uses existing, independent ontologies and off-the-shelf libraries and applications
RBM (continued)
Is notOntology alignmentAlignment often a prelude to ontology merging, and used where domains at least partly intersect
We do not intend to merge ontologies, and each data source may be very different from another
Format reconciliationWe are not trying to create a single, overarching format just quickly pull data from many formats
Systems Biology and RBM
Add information to modelsAdd new interactions/pathways to existing models
Add new biological annotation to existing models
Build skeleton models of requested interactions/pathways
RBM Overview
UniprotKBCellMLPathwayCommonsXMLXMLOWLOWLBioPAXCoreOntologyInstancesResolvesyntacticheterogeneityResolvesemanticheterogeneity, reasoning,querying
RetrievedataMFO
Exportto SBML, other formats If required
BioModelsXMLMFO
...XMLOWL
Why have an OWL intermediary where converters exist?
BioPAX SBML, CellML SBML converters existLossy (due to different scopes of each format)
Might not get what we need from such conversion
Why have an OWL intermediary where converters exist?
BioPAX SBML, CellML SBML converters existWith SWRL rules we can pull information for exactly those portions of each format we're interested in
Not dependent upon external developers if the meaning or structure of a format changes
Easier to change rules (especially for web applications or novices) and re-run mappings than re-write hard-coded Java/perl etc.
The core ontology
Ideally, a core ontology should be a tightly-scoped ontology describing the domain of interest
Multiple core ontologies can be created as necessary to address multiple biological questions
We began with an ontology describing the basics of telomere uncapping
Sharing common concepts among core ontologies
To make it easier to swap out core ontologies, use a common ontology which all can inherit BioPAX Level 3 (and perhaps the SBPAX3 extension) is being considered for my research
Such an ontology can be selectively enriched with the biological information of interest
Only a small number of domain-specific SWRL rules would be needed with each new core ontology
Visible face of RBM
Saintpulls suggested MIRIAM annotation and possible interactors from web services
syntactic integration of data, or direct querying of WSs based on query strings built from the SBML/CellML models
semantic Saint will also pull information out of RBM-integrated data
Reasoning
Not much reasoning over BioPAX yet, though as a component of my core ontology this will be coming soon
Reasoning over MFO models is quick, which is to be expected given the (deliberate) relative lack of complexity
Reasoning
Reasoning and querying over the core ontology has already discovered new annotations as well as possible identification of unknown species in SBML models
Reasoning tends to be slower than I'd like, although much of it can be done behind the scenes and the results stored for later queries (i.e. with SQWRL)
Many interesting projects
Model annotation for Synthetic BiologyGoksel Misirli and others
BioPAX SBMLSBPAX3 and other work by Oliver Ruebenacker and others
EBI BioPAX SBML conversion
RBM using both as data sources
Many interesting projects
SBML and OWLMFO
SBMLHarvester by Robert Hoehndorf and others
CellML and OWL
Related Work from Us
Model annotation for synthetic biology: http://dx.doi.org/10.1093/bioinformatics/btr048
Rule-Based Mediation http://cisban-silico.cs.ncl.ac.uk/RBM/http://dx.doi.org/10.1186/2041-1480-1-S1-S3
MFO: http://cisban-silico.cs.ncl.ac.uk/MFO/ doi:10.2390/biecoll-jib-2007-80
Related Work from Us
SyMBA: http://symba.sf.net
Saint: http://saint-annotate.sf.net http://dx.doi.org/10.1093/bioinformatics/btp523
Other Related Work
SBPAX3 http://sourceforge.net/apps/mediawiki/biopax/index.php?title=SBPAX3
SBMLHarvester http://bioonto.gen.cam.ac.uk/sbmlharvester/
SBML BioPAX conversion sbml2biopax http://www.ebi.ac.uk/compneur-srv/sbml/converters/SBMLtoBioPax.html
CellML and OWL, Wimalaratne et al. doi: 10.1093/bioinformatics/btp391
Thank you!
And thanks also to Phil Lord and Neil Wipat, my PhD supervisors
Biology, Neurosciences & Computing Group at the Computing Science Department, Newcastle University
CISBAN
BBSRC
Contact Me
Contact me@allysonlister
http://themindwobbles.wordpress.com
06/29/11