![Page 1: Bio2RDF : A biological knowledge base for the Semantic Web](https://reader035.vdocument.in/reader035/viewer/2022062312/554e8e61b4c90526358b4c7c/html5/thumbnails/1.jpg)
Bio2RDF: A biological knowledge base for the Semantic Web
Michel Dumontier, François Belleau, Marc-Alexandre Nolin, Peter Ansell
![Page 2: Bio2RDF : A biological knowledge base for the Semantic Web](https://reader035.vdocument.in/reader035/viewer/2022062312/554e8e61b4c90526358b4c7c/html5/thumbnails/2.jpg)
Web search for biological informationis hit or miss
![Page 3: Bio2RDF : A biological knowledge base for the Semantic Web](https://reader035.vdocument.in/reader035/viewer/2022062312/554e8e61b4c90526358b4c7c/html5/thumbnails/3.jpg)
something you can lookup and search for with rich descriptions
Introducing...
![Page 4: Bio2RDF : A biological knowledge base for the Semantic Web](https://reader035.vdocument.in/reader035/viewer/2022062312/554e8e61b4c90526358b4c7c/html5/thumbnails/4.jpg)
Surface web:167 terabytes
Deep web:91,000 terabytes
545-to-one
![Page 5: Bio2RDF : A biological knowledge base for the Semantic Web](https://reader035.vdocument.in/reader035/viewer/2022062312/554e8e61b4c90526358b4c7c/html5/thumbnails/5.jpg)
Bio-Portals provide Database accessgive better results
![Page 6: Bio2RDF : A biological knowledge base for the Semantic Web](https://reader035.vdocument.in/reader035/viewer/2022062312/554e8e61b4c90526358b4c7c/html5/thumbnails/6.jpg)
We want to simultaneously
query the 1000+ biological databases
![Page 7: Bio2RDF : A biological knowledge base for the Semantic Web](https://reader035.vdocument.in/reader035/viewer/2022062312/554e8e61b4c90526358b4c7c/html5/thumbnails/7.jpg)
Data silos – not made for sharing
![Page 8: Bio2RDF : A biological knowledge base for the Semantic Web](https://reader035.vdocument.in/reader035/viewer/2022062312/554e8e61b4c90526358b4c7c/html5/thumbnails/8.jpg)
How do we integrate these resources?
![Page 9: Bio2RDF : A biological knowledge base for the Semantic Web](https://reader035.vdocument.in/reader035/viewer/2022062312/554e8e61b4c90526358b4c7c/html5/thumbnails/9.jpg)
Bio2RDF provides the methodology to create and glue these different networks.
![Page 10: Bio2RDF : A biological knowledge base for the Semantic Web](https://reader035.vdocument.in/reader035/viewer/2022062312/554e8e61b4c90526358b4c7c/html5/thumbnails/10.jpg)
Bio2RDF is building the linked data web for biological data
![Page 11: Bio2RDF : A biological knowledge base for the Semantic Web](https://reader035.vdocument.in/reader035/viewer/2022062312/554e8e61b4c90526358b4c7c/html5/thumbnails/11.jpg)
Contributing to a growing linked data web
![Page 12: Bio2RDF : A biological knowledge base for the Semantic Web](https://reader035.vdocument.in/reader035/viewer/2022062312/554e8e61b4c90526358b4c7c/html5/thumbnails/12.jpg)
What is the semantic web?
The Semantic Web is a web of knowledge.
It is about standard formats forrepresenting and querying
knowledge drawn from diverse sources and
making statementsabout real
objects.
![Page 13: Bio2RDF : A biological knowledge base for the Semantic Web](https://reader035.vdocument.in/reader035/viewer/2022062312/554e8e61b4c90526358b4c7c/html5/thumbnails/13.jpg)
Goals for the Semantic Web
• Provide a common knowledge representation • syntax & semantics
• Facilitate publishing, data integration and information retrieval
• Make possible semantically interoperable web applications and services
• Enable the answering of questions across global repositories of knowledge
![Page 14: Bio2RDF : A biological knowledge base for the Semantic Web](https://reader035.vdocument.in/reader035/viewer/2022062312/554e8e61b4c90526358b4c7c/html5/thumbnails/14.jpg)
Resource Description Framework (RDF)
• Allows one to express propositions, and reason about them
• Uniform Resource Identifier (URI) are entity names• i.e http://purl.uniprot.org/uniprot/Q16665
• A RDF statement consists of:– Subject: resource identified by a URI– Predicate: resource identified by a URI– Object: resource or literal
u:Q16665
Protein
rdf:type
![Page 15: Bio2RDF : A biological knowledge base for the Semantic Web](https://reader035.vdocument.in/reader035/viewer/2022062312/554e8e61b4c90526358b4c7c/html5/thumbnails/15.jpg)
Q16665
Protein
rdf:type
Molecule
rdfs:subClassOf
rdf:type
Semantic Knowledge Base
fact
ontology
Knowledge base
![Page 16: Bio2RDF : A biological knowledge base for the Semantic Web](https://reader035.vdocument.in/reader035/viewer/2022062312/554e8e61b4c90526358b4c7c/html5/thumbnails/16.jpg)
16
RDF/XML<?xml version="1.0"?><rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:u="http://purl.uniprot.org/uniprot/"
<rdf:Description rdf:about=“&u;Q16665"> <rdf:type rdf:resource=“&u;Protein"/> </rdf:Description></rdf:RDF>
PREFIX u: <http://purl.uniprot.org/uniprot/> .
<u:Q16665> a <u:Protein> .
N3
![Page 17: Bio2RDF : A biological knowledge base for the Semantic Web](https://reader035.vdocument.in/reader035/viewer/2022062312/554e8e61b4c90526358b4c7c/html5/thumbnails/17.jpg)
Syntactic Data Integration
u:Q16665 go:nucleus
HIF1-alphahas name
located in
u:Q16665
u:Q16665 u:vhlinteracts with
UniProt
Gene Ontology
u:Q16665
HIF1-alphahas name
go:nucleuslocated in
u:vhlinteracts with
Unified view
+
+
BIND
depends on consistent naming
![Page 18: Bio2RDF : A biological knowledge base for the Semantic Web](https://reader035.vdocument.in/reader035/viewer/2022062312/554e8e61b4c90526358b4c7c/html5/thumbnails/18.jpg)
Semantic Data Integration
Protein
U:Q16665
rdf:type
depends on accurate typing
u:vhl
![Page 20: Bio2RDF : A biological knowledge base for the Semantic Web](https://reader035.vdocument.in/reader035/viewer/2022062312/554e8e61b4c90526358b4c7c/html5/thumbnails/20.jpg)
![Page 21: Bio2RDF : A biological knowledge base for the Semantic Web](https://reader035.vdocument.in/reader035/viewer/2022062312/554e8e61b4c90526358b4c7c/html5/thumbnails/21.jpg)
Bio2RDF Design Principles
http://bio2rdf.wiki.sourceforge.net/Banff+Manifesto
![Page 22: Bio2RDF : A biological knowledge base for the Semantic Web](https://reader035.vdocument.in/reader035/viewer/2022062312/554e8e61b4c90526358b4c7c/html5/thumbnails/22.jpg)
Over 1800 namespaces
Compiled From: NAR, BioMoby, UniProt, NCBI, SRS
![Page 23: Bio2RDF : A biological knowledge base for the Semantic Web](https://reader035.vdocument.in/reader035/viewer/2022062312/554e8e61b4c90526358b4c7c/html5/thumbnails/23.jpg)
Naming Convention
http://bio2rdf.org/namespace:identifier
http://bio2rdf.org/pdb:1AM0
http://bio2rdf.org/gi:99
![Page 24: Bio2RDF : A biological knowledge base for the Semantic Web](https://reader035.vdocument.in/reader035/viewer/2022062312/554e8e61b4c90526358b4c7c/html5/thumbnails/24.jpg)
![Page 25: Bio2RDF : A biological knowledge base for the Semantic Web](https://reader035.vdocument.in/reader035/viewer/2022062312/554e8e61b4c90526358b4c7c/html5/thumbnails/25.jpg)
Bio2RDF network = 2.3 BT
![Page 26: Bio2RDF : A biological knowledge base for the Semantic Web](https://reader035.vdocument.in/reader035/viewer/2022062312/554e8e61b4c90526358b4c7c/html5/thumbnails/26.jpg)
Namespace Domain Updated Triples Topics Namespaces SPARQL
Affymetrix Probeset loading 45560115 1708777 20affymetrix
BIND Network information 09/04/1930 bindBioCYC Pathway/BioPAX 4418699622 + xref biocycChEBI@EBI Chemistry 09/03/2025 4764030 50377 25chebiCPD@KEGG Chemistry 09/04/2014 177199 14071 10keggcPath Pathway/BioPAX 09/04/2007 28052098 51cpathDBpedia Encyclopedia 09/03/2023 190790 0 21dbpediaDR@KEGG Drug 09/04/2014 116822 8117 8drEC@KEGG Enzyme 09/04/2014 556888 4245 4ecEC@UniProt Enzyme 09/04/2014 36109 enzymeGeneID@NCBI Gene loading 1.73E+08 86geneidGL@KEGG Chemistry 09/04/2014 94148 10965 2keggGO Ontology 09/03/2015 8188649 804979 144goHGNC Genome 09/03/2025 1085662 125256 14hgnc
HomoloGene@NCBI Homolog 09/03/1931 6598206 7homologeneIProClass@PIR Protein loading 1.92E+08 19 iproclassMGI Genome 09/03/2025 3089976 12mgiOBO Ontology 09/03/2027 4507016 4954332 165oboOMIM@NCBI Disease 09/03/2024 1048053 32102 7omimPath@KEGG Pathway 09/03/2028 50793314 keggPDB Protein 09/03/2021 1215254 44569 2pdbPubmed@UniProt Article 09/03/1931 pubmedPubmed@NCBI Article 09/03/1931 pubmedReactome Pathway/BioPAX 09/04/2015 57527092 22reactomeRN@KEGG Pathway 09/04/2015 110971 7755 5keggSGD Genome 09/04/2015 1437648 13sgdTaxonomy@UniProt Taxon 09/04/2014 3230933 taxonomyUniParc@UniProt Sequence 09/04/2009 5.59E+08 53uniparc
UniPathway@UniProt Pathway 09/04/2014 8508 unipathwayUniProtKB@UniProt Protein 09/04/2016 4.56E+08 135uniprotUniRef@UniProt Homolog 09/04/2008 3.9E+08 5unirefUniSTS@NCBI Marker 09/03/1931 7542235 7unists
![Page 27: Bio2RDF : A biological knowledge base for the Semantic Web](https://reader035.vdocument.in/reader035/viewer/2022062312/554e8e61b4c90526358b4c7c/html5/thumbnails/27.jpg)
Mouse and Human Atlas (65 MT)
![Page 28: Bio2RDF : A biological knowledge base for the Semantic Web](https://reader035.vdocument.in/reader035/viewer/2022062312/554e8e61b4c90526358b4c7c/html5/thumbnails/28.jpg)
Free, Open Source software
![Page 29: Bio2RDF : A biological knowledge base for the Semantic Web](https://reader035.vdocument.in/reader035/viewer/2022062312/554e8e61b4c90526358b4c7c/html5/thumbnails/29.jpg)
Bio2RDF Software
• http://sourceforge.net/projects/bio2rdf/• Virtuoso Triple Store gives SPARQL endpoint• Bio2RDF software transforms URIs to SPARQL
queries directed to one or more endpoints• RDFizers – transform legacy data into RDF– OMIM, KEGG
• SW DBs – rules to create Bio2RDF URI’s– Dbpedia, BioPAX
![Page 30: Bio2RDF : A biological knowledge base for the Semantic Web](https://reader035.vdocument.in/reader035/viewer/2022062312/554e8e61b4c90526358b4c7c/html5/thumbnails/30.jpg)
SPARQL Endpointshttp://ns.bio2rdf.org/sparql
http://atlas.bio2rdf.org/sparql
![Page 31: Bio2RDF : A biological knowledge base for the Semantic Web](https://reader035.vdocument.in/reader035/viewer/2022062312/554e8e61b4c90526358b4c7c/html5/thumbnails/31.jpg)
Services
• Describe a resource – http://bio2rdf.org/ns:id
• Global services over federated endpoints– http://bio2rdf.org/links/ns:id– http://bio2rdf.org/search/term
• Targeted services to a specific endpoint– http://bio2rdf.org/linksns/ns/ns2:id – http://bio2rdf.org/searchns/ns/term
![Page 32: Bio2RDF : A biological knowledge base for the Semantic Web](https://reader035.vdocument.in/reader035/viewer/2022062312/554e8e61b4c90526358b4c7c/html5/thumbnails/32.jpg)
Describe service
http://bio2rdf.org/ns:id
Corresponding SPARQL query :CONSTRUCT {
?s ?p ?o .}WHERE {
?s ?p ?o .FILTER(?s = <http://bio2rdf.org/ns:id>).
}
Sent to http://ns.bio2rdf.org/sparql?query=... DNS subdomain resolution service
![Page 34: Bio2RDF : A biological knowledge base for the Semantic Web](https://reader035.vdocument.in/reader035/viewer/2022062312/554e8e61b4c90526358b4c7c/html5/thumbnails/34.jpg)
Virtuoso 6.0 Facet Browsinghttp://lod.openlinksw.com/
![Page 35: Bio2RDF : A biological knowledge base for the Semantic Web](https://reader035.vdocument.in/reader035/viewer/2022062312/554e8e61b4c90526358b4c7c/html5/thumbnails/35.jpg)
Multiple Ways To Represent Knowledge
Fig. 2. Three ways to model the relationship between a protein and the volume it occupies.
![Page 36: Bio2RDF : A biological knowledge base for the Semantic Web](https://reader035.vdocument.in/reader035/viewer/2022062312/554e8e61b4c90526358b4c7c/html5/thumbnails/36.jpg)
Fig. 1. From linked data to linked knowledge through syntactic and semantic normalization.
![Page 37: Bio2RDF : A biological knowledge base for the Semantic Web](https://reader035.vdocument.in/reader035/viewer/2022062312/554e8e61b4c90526358b4c7c/html5/thumbnails/37.jpg)
Ontology as Strategy
![Page 38: Bio2RDF : A biological knowledge base for the Semantic Web](https://reader035.vdocument.in/reader035/viewer/2022062312/554e8e61b4c90526358b4c7c/html5/thumbnails/38.jpg)
OWL Has Explicit Semantics
Can therefore be used to captured knowledge in a machine understandable way
![Page 39: Bio2RDF : A biological knowledge base for the Semantic Web](https://reader035.vdocument.in/reader035/viewer/2022062312/554e8e61b4c90526358b4c7c/html5/thumbnails/39.jpg)
A generalized Biological Data Model
![Page 40: Bio2RDF : A biological knowledge base for the Semantic Web](https://reader035.vdocument.in/reader035/viewer/2022062312/554e8e61b4c90526358b4c7c/html5/thumbnails/40.jpg)
Semantic normalization will improve facet browsing and question answering
![Page 41: Bio2RDF : A biological knowledge base for the Semantic Web](https://reader035.vdocument.in/reader035/viewer/2022062312/554e8e61b4c90526358b4c7c/html5/thumbnails/41.jpg)
You want to join the knowledge web
![Page 42: Bio2RDF : A biological knowledge base for the Semantic Web](https://reader035.vdocument.in/reader035/viewer/2022062312/554e8e61b4c90526358b4c7c/html5/thumbnails/42.jpg)
Share your data
![Page 43: Bio2RDF : A biological knowledge base for the Semantic Web](https://reader035.vdocument.in/reader035/viewer/2022062312/554e8e61b4c90526358b4c7c/html5/thumbnails/43.jpg)
Bridge your data with others in semantic communities (data networks).
![Page 44: Bio2RDF : A biological knowledge base for the Semantic Web](https://reader035.vdocument.in/reader035/viewer/2022062312/554e8e61b4c90526358b4c7c/html5/thumbnails/44.jpg)
![Page 45: Bio2RDF : A biological knowledge base for the Semantic Web](https://reader035.vdocument.in/reader035/viewer/2022062312/554e8e61b4c90526358b4c7c/html5/thumbnails/45.jpg)
Time-sensitive or frequently updated data is one way to encourage more visits.
![Page 46: Bio2RDF : A biological knowledge base for the Semantic Web](https://reader035.vdocument.in/reader035/viewer/2022062312/554e8e61b4c90526358b4c7c/html5/thumbnails/46.jpg)
Bioinformatics Discovery Registry• Part of SharedName initiative to provide stable URI patterns for
data records.• We add the relationship between entities and records
Discovery Service• Registry links entities to data records, their formats (RDF/XML,
HTML, etc) and provider (Bio2RDF, Uniprot)http://registry.semanticscience.org/ns:id
Redirection Service• Automatic redirection to data provider document
http://registry.semanticsience.org/doc/provider/format/ns:id
![Page 47: Bio2RDF : A biological knowledge base for the Semantic Web](https://reader035.vdocument.in/reader035/viewer/2022062312/554e8e61b4c90526358b4c7c/html5/thumbnails/47.jpg)
![Page 48: Bio2RDF : A biological knowledge base for the Semantic Web](https://reader035.vdocument.in/reader035/viewer/2022062312/554e8e61b4c90526358b4c7c/html5/thumbnails/48.jpg)
Build aknowledge basefrom a series of questions
![Page 49: Bio2RDF : A biological knowledge base for the Semantic Web](https://reader035.vdocument.in/reader035/viewer/2022062312/554e8e61b4c90526358b4c7c/html5/thumbnails/49.jpg)
Carole Goble (ISWC 2005)
Web-based Knowledge Discovery a very painful process
![Page 50: Bio2RDF : A biological knowledge base for the Semantic Web](https://reader035.vdocument.in/reader035/viewer/2022062312/554e8e61b4c90526358b4c7c/html5/thumbnails/50.jpg)
The Knowledge Web
• Merging data & services
• Reasoning & question answering
• Persistent (RESTful)
• Trust & Security
Data consumers must be able to rely upon your data to use it as a foundation for their own applications.
![Page 51: Bio2RDF : A biological knowledge base for the Semantic Web](https://reader035.vdocument.in/reader035/viewer/2022062312/554e8e61b4c90526358b4c7c/html5/thumbnails/51.jpg)
2009 Goals
• Add more data!– Standardize RDFizers– Enrichment from small producer data!
• Design more RESTful services (Workflow)• Start using Virtuoso 6 cluster• Add mirrors• Approval from data providers to distribute RDF
dump and publish SPARQL endpoints– Confirmed: UniProt, BioCyc, Pathway Commons, BIND
![Page 52: Bio2RDF : A biological knowledge base for the Semantic Web](https://reader035.vdocument.in/reader035/viewer/2022062312/554e8e61b4c90526358b4c7c/html5/thumbnails/52.jpg)
Triplified Data and Virtuoso DB
http://quebec.bio2rdf.org/download
![Page 54: Bio2RDF : A biological knowledge base for the Semantic Web](https://reader035.vdocument.in/reader035/viewer/2022062312/554e8e61b4c90526358b4c7c/html5/thumbnails/54.jpg)
BIO2RDF Materials
![Page 55: Bio2RDF : A biological knowledge base for the Semantic Web](https://reader035.vdocument.in/reader035/viewer/2022062312/554e8e61b4c90526358b4c7c/html5/thumbnails/55.jpg)
Thanks To:
• The Bio2RDF community• Dumontier Lab – Alex De Leon, Jose Cruz, Natalia Villanueva-Rosales
• Quebec Reseachers– Francois Belleau, Marc-Alexandre Nolin
• Australian Researchers– Peter Ansell
• Openlink Virtuoso Team