uniprotkb/swiss-prot:why sparql?
TRANSCRIPT
SPARQL: UniProtKB/Swiss-Prot why do it?
Jerven Bolleman Developer Swiss-Prot Group
What is UniProtKB/Swiss-Prot
• Central database in the Life Sciences
– Proteins -‐> you are made out of them
– Summarises current scientific knowledge
– Links 150+ databases together
• Swiss-‐Prot & Vital-‐IT group activities are funded by the Swiss-‐Confederation through the SERI (State secretariat for education, research and innovation)
Our Goals
• Provide core Bioinformatics resources
– UniProtKB/
–
– …
• Provide services and infrastructure
– Vital-‐IT : HPC for the life-‐sciences
– …
Why provide a public SPARQL endpoint
• A 10 man wet laboratory can not afford:
– to host their own database in house holding all or even a bit of all life science data.
– not to have access, and use, existing life science information.
← Not CPU Time...But Brain Time
↓
The right kind of optimisation
Why provide a public SPARQL endpoint
• Classical SQL can be provided on the web
–Is not practical –No federation –Poor standards conformance
• Local SQL is expensive • Local JSON is no better
• Nor is local XML
Data Integration Traditional
Pathway.txt
UniProt.txt
Pathway Parser
UniProt Parser
Pathway Schema
UniProt Schema
Own Lab Data
Data warehouse
SQL queries
$
$
$
$
$
$
Data Integration RDF/SPARQL
Pathway.rdf
UniProt.rdf
Own Lab Data
Triple Store SPARQL Queries
$
$?
Why provide a public SPARQL endpoint
• Document centric REST is not enough
–Swiss-‐Prot available as REST –(over e-mail !!) since 1986
–expasy.ch since 1993 –www.uniprot.org since 2002
• Most user use a GUI not a CLI • developers build GUI on a CLI
10© 2015 SIB
100
10'000
1'000'000
2015-01
2015-02
2015-03
2015-04
2015-05
2015-06
2015-07
2015-08
queries ask selectconstruct describe
Queries per month in 2015 peak: 4 million per month
Real users
Mix between hard analytics and super specific
Estimate somewhere between: 300 - 1000 real humans per month
We know they are real because they take holidays ;)
Using the Semantic Web for faster (Bio-) Research http://edu.isb-sib.ch/course/view.php?id=212