ontologies come of age with the ikup browser...• kupk ontology ~1800 classes. ~40,000 after...

19
Ontologies Come of Age with the iKUP browser Simon Jupp Bio-health informatics group University of Manchester www.kupkb.org

Upload: others

Post on 19-Jul-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Ontologies Come of Age with the iKUP browser...• KUPK Ontology ~1800 classes. ~40,000 after imports closure Architecture • Sesame and BigOWLIM for the RDF store • Web site developed

Ontologies Come of Agewith the iKUP browser

Simon JuppBio-health informatics group

University of Manchester

www.kupkb.org

Page 2: Ontologies Come of Age with the iKUP browser...• KUPK Ontology ~1800 classes. ~40,000 after imports closure Architecture • Sesame and BigOWLIM for the RDF store • Web site developed

The problem domainKidney and Urinary Pathways

Kidney

Ureter

Bladder

Mutiple diseases

Dialysis and transplantation

Need to understand how they work for prevention

Need to learn new ways to detect them

Page 3: Ontologies Come of Age with the iKUP browser...• KUPK Ontology ~1800 classes. ~40,000 after imports closure Architecture • Sesame and BigOWLIM for the RDF store • Web site developed

The problem domain

Hundreds of studies have been conducted by the kidney research community

On different species

On different materials

• On different biological levels

gene

human mouse

urine tissue

protein

cell

Page 4: Ontologies Come of Age with the iKUP browser...• KUPK Ontology ~1800 classes. ~40,000 after imports closure Architecture • Sesame and BigOWLIM for the RDF store • Web site developed

Where does the data go?Research Papers

Bespoke kidney laboratory databases

Generalist databases

Scattered, hidden in figures, coming in different formatsMost of the data is lost!

Page 5: Ontologies Come of Age with the iKUP browser...• KUPK Ontology ~1800 classes. ~40,000 after imports closure Architecture • Sesame and BigOWLIM for the RDF store • Web site developed

What has been observed, where and when?

e.g. Experiment X showed gene TGFB1 over-expressed in location Kidneyunder condition model of diabetic nephropathy

Experimental factors

Capturing what is known in a form of nano-publication

Disease ontologyAnimal model

Cell type ontologyMouse anatomy ontology

Ontologies provide the schema

Page 6: Ontologies Come of Age with the iKUP browser...• KUPK Ontology ~1800 classes. ~40,000 after imports closure Architecture • Sesame and BigOWLIM for the RDF store • Web site developed

Filling in the gapsWe needed to connect these reference ontologies.

By connecting we build our own application ontology.

Anatomy (MAO) Gene Biological processes(GO)Gene Biological processes(GO)

Cells (CTO)participate-in

Renal 

tubule

Renal proximal tubule

Proximal 

tubule

Proximal straight tubule

Proximal 

tubule

Proximal convoluted tubule

Assertion

Inference

subClassOf

Proximal 

epithelial cell

Proximal tubule 

epithelial cell

Proximal 

epithelial cell

Proximal straighttubule 

epithelial cell

Proximal 

epithelial cell

Proximal convoluted tubule 

epithelial cell

subClassOf

part-of

Renal sodium absorption 

Renal sodium absorption 

Renal sodium ion 

absorption

Renal sodium ion 

absorption

participates-in

part-of

participates-in

Kidney Cortex

part-of

part-of

Page 7: Ontologies Come of Age with the iKUP browser...• KUPK Ontology ~1800 classes. ~40,000 after imports closure Architecture • Sesame and BigOWLIM for the RDF store • Web site developed

Separation of concerns

Populous

All Eukaryotic Cells are either nucleated or anucleate, some cells are multinucleateKnowledge

‘Eukaryotic Cells’ has_nucleation some ‘Nucleation’‘Nucleation’ subClassOf {mononucleate , binucleate , polynucleate , anucleate}

Ontologically

‘Eukaryotic Cells’ has_nucleation some ‘Nucleation’‘Nucleation’ subClassOf {mononucleate , binucleate , polynucleate , anucleate} Differentia

‘Eukaryotic Cells’ ‘Nucleation’

Mononuclear phagocyte mononucleateFlight Muscle cell multinucleateRed Blood cell anucleate

Real Examples

Page 8: Ontologies Come of Age with the iKUP browser...• KUPK Ontology ~1800 classes. ~40,000 after imports closure Architecture • Sesame and BigOWLIM for the RDF store • Web site developed

Ontologies by stealth

Populous generates simple Excel based templates

The domain experts are the experts so get them build it

Anatomy (MAO)

Biological processes(

GO)

Biological processes(

GO)

Cells (CTO)

http://www.populous.org.uk

Page 9: Ontologies Come of Age with the iKUP browser...• KUPK Ontology ~1800 classes. ~40,000 after imports closure Architecture • Sesame and BigOWLIM for the RDF store • Web site developed

Ontologies by stealthConvert from Excel Owl Classify Validate in Protégé

Creation of a specialized Kidney and Urinary Pathway Ontology (KUPO)

Page 10: Ontologies Come of Age with the iKUP browser...• KUPK Ontology ~1800 classes. ~40,000 after imports closure Architecture • Sesame and BigOWLIM for the RDF store • Web site developed

Describing/Collecting experimental data

Gathering good meta-data AND data again by stealth using RightField

Page 11: Ontologies Come of Age with the iKUP browser...• KUPK Ontology ~1800 classes. ~40,000 after imports closure Architecture • Sesame and BigOWLIM for the RDF store • Web site developed

Mashing it all together

Kidney and Urinary Pathway Ontology~1800 classes (~40,000 after imports closure)

Experimental data195 KUP experiments/databases

integrated

Bio2RDF Linked data

OWL reasoning

KUP Knowledge Base

RDF triple storeSesame + OWLIM ~50M triples

Excel 2 RDF/OWL

Page 12: Ontologies Come of Age with the iKUP browser...• KUPK Ontology ~1800 classes. ~40,000 after imports closure Architecture • Sesame and BigOWLIM for the RDF store • Web site developed

The iKUP browser

An open-source, collaborative and easy-to-use interface

Page 13: Ontologies Come of Age with the iKUP browser...• KUPK Ontology ~1800 classes. ~40,000 after imports closure Architecture • Sesame and BigOWLIM for the RDF store • Web site developed

The iKUP browser

Page 14: Ontologies Come of Age with the iKUP browser...• KUPK Ontology ~1800 classes. ~40,000 after imports closure Architecture • Sesame and BigOWLIM for the RDF store • Web site developed

Anatomy search

Page 15: Ontologies Come of Age with the iKUP browser...• KUPK Ontology ~1800 classes. ~40,000 after imports closure Architecture • Sesame and BigOWLIM for the RDF store • Web site developed

Disease search

Page 16: Ontologies Come of Age with the iKUP browser...• KUPK Ontology ~1800 classes. ~40,000 after imports closure Architecture • Sesame and BigOWLIM for the RDF store • Web site developed

Doing some biology1. A biological question

Accepted for publication in the FASEB J!

Can calreticulin be associated to the development of human kidney disease?

2. No answer with classical tools

Search in Pubmed and Google does not return any relevant result!

3. Querying the KUPKB

4. Validation in the wet-lab

KUPKB in silico result confirmed.

5. Publish an innovative result

Page 17: Ontologies Come of Age with the iKUP browser...• KUPK Ontology ~1800 classes. ~40,000 after imports closure Architecture • Sesame and BigOWLIM for the RDF store • Web site developed

Summary The KUPKB RDF store is a mashup of biological knowledge relating to the

KUP domain

Ontologies provide the schema and a consistent data annotation mechanism

We expose this knowledge base through a simple web interface that real biologists can use

It is a testament to the tools and APIs that such applications are now being delivered at relatively low cost

Page 18: Ontologies Come of Age with the iKUP browser...• KUPK Ontology ~1800 classes. ~40,000 after imports closure Architecture • Sesame and BigOWLIM for the RDF store • Web site developed

Thank you for listening…

www.kupkb.org

Page 19: Ontologies Come of Age with the iKUP browser...• KUPK Ontology ~1800 classes. ~40,000 after imports closure Architecture • Sesame and BigOWLIM for the RDF store • Web site developed

Some rough stats…• 195 KUP experiments integrated• KUPKB RDF store ~35M triples• KUPK Ontology ~1800 classes. ~40,000 after imports closure

Architecture• Sesame and BigOWLIM for the RDF store• Web site developed with Google web toolkit• OWL API and HermiT reasoner for classification and faceted browsing