Download - The crusade for big data in the AAL domain
The crusade for Big Data in the AAL domainFemke Ongenae
22
Session organizers – Hi!
Femke OngenaeKnowledge Engineer
IBCN, UGent - iMinds
Femke De BackereeCare Researcher
IBCN, UGent - iMinds
Griet VerhennemanLegal ResearcherICRI, KULeuven -
iMinds
Julie DoyleResearch Fellow
CASALA
33
Ambient-Assisted Living
Trend towards more personalized & context-aware healthcare services
44
FallRisk
Social-aware and context-aware multi-sensor fall detection platform
5
66
Enable the collection of (near) real-life profile and context data
77
Closing the gap...
DevelopersResearchers
Keynote by Dr. GrayFemke Ongenae
Data Integration in a Big Data ContextOpen PHACTS Case StudyAlasdair J G [email protected]@gray_alasdair
11
Big Data
@gray_alasdair Big Data Integration
Volume Velocity
Variety Veracity
http
://i.k
inja
-img.
com
/gaw
ker-m
edia
/imag
e/up
load
/lvzm
0afp
8kik
5dct
xiya
.jpg
Open PHACTS Use Case
“Let me compare MW, logP and PSA for launched inhibitors of human & mouse oxidoreductases”
Chemical Properties (Chemspider) Launched drugs (Drugbank) Human => Mouse (Homologene) Protein Families (Enzyme) Bioactivty Data (ChEMBL) … other info (Uniprot/Entrez etc.)
“Let me compare MW, logP and PSA for launched inhibitors of human & mouse oxidoreductases”
@gray_alasdair Big Data Integration 12
13
Open PHACTS Mission: Integrate Multiple Research Biomedical Data Resources
Into A Single Open & FreeAccess Point
@gray_alasdair Big Data Integration
14
LiteraturePubChem
GenbankPatents Databases
Downloads
Data Integration Data Analysis Firewalled Databases
Repeat @ each companyx
A single, shared solution.
Funded under• IMI: 2011-14• ENSO: 2014-16
Pre-competitive Data
@gray_alasdair Big Data Integration
15
http://dx.doi.org/10.1016/j.websem.2014.03.003
• Cloud-Based “Production” Level System.
• Secure & Private
• Guided By Business Questions
• Uses Semantic Web Technology
• Provides REST-ful API
http://dx.doi.org/10.1016/j.drudis.2013.05.008
Discovery Platform
@gray_alasdair Big Data Integration
16
Scientific Results
http://ceur-ws.org/Vol-1114/Demo_Dunlop.pdf
http://dx.doi.org/10.1016/j.drudis.2014.11.006 http://dx.doi.org/10.1002/minf.v31.8
http://dx.doi.org/10.1371/journal.pone.0115460
@gray_alasdair Big Data Integration
OPS Discovery Platform
@gray_alasdair Big Data Integration 17
Drug Discovery Platform
Apps
Domain API
Interactive responses
Production qualityintegration platform
MethodCalls
Standard Web Technologies
18
App Ecosystem
@gray_alasdair
An “App Store”?
Explorer Explorer2 ChemBioNavigator Target Dossier Pharmatrek Helium
MOE Collector Cytophacts Utopia Garfield SciBite
KNIME Mol. Data Sheets PipelinePilot scinav.it Taverna
Big Data Integration https://www.openphacts.org/2/sci/apps.html
Big Data Integration 19http://chembionavigator.com
ChemBioNavigator
@gray_alasdair
Big Data Integration 20@gray_alasdair
Big Data Integration 21@gray_alasdair
22
API Hits
@gray_alasdair Big Data Integration
Jan 2013
Feb 2013
Mar 2013
Apr 2013
May 2013
June 2013
July 2013
Aug 2013
Sept 2
013
Oct 2013
Nov 2013
Dec 2013
Jan 2014
Feb 2014
Mar 2014
Apr 2014
May 2014
June 2014
July 2014
Aug 2014
Sept 2
014
Oct 2014
Nov 2014
Dec 2014
Jan 2015
Feb 2015
Mar 2015
Apr 2015
May 2015
June 2015
0
10000000
20000000
30000000
40000000
50000000
60000000
Month
No
of H
its
Public launchof 1.2 API
1.3 API 1.4 API 1.5 API
23
OPS Discovery Platform
RDFNanopub
Db
VoID
Data Cache (Virtuoso Triple Store)
Semantic Workflow Engine
Linked Data API (RDF/XML, TTL, JSON)DomainSpecificServices
Identity Resolution
Service
Chemistry RegistrationNormalisation & Q/C
IdentifierManagement
Service
Indexing
Cor
e Pl
atfo
rm
P12374EC2.43.4
CS4532
“Adenosine receptor 2a”
RDF
VoID
Db
RDFNanopub
Db
VoID
RDF
Db
VoID
RDFNanopub
VoID
Public Content Commercial
Public Ontologies
User Annotations
Apps
@gray_alasdair Big Data Integration
24
Open PHACTS Data
@gray_alasdair Big Data Integration
John Wilbanks consulted for us
A framework built around STANDARD well-understood Creative Commons licences – and how they interoperate
Deal with the problems by:
Interoperable licences
Appropriate terms
Declare expectations to users and
data publishers
One size won‘t fit all requirements
Data Licensing (Or Lack Of!)
26
API: Complex Interactions
@gray_alasdair Big Data Integration
Disease
Tissue
Target
Compound
Pathway
27
STANDARD_TYPE UNIT_COUNT---------------- -------AC50 7 Activity 421 EC50 39 IC50 46 ID50 42 Ki 23 Log IC50 4 Log Ki 7 Potency 11 log IC50 0
STANDARD_TYPE STANDARD_UNITS COUNT(*)------------------ ------------------ --------IC50 nM 829448 IC50 ug.mL-1 41000 IC50 38521 IC50 ug/ml 2038 IC50 ug ml-1 509 IC50 mg kg-1 295 IC50 molar ratio 178 IC50 ug 117 IC50 % 113 IC50 uM well-1 52
~ 100 units>5000 types
Implemented using the Quantities, Units, Dimension, TypesOntology (http://www.qudt.org/)
Quantitative Data Challenges
@gray_alasdair Big Data Integration
28
Quality Assurance
@gray_alasdair Big Data Integration
Big Data Integration 29
P12047X31045 P12047
GB:29384RS_2353
Identity Mapping
@gray_alasdair
Andy Law's Third Law“The number of unique identifiers assigned to an individual is never less than the number of Institutions involved in the study”http://bioinformatics.roslin.ac.uk/lawslaws/
Gleevec®: Imatinib Mesylate
@gray_alasdair Big Data Integration 30
DrugbankChemSpider PubChem
Imatinib
MesylateImatinib MesylateYLMAHDNUQAMNNX-UHFFFAOYSA-N
Gleevec®: Imatinib Mesylate
@gray_alasdair Big Data Integration 31
DrugbankChemSpider PubChem
Imatinib
MesylateImatinib MesylateYLMAHDNUQAMNNX-UHFFFAOYSA-N
Are these records the same?It depends upon your task!
Big Data Integration 32
skos:exactMatch(InChI)
Strict Relaxed
Analysing Browsing
Structure Lens
@gray_alasdair
I need to perform an analysis, give me details of the active compound in
Gleevec.
Big Data Integration 33
skos:closeMatch(Drug Name)
skos:closeMatch(Drug Name)
skos:exactMatch(InChI)
Strict Relaxed
Analysing Browsing
Name Lens
@gray_alasdair
Which targets are known to interact with Gleevec?
Big Data Integration 35
Data Provenance
@gray_alasdair
Big Data Integration 36
Data Provenance
@gray_alasdair
38
dev.openphacts.org
@gray_alasdair Big Data Integration
40
Open PHACTS Approach1. Know your audience
Web developers2. Understand your use cases
Prioritised business questions3. Identify access pathways
Identify dataIdentify connectionsImplement API
@gray_alasdair Big Data Integration
41
QuestionsAlasdair J G [email protected]@gray_alasdair
Open [email protected]@open_phacts
@gray_alasdair Big Data Integration
Brainstorm sessionFemke Ongenae
4343
How do we enable an infrastructure/platform that allows the user-
friendly and rapid sharing of Living Lab data?
4444
Brainstorm: 3 Steps
Generation of ideas
Selection of best ideas
Further detailing top ideashttp://www.flandersdc.be/gps
4545
Table 3Data Sharing Infrastructure
Table 4 Quality & Reliability of data
Table 5Data Usage
Results
Table 1Privacy &
Ethics
Table 2Business Models
Generating ideas
4646
Practical arrangements• Paper indicating table order
• Brainstorm round: +/- 15 minutes
• Moderators
4747
Some tips!
4848
Some tips!
Delay your judgementBe open to naive and crazy
ideas
Openess & enthusiasmUse associative thinking
Piggyback on ideas of others
4949
Selection of ideas• Summarize 3 key ideas
• How to select?– Keep the goal in mind! – Think in opportunities– What are you enthusiastic about?– Personal engagement– What is needed in the short term? – Most promising
5050
Selection of ideas• 5 Votes
• Put your name & e-mail on the sheet if you want to be involved in working out the idea!
THANK YOU FOR YOURTIME
Contact me @ [email protected]