connecting the dots: drug information and linked data

22
Connec&ng the dots: drug informa&on and Linked Data Tomasz Adamusiak MD PhD 7omasz

Upload: tomasz-adamusiak

Post on 05-Jul-2015

89 views

Category:

Health & Medicine


0 download

DESCRIPTION

Presented as part of the AMIA2014 Knowledge Representation + Semantics and Clinical Information Systems Working Groups Pre-Symposium "Drug Terminology Standards: Meaningful Use and Better Knowledge" November 16, 2014 Washington, DC

TRANSCRIPT

Page 1: Connecting the dots: drug information and Linked Data

Connec&ng  the  dots:  drug  informa&on  and  Linked  Data  

Tomasz  Adamusiak  MD  PhD    

7omasz  

Page 2: Connecting the dots: drug information and Linked Data

Conflict  of  interest  disclosure  

•  Tomasz  Adamusiak  is  a  Senior  Data  Scien&st  at    Thomson  Reuters,  provider  of  intelligent  informa&on  for  pharma  and  research  ins&tu&ons  

Page 3: Connecting the dots: drug information and Linked Data

Tomasz  Adamusiak  MD  PhD  

•  Former  NLM  Fellow  and  bioinforma&cian  at  EBI  

Page 4: Connecting the dots: drug information and Linked Data

Learning  Objec&ves  

•  Describe  Linked  Data  and  and  seman&c  content  integra&on  technologies  

•  Recognize  the  value  of  integra&ng  drug  informa&on  with  public  resources  

Page 5: Connecting the dots: drug information and Linked Data

AS  OF  2012,  ABOUT  2.5  EXABYTES  OF  DATA  CREATED  EACH  DAY    

Page 6: Connecting the dots: drug information and Linked Data

2.5  exabytes  ≈    7  000  Libraries  of  Congress  

By  Carol  M.  Highsmith  (Own  work)  [CC-­‐BY-­‐SA-­‐3.0]  

Page 7: Connecting the dots: drug information and Linked Data

2.5  exabytes  ≈    7  000  Libraries  of  Congress  

Page 8: Connecting the dots: drug information and Linked Data

Tim  Berners-­‐Lee:  the  next  Web  of  open,  linked  data  

 If  you  want  to  put  something  on  the  web  there  are  three  rules:  1.  All  kinds  of  conceptual  things,  they  have  names  now  that  start  with  HTTP.  2.  If  I  take  one  of  these  HTTP  names  and  I  look  it  up  [...]  I  fetch  the  data  using  

the  HTTP  protocol  from  the  web,  I  will  get  back  some  data  in  a  standard  format  

3.  It's  got  rela5onships  [..]  the  other  thing  that  it's  related  to  is  given  one  of  those  names  that  starts  HTTP.  So,  I  can  go  ahead  and  look  that  thing  up.    

Sir  Tim  Berners-­‐Lee  on  the  next  Web  (TED2009)  

Page 9: Connecting the dots: drug information and Linked Data

The  5  stars  of  open  linked  data  

★   Pu`ng  anything  up  there  ★★   Machine  readable  format  ★★★   Non-­‐proprietary  format  ★★★★   Use  URLs  to  iden&fy  things  ★★★★★   Provide  context  by  linking  to  others  

Gov  2.0  Expo  2010:  Tim  Berners-­‐Lee,  "Open,  Linked  Data  for  a  Global  Community”  hdps://www.youtube.com/watch?v=ga1aSJXCFe0#t=328    

Page 10: Connecting the dots: drug information and Linked Data

RDF  triple  is  the  core  concept  underpinning  the  seman&c  web  

subject   predicate   object  <hdp://www.example.com/index.html>     <hdp://purl.org/dc/elements/1.1/creator>     „John  Smith”  

example:index.html   John Smith dc:creator  

Page 11: Connecting the dots: drug information and Linked Data

Several  data  sources  available  

Page 12: Connecting the dots: drug information and Linked Data

Caveat  1:  missing  central  URI  reconcilia&on  

•  Responsibility  for  URIs:  hdp://bio2rdf.org/mesh:68009154  hdp://bio2rdf.org/pubmed:11992264  hdp://bio2rdf.org/go:0016458  hdp://purl.org/obo/owl/GO#GO_0016458  •  Versioning:  hdp://sig.uw.edu/fma#Anatomical_en&ty  (FMA  3.1)  hdp://sig.biostr.washington.edu/fma3.0#Anatomical_en&ty  (FMA  3.0)  hdp://purl.obolibrary.org/obo/GO_0016458  (Foundry-­‐compliant  URI)  •  Requires  insAtuAonal  support  •  RxNorm  in  RDF?  

Page 13: Connecting the dots: drug information and Linked Data

Caveat  2:  data  locality  

hdp://gigaom.com/broadband/the-­‐storage-­‐vs-­‐bandwidth-­‐debate/  

Page 14: Connecting the dots: drug information and Linked Data

CONNECTING THE DOTS

Given therapeutic action - PPAR gamma partial/agonist – what were the related compounds studied, the indications for treatment, technologies of drug delivery, related genes and affected pathways?

Page 15: Connecting the dots: drug information and Linked Data

EBI  RDF  Plasorm  •  All  model  elements  with  annota&ons  to  acetylcholine-­‐

gated  channel  complex  (GO:0005892)    

•  Samples  treated  with  alcohol    

•  Find  drug-­‐like  (but  currently  not  approved)  molecules  which  bind  7TM1  GPCRs  with  high  affinity    

•  Under  what  experimental  condi&ons  is  Ensembl  gene  ENSG00000129991  (TNNI3)  expressed?    

•  Pathways  that  reference  Insulin  (P01308)    

•  What  are  the  preferred  gene  name  and  disease  annota&ons  of  all  human  UniProt  entries  that  are  known  to  be  involved  in  a  disease?  

★★★★★  

Page 16: Connecting the dots: drug information and Linked Data

Open  PHACTS  Discovery  Plasorm  

Freely  available,  pharmacological  data  from  a  variety  of  resources  +  tools  and  services  to  support  pharmacological  research  

★★★★★  

Page 17: Connecting the dots: drug information and Linked Data

Bio2RDF:  Linked  Data  for  the  Life  Sciences  

•  ~11  billion  triples  across  35  datasets  •  Datasets  include:  clinicaltrials.gov,  dbSNP,  GenAge,  GenDR,  LSR,  OrphaNet,  PubMed,  SIDER,  WormBase  

•  Locally  hosted  endpoints:  chembl,  linkedSPL,  pathwaycommons,  reactome,  wikipathways  

★★★★★  

Page 18: Connecting the dots: drug information and Linked Data

NCBO  BioPortal  RDF  

•  Provide  RDF  for  each  class  in  BioPortal  so  that  we  can  have  a  URL  to  a  concept  that  resolves  to  a  set  of  RDF  triples  that  provide  essen&al  informa&on  about  the  term  

•  Provide  an  RDF  dump  of  each  ontology  in  BioPortal  to  put  them  in  a  tripelstore  to  enable  SPARQL  access  to  the  ontologies  

★★★★★  

Page 19: Connecting the dots: drug information and Linked Data

Linked  Structured  Product  Labels  hdp://purl.org/net/linkedSPLs  

•  LinkedSPLs  publishes  all  sec&ons  of  FDA-­‐approved  prescrip&on  and  over  the  counter  drug  package  inserts  from  DailyMed  for  use  by  NLP  and  Seman&c  Web  researchers  

•  All  ac&ve  moie&es  and  product  labels  are  mapped  to  RxNORM  PURLs  provided  by  the  NCBO  Bioportal  SPARQL  endpoint  

•  LinkedSPLs  is  provided  as  a  service  as  part  of  the  Drug  Interac&on  Knowledge  Base  (DIKB)  project    

Boyce  RD  et  al.  Dynamic  enhancement  of  drug  product  labels  to  support  drug  safety,  efficacy,  and  effecLveness.  J  Biomed  SemanLcs.  2013  Jan  26;4(1):5.  PMID:  23351881.  

★★★★★  

Page 20: Connecting the dots: drug information and Linked Data

Making  public  FDA  datasets  more  accessible  

•  Adverse  events.  FDA’s  publically  available  drug  adverse  event  and  medica&on  error  reports,  and  medical  device  adverse  event  reports.  

•  Recalls.  Enforcement  report  data,  containing  informa&on  gathered  from  public  no&ces  about  certain  recalls  of  FDA-­‐regulated  products.  

•  Labeling.  Structured  Product  Labeling  (SPL)  data  for  FDA-­‐regulated  human  prescrip&on  drug,  OTC  drug  and  biological  product  labeling.  

★★★★  

Page 21: Connecting the dots: drug information and Linked Data

RDF  Representa&on  of  CDISC  Founda&onal  Standards  

•  PhUSE  and  CDISC  Draz  RDF  Representa&on  •  RDF  could  provide  a  founda&on  for  interoperable  end  to  end  data  standards  in  clinical  research    

•  hdp://github.com/phuse-­‐org/rdf.cdisc.org  

Page 22: Connecting the dots: drug information and Linked Data

Thank  You