![Page 1: Open innovation contributions from RSC resulting from the Open Phacts project](https://reader037.vdocument.in/reader037/viewer/2022102815/554e7cfcb4c905f66a8b5258/html5/thumbnails/1.jpg)
Open innovation and chemistry data management contributions
from RSC resulting from the Open PHACTS project
Antony Williams, Valery Tkachenko, Ken Karapetyan, Alexey Pshenichnov, Colin Batchelor, Jon Steele & David Sharpe
ACS San Francisco
August 2014
![Page 2: Open innovation contributions from RSC resulting from the Open Phacts project](https://reader037.vdocument.in/reader037/viewer/2022102815/554e7cfcb4c905f66a8b5258/html5/thumbnails/2.jpg)
What’s the structure?What’s the structure?
Are they in our file?
Are they in our file?
What’s similar?What’s
similar?
What’s the target?
What’s the target?Pharmacology
data?Pharmacology
data?
Known Pathways?
Known Pathways?
Working On Now?
Working On Now?Connections
to disease?Connections to disease?
Expressed in right cell type?Expressed in
right cell type?
Competitors?Competitors?
IP?IP?
![Page 3: Open innovation contributions from RSC resulting from the Open Phacts project](https://reader037.vdocument.in/reader037/viewer/2022102815/554e7cfcb4c905f66a8b5258/html5/thumbnails/3.jpg)
Fundamental issue:
•There is a LOT of science online!
•Chaotic, varying quality and very valuable!
•Scientists want to find information quickly and easily
•Often they just “can’t get there” (or don’t even know where “there” is)
•And you have to manage it all (or not)
![Page 4: Open innovation contributions from RSC resulting from the Open Phacts project](https://reader037.vdocument.in/reader037/viewer/2022102815/554e7cfcb4c905f66a8b5258/html5/thumbnails/4.jpg)
Pre-competitive Informatics:Pharma are all accessing, processing, storing & re-processing external research data
LiteraturePubChem
GenbankPatents
DatabasesDownloads
Data Integration Data AnalysisFirewalled Databases
Repeat @ each
companyx
Lowering industry firewalls: pre-competitive informatics in drug discovery Nature Reviews Drug Discovery (2009) 8, 701-708 doi:10.1038/nrd2944
![Page 5: Open innovation contributions from RSC resulting from the Open Phacts project](https://reader037.vdocument.in/reader037/viewer/2022102815/554e7cfcb4c905f66a8b5258/html5/thumbnails/5.jpg)
ChEMBLChEMBL DrugBankDrugBank Gene Ontology
Gene Ontology WikipathwaysWikipathways
UniProtUniProt
ChemSpiderChemSpider
UMLSUMLS
ConceptWikiConceptWiki
ChEBIChEBI
TrialTroveTrialTrove
GVKBioGVKBio
GeneGoGeneGo
TR IntegrityTR Integrity
“Find me compounds that inhibit targets in NFkB pathway assayed in only functional assays with a potency <1 μM”
“What is the selectivity profile of known p38 inhibitors?”
“Let me compare MW, logP and PSA for known oxidoreductase inhibitors”
![Page 6: Open innovation contributions from RSC resulting from the Open Phacts project](https://reader037.vdocument.in/reader037/viewer/2022102815/554e7cfcb4c905f66a8b5258/html5/thumbnails/6.jpg)
Business Question Driven Approach
![Page 7: Open innovation contributions from RSC resulting from the Open Phacts project](https://reader037.vdocument.in/reader037/viewer/2022102815/554e7cfcb4c905f66a8b5258/html5/thumbnails/7.jpg)
• 3-year Innovative Medicines Initiative project
• Integrating chemistry and biology data using semantic web technologies
• Open source code, open data and open standards
• Academics, Pharmas, Publishers…• To put medicines in the pipeline…
![Page 8: Open innovation contributions from RSC resulting from the Open Phacts project](https://reader037.vdocument.in/reader037/viewer/2022102815/554e7cfcb4c905f66a8b5258/html5/thumbnails/8.jpg)
The Open PHACTS community ecosystem
![Page 9: Open innovation contributions from RSC resulting from the Open Phacts project](https://reader037.vdocument.in/reader037/viewer/2022102815/554e7cfcb4c905f66a8b5258/html5/thumbnails/9.jpg)
Originally used ChemSpider..
![Page 10: Open innovation contributions from RSC resulting from the Open Phacts project](https://reader037.vdocument.in/reader037/viewer/2022102815/554e7cfcb4c905f66a8b5258/html5/thumbnails/10.jpg)
Open PHACTS Deliverables
• Many details but overall…• Deliver an Open Source chemical registry
service, independent of ChemSpider• Development of Open Source CVSP platform• Deliver widgets and APIs to the project• Deliver high quality, standardized Open Data• Deliver structure data in RDF format
![Page 11: Open innovation contributions from RSC resulting from the Open Phacts project](https://reader037.vdocument.in/reader037/viewer/2022102815/554e7cfcb4c905f66a8b5258/html5/thumbnails/11.jpg)
Standardize
• Use the SRS as guidance for standardization• Adjust as necessary to our needs
![Page 12: Open innovation contributions from RSC resulting from the Open Phacts project](https://reader037.vdocument.in/reader037/viewer/2022102815/554e7cfcb4c905f66a8b5258/html5/thumbnails/12.jpg)
Nitro groups
![Page 13: Open innovation contributions from RSC resulting from the Open Phacts project](https://reader037.vdocument.in/reader037/viewer/2022102815/554e7cfcb4c905f66a8b5258/html5/thumbnails/13.jpg)
Salt and Ionic Bonds
![Page 14: Open innovation contributions from RSC resulting from the Open Phacts project](https://reader037.vdocument.in/reader037/viewer/2022102815/554e7cfcb4c905f66a8b5258/html5/thumbnails/14.jpg)
Depositions Gateway User Interface
![Page 15: Open innovation contributions from RSC resulting from the Open Phacts project](https://reader037.vdocument.in/reader037/viewer/2022102815/554e7cfcb4c905f66a8b5258/html5/thumbnails/15.jpg)
Validate and Standardize
![Page 16: Open innovation contributions from RSC resulting from the Open Phacts project](https://reader037.vdocument.in/reader037/viewer/2022102815/554e7cfcb4c905f66a8b5258/html5/thumbnails/16.jpg)
CVSP Filtering
![Page 17: Open innovation contributions from RSC resulting from the Open Phacts project](https://reader037.vdocument.in/reader037/viewer/2022102815/554e7cfcb4c905f66a8b5258/html5/thumbnails/17.jpg)
CVSP Filtering of DrugBank
![Page 18: Open innovation contributions from RSC resulting from the Open Phacts project](https://reader037.vdocument.in/reader037/viewer/2022102815/554e7cfcb4c905f66a8b5258/html5/thumbnails/18.jpg)
ChEMBL (1.3 million records)
• 11,020 records with 4 bonds and zero charge, e.g. CHEMBL501101 or CHEMBL501973
• 271 records with hypervalent oxygen (e.g. , CHEMBL2219679), carbon (e.g. 1005895), boron, chlorine, iodine or phosphine
• 6,177 records where direction of bond makes no sense, e.g. CHEMBL12760 and CHEMBL34704
![Page 19: Open innovation contributions from RSC resulting from the Open Phacts project](https://reader037.vdocument.in/reader037/viewer/2022102815/554e7cfcb4c905f66a8b5258/html5/thumbnails/19.jpg)
O H
O
O H
O
O–
O
Na+
Na+
O
O–
O
O–
OPS1
O–
ONa
+
DrugBank ID DB07241
OPS5OPS4
OPS3
OPS2
OPS6
ops:OPS1 skos:exactMatch <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugs/DB07241> .
ops:OPS2 skos:relatedMatch ops:OPS1 .
ops:OPS3 skos:relatedMatch ops:OPS1 .
ops:OPS3 skos:closeMatch ops:OPS4 .
ops:OPS3 skos:closeMatch ops:OPS5 .
ops:OPS4 skos:closeMatch ops:OPS6 .
ops:OPS5 skos:closeMatch ops:OPS6 .
Chemical Registry Service
![Page 20: Open innovation contributions from RSC resulting from the Open Phacts project](https://reader037.vdocument.in/reader037/viewer/2022102815/554e7cfcb4c905f66a8b5258/html5/thumbnails/20.jpg)
Open Sourcing Data and Code
• All Open PHACTS data is licensed as Open Data and available from Open PHACTS website – ca. 2 Million chemicals
• The Chemical Registration Service, including Chemical Validation and Standardization Platform preparing as Open Source now!
![Page 21: Open innovation contributions from RSC resulting from the Open Phacts project](https://reader037.vdocument.in/reader037/viewer/2022102815/554e7cfcb4c905f66a8b5258/html5/thumbnails/21.jpg)
RSC data in Open PHACTS
1. Molecule synonyms and identifiers
2. Linksets between ChEBI, ChEMBL, DrugBank
and OPS identifiers
3. Molecule–molecule relations (“parent–child”) of
interest for drug discovery
4. Calculated physicochemical properties for
compounds (both molecular and macroscopic)
![Page 22: Open innovation contributions from RSC resulting from the Open Phacts project](https://reader037.vdocument.in/reader037/viewer/2022102815/554e7cfcb4c905f66a8b5258/html5/thumbnails/22.jpg)
Our RDF schema
Two dozen calculated properties >106 molecules•CHEMINF ontology for cheminformatics•QUDT for units and numeric values•ChemSpider IDs for molecules
![Page 23: Open innovation contributions from RSC resulting from the Open Phacts project](https://reader037.vdocument.in/reader037/viewer/2022102815/554e7cfcb4c905f66a8b5258/html5/thumbnails/23.jpg)
Synonyms and identifiers
Newly added to the CHEMINF ontology:
•Validated ChemSpider synonyms•Unvalidated ChemSpider synonyms•Validated database identifiers•Unvalidated database identifiers •InChI, InChIKey, SMILES •Preferred ChemSpider name
![Page 24: Open innovation contributions from RSC resulting from the Open Phacts project](https://reader037.vdocument.in/reader037/viewer/2022102815/554e7cfcb4c905f66a8b5258/html5/thumbnails/24.jpg)
Physicochemical properties• log P• log D (at pH 5.5 and 7.4)• bioconcentration factor KOC (at pH 5.5, at pH 7.4)• index of refraction• polar surface area• molar refractivity• molar volume• Polarizability• surface tension• density at STP• flash point at 1 atm• boiling point at 1 atm• enthalpy of vaporization at STP• vapour pressure at STP
![Page 25: Open innovation contributions from RSC resulting from the Open Phacts project](https://reader037.vdocument.in/reader037/viewer/2022102815/554e7cfcb4c905f66a8b5258/html5/thumbnails/25.jpg)
RDF exports from CRS
![Page 26: Open innovation contributions from RSC resulting from the Open Phacts project](https://reader037.vdocument.in/reader037/viewer/2022102815/554e7cfcb4c905f66a8b5258/html5/thumbnails/26.jpg)
benzene’s connection table
OPSbenzene
calculation result
QUDTdimensionless
quantity
“2.17”^^xsd:float
IAOis about
OBIhas specified
output
OBIhas specified
input
QUDThas value
QUDThas standard uncertainty
QUDThas unit
CHEMINFcalculated log P
rdf:type
CHEMINFconnection table
rdf:type
“0.234”^^xsd:float
calculation process
CHEMINFexecution of ACD/Labs
PhysChem software library version 12.01
rdf:type
It is actually more complicated..
![Page 27: Open innovation contributions from RSC resulting from the Open Phacts project](https://reader037.vdocument.in/reader037/viewer/2022102815/554e7cfcb4c905f66a8b5258/html5/thumbnails/27.jpg)
What’s built on top of this?
![Page 28: Open innovation contributions from RSC resulting from the Open Phacts project](https://reader037.vdocument.in/reader037/viewer/2022102815/554e7cfcb4c905f66a8b5258/html5/thumbnails/28.jpg)
Important for other projects
• Multiple outputs from the project available for reuse to underpin other projects:• Chemical registry service• Chemical validation and standardization• APIs and visualization widgets
![Page 29: Open innovation contributions from RSC resulting from the Open Phacts project](https://reader037.vdocument.in/reader037/viewer/2022102815/554e7cfcb4c905f66a8b5258/html5/thumbnails/29.jpg)
New Repository Architecturedoi: 10.1007/s10822-014-9784-5
![Page 30: Open innovation contributions from RSC resulting from the Open Phacts project](https://reader037.vdocument.in/reader037/viewer/2022102815/554e7cfcb4c905f66a8b5258/html5/thumbnails/30.jpg)
New Repository Architecture
Compounds Reactions Spectra Materials Documents
CompoundsAPI
ReactionsAPI
SpectraAPI
MaterialsAPI
DocumentsAPI
CompoundsWidgets
ReactionsWidgets
SpectraWidgets
MaterialsWidgets
DocumentsWidgets
Data tier
Data access tier
User interface
components tier
Analytical Laboratory application
User interface tier
(examples) Electronic Laboratory Notebook
Paid 3rd party integrations (various platforms – SharePoint, Google, etc)
Chemical Inventory application
![Page 31: Open innovation contributions from RSC resulting from the Open Phacts project](https://reader037.vdocument.in/reader037/viewer/2022102815/554e7cfcb4c905f66a8b5258/html5/thumbnails/31.jpg)
Input data pipeline
Deposition Gateway
Staging databases
Compounds
Reactions
Spectra
Materials
Articles / CSSP
Compounds Module
Spectra Module
Reactions Module
Materials Module
TextminingModule
1Module
Web UI for unified depositions
DropBox, Google Drive, SkyDrive, etc
LabTrove and other templated data
Documents
API, FTP, etc
Raw data Validated dataStaging
databases
All databases are sliced by data sources/data
collections and have simple
security model where each data
slice/source is private, public or
embargoed
![Page 32: Open innovation contributions from RSC resulting from the Open Phacts project](https://reader037.vdocument.in/reader037/viewer/2022102815/554e7cfcb4c905f66a8b5258/html5/thumbnails/32.jpg)
Compounds
![Page 33: Open innovation contributions from RSC resulting from the Open Phacts project](https://reader037.vdocument.in/reader037/viewer/2022102815/554e7cfcb4c905f66a8b5258/html5/thumbnails/33.jpg)
Reactions
![Page 34: Open innovation contributions from RSC resulting from the Open Phacts project](https://reader037.vdocument.in/reader037/viewer/2022102815/554e7cfcb4c905f66a8b5258/html5/thumbnails/34.jpg)
Analytical data
![Page 35: Open innovation contributions from RSC resulting from the Open Phacts project](https://reader037.vdocument.in/reader037/viewer/2022102815/554e7cfcb4c905f66a8b5258/html5/thumbnails/35.jpg)
For Deposition of Data• Quality of data at source
• ensuring chemicals are correct - VALIDATION• reactions map and balance as appropriate –
VALIDATION and STANDARDIZATION• file format handling for analytical data types –
binary file formats are proprietary - STANDARDIZATION
• valid interpretation of data – VALIDATION and ANNOTATION
![Page 36: Open innovation contributions from RSC resulting from the Open Phacts project](https://reader037.vdocument.in/reader037/viewer/2022102815/554e7cfcb4c905f66a8b5258/html5/thumbnails/36.jpg)
Input data pipeline
Deposition Gateway
Staging databases
Compounds
Reactions
Spectra
Materials
Articles / CSSP
Compounds Module
Spectra Module
Reactions Module
Materials Module
TextminingModule
1Module
Web UI for unified depositions
DropBox, Google Drive, SkyDrive, etc
LabTrove and other templated data
Documents
API, FTP, etc
Raw data Validated dataStaging
databases
All databases are sliced by data sources/data
collections and have simple
security model where each data
slice/source is private, public or
embargoed
![Page 37: Open innovation contributions from RSC resulting from the Open Phacts project](https://reader037.vdocument.in/reader037/viewer/2022102815/554e7cfcb4c905f66a8b5258/html5/thumbnails/37.jpg)
Deposition of Data
![Page 38: Open innovation contributions from RSC resulting from the Open Phacts project](https://reader037.vdocument.in/reader037/viewer/2022102815/554e7cfcb4c905f66a8b5258/html5/thumbnails/38.jpg)
User Interface Approach
Compounds Reactions Spectra Materials Documents
CompoundsAPI
ReactionsAPI
SpectraAPI
MaterialsAPI
DocumentsAPI
CompoundsWidgets
ReactionsWidgets
SpectraWidgets
MaterialsWidgets
DocumentsWidgets
Data tier
Data access tier
User interface
components tier
Analytical Laboratory application
User interface tier
(examples) Electronic Laboratory Notebook
Paid 3rd party integrations (various platforms – SharePoint, Google, etc)
Chemical Inventory application
![Page 39: Open innovation contributions from RSC resulting from the Open Phacts project](https://reader037.vdocument.in/reader037/viewer/2022102815/554e7cfcb4c905f66a8b5258/html5/thumbnails/39.jpg)
![Page 40: Open innovation contributions from RSC resulting from the Open Phacts project](https://reader037.vdocument.in/reader037/viewer/2022102815/554e7cfcb4c905f66a8b5258/html5/thumbnails/40.jpg)
![Page 41: Open innovation contributions from RSC resulting from the Open Phacts project](https://reader037.vdocument.in/reader037/viewer/2022102815/554e7cfcb4c905f66a8b5258/html5/thumbnails/41.jpg)
User Interface Approach
Compounds Reactions Spectra Materials Documents
CompoundsAPI
ReactionsAPI
SpectraAPI
MaterialsAPI
DocumentsAPI
CompoundsWidgets
ReactionsWidgets
SpectraWidgets
MaterialsWidgets
DocumentsWidgets
Data tier
Data access tier
User interface
components tier
Analytical Laboratory application
User interface tier
(examples) Electronic Laboratory Notebook
Paid 3rd party integrations (various platforms – SharePoint, Google, etc)
Chemical Inventory application
![Page 42: Open innovation contributions from RSC resulting from the Open Phacts project](https://reader037.vdocument.in/reader037/viewer/2022102815/554e7cfcb4c905f66a8b5258/html5/thumbnails/42.jpg)
![Page 43: Open innovation contributions from RSC resulting from the Open Phacts project](https://reader037.vdocument.in/reader037/viewer/2022102815/554e7cfcb4c905f66a8b5258/html5/thumbnails/43.jpg)
Work in Progress
![Page 44: Open innovation contributions from RSC resulting from the Open Phacts project](https://reader037.vdocument.in/reader037/viewer/2022102815/554e7cfcb4c905f66a8b5258/html5/thumbnails/44.jpg)
User Interface Approach
Compounds Reactions Spectra Materials Documents
CompoundsAPI
ReactionsAPI
SpectraAPI
MaterialsAPI
DocumentsAPI
CompoundsWidgets
ReactionsWidgets
SpectraWidgets
MaterialsWidgets
DocumentsWidgets
Data tier
Data access tier
User interface
components tier
Analytical Laboratory application
User interface tier
(examples) Electronic Laboratory Notebook
Paid 3rd party integrations (various platforms – SharePoint, Google, etc)
Chemical Inventory application
![Page 45: Open innovation contributions from RSC resulting from the Open Phacts project](https://reader037.vdocument.in/reader037/viewer/2022102815/554e7cfcb4c905f66a8b5258/html5/thumbnails/45.jpg)
A Compounds Repository Interface
![Page 46: Open innovation contributions from RSC resulting from the Open Phacts project](https://reader037.vdocument.in/reader037/viewer/2022102815/554e7cfcb4c905f66a8b5258/html5/thumbnails/46.jpg)
The PharmaSea Website
![Page 47: Open innovation contributions from RSC resulting from the Open Phacts project](https://reader037.vdocument.in/reader037/viewer/2022102815/554e7cfcb4c905f66a8b5258/html5/thumbnails/47.jpg)
The Open PHACTS community ecosystem
![Page 48: Open innovation contributions from RSC resulting from the Open Phacts project](https://reader037.vdocument.in/reader037/viewer/2022102815/554e7cfcb4c905f66a8b5258/html5/thumbnails/48.jpg)
Open PHACTS Project Partners
Pfizer Limited – Coordinator
Universität Wien – Managing entity
Technical University of Denmark
University of Hamburg, Center for Bioinformatics
BioSolveIT GmBH
Consorci Mar Parc de Salut de Barcelona
Leiden University Medical Centre
Royal Society of Chemistry
Vrije Universiteit Amsterdam
Spanish National Cancer Research Centre
University of Manchester
Maastricht University
Aqnowledge
University of Santiago de Compostela
Rheinische Friedrich-Wilhelms-Universität Bonn
AstraZeneca
GlaxoSmithKline
Esteve
Novartis
Merck Serono
H. Lundbeck A/S
Eli LillyNetherlands Bioinformatics CentreSwiss Institute of BioinformaticsConnectedDiscoveryEMBL-European Bioinformatics Institute
Janssen
OpenLink
![Page 49: Open innovation contributions from RSC resulting from the Open Phacts project](https://reader037.vdocument.in/reader037/viewer/2022102815/554e7cfcb4c905f66a8b5258/html5/thumbnails/49.jpg)
Thank you
Email: [email protected]: 0000-0002-2668-4821 Twitter: @ChemConnectorPersonal Blog: www.chemconnector.com SLIDES: www.slideshare.net/AntonyWilliams