answering scientific questions with linked european nanosafety data
TRANSCRIPT
Answering scientificquestions with linked European nanosafety dataEgon Willighagen
http://chem-bla-ics.blogspot.com/@egonwillighagen #nmsa2017ORCID:0000-0001-7542-0286
NMSA2017, 8 Feb 2017, Malaga
Copyright © 2017 E. Willighagen, CC-BY 4.0 Int.
Data flow (*) 1. research data management
a) data measurementsb) data storage
2. data curation3. data collection and integration4. data analysis (statistics / machine
learning)
Image: https://commons.wikimedia.org/wiki/File:ABS-8301.0-ProductionSelectedConstructionMaterials-QuarterlyCommodities-Quantity-PreMixedConcrete-Australia-A3572492X.svg
Answering research questions
1. Are metal oxide nanoparticles genotoxic?
2. Is this data set complete, or are there knowledge gaps?
http://www.openphactsfoundation.org
Semantic data integration: needs
1. Data in a semantic format2. Linked Data3. Use of common
ontologies4. A query language
5. (High quality and complete data)
Linked Data
"Linking Open Data cloud diagram 2017, by Andrejs Abele, John P. McCrae, Paul Buitelaar, Anja Jentzsch and Richard Cyganiak. http://lod-cloud.net/"
Results: data completenessSELECT DISTINCT ?substance ?type ?title ?value ?unitWHERE { BIND (ex:NWKI-002f5129 AS ?substance) BIND (pato:PATO_0000117 as ?propertyType) { ?assay a ?propertyType . } UNION { ?assay a [ rdfs:subClassOf+ ?propertyType ] } ?substance a obo:CHEBI_59999 ; obo:BFO_0000056 ?mgroup . ?mgroup obo:OBI_0000299 ?endpoint . ?endpoint sso:has-value ?value ; sso:has-unit ?unit . ?assay a bao:BAO_0000015, ?type ; bao:BAO_0000209 ?mgroup ; dc:title ?title . FILTER (?type != bao:BAO_0000015)} ORDER BY ASC(?substance)
Results: data completenessSELECT DISTINCT ?substance ?type ?title ?value ?unitWHERE { BIND (ex:NWKI-002f5129 AS ?substance) BIND (pato:PATO_0000117 as ?propertyType) { ?assay a ?propertyType . } UNION { ?assay a [ rdfs:subClassOf+ ?propertyType ] } ?substance a obo:CHEBI_59999 ; obo:BFO_0000056 ?mgroup . ?mgroup obo:OBI_0000299 ?endpoint . ?endpoint sso:has-value ?value ; sso:has-unit ?unit . ?assay a bao:BAO_0000015, ?type ; bao:BAO_0000209 ?mgroup ; dc:title ?title . FILTER (?type != bao:BAO_0000015)} ORDER BY ASC(?substance)
Results: genotoxicity of metal oxides
Which metal oxides (NPO_1541) show a form of genotoxicity (BAO_0002167)?
Conclusions
1. established a data workflow: spreadsheets, eNanoMapper, RDF, linked data
2. ontology annotation allows reasoning and supports data curation
3. approach allows unified linked of different, independent datasets
4. scientific questions can be formulated as SPARQL