answering scientific questions with linked european nanosafety data

16
Answering scienfic quesons with linked European nanosafety data Egon Willighagen hp://chem-bla-ics.blogspot.com/ @egonwillighagen #nmsa2017 ORCID:0000-0001-7542-0286 NMSA2017, 8 Feb 2017, Malaga Copyright © 2017 E. Willighagen, CC-BY 4.0 Int.

Upload: egon-willighagen

Post on 21-Feb-2017

56 views

Category:

Health & Medicine


0 download

TRANSCRIPT

Answering scientificquestions with linked European nanosafety dataEgon Willighagen

http://chem-bla-ics.blogspot.com/@egonwillighagen #nmsa2017ORCID:0000-0001-7542-0286

NMSA2017, 8 Feb 2017, Malaga

Copyright © 2017 E. Willighagen, CC-BY 4.0 Int.

Data flow (*) 1. research data management

a) data measurementsb) data storage

2. data curation3. data collection and integration4. data analysis (statistics / machine

learning)

Image: https://commons.wikimedia.org/wiki/File:ABS-8301.0-ProductionSelectedConstructionMaterials-QuarterlyCommodities-Quantity-PreMixedConcrete-Australia-A3572492X.svg

Answering research questions

1. Are metal oxide nanoparticles genotoxic?

2. Is this data set complete, or are there knowledge gaps?

http://www.openphactsfoundation.org

Semantic data integration: needs

1. Data in a semantic format2. Linked Data3. Use of common

ontologies4. A query language

5. (High quality and complete data)

Semantic data (RDF format)

http://egonw.github.com/nmsa/

Semantic data (RDF format)

http://egonw.github.com/nmsa/

Linked Data

"Linking Open Data cloud diagram 2017, by Andrejs Abele, John P. McCrae, Paul Buitelaar, Anja Jentzsch and Richard Cyganiak. http://lod-cloud.net/"

Ontologies define hierarchies

Results: data completenessSELECT DISTINCT ?substance ?type ?title ?value ?unitWHERE { BIND (ex:NWKI-002f5129 AS ?substance) BIND (pato:PATO_0000117 as ?propertyType) { ?assay a ?propertyType . } UNION { ?assay a [ rdfs:subClassOf+ ?propertyType ] } ?substance a obo:CHEBI_59999 ; obo:BFO_0000056 ?mgroup . ?mgroup obo:OBI_0000299 ?endpoint . ?endpoint sso:has-value ?value ; sso:has-unit ?unit . ?assay a bao:BAO_0000015, ?type ; bao:BAO_0000209 ?mgroup ; dc:title ?title . FILTER (?type != bao:BAO_0000015)} ORDER BY ASC(?substance)

Results: data completenessSELECT DISTINCT ?substance ?type ?title ?value ?unitWHERE { BIND (ex:NWKI-002f5129 AS ?substance) BIND (pato:PATO_0000117 as ?propertyType) { ?assay a ?propertyType . } UNION { ?assay a [ rdfs:subClassOf+ ?propertyType ] } ?substance a obo:CHEBI_59999 ; obo:BFO_0000056 ?mgroup . ?mgroup obo:OBI_0000299 ?endpoint . ?endpoint sso:has-value ?value ; sso:has-unit ?unit . ?assay a bao:BAO_0000015, ?type ; bao:BAO_0000209 ?mgroup ; dc:title ?title . FILTER (?type != bao:BAO_0000015)} ORDER BY ASC(?substance)

Results: data completeness (TiO2)

http://egonw.github.io/enmSummaries/

Results: genotoxicity of metal oxides

Which metal oxides (NPO_1541) show a form of genotoxicity (BAO_0002167)?

(High quality data)

http://search.data.enanomapper.net/

(High quality data)

http://search.data.enanomapper.net/

Conclusions

1. established a data workflow: spreadsheets, eNanoMapper, RDF, linked data

2. ontology annotation allows reasoning and supports data curation

3. approach allows unified linked of different, independent datasets

4. scientific questions can be formulated as SPARQL

Acknowledgments

eNanoMapper (Grant Agreement no. 604134) is a project supported by the European Commission through the 7th Framework Programme

http://www.enanomapper.net/