the electronic notebook ontology

21
The Electronic Notebook Ontology Stuart J. Chalk Department of Chemistry, University of North Florida [email protected] VIVO 2015 – August 2015

Upload: stuart-chalk

Post on 15-Apr-2017

149 views

Category:

Science


0 download

TRANSCRIPT

Page 1: The Electronic Notebook Ontology

The ElectronicNotebook Ontology

Stuart J. ChalkDepartment of Chemistry, University of North

[email protected]

VIVO 2015 – August 2015

Page 2: The Electronic Notebook Ontology

Motivation Inspiration Electronic Scientific Notebooks The Experiment Markup Language VIVO-ISF Ontology HCLS Community Profiles Analysis Important Questions Ontology Conclusion

Outline

Page 3: The Electronic Notebook Ontology

There’s somethingmissing from the big data landscape in science…

VIVO captures data about scientists (faculty)… …but not about the data they produce

HCLS Community Profile outlines metadata for describing datasets but does not mention laboratory notebooks

Electronic laboratory notebooks are set to become the standard way scientists capture data

How do we link these together?

Motivation

Page 4: The Electronic Notebook Ontology
Page 5: The Electronic Notebook Ontology

Scientists need to move todigital notebooks…

...and record not just the databut the flow and context

Traditional Laboratory Notebooks

How science is doneis important for searching,aggregation, meta-analysis

Page 6: The Electronic Notebook Ontology

Developed out of Laboratory InformationManagement Systems (LIMS)

Content Management System for Scientists Storage of

Research data Research resources (instruments, samples, scientists) The story of the scientific endeavor

Link to external resources Display chemical structures Allow aggregation, processing of data Be compliant with industry standard record

keeping

Electronic Laboratory Notebooks

Page 7: The Electronic Notebook Ontology

Electronic Laboratory Notebooks

Page 8: The Electronic Notebook Ontology

A specification (written in XML) that describes different types of information recorded during the scientific process (http://exptml.sourceforge.net)

Many datatypes (will expand…)

Experiment Markup Language (ExptML)

Sample Solution Space Specimen Substance Task Template Timeline User Vendor

Annotation Api Calculation Chemical Citation Communication Customer Data Dataset Definition

Element Equipment Event Experiment Group Project Protocol Quote Report Result

Page 9: The Electronic Notebook Ontology

ExptML Ontology

Page 10: The Electronic Notebook Ontology

VIVO-ISF Ontology

https://wiki.duraspace.org/download/attachments/51052811/PeopleOrgsRolesGrants.2014-03-14.png

Page 11: The Electronic Notebook Ontology

The Healthcare and Life Science (HCLS) Community Profile is a Note from the Semantic Web HCLS Interest Group Access to consistent, high-quality metadata is critical to

finding, understanding, and reusing scientific data. This document describes a consensus among participating stakeholders in the Health Care and the Life Sciences domain on the description of datasets using the Resource Description Framework (RDF). This specification meets key functional requirements, reuses existing vocabularies to the extent that it is possible, and addresses elements of data description, versioning, provenance, discovery, exchange, query, and retrieval.

Data Descriptions:HCLS Community Profile

http://www.w3.org/TR/hcls-dataset/

Page 12: The Electronic Notebook Ontology

Describes three levels for description of datasets Summary Level

Type declaration (rdf:type = dctypes:Dataset)

Title (dct:title = rdf:langString) Description (dct:description =

rdf:langString) Publisher (dct:publisher = IRI)

Version Level Type declaration (rdf:type =

dctypes:Dataset) Title (dct:title = rdf:langString) Description (dct:description =

rdf:langString)

Creator (dct:creator = IRI) Publisher (dct:publisher = IRI) Version identifier (pav:version =

xsd:string) Version linking (dct:isVersionOf =

IRI) Distribution Level

Type declaration (rdf:type = void:Dataset OR dcat:Distribution)

Title (dct:title = rdf:langString) Description (dct:description =

rdf:langString) Creator (dct:creator = IRI) Publisher (dct:publisher = IRI) License (rdf:type = IRI)

Data Descriptions:HCLS Community Profile

http://www.w3.org/TR/hcls-dataset/#datasetdescriptionlevels

Page 13: The Electronic Notebook Ontology

Goal: Automated identification of datasets that could be made searchable and/or distributable

When an ELN functions what does it do? Orchestrates access to the system

(authentication) Supplies GUI to allow information to be

Displayed Entered Processed

Processes files to bring them into the system Sends requests to internal/external servers to

get data

Analysis

Page 14: The Electronic Notebook Ontology

Is this information a dataset?

Does dataset belong to this author? Is the dataset available? Is there appropriate metadata? At what HCLS levels can this dataset be made

available?

What mechanism is used to make the dataset available?

Important Questions

Page 15: The Electronic Notebook Ontology

Actions that deal with datasets Software actions User actions

Clues that something is research data(not metadata or someone else’s data)

Collection of metadata for annotation of datasets

Inference that a HCLS dataset has been created

Dataset Identification

Page 16: The Electronic Notebook Ontology

Electronic Notebook Ontology (ENO)

Page 17: The Electronic Notebook Ontology

ENO

Page 18: The Electronic Notebook Ontology

ENO

Page 19: The Electronic Notebook Ontology

Providing a mechanism to link research data to VIVO profiles would Add value to VIVO Provides faculty with a resource for their

data management plans Creates opportunities for automatic aggregation

of research data into institutional repositories

Needs to be implemented in a test ELN…

Take Home

Page 21: The Electronic Notebook Ontology