from data portal to knowledge portal: leveraging semantic technologies to support interdisciplinary...
TRANSCRIPT
deepcarbon.net
Xiaogang Ma, Patrick West, John Erickson, Stephan Zednik, Yu Chen, Han Wang, Hao Zhong, Peter Fox
Tetherless World ConstellationRensselaer Polytechnic Institute
From data portal to knowledge portal:Leveraging semantic technologies to support interdisciplinary studies
2
Outline
• Deep Carbon Observatory
• Deep Carbon Virtual Observatory (DCvO)
– Architecture of DCvO
– DCO Ontologies
– Boundary activities
– Discovering information by clicking through
• Summary
3
A 10-year (2009-2019) initiative to intensify global attention and scientific effort in the burgeoning field of deep carbon science
4
• Faculty, staff and students from the Tetherless World Constellation (TWC) at Rensselaer Polytechnic Institute (RPI)
• Responsible for– DCO Architecture and technology infrastructure– DCO Computer Cluster– The Deep Carbon Virtual Observatory DCvO
Deep Carbon Observatory – Data Science
5
Deep Carbon Virtual Observatory
Scientists – actually ANYONE - should be able to access a global, distributed knowledge base of scientific data and information that:• appears to be integrated• appears to be locally available • is in a language (written, programming, or science)
that is understandable and can be sharedData intensive – volume, complexity, mode, scale,
heterogeneity, … in an OPEN WORLD
6
Deep Carbon Virtual Observatory
• A vision of the DCvO:– A conceptual model of the interplay between data, people,
publication, instruments, models, organizations, etc.– Identify, annotate and link all key entities, agents and activities – A repository for datasets and associated metadata– Unique and powerful data and metadata visualization for
dissemination of information– Facilitates the discovery of potential collaborations– An integrated portal for diverse content and applications
(Fox et al., 2014)
7
DCvO “Architecture”
8
vivo.cornell.edu
VIVO - represents academic research
communities
DCO ontology: a model for concept types and relationships
DCO ontologies extend each other and the VIVO ontology
9
Ontologies and schemas used in the DCO web portal
Name Prefix
Dublin Core Metadata Element Set dc
DCMI Metadata Terms dct
VIVO Core vivo
VIVO Scientific Research Ontology scires
Data Catalog Vocabulary dcat
Bibliographic Ontology bibo
Citation Counting and Context Characterization Ontology c4o
Citation Typing Ontology cito
FRBR-Aligned Bibliographic Ontology fabio
Event Ontology event
Friend of a Friend foaf
vCard Ontology vcard
Geopolitical Ontology geo
Simple Knowledge Organization System skos
DCO Ontology dco
PROV Ontology prov
10
Ontologies and schemas used in the DCO web portal
DCO Boundary Activities are driving the extensions within the DCO Ontologies
11
DCO Extension for Project Updates
12
Dynamically generated list of Grants that are part of the Deep Carbon
Observatory. Users can click through to learn more, and members can create
reports to be sent to funding orgs
13
Grant page lists all projects and reporting updates for each of the
projects and field studies
14
DCO Extension for Data Types
15
A Few Boundary Activities
• Given a DOI pull publication information from CrossRef
and/or Web of Science
• DCO IGSN Allocation Agent to work with the IGSN
Registry
• Integration with existing data portals and repositories
• Data Rescue activities
16
Modern informatics enables a new scale-free framework approach
• Use cases• Stakeholders• Modeling• Ontologies• Evaluation
17
What does a DCO data publication look like?
18
Identification and annotation
Information on the landing page of a dataset
19
Linking to enable forward and backward tracking
Landing page of Helium Concept
20
Landing page of a person
Linking to build Collaborations
21
Landing page of a research area
Linking to build Collaborations
22
DCO Knowledge Graph Analytics
23
Thus… progress…
• Integrative – semantics• Transparent – semantics• Collaborative – semantics• Application integration
– Yep – semantics
24
Thank you!Patrick West, [email protected], https://deepcarbon.net, http://tw.rpi.edu
25
An integrated portal: deepcarbon.net
26
Faceted publication
browser
27
Repository for archiving datasets
Archived datasets of ‘Noble gas isotope abundances in
terrestrial fluids’
28
Collaboration tools
Group Based CollaborationGroup data deposit and
reporting
Listings of group content
Group management
and messaging
29
RDA DTR and PIT adoption
The DTR primitives are comparable to a list of BASIC DATA TYPE CLASSES in the DCO ontology, e.g. Dataset, Image, Video, Audio, etc.
A registered DCO dataset is asserted as an instance of one of those basic data type classes.
It is possible to further annotate the dataset with the SPECIFIC DATA TYPES defined within a DTR, and each data type has a unique PID.
A Few Boundary Activities
Results of data type specification
• Updates to the DCO Ontology:– A new class dco:DataType. Each specific data type is an instance of it– An object property dco:hasDataType linking a dataset and a data type– A collection of other classes and properties associated with dco:DataType
30
31
• New datasets available via dataset browser• Includes citations to the originating publication• Data files accessible through dataset repository
Thermodynamic Data Rescue
32
DCO Knowledge Store Analytics
33
DCO Knowledge Store Visualizations
34
All information is linked and traceable!
35
Mediation
From: C. Borgman, 2008, NSF Cyberlearning Report, Illustration by Roy Pea and Jillian C. Wallis
Guess
6th Generation
All these generations of mediation are in effect as we collaborate