semantically-enabled environmental data discovery and integration: demonstration using the iceland...

Post on 12-Sep-2014

475 Views

Category:

Technology

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

This presentation is about a framework for semantically-enabled data discovery and integration across multiple Earth science disciplines. Data harmonization was based on the principles of Linked Data. Previous works define the Data Cube extensions which are relevant to certain Earth science disciplines. To provide a generic and domain independent solution, we propose an upper level vocabulary (the ENVRI vocabulary) that allows us to express domain specific information at a higher level of abstraction. From a human viewpoint we provide an interactive Web based user interface for data discovery and integration across multiple research infrastructures (http://portal.envri.eu). The system is demonstrated on a use case of the Iceland Volcano’s eruption on April 10, 2010.

TRANSCRIPT

Semantically-EnabledEnvironmental Data Discovery and Integration:Demonstration Using the Icelandic Volcano Use Case

Tatiana Tarasova 1 Massimo Argenti 2 Maarten Marx 1

1ISLA, University of Amsterdam

2European Space Agency

October 7, 2013System description papers, KESW 2013

Environmental Research

Example Case Study

Icelandic Volcano Eruption [6]

What was the impact of the eruption of the Icelandic Volcano(Eyjafjallajokull) in 2010 on the environment?

What data can contribute to the research?

EMSO  

IAGOS-­‐ERI  

AURORA  BOREALIS  

EUFAR-­‐COPAL  

EPOS  

EISCAT-­‐3D  

SIOS  

LIFEWATCH  

EURO-­‐ARGO   ICOS  ATC  

?  

?  

?  

?  

?  

?  ?  

Technological and Structural Data Heterogeneity

EURO-­‐ARGO   ICOS  ATC  

Atmospheric  measurements  

Ocean  temperature  

FTP  catalogues   Authorized  IP  access  

CSV   NetCDF  ?  

Semantic Data Heterogeneity

EURO-­‐ARGO   ICOS  ATC  

Atmospheric  measurements  

Ocean  temperature  

FTP  catalogues   Authorized  IP  access  

CSV   NetCDF  

ppm  

flask  

hourly  measurements  level  2  

plaBorm  

good  quality  

float  trajectories  

?  

Outline

1 Motivation

2 Environmental Data Discovery

3 Environmental Data IntegrationLinked Data Approach to Environmental Data IntegrationDemonstration Using the Icelandic Volcano Case Study

Environmental Data Discovery

Approach

discover data through a single harmonized metadata catalogue

enable semantic data discovery through semantic tagging of datasets

Implementation

ENVRI portal http://portal.envri.eu

OpenSearch [11] based catalogue

1350 data series, 288.971 triples stored in SESAME [12]

geospatial metadata model that extends the INSPIRE guidelines [7]with richer semanticshttp://portal.genesi-dec.eu/news/?id=117

semantic tagging against a set of the Earth Science vocabularies(GCMD [8], SBA [9], GEMET [10])

Outline

1 Motivation

2 Environmental Data Discovery

3 Environmental Data IntegrationLinked Data Approach to Environmental Data IntegrationDemonstration Using the Icelandic Volcano Case Study

Linked Environmental Data

Linked Data [13]

→ publish data not documents!

Environmental Data

→ datasets with observations

Linked Environmental Data

→ publish observations not datasets!

→ fine-grain representation of environmental data will bring newopportunities to query and integrate environmental data at the levelof single observations

Atmospheric Measurements (ICOS) [14]

Dataset “CO2 concentration measured by Mace Head”

Observation “CO2 concentration in the air Measured by Mace Head on 2010-01-05 was 392.011”

Dimensions: Time Geospatial location Unit of Measure Observed Phenomenon …

Ocean Monitoring (Euro-Argo) [15]

Dataset Observations Dimensions

Ash Cloud Dynamics (ACTRIS) [16]

Dataset Observations Dimensions

Related Work

RDF Data Cube [17] based approaches

Linked Environmental Data [1]

The ACORN-SAT Linked Climate Dataset [2]

A Linked Data Framework for Publishing the UK Environmental Data [4]

Data Cube: core model

Dimension  

has dimension

Dataset   Dataset  Structure  

Observa2on  

has dataset

has structure

But what about semantic data interoperability?

Question

Can we find generic concepts to capture domain semantics ofenvironmental data?

Data Cube Extension

has Feature of Interest

Dataset   Dataset  Structure  

Observa.on  

has dataset

has structure

Loca.on   Time   FeatureOfInterest  

. . .

has time has location

ENVRI vocabulary [18] (based on OGC O&M [19])

Observation

Feature of Interest

has Feature of Interest

TimeLocation

has Location

Result

has ResultProcedure

has procedure

Observed Property

has Observed Property

has Propertyhas Time

Outline

1 Motivation

2 Environmental Data Discovery

3 Environmental Data IntegrationLinked Data Approach to Environmental Data IntegrationDemonstration Using the Icelandic Volcano Case Study

Demonstration

Data

ICOS CO2 concentration, Euro-Argo - ocean temperature

35 collections, 10.520 observations, 136.556 RDF triples

Implementation

RDF generation - RDF Data Cube plug-in for Google Refine [5]

storage - Virtuoso RDF store

access - http://data.politicalmashup.nl/sparql/

Queries

1: query for individual and subsets of observations

Retrieve all the observations for the days of the Volcano eruption(from 20 March to 23 June, 2010).

2: exploit the semantics of the terms of the ENVRI vocabulary

What phenomena were measured in 2010 in the area next to theVolcano?

What instruments were used to make measurements in 2010 in thearea next to the Volcano?

Queries

1: query for individual and subsets of observations

Retrieve all the observations for the days of the Volcano eruption(from 20 March to 23 June, 2010).

2: exploit the semantics of the terms of the ENVRI vocabulary

What phenomena were measured in 2010 in the area next to theVolcano?

What instruments were used to make measurements in 2010 in thearea next to the Volcano?

Conclusion

→ data discovery through a harmonized metadata catalogue based onthe geospatial metadata model

→ fine-grain representation of environmental data enables queries thatretrieve and integrate data at the level of single observation instead ofpre-defined collections

→ ENVRI vocabulary enables semantically rich queries

Future Work

→ Alignment between data models for data discovery and dataharmonization

→ Systematic study of the proposed modelling solution

Conclusion

→ data discovery through a harmonized metadata catalogue based onthe geospatial metadata model

→ fine-grain representation of environmental data enables queries thatretrieve and integrate data at the level of single observation instead ofpre-defined collections

→ ENVRI vocabulary enables semantically rich queries

Future Work

→ Alignment between data models for data discovery and dataharmonization

→ Systematic study of the proposed modelling solution

Questions?

Thank you!

→ ENVRI portal http://portal.envri.eu

→ more about Linked Environmental Datahttp://staff.science.uva.nl/~ttaraso1/html/envri.html

References I

Ruther, M., Fock, J., and Hubener, J.: Linked Environmental Data. 24thInternational Conference on Informatics for Environmental Protection (2010)

Ruther, M., Fock, J., and Hubener, J.: The ACORN-SAT Linked Climate Dataset.Semantic Web Journal (2013)http://www.semantic-web-journal.net/system/files/swj457.pdf

The ENVRI vocabularyhttp://data.politicalmashup.nl/RDF/vocabularies/envri

Shaon, A., Woolf, A., Boczek, R., Rogers, W., and Jackson, M.: An Open SourceLinked Data Framework for Publishing Environmental Data under the UK LocationStrategy. Proceedings of the Terra Cognita Workshop on Foundations,Technologies and Applications of the Geospatial Web (2011)http://ceur-ws.org/Vol-798/paper6.pdf

The Data Cube plug-in for Google Refine http://refine.deri.ie/qbExport

2010 eruptions of Eyjafjallajokull on Wikipediahttp://en.wikipedia.org/wiki/2010_eruptions_of_Eyjafjallaj%C3%B6kull

References II

State of progress in the development of guidelines to express elements of theInfrastructure for Spatial Information in the European Community (INSPIRE)metadata implementing rules using ISO 15836 (Dublin Core). EuropeanCommission (2008) http://inspire.jrc.ec.europa.eu/reports/

ImplementingRules/metadata/MD_IR_and_DC_state%20of%20progress.pdf

The Global Change Master Directory (GCMD) http://gcmd.nasa.gov/

The Societal Benefit Area vocabularies (SBA)http://www.earthobservations.org/

The GEneral Multilingual Environmental Thesaurus (GEMET)http://www.eionet.europa.eu/gemet/

The OpenSearch standard protocol http://www.opensearch.org/

http://www.openrdf.org/

Berners-Lee, T.: Linked data - design issues, 2006.http://www.w3.org/DesignIssues/LinkedData.html

References III

The Integrated Carbon Dioxide System (ICOS), Atmospheric MeasurementsSystem https://icos-atc-demo.lsce.ipsl.fr/

Euro-Argo http://www.argodatamgt.org/

The Aerosols, Clouds, and Trace Gasses Research Infrastructure Network (ACTRIS)www.actris.net

The Data Cube vocabularyhttp://www.w3.org/TR/2013/WD-vocab-data-cube-20130312/

The ENVRI vocabulary.http://data.politicalmashup.nl/RDF/vocabularies/envri

Geographic Information: Observations and Measurements. OGC AbstractSpecification http://www.opengeospatial.org/standards/om

top related