semantic technologies and application to climate data

19
Semantic Technologies and Application to Climate Data M. Benno Blumenthal IRI/Columbia University CDW 2011-03-30/04-01

Upload: wynter-sears

Post on 31-Dec-2015

20 views

Category:

Documents


1 download

DESCRIPTION

M. Benno Blumenthal IRI/Columbia University CDW 2011-03-30/04-01. Semantic Technologies and Application to Climate Data. RDF: single framework for writing multiple systems. Triplets of Subject Property (or Predicate) Object URI’s identify things, i.e. most of the above - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Semantic Technologies and Application to Climate Data

Semantic Technologies and Application to Climate Data

M. Benno Blumenthal

IRI/Columbia University

CDW 2011-03-30/04-01

Page 2: Semantic Technologies and Application to Climate Data

Triplets of • Subject• Property (or Predicate)• ObjectURI’s identify things, i.e. most of the above

Namespaces are used as a convenient shorthand for the URIs

Inferred triples (RDFS, OWL, rules)

{WOA} dc:title “World Ocean Atlas”

{WOA} iridl:hasPart {Monthly}

{dc:title} rdfs:isDefinedBy {dc:}

RDF: single framework for writing multiple systems

Page 3: Semantic Technologies and Application to Climate Data

blind monks examining an elephant

John Godfrey Saxe (1816-1887)

Multiple partial representations of objects described by data

Page 4: Semantic Technologies and Application to Climate Data

Standard Metadata

Users

Datasets

Tools

Standard Metadata Schema/Data Services

Page 5: Semantic Technologies and Application to Climate Data

Standard metadata schema

Tools

Users

Datasets

Standard Metadata Schema

RDF

RDFRDF

Tools

Users

Datasets

Standard Metadata Schema

RDF

RDFRDF

Tools

Users

Datasets

Standard Metadata Schem

RDF

RDFRDF

RDF Data Model Exchange

RDF

Tools

Users

Datasets

Standard Metadata Schema

RDF

RDFRDF

Tools

Users

Datasets

Standard Metadata Schema

RDF

RDFRDF

Page 6: Semantic Technologies and Application to Climate Data

Models, Crosswalks, and Objects

Structure of the RDF information that we are using to represent data objects in multiple frameworks (see full figure)

Page 7: Semantic Technologies and Application to Climate Data

Data Server leads to URI

IRI data library is a pure REST interface, so that there is a URL for everything: dataset, variable, series of analysis filters on variables, image, datafile.

RDF can thus be used to annotate everything.

Page 8: Semantic Technologies and Application to Climate Data

IRI Data Library Overview

IRI Data Collection

Generalized Data Tools

Specialized Data Tools

Dataset • Dataset •Dataset •Variable•ivar•ivar

multidimensional

Data Viewer Data Language

Maproom

URL/URI for data, calculations, figs, etc

Page 9: Semantic Technologies and Application to Climate Data

IRI DataCollectionDataset

• Dataset •Dataset

•Variable•ivar•ivar

Calculations“virtual

variables”

imagesgraphics

descriptive and navigational

pages

OpenGISWMS/WCS

KML

Data Filesnetcdfbinaryimages

Clients

OpenDAPTHREDDS

Tables

ServersOpenDAP

THREDDS

GRIBnetCDFimagesbinary

DatabaseTablesqueries

spreadsheets shapefiles

images w/proj

IRI Data Collection

Page 10: Semantic Technologies and Application to Climate Data

Dataset Objects

Page 11: Semantic Technologies and Application to Climate Data

Crosswalk to Faceted Search

Page 13: Semantic Technologies and Application to Climate Data

Crosswalk to DIF-CD Records

Page 14: Semantic Technologies and Application to Climate Data

Sample DIF-CD<DIF>

<Entry_ID>IRIDL_ENSO_Climate_Impacts_ENSO_PRCP_Prob_Australia</Entry_ID>

<Entry_Title>ENSO Climate Impacts ENSO PRCP Prob Australia</Entry_Title>

<Data_Set_Citation>

<Dataset_Title>ENSO Climate Impacts ENSO PRCP Prob Australia</Dataset_Title>

<Online_Resource>

http://iridl.ldeo.columbia.edu/maproom/.ENSO/.Climate_Impacts/.ENSO_PRCP_Prob/index.html?map.lon.plotfirst=100&map.lon.plotlast=180&map.lat.plotfirst=-55&map.lat.plotlast=0&map.lat.units=degree_north&map.lon.units=degree_east

</Online_Resource>

</Data_Set_Citation>

<Parameters>

<Category>EARTH SCIENCE</Category>

<Topic>ATMOSPHERE</Topic>

<Term>PRECIPITATION</Term>

<Variable_Level_1>PRECIPITATION RATE</Variable_Level_1>

</Parameters>

Page 15: Semantic Technologies and Application to Climate Data

XML Schema to Owl Translation

Based on existing software, but extended• Bi-directional, enough information is

preserved to generate conforming XML documents (a Java class extracts XML elements from a triple store)

• Structure is in the schema information, not the instance

• Fixed xslt converts instance files to RDF

Page 16: Semantic Technologies and Application to Climate Data

XML Schema to Owl Translation

XML Schema – instance translation is essentially an alternate RDF/XML representation where only the properties are nested

– A standard XML file has all blank-node entities

– XML schema with rdf:about/rdf:resource can have uri entities

makes sense that the instance file does not explicitly type

all the elements.

Page 17: Semantic Technologies and Application to Climate Data

Data ServersOntologies

MMI

JPL

StandardsOrganizations

Start Point

RDF/XML-Schema CrawlerXSLT/GRDDL ingest

XML Schema to OWL translationOwl SemanticsSWRL Rules

SeRQL CONSTRUCT

Search Queries

LocationCanonicalizer

TimeCanonicalizer

Sesame

Search Interface

bibliography

IRI RDF Architecture

Page 18: Semantic Technologies and Application to Climate Data

SSWAP

Simple Semantic Web Architecture and Protocol

A way of providing a service that semantically describes its domain and range to advertise it. To invoke it, both domain and range are restricted.

Traditionally we specify of chain of processing steps, and provenance documents that effort. SSWAP specifies an object by constraining it – you could specify its provenance to get it “traditionally”, or some other quality.

Page 19: Semantic Technologies and Application to Climate Data

Multiplicity of Data Representations

RDF provides a unifying framework to simultaneous hold and deliver dataset metadata according to multiple standards

Models, Crosswalks, and Objects organizes that framework clarifying the semantic distance spanned

bidirectional XML Schema to OWL translation enables delivery of inferred metadata to existing XML-based systems

Persistence with inference/transform is the underlying technology

Semantic Service Framework could extend this framework to semantically-informed workflow generation