metadata & brokering - a modern approach for ingv ri

28
METADATA a modern approach Daniele Bailo

Upload: daniele-bailo

Post on 13-Aug-2015

103 views

Category:

Engineering


0 download

TRANSCRIPT

Page 1: Metadata & Brokering - a modern approach for INGV RI

METADATAa modern approach

Daniele Bailo

Page 2: Metadata & Brokering - a modern approach for INGV RI

CHARACTERS

Page 3: Metadata & Brokering - a modern approach for INGV RI

Leading Actor

Digital Data

Sequence of (digital) symbols- With a meaning- Can be stored- Can be transmitted- Can be computed

Page 4: Metadata & Brokering - a modern approach for INGV RI

Guest Star

Metadata

Data about Data (really?)

FunctionsManage Data (discovery, selection etc)

Issues (selection of)- What is metadata to

me, can be data to others

- Many standards- Ontologies

Page 5: Metadata & Brokering - a modern approach for INGV RI

Actor

Broker(ing system)Intermediary software

Functions- Access to several

system at your place

- Collects data for you (integration)

Issues (selection of)- Performances- Works better with

metadata

Page 6: Metadata & Brokering - a modern approach for INGV RI

Actor

The Triad

A set of 3 elements to fully manage data

FunctionsPID – persistent identifierMetadata – discovery & selectionDO – data of interest

<PID, metadata, DO>

Page 7: Metadata & Brokering - a modern approach for INGV RI

Technical support staff

Data Base

Collection of (organized) Data

AliasRepository, Data Center etc.

Superpowers- DBMS (allows definition, creation, querying, update, and administration of databases)

Page 8: Metadata & Brokering - a modern approach for INGV RI

Technical support staff

APIs Application programming Interface

Standard procedures or instructions to access to a service (or function)

AliasWEB service, RESTful service, [thin layer] etc..

Needs- Standards for

requests- Standards for

response

Page 9: Metadata & Brokering - a modern approach for INGV RI

Themes1. Optimizaton of

resources

2. Single point access…to several Database and services

3. OPEN ACCESS obligationsBerlin Declaration,DPC…

4. Interoperation for data re-use New multidisciplinary science

5. Citationand data provenance

Page 10: Metadata & Brokering - a modern approach for INGV RI

Comments?

Questions?

Page 11: Metadata & Brokering - a modern approach for INGV RI

SCENARIOS1. Friendship based

discovery

2. Manual discovery

3. Advanced manual discovery

4. Brokering (canonical form

5. Metadata driven canonical brokering

6. Metadata driven canonical brokering with contextualization

PREMISEStructured data (standards)

Page 12: Metadata & Brokering - a modern approach for INGV RI

#0 friendship based discovery1. data stored on USB

pendrives, CDs etc.

2. Phone calls

3. Emails

Issues

Works well in masonry clubs

Page 13: Metadata & Brokering - a modern approach for INGV RI

#1 Manual discovery

= data Format A – repository A

= data Format B – repository B

= data Format C – repository C

Dataset

Dataset

Dataset

Data from Irpinia

1. User discovers data

2. Repository do not have web services

3. No metadata (or embedded into file or diectory structure)

4. Manual match & mapping

Issues

Performances, efficiency, error prone, partial datasets

Dataset

Dataset

DatasetData

setDataset

Dataset

Page 14: Metadata & Brokering - a modern approach for INGV RI

#2 Advanced manual discovery

= data Format A – repository A

= data Format B – repository B

= data Format C – repository C

Dataset

Dataset

Dataset

Data from Irpinia

1. User discovers data

2. Repository have access interfaces (APIs, WS…)

3. Minimal metadata set

4. Manual match & mapping

Issues- Performances,

efficiency, error prone

- Some standardization in place

Dataset

Dataset

DatasetData

setDataset

Dataset

API API API

Page 15: Metadata & Brokering - a modern approach for INGV RI

#4 Brokering (canonical form)

= data Format A – repository A

= data Format B – repository B

= data Format C – repository C

Dataset

Dataset

Dataset

Data from Irpinia

1. Broker discovers data

2. Repository have access interfaces (APIs, WS…)

3. Minimal metadata set

4. Minimal match &mapping

5. Multdisciplinary (ontologies)

Issues- Single AP- development and

maintenance- “hardcoded”

metadata

Dataset

Dataset

DatasetData

setDataset

Dataset

API API API

Broker

API Metadata canonical form

Page 16: Metadata & Brokering - a modern approach for INGV RI

#5 Metadata driven canonical Brokering

= data Format A – repository A

= data Format B – repository B

= data Format C – repository C

Dataset

Dataset

Dataset

Data from Irpinia

1. Broker discovers data

2. Access interfaces3. Full metadata set4. Advance match

&mapping5. Multdisciplinary

(ontologies)Issues- Single AP- Stored graph

metadata- Huge metadata

superset

Dataset

Dataset

DatasetData

setDataset

Dataset

API API API

Broker

API Metadatacatalog

Page 17: Metadata & Brokering - a modern approach for INGV RI

#6 Metadata driven canonical Brokeringwith contextualization

= data Format A – repository A

= data Format B – repository B

= data Format C – repository C

Dataset

Dataset

Dataset

Data from Irpinia

1. Map & match only contextualization metadata

2. Pointers to detailed metadata

Dataset

Dataset

DatasetData

setDataset

Dataset

API API API

Broker

API Metadatacatalog

Page 18: Metadata & Brokering - a modern approach for INGV RI

#6 Metadata driven canonical Brokeringwith contextualization

= data Format A – repository A

= data Format B – repository B

= data Format C – repository C

Dataset

Dataset

Dataset

1. Map & match only contextualization metadata

2. Pointers to detailed metadata

3. Export metadata in any standard

3 layer metadata model

Dataset

Dataset

DatasetData

setDataset

Dataset

API API API

Discovery (DC) and (CKAN, eGMS)

Contextual (CERIF metadata model)

Detailed (community specific)

Gen

erat

e

Point to

Page 19: Metadata & Brokering - a modern approach for INGV RI

Question

There is a missing actor.

WHO?

Page 20: Metadata & Brokering - a modern approach for INGV RI

Dataset

Dataset

DatasetData

setDataset

DatasetData

setDataset

Dataset

API API API

Discovery (DC) and (CKAN, eGMS)

Contextual (CERIF metadata model)

Detailed (community specific)

<PID, metadata, DO>1. PID univocally

identifies a Digital Object

2. Metadata provides description of the Object

3. DO is the Digital Object… to be defined

Data from Irpinia

<PID, metadata, DO>

request response

Page 21: Metadata & Brokering - a modern approach for INGV RI

Wrapping up

We need1. Metadata describing

data2. APIs & web services3. Defined WS output

format4. PID system -5. Brokering system6. Metadata catalogue

supporting1. Ontologies2. Contextualization

Page 22: Metadata & Brokering - a modern approach for INGV RI

Q&A

Page 23: Metadata & Brokering - a modern approach for INGV RI

#3 Metadata driven canonical brokering

= data Format A – repository A

= data Format B – repository B

= data Format C – repository C

Dataset

Dataset

Dataset

Data from Irpinia

1. Broker discovers data

2. Repository have access interfaces (APIs, WS…)

3. Significant metadata set

4. Good match &mapping

Issues

- development and maintenance

- Single AP

- “hardcoded” metadata

Dataset

Dataset

DatasetData

setDataset

Dataset

API API API

Broker

API Metadatacatalog

Page 24: Metadata & Brokering - a modern approach for INGV RI

#4 Metadata driven canonical brokering

Broker

= any data format

Dataset

Issues

1. Predefined tools for matching and mapping

2. Writing software: n conversion algorithms to canonical form

3. Ontologies

4. Multidisciplinarybut many formats

5. Good data discovery

6. Not all metadata used

Dataset Data

set

Dataset

Dataset

= metadata format A

= metadata format B

Data from Irpinia

catalog

Page 25: Metadata & Brokering - a modern approach for INGV RI

#1 Conventional

Brokering

Broker

= data Format A

= data Format B

= data Format C

Dataset

Dataset Data

set

Dataset

Dataset

Dataset

Dataset

DatasetData

set Dataset

Dataset

Dataset

Data from Irpinia

Issues

1. Writing software: n*(n-1) conversion algorithms

2. does not scale in costs of development and maintenance

3. matching and mapping

4. works within a restricted research domain

5. “Complex” data discovery

Page 26: Metadata & Brokering - a modern approach for INGV RI

#2 Brokering with canonical form

Broker

= data Format A

= data Format B

= data Format C

Dataset

Dataset Data

set

Dataset

Dataset

Dataset

Dataset

DatasetData

set Dataset

Dataset

Dataset

Data from Irpinia

Issues

1. Writing software: n conversion algorithms to canonical form

2. works within a restricted research domain

3. matching and mapping

4. “Complex” data discovery

= canonical Format A

Page 27: Metadata & Brokering - a modern approach for INGV RI

#3 Metadata driven simple brokering

Broker

= any data format

Dataset

Issues

1. Good data discovery

2. Predefined tools for matching and mapping

3. Multidisciplinarybut many formats

4. Writing software: n*(n-1) conversion algorithms

5. Ontologies

Dataset Data

set

Dataset

Dataset

= metadata format A

= metadata format B

Data from Irpinia

METADATA

Page 28: Metadata & Brokering - a modern approach for INGV RI

#2 Metadata driven canonical brokering

Broker

= any data format

Dataset

Issues

1. Predefined tools for matching and mapping

2. Writing software: n conversion algorithms to canonical form

3. Ontologies

4. Multidisciplinarybut many formats

5. Good data discovery

Dataset Data

set

Dataset

Dataset

= metadata format A

= metadata format B

Data from Irpinia

catalog

METADATA