a chain-reds perspective about data access and metadata management

Post on 23-Feb-2016

39 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

A CHAIN-REDS Perspective about Data Access and Metadata Management. Rafael Mayo-García, CIEMAT. Tunis / 12-13 Dec 2013. A CHAIN-REDS Perspective about Data Access and Metadata Management - PowerPoint PPT Presentation

TRANSCRIPT

Co-ordination & Harmonisation of Advanced e-Infrastructuresfor Research and Education Data Sharing

www.chain-project.euproj-office@chain-project.euGrant Agreement n. 306819

A CHAIN-REDS Perspective about Data Access and Metadata

ManagementRafael Mayo-García, CIEMAT

Tunis / 12-13 Dec 2013

A CHAIN-REDS Perspective about Data Access and Metadata Management

Roberto Barberaa,b, Carla Carrubbab, Giuseppina Inserrab, Christos Kanellopoulosc, Kostas Koumantarosc, Rafael Mayo-Garcíad, Ognjen

Prnjatc, Rita Riccerib, Manuel Rodriguez Pascuald, Antonio Rubio-Monterod, Federico Ruggierie

a University of Cataniab INFN-Catania

c GRNETd CIEMAT

e GARR & INFN-Roma Tre

Coordination &

Harmonisation of Advanced

eINfrastructuresCHAIN

CHAIN-REDS: A legacy from CHAIN

CHAIN-REDS is an EC (306819) funded project ~ 2.1 M€ 1 December 2012 – 30 months

Structured in WP 1 Project Management WP 2 Dissemination, Training and Outreach WP 3 Interoperation and coordination of e-

Infrastructures WP 4 Data Infrastructures WP 5 Support to small groups and emerging

communities

WP4 in CHAIN-REDS

CHAIN-REDS is an EC (306819) funded project ~ 2.1 M€ 1 December 2012 – 30 months

Structured in WP 1 Project Management WP 2 Dissemination, Training and Outreach WP 3 Interoperation and coordination of e-

Infrastructures WP 4 Data Infrastructures WP 5 Support to small groups and emerging

communities

WP4 in CHAIN-REDS

Partners INFN CIEMAT GRNET CESNET UBUNTUNET CLARA IHEP ASREN SIGMA ORIONIS C-DAC

WP4 ‘Data infrastructures’

Partners INFN CIEMAT GRNET CESNET UBUNTUNET CLARA IHEP ASREN SIGMA ORIONIS C-DAC

Europe

Europe

WP4 ‘Data infrastructures’

INFN CIEMAT GRNET CESNET UBUNTUNET CLARA IHEP ASREN SIGMA ORIONIS C-DAC

Europe

Africa

WP4 ‘Data infrastructures’

Europe

INFN CIEMAT GRNET CESNET UBUNTUNET CLARA IHEP ASREN SIGMA ORIONIS C-DAC

AfricaLatin America

WP4 ‘Data infrastructures’

INFN CIEMAT GRNET CESNET UBUNTUNET CLARA IHEP ASREN SIGMA ORIONIS C-DAC

Latin AmericaAsia

WP4 ‘Data infrastructures’

Asia

INFN CIEMAT GRNET CESNET UBUNTUNET CLARA IHEP ASREN SIGMA ORIONIS C-DAC

Asia

WP4 ‘Data infrastructures’

Middle East

Asia

INFN CIEMAT GRNET CESNET UBUNTUNET CLARA IHEP ASREN SIGMA ORIONIS C-DAC

WP4 ‘Data infrastructures’

Middle East

Public outreach and dissemination is focused on reporting on Trans-continental Data Infrastructures and Data repositories and on several Use Cases

D4.1 Trans-continental Data Infrastructures and Data repositories

D4.2 Analysis of Data Infrastructures and Data repositories (coming soon)

Available at http://www.chain-project.eu/deliverables

WP4 ‘Data infrastructures’

CHAIN-REDS has established official collaborations (MoUs) with other VRC-related communities

AgINFRA DCH-RP EarthServer EIFL ENGAGE

WP4 ‘Data infrastructures’

Conversations are being held with EUDAT, H3Africa, iMENTORS, IVOA, SAEON, SKA Africa, Univ. Cape Town

WP4 ‘Data infrastructures’

Extend the CHAIN-REDS Knowledge Base (BS) with Data capabilities http://www.chain-project.eu/knowledge-base

Knowledge Base: Infrastructure

RREN(s) NREN NGI CA(s) Ident.

Fed(s) ROC(s) Grid site(s) Application(

s)

An investigation on the available (Open Access) Data and Document Repositories has been performed

Information has been collected in Africa, Asia, Europe, Latin America and the Middle East

New ones have been incorporated into the Knowledge Base

These new repositories range from databases owned by a single group to huge continental collaborations

Knowledge Base:Document & Data repositories

Knowledge Base:Document & Data repositories

• 3,200 repos• >33 M docs

Knowledge Base:Document & Data repositories

About Open Access Data Repositories, standards are being promoted

OAI-PMH for metadata retrieval Dublin Core as metadata schema SPARQL for semantic web search VOTable (XML) as potential standard for the interchange

of data represented as a set of tables Persistent Identifiers (PID)

Standards

The adopted standards have been implemented in the CHAIN-REDS KB

Developments on (Open Access) Document and Data Repositories

A semantic web enrichment A semantic search engine

OADRs and DRs

25

Semantic enrichment

OAD

Rs

Dat

a Re

pos.OAI-PMH OAI-PMH

Harvester(running on grid/cloud)

Linked-data search engine

Semantic-web enrichment

End-points

Harvester(running on grid/cloud)

Semantic search engine architecture

The semantic search engine on CHAIN-REDS linked data is available

Allows searching among the semantically-enriched metadata coming from the OADRs and DRs included in the KB

OADRs and DRs

cell

OADRs and DRs

OADRs and DRs

New knowledge discovery!

Single and Parallel semantic search are available Single: the usual semantic search service described before Parallel: the new parallel semantic search service that allow

users to search in parallel across the millions of resources contained in the CHAIN-REDS Knowledge Base and in the ENGAGE Platform

Parallel semantic search engines have been made available also in others Science Gateways agINFRA (CHAIN-REDS Knowledge Base & OpenAgris

repository) DCH-RP (CHAIN-REDS Knowledge Base & Europeana, Cultura

Italia and Isidore repositories)

Semantic Search Engine

Performs sequential and parallel searches ENGAGE

agINFRA DCH-RP

Semantic Search Engine

Semantic Search Engine

A programmable use of the CHAIN-REDS Semantic Search Engine is also possible by means of a RESTful API

http://www.chain-project.eu/semantic-search-api CHAIN-REDS webpage Semantic Search Web

Example http://www.chain-project.eu/virtuoso/api/resources?

keyword=<KEYWORD>&limit=<NUMBER_OF_RESOURCES >

Semantic Search Engine

Future developments on A tool for extracting the data associated to OADRs The execution of distributed jobs in the Science

Gateway

Data Accessibility, Reproducibility and Trustworthiness (DART)

Based on the interoperability demo performed by CHAIN-REDS at EGI TF 2013

Aiming at seamlessly perform the cycle Access to a document Extraction of associated raw data

Execution of a code taking those data as input Generation of new results Upload of the new results and article

Coming actions

CHAIN-REDS has identified in a first phase several fields with interests in the different regions

Agriculture Cultural Heritage e-Government Earth Science Astronomy and Astrophysics

Potential collaborations with initiatives and projects working on these areas are being carried out

Conclusions

Other fields and groups are also of interest OADRs’ and DRs’ managers/owners are welcome to

contact the project to share their data within the CHAIN Knowledge Base (both in Africa and Latin America this is already happening)

CHAIN-REDS is also looking forward to receiving feedbacks from all interested organizations on the Knowledge Base and the semantic search service

Conclusions

Data developments have been carried out in the Regions of interest to CHAIN-REDS

A special action in the Middle East is now a priority for CHAIN-REDS

Semantic engine and web-enrichment are powerful tools to link data and retrieve information DART

Conclusions

Co-ordination & Harmonisation of Advanced e-Infrastructuresfor Research and Education Data Sharing

www.chain-project.euproj-office@chain-project.euGrant Agreement n. 306819

Thank you !

www.chain-project.euproj-office@chain-project.eu – rafael.mayo@ciemat.es

top related