Transcript
Page 1: Publishing research information as Linked Data Proposal of Recommendations EuroCRIS meeting. February 2012 Miguel-Ángel Sicilia

Publishing research information as Linked DataProposal of Recommendations

EuroCRIS meeting. February 2012

Miguel-Ángel Sicilia

Page 2: Publishing research information as Linked Data Proposal of Recommendations EuroCRIS meeting. February 2012 Miguel-Ángel Sicilia

ROADMAP

Introduction & Motivation Stakeholders Example Architecture Basic Principles of the LD Exposure CERIF Ontology Recipes for the CERIF LD Exposure CERIF Model Extension Key Use Case Demo Bootstrapping Issues and challenges Conclusions

Page 3: Publishing research information as Linked Data Proposal of Recommendations EuroCRIS meeting. February 2012 Miguel-Ángel Sicilia

INTRODUCTION & MOTIVATION

Page 4: Publishing research information as Linked Data Proposal of Recommendations EuroCRIS meeting. February 2012 Miguel-Ángel Sicilia

A POINT OF DEPARTURE?

“CERIF and Linked Data are similar, complementary approaches. However, there are significant differences in the way they encode relationships. EXRI-UK reviewed these approaches against higher education needs and recommended that CERIF should be the basis for the exchange of research information in the UK. CERIF is currently better able to encode the rich information required to communicate research information, and has the organisational backing of EuroCRIS, ensuring it is well-managed and sustainable. EXRI-UK final report,

http://www.jiscinfonet.ac.uk/infokits/research/exri-uk

Page 5: Publishing research information as Linked Data Proposal of Recommendations EuroCRIS meeting. February 2012 Miguel-Ángel Sicilia

AN EXAMPLE OF USING LINKED DATA IN RIS

Page 6: Publishing research information as Linked Data Proposal of Recommendations EuroCRIS meeting. February 2012 Miguel-Ángel Sicilia
Page 7: Publishing research information as Linked Data Proposal of Recommendations EuroCRIS meeting. February 2012 Miguel-Ángel Sicilia
Page 8: Publishing research information as Linked Data Proposal of Recommendations EuroCRIS meeting. February 2012 Miguel-Ángel Sicilia

XML DATA INTERCHANGE

RIS Database (CERIF)

RIS Database (CERIF)

generate parse

send/reception

Page 9: Publishing research information as Linked Data Proposal of Recommendations EuroCRIS meeting. February 2012 Miguel-Ángel Sicilia

LIMITATIONS (FROM A LINKED DATA PERSPECTIVE)

WebAPI

A

Aggregator (harvester or query client)

Shortcomings

1. APIs provide proprietary interfaces (even though CERIF XML standardizes the interchange format)

2. Aggregators are based on a fixed set of data sources. (not necessarily, but require some registry of providers).

3. You can not set hyperlinks neither between RIS entities (projects, people, organizations, publications) descriptions nor from them to other data or terminologies.

WebAPI

B

WebAPI

C

WebAPI

D

Adapted from: Christian Bizer: The Web of Linked Data (26/07/2009)

Page 10: Publishing research information as Linked Data Proposal of Recommendations EuroCRIS meeting. February 2012 Miguel-Ángel Sicilia

THE LINKED DATA APPROACH

Adapted from: Christian Bizer: The Web of Linked Data (26/07/2009)

BC

RDF

RDFlink

A D DBpedia

RDFlinks

RDFlinks

RDFlinks

RDF

RDF

RDF

RDF

RDF RDF

RDF

RDF

RDF

Use RDF to provide CERIF metadata based on the XML mapping

Add links using different kinds of relations rel (mapping of CERIF link entities?).

Connect to terminologies using some Classification (cls). (an extension of keywords in CERIF?)

Link to other LOD datasets instead of repeating information.

cls

rel

cls

Terminology server

Page 11: Publishing research information as Linked Data Proposal of Recommendations EuroCRIS meeting. February 2012 Miguel-Ángel Sicilia

BROWSING & INTEGRATING

Adapted from: Christian Bizer: The Web of Linked Data (26/07/2009)

B C

RI

typedlinks

A D E

typedlinks

typedlinks

typedlinks

RI

Term

Term

RI

RI RI

RI

Term

Term

Data integrator (combines information for a given cfPers, cfProj or cfOrgUnit)

Browser

Data integrator (combines Information of several cfPers, cfProj or cfOrgUnit,

e.g. for analyzing country or call outcomes)

Page 12: Publishing research information as Linked Data Proposal of Recommendations EuroCRIS meeting. February 2012 Miguel-Ángel Sicilia

RELEVANT RECOMMENDATIONS

Page 13: Publishing research information as Linked Data Proposal of Recommendations EuroCRIS meeting. February 2012 Miguel-Ángel Sicilia

CERIF COMPONENTS

CERIF

SQL

XML

SEMANTICS

LINKED DATA

Page 14: Publishing research information as Linked Data Proposal of Recommendations EuroCRIS meeting. February 2012 Miguel-Ángel Sicilia

STAKEHOLDERS

Page 15: Publishing research information as Linked Data Proposal of Recommendations EuroCRIS meeting. February 2012 Miguel-Ángel Sicilia

A PROPOSAL

Higher Education Institutions (HEI) or R&D institutions

Funding bodies (FB) Research Authorities (RA) Researchers Research information Enterprises (RIE) General public Enterprises

…which are their critical use cases and their “killer apps”?

Page 16: Publishing research information as Linked Data Proposal of Recommendations EuroCRIS meeting. February 2012 Miguel-Ángel Sicilia

EXAMPLE ARCHITECTURE

Page 17: Publishing research information as Linked Data Proposal of Recommendations EuroCRIS meeting. February 2012 Miguel-Ángel Sicilia

STRATEGIES FOR PUBLISH LINKED DATA

ALTERNATIVES FOR THE EXPOSURE OF LINKED DATA Providing a endpoint for enquiries Serving Static RDF Files Serving RDF Embedded in HTML Files Serving LD from RDF Triple Stores Serving LD by wrapping Web APIs Serving LD from Relational Databases

FACTORS AFFECTING THE DECISION How much data do you want to serve? How is your data currently stored? How often does your data change?

Page 18: Publishing research information as Linked Data Proposal of Recommendations EuroCRIS meeting. February 2012 Miguel-Ángel Sicilia

RIS ARCHITECTURE

Internet Navigator

PAPERSLinked Data-The Story So Far[PDF] de igeex.bizT Berners-Lee - International Journal on Semantic Web and …, 2009 - igi-global.comCitado por 294 - Artículos relacionados - Las 19 versiones - Importar al BibTeX Back!

URL: http://cris.myorganization.orgFile Favourites Help

RIS Database (CERIF)

RIS Application Server

Page 19: Publishing research information as Linked Data Proposal of Recommendations EuroCRIS meeting. February 2012 Miguel-Ángel Sicilia

RIS-LD ARCHITECTURE Internet Navigator

PAPERS

relacionados - Las 19 versiones - Importar al BibTeX

Back!

File Faourites Help

<http://cris.myOrganization.org:2020/resource/projects/Organic.Edunet> a cerif:Project ; rdfs:label "Multilingual Federation of Learning Repositories"@en-uk ; cerif:acronym "Organic.Edunet" ; cerif:endDate "2010-09-30"^^xsd:date ; cerif:internalIdentifier "ff808181300cf99e01300d1a355f0003" cerif:isLinkedByOrganisationUnit

RIS Database (CERIF)

D2R Server

RIS Application Server

Page 20: Publishing research information as Linked Data Proposal of Recommendations EuroCRIS meeting. February 2012 Miguel-Ángel Sicilia

URI SCHEME PUBLISHED BY D2R

http://cris.myorg.org/resource/RESOURCE_ID LD Identifier for a given resource

http://cris.myorg.org/data/RESOURCE_ID Resource description of a given resource in RDF

(N3)

http://cris.myorg.org/page/RESOURCE_ID Resource description of a given resource in

HTML

Page 21: Publishing research information as Linked Data Proposal of Recommendations EuroCRIS meeting. February 2012 Miguel-Ángel Sicilia

OPENING OUR CERIF DATASETS

Internet Navigator

URL: http://mashup.orgFile Favourites Help

mashup

<http://cris.myOrganization.org:2020/resource/projects/Organic.Edunet> a cerif:Project ; rdfs:label "Multilingual Federation of Learning Repositories"@en-uk ; cerif:acronym "Organic.Edunet" ; cerif:endDate "2010-09-30"^^xsd:date ; cerif:internalIdentifier "ff808181300cf99e01300d1a355f0003" cerif:isLinkedByOrganisationUnit

RIS-LD

Page 22: Publishing research information as Linked Data Proposal of Recommendations EuroCRIS meeting. February 2012 Miguel-Ángel Sicilia

BENEFITS OF OUR ARCHITECTURE

Exposure of Liked Data without altering the current research information system (non-intrusive)

Linked Data interface: RDF descriptions of individual resources stored in DB over the HTTP protocol

SPARQL endpoint (the SQL of Linked Data) Traditional HTML interface: web pages

describing resources Simple way of interchanging data on the Web Create new third party applications using

open linked data from RIS systems

Page 23: Publishing research information as Linked Data Proposal of Recommendations EuroCRIS meeting. February 2012 Miguel-Ángel Sicilia

BASIC PRINCIPLES OF THE LD EXPOSURE

Page 24: Publishing research information as Linked Data Proposal of Recommendations EuroCRIS meeting. February 2012 Miguel-Ángel Sicilia

GENERAL PRINCIPLES OF THE LOD APPROACH

1. Use URIs as names for things.2. Use HTTP URIs so that people can look up

those names. 3. When someone looks up a URI, provide

useful information, using the standards (RDF*, SPARQL)

4. Include links to other URIs. so that they can discover more things.

Page 25: Publishing research information as Linked Data Proposal of Recommendations EuroCRIS meeting. February 2012 Miguel-Ángel Sicilia

RE-USING OF WELL-KNOWN TERMS

We need an ontology for the CERIF model elements

"Do not reinvent the wheel" Data can be consumed by applications that

may be tuned to well-known vocabularies Foster interoperability between different

datasets

Page 26: Publishing research information as Linked Data Proposal of Recommendations EuroCRIS meeting. February 2012 Miguel-Ángel Sicilia

SELF-DESCRIBED AND CONSISTENT TERMS

Logical entities are translated into RDF classes and their attributes into RDF properties

CF prefixes are not necessary for ontology terms Instead, URI namespaces

Properties and Classes self-described rdfs:label (title case capitalized version of the

property/class) rdfs:comment (a plain text description of the

Page 27: Publishing research information as Linked Data Proposal of Recommendations EuroCRIS meeting. February 2012 Miguel-Ángel Sicilia

URI DESIGN

Essential to enable interoperability and understanding

Create human-readable and memorable URIs Avoid using artificial primary keys

Discover URIs using similarity heuristics Follow a similar schema/pattern for URIs

http://cris.myorg.org/resource/ENTITYNAME/ENTITYID

Example for a identifier for the EU project “Virtual Open Access ...” hosted at University of Athens http://cris.aua.gr/resource/projects/VOA3R

Page 28: Publishing research information as Linked Data Proposal of Recommendations EuroCRIS meeting. February 2012 Miguel-Ángel Sicilia

WHERE DO RI DATASETS LIVE?

Higher ed or R&D institutions maintain repositories centred on Pers, cfOrgUnit (internal) and sometimes cfProj and emphasizing cfResPubl, cfResPat.

Funding bodies are centred around cfProj, cfOrgUnit (mostly legal bodies, not internal) and cfFundProg and related.

Bibliographic and citation databases focus on cfResPubl, cfResPat and in general provide poor support for cfPers and cfOrgUnit.

Page 29: Publishing research information as Linked Data Proposal of Recommendations EuroCRIS meeting. February 2012 Miguel-Ángel Sicilia

DISTRIBUTED DATASETS

Research Information is distributed Frequently, there is duplicated information in

different RIS systems. ID for VOA3R Project in the University of

Athens dataset* http://cris.aua.gr/resource/projects/VOA3R

ID for VOA3R Project in the University of Alcalá dataset* http://cris.uah.es/resource/projects/VOA3R

No Problem: a same concept can be identified by different URIs in Linked Data Using owl:sameAs predicate

* Assuming that there is a corporate RIS available in http://cris.....

Page 30: Publishing research information as Linked Data Proposal of Recommendations EuroCRIS meeting. February 2012 Miguel-Ángel Sicilia

CERIF ONTOLOGY

Page 31: Publishing research information as Linked Data Proposal of Recommendations EuroCRIS meeting. February 2012 Miguel-Ángel Sicilia

CERIF ONTOLOGY

CERIF Ontolog

y

CERIF Semanti

c Vocabul

ary

Other vocabular

ieshttp://eurocris.org/semcerif

http://eurocris.org/cerif

Page 32: Publishing research information as Linked Data Proposal of Recommendations EuroCRIS meeting. February 2012 Miguel-Ángel Sicilia

EUROCRIS WEBSITE FOR PUBLISHING ONTOLOGIES

Current version at http://spi-fm.uca.es/neologism/

Page 33: Publishing research information as Linked Data Proposal of Recommendations EuroCRIS meeting. February 2012 Miguel-Ángel Sicilia

THE CERIF ONTOLOGY ON THE WEB

Page 34: Publishing research information as Linked Data Proposal of Recommendations EuroCRIS meeting. February 2012 Miguel-Ángel Sicilia

VISUAL REPRESENTATION OF THE ONTOLOGIES

Current version at http://spi-fm.uca.es/neologism/cerif

Page 35: Publishing research information as Linked Data Proposal of Recommendations EuroCRIS meeting. February 2012 Miguel-Ángel Sicilia

ONTOLOGY TERMS

Page 36: Publishing research information as Linked Data Proposal of Recommendations EuroCRIS meeting. February 2012 Miguel-Ángel Sicilia

RECIPES FOR THE CERIF LD EXPOSURE

Page 37: Publishing research information as Linked Data Proposal of Recommendations EuroCRIS meeting. February 2012 Miguel-Ángel Sicilia

RECIPES FOR THE CERIF LD EXPOSUREMULTIPLE LANGUAGE FEATURES

Page 38: Publishing research information as Linked Data Proposal of Recommendations EuroCRIS meeting. February 2012 Miguel-Ángel Sicilia

CERIF MULTIPLE LANGUAGE FEATURES

Predicate Objectrdfs:labelfoaf:namecerif:name

*.cfName

rdfs:labeldc:titlecerif:title

*.cfTitle

dc:description *.cfDescrcerif:keyworddc:subject

*.cfKeyw

dcterms:abstractcerif:abstract

*.cfAbstract

cerif:researchActivities cfOrgUnit.cfResActcerif:researchInterests cfPers.cfResIntdcterms:alternative cfResPublSubtitle.cfSubtitlefoaf:name cfResPublNameAbbrev.cfNameAbbrevbibo:annotates cfResPublBiblNote.cfBiblNote

Page 39: Publishing research information as Linked Data Proposal of Recommendations EuroCRIS meeting. February 2012 Miguel-Ángel Sicilia

RECIPES FOR THE CERIF LD EXPOSURESEMANTIC FEATURES

Page 40: Publishing research information as Linked Data Proposal of Recommendations EuroCRIS meeting. February 2012 Miguel-Ángel Sicilia

CERIF SEMANTICS DOCUMENT

From a PDF document with the CERIF semantics…

Page 41: Publishing research information as Linked Data Proposal of Recommendations EuroCRIS meeting. February 2012 Miguel-Ángel Sicilia

CERIF SEMANTIC VOCABULARY

Current version at http://spi-fm.uca.es/neologism/semcerif

…To a RDF Vocabulary with the roles and classification terms

Page 42: Publishing research information as Linked Data Proposal of Recommendations EuroCRIS meeting. February 2012 Miguel-Ángel Sicilia

CERIF USING EXTERNAL VOCABULARIES.

The predicates cerif:classification and cerif:role enable to use external vocabularies to enrich our data CERIF

Ontology

CERIF Semanti

c Vocabul

ary

Other vocabulary

N

Other vocabulary

1...

cerif:classification

cerif:role

Page 43: Publishing research information as Linked Data Proposal of Recommendations EuroCRIS meeting. February 2012 Miguel-Ángel Sicilia

RECIPES FOR THE CERIF LD EXPOSUREADDITIONAL FEATURES

Page 44: Publishing research information as Linked Data Proposal of Recommendations EuroCRIS meeting. February 2012 Miguel-Ángel Sicilia

CERIF ADDITIONAL FEATURES

The current CERIF model contains Dublin Core and Formalised Dublin Core entities and attributes.

We will use external vocabularies through cerif:role and cerif:classification properties

Avoiding the need of storing and publishing entities related to any terminology.

Page 45: Publishing research information as Linked Data Proposal of Recommendations EuroCRIS meeting. February 2012 Miguel-Ángel Sicilia

RECIPES FOR THE CERIF LD EXPOSUREBASE ENTITIES

Page 46: Publishing research information as Linked Data Proposal of Recommendations EuroCRIS meeting. February 2012 Miguel-Ángel Sicilia

CERIF BASE ENTITY PROJECT

Project Acronym (cfProj.cfAcro) will be part of the resource identifier (ID) http://cris.myorganization.org/resource/proj

ects/IDPredicate Objectrdf:type “cerif:Project”cerif:internalIdentifier cfProj.cfPersIdcerif:startDate cfProj.cfStartDatecerif:endDate cfProj.cfEndDatecerif:acronym cfProj.cfAcrocerif:urifoaf:homepage cfProj.cfURI

Page 47: Publishing research information as Linked Data Proposal of Recommendations EuroCRIS meeting. February 2012 Miguel-Ángel Sicilia

PREFIXES USED IN EXAMPLES

# Bult-on prefixes

@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .

@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .

@prefix owl: <http://www.w3.org/2002/07/owl#> .

# External vocabularies

@prefix foaf: <http://xmlns.com/foaf/0.1/> .

@prefix dc: <http://purl.org/dc/elements/1.1/> .

@prefix dcterms: <http://purl.org/dc/terms/> .

@prefix bibo: <http://purl.org/ontology/bibo/> .

# CERIF

@prefix cerif: <http://eurocris.org/cerif#> .

@prefix semcerif: <http://eurocris.org/cerif#> .

Page 48: Publishing research information as Linked Data Proposal of Recommendations EuroCRIS meeting. February 2012 Miguel-Ángel Sicilia

DESCRIPTION OF A CERIF PROJECT (I)

<http://cris.myOrganization.org/resource/projects/VOA3R>

a cerif:Project ;

rdfs:label "Repositorio de Agricultura y Acuicultura de acceso abierto virtual"@es-es , "Virtual Open Access Agriculture & Aquaculture Repository"@en-uk ;

dc:title "Repositorio de Agricultura y Acuicultura de acceso abierto virtual"@es-es , "Virtual Open Access Agriculture & Aquaculture Repository"@en-uk ;

cerif:title "Repositorio de Agricultura y Acuicultura de acceso abierto virtual"@es-es , "Virtual Open Access Agriculture & Aquaculture Repository"@en-uk ;

cerif:internalIdentifier "ff8080812ddb916a012ddb9170b60001" ;

cerif:acronym "VOA3R" ;

Page 49: Publishing research information as Linked Data Proposal of Recommendations EuroCRIS meeting. February 2012 Miguel-Ángel Sicilia

DESCRIPTION OF A CERIF PROJECT (II)

dcterms:abstract "The general objective of the VOA3R project is to improve the spread of European agriculture and aquaculture research results by using an innovative approach to sharing open access research products. "@en-uk ;

cerifs:abstract "The general objective of the VOA3R project is to improve the spread of European agriculture and aquaculture research results by using an innovative approach to sharing open access research products. "@en-uk ;

 

foaf:homepage <http://voa3r.eu/> ; 

cerif:uri <http://voa3r.eu/> ; 

 

cerif:startDate "2010-06-01"^^xsd:date ;

 

cerif:endDate "2013-05-31"^^xsd:date ;

 

Page 50: Publishing research information as Linked Data Proposal of Recommendations EuroCRIS meeting. February 2012 Miguel-Ángel Sicilia

CERIF BASE ENTITY ORGANISATION UNIT

Organisation Acronym (cfOrgUnit.cfAcro) will be part of the resource identifier (ID) http://cris.myorganization.org/resource/organi

sationUnits/IDPredicate Objectrdf:type “cerif:OrganisationUnit”cerif:internalIdentifier cfOrgUnit.cfOrgUnitIdcerif:headcount cfOrgUnit.cfHeadcountcerif:turnover cfOrgUnit.cfTurncerif:turnoverCurrencyCode cfOrgUnit.cfCurrCode

cerif:acronym cfOrgUnit.cfAcrofoaf:homepage cfOrgUnit.cfURI

Page 51: Publishing research information as Linked Data Proposal of Recommendations EuroCRIS meeting. February 2012 Miguel-Ángel Sicilia

DESCRIPTION OF A CERIF ORGANISATION UNIT (I)

<http://cris.myOrganization.org/resource/organisationUnits/UAH>

 

a cerif:OrganisationUnit ;

rdfs:label "Universidad de Alcala "@es-es , " University of Alcala "@en-uk ;

foaf:name"Universidad de Alcala "@es-es , " University of Alcala "@en-uk ;

cerif:name"Universidad de Alcala "@es-es , " University of Alcala "@en-uk ;

 

cerif:internalIdentifier "ff8081812f0d51ed012f2faa5bfe0003" ;

cerif:acronym "UAH" ;

Page 52: Publishing research information as Linked Data Proposal of Recommendations EuroCRIS meeting. February 2012 Miguel-Ángel Sicilia

DESCRIPTION OF A CERIF ORGANISATION UNIT (II)

 

foaf:homepage <http://www.uah.es> ;

  cerif:uri <http://www.uah.es> ;

 

cerif:headcount "990" ;

 

cerif:turnover "200800" ;

Page 53: Publishing research information as Linked Data Proposal of Recommendations EuroCRIS meeting. February 2012 Miguel-Ángel Sicilia

CERIF BASE ENTITY PERSON

Person’s full name1 (encoded according to URL rules) will be part of the resource identifier (ID) http://cris.myorganization.org/resource/per

sons/ID

[1] Full name is settled by concatenating the attributes cfFirstNames, cfOtherNames and cfFamilyNames of the cfPersName entity.

Predicate Objectrdf:type “cerif:Person”cerif:internalIdentifier cfPers.cfPersIdcerif:birthdate cfPers.birthdatefoaf:gender cfPers.cfGenderfoaf:homepage cfPers.cfURIfoaf:firstName cfPersName.cfFirstNamesfoaf:familyName cfPersName.cfFamilyNamesfoaf:name $FULLNAME$rdfs:label $FULLNAME$

[1] We have chosen to create a new property cerif:birthdate, since the property foaf:birthday does not support the birth year.

Page 54: Publishing research information as Linked Data Proposal of Recommendations EuroCRIS meeting. February 2012 Miguel-Ángel Sicilia

DESCRIPTION OF A CERIF PERSON (I)

<http://cris.myOrganization.org/resource/persons/Miguel-Angel_Sicilia>

a cerif:Person ;

 

rdfs:label "Miguel-Angel Sicilia" ;

foaf:name "Miguel-Angel Sicilia" ;

cerif:name"Miguel-Angel Sicilia" ;

foaf:familyName "Sicilia" ;

 

foaf:firstName "Miguel-Angel" ;

 

foaf:givenName "Miguel-Angel" ;

 

Page 55: Publishing research information as Linked Data Proposal of Recommendations EuroCRIS meeting. February 2012 Miguel-Ángel Sicilia

DESCRIPTION OF A CERIF PERSON (II)

foaf:gender "m" ; 

cerif:gender "m" ; 

foaf:homepage <http://www.cc.uah.es/msicilia/> ;

  cerif:uri<http://www.cc.uah.es/msicilia/> ;

cerif:researchInterests "Ontologies, learning technology" ;

 

cerif:internalIdentifier "ff8081812ec8ae9e012f0b02353d0008" ;

 

cerif:birthdate "1973-10-10"^^xsd:date ;

Page 56: Publishing research information as Linked Data Proposal of Recommendations EuroCRIS meeting. February 2012 Miguel-Ángel Sicilia

RECIPES FOR THE CERIF LD EXPOSURERESULT ENTITIES

Page 57: Publishing research information as Linked Data Proposal of Recommendations EuroCRIS meeting. February 2012 Miguel-Ángel Sicilia

CERIF ENTITY RESULT PUBLICATION

Original title of the publication (cfResPublTitle.cfTitle) encoded according to URL rules, will be part of the resource identifier (ID) http://cris.myorganization.org/resource/public

ations/ID

Later, we will link our bibliographic resources to bibliographic records in other LD systems, using the predicate owl:sameAs.

Predicate Objectrdf:type “cerif:Publication”cerif:internalIdentifier cfResPubl.cfResPublIddc:date cfResPubl.cfResPublDatefoaf:homepage cfResPubl.cfURI

Page 58: Publishing research information as Linked Data Proposal of Recommendations EuroCRIS meeting. February 2012 Miguel-Ángel Sicilia

CERIF ENTITY RESULT PUBLICATION (II)

Predicate Objectbibo:volume cfResPubl.cfVolbibo:edition cfResPubl.cfEditionbibo:issue cfResPubl.cfIssuebibo:pageStart cfResPubl.cfStartPagebibo:pageEnd cfResPubl.cfEndPagebibo:isbn cfResPubl.cfISBNbibo:issn cfResPubl.cfISSNbibo:number cfResPubl.cfNumbibo:numPages cfResPubl.cfTotalPagesdcterms:isPartOf1 “myorg:series/” + cfResPubl.cfSeries2

[1] Publications are linked to dynamically generated instances with type bib:Series[2] The prefix myorg refers to http://cris.myorganization.org/resource/ namespace.

Bibliographic metadata in RDF

Page 59: Publishing research information as Linked Data Proposal of Recommendations EuroCRIS meeting. February 2012 Miguel-Ángel Sicilia

CERIF ENTITY RESULT PATENT

Register number (cfResPat.cfPatentNum) of the patent will be part of the resource identifier (ID) http://cris.myorganization.org/resource/pat

ents/IDPredicate Objectrdf:type “cerif:Patent”cerif:internalIdentifier cfResPat.cfResPublIdcerif:approvalDate cfResPat.cfApprovDatecerif:registrationDate cfResPat.cfRegistrDatecerif:patentNumber cfResPat.cfPatentNumcerif:countryCode cfResPat.cfCountryCodefoaf:homepage cfResPat.cfURI

Page 60: Publishing research information as Linked Data Proposal of Recommendations EuroCRIS meeting. February 2012 Miguel-Ángel Sicilia

CERIF ENTITY RESULT PRODUCT

Internal Identifier (cfResProd.cfResProdInternId) of the product will be part of the resource identifier (ID) http://cris.myorganization.org/resource/pro

ducts/ID

Predicate Objectrdf:type “cerif:Product”cerif:internalIdentifier cfResProd.cfResProdIdcerif:productNumber cfResProd.cfResProdInternIdcerif:registrationDate cfResProd.cfRegistrDatefoaf:homepage cfResPubl.cfURI

Page 61: Publishing research information as Linked Data Proposal of Recommendations EuroCRIS meeting. February 2012 Miguel-Ángel Sicilia

RECIPES FOR THE CERIF LD EXPOSURELINK ENTITIES

Page 62: Publishing research information as Linked Data Proposal of Recommendations EuroCRIS meeting. February 2012 Miguel-Ángel Sicilia

CERIF ENTITY LINK RELATIONSHIP (I)

<<CERIF Ontology>>

Entity 1

<<CERIF Ontology>>

Entity 2

<<CERIF Ontology>>

Relationship

xsd:dateTime

xsd:float

cerif:linksToEntity

cerif:startDate

cerif:endDate

cerif:fraction

cerif:isLinkedByEntity

<<External Vocabulary>>

rdf:Resource

cerif:role

Page 63: Publishing research information as Linked Data Proposal of Recommendations EuroCRIS meeting. February 2012 Miguel-Ángel Sicilia

CERIF ENTITY LINK RELATIONSHIP (II)

ENTITYLINK refers to the name of the link entity (e.g.: cfOrgUnit_ResPubl). VOCAB-TERM-URI is a URI pointing to a given term of a external vocabulary.

Entity Link identifier (ID) must be retrieved/generated from the database http://cris.myorganization.org/resource/

ENTITYLINK/IDPredicate Objectrdf:type “cerif:Relationship”rdfs:label “Association ” + IDcerif:startDate ENTITYLINK.cfStartDatecerif:endDate ENTITYLINK.cfEndDatecerif:fraction ENTITYLINK.cfFractioncerif:role VOCAB-TERM-URI

Page 64: Publishing research information as Linked Data Proposal of Recommendations EuroCRIS meeting. February 2012 Miguel-Ángel Sicilia

RELATING A PROJECT WITH A ORGANISATION UNIT

<http://cris.myOrganization.org/resource/projects/VOA3R>

...

cerif:isLinkedByOrganisationUnit

<http://cris.myOrganization.org/resource/proj_orgunit/VOA3R-UAH-uuid> ,

….

<http://cris.myOrganization.org/resource/proj_orgunit/VOA3R-GRNET-uuid> ; 

<http://cris.myOrganization.org/resource/organisationUnits/UAH>

...

cerif:linksToProject <http://cris.myOrganization.org/resource/proj_orgunit/VOA3R-UAH-uuid> ;

….

<http://cris.myOrganization.org/resource/proj_orgunit/Organic.Edunet-UAH-uuid> ; 

Page 65: Publishing research information as Linked Data Proposal of Recommendations EuroCRIS meeting. February 2012 Miguel-Ángel Sicilia

DESCRIPTION OF A ENTITY LINK RELATIONSHIP

<http://cris.myOrganization.org/resource/proj_orgunit/VOA3R-UAH-uuid>

a cerif:Relationship;

rdfs:label "Association between VOA3R (Project) and UAH (Organisation Unit)" ;

 

cerif:role <http://eurocris.org/semcerif#participant> ;

 

cerif:startDate "1901-01-01 00:00:00.0" ;

 

cerif:endDate "2099-12-31 23:59:59.0" ;

 

cerif:fraction "0.75" .

Page 66: Publishing research information as Linked Data Proposal of Recommendations EuroCRIS meeting. February 2012 Miguel-Ángel Sicilia

CERIF ENTITY LINK CLASSIFICATION (I)

<<CERIF Ontology>>

Entity

<<CERIF Ontology>>

Classification

xsd:dateTime

xsd:float

cerif:isClassifiedBy

cerif:startDate

cerif:endDate

cerif:fraction

<<External Vocabulary>>

rdf:Resource

cerif:classification

Page 67: Publishing research information as Linked Data Proposal of Recommendations EuroCRIS meeting. February 2012 Miguel-Ángel Sicilia

CERIF ENTITY LINK CLASSIFICATION (II)

ENTITYLINK refers to the name of the link entity (e.g.: cfOrgUnit_Class). VOCAB-TERM-URI is a URI pointing to a given term of a external vocabulary.

Entity Link identifier (ID) must be retrieved/generated from the database ENTITY-URI/class/IDPredicate Object

rdf:type “cerif:Classification”rdfs:label “Classification ” + IDcerif:startDate ENTITY_LINK.cfStartDatecerif:endDate ENTITY_LINK.cfEndDatecerif:fraction ENTITY_LINK.cfFractioncerif:classification VOCAB-TERM-URI

Page 68: Publishing research information as Linked Data Proposal of Recommendations EuroCRIS meeting. February 2012 Miguel-Ángel Sicilia

CLASSIFYING A CERIF PROJECT

<http://cris.myOrganization.org/resource/organisationUnits/UAH>

...

cerif:isClassifiedBy

<http://cris.myOrganization.org/resource/organisationUnits/UAH/class/uuid>;

Page 69: Publishing research information as Linked Data Proposal of Recommendations EuroCRIS meeting. February 2012 Miguel-Ángel Sicilia

DESCRIPTION OF A ENTITY LINK CLASSIFICATION

<http://cris.myOrganization.org/resource/organisationUnits/UAH/class/uuid>

 

a cerif:Classification ;

 

rdfs:label "Classification for UAH as an University " ;

 

cerif:classification <http://eurocris.org/semcerif#University> ;

 

cerif:startDate "1901-01-01 00:00:00.0" ;

 

cerif:endDate "2099-12-31 23:59:59.0" ;

 

cerif:fraction "1" .

Page 70: Publishing research information as Linked Data Proposal of Recommendations EuroCRIS meeting. February 2012 Miguel-Ángel Sicilia

RECIPES FOR THE CERIF LD EXPOSUREOTHER CERIF ENTITIES

Page 71: Publishing research information as Linked Data Proposal of Recommendations EuroCRIS meeting. February 2012 Miguel-Ángel Sicilia

OTHER CERIF ENTITIES

Shared Entities [Not exposed as Linked Data resources] Currency Country Language

Infrastructure [Exposed as Linked Data resources] Equipment Facility Service

Second Level [Exposed as Linked Data resources] Funding, Event, PrizeAward Metrics, Cite CurriculumVitae, ExpertiseAndSkill, Qualification, ElectronicAddress, PostalAddress

Page 72: Publishing research information as Linked Data Proposal of Recommendations EuroCRIS meeting. February 2012 Miguel-Ángel Sicilia

CERIF SHARED ENTITIES

Country, Language and Currency are shared globally entities, therefore is not necessary to carry out an exposure of data from them in our CERIF-LD datasets.

Instead, we should use the countries, currencies and languages available on Dbpedia. http://dbpedia.org/ontology/Currency http://dbpedia.org/ontology/Country http://dbpedia.org/ontology/Language

Page 73: Publishing research information as Linked Data Proposal of Recommendations EuroCRIS meeting. February 2012 Miguel-Ángel Sicilia

CERIF 2ND LEVEL ENTITY FUNDING

Original name1 associated to the funding will be part of the resource identifier (ID) http://cris.myorganization.org/resource/fun

ding/ID

[1] Taking into account the condition [cfTrans=o]

Predicate Objectrdf:type “cerif:Funding”cerif:internalIdentifier cfFund.cfFundIdcerif:startDate cfFund.cfStartDatecerif:endDate cfFund.cfEndDatecerif:amount cfFund.cfAmountcerif:currencyCode cfFund.cfCurrCodefoaf:homepage cfFund.cfURI

Page 74: Publishing research information as Linked Data Proposal of Recommendations EuroCRIS meeting. February 2012 Miguel-Ángel Sicilia

CERIF MODEL EXTENSION

Page 75: Publishing research information as Linked Data Proposal of Recommendations EuroCRIS meeting. February 2012 Miguel-Ángel Sicilia

CERIF LINKED OPEN DATA ENTITY

Creation of a new entity

structure cfCERIF-LOD

internalcfEntitycfInstanceId

predicate

cfPredicateClassIdcfPredicateClassSchemeIdcfStartDatecfEndDate

external

cfxObjectURIcfxSourceURIcfxURIKind (opt)cfClassId (opt)cfClassSchemeId (opt)

Page 76: Publishing research information as Linked Data Proposal of Recommendations EuroCRIS meeting. February 2012 Miguel-Ángel Sicilia

CERIF LINKED OPEN DATA ENTITY

cfCERIF-LOD examplecfEntity cfPerson

cfInstanceId miguel-angel-sicilia-uuid

cfPredicateClassId professor

cfPredicateClassSchemeId CERIF Semantics 2011

cfStartDate 2000-06-01 00:00:00.0

cfEndDate 2004-02-31 23:59:59.0

cfxObjectURI http://otherRIS.org/resource/organisationUnit/CarlosIII

cfxSourceURI http://otherRIS.org/

cfxURIKind (opt) absolute

cfClassId (opt) cfOrgUnit

cfClassSchemeId (opt) cfCERIF-2011-Entities-Collection

Example 1

Page 77: Publishing research information as Linked Data Proposal of Recommendations EuroCRIS meeting. February 2012 Miguel-Ángel Sicilia

CERIF LINKED OPEN DATA ENTITY

cfCERIF-LOD examplecfEntity cfPerson

cfInstanceId miguel-angel-sicilia-uuid

cfPredicateClassId cfsameAs-uuid

cfPredicateClassSchemeId cflinkedopendata-2008-1.3-uuid

cfStartDatecfEndDatecfxObjectURI http://dblp.l3s.de/d2r/resource/authors/Miguel-

Angel_SiciliacfxSourceURI http://dblp.l3s.de/d2r/sparql

cfxURIKind (opt)cfClassId (opt)cfClassSchemeId (opt)

Example 2

Page 78: Publishing research information as Linked Data Proposal of Recommendations EuroCRIS meeting. February 2012 Miguel-Ángel Sicilia

KEY USE CASE

Page 79: Publishing research information as Linked Data Proposal of Recommendations EuroCRIS meeting. February 2012 Miguel-Ángel Sicilia

KEY USE CASE

Miguel-Angel Sicilia is a researcher who has worked for several organizations in his career. Now his research information is spread across several RIS located at different Universities. Miguel-Angel would like to have all his information integrated…..

Page 80: Publishing research information as Linked Data Proposal of Recommendations EuroCRIS meeting. February 2012 Miguel-Ángel Sicilia

STEP 1: RETRIEVE LOCAL IDENTIFIERS

SELECT DISTINCT ?researcher

WHERE {

?researcher a <http://eurocris.org/cerif#Person> .

?researcher foaf:firstName "Miguel-Angel" .

?researcher foaf:familyName "Sicilia" ;

}

ORDER BY ?researcher

What is the Miguel-Angel Sicilia’s identifier at University of Alcalá? And in Carlos III of Madrid?

Sending the same query to multiple the registered dataset's endpoints

Page 81: Publishing research information as Linked Data Proposal of Recommendations EuroCRIS meeting. February 2012 Miguel-Ángel Sicilia

STEP 2: OBTAINING ORGANISATION NAMES

SELECT DISTINCT ?organisationName

WHERE {

<http://cris.uah.es/resource/persons/Miguel-Angel_Sicilia> cerif:linksToOrganisationUnit ?ENTITYLINK .

?organisation cerif:isLinkedByPerson ?ENTITYLINK .

?organisation a <http://eurocris.org/cerif#OrganisationUnit> .

?organisation foaf:name ?organisationName ;

}

ORDER BY ?organisationName

In what organizations has worked Miguel-Angel? From previous results, we will send the following

query to the proper dataset's endpoints

Page 82: Publishing research information as Linked Data Proposal of Recommendations EuroCRIS meeting. February 2012 Miguel-Ángel Sicilia

STEP 3: JOINING AND FORMATTING RESULTS

Miguel-Angel_Sicilia has worked in the following universities: University of Alcalá Carlos III University of Madrid

Page 83: Publishing research information as Linked Data Proposal of Recommendations EuroCRIS meeting. February 2012 Miguel-Ángel Sicilia

DEMO

Page 84: Publishing research information as Linked Data Proposal of Recommendations EuroCRIS meeting. February 2012 Miguel-Ángel Sicilia

FRONT-END FOR CERIF LD DATASET

http://voa3r.cc.uah.es:443/dataset/

Page 86: Publishing research information as Linked Data Proposal of Recommendations EuroCRIS meeting. February 2012 Miguel-Ángel Sicilia

BOOTSTRAPPING

Page 87: Publishing research information as Linked Data Proposal of Recommendations EuroCRIS meeting. February 2012 Miguel-Ángel Sicilia

BOOTSTRAPPINGInstalling and configuring of the Linked Data Server

Page 88: Publishing research information as Linked Data Proposal of Recommendations EuroCRIS meeting. February 2012 Miguel-Ángel Sicilia

CONFIGURATION FILE OF D2R SERVER

map:Projects a d2rq:ClassMap;

d2rq:dataStorage map:database;

d2rq:uriPattern "projects/@@cfProj.cfAcro|urlify@@";

d2rq:class cerif:Project;

map:Projects_cfProjId a d2rq:PropertyBridge;

d2rq:belongsToClassMap map:Projects;

d2rq:property cerif:internalIdentifier;

d2rq:column "cfProj.cfProjId";

 

map:Projects_cfAcro a d2rq:PropertyBridge;

d2rq:belongsToClassMap map:Projects;

d2rq:property cerif:acronym;

d2rq:column "cfProj.cfAcro";

.

Page 89: Publishing research information as Linked Data Proposal of Recommendations EuroCRIS meeting. February 2012 Miguel-Ángel Sicilia

REQUIRED ACTIONS IN THE D2R CONFIGURATION

Adaptation to multiple languages D2R does not support dynamic language tags.

Therefore, it is necessary to replicate the predicates d2rq:PropertyBridge.

Data Normalization There are several attributes of the CERIF that

should be normalized before being exposed as Linked Data. Not-atomic attributes, e.g.: cfKeyw, cfResInt and

cfResAct. Publication of the cfPers.cfGender attributes. Attributes with NULL values requires a special

treatment for some databases. Normalization in publication time (vía D2R conf.

file) or within the RIS system.

Page 90: Publishing research information as Linked Data Proposal of Recommendations EuroCRIS meeting. February 2012 Miguel-Ángel Sicilia

BOOTSTRAPPINGLinking our dataset with external data

Page 91: Publishing research information as Linked Data Proposal of Recommendations EuroCRIS meeting. February 2012 Miguel-Ángel Sicilia

DISCOVERING LINKS WITH EXTERNAL RESOURCES

LD Basic principle: To set RDF links pointing into other data sources on the Web

http://richard.cyganiak.de/2007/10/lod/

CERIF dataset

CERIF db

SILK Link

Discovery Framework

owl:sameAscerif:count

ry

Page 92: Publishing research information as Linked Data Proposal of Recommendations EuroCRIS meeting. February 2012 Miguel-Ángel Sicilia

BOOTSTRAPPINGPublishing metadata

Page 93: Publishing research information as Linked Data Proposal of Recommendations EuroCRIS meeting. February 2012 Miguel-Ángel Sicilia

METADATA FOR LINKED DATA RESOURCES DESCRIPTIONS

Needed for additional metadata for RDF resource descriptions published in CERIF-LD datasets Assurance the origin of data as well as to enable

them to assess the quality of data

Enable external applications to use CERIF data on a secure legal basis

enables to aiding discovery and indexing of the our data by crawlers

Page 94: Publishing research information as Linked Data Proposal of Recommendations EuroCRIS meeting. February 2012 Miguel-Ángel Sicilia

ADDITIONAL METADATA SERVED BY D2R d2r:documentMetadata [

# General metadata

rdf:type foaf:Document ;

rdfs:comment "This resource description is published according to CERIF LD.";

# Provenance metadata

dc:creator <http://www.myOrganization.org> ;

dc:publisher <http://www.myOrganization.org> ;

dc:date "2011-01-01"^^xsd:date;

# License metadata

dc:rights <http://creativecommons.org/licenses/by-nc/3.0> ;

# Dataset metadata

void:inDataset <http://cris.myOrganization.org/dataset> ; ];

Page 95: Publishing research information as Linked Data Proposal of Recommendations EuroCRIS meeting. February 2012 Miguel-Ángel Sicilia

BOOTSTRAPPINGPublishing CERIF-LD datasets

Page 96: Publishing research information as Linked Data Proposal of Recommendations EuroCRIS meeting. February 2012 Miguel-Ángel Sicilia

DESCRIBING CERIF-LD DATASETS WITH VOIDhttp://cris.myOrganization.org/dataset

a void:Dataset ;

rdfs:label "CERIF Research Information Dataset of MyOrganization" ;

dc:title "CERIF Research Information Dataset of MyOrganization" ;

dc:description "Dataset describing CERIF resources from the corporte current research information system";

foaf:homepage <http://cris.myOrganization.org/dataset.html> ;

foaf:isPrimaryTopicOf <http://cris.myOrganization.org/dataset.rdf> ;

void:sparqlEndpoint <http://cris.myOrganization.org/sparql>;

void:vocabulary <http://xmlns.com/foaf/0.1/>;

void:vocabulary <http://eurocris.org/cerif>;

void:vocabulary <http://eurocris.org/semcerif>;

void:exampleResource <http://cris.myorganization.org/resource/projects/VOA3R> ;

Page 97: Publishing research information as Linked Data Proposal of Recommendations EuroCRIS meeting. February 2012 Miguel-Ángel Sicilia

DISSEMINATING OUR DATASET

Include in open data sources registers (linked or non linked)

Using VoID, CERIF-LD datasets can be made discoverable:

Enable to the further interconnection with other datasets

Foster the development of new web apps

Page 98: Publishing research information as Linked Data Proposal of Recommendations EuroCRIS meeting. February 2012 Miguel-Ángel Sicilia

ISSUES AND CHALLENGES

Page 99: Publishing research information as Linked Data Proposal of Recommendations EuroCRIS meeting. February 2012 Miguel-Ángel Sicilia

OPEN ISSUES: GENERAL

Links Entity introduce “hops” among linked resources into our datasets. It does the navigation via SPARQL a little more complex. Add syntactic sugar?

The configuration of the D2R is too heavy and repetitive, encouraging intensive use of "copy / paste" which is prone to errors Any proposal to further automate file generation

D2R? How to perform a synchronous and

sustainable evolution of the CERIF model components? Namely: data model, SQL scripts, XML Schema,

RDFS ontologies and the D2R pre-configuration file.

Page 100: Publishing research information as Linked Data Proposal of Recommendations EuroCRIS meeting. February 2012 Miguel-Ángel Sicilia

OPEN ISSUES: URI DESIGN

Uniqueness of identifiers: the selected attributes to be part of the resource identifiers (acronyms, titles, full names, etc.) must be unique in its local dataset. Alter CERIF database model introducing unique

keys? Resource Identifiers (URIs) must does not

change over time. What happens if you change the title of a publication? Alter CERIF model introducing a new attribute

LDIdentifier for all entities. Links Entity need short unique identifiers. Its

primary key is composed by many attributes.

Page 101: Publishing research information as Linked Data Proposal of Recommendations EuroCRIS meeting. February 2012 Miguel-Ángel Sicilia

OPEN ISSUES: LINKING EXTERNAL RESOURCES

How to publish links between our data and external resources? Extend CERIF model including a new entity for

store RDF statements: triples subject/predicate/object

How to establish the mapping between terms and classifications of the CERIF database with the LD external vocabularies? Reusing the attribute cfURI of the cfClass entity?

Page 102: Publishing research information as Linked Data Proposal of Recommendations EuroCRIS meeting. February 2012 Miguel-Ángel Sicilia

OPEN ISSUES: LINKING BIBLIOGRAPHIC DATABASES

We have used the Bibliographic Ontology for describe linked data for result publications. There is another proposal, named SWRC

(Semantic Web for Research Communities) The major bibliographic databases are not

yet publishing linked data. Elsevier Developer Network is addressing it now.

Page 103: Publishing research information as Linked Data Proposal of Recommendations EuroCRIS meeting. February 2012 Miguel-Ángel Sicilia

CONCLUSIONS

Page 104: Publishing research information as Linked Data Proposal of Recommendations EuroCRIS meeting. February 2012 Miguel-Ángel Sicilia

CONCLUSIONS

Proposal of recommendations for the exposure of Linked Data according to CERIF Ontology. A basic, official RDF(S) mapping of CERIF.

Example architecture (not the only) for publish linked data from RIS databases

A set of recipes for translate CERIF model entities into RDF classes and properties and a guided approach for bootstrapping.

There are a number of issues which require a further treatment

Linked Data as a way for interchanging data between different stakeholders involved in Research.

Page 105: Publishing research information as Linked Data Proposal of Recommendations EuroCRIS meeting. February 2012 Miguel-Ángel Sicilia

THANKSMiguel-Angel Sicilia [email protected]

Iván Ruiz Rube [email protected]


Top Related