driver guidelines and repository interoperability

32
Fasten … Seatbelt Maurice Vanderfeesten - SURFfoundation (NL) 15 November 2008 – Baltimore – DRIVER meeting

Upload: mauricevanderfeesten

Post on 20-May-2015

1.211 views

Category:

Technology


0 download

DESCRIPTION

On 2008-11-15 Maurice Vanderfeesten gave a presentation in Baltimore at the SPARC OpenAccess confenrence. This presentation explains about the needs for interoperability amoung repository systems. DRIVER provides guidelines how to expose metadata via OAI-PMH is a way that has international compliance.

TRANSCRIPT

Page 1: Driver Guidelines and Repository Interoperability

Fasten … SeatbeltMaurice Vanderfeesten - SURFfoundation (NL)

15 November 2008 – Baltimore – DRIVER meeting

Page 2: Driver Guidelines and Repository Interoperability

Fasten

Excel in Scholarly communicationExcel in Scholarly communication

Page 3: Driver Guidelines and Repository Interoperability

Seatbelt

Get to the finish line safely

Page 4: Driver Guidelines and Repository Interoperability

Innovation towards the intelligent web

4

Reasoning

Amount of data

Pro

duct

ivity

of S

earc

h

Databases

Web 1.0 1990 - 2000

PC Era1980 - 1990

The World Wide Web

The Desktop Keyword searchDirectories

2010 - 2020

2000 - 2010

2020 - 2030

Web 3.0

Web 4.0

Web 2.0 Natural language search

Tagging

Semantic SearchThe Semantic Web

The Intelligent Web

The Social Web

Files & Folders

By: Radar Networks / TWINE

Page 5: Driver Guidelines and Repository Interoperability

Work together:Respect some rules

Page 6: Driver Guidelines and Repository Interoperability

Global Digital Repository Infrastructure

Global Digital Repository Infrastructure

One goal: “Reliable Content Provision” One goal: “Reliable Content Provision”

Page 7: Driver Guidelines and Repository Interoperability

TICER 2008, Tilburg

Wide spread metadata standards: Unqualified Dublin Core & OAI-PMH

Problem: interpreting semantics; standard specifications not enough

Example: Electronic theses need context specific descriptions for date, type, roles & language

7

Reality: Efforts to interpret and normalize data

Page 8: Driver Guidelines and Repository Interoperability

TICER 2008, Tilburg

- Trouble automatically interpreting semanticsex. [date] (Cranfield)

8

Effort interpreting dates

<dc:contributor>Partington, David(supervisor)</dc:contributor><dc:creator>Lupson, Jonathan</dc:creator><dc:date>2007-06-06T18:17:13Z</dc:date><dc:date>2007-06-06T18:17:13Z</dc:date><dc:date>2007-02</dc:date><dc:identifier>http://hdl.handle.net/1826/1729</dc:identifier><dc:description>

(Publication?)(Graduation?)(Start ?)

Humboldt:<dc:date>2007-06-07</dc:date> (Graduation) <dc:date>2007-03-06</dc:date> (Publication) <dc:date>2003-02</dc:date> (Start)

Humboldt:<dc:date>2007-06-07</dc:date> (Graduation)<dc:date>2007-03-06</dc:date> (Publication)<dc:date>2003-02</dc:date> (Start)

Recommendation: in Unqualified Dublin Core use one date field that represents the Publication date!

Page 9: Driver Guidelines and Repository Interoperability

TICER 2008, Tilburg

- Trouble automatically interpreting semanticsex. [type]

9

Effort interpreting types

DIVA:<dc:type>text.thesis.doctoral</dc:type>

Cranfield:<dc:type>Thesis or dissertation</dc:type><dc:type>Doctoral</dc:type><dc:type>PhD</dc:type>

Humboldt:<dc:type>Text</dc:type><dc:type>dissertation</dc:type>

Recommendation: use the following qualifications:“info:eu-repo/semantics/bachelorThesis”, “info:eu-repo/semantics/masterThesis”, “info:eu-repo/semantics/doctoralThesis” (Bologna Convention)

Page 10: Driver Guidelines and Repository Interoperability

TICER 2008, Tilburg

1. Electronic theses need context specific descriptions

10

Effort interpreting roles

<dc:contributor>Partington, David(supervisor)</dc:contributor><dc:creator>Lupson, Jonathan</dc:creator><dc:date>2007-06-06T18:17:13Z</dc:date><dc:date>2007-06-06T18:17:13Z</dc:date><dc:date>2007-02</dc:date><dc:identifier>http://hdl.handle.net/1826/1729</dc:identifier><dc:description>

Recommendation: use the contributor field in Dublin Core only for the person who supervised the Doctoral thesis project.

Page 11: Driver Guidelines and Repository Interoperability

TICER 2008, Tilburg

Personal notation flavour of a language

11

Effort interpreting languages

<dc:language>Nederlands</dc:language>

<dc:language>ned</dc:language>

<dc:language>nl</dc:language>

<dc:language>nld/dut</dc:language>

<dc:language>en_UK</dc:language>

<dc:language>mn</dc:language>

Recommendation: use ISO639-3As a standard way of writing down a language in a repository

Page 12: Driver Guidelines and Repository Interoperability

Number of repositories increase

DRIVER: Collection of

Quality Metadata for OpenAccess

Material

DRIVER: Collection of

Quality Metadata for OpenAccess

Material

Page 13: Driver Guidelines and Repository Interoperability

All services providers must build adaptors for every single repository

Page 14: Driver Guidelines and Repository Interoperability

Interoperability shares workload

Page 15: Driver Guidelines and Repository Interoperability

Global Digital Repository Infrastructure

Global Digital Repository Infrastructure

One goal: “Reliable Content Provision” One goal: “Reliable Content Provision”

Page 16: Driver Guidelines and Repository Interoperability

RepositoryRepository

URLURL

Reliability: Broken Links Issue

Page 17: Driver Guidelines and Repository Interoperability

RepositoryRepositoryGlobal

Resolver

GlobalResolver

OAI-PMHOAI-PMHIDID

URLURLID + URLUpdates

ID + URLUpdates

Reliability: Link resolvers

• Use ID’s for citation reference• Obligation to update• Technology independent (future proof)

Page 18: Driver Guidelines and Repository Interoperability

Standards, Agreements, Rules: Interoperability guidelines

Page 19: Driver Guidelines and Repository Interoperability

Towards web-reasoning: data efficiency & interoperability levels

By: Andreas Tolk et al., "Composable M&S Web Services for Net-centric Applications," Journal for Defense Modeling & Simulation (JDMS),

Volume 3 Number 1, pp. 27-44, January 2006

Page 20: Driver Guidelines and Repository Interoperability

Interoperability leads towards improved retrieval and recall

Reasoning

Amount of data

Pro

duct

ivity

of S

earc

h

Databases

Web 1.0 1990 - 2000

PC Era1980 - 1990

The World Wide Web

The Desktop Keyword searchDirectories

2010 - 2020

2000 - 2010

2020 - 2030

Web 3.0

Web 4.0

Web 2.0 Natural language search

Tagging

Semantic SearchThe Semantic Web

The Intelligent Web

The Social Web

Files & Folders

By: Radar Networks / TWINE

Page 21: Driver Guidelines and Repository Interoperability

We have: Tools for Syntactic & Semantic Interoperability

- Guidelines for content providers,

exposing textual resources with OAI-PMH

- Validator,

checking the rate of compliance to the

“Guidelines for content providers”

21

Page 22: Driver Guidelines and Repository Interoperability
Page 23: Driver Guidelines and Repository Interoperability

Guidelines 2.0- Build on knowledge from past & current IR projects (EU)

- 26 actively involved contributors (experts and repository

managers) from 8 countries.

- Practical answers for IR’s on how to:

- Improve full-text access

- Standardize metadata quality

- Create a reliable infrastructure for permanent identification,

resolution, traceability and storage

- Resolve semantic and classification issues

Page 24: Driver Guidelines and Repository Interoperability

Guidelines 2.0 - Chapters

1. Use of OAI-PMH

2. Use of Metadata OAI_DC

3. Use of Best Practices for OAI_DC

4. Use of Compound Object Wrapping

5. Use of Vocabularies and Semantics

6. Use of Quality labels (Long Term Preservation)

7. Use of Persistent Identifiers

8. Use of Usage Statistics Exchange

9. Use of Intellectual Property Rights (IPR)

Page 25: Driver Guidelines and Repository Interoperability

Validator

Page 26: Driver Guidelines and Repository Interoperability

Validator

- Deep validation

- Experimental tool

- Self-test for Repository

Managers

- Embedded in DRIVER

registration process

- Detects interoperability issues

- Provides explanation per

interoperability issue.

- Points to exact location of the

issue for easy debugging

- Offers recommendations on how

to correctly modify your

repository to interoperable

standards

- Creates a report for future

reference

- Provides a weighted score for

balanced effort

- Score influences the result list.

Page 27: Driver Guidelines and Repository Interoperability

Looking back on what we have:

- Guidelines for content providers,

exposing textual resources with OAI-PMH

- Validator,

checking the rate of compliance to the

“Guidelines for content providers”

27

Page 28: Driver Guidelines and Repository Interoperability

What is missing?

28

Guidelines

Page 29: Driver Guidelines and Repository Interoperability

Trias Politica Model

29

Legislative

Page 30: Driver Guidelines and Repository Interoperability

Reasoning

Amount of data

Pro

du

ctiv

ity o

f S

ea

rch

Databases

Web 1.0 1990 - 2000

PC Era1980 - 1990

The World Wide Web

The Desktop Keyword search

Directories

2010 - 2020

2000 - 2010

2020 - 2030

Web 3.0

Web 4.0

Web 2.0 Natural language search

Tagging

Semantic Search

The Semantic Web

The Intelligent Web

The Social Web

Files & Folders

By: Radar Networks / TWINE

We DON’T have- A structure for acceptance of

Repository Interoperability Guidelines World Wide

- Executive enforcement enabling action on adopting Interoperability Guidelines for Repositories, World Wide, on a National and local level

30

Page 31: Driver Guidelines and Repository Interoperability

Questions• What strategies can be used to create a global “Trias

Politica” for repositories in order to enforce “reliable content provision” by using interoperability guidelines?

• What strategies are there to maintain repository guidelines? Who is responsible?

• What strategies are known to create an acceptance mechanism for global agreement to repository guidelines?

• What strategies can be used to enforce repository guidelines?

• Who is responsible for the (metadata) quality of the repository output?

Page 32: Driver Guidelines and Repository Interoperability

The end

Thank you

Maurice Vanderfeesten

www.SURFfoundation.nl

[email protected]