about the challenges of linked open data (lod) in libraries

Post on 11-Feb-2017

218 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Ansgar Scherpa.scherp@zbw.eu

104th Bibliothekartag, Nuremberg, Germany, May 2015

About the Challenges of Linked Open Data (LOD) in Libraries

Index Newly Acquired Media

• Ancient world: Library of Alexandria• Today: database-oriented systems• Tomorrow: Web Linked Open Data in Libraries

Source: http://en.wikipedia.org/wiki/Library_of_Alexandria

- 2 -

Linked Open Data (LOD) in Libraries• Publishing and interlinking of data• Different quality and purpose• From different sources in the Web

World Wide WebDocumentsHyperlinksHTMLAddresses (URIs)

Example: http://www.bibliothekartag2015.de

Linked DataDataTyped LinksRDFAddresses (URIs)

- 3 -

Linked (Library) Data: A Success Story

- 4 -

Current (Technological) Topics *)

1. Entity resolution2. Schema matching3. Distributed data management4. Automatic indexing5. Indexing non-textual content6. Data provenance

Non-technical but equally important:• Quality management (e.g., automated indexing)• Legal aspects• Job market

*) Disclaimer: No guarantee for completeness - 5 -

Core computerscience

1. Entity Resolution• Intra-library

• Identify the author of a new publication

• Inter-library• Linking records via , e.g., authors

Helmut Kohl Helmut Kohlvs.

Ansgar Scherp Ansgar Scherpvs.

ZBW/DNB DBLP

1995 2005

- 6 -

1. Entity Resolution in LOD• Use URI aliases to connect LOD resources• Describing the same things in the real world• Service for sameAs-links: .org

• Resolution of name, co-authors, title, and venue often not sufficient

- 7 -Source: J. Neubert, K. Tochtermann: Linked Library Data: Offering a Backbone for the Semantic Web, CiCIS, 2012.

Source Persons Organizations

DBpedia 364,000 148,000Library of Congress Authorities 3,800,000 900,000German NationalLibrary AuthorityFile 1,797,911 1,262,404Virtual International Authority File 10 million 3.25 million

2. Schema Matching: STW and TheSoz

- 8 -

Standard Thesaurus Wirtschaft

• Manually created ~5000 mappings (mostly 2004/2005)• Also connected to GND and ACROVOC• OAEI Library Track for ontology matching (since 2012)

TheSoz (GESIS)

VIAF(Virtual International Authority File)• Combines multiple name authority files (http://viaf.org/)• Lower costs and increase utility of library authority files • Matching and linking widely-used authority files and

making that information available on the Web

- 9 -

• Auto-completion suggests terms from PND, STW, …• Author confirms by selecting terms• Keyword is matched with the semantic concept

- 10 -

Subject Indexing in

• Auto-completion suggests terms from PND, STW, …• Author confirms by selecting terms• Keyword is matched with the semantic concept

- 11 -

Subject Indexing in

4. Automated Indexing in GERHARD

- 12 -

• ~ 1 Mio web documents • ~ 10.000 concepts from UDC• 3 Languages (EN, DE, FR,)

4. Automated Indexing at ZBW• 1.6 Mio documents with STW annotations in LOD• Average of 5 descriptors per document

• Multi-labeling scientific documents using kNNclassifier with entity detection and the HITS algorithm

• Experiments over 62,000 open access documents• Avg. recall of 40% and precision of 40%• Outperforms today's approaches such as Maui

• But: does not require expensive training phases

• Integrate automatic classification methods in semi-automatic workflow ( “human in the loop”)

- 13 -

6. Data Provenance• VIAF: inter-organizational, cross-border

and thus cross-lingual record linkage• Records may come from different libraries

• But how to …• track metadata (re)use?• refer to original metadata when library A uses a

(part of) record from library B?

- 14 -

• Digitally signing and publishing metadata as LOD• Allows to build network of trust

Summary• Libraries as innovation driver for Linked Open Data• Interesting research topics for computer science• Both data and expertise is available

• Present of representing metadata and record linkage

Got Interested?Contact me:Ansgar ScherpEmail: a.scherp@zbw.euWeb: http://zwb.eu/en/research/

knowledge-discovery

top related