linked data in the digital humanities skills workshop for realising the opportunities of the...

Post on 11-May-2015

8.967 Views

Category:

Technology

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

This workshop will introduce participants to Linked Data, a key semantic web technology, and its uses in the digital humanities. Through examples of Linked Data websites and applications, we will explore how Linked Data is being used by individual digital humanities scholars, by organisations such as the BBC and the Central Statistics Office, and by cultural heritage institutions worldwide. We will make comparisons to other approaches to structuring data (including markup and metadata approaches such as TEI and XML) and discuss best practices for creating and reusing Linked Data (such as the importance of identifiers and standard vocabularies). Participants will also be introduced to tools for creating and exploring Linked Data. The workshop will also include a hands-on exercise in creating Linked Data. Linked Data in the Digital Humanities was a Skills Workshop http://dri.ie/skills-workshops part of Realising the Opportunities of Digital Humanities http://dri.ie/realising-opportunities-digital-humanities Presenters: Jodi Schneider and Michael Hausenblas with support from Stefan Decker, Nuno Lopes, and Bahareh Heravi all of the Digital Enterprise Research Institute, National University of Ireland Galway

TRANSCRIPT

Copyright 2011 Digital Enterprise Research Institute. All rights reserved.

Digital Enterprise Research Institute www.deri.ie

Enabling Networked Knowledge

Linked Data in the Digital Humanities

Jodi Schneider & Michael Hausenblaswith Stefan Decker & Nuno Lopes

Thursday 25th October 2012

1

Realising the Opportunities of Digital HumanitiesNational University of Ireland, Maynooth

(2) 2

Introduction

(3) 3

What is Linked Data? Why use it? What are some examples?

How do Linked Data applications differ from conventional ones?

How is Linked Data different from other structured data used in digital humanities? (e.g. TEI, XML)

What are the best practices for creating Linked Data?

Objectives

(4) 4

Using identifiers to enable access to add structure to link to other stuff

What is Linked Data?

(5) 5

Why use Linked Data?

We do not want data silos!

Photo credit “nepatterson”, Flickr

(7) 7

A “Web” where documents are available for download on the Internet but there would be no hyperlinks among them

Imagine…

Slide credit: Ivan Herman

(10) 10

We need a proper infrastructure for a real Web of Data data is available on the Web

• accessible via standard Web technologies data are interlinked over the Web ie, data can be integrated over the Web

We need Linked Data

Data on the Web is not enough…

Slide credit: Ivan Herman

This is what we want!

Photo credit “kxlly”, Flickr

(12) 12

Where Linked Data is used

Mass Media BBC New York Times Guardian

Scholarly Publishers Nature CrossRef

Data Publishers USData.gov Data.gov.uk Central Statistics Office

Libraries

(13) 13

Libraries where Linked Data is used

(15) 15

Cost and benefits of Open Data – 5 ★ Open Data

(16) 16

http://5stardata.info

(17) 17

How do Linked Data applications differ from conventional ones?

DBpediarecord

data.gov.ierecord

British Libraryrecord

Europeanarecord

core record

apprecordsapp

conventional back-end Linked Data back-end

DBpediarecord

data.gov.ierecord

British Libraryrecord

Europeanarecord

core record

apprecordsapp

conventional back-end Linked Data back-end

up to 80% re-use possible

time & cost reduction!

(20) 20

Examples of Linked Data applicationswith Irish data …

http://county-rank.data-gov.ie

http://county-rank.data-gov.ie

http://school-explorer.data-gov.ie

(24) 24

Anatomy of the Authors N' Books application

http://srvgal85.deri.ie/ab-app/

Dydra

DBpedia

Europeana

user

DERI server-sideclient-side other server

ab-proxy.py(120 LOC)

ab-app.js(181 LOC)

(27) 27

What Would You Use Linked Data for?

(28) 28

How is Linked Data different from XML & TEI

(29) 29

Using identifiers to enable access to add structure to link to other stuff

What is Linked Data?

(30) 30

XML as a tree

Document

Paragraph

Sentence Sentence Sentence

Paragraph

Sentence Sentence

Remember, everything must nest properly!

We use family tree terms: parent, child, sibling, ancestor, and descendent.

Slide credit: Susan Schreibman

(31) 31

RDF: More than trees

(32) 32

What does TEI make explicit?

structural divisions within a texttitle-page, chapter, scene, stanza, line, etc

typographical elementschanges in typeface, special characters, etc

other textual featuresgrammatical structures, location of illustrations, variant

forms, etc

Slide credit: Susan Schreibman

(33) 33

What does TEI make explicit?

structural divisions within a texttitle-page, chapter, scene, stanza, line, etc

typographical elementschanges in typeface, special characters, etc

other textual featuresgrammatical structures, location of illustrations, variant

forms, etc

Slide credit: Susan Schreibman

(34) 34

Best Practice 1: Identifiers

(35) 35

Using identifiers to enable access to add structure to link to other stuff

What is Linked Data?

(36) 36

http://www.youtube.com/watch?v=TJfrNo3Z-DU

Why do we need identifiers?

(37) 37

(38) 38

(39) 39

Language ambiguity

dog =

Slide Credit: Karen Coyle

(40) 40

Language ambiguity

dog (lang=en)

hund (lang=de)

perro (lang=sp)

chien (lang=fr)

IDa87nn3

Slide Credit: Karen Coyle

(41) 41

UPC code

Slide Credit: Karen Coyle

(42) 42

ISBN

Slide Credit: Karen Coyle

(43) 43

URIs as Identifiers

Slide Credit: Michael Hausenblas

A Uniform Resource Identifier (URI) is a compact sequence of characters that identifies an abstract or physical resource. [RFC3986]

Syntax URI = scheme ":" hier-part [ "?" query ] [ "#" fragment ]

Example foo://example.com:8042/over/there?name=ferret#nose \_/ \_________________/\____/ \_______________/ \____/ | | | | | scheme authority path query fragment

(44) 44

Example: DBPedia

http://dbpedia.org/page/Maynooth

(45) 45

Example: DBPedia

http://dbpedia.org/page/Maynooth

(46) 46

Using Linked Data 1: Querying & Filtering

(47) 47

Using identifiers to enable access to add structure to link to other stuff

What is Linked Data?

(49) 49

Linked Data Application Revisited

(50) 50

One Query

(51) 51

Best Practice 2: Vocabularies

(52) 52

Dublin Core

Title Creator Date Subject Contributor Coverage Description Format Identifier Publisher Relation Rights Source Type

(53) 53

Friend of a Friend (FOAF)

Slide credit: Dan Brickley

(54) 54

Friend of a Friend (FOAF)

Slide credit: Dan Brickley

(55) 55

Friend of a Friend (FOAF)

Slide credit: Dan Brickley

(56) 56

Friend of a Friend (FOAF)

Slide credit: Dan Brickley

(57) 57

Friend of a Friend (FOAF)

Slide credit: Dan Brickley

(58) 58

Friend of a Friend (FOAF)

Slide credit: Dan Brickley

(59) 59

Friend of a Friend (FOAF)

Slide credit: Dan Brickley

(60) 60

Which vocabularies might you use? What might you need vocabularies to

represent?

There are *many* vocabularies!

(61) 61

Quick Recap

(62) 62

What is Linked Data? Why use it? What are some examples?

Objectives

(63) 63

Using identifiers to enable access to add structure to link to other stuff

What is Linked Data?

We do not want data silos!

Photo credit “nepatterson”, Flickr

This is what we want!

Photo credit “kxlly”, Flickr

(66) 66

http://5stardata.info

(67) 67

What is Linked Data? Why use it? What are some examples?

How do Linked Data applications differ from conventional ones?

Objectives

DBpediarecord

data.gov.ierecord

British Libraryrecord

Europeanarecord

core record

apprecordsapp

conventional back-end Linked Data back-end

up to 80% re-use possible

time & cost reduction!

(69) 69

What is Linked Data? Why use it? What are some examples?

How do Linked Data applications differ from conventional ones?

How is Linked Data different from other structured data used in digital humanities? (e.g. TEI, XML)

Objectives

(70) 70

RDF: More than trees

(71) 71

What is Linked Data? Why use it? What are some examples?

How do Linked Data applications differ from conventional ones?

How is Linked Data different from other structured data used in digital humanities? (e.g. TEI, XML)

What are the best practices for creating Linked Data?

Objectives

(72) 72

What is Linked Data? Why use it? What are some examples?

How do Linked Data applications differ from conventional ones?

How is Linked Data different from other structured data used in digital humanities? (e.g. TEI, XML)

What are the best practices for creating Linked Data?

Objectives

(73) 73

Best Practice 1: Identifiers

(74) 74

Need Identifiers!

dog =

Slide Credit: Karen Coyle

(75) 75

Example: DBPedia

http://dbpedia.org/page/Maynooth

(76) 76

Best Practice 2: Vocabularies

(77) 77

Dublin Core

Title Creator Date Subject Contributor Coverage Description Format Identifier Publisher Relation Rights Source Type

(78) 78

Best Practice 3: Connect to the community

(79) 79

79

(80) 80

(81) 81

(82) 82

(83) 83

(84) 84

(85) 85

(86) 86

Question Time!

(87) 87

Acknowledgements

Copyright 2011 Digital Enterprise Research Institute. All rights reserved.

Digital Enterprise Research Institute www.deri.ie

Enabling Networked Knowledge

Thanks to our funders!

(89) 89

Thanks for slides!

Dan BrickleyKaren CoyleSiegfried HandschuhMichael HausenblasIvan HermanJodi SchneiderSusan Schreibman

(90) 90

Appendix & Backup Slides

(91) 91

Additional Examples

(94) 94

(95) 95

Advanced Topics

(96) 96

Demo on Queries

(97) 97

Making Linked Data 1: Convert Existing Data

(99) 99

Using Linked Data 2: Data Integration

AKA “An introduction to the Semantic Web (Through an Example)” by Ivan Herman

(100) 100

Map the various data onto an abstract data representation make the data independent of its internal

representation… Merge the resulting representations Start making queries on the whole!

queries not possible on the individual data sets

The rough structure of data integration

(102) 102

A simplified bookstore data (dataset “A”)

ISBN Author Title Publisher Year

0006511409X id_xyz The Glass Palace id_qpr 2000

ID Name Homepage

id_xyz Ghosh, Amitav http://www.amitavghosh.com

ID Publisher’s name City

id_qpr Harper Collins London

(103) 103

1st: export your data as a set of relations

http://…isbn/000651409X

Ghosh, Amitav http://www.amitavghosh.com

The Glass Palace

2000

London

Harper Collins

a:title

a:year

a:city

a:p_name

a:namea:homepage

a:authora:publisher

(104) 104

Relations form a graph the nodes refer to the “real” data or contain some

literal how the graph is represented in machine is immaterial

for now

Some notes on the exporting the data

(106) 106

Another bookstore data (dataset “F”)

A B C D

1 ID Titre Traducteur Original2 ISBN 2020286682 Le Palais des Miroirs $A12$ ISBN 0-00-6511409-X3

4

5

6 ID Auteur7 ISBN 0-00-6511409-X $A11$

8

9

10 Nom11 Ghosh, Amitav12 Besse, Christianne

(107) 107

2nd: export your second set of data

http://…isbn/000651409X

Ghosh, Amitav

Besse, Christianne

Le palais des miroirsf:original

f:nom

f:traducteur

f:auteurf:ti

tre

http://…isbn/2020386682

f:nom

(108) 108

3rd: start merging your data

http://…isbn/000651409X

Ghosh, Amitav

Besse, Christianne

Le palais des miroirs

f:original

f:nom

f:traducteur

f:auteur f:titre

http://…isbn/2020386682

f:nom

http://…isbn/000651409X

Ghosh, Amitav

http://www.amitavghosh.com

The Glass Palace

2000

London

Harper Collins

a:title

a:year

a:city

a:p_name

a:namea:homepage

a:author

a:publisher

(109) 109

3rd: start merging your data (cont)

http://…isbn/000651409X

Ghosh, Amitav

Besse, Christianne

Le palais des miroirs

f:origina

l

f:nom

f:traducteur

f:auteur f:titre

http://…isbn/2020386682

f:nom

http://…isbn/000651409X

Ghosh, Amitav

http://www.amitavghosh.com

The Glass Palace

2000

London

Harper Collins

a:title

a:year

a:city

a:p_name

a:namea:homepage

a:author

a:publisher

Same URI!

(110) 110

3rd: start merging your dataa:title

Ghosh, Amitav

Besse, Christianne

Le palais des miroirs

f:original

f:nom

f:traducteur

f:auteur

f:titre

http://…isbn/2020386682

f:nom

Ghosh, Amitav

http://www.amitavghosh.com

The Glass Palace

2000

London

Harper Collins

a:year

a:city

a:p_name

a:namea:homepage

a:author

a:publisher

http://…isbn/000651409X

(111) 111

User of data “F” can now ask queries like: “give me the title of the original”

• well, … « donnes-moi le titre de l’original »

This information is not in the dataset “F”… …but can be retrieved by merging with dataset

“A”!

Start making queries…

(112) 112

We “feel” that a:author and f:auteur should be the same

But an automatic merge doest not know that! Let us add some extra information to the

merged data: a:author same as f:auteur both identify a “Person” a term that a community may have already defined:

• a “Person” is uniquely identified by his/her name and, say, homepage

• it can be used as a “category” for certain type of resources

However, more can be achieved…

(113) 113

3rd revisited: use the extra knowledge

Besse, Christianne

Le palais des miroirsf:original

f:nom

f:traducteur

f:auteur

f:titre

http://…isbn/2020386682

f:nom

Ghosh, Amitav

http://www.amitavghosh.com

The Glass Palace

2000

London

Harper Collins

a:title

a:year

a:city

a:p_name

a:namea:homepage

a:author

a:publisher

http://…isbn/000651409X

http://…foaf/Personr:type

r:type

(114) 114

User of dataset “F” can now query: “donnes-moi la page d’accueil de l’auteur de l’original”

• well… “give me the home page of the original’s ‘auteur’”

The information is not in datasets “F” or “A”… …but was made available by:

merging datasets “A” and datasets “F” adding three simple extra statements as an extra

“glue”

Start making richer queries!

(115) 115

Using, e.g., the “Person”, the dataset can be combined with other sources

For example, data in Wikipedia can be extracted using dedicated tools e.g., the “dbpedia” project can extract the “infobox”

information from Wikipedia already…

Combine with different datasets

(116) 116

Merge with Wikipedia data

Besse, Christianne

Le palais des miroirsf:original

f:nom

f:traducteur

f:auteur

f:titre

http://…isbn/2020386682

f:nom

Ghosh, Amitav http://www.amitavghosh.com

The Glass Palace

2000

London

Harper Collins

a:title

a:year

a:city

a:p_name

a:namea:homepage

a:author

a:publisher

http://…isbn/000651409X

http://…foaf/Personr:type

r:type

http://dbpedia.org/../Amitav_Ghosh

r:type

foaf:name w:reference

(117) 117

Merge with Wikipedia data

Besse, Christianne

Le palais des miroirsf:original

f:nom

f:traducteur

f:auteur

f:titre

http://…isbn/2020386682

f:nom

Ghosh, Amitav http://www.amitavghosh.com

The Glass Palace

2000

London

Harper Collins

a:title

a:year

a:city

a:p_name

a:namea:homepage

a:author

a:publisher

http://…isbn/000651409X

http://…foaf/Personr:type

r:type

http://dbpedia.org/../Amitav_Ghosh

http://dbpedia.org/../The_Hungry_Tide

http://dbpedia.org/../The_Calcutta_Chromosome

http://dbpedia.org/../The_Glass_Palace

r:type

foaf:name w:reference

w:author_of

w:author_of

w:author_of

w:isbn

(118) 118

Merge with Wikipedia data

Besse, Christianne

Le palais des miroirsf:original

f:nom

f:traducteur

f:auteur

f:titre

http://…isbn/2020386682

f:nom

Ghosh, Amitav http://www.amitavghosh.com

The Glass Palace

2000

London

Harper Collins

a:title

a:year

a:city

a:p_name

a:namea:homepage

a:author

a:publisher

http://…isbn/000651409X

http://…foaf/Personr:type

r:type

http://dbpedia.org/../Amitav_Ghosh

http://dbpedia.org/../The_Hungry_Tide

http://dbpedia.org/../The_Calcutta_Chromosome

http://dbpedia.org/../Kolkata

http://dbpedia.org/../The_Glass_Palace

r:type

foaf:name w:reference

w:author_of

w:author_of

w:author_of

w:born_in

w:isbn

w:long w:lat

(119) 119

It may look like it but, in fact, it should not be… What happened via automatic means is done

every day by Web users! The difference: a bit of extra rigour so that

machines could do this, too

Is that surprising?

(120) 120

We could add extra knowledge to the merged datasets e.g., a full classification of various types of library data geographical information etc.

This is where ontologies, extra rules, etc, come in ontologies/rule sets can be relatively simple and small,

or huge, or anything in between… Even more powerful queries can be asked as a

result

It could become even more powerful

(121) 121

What did we do?

Data in various formats

Data represented in abstract format

Applications

Map,Expose,…

ManipulateQuery…

(122) 122

Thanks!

top related