data-mining the semantic web @tcd

Data-mining the Semantic Web and spatially visualising the results DAH workshop Trinity College Dublin 27 May 2015

Upload: frank-lynam

Post on 28-Jul-2015




3 download


Page 1: Data-mining the Semantic Web @TCD

Data-mining the Semantic Weband spatially visualising the resultsDAH workshopTrinity College Dublin 27 May 2015

Page 2: Data-mining the Semantic Web @TCD

2 of 63@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsDAH workshop

Workshop overview

• Morning session : Data-mining– Open Data– Linked Data– Linked Open Data implementation– Semantic Web and ontologies– Hands-on practical exercises

Page 3: Data-mining the Semantic Web @TCD

3 of 63@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsDAH workshop

Workshop overview

• Afternoon session : Data visualisation– Data visualisation concepts introduction– Web maps and geo-tagging– Hands-on practical– Interpretations– Hermeneutic circle

Page 4: Data-mining the Semantic Web @TCD

4 of 63@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsDAH workshop

But first, a very quick survey

• Your occupation– UG student– PG student– Professional academic– Non-academic

Page 5: Data-mining the Semantic Web @TCD

5 of 63@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsDAH workshop

A quick survey

• Your age group– Under 16– 16-24– 25-34– 35-44– 45-54– 55 and over

Page 6: Data-mining the Semantic Web @TCD

6 of 63@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsDAH workshop

A quick survey

• How familiar are you with Open Access?– 1 - Not familiar at all– 2– 3– 4– 5 – Very familiar

Page 7: Data-mining the Semantic Web @TCD

7 of 63@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsDAH workshop

A quick survey

• How familiar are you with Open Data?– 1 – Not familiar at all– 2– 3– 4– 5 – Very familiar

Page 8: Data-mining the Semantic Web @TCD

8 of 63@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsDAH workshop

A quick survey

• How familiar are you with Linked Data?– 1 – Not familiar at all– 2– 3– 4– 5 – Very familiar

Page 9: Data-mining the Semantic Web @TCD

9 of 63@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsDAH workshop

A quick survey

• How familiar are you with the Semantic Web?– 1 – Not familiar at all– 2– 3– 4– 5 – Very familiar

Page 10: Data-mining the Semantic Web @TCD

10 of 63@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsDAH workshop

A quick survey

• Have you ever published Open Data?– Yes– No

Page 11: Data-mining the Semantic Web @TCD

11 of 63@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsDAH workshop

A quick survey

• Have you ever consumed Linked Open Data services?– Yes– No

Page 12: Data-mining the Semantic Web @TCD

12 of 63@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsDAH workshop

A quick survey

• Please fill in your…– Name– Email address

Don’t worry – I’m not going to pass them on to anyone

Page 13: Data-mining the Semantic Web @TCD

13 of 63@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsDAH workshop

From the horse’s mouth


Page 14: Data-mining the Semantic Web @TCD

14 of 63@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsDAH workshop

Page 15: Data-mining the Semantic Web @TCD

15 of 63@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsDAH workshop

Open Access

TerminologyOpen Data

Big Data

The web of data

The Semantic WebLinked Data

data mining

Page 16: Data-mining the Semantic Web @TCD

16 of 63@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsDAH workshop

Asking questions of digital datasets


Page 17: Data-mining the Semantic Web @TCD

17 of 63@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsDAH workshop

Open Access


Page 18: Data-mining the Semantic Web @TCD

18 of 63@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsDAH workshop

Design by Julie Beckfor the Harvard University Neuroinformatics dept(source:

Page 19: Data-mining the Semantic Web @TCD

19 of 63@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsDAH workshop

Linked DataTerminology

The linkages between the major Linked Data datasets (source:

Page 20: Data-mining the Semantic Web @TCD

20 of 63@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsDAH workshop

Big DataTerminology

Wordle of terms associated with Big Data activity (source:

Page 21: Data-mining the Semantic Web @TCD

21 of 63@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsDAH workshop

5 Stars of Open Data

put your data online under an open license

make it structured (e.g. as an Excel file)

use non-proprietary formats (e.g. XML and not Excel)

use URIs to identify resources

link your data to external datasets

Page 22: Data-mining the Semantic Web @TCD

22 of 63@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsDAH workshop

The RDF Triple

Page 23: Data-mining the Semantic Web @TCD

23 of 63@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsDAH workshop

A Triple Example

‘…the boy’s name is Tom…’




Page 24: Data-mining the Semantic Web @TCD

24 of 63@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsDAH workshop

Triple Linking

‘…Tom is short for Thomas…’




Page 25: Data-mining the Semantic Web @TCD

25 of 63@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsDAH workshop

Graph data

Page 26: Data-mining the Semantic Web @TCD

26 of 63@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsDAH workshop

Serialising RDF

• Turtle



• N-Triples

Page 27: Data-mining the Semantic Web @TCD

27 of 63@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsDAH workshop

RDF Turtle@base <> .@prefix rdf: <> .@prefix rdfs: <> .@prefix foaf: <> .@prefix rel: <> .

<green-goblin> rel:enemyOf <spiderman> ; a foaf:Person ; # in the context of the Marvel universe foaf:name "Green Goblin" .

<spiderman> rel:enemyOf <green-goblin> ; a foaf:Person ; foaf:name "Spiderman", "Человек-паук"@ru .




Page 28: Data-mining the Semantic Web @TCD

28 of 63@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsDAH workshop

As N-Triples

<> <> <> .<> <> <> .<> <> "Green Goblin" .<> <> <> .<> <> <> .<> <> "Spiderman" .<> <> "\u00D0\u00A7\u00D0\u00B5\u00D0\u00BB\u00D0\u00BE\u00D0\u00B2\u00D0\u00B5\u00D0\u00BA-\u00D0\u00BF\u00D0\u00B0\u00D1\u0083\u00D0\u00BA"@ru .

Page 29: Data-mining the Semantic Web @TCD

29 of 63@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsDAH workshop

As JSON{"http:\/\/\/green-goblin":{"http:\/\/\/schemas\/relationship\/enemyOf":[{"type":"uri","value":"http:\/\/\/spiderman"}],"http:\/\/\/1999\/02\/22-rdf-syntax-ns#type":[{"type":"uri","value":"http:\/\/\/foaf\/0.1\/Person"}],"http:\/\/\/foaf\/0.1\/name":[{"type":"literal","value":"Green Goblin"}]},"http:\/\/\/spiderman":{"http:\/\/\/schemas\/relationship\/enemyOf":[{"type":"uri","value":"http:\/\/\/green-goblin"}],"http:\/\/\/1999\/02\/22-rdf-syntax-ns#type":[{"type":"uri","value":"http:\/\/\/foaf\/0.1\/Person"}],"http:\/\/\/foaf\/0.1\/name":[{"type":"literal","value":"Spiderman"},{"type":"literal","value":"\u0427\u0435\u043b\u043e\u0432\u0435\u043a-\u043f\u0430\u0443\u043a","lang":"ru"}]}}

Page 30: Data-mining the Semantic Web @TCD

30 of 63@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsDAH workshop

As RDF/XML<?xml version="1.0" encoding="utf-8" ?><rdf:RDF xmlns:rdf="" xmlns:foaf="" xmlns:ns0="">

<foaf:Person rdf:about=""> <ns0:enemyOf> <foaf:Person rdf:about=""> <ns0:enemyOf rdf:resource=""/> <foaf:name>Spiderman</foaf:name> <foaf:name xml:lang="ru">Человек-паук</foaf:name> </foaf:Person> </ns0:enemyOf>

<foaf:name>Green Goblin</foaf:name> </foaf:Person>


Page 31: Data-mining the Semantic Web @TCD

31 of 63@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsDAH workshop

Visualised as a Graph

Page 32: Data-mining the Semantic Web @TCD

32 of 63@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsDAH workshop


InfrastructureA server farm (source:

Page 33: Data-mining the Semantic Web @TCD

33 of 63@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsDAH workshop

Practical: Making RDF

Q: Create RDF representations of yourself and your relationships

Page 34: Data-mining the Semantic Web @TCD

34 of 63@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsDAH workshop

The Semantic Web and Ontologies

The stages of the Web (source:

Page 35: Data-mining the Semantic Web @TCD

35 of 63@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsDAH workshop

Ontological Classes and Properties

Page 36: Data-mining the Semantic Web @TCD

36 of 63@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsDAH workshop

The British Museum data mapping onto the CIDOC CRM(source:

Page 37: Data-mining the Semantic Web @TCD

37 of 63@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsDAH workshop

The CIDOC CRM basic entity types and their relationships(source:

Page 38: Data-mining the Semantic Web @TCD

38 of 63@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsDAH workshop


Page 39: Data-mining the Semantic Web @TCD

39 of 63@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsDAH workshop

Graph data

Page 40: Data-mining the Semantic Web @TCD

40 of 63@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsDAH workshop

Minna Sundberg (source:

Page 41: Data-mining the Semantic Web @TCD

41 of 63@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsDAH workshop

Querying using SPARQL


?s ?p ?o} LIMIT 10

Page 42: Data-mining the Semantic Web @TCD

42 of 63@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsDAH workshop

More complex SPARQL

PREFIX rdf: <>PREFIX letters1916: <>PREFIX letters1916data: <>PREFIX schema: <>

SELECT DISTINCT ?letter ?letterName ?recipientPostalAddressName ?recipientLongitude ?recipientLatitudeWHERE {

?letter rdf:type letters1916:Letter ;schema:name ?letterName ;letters1916:recipientLocation ?recipientPostalAddress .

?recipientPostalAddress schema:addressRegion ?recipientPostalAddressRegion ;FILTER regex(?recipientPostalAddressRegion, 'Galway', 'i')?recipientPostalAddress schema:name ?recipientPostalAddressName .

?recipientPlace schema:address ?recipientPostalAddress ;schema:geo ?recipientGeoCoordinates .

?recipientGeoCoordinates schema:longitude ?recipientLongitude ;schema:latitude ?recipientLatitude





Page 43: Data-mining the Semantic Web @TCD

43 of 63@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsDAH workshop

Practical: Universities on DBpedia

Q: Get a list of all of the universities that DBpedia knows about

Page 44: Data-mining the Semantic Web @TCD

44 of 63@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsDAH workshop


@prefix dct: <> .@prefix skos: <> .@prefix cc: <> .

<> a skos:Concept ;cc:license <> ;cc:attributionURL <> ;cc:attributionName "" ;skos:inScheme <> ;skos:prefLabel “Jar" ;skos:scopeNote ”A jar concept. Pottery. This isn’t a great scope note." ;dct:publisher <> ;dct:identifier <> ;dct:issued "2015-02-23"^^xsd:date ;skos:exactMatch <> .

Page 45: Data-mining the Semantic Web @TCD

45 of 63@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsDAH workshop


SELECT * WHERE { ?s rdfs:label ?label .

FILTER langMatches(lang(?label), "en”)}

Page 46: Data-mining the Semantic Web @TCD

46 of 63@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsDAH workshop


SELECT * WHERE { ?s rdfs:label ?label .

FILTER langMatches(lang(?label), "en") .

FILTER regex(?label, ”bell", "i”)}

Page 47: Data-mining the Semantic Web @TCD

47 of 63@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsDAH workshop


SELECT * WHERE { ?s dct:dateCreated ?dateCreated .

FILTER (?dateCreated > '1900-01-01'}

Page 48: Data-mining the Semantic Web @TCD

48 of 63@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsDAH workshop

Practical: British Museum Sarcophagi

Q: Get the find spots of all of the sarcophagi in the British Museum collection

SPARQL endpoint:

Page 49: Data-mining the Semantic Web @TCD

49 of 63@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsDAH workshop

Practical: Archaeological stratigraphy

Q: Get the stratigraphic relationships between the contexts excavated at Priniatikos Pyrgos

SPARQL endpoint:

Page 50: Data-mining the Semantic Web @TCD

50 of 63@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsDAH workshop

Stratigraphy explained (very briefly…)

Sample stratigraphic sequence (source:

Page 51: Data-mining the Semantic Web @TCD

51 of 63@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsDAH workshop

The Priniatikos Pyrgos ontology

Page 52: Data-mining the Semantic Web @TCD

52 of 63@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsDAH workshop

Practical: Archaeological stratigraphy

Q: Get the stratigraphic relationships between the contexts excavated at Priniatikos Pyrgos

SPARQL endpoint:

Hint: you will need to traverse 2 levels of the ontology’s hierarchy to get at the stratigraphy data

Page 53: Data-mining the Semantic Web @TCD

53 of 63@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsDAH workshop

Practical: Nomisma and Ancient Coins

Q: Get the geo-coordinates of all of the coin hoards stored in the Nomisma triplestore

SPARQL endpoint:

Page 54: Data-mining the Semantic Web @TCD

54 of 63@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsDAH workshop

Geo-coding the Find Spotswith Google Refine

Page 55: Data-mining the Semantic Web @TCD

55 of 63@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsDAH workshop

The Google Maps API

Address String

Geo-coordinates as JSON

Page 56: Data-mining the Semantic Web @TCD

56 of 63@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsDAH workshop

Export as CSV

Page 57: Data-mining the Semantic Web @TCD

57 of 63@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsDAH workshop

Practical: Getty Concepts

Q: Get all of the Getty URIs that represent concepts related to amphorae

SPARQL endpoint:

Page 58: Data-mining the Semantic Web @TCD

58 of 63@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsDAH workshop

Additional Linked Data Resources

Page 59: Data-mining the Semantic Web @TCD

59 of 63@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsDAH workshop

One final quick survey

• Please arrange the practicals in terms of how easy they were to complete (1 for hardest and 5 for easiest)?– Making your FOAF profile– DBpedia universities– British Museum sarcophagi hunting– Getty vocabularies– Nomisma coin hoards

Page 60: Data-mining the Semantic Web @TCD

60 of 63@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsDAH workshop

One final quick survey

• Would you consider publishing Linked Open Data in the future?– 1 – Absolutely not – 2– 3– 4– 5 – Definitely

Page 61: Data-mining the Semantic Web @TCD

61 of 63@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsDAH workshop

One final quick survey

• Would you consider using Linked Open Data resources (using SPARQL or otherwise) in the future?– 1 – Absolutely not – 2– 3– 4– 5 – Definitely

Page 62: Data-mining the Semantic Web @TCD

62 of 63@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsDAH workshop

One final quick survey

• Is Linked Open Data a feasible platform on which to undertake humanities research?– 1 – Absolutely not– 2– 3– 4– 5 – Definitely

Page 63: Data-mining the Semantic Web @TCD

63 of 63@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsDAH workshop

One final quick survey

• Any final comments?

Page 64: Data-mining the Semantic Web @TCD

64 of 63@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsDAH workshop

Thank you!

Martin Lemay (source: