data-mining the semantic web @tcd

64
Data-mining the Semantic Web and spatially visualising the results DAH workshop Trinity College Dublin 27 May 2015

Upload: frank-lynam

Post on 28-Jul-2015

45 views

Category:

Technology


3 download

TRANSCRIPT

Page 1: Data-mining the Semantic Web @TCD

Data-mining the Semantic Weband spatially visualising the resultsDAH workshopTrinity College Dublin 27 May 2015

Page 2: Data-mining the Semantic Web @TCD

2 of 63@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsDAH workshop

Workshop overview

• Morning session : Data-mining– Open Data– Linked Data– Linked Open Data implementation– Semantic Web and ontologies– Hands-on practical exercises

Page 3: Data-mining the Semantic Web @TCD

3 of 63@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsDAH workshop

Workshop overview

• Afternoon session : Data visualisation– Data visualisation concepts introduction– Web maps and geo-tagging– Hands-on practical– Interpretations– Hermeneutic circle

Page 4: Data-mining the Semantic Web @TCD

4 of 63@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsDAH workshop

But first, a very quick survey

• Your occupation– UG student– PG student– Professional academic– Non-academic

Page 5: Data-mining the Semantic Web @TCD

5 of 63@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsDAH workshop

A quick survey

• Your age group– Under 16– 16-24– 25-34– 35-44– 45-54– 55 and over

Page 6: Data-mining the Semantic Web @TCD

6 of 63@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsDAH workshop

A quick survey

• How familiar are you with Open Access?– 1 - Not familiar at all– 2– 3– 4– 5 – Very familiar

Page 7: Data-mining the Semantic Web @TCD

7 of 63@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsDAH workshop

A quick survey

• How familiar are you with Open Data?– 1 – Not familiar at all– 2– 3– 4– 5 – Very familiar

Page 8: Data-mining the Semantic Web @TCD

8 of 63@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsDAH workshop

A quick survey

• How familiar are you with Linked Data?– 1 – Not familiar at all– 2– 3– 4– 5 – Very familiar

Page 9: Data-mining the Semantic Web @TCD

9 of 63@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsDAH workshop

A quick survey

• How familiar are you with the Semantic Web?– 1 – Not familiar at all– 2– 3– 4– 5 – Very familiar

Page 10: Data-mining the Semantic Web @TCD

10 of 63@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsDAH workshop

A quick survey

• Have you ever published Open Data?– Yes– No

Page 11: Data-mining the Semantic Web @TCD

11 of 63@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsDAH workshop

A quick survey

• Have you ever consumed Linked Open Data services?– Yes– No

Page 12: Data-mining the Semantic Web @TCD

12 of 63@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsDAH workshop

A quick survey

• Please fill in your…– Name– Email address

Don’t worry – I’m not going to pass them on to anyone

Page 13: Data-mining the Semantic Web @TCD

13 of 63@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsDAH workshop

From the horse’s mouth

(source: www.ted.com/talks/tim_berners_lee_on_the_next_web)

Page 14: Data-mining the Semantic Web @TCD

14 of 63@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsDAH workshop

Page 15: Data-mining the Semantic Web @TCD

15 of 63@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsDAH workshop

Open Access

TerminologyOpen Data

Big Data

The web of data

The Semantic WebLinked Data

data mining

Page 16: Data-mining the Semantic Web @TCD

16 of 63@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsDAH workshop

Asking questions of digital datasets

Terminology

Page 17: Data-mining the Semantic Web @TCD

17 of 63@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsDAH workshop

Open Access

Terminology

Page 18: Data-mining the Semantic Web @TCD

18 of 63@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsDAH workshop

Design by Julie Beckfor the Harvard University Neuroinformatics dept(source: www.juliebcreative.com/portfolio/open-data-logo/)

Page 19: Data-mining the Semantic Web @TCD

19 of 63@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsDAH workshop

Linked DataTerminology

The linkages between the major Linked Data datasets (source: lod-cloud.net)

Page 20: Data-mining the Semantic Web @TCD

20 of 63@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsDAH workshop

Big DataTerminology

Wordle of terms associated with Big Data activity (source: sfdata.startupweekend.org)

Page 21: Data-mining the Semantic Web @TCD

21 of 63@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsDAH workshop

5 Stars of Open Data

put your data online under an open license

make it structured (e.g. as an Excel file)

use non-proprietary formats (e.g. XML and not Excel)

use URIs to identify resources

link your data to external datasets

Page 22: Data-mining the Semantic Web @TCD

22 of 63@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsDAH workshop

The RDF Triple

Page 23: Data-mining the Semantic Web @TCD

23 of 63@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsDAH workshop

A Triple Example

‘…the boy’s name is Tom…’

subject

predicate

object

Page 24: Data-mining the Semantic Web @TCD

24 of 63@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsDAH workshop

Triple Linking

‘…Tom is short for Thomas…’

subject

predicate

object

Page 25: Data-mining the Semantic Web @TCD

25 of 63@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsDAH workshop

Graph data

Page 26: Data-mining the Semantic Web @TCD

26 of 63@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsDAH workshop

Serialising RDF

• Turtle

• JSON

• RDF/XML

• N-Triples

Page 27: Data-mining the Semantic Web @TCD

27 of 63@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsDAH workshop

RDF Turtle@base <http://example.org/> .@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .@prefix foaf: <http://xmlns.com/foaf/0.1/> .@prefix rel: <http://www.perceive.net/schemas/relationship/> .

<green-goblin> rel:enemyOf <spiderman> ; a foaf:Person ; # in the context of the Marvel universe foaf:name "Green Goblin" .

<spiderman> rel:enemyOf <green-goblin> ; a foaf:Person ; foaf:name "Spiderman", "Человек-паук"@ru .

1

2

3

Page 28: Data-mining the Semantic Web @TCD

28 of 63@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsDAH workshop

As N-Triples

<http://example.org/green-goblin> <http://www.perceive.net/schemas/relationship/enemyOf> <http://example.org/spiderman> .<http://example.org/green-goblin> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://xmlns.com/foaf/0.1/Person> .<http://example.org/green-goblin> <http://xmlns.com/foaf/0.1/name> "Green Goblin" .<http://example.org/spiderman> <http://www.perceive.net/schemas/relationship/enemyOf> <http://example.org/green-goblin> .<http://example.org/spiderman> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://xmlns.com/foaf/0.1/Person> .<http://example.org/spiderman> <http://xmlns.com/foaf/0.1/name> "Spiderman" .<http://example.org/spiderman> <http://xmlns.com/foaf/0.1/name> "\u00D0\u00A7\u00D0\u00B5\u00D0\u00BB\u00D0\u00BE\u00D0\u00B2\u00D0\u00B5\u00D0\u00BA-\u00D0\u00BF\u00D0\u00B0\u00D1\u0083\u00D0\u00BA"@ru .

Page 29: Data-mining the Semantic Web @TCD

29 of 63@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsDAH workshop

As JSON{"http:\/\/example.org\/green-goblin":{"http:\/\/www.perceive.net\/schemas\/relationship\/enemyOf":[{"type":"uri","value":"http:\/\/example.org\/spiderman"}],"http:\/\/www.w3.org\/1999\/02\/22-rdf-syntax-ns#type":[{"type":"uri","value":"http:\/\/xmlns.com\/foaf\/0.1\/Person"}],"http:\/\/xmlns.com\/foaf\/0.1\/name":[{"type":"literal","value":"Green Goblin"}]},"http:\/\/example.org\/spiderman":{"http:\/\/www.perceive.net\/schemas\/relationship\/enemyOf":[{"type":"uri","value":"http:\/\/example.org\/green-goblin"}],"http:\/\/www.w3.org\/1999\/02\/22-rdf-syntax-ns#type":[{"type":"uri","value":"http:\/\/xmlns.com\/foaf\/0.1\/Person"}],"http:\/\/xmlns.com\/foaf\/0.1\/name":[{"type":"literal","value":"Spiderman"},{"type":"literal","value":"\u0427\u0435\u043b\u043e\u0432\u0435\u043a-\u043f\u0430\u0443\u043a","lang":"ru"}]}}

Page 30: Data-mining the Semantic Web @TCD

30 of 63@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsDAH workshop

As RDF/XML<?xml version="1.0" encoding="utf-8" ?><rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:foaf="http://xmlns.com/foaf/0.1/" xmlns:ns0="http://www.perceive.net/schemas/relationship/">

<foaf:Person rdf:about="http://example.org/green-goblin"> <ns0:enemyOf> <foaf:Person rdf:about="http://example.org/spiderman"> <ns0:enemyOf rdf:resource="http://example.org/green-goblin"/> <foaf:name>Spiderman</foaf:name> <foaf:name xml:lang="ru">Человек-паук</foaf:name> </foaf:Person> </ns0:enemyOf>

<foaf:name>Green Goblin</foaf:name> </foaf:Person>

</rdf:RDF>

Page 31: Data-mining the Semantic Web @TCD

31 of 63@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsDAH workshop

Visualised as a Graph

Page 32: Data-mining the Semantic Web @TCD

32 of 63@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsDAH workshop

Triplestoresand

InfrastructureA server farm (source: www.cirrusinsight.com)

Page 33: Data-mining the Semantic Web @TCD

33 of 63@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsDAH workshop

Practical: Making RDF

http://www.franklynam.com/blog.aspx?id=85

Q: Create RDF representations of yourself and your relationships

Page 34: Data-mining the Semantic Web @TCD

34 of 63@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsDAH workshop

The Semantic Web and Ontologies

The stages of the Web (source: urenio.org)

Page 35: Data-mining the Semantic Web @TCD

35 of 63@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsDAH workshop

Ontological Classes and Properties

Page 36: Data-mining the Semantic Web @TCD

36 of 63@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsDAH workshop

The British Museum data mapping onto the CIDOC CRM(source: confluence.ontotext.com/display/ResearchSpace/BM+Mapping)

Page 37: Data-mining the Semantic Web @TCD

37 of 63@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsDAH workshop

The CIDOC CRM basic entity types and their relationships(source: www.cidoc-crm.org/)

Page 38: Data-mining the Semantic Web @TCD

38 of 63@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsDAH workshop

Vocabularies

Page 39: Data-mining the Semantic Web @TCD

39 of 63@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsDAH workshop

Graph data

Page 40: Data-mining the Semantic Web @TCD

40 of 63@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsDAH workshop

Minna Sundberg (source: www.sssscomic.com/comic.php?page=196)

Page 41: Data-mining the Semantic Web @TCD

41 of 63@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsDAH workshop

Querying using SPARQL

SELECT *WHERE {

?s ?p ?o} LIMIT 10

Page 42: Data-mining the Semantic Web @TCD

42 of 63@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsDAH workshop

More complex SPARQL

PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>PREFIX letters1916: <http://letters1916.linkedarc.net/ontology/>PREFIX letters1916data: <http://letters1916.linkedarc.net/data/>PREFIX schema: <http://schema.org/>

SELECT DISTINCT ?letter ?letterName ?recipientPostalAddressName ?recipientLongitude ?recipientLatitudeWHERE {

?letter rdf:type letters1916:Letter ;schema:name ?letterName ;letters1916:recipientLocation ?recipientPostalAddress .

?recipientPostalAddress schema:addressRegion ?recipientPostalAddressRegion ;FILTER regex(?recipientPostalAddressRegion, 'Galway', 'i')?recipientPostalAddress schema:name ?recipientPostalAddressName .

?recipientPlace schema:address ?recipientPostalAddress ;schema:geo ?recipientGeoCoordinates .

?recipientGeoCoordinates schema:longitude ?recipientLongitude ;schema:latitude ?recipientLatitude

}

1

2

3

Page 43: Data-mining the Semantic Web @TCD

43 of 63@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsDAH workshop

Practical: Universities on DBpedia

http://www.franklynam.com/blog.aspx?id=86

Q: Get a list of all of the universities that DBpedia knows about

Page 44: Data-mining the Semantic Web @TCD

44 of 63@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsDAH workshop

SKOS

@prefix dct: <http://purl.org/dc/terms/> .@prefix skos: <http://www.w3.org/2004/02/skos/core#> .@prefix cc: <http://creativecommons.org/ns#> .

<http://linkedarc.net/vocabs/vessel-jar> a skos:Concept ;cc:license <http://creativecommons.org/licenses/by/3.0> ;cc:attributionURL <http://linkedarc.net> ;cc:attributionName "linkedarc.net" ;skos:inScheme <http://linkedarc.net/vocabs> ;skos:prefLabel “Jar" ;skos:scopeNote ”A jar concept. Pottery. This isn’t a great scope note." ;dct:publisher <http://linkedarc.net> ;dct:identifier <http://linkedarc.net/vocabs/vessel-jar> ;dct:issued "2015-02-23"^^xsd:date ;skos:exactMatch <http://purl.org/heritagedata/schemes/mda_obj/concepts/97609> .

Page 45: Data-mining the Semantic Web @TCD

45 of 63@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsDAH workshop

SPARQL + FILTER

SELECT * WHERE { ?s rdfs:label ?label .

FILTER langMatches(lang(?label), "en”)}

Page 46: Data-mining the Semantic Web @TCD

46 of 63@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsDAH workshop

SPARQL + FILTER

SELECT * WHERE { ?s rdfs:label ?label .

FILTER langMatches(lang(?label), "en") .

FILTER regex(?label, ”bell", "i”)}

Page 47: Data-mining the Semantic Web @TCD

47 of 63@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsDAH workshop

SPARQL + FILTER

SELECT * WHERE { ?s dct:dateCreated ?dateCreated .

FILTER (?dateCreated > '1900-01-01'}

Page 48: Data-mining the Semantic Web @TCD

48 of 63@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsDAH workshop

Practical: British Museum Sarcophagi

Q: Get the find spots of all of the sarcophagi in the British Museum collection

SPARQL endpoint: http://collection.britishmuseum.org/sparql

Page 49: Data-mining the Semantic Web @TCD

49 of 63@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsDAH workshop

Practical: Archaeological stratigraphy

Q: Get the stratigraphic relationships between the contexts excavated at Priniatikos Pyrgos

SPARQL endpoint: http://linkedarc.net/sparql

Page 50: Data-mining the Semantic Web @TCD

50 of 63@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsDAH workshop

Stratigraphy explained (very briefly…)

Sample stratigraphic sequence (source: www.lparchaeology.com)

Page 51: Data-mining the Semantic Web @TCD

51 of 63@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsDAH workshop

The Priniatikos Pyrgos ontology

Page 52: Data-mining the Semantic Web @TCD

52 of 63@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsDAH workshop

Practical: Archaeological stratigraphy

Q: Get the stratigraphic relationships between the contexts excavated at Priniatikos Pyrgos

SPARQL endpoint: http://linkedarc.net/sparql

Hint: you will need to traverse 2 levels of the ontology’s hierarchy to get at the stratigraphy data

Page 53: Data-mining the Semantic Web @TCD

53 of 63@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsDAH workshop

Practical: Nomisma and Ancient Coins

Q: Get the geo-coordinates of all of the coin hoards stored in the Nomisma triplestore

SPARQL endpoint: http://nomisma.org/sparql

Page 54: Data-mining the Semantic Web @TCD

54 of 63@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsDAH workshop

Geo-coding the Find Spotswith Google Refine

Page 55: Data-mining the Semantic Web @TCD

55 of 63@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsDAH workshop

The Google Maps API

Address String

Geo-coordinates as JSON

Page 56: Data-mining the Semantic Web @TCD

56 of 63@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsDAH workshop

Export as CSV

Page 57: Data-mining the Semantic Web @TCD

57 of 63@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsDAH workshop

Practical: Getty Concepts

Q: Get all of the Getty URIs that represent concepts related to amphorae

SPARQL endpoint: http://vocab.getty.edu/sparql

Page 58: Data-mining the Semantic Web @TCD

58 of 63@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsDAH workshop

Additional Linked Data Resources

http://www.franklynam.com/blog.aspx?id=89

Page 59: Data-mining the Semantic Web @TCD

59 of 63@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsDAH workshop

One final quick survey

• Please arrange the practicals in terms of how easy they were to complete (1 for hardest and 5 for easiest)?– Making your FOAF profile– DBpedia universities– British Museum sarcophagi hunting– Getty vocabularies– Nomisma coin hoards

Page 60: Data-mining the Semantic Web @TCD

60 of 63@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsDAH workshop

One final quick survey

• Would you consider publishing Linked Open Data in the future?– 1 – Absolutely not – 2– 3– 4– 5 – Definitely

Page 61: Data-mining the Semantic Web @TCD

61 of 63@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsDAH workshop

One final quick survey

• Would you consider using Linked Open Data resources (using SPARQL or otherwise) in the future?– 1 – Absolutely not – 2– 3– 4– 5 – Definitely

Page 62: Data-mining the Semantic Web @TCD

62 of 63@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsDAH workshop

One final quick survey

• Is Linked Open Data a feasible platform on which to undertake humanities research?– 1 – Absolutely not– 2– 3– 4– 5 – Definitely

Page 63: Data-mining the Semantic Web @TCD

63 of 63@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsDAH workshop

One final quick survey

• Any final comments?

Page 64: Data-mining the Semantic Web @TCD

64 of 63@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsDAH workshop

Thank you!

Martin Lemay (source: twitter.com/martinlemay)