accessing cultural heritage collections using semantic web techniques antoine isaac stitch project...

40
Accessing Cultural Heritage Collections using Semantic Web Techniques Antoine ISAAC STITCH Project SIKS Semantic Web Seminar, Utrecht April 11 th , 2007

Upload: mikayla-wedgewood

Post on 31-Mar-2015

213 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Accessing Cultural Heritage Collections using Semantic Web Techniques Antoine ISAAC STITCH Project SIKS Semantic Web Seminar, Utrecht April 11 th, 2007

Accessing Cultural Heritage Collections using Semantic Web Techniques

Antoine ISAACSTITCH Project

SIKS Semantic Web Seminar, UtrechtApril 11th, 2007

Page 2: Accessing Cultural Heritage Collections using Semantic Web Techniques Antoine ISAAC STITCH Project SIKS Semantic Web Seminar, Utrecht April 11 th, 2007

Accessing Cultural Heritage collections using Semantic Web techniques

Background

• CATCH@ NWO• Continuous Access To Cultural Heritage• 10 computer science projects applied to the CH field

• Personalization of access, image/text/audio analysis

• Integration of projects in CH institutes (museums, archives)

• STITCH • SemanTic Interoperability To access Cultural Heritage

• Exchanging and integrating metadata• Vrije Universiteit, Koninklijke Bibliotheek & Max

Planck Institute

Page 3: Accessing Cultural Heritage Collections using Semantic Web Techniques Antoine ISAAC STITCH Project SIKS Semantic Web Seminar, Utrecht April 11 th, 2007

Accessing Cultural Heritage collections using Semantic Web techniques

Agenda

• Cultural Heritage and Semantic Web• Two important issues

• Representing Cultural Heritage vocabularies on the Semantic Web

• Vocabulary alignment

• Demo

Page 4: Accessing Cultural Heritage Collections using Semantic Web Techniques Antoine ISAAC STITCH Project SIKS Semantic Web Seminar, Utrecht April 11 th, 2007

Accessing Cultural Heritage collections using Semantic Web techniques

Some Needs for CH Collections

• Representation of objects and knowledge about them • Pointing at collection artifacts: books…• Describing them: creating metadata

• Specific metadata structures (metadata schemes)• Controlled expert vocabularies (e.g. thesauri)

• Accessing artifacts using metadata • E.g. search using information contained in

thesauri

Page 5: Accessing Cultural Heritage Collections using Semantic Web Techniques Antoine ISAAC STITCH Project SIKS Semantic Web Seminar, Utrecht April 11 th, 2007

Accessing Cultural Heritage collections using Semantic Web techniques

KB Illustrated Manuscripts – Iconclass vocabulary

Page 6: Accessing Cultural Heritage Collections using Semantic Web Techniques Antoine ISAAC STITCH Project SIKS Semantic Web Seminar, Utrecht April 11 th, 2007

Accessing Cultural Heritage collections using Semantic Web techniques

KB Illustrated Manuscripts

Page 7: Accessing Cultural Heritage Collections using Semantic Web Techniques Antoine ISAAC STITCH Project SIKS Semantic Web Seminar, Utrecht April 11 th, 2007

Accessing Cultural Heritage collections using Semantic Web techniques

Some Needs for CH Collections (2)

• Communicating data to the outside world• Web portals

• Integrating different collections• Virtual collections

• The European Library, http://www.theeuropeanlibrary.org

• Geheugen van Nederland, http://www.geheugenvannederland.nl

Page 8: Accessing Cultural Heritage Collections using Semantic Web Techniques Antoine ISAAC STITCH Project SIKS Semantic Web Seminar, Utrecht April 11 th, 2007

Accessing Cultural Heritage collections using Semantic Web techniques

(Biased) Semantic Web

• Pointing at resources: documents, knowledge objects

• Enabling structured assertions• Metadata about entities present on the Web

• Using vocabularies with defined semanticsOntologies: formal definitions of shared conceptual

vocabulariesRDF Schema /OWL

<owl:Class rdf:about="#Bird"> <owl:disjointWith> <owl:Class rdf:about="#Mammals"/> </owl:disjointWith> <rdfs:subClassOf> <owl:Class rdf:ID="Animals"/> </rdfs:subClassOf> </owl:Class>

<Bird rdf:about="#tweety"/>

Page 9: Accessing Cultural Heritage Collections using Semantic Web Techniques Antoine ISAAC STITCH Project SIKS Semantic Web Seminar, Utrecht April 11 th, 2007

Accessing Cultural Heritage collections using Semantic Web techniques

(Biased) Semantic Web

• Web-based resources allow division/sharing of • document• vocabulary• metadata

(doc3, hasSubject, Amsterdam)

differentowners & locations

http://www.kb.nl/eDepot

http://www.geo.org/voc/

http://www.ned.nl/doc3

Page 10: Accessing Cultural Heritage Collections using Semantic Web Techniques Antoine ISAAC STITCH Project SIKS Semantic Web Seminar, Utrecht April 11 th, 2007

Accessing Cultural Heritage collections using Semantic Web techniques

Cultural Heritage Collections and Semantic Web

• Categorizing/classifying things

• Structuring descriptions

• Web-based approach

Semantic Web techniques are good candidates for representing and exploiting Cultural Heritage metadata

Page 11: Accessing Cultural Heritage Collections using Semantic Web Techniques Antoine ISAAC STITCH Project SIKS Semantic Web Seminar, Utrecht April 11 th, 2007

Accessing Cultural Heritage collections using Semantic Web techniques

Important line of research

• Long-term projects• MuseumFinland, http://www.museosuomi.fi/ • eCulture, http://e-culture.multimedian.nl/

• Common portals to (many) collections

• Exploiting the data found in the original systems• Metadata content: place, date, creator…• Semantics of vocabularies used to create this

information• E.g. hierarchical information • “A Picture featuring a crow features a bird”

Page 12: Accessing Cultural Heritage Collections using Semantic Web Techniques Antoine ISAAC STITCH Project SIKS Semantic Web Seminar, Utrecht April 11 th, 2007

Accessing Cultural Heritage collections using Semantic Web techniques

Page 13: Accessing Cultural Heritage Collections using Semantic Web Techniques Antoine ISAAC STITCH Project SIKS Semantic Web Seminar, Utrecht April 11 th, 2007

Accessing Cultural Heritage collections using Semantic Web techniques

Agenda

• Cultural Heritage and Semantic Web• Two important issues

• Representing Cultural Heritage vocabularies on the Semantic Web

• Vocabulary alignment

• Demo

Page 14: Accessing Cultural Heritage Collections using Semantic Web Techniques Antoine ISAAC STITCH Project SIKS Semantic Web Seminar, Utrecht April 11 th, 2007

Accessing Cultural Heritage collections using Semantic Web techniques

Representing CH vocabularies on the Semantic Web - Similarities

• Both ontologies and thesauri bring concept hierarchies

• giving the intended meaning of a vocabulary through links between its items

• “concept/term” ≈? owl:Class• “broader” ≈?

rdfs:subClassOf• “scope notes” ≈?

rdfs:comment

Page 15: Accessing Cultural Heritage Collections using Semantic Web Techniques Antoine ISAAC STITCH Project SIKS Semantic Web Seminar, Utrecht April 11 th, 2007

Accessing Cultural Heritage collections using Semantic Web techniques

Representing CH vocabularies on the Semantic Web - Problems

• Thesauri designed for humans, no formal interpretation

• How to interpret a thesaurus in RDFS/OWL:• If “(Story of) Hercules” is a class, what are its instances?• Is “Hercules shooting Nessus” a subclass of “Love-affairs of

Hercules”?Thesaurus hierarchy: subsumption, mereological relation,

Page 16: Accessing Cultural Heritage Collections using Semantic Web Techniques Antoine ISAAC STITCH Project SIKS Semantic Web Seminar, Utrecht April 11 th, 2007

Accessing Cultural Heritage collections using Semantic Web techniques

Representing CH vocabularies on the Semantic Web – Different approaches

• Ontologising• Cleaning thesaurus by distinguishing roles, kinds,

etc.• Cleaning the hierarchical links

• Representing knowledge found in sources as such• Informal knowledge represented in RDF/OWL formal

framework

Page 17: Accessing Cultural Heritage Collections using Semantic Web Techniques Antoine ISAAC STITCH Project SIKS Semantic Web Seminar, Utrecht April 11 th, 2007

Accessing Cultural Heritage collections using Semantic Web techniques

SKOS

• Simple Knowledge Organization Systems• (Future) W3C standard

• Model to represent controlled and structured vocabularies on the Semantic Web• Compatible with community needs• Core model for representing thesauri, classification

schemes, etc.

Page 18: Accessing Cultural Heritage Collections using Semantic Web Techniques Antoine ISAAC STITCH Project SIKS Semantic Web Seminar, Utrecht April 11 th, 2007

Accessing Cultural Heritage collections using Semantic Web techniques

SKOS

• Building blocks (ontology) to create XML/RDF data about controlled vocabularies• Classes Concept and ConceptScheme• Lexical properties

• prefLabel• altLabel

• Semantic properties • broader, narrower• related

• Properties for notes and comments• scopeNote• definition

Page 19: Accessing Cultural Heritage Collections using Semantic Web Techniques Antoine ISAAC STITCH Project SIKS Semantic Web Seminar, Utrecht April 11 th, 2007

Accessing Cultural Heritage collections using Semantic Web techniques

SKOS: Brinkman Trefwoorden (KB)

075607204 geneeskunde RT geneesmiddelenNT kindergeneeskunde

075607220 geneesmiddelen UF medicijnen

075611791 kindergeneeskunde BT geneeskunde noot: kinderen ouder dan 12 vallen niet

onderkindergeneeskunde

medicijnen USE geneesmiddelen

Page 20: Accessing Cultural Heritage Collections using Semantic Web Techniques Antoine ISAAC STITCH Project SIKS Semantic Web Seminar, Utrecht April 11 th, 2007

Accessing Cultural Heritage collections using Semantic Web techniques

SKOS: Brinkman Trefwoorden (KB)

skos: = http://www.w3.org/2004/02/skos/core#bk: = http://www.kb.nl/brinkman/

bk:075611791

kindergeneeskundekinderen ouder dan12 vallen niet onderkindergeneeskunde

bk:075607204

geneeskunde

bk:075607220

medicijnengeneesmiddelenskos:prefLabel

skos:scopeNote

skos:broader

skos:prefLabel

skos:related

skos:prefLabel skos:altLabel

Page 21: Accessing Cultural Heritage Collections using Semantic Web Techniques Antoine ISAAC STITCH Project SIKS Semantic Web Seminar, Utrecht April 11 th, 2007

Accessing Cultural Heritage collections using Semantic Web techniques

SKOS: Brinkman Trefwoorden (KB)

<skos:Concept rdf:about="http://www.kb.nl/brinkman/bk075607204"><skos:prefLabel>geneeskunde</skos:prefLabel><skos:related rdf:resource="http://www.kb.nl/brinkman/bk075607220"/>

</skos:Concept><skos:Concept rdf:about="http://www.kb.nl/brinkman/bk075607220">

<rdf:type rdf:resource="&skos;Concept"/><skos:prefLabel>geneesmiddelen</skos:prefLabel><skos:altLabel>medicijnen</skos:altLabel>

</skos:Concept><skos:Concept rdf:about="http://www.kb.nl/brinkman/bk075611791">

<rdf:type rdf:resource="&skos;Concept"/><skos:prefLabel>kindergeneeskunde</skos:prefLabel><skos:broader rdf:resource="http://www.kb.nl/brinkman/bk075607204"/><skos:scopeNote>kinderen ouder dan 12 vallen niet onder

kindergeneeskunde</skos:scopeNote></skos:Concept>

Page 22: Accessing Cultural Heritage Collections using Semantic Web Techniques Antoine ISAAC STITCH Project SIKS Semantic Web Seminar, Utrecht April 11 th, 2007

Accessing Cultural Heritage collections using Semantic Web techniques

Agenda

• Cultural Heritage and Semantic Web• Two important issues

• Representing Cultural Heritage vocabularies on the Semantic Web

• Vocabulary alignment

• Demo

Page 23: Accessing Cultural Heritage Collections using Semantic Web Techniques Antoine ISAAC STITCH Project SIKS Semantic Web Seminar, Utrecht April 11 th, 2007

Accessing Cultural Heritage collections using Semantic Web techniques

Cultural Heritage Interoperability Problems

• Problem: integrating different databases/metadata schemes/vocabularies

• Syntactic interoperability can be solved• Common format: XML (RDF)• Common vocabulary model (SKOS)

• How about conceptual heterogeneity?

Page 24: Accessing Cultural Heritage Collections using Semantic Web Techniques Antoine ISAAC STITCH Project SIKS Semantic Web Seminar, Utrecht April 11 th, 2007

Accessing Cultural Heritage collections using Semantic Web techniques

The semantic interoperability problem

• There is no standard thesaurus• We don’t really want it

different vocabularies for different expertise domains, traditions, tasks

• Consequence:• “klassieke ruïnes” vs. “landschap met ruïnes”• “maagd Maria” vs. “Heilige Moeder”

• Practical problem:• Searching for “Heilige Moeder” misses “maagd

Maria”• Unless we know both vocabularies

Page 25: Accessing Cultural Heritage Collections using Semantic Web Techniques Antoine ISAAC STITCH Project SIKS Semantic Web Seminar, Utrecht April 11 th, 2007

Accessing Cultural Heritage collections using Semantic Web techniques

Old situation

Page 26: Accessing Cultural Heritage Collections using Semantic Web Techniques Antoine ISAAC STITCH Project SIKS Semantic Web Seminar, Utrecht April 11 th, 2007

Accessing Cultural Heritage collections using Semantic Web techniques

Vocabulary alignment

• STITCH aim: find correspondences between vocabulary elements• “klassieke ruïnes” ≈ “landschap met ruïnes”• “maagd Maria” = “Heilige Moeder”

Page 27: Accessing Cultural Heritage Collections using Semantic Web Techniques Antoine ISAAC STITCH Project SIKS Semantic Web Seminar, Utrecht April 11 th, 2007

Accessing Cultural Heritage collections using Semantic Web techniques

New situation

Page 28: Accessing Cultural Heritage Collections using Semantic Web Techniques Antoine ISAAC STITCH Project SIKS Semantic Web Seminar, Utrecht April 11 th, 2007

Accessing Cultural Heritage collections using Semantic Web techniques

Automatic alignment techniques

• Lexical Labels of entities and textual definitions

• StructuralStructure of the formal definitions of entities, position in the

hierarchy

• StatisticalObject information (e.g. book indexing)

• Background knowledge Using a shared conceptual reference to find links

brainLong tumor tumorLong

Page 29: Accessing Cultural Heritage Collections using Semantic Web Techniques Antoine ISAAC STITCH Project SIKS Semantic Web Seminar, Utrecht April 11 th, 2007

Accessing Cultural Heritage collections using Semantic Web techniques

Lexical alignment

• Use preferred labels, synonyms, notes• Heuristic methods to discover

equivalence and specialization relations

Funeral of Patroclus PatroclusMore specific than

Page 30: Accessing Cultural Heritage Collections using Semantic Web Techniques Antoine ISAAC STITCH Project SIKS Semantic Web Seminar, Utrecht April 11 th, 2007

Accessing Cultural Heritage collections using Semantic Web techniques

Automatic Alignment Techniques

• Lexical Labels of entities and textual definitions

• StructuralStructure of the formal definitions of entities, position in the

hierarchy

• StatisticalObject information (e.g. book indexing)

• Shared background knowledge Using a conceptual reference to deduce correspondences

brainLong tumor tumorLong

Page 31: Accessing Cultural Heritage Collections using Semantic Web Techniques Antoine ISAAC STITCH Project SIKS Semantic Web Seminar, Utrecht April 11 th, 2007

Accessing Cultural Heritage collections using Semantic Web techniques

Statistical alignment

Page 32: Accessing Cultural Heritage Collections using Semantic Web Techniques Antoine ISAAC STITCH Project SIKS Semantic Web Seminar, Utrecht April 11 th, 2007

Accessing Cultural Heritage collections using Semantic Web techniques

Statistic approach: Koninklijke Bibliotheek case

• Situation: 2 overlapping collections indexed with different thesauri

• Comparison means: measuring overlap between concepts from the thesauri• Using the sets of books indexed by these concepts

• Results1: 9132.9 Schilderijen - schilderkunst

2: 8088.5 Kwaliteitszorg - kwaliteitsmanagement

3: 6232.7 Personeelsmanagement - personeelsbeleid

...

17: 3421.8 Diabetes mellitus - suikerziekte

Page 33: Accessing Cultural Heritage Collections using Semantic Web Techniques Antoine ISAAC STITCH Project SIKS Semantic Web Seminar, Utrecht April 11 th, 2007

Accessing Cultural Heritage collections using Semantic Web techniques

Agenda

• Cultural Heritage and Semantic Web• Two important issues

• Representing Cultural Heritage vocabularies on the Semantic Web

• Vocabulary alignment

• Demo

Page 34: Accessing Cultural Heritage Collections using Semantic Web Techniques Antoine ISAAC STITCH Project SIKS Semantic Web Seminar, Utrecht April 11 th, 2007

Accessing Cultural Heritage collections using Semantic Web techniques

Demo

• KB Illuminated Manuscripts• French National Library Mandragore

Manuscripts

Page 35: Accessing Cultural Heritage Collections using Semantic Web Techniques Antoine ISAAC STITCH Project SIKS Semantic Web Seminar, Utrecht April 11 th, 2007

Accessing Cultural Heritage collections using Semantic Web techniques

Manuscripts, 2nd Collection: BNF Mandragore

Page 36: Accessing Cultural Heritage Collections using Semantic Web Techniques Antoine ISAAC STITCH Project SIKS Semantic Web Seminar, Utrecht April 11 th, 2007

Accessing Cultural Heritage collections using Semantic Web techniques

Manuscripts, 2nd Collection: BNF Mandragore

Page 37: Accessing Cultural Heritage Collections using Semantic Web Techniques Antoine ISAAC STITCH Project SIKS Semantic Web Seminar, Utrecht April 11 th, 2007

Accessing Cultural Heritage collections using Semantic Web techniques

Demo

• http://stitch.cs.vu.nl/rp33333/MANDRA-SV-ICE-mandraNewNONE , amphibians

• http://stitch.cs.vu.nl/rp33333/MANDRA-SV-MANDRA-mandraNewNONE, wheat

Page 38: Accessing Cultural Heritage Collections using Semantic Web Techniques Antoine ISAAC STITCH Project SIKS Semantic Web Seminar, Utrecht April 11 th, 2007

Accessing Cultural Heritage collections using Semantic Web techniques

Conclusion: Semantic Web can help Cultural Heritage

• Representation of collections and associated expert vocabularies

• Semantic integration through correspondences between different vocabularies

New opportunities for exploiting cultural heritage information

Page 39: Accessing Cultural Heritage Collections using Semantic Web Techniques Antoine ISAAC STITCH Project SIKS Semantic Web Seminar, Utrecht April 11 th, 2007

Accessing Cultural Heritage collections using Semantic Web techniques

Thanks!

Page 40: Accessing Cultural Heritage Collections using Semantic Web Techniques Antoine ISAAC STITCH Project SIKS Semantic Web Seminar, Utrecht April 11 th, 2007

Accessing Cultural Heritage collections using Semantic Web techniques

Links

• Semantic Web at Vrije Universiteit• http://www.cs.vu.nl/ai/kr/• http://www.cs.vu.nl/bi/

• SKOS• http://www.w3.org/2004/02/skos/

• Other Cultural Heritage and Semantic Web projects• MuseumFinland, http://www.museosuomi.fi/ • eCulture, http://e-culture.multimedian.nl/