21.09.06 krzysztof janowicz towards a similarity-based identity assumption service for historical...

22
21.09.06 Krzysztof Janowicz Towards a Similarity-Based Identity Assumption Service for Historical Places Establishing Meaningful Links Krzysztof Janowicz; Muenster Semantic Interoperability Lab (MUSIL)

Post on 19-Dec-2015

217 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 21.09.06 Krzysztof Janowicz Towards a Similarity-Based Identity Assumption Service for Historical Places Establishing Meaningful Links Krzysztof Janowicz;

21.09.06 Krzysztof Janowicz

Towards a Similarity-Based Identity Assumption Service for Historical Places Establishing Meaningful Links

Krzysztof Janowicz; Muenster Semantic Interoperability Lab (MUSIL)

Page 2: 21.09.06 Krzysztof Janowicz Towards a Similarity-Based Identity Assumption Service for Historical Places Establishing Meaningful Links Krzysztof Janowicz;

Krzysztof Janowicz Similarity-Based Identity Assumption Service for Historical Places 2

Outline

• Motivation

• Scenario

• Annotation

• Theory

• Further WorkImage from: http://de.wikipedia.org/wiki/HMS_Victory(Bleiglass, 1998)

Page 3: 21.09.06 Krzysztof Janowicz Towards a Similarity-Based Identity Assumption Service for Historical Places Establishing Meaningful Links Krzysztof Janowicz;

Krzysztof Janowicz Similarity-Based Identity Assumption Service for Historical Places 3

Motivation

• For the cultural heritage community• Incomplete and vague knowledge

• Interchange between external sources is necessary to answer complex scientific questions & to clean up local knowledge

• Local versus global identifiers Accessible service-based infrastructure!

Page 4: 21.09.06 Krzysztof Janowicz Towards a Similarity-Based Identity Assumption Service for Historical Places Establishing Meaningful Links Krzysztof Janowicz;

Krzysztof Janowicz Similarity-Based Identity Assumption Service for Historical Places 4

Motivation

•For semantic similarity research• Application of similarity in a real world domain

• Similarity as part of the identity assumption puzzle

• Combination of similarity and classical reasoning

• Using a stable upper-level ontology (CIDOC CRM)Theory of similarity assumptions for historical places

Page 5: 21.09.06 Krzysztof Janowicz Towards a Similarity-Based Identity Assumption Service for Historical Places Establishing Meaningful Links Krzysztof Janowicz;

Krzysztof Janowicz Similarity-Based Identity Assumption Service for Historical Places 5

Motivation• For an identity assumption service

• To run queries against multiple sources it has to be made sure that they refer to the same real-world phenomena; just a common language is not enough!

• Non unique place names (even within the same area)

• Place names refer to cities, rivers, valleys, mountains,…

• Misinterpreted place names (e.g. 'Al Wahat‘ Oasis)

• Names also refer to varying geopolitical units (e.g. nomads) or prominent (artificial) landmarks (e.g. telegraph stations)

• Out-dated place or even country names (e.g. UDSSR)

Gazetteers can only partially solve these problems

(From discussions with Dr. Karl-Heinz Lampe; ZFMK)

Page 6: 21.09.06 Krzysztof Janowicz Towards a Similarity-Based Identity Assumption Service for Historical Places Establishing Meaningful Links Krzysztof Janowicz;

Krzysztof Janowicz Similarity-Based Identity Assumption Service for Historical Places 6

Battle of Trafalgar - Scenario• Took place at Cape Trafalgar (Province Cadiz) in 1805

• British victory under the command of Horatio Nelson

• HMS Victory was Nelsons flagship

• Nelson was shot during the battle and died afterwards

Should be easy to annotate!?

Spatial relation between naval battleground and terrestrial cape, Province Cadiz,..?

Place names:Cabo Trafalgar,Taraf al-Gharb,الطرف رأس األغر

Also in a historical sourcefrom French perspective?

Image from: http://en.wikipedia.org/wiki/Horatio_Nelson (painted by Nicholas Pocock)

Vice-Admiral Horatio Nelson, 1st Viscount Nelson?

HMS Victory:Which one?!

Temporal relations?

Page 7: 21.09.06 Krzysztof Janowicz Towards a Similarity-Based Identity Assumption Service for Historical Places Establishing Meaningful Links Krzysztof Janowicz;

Krzysztof Janowicz Similarity-Based Identity Assumption Service for Historical Places 7From: http://en.wikipedia.org/wiki/Image:Trafalgar_aufstellung.jpg

Page 8: 21.09.06 Krzysztof Janowicz Towards a Similarity-Based Identity Assumption Service for Historical Places Establishing Meaningful Links Krzysztof Janowicz;

Krzysztof Janowicz Similarity-Based Identity Assumption Service for Historical Places 8

Annotation of Historical Knowledge

• CIDOC conceptual reference model (CRM) as upper-level ontology for the cultural heritage domain

• specifies abstract and interrelated vocabulary instead of concrete definitions such as for kinds of exhibits heterogeneous domain!

• describes historical knowledge by relations between places, events, actor and objects

• RDF(S) based representation

• ISO Standard (ISO/PRF 21127)

Page 9: 21.09.06 Krzysztof Janowicz Towards a Similarity-Based Identity Assumption Service for Historical Places Establishing Meaningful Links Krzysztof Janowicz;

Krzysztof Janowicz Similarity-Based Identity Assumption Service for Historical Places 9

Annotation Examples (RDF-Triples)• P89F.falls_within(E53.Place(Cape Trafalgar),

E53.Place(Province Cádiz))

Subject-Predicate-Object:

The place Cape Trafalgar falls within a place called Province Cádiz

• P8F.took_place_at(E7.Activity(Battle of Trafalgar), E53.Place(Cape

Trafalgar))

• P117F.occurs_during (E7.Activity(Battle of Trafalgar), E5.Event(Trafalgar

Campaign))

• P14F.carried_out_by (E7.Activity(Battle of Trafalgar), E21.Person(Nelson))

• P2F.has_type (E53.Place(Andalusia), E55.Type(regions))

Page 10: 21.09.06 Krzysztof Janowicz Towards a Similarity-Based Identity Assumption Service for Historical Places Establishing Meaningful Links Krzysztof Janowicz;

Krzysztof Janowicz Similarity-Based Identity Assumption Service for Historical Places 10

Theory

• In practice semi-automatic disambiguation via gazetteers and other global authorities (such as for historical figures) is often difficult, expensive and error-prone

(especially for subordinate geopolitical units, events, actors,…)

Use the links established via the CIDOC CRM annotation between places, actors, objects and events as additional reference points!

Page 11: 21.09.06 Krzysztof Janowicz Towards a Similarity-Based Identity Assumption Service for Historical Places Establishing Meaningful Links Krzysztof Janowicz;

Krzysztof Janowicz Similarity-Based Identity Assumption Service for Historical Places 11

Theory

Geoinformation = < x, z >

Semantic Reference Systems

interpretatio

n interpretation

Spatiotemporal Reference Systems

Use thematic information as support for spatiotemporal reference

Mike Goodchild: Geographic Rreality

CIDOC CRM+ Reasoning

+ Similarity

Page 12: 21.09.06 Krzysztof Janowicz Towards a Similarity-Based Identity Assumption Service for Historical Places Establishing Meaningful Links Krzysztof Janowicz;

Krzysztof Janowicz Similarity-Based Identity Assumption Service for Historical Places 12

Theory: Framework

Comparing Place Descriptions

1. Extract new triples out of existing ones Spatiotemporal & Subsumption Reasoning

2. Compute overlap between source and target triples Semantic Similarity Measurement

3. Compare remaining labels & identifiers Syntactic Identifier Matching

4. How probably compared places correspond Identity Assumption

Page 13: 21.09.06 Krzysztof Janowicz Towards a Similarity-Based Identity Assumption Service for Historical Places Establishing Meaningful Links Krzysztof Janowicz;

Krzysztof Janowicz Similarity-Based Identity Assumption Service for Historical Places 13

Theory: Reasoning • Entities are described by sets of RDF triples

• Inference rules to generate new triplesMake local knowledge explicit!More comparable information about entities

• Example: Spatial & temporal Inference rules

• Be careful - names are ambiguous!

HMS XYZ (1804)

HMS XYZ (1805)

?

Page 14: 21.09.06 Krzysztof Janowicz Towards a Similarity-Based Identity Assumption Service for Historical Places Establishing Meaningful Links Krzysztof Janowicz;

Krzysztof Janowicz Similarity-Based Identity Assumption Service for Historical Places 14

Theory: Similarity

NelsonsDeath

ProvinceCádiz

BattleOf

Trafalgar

CapeTrafalgar

NapoleonicWars

Nelson

Nelson

per

form

ed

falls within

died in

Cape Trafalgar

ProvinceCádiz

falls within

Source:

Cape Trafalgar

ProvinceCádiz

overlaps with

Target:

simp *

sims

sims

=

ProvinceCádiz

Page 15: 21.09.06 Krzysztof Janowicz Towards a Similarity-Based Identity Assumption Service for Historical Places Establishing Meaningful Links Krzysztof Janowicz;

Krzysztof Janowicz Similarity-Based Identity Assumption Service for Historical Places 15

Theory: Network Approach to Similarity

1. For all tuples from the source entity: find equal or similar tuples within the target entity description

2. Define meaningful notions of similarity for given predicates (relations)

• Spatial

• Temporal

• Thematic

3. Define meaningful notion of similarity for all objects that are not subjects of other triples themselves (e.g. ADL Feature Types)

Page 16: 21.09.06 Krzysztof Janowicz Towards a Similarity-Based Identity Assumption Service for Historical Places Establishing Meaningful Links Krzysztof Janowicz;

Krzysztof Janowicz Similarity-Based Identity Assumption Service for Historical Places 16

Theory: Neighborhoods & Hierarchies

Egenhofer & Al-Taha 1992

Different similarity measures for neighborhoods & hierarchies

temporal spatial

thematic

Page 17: 21.09.06 Krzysztof Janowicz Towards a Similarity-Based Identity Assumption Service for Historical Places Establishing Meaningful Links Krzysztof Janowicz;

Krzysztof Janowicz Similarity-Based Identity Assumption Service for Historical Places 17

Theory: Syntactic Matching

• After recursively applying (semantic) similarity measurements, only labels, vague appellations and identifier are left

Requires syntactic matching / measuring

(Getty Thesaurus)

ID: 7008751 ID: 7008750

Cape TrafalgarWrexham

(found at: www.gwjokes.com )

Page 18: 21.09.06 Krzysztof Janowicz Towards a Similarity-Based Identity Assumption Service for Historical Places Establishing Meaningful Links Krzysztof Janowicz;

Krzysztof Janowicz Similarity-Based Identity Assumption Service for Historical Places 18

• Two place descriptions probably refer to the same (real world) place if they are linked via equal or similar relations to equal or similar events, actors, objects, …

• Similar position within a network of historical facts

• Stepwise applying new restrictions to the set of compared historical places

Number of compared tuples is a critical issue!

Theory: Identity Assumptions

Page 19: 21.09.06 Krzysztof Janowicz Towards a Similarity-Based Identity Assumption Service for Historical Places Establishing Meaningful Links Krzysztof Janowicz;

Krzysztof Janowicz Similarity-Based Identity Assumption Service for Historical Places 19

Further Work & Evidence• Similarity is only one part of the puzzle!

• Other parts: trust, contradictions & consistence,...

• Which inference rules may lead to difficulties?

• How to handle complementary knowledge?

• Connections to Time Map and ECAI

• Evidence! Battle of Trafalgar Scenario?Develop a identity assumption pilotCombination of similarity measurement with itinerariesBased on real world data from ZFMK, Bonn (biodiversity museum)

Page 20: 21.09.06 Krzysztof Janowicz Towards a Similarity-Based Identity Assumption Service for Historical Places Establishing Meaningful Links Krzysztof Janowicz;

Krzysztof Janowicz Similarity-Based Identity Assumption Service for Historical Places 20

Questions

•Thank You!•Special thanks to

• Martin Doerr Foundation for Research and Technology - Hellas (FORTH)

Institute of Computer Science. Heraklion, Crete, Greece

• Karl-Heinz Lampe Zoologisches Forschungsmuseum Alexander Koenig (ZFMK).

Bonn, Germany

•Any Questions?

Page 21: 21.09.06 Krzysztof Janowicz Towards a Similarity-Based Identity Assumption Service for Historical Places Establishing Meaningful Links Krzysztof Janowicz;

Krzysztof Janowicz Similarity-Based Identity Assumption Service for Historical Places 21

‘Real World’-Place?

From: http://de.wikipedia.org/wiki/Bild:Atlantis_map_kircher.gif

Page 22: 21.09.06 Krzysztof Janowicz Towards a Similarity-Based Identity Assumption Service for Historical Places Establishing Meaningful Links Krzysztof Janowicz;

Krzysztof Janowicz Similarity-Based Identity Assumption Service for Historical Places 22

Gazetteer Feature Types

• Gazetteer Feature Types

Andalucía

ADLG Getty Thesaurus