associative and spatial relationships in thesaurus-based retrieval

18
Associative and Spatial Relationships in Thesaurus-based Retrieval Harith Alani 1 , Christopher Jones 2 , Douglas Tudhope 1 1 School of Computing, University of Glamorgan {halani,dstudhope}@glam.ac.uk 2 Department of Computer Science, University of Cardiff [email protected]

Upload: hedwig

Post on 09-Jan-2016

30 views

Category:

Documents


0 download

DESCRIPTION

Associative and Spatial Relationships in Thesaurus-based Retrieval. Harith Alani 1 , Christopher Jones 2 , Douglas Tudhope 1 1 School of Computing, University of Glamorgan {halani,dstudhope}@glam.ac.uk 2 Department of Computer Science, University of Cardiff [email protected]. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Associative and Spatial Relationships  in Thesaurus-based Retrieval

Associative and Spatial Relationships in

Thesaurus-based Retrieval

Harith Alani1, Christopher Jones2, Douglas Tudhope1

1 School of Computing, University of Glamorgan {halani,dstudhope}@glam.ac.uk

2 Department of Computer Science, University of [email protected]

Page 2: Associative and Spatial Relationships  in Thesaurus-based Retrieval

OASIS - Ontologically Augmented

Spatial Information System • Aims:

– Investigate terminology systems for thematic and spatial access in digital library applications

and in particular:

• Investigate retrieval potential of geographical metadata schema consisting of rich place name data, with limited locational information

• Explore retrieval potential of reasoning over underlying semantic relationships

Page 3: Associative and Spatial Relationships  in Thesaurus-based Retrieval

Presentation

• Overview of OASIS

• Spatial query expansion

• Semantic distance measures for query expansion.

• Role of thesaurus associative relationships.

• Conclusions

Page 4: Associative and Spatial Relationships  in Thesaurus-based Retrieval

OASIS datasets

• Art & Architecture Thesauri (AAT): thematic descriptors - town, arrow, axe, etc. (J. Paul Getty Trust)

• Thesaurus of Geographic Names (TGN): place names, hierarchies and centroid co-ordinates. (J. Paul Getty Trust)

• Bartholomew digital map data: place names, co-ordinates, and adjacency relationships

• Ontological schema implemented using Semantic Index System (SIS), also used to store the data collection

• Royal Commission on the Ancient and Historical Monuments of Scotland (RCAHMS): dataset of Scottish archaeological sites and objects

Page 5: Associative and Spatial Relationships  in Thesaurus-based Retrieval

OASIS Schema

longitude

Geographical

Concept

Scope Note

latitude

area

String

Integer

Name

variant spelling

variant spelling

(Preferred Term)Standard Name

(Non Preferred Term)Alternative Name

Datedate

date

Languagelanguage

language

TopologicalRelationships

isA

Place

isA

meets

overlaps

partOfArtefact

found at

made at

Datedate found

date made Objecttype

Materialmade of

Stringdescription

TGN

TGN & Bartholomew RCAHMS

AAT

Page 6: Associative and Spatial Relationships  in Thesaurus-based Retrieval

Indexing Artefact DE 121

Page 7: Associative and Spatial Relationships  in Thesaurus-based Retrieval

OASIS UI - Selecting Items

Page 8: Associative and Spatial Relationships  in Thesaurus-based Retrieval

Spatial Expansion

Page 9: Associative and Spatial Relationships  in Thesaurus-based Retrieval

Semantic distance

BTRT

edged weapons

weapons

Axes(weapons)

Battle-axes

throwing axes

franciscas

tomahawks(weapons)

Axes(tools)

<cutting tools>

staff weapons

gisarmes

halberds

pollaxes

harpoons

hatchets

<wood-cutting andfinishing tools>

Pulaskis

Tools & Equipment Weapons & Ammunition AAT Hierarchies

Page 10: Associative and Spatial Relationships  in Thesaurus-based Retrieval

RT Expansion Experiments

• Sometimes RTs are used in a very loose way

• This causes problem for term expansion over RTs

• We developed a set of experimental scenarios for more precise control in RT expansion

Page 11: Associative and Spatial Relationships  in Thesaurus-based Retrieval

BT/NT Expansion only

Page 12: Associative and Spatial Relationships  in Thesaurus-based Retrieval

RT Expansion included

Page 13: Associative and Spatial Relationships  in Thesaurus-based Retrieval

Inter-Hierarchical RTs not Traversed

Page 14: Associative and Spatial Relationships  in Thesaurus-based Retrieval

Specialisation of RTs

• Alternative approach is to make use of a

richer set of thesaurus relationships by

specialising the main relationships

• Allows optional filtering on RT subtypes

• In creating the AAT, a subset of RT types

was employed by thesaurus editors

Page 15: Associative and Spatial Relationships  in Thesaurus-based Retrieval

AAT RT Codes & Rules

1. Alternate hierarchical relationship

– Alternative broader/narrower terms (arrows - edged weapons)

2. Whole/Part relationships (arrows - nocks)

3. Interfacet links (19 subtypes).

– Activity - Equipment Needed or Produced (arrows - archery)

4. Distinguished-from links (axes (weapons) - axes (tools))

5. Conjuncted-term links (arrows - bows (weapons))

Page 16: Associative and Spatial Relationships  in Thesaurus-based Retrieval

Semantic distance

edged weapons

weapons

Axes(weapons)

Battle-axes

throwing axes

franciscas

tomahawks(weapons)

Axes(tools)

<cutting tools>

staff weapons

gisarmes

halberds

pollaxes

harpoons

hatchets

<wood-cutting andfinishing tools>

Pulaskis

Tools & Equipment Weapons & Ammunition AAT Hierarchies

BTRT (1)RT (4)

Page 17: Associative and Spatial Relationships  in Thesaurus-based Retrieval

Specialised RTs

Page 18: Associative and Spatial Relationships  in Thesaurus-based Retrieval

Conclusions • We have explored the use of semantic distance measures

for spatial and associative (RT) thesaurus relationships:-– distance measures can be used with place name hierarchies, as

used in online gazetteers and geographical thesauri, when footprint data is limited

• The experimental RT scenarios suggest a potential for specialisation of RTs into different sub-types, and optionally linking RT type to query context

• Relationship subclasses used in thesaurus design should be retained for later use in retrieval