associative and spatial relationships in thesaurus-based retrieval
DESCRIPTION
Associative and Spatial Relationships in Thesaurus-based Retrieval. Harith Alani 1 , Christopher Jones 2 , Douglas Tudhope 1 1 School of Computing, University of Glamorgan {halani,dstudhope}@glam.ac.uk 2 Department of Computer Science, University of Cardiff [email protected]. - PowerPoint PPT PresentationTRANSCRIPT
Associative and Spatial Relationships in
Thesaurus-based Retrieval
Harith Alani1, Christopher Jones2, Douglas Tudhope1
1 School of Computing, University of Glamorgan {halani,dstudhope}@glam.ac.uk
2 Department of Computer Science, University of [email protected]
OASIS - Ontologically Augmented
Spatial Information System • Aims:
– Investigate terminology systems for thematic and spatial access in digital library applications
and in particular:
• Investigate retrieval potential of geographical metadata schema consisting of rich place name data, with limited locational information
• Explore retrieval potential of reasoning over underlying semantic relationships
Presentation
• Overview of OASIS
• Spatial query expansion
• Semantic distance measures for query expansion.
• Role of thesaurus associative relationships.
• Conclusions
OASIS datasets
• Art & Architecture Thesauri (AAT): thematic descriptors - town, arrow, axe, etc. (J. Paul Getty Trust)
• Thesaurus of Geographic Names (TGN): place names, hierarchies and centroid co-ordinates. (J. Paul Getty Trust)
• Bartholomew digital map data: place names, co-ordinates, and adjacency relationships
• Ontological schema implemented using Semantic Index System (SIS), also used to store the data collection
• Royal Commission on the Ancient and Historical Monuments of Scotland (RCAHMS): dataset of Scottish archaeological sites and objects
OASIS Schema
longitude
Geographical
Concept
Scope Note
latitude
area
String
Integer
Name
variant spelling
variant spelling
(Preferred Term)Standard Name
(Non Preferred Term)Alternative Name
Datedate
date
Languagelanguage
language
TopologicalRelationships
isA
Place
isA
meets
overlaps
partOfArtefact
found at
made at
Datedate found
date made Objecttype
Materialmade of
Stringdescription
TGN
TGN & Bartholomew RCAHMS
AAT
Indexing Artefact DE 121
OASIS UI - Selecting Items
Spatial Expansion
Semantic distance
BTRT
edged weapons
weapons
Axes(weapons)
Battle-axes
throwing axes
franciscas
tomahawks(weapons)
Axes(tools)
<cutting tools>
staff weapons
gisarmes
halberds
pollaxes
harpoons
hatchets
<wood-cutting andfinishing tools>
Pulaskis
Tools & Equipment Weapons & Ammunition AAT Hierarchies
RT Expansion Experiments
• Sometimes RTs are used in a very loose way
• This causes problem for term expansion over RTs
• We developed a set of experimental scenarios for more precise control in RT expansion
BT/NT Expansion only
RT Expansion included
Inter-Hierarchical RTs not Traversed
Specialisation of RTs
• Alternative approach is to make use of a
richer set of thesaurus relationships by
specialising the main relationships
• Allows optional filtering on RT subtypes
• In creating the AAT, a subset of RT types
was employed by thesaurus editors
AAT RT Codes & Rules
1. Alternate hierarchical relationship
– Alternative broader/narrower terms (arrows - edged weapons)
2. Whole/Part relationships (arrows - nocks)
3. Interfacet links (19 subtypes).
– Activity - Equipment Needed or Produced (arrows - archery)
4. Distinguished-from links (axes (weapons) - axes (tools))
5. Conjuncted-term links (arrows - bows (weapons))
Semantic distance
edged weapons
weapons
Axes(weapons)
Battle-axes
throwing axes
franciscas
tomahawks(weapons)
Axes(tools)
<cutting tools>
staff weapons
gisarmes
halberds
pollaxes
harpoons
hatchets
<wood-cutting andfinishing tools>
Pulaskis
Tools & Equipment Weapons & Ammunition AAT Hierarchies
BTRT (1)RT (4)
Specialised RTs
Conclusions • We have explored the use of semantic distance measures
for spatial and associative (RT) thesaurus relationships:-– distance measures can be used with place name hierarchies, as
used in online gazetteers and geographical thesauri, when footprint data is limited
• The experimental RT scenarios suggest a potential for specialisation of RTs into different sub-types, and optionally linking RT type to query context
• Relationship subclasses used in thesaurus design should be retained for later use in retrieval