spot the dog: an overview of semantic retrieval of unannotated images in the semantic gap project
TRANSCRIPT
![Page 1: Spot the Dog: An overview of semantic retrieval of unannotated images in the Semantic Gap project](https://reader034.vdocument.in/reader034/viewer/2022051414/55ac89f71a28abeb588b4599/html5/thumbnails/1.jpg)
Spot the Dog An overview of semantic retrieval of unannotated images in the Semantic
Gap projectSemantic Image Retrieval - The User Perspective
Jonathon Hare Intelligence, Agents, Multimedia Group
School of Electronics and Computer ScienceUniversity of Southampton
{jsh2}@ecs.soton.ac.uk
![Page 2: Spot the Dog: An overview of semantic retrieval of unannotated images in the Semantic Gap project](https://reader034.vdocument.in/reader034/viewer/2022051414/55ac89f71a28abeb588b4599/html5/thumbnails/2.jpg)
The previous talks have described the issues associated with image retrieval from the practitioner perspective -- a problem that has become known as the ‘semantic gap’ in image retrieval.
This presentation aims to explore how the use of novel computational and mathematical techniques can be used to help improve content-based multimedia search by enabling textual search of unannotated imagery.
Introduction
![Page 3: Spot the Dog: An overview of semantic retrieval of unannotated images in the Semantic Gap project](https://reader034.vdocument.in/reader034/viewer/2022051414/55ac89f71a28abeb588b4599/html5/thumbnails/3.jpg)
Unannotated Imagery
Manually constructing metadata in order to index images is expensive.
Perhaps US$1-$5 per image for simple keywording.
More for archival quality metadata (keywords, caption, title, description, dates, times, events).
Every day, the number of images is increasing.
In many domains, manually indexing everything is an impossible task!
![Page 4: Spot the Dog: An overview of semantic retrieval of unannotated images in the Semantic Gap project](https://reader034.vdocument.in/reader034/viewer/2022051414/55ac89f71a28abeb588b4599/html5/thumbnails/4.jpg)
Unannotated Imagery An Example
Kennel club image collection.
relatively small (~60,000 images)
~7000 of those digitised.
~3000 of those have subject metadata (mostly keywords), remainder have little/no information.
Each year, after the Crufts dog show they expect to receive additional (digital) images [of the order of a few 1000] with little, if any metadata, other than date/time (and only then if the camera is set-up correctly).
![Page 5: Spot the Dog: An overview of semantic retrieval of unannotated images in the Semantic Gap project](https://reader034.vdocument.in/reader034/viewer/2022051414/55ac89f71a28abeb588b4599/html5/thumbnails/5.jpg)
An Overview of Our Approach
Conceptually simple idea: Teach a machine to learn the relationship between visual features of images and the metadata that describes them.
So, two stages:
Use exemplar image/metadata pairs to learn relationships.
Project learnt relationships to images without metadata in order to make them searchable.
![Page 6: Spot the Dog: An overview of semantic retrieval of unannotated images in the Semantic Gap project](https://reader034.vdocument.in/reader034/viewer/2022051414/55ac89f71a28abeb588b4599/html5/thumbnails/6.jpg)
Modelling Visual Information
In order to model the visual content of an image we can generate and extract descriptors or feature-vectors.
Feature-vectors can describe many differing aspects of the image content.
Low level features:
Fourier transforms, wavelet decomposition, texture histograms, colour histograms, shape primitives, filter primitives, etc.
Higher-level features:
Faces, objects, etc.
![Page 7: Spot the Dog: An overview of semantic retrieval of unannotated images in the Semantic Gap project](https://reader034.vdocument.in/reader034/viewer/2022051414/55ac89f71a28abeb588b4599/html5/thumbnails/7.jpg)
Visual Term Representations
A modern approach to modelling the content of an image is to treat it like a textual document.
Model image as a collection of “visual terms”.
Synonymous with words in a text document.
Feature-vectors can be transformed into visual terms through some mapping.
![Page 8: Spot the Dog: An overview of semantic retrieval of unannotated images in the Semantic Gap project](https://reader034.vdocument.in/reader034/viewer/2022051414/55ac89f71a28abeb588b4599/html5/thumbnails/8.jpg)
Visual Term Representations Bag-of-Terms
For indexing purposes, we often discount order/arrangement of terms and just count number of occurrences.
The quick brown fox
jumped over the lazy dog
brown dog fox jumped lazy over quick the
1 1 1 1 1 1 1 2[ ]1[ 2 0 0 6 ]
![Page 9: Spot the Dog: An overview of semantic retrieval of unannotated images in the Semantic Gap project](https://reader034.vdocument.in/reader034/viewer/2022051414/55ac89f71a28abeb588b4599/html5/thumbnails/9.jpg)
Visual Term Representations Example: Global Colour Visual Terms
A common way of indexing the global colours used in an image is the colour histogram.
The each bin of the histogram counts the number of pixels of the colour range represented by that bin.
The colour histogram can thus be used directly as a term occurrence vector in which each bin is represented as a visual term.
1569
3408
491
0 0
902
2146
5026
0 0 56
3633
0 0 0
6827
![Page 10: Spot the Dog: An overview of semantic retrieval of unannotated images in the Semantic Gap project](https://reader034.vdocument.in/reader034/viewer/2022051414/55ac89f71a28abeb588b4599/html5/thumbnails/10.jpg)
Visual Term Representations Example: Local interest-point based visual terms
Features based on Lowe’s difference-of-Gaussian region detector and SIFT feature vector.
A vocabulary of exemplar feature-vectors is learnt by applying k-means clustering to a training set of features.
Feature-vectors can then be quantised to discrete visual terms by finding the closest exemplar in the vocabulary.
![Page 11: Spot the Dog: An overview of semantic retrieval of unannotated images in the Semantic Gap project](https://reader034.vdocument.in/reader034/viewer/2022051414/55ac89f71a28abeb588b4599/html5/thumbnails/11.jpg)
Semantic SpacesBasic idea: Create a large multidimensional space in which images, keywords (or other metadata) and visual terms can be placed.
In the training stage learn how keywords are related to visual terms and images.
Place related visual terms, images and keywords close-together within the space.
In the projection stage unannotated images can be placed in the space based upon the visual terms they contain.
The placement should be such that they lie near keywords that describe them.
![Page 12: Spot the Dog: An overview of semantic retrieval of unannotated images in the Semantic Gap project](https://reader034.vdocument.in/reader034/viewer/2022051414/55ac89f71a28abeb588b4599/html5/thumbnails/12.jpg)
Semantic Spaces Conceptual Overview
![Page 13: Spot the Dog: An overview of semantic retrieval of unannotated images in the Semantic Gap project](https://reader034.vdocument.in/reader034/viewer/2022051414/55ac89f71a28abeb588b4599/html5/thumbnails/13.jpg)
Semantic Spaces Conceptual Overview
![Page 14: Spot the Dog: An overview of semantic retrieval of unannotated images in the Semantic Gap project](https://reader034.vdocument.in/reader034/viewer/2022051414/55ac89f71a28abeb588b4599/html5/thumbnails/14.jpg)
Semantic Spaces Uses of the space
Once constructed, the semantic space has a number of uses:
Finding images (both annotated and unannotated) by keyword(s)/metadata.
Finding images (both annotated and unannotated) by semantically similar images.
Determining likely metadata for an image.
Examining keyword-keyword and keyword-visual term relationships.
Segmenting an image.
![Page 15: Spot the Dog: An overview of semantic retrieval of unannotated images in the Semantic Gap project](https://reader034.vdocument.in/reader034/viewer/2022051414/55ac89f71a28abeb588b4599/html5/thumbnails/15.jpg)
Semantic Spaces Searching by Keyword
SUN
TRAIN
![Page 16: Spot the Dog: An overview of semantic retrieval of unannotated images in the Semantic Gap project](https://reader034.vdocument.in/reader034/viewer/2022051414/55ac89f71a28abeb588b4599/html5/thumbnails/16.jpg)
Semantic Spaces Searching by Keyword
SUN
TRAIN
Ranked Search Results:
Search for images about “SUN”
![Page 17: Spot the Dog: An overview of semantic retrieval of unannotated images in the Semantic Gap project](https://reader034.vdocument.in/reader034/viewer/2022051414/55ac89f71a28abeb588b4599/html5/thumbnails/17.jpg)
Semantic Spaces Searching by Keyword
SUN
TRAIN
Ranked Search Results:
Search for images about “SUN”
SUN
![Page 18: Spot the Dog: An overview of semantic retrieval of unannotated images in the Semantic Gap project](https://reader034.vdocument.in/reader034/viewer/2022051414/55ac89f71a28abeb588b4599/html5/thumbnails/18.jpg)
Semantic Spaces Searching by Keyword
SUN
TRAIN
Ranked Search Results:
Search for images about “SUN”
SUN
![Page 19: Spot the Dog: An overview of semantic retrieval of unannotated images in the Semantic Gap project](https://reader034.vdocument.in/reader034/viewer/2022051414/55ac89f71a28abeb588b4599/html5/thumbnails/19.jpg)
Semantic Spaces Searching by Image
![Page 20: Spot the Dog: An overview of semantic retrieval of unannotated images in the Semantic Gap project](https://reader034.vdocument.in/reader034/viewer/2022051414/55ac89f71a28abeb588b4599/html5/thumbnails/20.jpg)
Semantic Spaces Searching by Image
Search for images like this:
Ranked Search Results:
![Page 21: Spot the Dog: An overview of semantic retrieval of unannotated images in the Semantic Gap project](https://reader034.vdocument.in/reader034/viewer/2022051414/55ac89f71a28abeb588b4599/html5/thumbnails/21.jpg)
Semantic Spaces Searching by Image
Search for images like this:
Ranked Search Results:
![Page 22: Spot the Dog: An overview of semantic retrieval of unannotated images in the Semantic Gap project](https://reader034.vdocument.in/reader034/viewer/2022051414/55ac89f71a28abeb588b4599/html5/thumbnails/22.jpg)
Semantic Spaces Searching by Image
Search for images like this:
Ranked Search Results:
![Page 23: Spot the Dog: An overview of semantic retrieval of unannotated images in the Semantic Gap project](https://reader034.vdocument.in/reader034/viewer/2022051414/55ac89f71a28abeb588b4599/html5/thumbnails/23.jpg)
Semantic Spaces Suggesting Keywords
SUN
SKY
MOUNTAINTREE
CAR
![Page 24: Spot the Dog: An overview of semantic retrieval of unannotated images in the Semantic Gap project](https://reader034.vdocument.in/reader034/viewer/2022051414/55ac89f71a28abeb588b4599/html5/thumbnails/24.jpg)
Semantic Spaces Suggesting Keywords
Suggested keywords:
Suggest keywords for this image: SUN
SKY
MOUNTAINTREE
CAR
![Page 25: Spot the Dog: An overview of semantic retrieval of unannotated images in the Semantic Gap project](https://reader034.vdocument.in/reader034/viewer/2022051414/55ac89f71a28abeb588b4599/html5/thumbnails/25.jpg)
Semantic Spaces Suggesting Keywords
Suggested keywords:
Suggest keywords for this image: SUN
SKY
MOUNTAINTREE
CAR
![Page 26: Spot the Dog: An overview of semantic retrieval of unannotated images in the Semantic Gap project](https://reader034.vdocument.in/reader034/viewer/2022051414/55ac89f71a28abeb588b4599/html5/thumbnails/26.jpg)
Semantic Spaces Suggesting Keywords
Suggested keywords:
Suggest keywords for this image: SUN
SKY
MOUNTAINTREE
CAR
SKY MOUNTAIN TREE SUN CAR
CARSUN
TREE
SKY
MOUNTAIN
![Page 27: Spot the Dog: An overview of semantic retrieval of unannotated images in the Semantic Gap project](https://reader034.vdocument.in/reader034/viewer/2022051414/55ac89f71a28abeb588b4599/html5/thumbnails/27.jpg)
Semantic Spaces Experimental Retrieval Results - Corel Dataset
Colour Histograms used as visual terms (each bin representing a single term).
Standard experimental collection: 500 test images, 4500 training images.
Results quite impressive ~ comparable with Machine Translation auto-annotation technique (but remember we are using much simpler image features).
Works well for query keywords that are easily associated with a particular set of colours,
but not so well for the other keywords.
![Page 28: Spot the Dog: An overview of semantic retrieval of unannotated images in the Semantic Gap project](https://reader034.vdocument.in/reader034/viewer/2022051414/55ac89f71a28abeb588b4599/html5/thumbnails/28.jpg)
Semantic Spaces Experimental Retrieval Results - Corel Dataset
Top 15 images when querying for ‘sun’
![Page 29: Spot the Dog: An overview of semantic retrieval of unannotated images in the Semantic Gap project](https://reader034.vdocument.in/reader034/viewer/2022051414/55ac89f71a28abeb588b4599/html5/thumbnails/29.jpg)
Semantic Spaces Experimental Retrieval Results - Corel Dataset
Top 15 images when querying for ‘horse’
![Page 30: Spot the Dog: An overview of semantic retrieval of unannotated images in the Semantic Gap project](https://reader034.vdocument.in/reader034/viewer/2022051414/55ac89f71a28abeb588b4599/html5/thumbnails/30.jpg)
Semantic Spaces Experimental Retrieval Results - Corel Dataset
Top 15 images when querying for ‘foals’
![Page 31: Spot the Dog: An overview of semantic retrieval of unannotated images in the Semantic Gap project](https://reader034.vdocument.in/reader034/viewer/2022051414/55ac89f71a28abeb588b4599/html5/thumbnails/31.jpg)
Demo The K9 Retrieval System
We have built a demonstration system around the semantic space idea and applied it to images from the Kennel Club picture library (>7000 images, ∼3000 with keywords).
The system allows annotated images to be retrieved by keywords and concepts (keywords with thesaurus expansion).
Both annotated and unannotated images can also be retrieved using the semantic space and regular content-based techniques.
This brief demo will concentrate on retrieval of annotated images using keyword matching, and unannotated images using the semantic space.
![Page 32: Spot the Dog: An overview of semantic retrieval of unannotated images in the Semantic Gap project](https://reader034.vdocument.in/reader034/viewer/2022051414/55ac89f71a28abeb588b4599/html5/thumbnails/32.jpg)
Conclusions
Semantic retrieval of unannotated images is hard!
Our semantic space approach takes us some of the way, but there is still a long way to go.
Retrieval is limited by the choice of visual features, and how well those features relate to the keywords.
![Page 33: Spot the Dog: An overview of semantic retrieval of unannotated images in the Semantic Gap project](https://reader034.vdocument.in/reader034/viewer/2022051414/55ac89f71a28abeb588b4599/html5/thumbnails/33.jpg)
Questions?