in-depth exploration of geotagging performance

25
In-depth Exploration of Geotagging Performance using sampling strategies on YFCC100M George Kordopatis-Zilos, Symeon Papadopoulos, Yiannis Kompatsiaris Information Technologies Institute, Thessaloniki, Greece MMCommons Workshop, October 16, 2016 @ Amsterdam, NL

Upload: symeon-papadopoulos

Post on 13-Jan-2017

126 views

Category:

Data & Analytics


2 download

TRANSCRIPT

Page 1: In-depth Exploration of Geotagging Performance

In-depth Exploration of Geotagging Performance using sampling strategies on YFCC100MGeorge Kordopatis-Zilos, Symeon Papadopoulos, Yiannis KompatsiarisInformation Technologies Institute, Thessaloniki, Greece

MMCommons Workshop, October 16, 2016 @ Amsterdam, NL

Page 2: In-depth Exploration of Geotagging Performance

Where is it?Depicted landmarkEiffel TowerLocationParis, Tennessee

Keyword “Tennesee” is very important to correctly place the photo.

Source (Wikipedia):http://en.wikipedia.org/wiki/Eiffel_Tower_(Paris,_Tennessee)

Page 3: In-depth Exploration of Geotagging Performance

MotivationEvaluating multimedia retrieval systems• What do we evaluate?• How?• What decisions do we make based on it?

MM system (black box) Test Collection

Comparison to ground truth

Evaluation measure

Decision

Page 4: In-depth Exploration of Geotagging Performance

Problem Formulation• Test collection creation Evaluation bias

• Performance reduced to a single measure miss a lot of nuances of performance

• Test problem: Geotagging = predicting the geographic location of a multimedia item based on its content

Page 5: In-depth Exploration of Geotagging Performance

Example: Evaluating geotagging• Test collection #1: 1M images, 700K located in US• Assume we use P@1km as an evaluation measure

• System 1: almost perfect precision in US (100%), very poor for rest of the world (10%) P@1km = 0.7*100 + 0.3*10 = 73%

• System 2: approximately the same precision all over the world (65%) P@1km = 65%

• Test collection #2: 1M images, 500K depicting cats and puppies on white background• Then, for 50% of the collection any prediction is

essentially random.

Page 6: In-depth Exploration of Geotagging Performance

Multimedia Geotagging• Problem of estimating the geographic location of a

multimedia item (e.g. Flickr image + metadata)• Variety of approaches:• Text-based: use the text metadata (tags)

• Gazetteer-based• Statistical methods (associations between tags & locations)

• Visual• Similarity-based (find most similar and use their location)• Model-based (learn visual model of an area)

• Hybrid• Combine text and visual

Page 7: In-depth Exploration of Geotagging Performance

Language Model

• Most likely cell: • Tag-cell probability:

We will refer to this as:Base LM (or Basic)

Page 8: In-depth Exploration of Geotagging Performance

Language Model Extensions• Feature selection

• Discard tags that do not provide any geographical cues• Selection criterion: locality > 0

• Feature weighting• More importance to tags with geographic information• Linear combination of locality and spatial entropy

• Multiple grids• Consider two grids: fine and coarse – if the estimate from the fine

grid falls within that of the coarse, then use that one

• Similarity Search• Out of the selected cell, use lat/lon of most similar item to refine

location estimation

We will refer to this as:Full LM (or Full)

Page 9: In-depth Exploration of Geotagging Performance

MediaEval Placing Task• Benchmarking activity in the context of MediaEval• Dataset: • Flickr images and videos (different each year)• Training and test set

• Also possible to test systems that use external data

Edition Training Set Test Set

2015 4,695,149 949,889

2014 5,025,000 510,000

2013 8,539,050 262,000

Page 10: In-depth Exploration of Geotagging Performance

Proposed Evaluation Framework• Initial (reference) test collection Dref

• Sampling function f: Dref Dtest

• Performance volatility

• p(D): performance score achieved in collection D• In our case, we consider two such measures:• P@1km• Median distance error

Page 11: In-depth Exploration of Geotagging Performance

Sampling StrategiesA variety of approaches for Placing Task collection:• Geographical Uniform Sampling• User Uniform Sampling• Text-based Sampling• Text Diversity Sampling• Geographically Focused Sampling• Ambiguity-based Sampling• Visual Sampling

Page 12: In-depth Exploration of Geotagging Performance

Uniform Sampling• Geographic Uniform Sampling• Divide earth surface into square areas of approximately

the same size (~10x10km)• Select N items from each area (N=median of items/area)

• User Uniform Sampling• Select only one item per user

Page 13: In-depth Exploration of Geotagging Performance

Text Sampling• Text-based Sampling• Select only items with more than M terms (M: median

of terms/item)

• Text Diversity Sampling• Represent items using bag-of-words• Use MinHash to generate a binary code per BoW vector• Select one item per code (bucket) B

Page 14: In-depth Exploration of Geotagging Performance

Other Sampling Strategies• Geographically Focused Sampling

• Pick items from a selected place (continent/country)

• Ambiguity-based Sampling• Select the set of items that are associated with ambiguous

place names (or the complementary set)• Ambiguity defined with the help of entropy

• Visual Sampling• Select only items associated with a given visual concept• Select only items associated with concepts related to buildings

Page 15: In-depth Exploration of Geotagging Performance

Experiments - Setup• Placing Task 2015 dataset: 949,889 images (subset

of YFCC100M)• Test four variants of Language Model method:• Basic-PT: Base LM method trained on PT dataset (=4.7

geotagged images released by the task organizers)• Full-PT: Full LM method trained on PT dataset• Basic-Y: Base LM method trained on YFCC dataset

(=40M geotagged images of YFCC100M)• Full-Y: Full LM method trained on YFCC dataset

Page 16: In-depth Exploration of Geotagging Performance

Reference Results

Page 17: In-depth Exploration of Geotagging Performance

Geographical Uniform Sampling• Initial distribution • Uniform distribution:• select three items/cell

Page 18: In-depth Exploration of Geotagging Performance

User Uniform Sampling

Page 19: In-depth Exploration of Geotagging Performance

Text-based Sampling

Select only images with >7 tags/item

Page 20: In-depth Exploration of Geotagging Performance

Text Diversity Sampling

• After MinHash, 478,817 buckets were created.

Page 21: In-depth Exploration of Geotagging Performance

Geographically Focused Sampling

Results of Full-Y

Page 22: In-depth Exploration of Geotagging Performance

Ambiguity-based Sampling

Page 23: In-depth Exploration of Geotagging Performance

Visual Sampling

Page 24: In-depth Exploration of Geotagging Performance

Summary of Results

Page 25: In-depth Exploration of Geotagging Performance

Thank you!Data/Code:• https://github.com/MKLab-ITI/multimedia-geotagging/

Get in touch:• George Kordopatis-Zilos: [email protected] • Symeon Papadopoulos: [email protected] / @sympap

With the support of: