relja arandjelović and andrew zisserman · 2014. 10. 28. · visual vocabulary with a semantic...

Visual vocabulary with a semantic twist Relja Arandjelović and Andrew Zisserman Visual Geometry Group, Department of Engineering Science, University of Oxford Motivation and objectives Semantic vocabulary Results Fast Semantic Segmentation via Soft Segments (FSSS) (paper within a paper) • Standard large scale instance retrieval: - Usually based on matching local descriptors, e.g. (Root)SIFT - Not distinctive enough - Can't "see the big picture" • SemanticSIFT: - Matching: utilize local image semantic content before after • Suppose we have pixel-wise semantic segmentation into C classes • Assign a "semantic word" to a local image patch: - The patch contains semantic class c if it contains at least one pixel of a class c - Number of possible semantic words K s =2 C -1 - For our choice: {sky, flora, other} (C=3) there are K s =7 semantic words: {sky}, {flora}, {other}, {sky, flora}, {sky, other}, {flora, other}, {sky, flora, other} Matching Product vocabulary Feature removal • Patches can match only if their semantic words are identical • Win #1: Increases precision due to stricter matching • SemanticSIFT vocabulary: product vocabulary of the visual and semantic vocabularies; size K=K semantic x K visual • Large scale retrieval: ranking via inverted index which exploits bag-of-words sparsity - Larger vocabulary => shorter posting lists => fewer items to traverse during scoring => faster retrieval • Win #2: Faster retrieval due to the larger (product) vocabulary • For a specific task: some features are not useful, or even detrimental • Can remove features a priori known to be irrelevant • Win #3: Reduced storage (RAM) costs Win-Win-Win • Testing on Oxford 5k and 105k datasets, training on Paris6k • Baseline: Hamming Embedding + burstiness • Over 5 random seeds: +1.2% • Baseline with 7x larger visual vocabulary, Oxford 5k: 54.9% • Expected speedup for an average query for Oxford 105k and SoftSemanticSIFT: 38.4% Mean average precision (mAP) Empirical speedup for the 55 Oxford queries • State-of-the-art semantic segmentation methods take minutes per image • We introduce a new method which takes 7 seconds on a single CPU in MATLAB for a 500x500 pixel image • Code available: www.robots.ox.ac.uk/~vgg/software/fast_semantic_segmentation • Idea: - Start with fast soft-segmentation method by Leordeanu et al. ECCV 2012 (takes 1.7s) - To handle segmentation uncertainty: introduce an "unknown" class and allow it to match all classes - Minimize an energy which stimulates agreement between soft-segments and similar pixels, taking into account soft-segment unary potentials - Stanford background dataset: 78% @ 3.7s / image - State-of-the-art: Lempitsky et al. (2011): 81.9% @ minutes per image due to using globalPb • Results: - Tighe & Lazebnik (2010): 77.5% @ 10 min / image no geometric verification False matches based on SIFT that are removed by semantic filtering

Upload: others

Post on 13-Sep-2020

5 views

Category:

Documents

0 download

Report

Download

Embed Size (px):

TRANSCRIPT

Visual vocabulary with a semantic twistRelja Arandjelović and Andrew Zisserman

Visual Geometry Group, Department of Engineering Science, University of Oxford

Motivation and objectives

Semantic vocabulary

Results

Fast Semantic Segmentationvia Soft Segments (FSSS)

(paper within a paper)

• Standard large scale instance retrieval:

- Usually based on matching local descriptors, e.g. (Root)SIFT

- Not distinctive enough

- Can't "see the big picture"

• SemanticSIFT:

- Matching: utilize local image semantic content

before

after

• Suppose we have pixel-wise semantic segmentation into C classes

• Assign a "semantic word" to a local image patch:

- The patch contains semantic class c if it contains at least one pixel of a class c

- Number of possible semantic words Ks=2C -1

- For our choice: {sky, flora, other} (C=3) there are Ks=7 semantic words: {sky}, {flora}, {other}, {sky, flora}, {sky, other}, {flora, other}, {sky, flora, other}

Matching

Product vocabulary

Feature removal

• Patches can match only if their semantic words are identical• Win #1: Increases precision due to stricter matching

• SemanticSIFT vocabulary: product vocabulary of the visual and semantic vocabularies; size K=Ksemantic x Kvisual

• Large scale retrieval: ranking via inverted index which exploits bag-of-words sparsity

- Larger vocabulary => shorter posting lists => fewer items to traverse during scoring => faster retrieval

• Win #2: Faster retrieval due to the larger (product) vocabulary

• For a specific task: some features are not useful, or even detrimental• Can remove features a priori known to be irrelevant

• Win #3: Reduced storage (RAM) costs

Win-Win-Win

• Testing on Oxford 5k and 105k datasets, training on Paris6k

• Baseline: Hamming Embedding + burstiness

• Over 5 random seeds: +1.2%

• Baseline with 7x larger visual vocabulary, Oxford 5k: 54.9%

• Expected speedup for an average query for Oxford 105k and SoftSemanticSIFT: 38.4%

Mean average precision (mAP)

Empirical speedup for the 55 Oxford queries

• State-of-the-art semantic segmentation methods take minutes per image

• We introduce a new method which takes 7 seconds on a single CPU in MATLAB for a 500x500 pixel image

• Code available:www.robots.ox.ac.uk/~vgg/software/fast_semantic_segmentation

• Idea:

- Start with fast soft-segmentation method by Leordeanu et al. ECCV 2012 (takes 1.7s)

- To handle segmentation uncertainty: introduce an "unknown" class and allow it to match all classes

- Minimize an energy which stimulates agreement between soft-segments and similar pixels, taking into account soft-segment unary potentials

- Stanford background dataset: 78% @ 3.7s / image- State-of-the-art: Lempitsky et al. (2011): 81.9% @ minutes per image due to using globalPb

• Results:

- Tighe & Lazebnik (2010): 77.5% @ 10 min / image

no geometric verification

False matches based on SIFT that are removed by semantic filtering

Albert Marinculić Boris Habrun Ljubo Barbić Relja Beck · Biološke opasnosti u hrani Albert Marinculić Boris Habrun Ljubo Barbić Relja Beck Osijek, 2009

Relja Novakovic Srbin Rimski Car

Ken Chatfield James Philbin Andrew Zisserman

Object Discovery with a Copy-Pasting GAN · Object Discovery with a Copy-Pasting GAN Relja Arandjelovic´ yAndrew Zisserman; yDeepMind VGG, Department of Engineering Science, University

NamethatSculpture - University of Oxfordvgg/publications/2012/...NamethatSculpture Relja Arandjelovic´ Department of Engineering Science University of Oxford [email protected]

Three things everyone should know to improve object retrieval Relja Arandjelovi´c Andrew Zisserman Department of Engineering Science, University of Oxford

A tour of pretext tasks Relja Arandjelović · Traditional losses such as contrastive or triplet [“Multi-task self-supervised visual learning”, Doersch and Zisserman 17], [“HowTo100M:

The SVM classifier zisserman lecture note.pdf

Analysing Daily Activity Logs for Smart Interactionkcntt.duytan.edu.vn/uploads/d458134c-d2cb-44d4-ac93-7dedcec30ceb_ai4...and fast spatial matching, CVPR 2007 Relja Arandjelovi´c

Andrew Zisserman Talk - Part 1a

Invariant Large Margin Nearest Neighbour Classifier M. Pawan Kumar Philip Torr Andrew Zisserman

Automatic Face Recognition for Film Character Retrieval in Feature-Length Films Ognjen Arandjelović Andrew Zisserman

Video Google: A Text Retrieval Approach to Object Matching in Videos Josef Sivic and Andrew Zisserman

C4B Machine Learning Hilary 2011 A. Zisserman • K-means ...az/lectures/ml/2011/lect8.pdf · C4B Machine Learning Hilary 2011 A. Zisserman • K-means algorithm • GMM and the EM

Omkar M. Parkhi, Andrea Vedaldi, Andrew Zisserman Deep face recognition

Andrew Zisserman Talk - Part 1b

O BJ C UT M. Pawan Kumar Philip Torr Andrew Zisserman UNIVERSITY OF OXFORD

REGIONAL ECONOMIC DEVELOPMENT STRATEGIES: …iua-global.org/docs/IUFA/IUFA_2003_Conf_Croatia.pdf · 2017. 8. 2. · Slobodan Bjelajac Renata Relja Sanja Stanic Special thanks are

Discriminative Sub-categorization Minh Hoai Nguyen, Andrew Zisserman University of Oxford 1

Andrew Zisserman - UCLAhelper.ipam.ucla.edu/publications/sews2/sews2_7272.pdfAndrew Zisserman (work with Ondřej Chum, ... 2. Scaling up ... Part 2: Scaling up: the Oxford buildings

Relja Arandjelovid and Andrew Zisserman - University of Oxfordvgg/publications/2012/Arandjelovic12a/... · Relja Arandjelovid and Andrew Zisserman ... Antonio Canova Cupid and Psyche

Florian Schroff, Antonio Criminisi & Andrew Zisserman ICCV 2007

Relja Arandjelovid and Andrew Zissermanvgg/publications/2012/Arandjelovic12a/... · Relja Arandjelovid and Andrew Zisserman ... Hercules and the Centaur Eurytion ... = max( BoW_score(image),

Andrew Zisserman Mark Jaderberg, Karen Simonyan, Andrea ...vision.cs.utexas.edu/381V-spring2016/slides/lad-paper.pdf · Mark Jaderberg, Karen Simonyan, Andrea Vedaldi, Andrew Zisserman

Relja Novaković: Serborum ili Urborum

Research Update and Future Work Directions – Jan 18, 2006 – Ognjen Arandjelović Roberto Cipolla

Discriminative Learning and Big Data - University of Oxford · 2019. 12. 3. · Discriminative Learning and Big Data Approximate Nearest Neighbours Relja Arandjelović DeepMind AIMS-CDT

Presented by Relja Arandjelović Iterative Quantization: A Procrustean Approach to Learning Binary Codes University of Oxford 21 st September 2011 Yunchao

Discriminative Learning and Big Data - University of …az/lectures/aims-big_data/relja_big...Discriminative Learning and Big Data Approximate Nearest Neighbours Relja Arandjelović

Presented by Relja Arandjelovi ć

{Charles, Pfister, Magee, Hogg, and Zisserman} 2013 ... · {Charles, Pfister, Everingham, and Zisserman} 2013{} {Dantone, Gall, Leistner, and Vanprotect unhbox voidb@x penalty @M

GhostVLAD for set-based face recognitionvgg/publications/2018/Zhong18b/zhong18b.pdf · GhostVLAD for set-based face recognition Yujie Zhong 1, Relja Arandjelovi c 2, and Andrew Zisserman;

VERY DEEP CONVOLUTIONAL NETWORKS FOR LARGE-SCALE IMAGE RECOGNITION does size matter? Karen Simonyan Andrew Zisserman

Andrew Zisserman Talk - Part 2

Fcv taxo zisserman