6.s093 visual recognition through machine learning...

Post on 09-Aug-2020

0 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

6.S093 Visual Recognition through Machine Learning Competition

Image by kirkh.deviantart.com

Aditya Khosla

Today’s class

• Part 1: Competition details

• Part 2: Image representation lecture– Bag-of-words

– Spatial pyramid

• Part 3: Feature extraction tutorial

Competition details: dataset

person

10 object categories

airplane bicycle car

cup/mug dog(s) guitar hamburger sofa trafficlight

Competition details: dataset

Training set

8,000 images

Validation set

2,000 imagesTesting set

5,000 images

labels provided NO labels provided

Leaderboard set

Competition details: submission

• For each image, you provide the probability of every class belonging in it (as returned by your algorithm)

airp

lan

e

bic

ycle car

cup

do

ggu

itar

ham

bu

rger

sofa

traf

fic

ligh

t

per

son

0

1

Competition details: evaluation

• Average precision

Competition details: prizes

Cas

h

first

+ cash

second third

+ cash

Competition details: thank you!

Image representation: bag-of-words

Document representation: bag-of-words

• Order-less document representation: frequencies of words from a dictionary Salton & McGill (1983)

Document representation: bag-of-words

• Order-less document representation: frequencies of words from a dictionary Salton & McGill (1983)

US Presidential Speeches Tag Cloud

Document representation: bag-of-words

• Order-less document representation: frequencies of words from a dictionary Salton & McGill (1983)

US Presidential Speeches Tag Cloud

Document representation: bag-of-words

• Order-less document representation: frequencies of words from a dictionary Salton & McGill (1983)

US Presidential Speeches Tag Cloud

Image representation: bag-of-words

document

bag-of-words

Image representation: bag-of-words

document

bag-of-words

image bag-of-visual words

Object Bag of ‘words’

ObjectUgly bag of

‘words’

ObjectStylish bag of

‘words’

ObjectStylish bag of

‘words’

visual dictionary

Image representation: bag-of-words

1. Extract descriptors

Image representation: bag-of-words

1. Extract descriptors

2. Learn “visual dictionary”

Image representation: bag-of-words

1. Extract descriptors

2. Learn “visual dictionary”

3. Quantize features using visual vocabulary

Image representation: bag-of-words

1. Extract descriptors

2. Learn “visual dictionary”

3. Quantize features using visual vocabulary

Image representation: bag-of-words

1. Extract descriptors

2. Learn “visual dictionary”

3. Quantize features using visual vocabulary

4. Represent images by frequencies of “visual words”

1. Extracting descriptors

regular grid interest points

Image representation: yesterdaygradient magnitude

gradient orientation

feature vector

Image representation: yesterdaygradient magnitude

gradient orientation

descriptor

2. Learning “visual dictionary”

Compute descriptor

2. Learning “visual dictionary”

descriptors

2. Learning visual dictionarydescriptors

2. Learning visual dictionarydescriptors

Clustering

2. Learning visual dictionarydescriptors

Clustering

visual vocabulary

Example visual vocabulary

Fei-Fei et al. 2005

Image patch examples

Sivic et al. 2005

Image patch examples

Sivic et al. 2005

How to choose the vocabulary size?

Bag-of-words: limitations

• What about the structure of the image?

=?

Image representation: spatial pyramids

level 0

Image representation: spatial pyramids

level 0 level 1

Image representation: spatial pyramids

level 0 level 1 level 2

Tutorial

top related