6.s093 visual recognition through machine learning...

6.S093 Visual Recognition through Machine Learning Competition

Image by kirkh.deviantart.com

Aditya Khosla

Today’s class

• Part 1: Competition details

• Part 2: Image representation lecture– Bag-of-words

– Spatial pyramid

• Part 3: Feature extraction tutorial

Competition details: dataset

person

10 object categories

airplane bicycle car

cup/mug dog(s) guitar hamburger sofa trafficlight

Competition details: dataset

Training set

8,000 images

Validation set

2,000 imagesTesting set

5,000 images

labels provided NO labels provided

Leaderboard set

Competition details: submission

• For each image, you provide the probability of every class belonging in it (as returned by your algorithm)

ycle car

Competition details: evaluation

• Average precision

Competition details: prizes

+ cash

second third

+ cash

Competition details: thank you!

Image representation: bag-of-words

Document representation: bag-of-words

• Order-less document representation: frequencies of words from a dictionary Salton & McGill (1983)

US Presidential Speeches Tag Cloud

document

bag-of-words

document

bag-of-words

image bag-of-visual words

Object Bag of ‘words’

ObjectUgly bag of

‘words’

ObjectStylish bag of

‘words’

ObjectStylish bag of

‘words’

visual dictionary

1. Extract descriptors

2. Learn “visual dictionary”

3. Quantize features using visual vocabulary

4. Represent images by frequencies of “visual words”

1. Extracting descriptors

regular grid interest points

Image representation: yesterdaygradient magnitude

gradient orientation

feature vector

Image representation: yesterdaygradient magnitude

gradient orientation

descriptor

2. Learning “visual dictionary”

Compute descriptor

2. Learning “visual dictionary”

descriptors

2. Learning visual dictionarydescriptors

Clustering

2. Learning visual dictionarydescriptors

Clustering

visual vocabulary

Example visual vocabulary

Fei-Fei et al. 2005

Image patch examples

Sivic et al. 2005

Image patch examples

Sivic et al. 2005

How to choose the vocabulary size?

Bag-of-words: limitations

• What about the structure of the image?

Image representation: spatial pyramids

level 0

level 0 level 1

level 0 level 1 level 2

Tutorial

6.s093 visual recognition through machine learning...

Documents

dynamics lecture3

final lecture3

lecture3 customers

lecture3 theageofsail

lecture3 plate tectonics part 2

lecture3 curvilinear motion part 1

lecture3 actuators

interpolasi lecture3

lecture3 optimisation

nanotechnology lecture3

lecture3 chap2

lecture3: 123.312

singapore lecture3

ics2208 lecture3

lecture3 laser

c2329- lecture3

5266 lecture3

sse lecture3

itc lecture3

brm lecture3