c omputation m odel for v isual c ategorization bhuwan dhingra

COMPUTATION MODEL FOR VISUAL CATEGORIZATIONBhuwan Dhingra

OVERVIEW

Objective: To study the hierarchy of object categorization using a computational model for vision.

Three levels of categorization – super-ordinate, basic and subordinate.

Basic level categories – maximize cue validities, and dominate any taxonomy.

Categorization implemented in unsupervised manner in the current model.

HYPOTHESES

Rosch et al, [1], claim that basic level categories accessed first.

Marc and Joubert, [2], claim that in a purely visual task super-ordinate categories accessed first.

Role of expertise emphasized several times in the literature, [3].

THE MODEL

Bag-of-Features:

THE MODEL

Extracted histograms clustered in an unsupervised manner using k-means algorithm.

Distance metric used – (1-correlation(h1,h2)), where h1 and h2 are two histograms.

DATASET

30 images for each subordinate category using Google image search of the keywords.

DATASET

FurnitureAnimal

TableChairBirdDog

Coffee Table

Picnic Table

Rocking Chair

Bar-stool

Crow

Pigeon

Foxhound

Dalmation

Super-ordinate classes

Basic classes

Sub-ordinate classes

TESTS

Test 1: Study which type of categorization dominates as the number of detected key-points is varied.

Test 2: Study how the performance of the categorization changes with the number of images.

Test 3: Study the effect of increasing the number of images of one basic category compared to others

Different categorizations were implemented by setting k = 2,4,8.

PERFORMANCE INDICES

Rand Index:

TP, TN, FP, FN are true positive and negatives, and false positives and negatives.

Purity: Percentage of correctly assigned points, assuming majority class for each cluster.

Normalized Mutual Information: Information theoretic mutual information between clusters and classes (normalized to 1).

Silhouette Index: Based on the ratio of the within class scatter to between class scatter.

RESULTS Variation of the performance metrics with

Peak Threshold or the number of key-points detected.

-0.01 0 0.01 0.02 0.03 0.04 0.05 0.06 0.070.35

0.4

0.45

0.5

0.55

0.6

0.65

0.7

0.75

0.8

0.85

Peak Threshold

Pur

ity

Purity vs Peak Threshold

Super-ordinateBasicSub-ordinate

-0.01 0 0.01 0.02 0.03 0.04 0.05 0.06 0.070.02

0.04

0.06

0.08

0.1

0.12

0.14

Peak Threshold

Silh

oue

tte

In

dex

Silhouette Index vs Peak Threshold


-0.01 0 0.01 0.02 0.03 0.04 0.05 0.06 0.070.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

Peak Threshold

Ran

d I

nd

ex

Rand Index vs Peak Threshold


-0.01 0 0.01 0.02 0.03 0.04 0.05 0.06 0.070

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

Peak Threshold

Nor

mal

ize

d M

utu

al I

nfo

rmat

ion

NMI vs Peak Threshold


RESULTS Variation of performance metrics with

number of images:

10 15 20 25 30

0.3

0.32

0.34

0.36

0.38

0.4

0.42

0.44

0.46

Images per sub-ordinate category

Nor

mal

ize

d M

utu

al I

nfo

rmat

ion

NMI vs Number of Images


10 15 20 25 300.4

0.45

0.5

0.55

0.6

0.65

0.7

0.75

0.8

0.85

0.9


Pur

ity

Purity vs Number of Images


10 15 20 25 30

0.2

0.25

0.3

0.35

0.4


Ran

d I

nd

ex

Rand Index vs Number of Images


10 15 20 25 300.065

0.07

0.075

0.08

0.085

0.09

0.095

0.1

0.105

0.11


Silh

oue

tte

In

dex

Silhoutte Index vs Number of Images


RESULTS Effect of expertise Two subordinate and one basic level categories

taken together, ex: {{dalmation, foxhound}, bird} Trial 1: Training samples of subordinate categories

half of basic categoryTrial 2: Training samples of subordinate category equal to basic category

30 600

0.05

0.1

0.15

0.2

0.25

0.3

0.35

Number of images in Basic Category

Ran

d I

nd

ex

Effect of Expertise

dogbirdchairtable

SOME PROBLEMS

White background images sometimes classified separate from cluttered background. Solution: Foreground extraction

High variability in Normalized Mutual Information (NMI)

Effect of expertise not clear Solution: Test for exponential increase in

images

REFERENCES

[1] Rosch, E., Mervis, C., Gray, W., Johnson, D., & Boyes-Braem, P. (1976). Basic objects in natural categories. Cognitive Psychology.

[2] Marc, J.M.M., Joubert, O.R., Nespoulous, J.L. & Fabre-Thorpe, M (2009). The time-course of visual categorizations: you spot the animal faster than the bird. PLoS one.

[3] Johnson, K.E., Mervis, C.B. (1997). Effects of varying levels of expertise on the basic level of categorization. Journal of Expert Psychology.

c omputation m odel for v isual c ategorization bhuwan dhingra

Documents

basic level categories

basic level of categorization

number of images

number of key

basic objects

number of detected keypoints

natural categories

categorization changes