Download - IIIT Hyderabad Efficient Image Retrieval Methods For Large Scale Dynamic Image Databases Suman Karthik 200407013 Advisor: Dr. C.V.Jawahar

IIIT

Hyd

erab

ad

Efficient Image Retrieval Methods For Large Scale Dynamic Image Databases

Suman Karthik

200407013

Advisor: Dr. C.V.Jawahar

IIIT

Hyd

erab

ad

• Cheap Imaging Hardware

• Plummeting Storage costs

• User Generated Content

Images

IIIT

Hyd

erab

ad

Image Databases

• Large Scale– Millions to billions of

images

• Dynamic– Highly dynamic in

nature Number of Images on Flickr fromDecember 2005 to November 2007

In millions

IIIT

Hyd

erab

ad

CBIR

• Content Based IR– Uses image content

• Pros– Good Quality

– Annotation agnostic

• Cons– Inefficient

– Not scalable

shape color texture

IIIT

Hyd

erab

ad

wN

d z

DPLSA, Hoffman, 2001

Bag Of Words

Words

*J Sivic & Zisserman,2003; Nister & Henrik,2006; Philbin,Sivic,Zisserman et la,2008;

FeatureExtraction

VectorQuantization

SemanticIndexing

Index

Compute SIFT descriptors

[Lowe’99]

W

D1 D2 D3

Inverted Index

IIIT

Hyd

erab

ad

Dynamic Databases

• Large scale

• New images added continuously

• High rate of change

• Nature of data not known apriori

Internet

Videos

Images

IIIT

Hyd

erab

ad

Text vs Images Dynamic databases

• Vocabulary known

• Rate of change of vocabulary low

• Stable vocabulary

• Vocabulary unknown

• Rate of change of vocabulary high

• Unstable vocabulary

IIIT

Hyd

erab

ad

Quantization and Semantic indexingIn Dynamic Databases

• As DB changes vocabulary is outmoded

• Updating vocabulary is too costly

• Not incremental

• Cannot keep up with rate of change

• As DB changes semantic index is invalid

• Updating semantic index is resource intensive

• Not incremental

• Cannot keep up with rate of change or scale

IIIT

Hyd

erab

ad

Dynamic Databases

Internet

Videos

Images

DynamicDatabase

FeatureExtraction

VectorQuantization

SemanticIndexing

Index

Quantization and semantic indexingmethods are a bottleneck

IIIT

Hyd

erab

ad

Objective 1

A. Motivation CBIR is inefficient and not scalable

B. Objective Develop methods to improve efficiency

and scalability of CBIR

C. ContributionsC 1.1 – Virtual Textual RepresentationC 1.2 – A new efficient indexing structureC 1.3 – Relevance feedback methods that improves performance

IIIT

Hyd

erab

ad

Objective 2

A. Motivation Quantization is bottleneck for BoW when dealing with dynamic image databases

B. Objective Develop incremental quantization

method for BoW model to successfully deal with dynamic image databases

C. ContributionsC 2.1 – Incremental Vector Quantization C 2.2 – Comparison of retrieval performance with existing methods C 2.3 – Comparison of incremental quantization with existing methods

IIIT

Hyd

erab

ad

Objective 3

A. MotivationSemantic Indexing is not scalable for BoW when dealing with dynamic image databases

B. Objective Develop incremental semantic indexing

method for BoW model to successfully deal with dynamic image databases

C. ContributionsC 3.1 – Bipartite Graph ModelC 3.2 – An algorithm for semantic indexing on BGMC 3.3 – Search engines for images

IIIT

Hyd

erab

ad

CBIR

IIIT

Hyd

erab

ad

* Image retrieval: Past, present, and future, Yong Rui, Thomas S. Huang, Shih F. Chang In International Symposium on Multimedia Information Processing 1997

Literature

• Global image retrieval

• Region based image retrieval

• Region Based Relevance feedback

Costly nearest neighbor based retrieval

Spatial Indexing

Relevance feedback heavily used

* Blobworld: A System for Region-Based Image Indexing and Retrieval, Chad Carson , Megan Thomas , Serge Belongie , Joseph M. Hellerstein , Jitendra Malik In Third International Conference on Visual Information Systems 1999* Region-Based Relevance Feedback In Image Retrieval, Feng Jing , Mingjing Li , Hong-jiang Zhang , Bo Zhang, Proc. IEEE International Symposium on Circuits and Systems 2002

IIIT

Hyd

erab

ad

Search

IIIT

Hyd

erab

ad

Transformation

Feature SpaceBins represented

by strings or wordsQuantization

Color

Com

pactness

Positio

n

IIIT

Hyd

erab

ad

Virtual Textual Representation

• Quantization– Uniform quantization (grid)

– Density based quantization(kmeans)

• Each cell is a string

Transformation

Document

Image

Words

Segments Text

Segmentation

IIIT

Hyd

erab

ad

CBIR Indexing

• Spatial Databases

• Relevance feedback skews the feature space rendering spatial databases inefficient*.

* Indexing for Relevance Feedback Image Retrieval, Jing Peng , Douglas R. Heisterkamp, In Proceedings of the IEEE International Conference on Image Processing (ICIP’03)

details

IIIT

Hyd

erab

ad

Elastic Bucket TrieNull

AB

C

A

B

A

B

B

Nodes

Buckets

CABCBA

OverflowSplitA B

QueryBBC

Retrieved Bucket

Insert

IIIT

Hyd

erab

ad

Relevance Feedback

Query

Retrieved

Relevance Feedback

IIIT

Hyd

erab

ad

Region importance based relevance feedback

KEYWORDS

Relevant Images Extracted WordsKeyword SelectionPseudo Image for

next iteration

Errors In Retrieval

IIIT

Hyd

erab

ad

Discriminative Relevance Feedback

• Classification is given precedence over clustering.

• Discriminative segments become the keywords.

• Non-discriminative segments are ignored.

SURFERS

WAVES

ROSES

FLOWERS

IIIT

Hyd

erab

ad

Discriminative Relevance Feedback

KEYWORDS

Relevant Images Extracted WordsKeyword SelectionPseudo Image for

next iterationIrrelevant Images

No Errors In Retrieval

IIIT

Hyd

erab

ad

Performance

Discriminative Relevance Feedback consistently out performs Region Based Importance method.

High Fscore

Low Fscore

IIIT

Hyd

erab

ad

Global image retrieval

LocalImageretrieval

SpatialIndexing

Non Spatialindexing

Global relevanceFeedback orNo relevance feedback

Region basedRelevance feedback

Our work

Early CBIR

Blobworld,(no indexing)

Simplicity(no indexing)

IIIT

Hyd

erab

ad

Analysis

• Relevance feedback algorithms need to be modified to work with text.

• Keywords emerge with relevance feedback signifying association between key segments.

• EBT can be used without any modifications with discriminative relevance feedback.

• Advent of Bag of Words model for image retrieval

IIIT

Hyd

erab

ad

Quantization

IIIT

Hyd

erab

ad

Literature

• Kmeans

• Hierarchical Kmeans

• Kmeans, Soft assignment

Time consuming offline quantization

Representative data available apriori

Quantization is not incremental

* Video Google: A Text Retrieval Approach to Object Matching in Videos, Josef Sivic, Andrew Zisserman, ICCV 2003 * Scalable Recognition with a Vocabulary Tree, D. Nistér and H. Stewénius, CVPR 2006 * Lost in quantization: Improving particular object retrieval in large scale image databases, James Philbin, Ondrej Chum, Michael Isard, Josef Sivic, Andrew Zisserman, CVPR 2008

IIIT

Hyd

erab

ad

Losses• Perceptual Loss

– Under quantization– Synonymy– Poor precision

• Binning Loss– Over quantization– Polysemy– Poor recall

Quantization

IIIT

Hyd

erab

ad

Incremental Vector Quantization

• Control perceptual loss

• Minimize binning loss

• Create quality code books

• Data dependent

• Incremental in nature

IIIT

Hyd

erab

ad

Algorithm

r

L = 2 L: minimum cardinality of a cell

Puts a upper bound on perceptual loss

Builds quality codebooks by ignoring noise

Soft BinAssignment: Minimizes binning loss

IIIT

Hyd

erab

ad

IIIT

Hyd

erab

ad

An experiment

• Given– All possible feature points in a feature space that

could be generated by natural processes.

• Quantize– K-means with apriori knowledge of entire data– IVQ with no apriori information.

• Performance– F-score– Time taken for incremental quantization

Details

IIIT

Hyd

erab

ad

Fscore

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

200 400 600 800 1000

Kmeans

Kmeans Soft

IVQ

IVQ: 1115 binsKmeans: 1000 bins

IVQ outperforms Kmeans

IIIT

Hyd

erab

ad

Time

IVQ outperforms Kmeans

•IVQ quantizes in 0.1 seconds•IVQ time complexity is linear

•Kmeans takes 1000 seconds•Time complexity exponential

IIIT

Hyd

erab

ad

Holiday Dataset

• Datasets• Holiday dataset

• 1491 images

• 500 categories

• Pre-processing • sift feature extraction.

• quantization using k-means.

• quantization using ivq

IIIT

Hyd

erab

ad

Incremental Quantization

S = seconds, D = Days

Batch = 100 images of 100,000 image ALOI dataset Added sequentially

• Datasets• ALOI dataset

• 100,000 images

• 1000 batches of 100 image each


• quantization using k-means/online kmeans.

• quantization using IVQ

IIIT

Hyd

erab

ad

Analysis

• IVQ bins higher than Kmeans (constant perceptual loss)

• IVQ efficient due to local changes

• LSH used to accelerate IVQ

• Semantic indexing can improve mAPMore

IIIT

Hyd

erab

ad

Kmeans

Offlinequantization

Onlinequantization

Non densitybased

Density Based

Non incremental

Incremental

Online Kmeans

RegularLattice

IVQ(local)

AdaptiveVocabularyTree(global)

IIIT

Hyd

erab

ad

Semantic Indexing

IIIT

Hyd

erab

ad

Semantic Indexing

w

d

P(w|d)

* Hoffman 1999; Blei, Ng & Jordan, 2004; R. Lienhart and M. Slaney,2007

Animal

Flower

Whippet daffodil

tulipGSD

doberman

rose

Whippet

dobermanGSD

daffodil

tulip roseLSI, pLSA, LDA

Words clustered around latent topics

Visual Words clustered around latent topics

IIIT

Hyd

erab

ad

Literature

• Visual pLSA

• Visual LDA

• Spatial semantic indexing

High space complexity due to large matrix operations.

Slow, resource intensive offline processing.

* Discovering Objects and Their Location in Images , Josef Sivic, Bryan Russell, Alexei A. Efros, Andrew Zisserman, and Bill Freeman, ICCV 2005* Image Retrieval on Large-Scale Image Databases, Eva Horster, Rainer Lienhart, Malcolm Slaney, CIVR 2007* Spatial Latent Dirichlet Allocation, X. Wang and E. Grimson, in Proceedings of Neural Information Processing Systems Conference (NIPS) 2007

IIIT

Hyd

erab

ad

Bipartite Graph Model

• Vector space model is encoded as bipartite graph of words and

document.

• TF values retained as edge weights.

• IDF values retained as term weights

d2

words Documents

Cash Flow Algorithm

d1

d3

d4

d5

w1

w2

w3

w4

w5

w6

SaddamCaptured

IraqPullout

ObamaElected

BushPopularity

FinancialCrisis

subprime

reforms

war

Iraq

elections

democrats

TF

IDF

10050

5025

2512.5

12.5

11.7

8.35

IIIT

Hyd

erab

ad

• Feature extraction– Local detectors, SIFT

• Vector quantization– K-means

• BGM insertion– Words, Documents– TF– IDF

BGM with BoW

…

IIIT

Hyd

erab

ad

w1w2Query image

w1 w2 w3 w4 w5

Result :

Why BGM is Superior ?

Cash Flow

Result :

Inverted Index

IIIT

Hyd

erab

ad

Naïve vs BGM

• Datasets• 9000 images of flickr.

• 9 Sports Categories

• 5 Animal Categories



• F-score• 2*(p*r)/(p+r)

IIIT

Hyd

erab

ad

BGM vs pLSA, IpLSA

• pLSA– Cannot scale for large databases.– Cannot update incrementally.– Latent topic intialization difficult– Space complexity high

• IpLSA– Cannot scale for large databases.– Cannot update new latent topics.– Latent topic intialization difficult– Space complexity high

• BGM+Cashflow– Efficient– Low space complexity

mAP Time Space

pLSA 0.553 5062s 3267Mb

IpLSA 0.567 56s 3356Mb

BGM 0.594 42s 57Mb

Number Of Concepts Known

Number Of Concepts unknown

mAP Time Space

pLSA 0.649 5144s 3267Mb

IpLSA 0.612 63s 3356Mb

BGM 0.594 42s 57Mb

• Datasets• Holiday dataset

• 1491 images

• 500 categories



IIIT

Hyd

erab

ad

Near Duplicate Retrieval

• Dataset: 500,000 movie frames – SIFT vectors– Kmeans quantization

• Indexed using text search library Ferret. – Efficient Indexing and retrieval – Effectively scalable to large data.

• Query frame given as query to Ferret index. • Cash propagated to every node until cut-off.

IIIT

Hyd

erab

ad

Sample Retrieval

Fastest Indian

Query Retrieval

Fight Club

Harry Potter

IIIT

Hyd

erab

ad

Analysis

• Low index insert time for new images– Less than 200 seconds to insert 1000 images in a million image index

• Marginally higher retrieval time– Due to multiple levels of graph traversal

• Memory usage minimal

• Works without concept number apriori

• BGM is a hybrid model – Generative

– discriminative

IIIT

Hyd

erab

ad

OfflineSemanticindexing

OnlineSemanticindexing

Generative

Discriminative

Non incremental

Incremental

BGM(generative + discriminative)

PLSA

BGM IDF

BGM TF

LDA

IPLSA

IIIT

Hyd

erab

ad

Conclusion

• Efficient methods for retrieval in large scale dynamic image databases

• Scalability and adaptability have been addressed

• A step closer to real world image retrieval

• Features and their mixture, a long way to go

IIIT

Hyd

erab

ad

Future Work

• Quality and quantity of features

• Automatic feature modeling

• Text search engines for image search

• GPU based quantization methods

• Multiple vocabularies for image retrieval

• Multimodal semantic indexing with BGM

IIIT

Hyd

erab

ad

List of publications

• Suman Karthik, C.V. Jawahar, "Incremental On-line semantic Indexing for Image Retrieval in Dynamic. Databases" 4th International Workshop on Semantic Learning and Applications, CVPR, 2008, Florida

• Suman Karthik, C.V. Jawahar, "Analysis of Relevance Feedback in Content Based Image Retrieval", Proceedings of the 9th International Conference on Control, Automation, Robotics and Vision (ICARCV), 2006, Singapore.

• Suman Karthik, C.V. Jawahar, Virtual Textual Representation for Efficient Image Retrieval. Proceedings of the 3rd International Conference on Visual Information Engineering (VIE), 26-28 September 2006 in Bangalore, India.

• Suman Karthik, C.V. Jawahar, Effecient Region Based Indexing and Retrieval for Images with Elastic Bucket Tries, Proceedings of the International Conference on Pattern Recognition (ICPR), 2006

IIIT

Hyd

erab

ad

The End

IIIT

Hyd

erab

ad

Intuitive way of learning content

Transformation

Over segmentation and subsequent deduction of content through relevance feedback.Document

Image

Words

Segments Text

Segmentation

Discriminative Relevance Feedback leverages this advantage to achieve better performance than standard techniques.

IIIT

Hyd

erab

ad

Kmeans

• Pros– Simple– Efficient

• Cons– Computationally expensive– Representative Training Set– Sensitive to parameter K

IIIT

Hyd

erab

ad

A naive quantization scheme

Quantization

F2

F1

F3

Advantages: - High speed. No quantization overhead- As dataset size grows precision increases

Disadvantages: - Not data dependent, no idea of visual concept- Information loss due to hard assignment

* Suman karthik, C. V. jawahar, Virtual Textual Representation for Efficient Image Retrieval VIE 2006* Tuytelaars, T. and Schmid, C. Vector Quantizing Feature Space with a Regular Lattice ICCV 2007

IIIT

Hyd

erab

ad

C 2.1 Methodology

• Data– 1000 Random feature vectors each generated from 1000 normal

distributions in a 2-d feature space. A total of 1 million feature points in the space.

– 100,000 Virtual images falling into 100 categories where each category image is generated by drawing random numbers from 10 normal distributions from the above data.

• Algorithms– Kmeans (quantized with the entire data and ideal K=1000)– IVQ– Kmeans with soft assignment

• Measures– F-score for retrieval performance– Time estimates for incremental quantization

Back

IIIT

Hyd

erab

ad

Performance

Back

IIIT

Hyd

erab

ad

Performance

IIIT

Hyd

erab

ad

Image Retrieval

• Contemporary approach– Uses textual cues

• Pros– Simple– Efficient

• Cons– Images are Subjective– Text cues unscalable– Quality Suffers

Rose

Petals

Red

Green

Bud

Gift

Love

Flower

IIIT

Hyd

erab

ad

Losses

• High Perceptual Loss

• High Binning Loss

• Optimal Quantization

IIIT

Hyd

erab

ad

Image retrieval as Text retrieval

Can an image be indexed, queried for and

retrieved as a text document?

Can this become… …this????????????

IIIT

Hyd

erab

ad

Relevance Feedback

• Statistical– Delta mean algorithm– Query Point Movement– Inverse Variance– Membership Criterion

• Kernel Based– Parzen Windows– SVM– Kernel BDA]

• Entropy Based– KL divergence

0

20

40

60

80

100

120

140

D1 D2 D3 D4

Inverse SigmaDelta MeanMCQPMKL DivergenceParzenKBDASVM

0

20

40

60

80

100

120

140

D1 D2 D3 D4

Inverse SigmaDelta MeanMCQPMKL DivergenceParzenKBDASVM

<<Back

IIIT

Hyd

erab

ad

Semantic Indexing for Images

• Objects and their location in images.

• Large Scale Image Databases

• Web image selection

• Spatial Latent Dirichlet Allocation

• Image auto-annotation

Sivic, J. Russell, B.C. Efros, A.A. Zisserman, A. FreemanLienhart, R. Slaney, MKeiji YanaiXianggang Wang, Eric GrimsonMonay, Florent and Gatica-Perez, Daniel,

High space complexity due to large matrix operations.

Slow, resource intensive offline processing.

Download - IIIT Hyderabad Efficient Image Retrieval Methods For Large Scale Dynamic Image Databases Suman Karthik 200407013 Advisor: Dr. C.V.Jawahar

Top Related