indexing techniques
DESCRIPTION
Indexing Techniques. Mei-Chen Yeh. Last week. Matching two sets of features Strategy 1 Convert to a fixed-length feature vector (Bag-of-words) Use a conventional proximity measure Strategy 2: Build point correspondences. visual vocabulary. …. Last week: bag-of-words. frequency. - PowerPoint PPT PresentationTRANSCRIPT
Indexing Techniques
Mei-Chen Yeh
Last week
• Matching two sets of features– Strategy 1
• Convert to a fixed-length feature vector (Bag-of-words)• Use a conventional proximity measure
– Strategy 2:• Build point correspondences
Last week: bag-of-words
…..
freq
uenc
y
codewords
visual vocabulary
Matching local features: building patch correspondences
?
To generate candidate matches, find patches that have the most similar appearance (e.g., lowest SSD)
Image 1 Image 2
Slide credits: Prof. Kristen Grauman
Matching local features: building patch correspondences
?
Simplest approach: compare them all, take the closest (or closest k, or within a thresholded distance)
Image 1 Image 2
Slide credits: Prof. Kristen Grauman
Indexing local features• Each patch / region has a descriptor, which is a point
in some high-dimensional feature space (e.g., SIFT)
Descriptor’s feature space
Database images
Indexing local features• When we see close points in feature space, we have
similar descriptors, which indicates similar local content.
Descriptor’s feature space
Database images
Query image
Problem statement
• With potentially thousands of features per image, and hundreds to millions of images to search, how to efficiently find those that are relevant to a new image?
50 thousand images
Slide credit: Nistér and Stewénius
4m
110 million images?
Scalability matters!
The Nearest-Neighbor Search Problem
• Given– A set S of n points in d dimensions– A query point q
• Which point in S is closest to q?
Time complexity of linear scan: O( ? ) dn
?
The Nearest-Neighbor Search Problem
The Nearest-Neighbor Search Problem
• r-nearest neighbor– for any query q, returns a point p S∈
s.t.
• c-approximate r-nearest neighbor– for any query q, returns a point p’ S∈
s.t.
rqp
crqp '
Today
• Indexing local features– Inverted file– Vocabulary tree– Locality sensitivity hashing
Indexing local features:
inverted file
Indexing local features: inverted file
• For text documents, an efficient way to find all pages on which a word occurs is to use an index.
• We want to find all images in which a feature occurs.– page ~ image– word ~ feature
• To use this idea, we’ll need to map our features to “visual words”.
Text retrieval vs. image search
• What makes the problems similar, different?
Visual words
• Extract some local features from a number of images …
e.g., SIFT descriptor space: each point is 128-dimensional
Slide credit: D. Nister, CVPR 2006
Visual words
Visual words
Visual words
Each point is a local descriptor, e.g. SIFT vector.
Example: Quantize into 3 words
• Map high-dimensional descriptors to tokens/words by quantizing the feature space
Descriptor’s feature space
• Quantize via clustering, let cluster centers be the prototype “words”
• Determine which word to assign to each new image region by finding the closest cluster center.
Word #2
Visual words
• Each group of patches belongs to the same visual word!
Figure from Sivic & Zisserman, ICCV 2003
Visual words
Visual vocabulary formation
Issues:• Sampling strategy: where to extract features? Fixed
locations or interest points?• Clustering / quantization algorithm• What corpus provides features (universal
vocabulary?)• Vocabulary size, number of words• Weight of each word?
Inverted file index
The index maps word-to-image ids
Why the index give us a significant gain in efficiency?
A query image is matched to database images that share visual words.
Inverted file index
tf-idf weighting• Term frequency – inverse document frequency• Describe the frequency of each word within an
image, decrease the weights of the words that appear often in the database– economic, trade, …– the, most, we, …
w↗w↘
discriminative regionscommon regions
tf-idf weighting• Term frequency – inverse document frequency• Describe the frequency of each word within an
image, decrease the weights of the words that appear often in the database
Total number of documents in database
Number of documents word i occurs in, in whole database
Number of occurrences of word i in document d
Number of words in document d
Slide credit: Xin Yang
Bag-of-Words + Inverted file
Training images
Local descriptors from training samples
…
Feature space Vocabulary
Visual-word2Visual-word1
Visual-word3
Freq
uenc
y
Visual Words
Local descriptor
VW1
VW2
VW3
VWk
K: number of words in vocabulary
Image i Image k...
Image i Image j... Image k...
Image m Image n...
.
.
.
Matching Scorehttp://www.robots.ox.ac.uk/~vgg/research/vgoogle/index.html
Bag-of-words representation
Inverted file
http://people.cs.ubc.ca/~lowe/keypoints/
D. Nistér and H. Stewenius. Scalable Recognition with a Vocabulary Tree, CVPR 2006.
Visualize as a tree
Perc
eptu
al a
nd S
enso
ry A
ugm
ente
d Co
mpu
ting
Visu
al O
bjec
t Re
cogn
itio
n Tu
tori
al
Vocabulary Tree• Training: Filling the tree
Slide credit: David Nister
[Nister & Stewenius, CVPR’06]
Perc
eptu
al a
nd S
enso
ry A
ugm
ente
d Co
mpu
ting
Visu
al O
bjec
t Re
cogn
itio
n Tu
tori
al
Vocabulary Tree• Training: Filling the tree
Slide credit: David Nister
[Nister & Stewenius, CVPR’06]
Perc
eptu
al a
nd S
enso
ry A
ugm
ente
d Co
mpu
ting
Visu
al O
bjec
t Re
cogn
itio
n Tu
tori
al
Vocabulary Tree• Training: Filling the tree
Slide credit: David Nister
[Nister & Stewenius, CVPR’06]
Perc
eptu
al a
nd S
enso
ry A
ugm
ente
d Co
mpu
ting
Visu
al O
bjec
t Re
cogn
itio
n Tu
tori
al
Vocabulary Tree• Training: Filling the tree
Slide credit: David Nister
[Nister & Stewenius, CVPR’06]
Perc
eptu
al a
nd S
enso
ry A
ugm
ente
d Co
mpu
ting
Visu
al O
bjec
t Re
cogn
itio
n Tu
tori
al
42
Vocabulary Tree• Training: Filling the tree
Slide credit: David Nister
[Nister & Stewenius, CVPR’06]
Perc
eptu
al a
nd S
enso
ry A
ugm
ente
d Co
mpu
ting
Visu
al O
bjec
t Re
cogn
itio
n Tu
tori
al
Vocabulary Tree• Recognition
Slide credit: David Nister
[Nister & Stewenius, CVPR’06]
RetrievedOr perform geometric verification
Think about the computational advantage of the hierarchical tree vs. a flat vocabulary!
Hashing
Direct addressing
• Create a direct-address table with m slots
U(universe of keys)
K(actual keys)
0123456789
2
3
5
8
key satellite data
23
5 8
1 9 4 07
6
Direct addressing
• Search operation: O(1)• Problem: The range of keys can be large!
– 64-bit numbers => 18,446,744,073,709,551,616 different keys
– SIFT: 128 * 8 bits U
K
Hashing
• O(1) average-case time• Use a hash function h to compute the slot
from the key k
U(universe of keys)
K(actual keys)
k1k4
k5
T: hash table0
h(k1)h(k4)
h(k5)
m-1
may not be k1 anymore!
k3
= h(k3) may share a bucket
Hashing
• A good hash function– Satisfies the assumption of simple uniform
hashing: each key is equally likely to hash to any of the m slots.
• How to design a hash function for indexing high-dimensional data?
128-d
T: hash table
?
Locality-sensitive hashing
• Indyk and Motwani. Approximate nearest neighbors: towards removing the curse of dimensionality, STOC 1998.
Locality-sensitive hashing (LSH)
• Hash functions are locality-sensitive, if, for any pair of points p, q we have:– Pr[h(p)=h(q)] is “high” if p is close to q– Pr[h(p)=h(q)] is “low” if p is far from q
Pr[ ( ) ( )] ( , ),h Fh x h y sim x y
Locality Sensitive Hashing
• A family H of functions h: Rd → U is called (r, cr, P1, P2)-sensitive, if for any p, q:– if then Pr[h(p)=h(q)] > P1– if then Pr[h(p)=h(q)] < P2
rqp crqp
LSH Function: Hamming Space
• Consider binary vectors– points from {0, 1}d
– Hamming distance D(p, q) = # positions on which p and q differ
Example: (d = 3)
D(100, 011) =D(010, 111) =
32
LSH Function: Hamming Space
• Define hash function h as hi(p) = pi where pi is the i-th bit of p
Example: select the 1st dimension
h(010) = 0h(111) = 1
Pr[h(010)≠h(111)] = ?⅔ vs. D(p, q)? d?Pr[h(p)=h(q)] = ?1 - D(p, q)/d
= D(p, q)/d
Clearly, h is locality sensitive.
LSH Function: Hamming Space
• A k-bit locality-sensitive hash function is defined as g(p) = [h1(p), h2(p), …, hk(p)]T
– Each hi(p) is chosen randomly
– Each hi(p) results in a single bitk
P
1
111Pr(similar points collide) ≥
Pr(dissimilar points collide) ≤ kP2
Indyk and Motwani [1998]
LSH Function: R2 space
• Consider 2-d vectors
LSH Function: R2 space
• The probability that a random hyperplane separates two unit vectors depends on the angle between them:
LSH Pre-processing
• Each image is entered into L hash tables indexed by independently constructed g1, g2, …, gL
• Preprocessing Space: O(LN)
LSH Querying
• For each hash table, return the bin indexed by gi(q), 1 ≤ i ≤ L.
• Perform a linear search on the union of the bins.
W. –T Lee and H. –T. Chen. Probing the local-feature space of interest points, ICIP 2010.
Hash family
a : random vector sampled from a Gaussian distribution
b : real value chosen uniformly from the range [0 , r]
r : segment width
The dot-product a v projects each vector ‧ v to “a line”
Building the hash table
Building the hash table
: segment width(max-min)/t
For each random projection, we get t buckets.
Building the hash table
• Generate K projections
How many buckets do we get? tK
Combing them to get an index in the hash table:
Building the hash table
• Example – 5 projections (K = 5)– 15 segments (t = 15)
• 155 = 759,375 buckets in total!
Collect three image patches of different size 16x16 , 32x32 , 64x64
Each set consist of 200,000 patches.
Natural image patches (from Berkeley segmentation database )
Noise image patches (Randomly-generated noise patches)
Sketching the Feature Space
Patch distribution over buckets
Summary
• Indexing techniques are essential for organizing a database and for enabling fast matching.
• For indexing high-dimensional data– Inverted file– Vocabulary tree– Locality sensitive hashing
Resources and extended readings
• LSH Matlab Toolbox– http://www.cs.brown.edu/~gregory/download.ht
ml• Yeh et al., “Adaptive Vocabulary Forests for
Dynamic Indexing and Category Learning,” ICCV 2007.