cbir ‐ features
TRANSCRIPT
Large-‐scaleVisual
Search:CBIR-‐
Features
Weimin Tan. Fudan University sep, 2014
Image featuresLe
vels
of
imag
e c
on
ten
t
High-level features Semantics
Shape
Texture
Color,lightness
Low-level features / visual features
(signatures, descriptors)
Image features
Textual Visual (low- ‐level)
Annotations and metadata:– tags/keywords;
– Creation date;
– geo tags;
– name of the file;
– photography conditions
(exposition, aperture,
flash…).
Features extracted from pixel
values:– color descriptors;
– texture descriptors;
– shape descriptors;
– Spatial layout descriptors.
Visual features (Low- ‐level)
Global Local
Describe the whole image:– average intensity;– average amount of
red;− …
Describe one part of the image:– average intensity for the left
upper part;– average amount of red in the
center of an image;− …
All pixels of an image are processed. Segmentation of the image is performed, pixels of a parti cular segment are processed to extract features.
Popular Visual Features Global Feature
– Color Color space Color histogram Color moment– Texture GLCM– Shape Context– GIST– Color Name
Local Feature– Detector Harris, LOG, DOG, MSER, Hessian Affine KAZE, FAST– Descriptor SIFT, SURF, LIOP, BRIEF ORB, FREAK, BRISK, CARD, Edge -SIFT
• Quanti zati on of color space
– Quantizat ion is important: size of the feature vector.
– When no color similarity function used:• Too many bins – similar colors are treated as dissimilar.• Too little bins – dissimilar colors are treated as similar.
h1 h2hN
Color Histogram
Color Histogram Retrieval
Color Histogram
Advantage:
• The color histogram is easy to compute and effective in
characterizing both the global and local distribution of colors in an
image.
• Robust to translation and rotation about the view axis and changes only slightly with the scale, occlusion and viewing angle.
Disadvantage:• Without color distributions of images
Popular Visual Features
Global Feature– Color Color space Color histogram Color moment– Texture GLCM– Shape Context– GIST– Color Name
Local Feature– Detector Harris, LOG, DOG, MSER, Hessian Affine KAZE, FAST– Descriptor SIFT, SURF, LIOP, BRIEF ORB, FREAK, BRISK, CARD, Edge -SIFT
Color Moments• Color moments have been proved to be efficient and effective
in representing color distributions of images
– First order(mean)
– Second order(variance)
– Third order(skewness)
Color Moments• Consider spatial layout
– Block- w‐ ise
Popular Visual Features
Global Feature– Color Color space Color histogram Color moment– Texture GLCM– Shape Context– GIST– Color Name
Local Feature– Detector Harris, LOG, DOG, MSER, Hessian Affine KAZE, FAST– Descriptor SIFT, SURF, LIOP, BRIEF ORB, FREAK, BRISK, CARD, Edge -SIFT
Texture FeatureSome pattern of color or intensity changes
NaturalTexture
灰度共生矩阵GLCM (Gray Level Co- ocurrence Matrices)
思想– 纹理是由灰度分布在空间位置上反复出现而形成– 纹理图像在图像空间中相隔某距离的两象素
间会存在一定的 灰度关系,即灰度的空间相关性
– 共生矩阵方法用条件概率来反映纹理,是相邻象素的灰度相关性的表现。
– 根据图像像素之间的位置关系(距离, 方向),构造一种矩 阵,作为纹理的描述– 矩阵的行坐标和列坐标表示不同的灰度,考察一对对象素出现的频度,以此作为矩阵中的元素
方法
• The GLCM is defined by:
– where nijis the number of occurrences of the pixel
values lying at distance d with angle in the image.
– The co-occurrence matrix P has dimension n x n, where n is the number of gray levels in the image.
P( p, q, d, ) nij
#{[(x , y ), (x , y )]S | f (x , y ) p & f (x , y )
q}p( p, q, d , ) 1 1 2 2 1 1 2 2
# S
GLCM
( p, q)
nij
Example:
0 1 2 3 0 11 2 3 0 1 22 3 0 1 2 33 0 1 2 3 00 1 2 3 0 11 2 3 0 1 2
0 0 0 0 1 10 0 0 0 1 10 0 0 0 1 10 0 0 0 1 12 2 2 2 3 32 2 2 2 3 3
Image A Image B
0 8 0 7
8 0 8 0
0 8 0 7
7 0 7 0
P (d 1, 0 )o
A
g1 0,
g2 1,
g3 2,
g4 3,
0 o 45 o
90 o 135 o
P (d 1, 45o ) B
18 3 3 0
3
3
6
1
1
6
1
1
0 1 1 2
P (d 1, 45o ) A
12 0 0 0
0
0
14
0
0
12
0
0
0 0 0 12
P (d 1, 0o ) B
24 4 0 0
4
0
8
0
0
12
0
2
0 0 2 4
Gray Level Co-occurrence Matrix Contains information about the positions of
pixels having similar gray level values. Robust to translation and rotation about the
view axis and changes only slowly with the scale, occlusion and viewing angle.
GLCM
Popular Visual Features
Global Feature– Color Color space Color histogram Color moment– Texture GLCM– Shape Context– GIST– Color Name
Local Feature– Detector Harris, LOG, DOG, MSER, Hessian Affine KAZE, FAST– Descriptor SIFT, SURF, LIOP, BRIEF ORB, FREAK, BRISK, CARD, Edge -SIFT
• What points on these two sampled contours are most similar? How do you know?
Shape Context Descriptor [Belongie et al ’02]
20Shape context slides from Belongie et al.
Count the number of points inside each bin, e.g.:
Count = 4
Count = 10
Compact representation of distribution of points relative to each point
...
NIPS’00, PAMI’02
Shape Context Descriptor
形状直方图
Global Feature
Comparing Shape Contexts
22
Compute matching costs usingChi Squared distance:
Recover correspondences by solving for least cost assignment, using costs Cij
(Then use a deformable template match, given the correspondences.)
Popular Visual Features
Global Feature– Color Color space Color histogram Color moment– Texture GLCM– Shape Context– GIST– Color Name
Local Feature– Detector Harris, LOG, DOG, MSER, Hessian Affine KAZE, FAST– Descriptor SIFT, SURF, LIOP, BRIEF ORB, FREAK, BRISK, CARD, Edge -SIFT
GIST Feature• Definition and Background
• Essence, holistic characteristics of an image• Context information obtained within an eye saccade (app.
150 ms.)• Evidence of place recognizing cells at Parahippocampal
Place Area (PPA)• Biologically plausible models of Gist are yet to be
proposed• Nature of tasks done with gist
• Scene categorization/context recognition• Region priming/layout recognition • Resolution/scale selection
C. Siagian and L. I t , Rapid Biologically- ‐Inspired Scene ClassificaOon Using Features Shared with Visual AuenOon, IEEE Transac=ons PAMI, Vol. 29, No. 2, pp. 300- ‐312, Feb 2007.
C. Siagian and L. Itti, Rapid Biologically Inspired Scene Classification Using Features Shared with Visual Attention, ‐IEEE Transactions on PAMI, Vol.29,No.2,pp.300-312,Feb 2007.
Human Vision Architecture• Visual Cortex:
– Low level filters, center-surround, and normalization
• Saliency Model:– Attend to pertinent regions
• Gist Model:– Compute image general
characteristic• High Level Vision:
– Object recognition– Layout recognition – Scene understanding
Gist Model Implementation Raw image feature-maps
• Gabor filters at 4 angles (0, 45, 90, 135) on 4 scales= 16 sub- ‐channels
• red- ‐green and blue- ‐yellow center surround each with 6 scale combinations= 12 sub- ‐channels
• Dark-bright center-surround with 6 scale combinations
= 6 sub- ‐channels = Total of 34 sub- ‐channels
Orientation Channel
color
Intensity
Gist Model Implementation
• Gist Feature Extraction
– Average values of predetermined grid
( 4×4 )
Global Feature
• Dimension Reduction– Original:
34 sub- ‐channels x 16 features= 544 features
– PCA/ICA reduction: 80 features
• Kept >95% of variance
Gist Model Implementation
Global Feature
System Example Run
Global Features• Advantages:
– Simple.– Low computatinal complexity.
• Disadvantages:– Low accuracy
• Why Local Feature?
– Locality: features are local, so robust to occlusion and clutter (no prior segmentation)
– Distinctiveness: individual features can be matched to a large database of objects
– Quantity: many features can be generated for even small objects
– Efficiency: close to real-time performance
– Extensibility: can easily be extended to wide range of differing feature types, with each adding robustness
Local Features
• Main Components:– Detection of interest points– Local Feature Descriptor
Local Features
Image Interest Points Local Feature
Popular Visual Features
Global Feature– Color Color space Color histogram Color moment– Texture GLCM– Shape Context– GIST– Color Name
Local Feature– Detector Harris, LOG, DOG, MSER, Hessian Affine KAZE, FAST– Descriptor SIFT, SURF, LIOP, BRIEF ORB, FREAK, BRISK, CARD, Edge -SIFT
Local Feature
• Corners as distinctive interest points− We should easily recognize the point by looking through a
small window− shift a window in any direction should give a large change
in intensity
“Flat” Region: No change in all direction
“Edge”:No change along the edge direction
“Corner”: Significant Change in all directions
Harris Corner Detector
Consider shifting the window W by (u,v)
• how do the pixels in W change?• compare each pixel before and after by
summing up the squared differences W
Taylor Series expansion of I:
If the motion (u,v) is small, then first order approx is good
Local FeatureHarris Corner Detector
W
Local FeatureHarris Corner Detector
M
This can be rewritten as
For the example above• You can move the center of the blue window to anywhere on the
yellow unit circle
• Which directions will result in the largest and smallest E values?• We can find these directions by looking at the eigenvectors of M
Local FeatureHarris Corner Detector
Eigenvalues and eigenvectors of M
• Define shifts with the smallest and largest change (E value)
•
••
•
x+ = direction of largest increase in E.
+ = amount of increase in direction x+
x- = direction of smallest increase in E.
- ‐ = amount of increase in direction x-
x-
x+
M
Mx x
Mx x
Local Feature
Harris Corner Detector
“Flat” Region:λ1 and λ2 are small;
“Edge”: λ1 >> λ2
λ2 >> λ1
“Corner”:λ1 and λ2 are large; λ1 ~ λ2
Local Feature
Harris Corner Detector
Feature Detection: Mathematics
1
2
“Corner”1 and 2 are large,1 ~ 2;
E increases in all directions
1 and 2 are small; E is almost constant in all directions
“Edge”1 >> 2
“Edge”2 >> 1
“Flat” region
Classification of image points using eigenvalues of M:
12
1 2
f 2f 12 (1 2 )Corner Response Function: or
Harris Corner Detector
• Procedure:− Compute M matrix for each image window to get their
cornerness scores− Find points whose surrounding window gave large corner
response− Take the points of local maxima, i.e., perform non
- ‐maximum suppression
优点: A 、旋转不变性; B 、图像灰度的仿射变化具有部分的不变性。
缺点: A 、它对尺度很敏感,不具备几何尺度不变性; B 、提取的角点是像素级的。
Harris Detector Example
The tops of the horns are detected in both images
Harris Corner (in red)
Popular Visual Features
Global Feature– Color Color space Color histogram Color moment– Texture GLCM– Shape Context– GIST– Color Name
Local Feature– Detector Harris, LOG, DOG, MSER, Hessian Affine KAZE, FAST– Descriptor SIFT, SURF, LIOP, BRIEF ORB, FREAK, BRISK, CARD, Edge -SIFT
Laplacian of Gaussian
LoG 边缘检测算子是 David Courtnay Marr 和 Ellen
Hildreth ( 1980 )共同提出的 [1] 。因此,也称为边缘检测算法或Marr & Hildreth 算子。
该算法首先对图像做高斯滤波,然后再求其拉普拉斯( Laplacian )二阶导数。即图像与 Laplacian of the Gaussian function 进行滤波运算。最后,通过检测滤波结果的零交叉( Zero crossings )可以获得图像或物体的边缘。因而,也被业界简称为 Laplacian-of-Gaussian (LoG) 算子。
Laplacian of Gaussian
Consider
Laplacian of Gaussianoperator
Where is the edge? Zero-crossings of bottom graph
is the Laplacian operator:
Laplacian of Gaussian
Gaussian derivative of Gaussian
Laplacian of Gaussian
Laplacian- ‐of- ‐Gaussian (LoG)We define the characteristic scale as the scale
that produces peak of Laplacian response.
Laplacian of Gaussian
LoG Blob Detection - ‐ Example
Interest points can be defined as the centers of blobs.
Popular Visual Features
Global Feature– Color Color space Color histogram Color moment– Texture GLCM– Shape Context– GIST– Color Name
Local Feature– Detector Harris, LOG, DOG, MSER, Hessian Affine KAZE, FAST– Descriptor SIFT, SURF, LIOP, BRIEF ORB, FREAK, BRISK, CARD, Edge -SIFT
Technical detailWe can approximate the Laplacian with a difference
of Gaussians; more efficient to implement.
(Laplacian)
(Difference of Gaussians)
DoG Image Pyramid
0 02
, 0 2o kso s
, 0 2kso s
Local Extrema DetectionMaxima and
minimaCompare x with its
26 neighbors at 3 scales
D(x, y, σ) = (G(x, y, kσ) − G(x, y, σ)) ∗ I(x, y) = L(x, y, kσ) − L(x, y, σ).
Popular Visual Features
Global Feature– Color Color space Color histogram Color moment– Texture GLCM– Shape Context– GIST– Color Name
Local Feature– Detector Harris, LOG, DOG, MSER, Hessian Affine KAZE, FAST– Descriptor SIFT, SURF, GLOH, LIOP, BRIEF ORB, FREAK, BRISK, CARD, Edge -SIFT
SIFT Descriptor
• Making descriptor rotation invariant
• Rotate patch according to its dominant gradient orientation• This puts patches into a canonical orientation.
Local Feature
Scale Invariant Feature Transform (SIFT) descriptor
• Basic idea:− Take 16x16 square window around detected feature− Compute edge orientation (angle of the gradient - ‐
90) for each pixel− Throw out weak edges (threshold gradient magnitude)− Create histogram of surviving edge orientations
0 2angle histogram
OrientationGradient and angle:
22
m(x, y) L(x 1, y) L(x 1, y) L(x, y 1) L(x, y 1)
(x, y) tan1 L(x, y 1) L(x, y 1) / L(x 1, y) L(x 1,
y)
Orientation selection
• Full version:− Divide the 16x16 window into a 4x4 grid of cells
(2x2 case shown below)− Compute an orientation histogram for each cell− 16 cells X 8 orientations = 128 dimensional descriptor
Scale Invariant Feature Transform (SIFT) descriptor
• Invariant to– Scale– Rotation
• Partially invariant to– Illumination changes– Camera viewpoint– Occlusion, clutter
Scale Invariant Feature Transform (SIFT) descriptor
Popular Visual Features
Global Feature– Color Color space Color histogram Color moment– Texture GLCM– Shape Context– GIST– Color Name
Local Feature– Detector Harris, LOG, DOG, MSER, Hessian Affine KAZE, FAST– Descriptor SIFT, SURF, GLOH, LIOP, BRIEF ORB, FREAK, BRISK, CARD, Edge -SIFT
SURF: Speeded Up Robust Features
• Using integral images for major speed up– Integral Image (summed area tables) is an intermediate represention for the
image and contains the sum of gray scale pixel values of image– They allow for fast computation of box type convolution filters.
ECCV 2006, CVIU 2008
• SURF 角点检测算法是对 SIFT 的一种改进,主要体现在速度上,效率更高。它和 SIFT 的主要区别是图像多尺度空间的构建方法不同。
SURF
A comparison of SIFT, PCA-SIFT and SURF
method Time Scale Rotation Blur Illumination Affine
Sift common best best common common good
PCA-sift good good good best good best
Surf best common common good best good
108
• Hessian- ‐based interest point localization
• Lxx(x,y,σ) is the Laplacian of Gaussian of the image• It is the convolution of the Gaussian second order
derivative with the image
构造高斯金字塔尺度空间SURF
110
• Approximated second order derivative with box filters (mean/average filter)
Local Feature
SURF
利用模板求偏导和卷积,得到 hessian 行列式图,类比于 sift 中的 DOG 图
111
Detection
• Scale analysis with constant image size
9 x 9, 15 x 15, 21 x 21, 27 x 27 39 x 39, 51 x 51 …1st octave 2nd octave
Local Feature
113
Description• Orientation AssignmentCircular neighborhood ofradius 6s around the interest point(s = the scale at which the point was detected)
Side length = 4s Cost 6 operation to compute the response
x response y response
Local Feature
与 sift 不同, surf 是统计 60 度扇形内所有点的水平 haar 小波特征和垂直haar 小波特征总和
Popular Visual Features
Global Feature– Color Color space Color histogram Color moment– Texture GLCM– Shape Context– GIST– Color Name
Local Feature– Detector Harris, LOG, DOG, MSER, Hessian Affine KAZE, FAST– Descriptor SIFT, SURF, GLOH, LIOP, BRIEF ORB, FREAK, BRISK, CARD, Edge -SIFT
GLOH : Gradient location-orientation histogram (Mikolajczyk and Schmid 2005)
16-bin location-orientation bin histogram -> 272D -> 128D by PCA
SIFT GLOH
Local Feature
使用对数极坐标分级结构替代 SIFT 使用的 4 象限。空间上取半径 6 , 11 , 15 ,角度上分八个区间(除中间区域) , 然后将 272 ( 17*16 )维的 histogram 在一个大数据库上训练,用 PCA 投影到一个 128 维向量
Popular Visual Features
Global Feature– Color Color space Color histogram Color moment– Texture GLCM– Shape Context– GIST– Color Name
Local Feature– Detector Harris, LOG, DOG, MSER, Hessian Affine KAZE, FAST– Descriptor SIFT, SURF, GLOH, LIOP, BRIEF ORB, FREAK, BRISK, CARD, Edge -SIFT
Zhenhua Wang, Bin Fan, and Fuchao Wu, "Local intensity order pattern for feature description." ICCV, 2011
Motivation: Orientation estimation error in SIFT
LIOP
LIOP: Local intensity order pattern for feature description
Popular Visual Features
Global Feature– Color Color space Color histogram Color moment– Texture GLCM– Shape Context– GIST– Color Name
Local Feature– Detector Harris, LOG, DOG, MSER, Hessian Affine KAZE, FAST– Descriptor SIFT, SURF, GLOH, LIOP, BRIEF ORB, FREAK, BRISK, CARD, Edge -SIFT
Edge-SIFT
IEEE TIP- ‐2014
: discriminative binary descriptor for scalable partial-duplicate mobile search.
Histogram based descriptor:• Good for classification tasks,• Expensive, not optimal for partial-duplicate search
Motivation—the edge map:• preserves structural clue and spatial clue• sparse, fast to compute• Is potential for local descriptor extraction
Extraction of Edge-SIFT
0-degree 45-degree 90-degree 135-degree
8 8 4 256bit descriptor
da
S Orientation
S
S
Edge Extraction & Edge Descriptor Computation
S scale
db
Orientation
scale
Keypoint Detection Image Patch Extraction& Normalization
The Matching Performance
The matching result of SIFT
The matching result of Edge-SIFT
Thank you !