cbir ‐ features

80
Large-‐scale Visual Search: CBIR-‐ Features Weimin Tan. Fudan University sep, 2014

Upload: tanweimin666

Post on 14-Aug-2015

214 views

Category:

Science


6 download

TRANSCRIPT

Page 1: Cbir ‐ features

Large-­‐scale­Visual­

Search:­­­CBIR­-­‐­

Features

Weimin Tan. Fudan University sep, 2014

Page 2: Cbir ‐ features

Image featuresLe

vels

of

imag

e c

on

ten

t

High-level features ­Semantics

­Shape

­Texture

­Color,­lightness

Low-level features / visual features

(signatures, descriptors)

Page 3: Cbir ‐ features

Image features

Textual Visual (low- ‐level)

Annotations and metadata:– tags/keywords;

– Creation date;

– geo tags;

– name of the file;

– photography conditions

(exposition, aperture,

flash…).

Features extracted from pixel

values:– color descriptors;

– texture descriptors;

– shape descriptors;

– Spatial layout descriptors.

Page 4: Cbir ‐ features

Visual features (Low- ‐level)

Global Local

Describe the whole image:– average intensity;– average amount of

red;− …

Describe one part of the image:– average intensity for the left

upper part;– average amount of red in the

center of an image;− …

All pixels of an image are processed. Segmentation of the image is performed, pixels of a parti cular segment are processed to extract features.

Page 5: Cbir ‐ features

Popular Visual Features Global Feature

– Color Color space Color histogram Color moment– Texture GLCM– Shape Context– GIST– Color Name

Local Feature– Detector Harris, LOG, DOG, MSER, Hessian Affine KAZE, FAST– Descriptor SIFT, SURF, LIOP, BRIEF ORB, FREAK, BRISK, CARD, Edge -SIFT

Page 6: Cbir ‐ features

• Quanti zati on of color space

– Quantizat ion is important: size of the feature vector.

– When no color similarity function used:• Too many bins – similar colors are treated as dissimilar.• Too little bins – dissimilar colors are treated as similar.

h1 h2hN

Color Histogram

Page 7: Cbir ‐ features

Color Histogram Retrieval

Page 8: Cbir ‐ features

Color Histogram

­Advantage­:

• The color histogram is easy to compute and effective in

characterizing both the global and local distribution of colors in an

image.

• Robust­ to translation and rotation about the view axis and changes only slightly with the scale, occlusion and viewing angle.

Disadvantage­:­• Without color distributions of images

Page 9: Cbir ‐ features

Popular Visual Features

Global Feature– Color Color space Color histogram Color moment– Texture GLCM– Shape Context– GIST– Color Name

Local Feature– Detector Harris, LOG, DOG, MSER, Hessian Affine KAZE, FAST– Descriptor SIFT, SURF, LIOP, BRIEF ORB, FREAK, BRISK, CARD, Edge -SIFT

Page 10: Cbir ‐ features

Color Moments• Color moments have been proved to be efficient and effective

in representing color distributions of images

– First order(mean)

– Second order(variance)

– Third order(skewness)

Page 11: Cbir ‐ features

Color Moments• Consider spatial layout

– Block- w‐ ise

Page 12: Cbir ‐ features

Popular Visual Features

Global Feature– Color Color space Color histogram Color moment– Texture GLCM– Shape Context– GIST– Color Name

Local Feature– Detector Harris, LOG, DOG, MSER, Hessian Affine KAZE, FAST– Descriptor SIFT, SURF, LIOP, BRIEF ORB, FREAK, BRISK, CARD, Edge -SIFT

Page 13: Cbir ‐ features

Texture FeatureSome pattern of color or intensity changes

Page 14: Cbir ‐ features

Natural­Texture

Page 15: Cbir ‐ features

灰度共生矩阵GLCM (Gray Level Co- ocurrence Matrices)

思想– 纹理是由灰度分布在空间位置上反复出现而形成– 纹理图像在图像空间中相隔某距离的两象素

间会存在一定的 灰度关系,即灰度的空间相关性

– 共生矩阵方法用条件概率来反映纹理,是相邻象素的灰度相关性的表现。

– 根据图像像素之间的位置关系(距离, 方向),构造一种矩 阵,作为纹理的描述– 矩阵的行坐标和列坐标表示不同的灰度,考察一对对象素出现的频度,以此作为矩阵中的元素

方法

Page 16: Cbir ‐ features

• The GLCM is defined by:

– where nijis the number of occurrences of the pixel

values lying at distance d with angle in the image.

– The co-occurrence matrix P has dimension n x n, where n is the number of gray levels in the image.

P( p, q, d, ) nij

#{[(x , y ), (x , y )]S | f (x , y ) p & f (x , y )

q}p( p, q, d , ) 1 1 2 2 1 1 2 2

# S

GLCM

( p, q)

nij

Page 17: Cbir ‐ features

Example:

0 1 2 3 0 11 2 3 0 1 22 3 0 1 2 33 0 1 2 3 00 1 2 3 0 11 2 3 0 1 2

0 0 0 0 1 10 0 0 0 1 10 0 0 0 1 10 0 0 0 1 12 2 2 2 3 32 2 2 2 3 3

Image A Image B

0 8 0 7

8 0 8 0

0 8 0 7

7 0 7 0

P (d 1, 0 )o

A

g1 0,

g2 1,

g3 2,

g4 3,

0 o 45 o

90 o 135 o

P (d 1, 45o ) B

18 3 3 0

3

3

6

1

1

6

1

1

0 1 1 2

P (d 1, 45o ) A

12 0 0 0

0

0

14

0

0

12

0

0

0 0 0 12

P (d 1, 0o ) B

24 4 0 0

4

0

8

0

0

12

0

2

0 0 2 4

Page 18: Cbir ‐ features

Gray Level Co-occurrence Matrix Contains information about the positions of

pixels having similar gray level values. Robust to translation and rotation about the

view axis and changes only slowly with the scale, occlusion and viewing angle.

GLCM

Page 19: Cbir ‐ features

Popular Visual Features

Global Feature– Color Color space Color histogram Color moment– Texture GLCM– Shape Context– GIST– Color Name

Local Feature– Detector Harris, LOG, DOG, MSER, Hessian Affine KAZE, FAST– Descriptor SIFT, SURF, LIOP, BRIEF ORB, FREAK, BRISK, CARD, Edge -SIFT

Page 20: Cbir ‐ features

• What points on these two sampled contours are most similar? How do you know?

Page 21: Cbir ‐ features

Shape Context Descriptor [Belongie et al ’02]

20Shape context slides from Belongie et al.

Count the number of points inside each bin, e.g.:

Count = 4

Count = 10

Compact representation of distribution of points relative to each point

...

NIPS’00, PAMI’02

Page 22: Cbir ‐ features

Shape Context Descriptor

形状直方图

Page 23: Cbir ‐ features

Global Feature

Comparing Shape Contexts

22

Compute matching costs usingChi Squared distance:

Recover correspondences by solving for least cost assignment, using costs Cij

(Then use a deformable template match, given the correspondences.)

Page 24: Cbir ‐ features

Popular Visual Features

Global Feature– Color Color space Color histogram Color moment– Texture GLCM– Shape Context– GIST– Color Name

Local Feature– Detector Harris, LOG, DOG, MSER, Hessian Affine KAZE, FAST– Descriptor SIFT, SURF, LIOP, BRIEF ORB, FREAK, BRISK, CARD, Edge -SIFT

Page 25: Cbir ‐ features

GIST Feature• Definition and Background

• Essence, holistic characteristics of an image• Context information obtained within an eye saccade (app.

150 ms.)• Evidence of place recognizing cells at Parahippocampal

Place Area (PPA)• Biologically plausible models of Gist are yet to be

proposed• Nature of tasks done with gist

• Scene categorization/context recognition• Region priming/layout recognition • Resolution/scale selection

C. Siagian and L. I t , Rapid Biologically- ‐Inspired Scene ClassificaOon Using Features Shared with Visual AuenOon, IEEE Transac=ons PAMI, Vol. 29, No. 2, pp. 300- ‐312, Feb 2007.

C. Siagian and L. Itti, Rapid Biologically Inspired Scene Classification Using Features Shared with Visual Attention, ‐IEEE Transactions on PAMI, Vol.29,No.2,pp.300-312,Feb 2007.

Page 26: Cbir ‐ features

Human Vision Architecture• Visual Cortex:

– Low level filters, center-surround, and normalization

• Saliency Model:– Attend to pertinent regions

• Gist Model:– Compute image general

characteristic• High Level Vision:

– Object recognition– Layout recognition – Scene understanding

Page 27: Cbir ‐ features

Gist Model Implementation Raw image feature-maps

• Gabor filters at 4 angles (0, 45, 90, 135) on 4 scales= 16 sub- ‐channels

• red- ‐green and blue- ‐yellow center surround each with 6 scale combinations= 12 sub- ‐channels

• Dark-bright center-surround with 6 scale combinations

= 6 sub- ‐channels = Total of 34 sub- ‐channels

Orientation Channel

color

Intensity

Page 28: Cbir ‐ features

Gist Model Implementation

• Gist Feature Extraction

– Average values of predetermined grid

( 4×4 )

Global Feature

Page 29: Cbir ‐ features

• Dimension Reduction– Original:

34 sub- ‐channels x 16 features= 544 features

– PCA/ICA reduction: 80 features

• Kept >95% of variance

Gist Model Implementation

Global Feature

Page 30: Cbir ‐ features

System Example Run

Page 31: Cbir ‐ features

Global Features• Advantages:

– Simple.– Low computatinal complexity.

• Disadvantages:– Low accuracy

Page 32: Cbir ‐ features

• Why Local Feature?

– Locality: features are local, so robust to occlusion and clutter (no prior segmentation)

– Distinctiveness: individual features can be matched to a large database of objects

– Quantity: many features can be generated for even small objects

– Efficiency: close to real-time performance

– Extensibility: can easily be extended to wide range of differing feature types, with each adding robustness

Local Features

Page 33: Cbir ‐ features

• Main Components:– Detection of interest points– Local Feature Descriptor

Local Features

Image Interest Points Local Feature

Page 34: Cbir ‐ features

Popular Visual Features

Global Feature– Color Color space Color histogram Color moment– Texture GLCM– Shape Context– GIST– Color Name

Local Feature– Detector Harris, LOG, DOG, MSER, Hessian Affine KAZE, FAST– Descriptor SIFT, SURF, LIOP, BRIEF ORB, FREAK, BRISK, CARD, Edge -SIFT

Page 35: Cbir ‐ features

Local Feature

• Corners as distinctive interest points− We should easily recognize the point by looking through a

small window− shift a window in any direction should give a large change

in intensity

“Flat” Region: No change in all direction

“Edge”:No change along the edge direction

“Corner”: Significant Change in all directions

Harris Corner Detector

Page 36: Cbir ‐ features

Consider shifting the window W by (u,v)

• how do the pixels in W change?• compare each pixel before and after by

summing up the squared differences W

Taylor Series expansion of I:

If the motion (u,v) is small, then first order approx is good

Local FeatureHarris Corner Detector

Page 37: Cbir ‐ features

W

Local FeatureHarris Corner Detector

Page 38: Cbir ‐ features

M

This can be rewritten as

For the example above• You can move the center of the blue window to anywhere on the

yellow unit circle

• Which directions will result in the largest and smallest E values?• We can find these directions by looking at the eigenvectors of M

Local FeatureHarris Corner Detector

Page 39: Cbir ‐ features

Eigenvalues and eigenvectors of M

• Define shifts with the smallest and largest change (E value)

••

x+ = direction of largest increase in E.

+ = amount of increase in direction x+

x- = direction of smallest increase in E.

- ‐ = amount of increase in direction x-

x-

x+

M

Mx x

Mx x

Local Feature

Harris Corner Detector

Page 40: Cbir ‐ features

“Flat” Region:λ1 and λ2 are small;

“Edge”: λ1 >> λ2

λ2 >> λ1

“Corner”:λ1 and λ2 are large; λ1 ~ λ2

Local Feature

Harris Corner Detector

Page 41: Cbir ‐ features

Feature Detection: Mathematics

1

2

“Corner”1 and 2 are large,1 ~ 2;

E increases in all directions

1 and 2 are small; E is almost constant in all directions

“Edge”1 >> 2

“Edge”2 >> 1

“Flat” region

Classification of image points using eigenvalues of M:

12

1 2

f 2f 12 (1 2 )Corner Response Function: or

Page 42: Cbir ‐ features

Harris Corner Detector

• Procedure:− Compute M matrix for each image window to get their

cornerness scores− Find points whose surrounding window gave large corner

response− Take the points of local maxima, i.e., perform non

- ‐maximum suppression

优点: A  、旋转不变性; B 、图像灰度的仿射变化具有部分的不变性。

缺点: A  、它对尺度很敏感,不具备几何尺度不变性; B 、提取的角点是像素级的。

Page 43: Cbir ‐ features

Harris Detector Example

Page 44: Cbir ‐ features

The tops of the horns are detected in both images

Harris Corner (in red)

Page 45: Cbir ‐ features

Popular Visual Features

Global Feature– Color Color space Color histogram Color moment– Texture GLCM– Shape Context– GIST– Color Name

Local Feature– Detector Harris, LOG, DOG, MSER, Hessian Affine KAZE, FAST– Descriptor SIFT, SURF, LIOP, BRIEF ORB, FREAK, BRISK, CARD, Edge -SIFT

Page 46: Cbir ‐ features

Laplacian of Gaussian

LoG 边缘检测算子是 David Courtnay Marr 和 Ellen

Hildreth ( 1980 )共同提出的 [1]  。因此,也称为边缘检测算法或Marr & Hildreth 算子。

该算法首先对图像做高斯滤波,然后再求其拉普拉斯( Laplacian )二阶导数。即图像与 Laplacian of the Gaussian function 进行滤波运算。最后,通过检测滤波结果的零交叉( Zero crossings )可以获得图像或物体的边缘。因而,也被业界简称为 Laplacian-of-Gaussian (LoG) 算子。

Page 47: Cbir ‐ features

Laplacian of Gaussian

Consider

Laplacian of Gaussianoperator

Where is the edge? Zero-crossings of bottom graph

Page 48: Cbir ‐ features

is the Laplacian operator:

Laplacian of Gaussian

Gaussian derivative of Gaussian

Laplacian of Gaussian

Page 49: Cbir ‐ features

Laplacian- ‐of- ‐Gaussian (LoG)We define the characteristic scale as the scale

that produces peak of Laplacian response.

Page 50: Cbir ‐ features

Laplacian of Gaussian

Page 51: Cbir ‐ features

LoG Blob Detection - ‐ Example

Interest points can be defined as the centers of blobs.

Page 52: Cbir ‐ features

Popular Visual Features

Global Feature– Color Color space Color histogram Color moment– Texture GLCM– Shape Context– GIST– Color Name

Local Feature– Detector Harris, LOG, DOG, MSER, Hessian Affine KAZE, FAST– Descriptor SIFT, SURF, LIOP, BRIEF ORB, FREAK, BRISK, CARD, Edge -SIFT

Page 53: Cbir ‐ features

Technical detailWe can approximate the Laplacian with a difference

of Gaussians; more efficient to implement.

(Laplacian)

(Difference of Gaussians)

Page 54: Cbir ‐ features

DoG Image Pyramid

0 02

Page 55: Cbir ‐ features

, 0 2o kso s

, 0 2kso s

Page 56: Cbir ‐ features

Local Extrema DetectionMaxima and

minimaCompare x with its

26 neighbors at 3 scales

Page 57: Cbir ‐ features

D(x, y, σ) = (G(x, y, kσ) − G(x, y, σ)) ∗ I(x, y) = L(x, y, kσ) − L(x, y, σ).

Page 58: Cbir ‐ features

Popular Visual Features

Global Feature– Color Color space Color histogram Color moment– Texture GLCM– Shape Context– GIST– Color Name

Local Feature– Detector Harris, LOG, DOG, MSER, Hessian Affine KAZE, FAST– Descriptor SIFT, SURF, GLOH, LIOP, BRIEF ORB, FREAK, BRISK, CARD, Edge -SIFT

Page 59: Cbir ‐ features

SIFT Descriptor

• Making descriptor rotation invariant

• Rotate patch according to its dominant gradient orientation• This puts patches into a canonical orientation.

Local Feature

Page 60: Cbir ‐ features

Scale Invariant Feature Transform (SIFT) descriptor

• Basic idea:− Take 16x16 square window around detected feature− Compute edge orientation (angle of the gradient - ‐

90) for each pixel− Throw out weak edges (threshold gradient magnitude)− Create histogram of surviving edge orientations

0 2angle histogram

Page 61: Cbir ‐ features

OrientationGradient and angle:

22

m(x, y) L(x 1, y) L(x 1, y) L(x, y 1) L(x, y 1)

(x, y) tan1 L(x, y 1) L(x, y 1) / L(x 1, y) L(x 1,

y)

Orientation selection

Page 62: Cbir ‐ features

• Full version:− Divide the 16x16 window into a 4x4 grid of cells

(2x2 case shown below)− Compute an orientation histogram for each cell− 16 cells X 8 orientations = 128 dimensional descriptor

Scale Invariant Feature Transform (SIFT) descriptor

Page 63: Cbir ‐ features

• Invariant to– Scale– Rotation

• Partially invariant to– Illumination changes– Camera viewpoint– Occlusion, clutter

Scale Invariant Feature Transform (SIFT) descriptor

Page 64: Cbir ‐ features

Popular Visual Features

Global Feature– Color Color space Color histogram Color moment– Texture GLCM– Shape Context– GIST– Color Name

Local Feature– Detector Harris, LOG, DOG, MSER, Hessian Affine KAZE, FAST– Descriptor SIFT, SURF, GLOH, LIOP, BRIEF ORB, FREAK, BRISK, CARD, Edge -SIFT

Page 65: Cbir ‐ features

SURF: Speeded Up Robust Features

• Using integral images for major speed up– Integral Image (summed area tables) is an intermediate represention for the

image and contains the sum of gray scale pixel values of image– They allow for fast computation of box type convolution filters.

ECCV 2006, CVIU 2008

• SURF 角点检测算法是对 SIFT 的一种改进,主要体现在速度上,效率更高。它和 SIFT 的主要区别是图像多尺度空间的构建方法不同。

Page 66: Cbir ‐ features

SURF

A comparison of SIFT, PCA-SIFT and SURF

method Time Scale Rotation Blur Illumination Affine

Sift common best best common common good

PCA-sift good good good best good best

Surf best common common good best good

Page 67: Cbir ‐ features

108

• Hessian- ‐based interest point localization

• Lxx(x,y,σ) is the Laplacian of Gaussian of the image• It is the convolution of the Gaussian second order

derivative with the image

构造高斯金字塔尺度空间SURF

Page 68: Cbir ‐ features

110

• Approximated second order derivative with box filters (mean/average filter)

Local Feature

SURF

利用模板求偏导和卷积,得到 hessian 行列式图,类比于 sift 中的 DOG 图

Page 69: Cbir ‐ features

111

Detection

• Scale analysis with constant image size

9 x 9, 15 x 15, 21 x 21, 27 x 27 39 x 39, 51 x 51 …1st octave 2nd octave

Local Feature

Page 70: Cbir ‐ features

113

Description• Orientation AssignmentCircular neighborhood ofradius 6s around the interest point(s = the scale at which the point was detected)

Side length = 4s Cost 6 operation to compute the response

x response y response

Local Feature

与 sift 不同, surf 是统计 60 度扇形内所有点的水平 haar 小波特征和垂直haar 小波特征总和

Page 71: Cbir ‐ features

Popular Visual Features

Global Feature– Color Color space Color histogram Color moment– Texture GLCM– Shape Context– GIST– Color Name

Local Feature– Detector Harris, LOG, DOG, MSER, Hessian Affine KAZE, FAST– Descriptor SIFT, SURF, GLOH, LIOP, BRIEF ORB, FREAK, BRISK, CARD, Edge -SIFT

Page 72: Cbir ‐ features

GLOH : Gradient location-orientation histogram (Mikolajczyk and Schmid 2005)

16-bin location-orientation bin histogram -> 272D -> 128D by PCA

SIFT GLOH

Local Feature

使用对数极坐标分级结构替代 SIFT 使用的 4 象限。空间上取半径 6 , 11 , 15 ,角度上分八个区间(除中间区域) , 然后将 272 ( 17*16 )维的 histogram 在一个大数据库上训练,用 PCA 投影到一个 128 维向量

Page 73: Cbir ‐ features

Popular Visual Features

Global Feature– Color Color space Color histogram Color moment– Texture GLCM– Shape Context– GIST– Color Name

Local Feature– Detector Harris, LOG, DOG, MSER, Hessian Affine KAZE, FAST– Descriptor SIFT, SURF, GLOH, LIOP, BRIEF ORB, FREAK, BRISK, CARD, Edge -SIFT

Page 74: Cbir ‐ features

Zhenhua Wang, Bin Fan, and Fuchao Wu, "Local intensity order pattern for feature description." ICCV, 2011

Motivation: Orientation estimation error in SIFT

LIOP

Page 75: Cbir ‐ features

LIOP: Local intensity order pattern for feature description

Page 76: Cbir ‐ features

Popular Visual Features

Global Feature– Color Color space Color histogram Color moment– Texture GLCM– Shape Context– GIST– Color Name

Local Feature– Detector Harris, LOG, DOG, MSER, Hessian Affine KAZE, FAST– Descriptor SIFT, SURF, GLOH, LIOP, BRIEF ORB, FREAK, BRISK, CARD, Edge -SIFT

Page 77: Cbir ‐ features

Edge-SIFT

IEEE TIP- ‐2014

: discriminative binary descriptor for scalable partial-duplicate mobile search.

Histogram based descriptor:• Good for classification tasks,• Expensive, not optimal for partial-duplicate search

Motivation—the edge map:• preserves structural clue and spatial clue• sparse, fast to compute• Is potential for local descriptor extraction

Page 78: Cbir ‐ features

Extraction of Edge-SIFT

0-degree 45-degree 90-degree 135-degree

8 8 4 256bit descriptor

da

S Orientation

S

S

Edge Extraction & Edge Descriptor Computation

S scale

db

Orientation

scale

Keypoint Detection Image Patch Extraction& Normalization

Page 79: Cbir ‐ features

The Matching Performance

The matching result of SIFT

The matching result of Edge-SIFT

Page 80: Cbir ‐ features

Thank you !