robust texture image representation by scale selective
TRANSCRIPT
郭振华清华大学深圳研究生院
Robust texture image
representation by scale
selective local binary patterns
(TIP2016)
清华大学深圳研究生院
清华大学深圳研究生院成立于2001年。为清华大学唯一的异地办学机构。同一学校,同一品牌,秉承同一文化传统。
有在校全日制研究生3000余人,其中博士生380余人。
有专职教师150余人,博士后80余人,双基地教师280余人,兼职教师40余人。
2http://www.sz.tsinghua.edu.cn
设生命与健康、能源与环境、信息科学与技术、物流与交通、先进制造、海洋科学与技术、社会科学与管理七个学部。
着力发展信息、先进制造、网络与媒体技术、环境、材料、新能源、物流、海洋等学科。
校园建筑面积10万平米,创新基地规划10万平米。
3
清华大学深圳研究生院
Outline
Texture definition and challenge
Texton (Statistical vs. Binary)
Overview of LBP and CLBP
Proposed SSLBP
Experimental Results and Discussion
4
Texture Definition
Definition
Wiki Dictionary:
The feel or shape of a surface or substance;
the smoothness, roughness, softness, etc. of
something.
In fact, the definition of texture is still an open
issue.
Texture is everywhere
6
草地(纹理)
树林(纹理)
楼房(纹理)
天空(纹理)
Wide Application
7
Desert vs Mountain
Normal vs Abnormal
Defect Detection
Common issues
Lightness, rotation and scale
8
9
Structural approach: a set of texels in some regular or repeated pattern
Structural Approach
10
How do you decide what is a texel?
Limitation of Structure
Approach
grass leaves
What/Where are the texels?
Statistical Texton
11
Binary Texton
12
Two advantages:
Fast
Insensitive to training set
LBP in spatial domain
13T. Ojala, M. Pietikäinen, and T. T. Mäenpää. Multiresolution gray-scale and rotation invariant texture classification
with Local Binary Pattern. IEEE Transactions on Pattern Analysis and Machine Intelligence 24(7), pp. 971-987, 2002.
LBP and contrast operators
14
Circle-LBP
15
Multiscale LBP
16
An example of LBP image
and histogram
17
Rotation Invariance (ri)
18
Rotation Invariance
19
Completed LBP
Central pixel and its P circularly and evenly spaced neighbours with radius R.
(a) (b) (c) (d)
(a) A 33 sample block; (b) the local differences; (c) the sign and (d) magnitude components.
24
Completed LBP
21
OriginalImage
LDSMT
CLBP Map
Center Gray Level CLBP_C
CLBP_SS
CLBP_MM
Local Difference
CLBPHistogram
Classifier
Representation Example
Histogram of CLBP_S of a sample. Histogram of CLBP_M of the sample.
Histogram of CLBP_S_M. Histogram of CLBP_S/M.Zhenhua Guo, Lei Zhang, David Zhang, A Completed Modeling of Local Binary Pattern Operator for Texture
Classification, IEEE Transactions on Image Processing, vol. 19, no. 6, pp. 1657-1663, 2010.
Scale Variation
Scale invariance is more difficult.
Two popular ways:
Local scale invariance
Global scale invariance
23
Scale Invariance (I)
24
Local scale invariance
Detect a Harris or Laplacian Region->Normalize the
region->Feature Extraction
Scale Invariance (I)
25
Local scale invariance
Estimating local scale or extracting local fractal
feature
Scale Invariance (II)
26
Global scale invariance
Global fractal feature
Scale Invariance (II)
27
Global scale invariance
Polar transform
Scale Invariance (II)
28
Global scale invariance
Scale shift matching
Assumption
29
Statistical dominant local patterns provide
discriminant information for texture classification.
When an image changes scale, percentage of
dominant patterns does not change.
S. Liao, M. W. K. Law, and A. C. S. Chung. Dominant local binary patterns for texture classification. IEEE
Transactions on Image Processing 18(5), pp. 1107-1118, 2009.
Algorithm 1:
Feature Learning
Step 1: for one training sample, build a scale space by a 2D
Gaussian filter;
Step 2: compute local pattern histogram for each image;
Step 3: only maximal frequency among different scale is
kept;
Step 4: compute average frequency for the whole training
set;
Step 5: dominant patterns with high frequency are learnt.
30
1
, =1
, 1< L, "*" is the convolution operator
i
l
l
f ls
s g l
1 2( )=max( ( ), ( ), ..., ( ))i Lf s s s
CLBP _ S / C CLBP _ S / C CLBP _ S / C CLBP _ S / CH k H k H k H k
( )( )= ( )+
if
CLBP _ S / CT T
CLBP _ S / C CLBP _ S / C
H kH k H k
N
Algorithm 2:
Feature Extraction
Step 1: for one test sample, build a scale space by a 2D
Gaussian filter;
Step 2: compute histogram for selected patterns by algorithm
1;
Step 3: only maximal frequency among different scale is kept.
31
1
, =1
, 1< L, "*" is the convolution operatorl
l
I ls
s g l
1 2( )=max( ( ), ( ), ..., ( ))Ls s sI
CLBP _ S / C CLBP _ S / C CLBP _ S / C CLBP _ S / CDPH k DPH k DPH k DPH k
Scale selective LBP
(SSLBP)
32
.
.
.
1 : Original ImageIs
Convolved by 2D
Gaussian function g
Image
Scale
Space
2
Is
I
Ls
0 05. 0 1. 0 08. 0 03. 0 02.
Feature Size: K
Feature frequency of riP ,R
T
CLBP _ M / CDP
Feature frequency
of riP ,R
T
CLBP _ S / CDP
Feature Size: K
0 14. 0 01. 0 02. 0 05. 0 0.
0 06. 0 07. 0 06. 0 05. 0 0.
Feature frequency of riP ,R
T
CLBP _ M / CDP
Feature frequency
of riP ,R
T
CLBP _ S / CDP
0 15. 0 01. 0 02. 0 01. 0 04.
Feature frequency of riP ,R
T
CLBP _ M / CDP
Feature frequency
of riP ,R
T
CLBP _ S / CDP
Feature
Scale
Space
0 15. 0 1. 0 08. 0 05. 0 04.
operationMax
0 14. 0 02. 0 07. 0 05. 0 02.
0 09. 0 02. 0 04. 0 03. 0 02.
0 13. 0 01. 0 07. 0 01. 0 01.
Output feature
for image I
Feature Size: 2K
operationMax
An example
33
5 10 15 20 25 30 35 40 45 500
0.01
0.02
0.03
0.04
0.05
0.06
Pattern Index
Fre
quency(%
)
5 10 15 20 25 30 35 40 45 500
0.01
0.02
0.03
0.04
0.05
0.06
Pattern Index
Fre
quency(%
)
5 10 15 20 25 30 35 40 45 500
0.01
0.02
0.03
0.04
0.05
0.06
0.07
0.08
Pattern Index
Fre
quency(%
)
5 10 15 20 25 30 35 40 45 500
0.01
0.02
0.03
0.04
0.05
0.06
0.07
0.08
Pattern Index
Fre
quency(%
)
0.35
0.17
5 Dominant Patterns
34
Scale Estimation
351 2 3 4 5 6 7 8 91.7
1.75
1.8
1.85
1.9
1.95
2
2.05
2.1
Image Scale
Avera
ge F
eatu
re S
cale
Scale parameter of KTH-TIPS.
, 1, 2,...,990
z
I
I IS
z
FS
AFS z
K
kScaleIndex
FS
K
k
I
I
1
)(
Texture Databases
36
Texture
Database
Name
Imaging property Image Size
Number
of
classes
Number
of
samples
CUReTThe images are captured under different illumination and
viewing directions. Fixed, 200*200 61 92
KTH-
TIPS
It extends CUReT by imaging new samples of ten of the
CUReT textures at a subset of the viewing and lighting
angles used in CUReT but also over a range of scales.
Varied,
196*201 10 81
UIUC
Textures are acquired under significant scale and viewpoint
changes, arbitrary rotations, and uncontrolled illumination
conditions, even including textures with non-rigid
deformation.
Fixed, 640*480 25 40
UMDIt has been designed in a similar way as UIUC, while the
image resolution is 4 times of UIUC.
Fixed, 1280*
96025 40
A LOTIt is systematically collected with varied viewing angles,
illumination angles, and illumination colors for each
material.
Varied, 1536*
891250 100
CUReT Database
37
KTH-TIPS Database
38
39
UIUC Database
40
UMD Database
ALOT Database
41
Parameters
Scale Space: 4
2D Gaussian filter: 20.25
Radius: 3, 9
Neighbor: 24
Feature extractor: CLBP_S/C, CLBP_M/C
Feature Length: 2400
NNC: NNC+Chi-square distance
Feature preprocessing: 42
( ), 1,2,...,k kH sqrt H k K
Nearest Subspace
Classifier (NSC)
There are C classes of textures.
n training samples in each class, a set of
histograms for one class:
project hy into the subspace spanned by Hc :
The projection residuals is computed as:
43
1( )T T yc c c cH H H h
2
yc c cerr h H
,1 ,2 ,, ,...,c c c c nH h h h
K. Lee, J. Ho, and D. Kriegman. Acquiring linear subspaces for face recognition under variable lighting. IEEE
Transactions on Pattern Analysis and Machine Intelligence 27(5), pp. 684–698, 2005.
Experimental ResultsMethod CUReT
(46)
KTH-TIPS
(40)
UIUC
(20)
UMD
(20)
ALOT
(50)
SRP [ICCV2011] (SVM) 99.37 99.29 98.56 99.30 -
RP [TPAMI2012] (NNC) 98.52 97.71 96.27 99.13 -
Caputo et al. [IVC2010] (SVM) 98.46 94.8 92.0 - -
BIF [IJCV2010] (Shift Matching NNC) 98.6 98.5 98.8 - -
OTF [CVIU2010] (SVM) - - 97.44 98.42 95.6
WMFS [TIP2013] (SVM) - - 97.62 98.68 96.94
PLS [CVPR2014](SVM) - 98.4 96.57 98.99 93.35
PFS [IVC2014](SVM) - 97.35 97.92 99.38 97.5
LEP [TIP2013] (Shift Matching NNC) - 97.56 - - -
scLBP [TIP2015] (SVM) 99.29 - 98.45 99.25 -
COV-LBPD [TIP2014] (NNC) - 98.0 - - -
RPICoLBP [TPAMI2014] (SVM) 98.4 98.4 - - -
RLBP [BMVC2013] (NNC) - - 96.7 - -
DLBP [TIP2009] (NNC) 84.93 86.99 60.73 89.87 78.38
LBPSRI [TIP2012] (NNC) 85.00 89.73 70.05 91.71 71.29
LBP [TPAMI2002] (NNC) 80.63 82.67 55.26 88.23 63.33
CLBP [TIP2010] (NNC) 97.40 97.19 93.26 98.00 93.28
SSLBP (NNC) 98.55 97.80 97.02 98.84 96.69
SSLBP (NSC) 99.51 99.39 99.36 99.46 99.7144
Time Cost (I)
Method
Scale Space
Building
Pattern/
Patch
Processing
HistogrammingClassification
(NNC)
RP [TPAMI2012] - SP·SR·Ip SR·C·ST·Ip C·ST·(C·Tn)
VZ_Patch
[TPAMI2009]- - SP·C·ST·Ip C·ST·(C·Tn)
SSLBP (L-1)·Sg· Ip 2·L·P·Ip 2·L·Ip 4·K·(C·Tn)
45
Here L=4 is the size of scale space, P=24 is the number of neighbours for LBP, Ip denotes the number of pixels
per image sample, Sg denotes the size of Gaussian smooth kernel, K=600 is the number of selected dominant
patterns. SP represents the size of a local patch, usually 7*7, SR is the dimension of random projection, usually
15, C denotes the number of classes, ST represents the number of clustered textons per class, here C·ST≈4·K. Tn
is the number of training samples per class.
CUReT KTH-TIPS UIUC UMD ALOT
Feature extraction of MFS
(ICCV2009) (Unit: Second)0.09 0.08 0.62 2.60 2.67
Feature extraction of VZ_MR8
(ICCV2005) (Unit: Second)1.03 0.93 9.98 37.35 40.11
Feature extraction of VZ_Patch
(TPAMI2009) (Unit: Second)12.44 11.28 96.97 309.51 346.64
Feature extraction of the
proposed scheme
(Unit: Second)
0.24 0.23 1.80 7.63 8.46
Matching (NNC)
(Unit: Millisecond)177.47 25.09 30.78 31.1 297.42
Matching (NSC)
(Unit: Millisecond)2.25 0.64 0.90 0.92 5.05
46
77
Time Cost (II)
Robust to image size
Image size
\#Training
Sample
20 15 10 5
1280*960 99.46+0.46 99.31+0.51 98.81+0.70 96.41+1.28
640*480 99.71+0.26 99.48+0.37 98.77+0.68 95.68+1.45
320*240 99.38+0.45 98.80+0.69 97.41+1.06 92.74+1.84
47
Test on UMD
Dominant patterns
analysis
CUReT KTH-TIPS UMD UIUC
CUReT 100% 80.33% 73% 74.54%
KTH-TIPS - 100% 76.12% 80.16%
UMD - - 100% 87.87%
UIUC - - - 100%
48
Percentage of identical dominant patterns between different training sets.
Robust to pattern selection
49
Discussion
Traditional methods try to extract local or
global scale invariant features.
From implementation view, extract local
scale variant feature first, then apply a
global transformation to achieve invariance.
From scale space view, instead of analyzing
scale spaces locally, analyze scale space
globally.
50
Conclusion
A simple and effective method to address
scale variation issue for texture image.
Fast enough for many applications, 0.24
second for a 200*200 image.
LBP with scale selection can get promising
result for challenge databases, such as UIUC
and ALOT.
51
谢谢!For any inquiry, please contact with [email protected]