online robust dictionary learning
DESCRIPTION
Online Robust Dictionary Learning. Cewu Lu, Jianping Shi, and Jiaya Jia The Chinese University of Hong Kong. Dictionary Learning. Denoise [ Mairal et al. 2008]. Upsampling [ Couzinie-Devy 2010]. Image Classification [Wang et al. 2010]. Background Subtraction - PowerPoint PPT PresentationTRANSCRIPT
Online Robust Dictionary Learning
Cewu Lu, Jianping Shi, and Jiaya JiaThe Chinese University of Hong Kong
Dictionary Learning
Denoise [Mairal et al. 2008] Upsampling[Couzinie-Devy 2010]
Image Classification [Wang et al. 2010]
Background Subtraction [Cong et al. 2010]
Dictionary Learning
Let be a set of “basis vectors”.1{ ,...., }qD d d
Let be a set of signal.1{ ,...., }nX x x
1{ ,...., }nX x x1{ ,...., }qD d d
Dictionary Learning
is “adapted” to if it can represent with a few basis vector.D X
1 1x D
2 2x D
n nx D
......3 3x D
Spare
X
Dictionary Learning
1
2
2 1,..., ,1
1min
n
n
i i iD
i
x Dn
Dictionary Learning
1
2
2 1,..., ,1
1min
n
n
i i iD
i
x Dn
Robust Dictionary Learning
1
2
2 1,..., ,1
1min
n
n
i i iD
i
x Dn
2min ii
x
X={2,5,6,9,10,12,14,15,18}
A toy example:
110.1
9 iix
L2 norm data fitting is not a robust measure.
Robust Dictionary Learning
1
2
2 1,..., ,1
1min
n
n
i i iD
i
x Dn
2min ii
x
X={2,5,6,9,10,12,14,15,80000}
A toy example:
18897
9 iix
Outliers
L2 norm data fitting is not a robust measure.
Inliers
Robust Dictionary Learning
1
1
1 1,..., ,1
1min
n
n
i i iD
i
x Dn
L1 norm is a robust measure.
[ Wagner et al 2009],[ Wang et al 2012],[ Zhao et al 2011 ]
min ii
x
A toy example:
21min i
i i
xx
X={2,5,6,9,10,12,14,15,80000}
Robust Dictionary Learning
1
1
1 1,..., ,1
1min
n
n
i i iD
i
x Dn
min ii
x
A toy example: 2min i i
i
x
1
iix
X={2,5,6,9,10,12,14,15,80000}
10i iix
199 1.26 10
[ Wagner et al 2009],[ Wang et al 2012],[ Zhao et al 2011 ]
L1 norm is a robust measure.
Robust Dictionary Learning
1
1
1 1,..., ,1
1min
n
n
i i iD
i
x Dn
Inliers
Outliers
Incorrect Dictionary
Non-Robust Dictionary Learning
Robust Dictionary Learning
1
1
1 1,..., ,1
1min
n
n
i i iD
i
x Dn
Inliers
Outliers
Correct Dictionary
Robust Dictionary Learning
Robust Dictionary Learning
1
1
1 1,..., ,1
1min
n
n
i i iD
i
x Dn
[ Wagner et al 2009],[ Wang et al 2012],[ Zhao et al 2011 ]
Inliers
Outliers
Dictionary
Non-Robust Dictionary Learning
But, it is not widely used….
Why?
Online Dictionary Learning
Large-scale data Dynamic data
Because…. L1 norm data-fitting hasn’t closed-form
heavy computation
We need Online.
1
2
2 1,..., ,1
1min
n
n
i i iD
i
x Dn
Online Solver [Mairal et al 2010]:
Data
Dictionary Update
Current
Online Dictionary Learning
1
2
2 1,..., ,1
1min
n
n
i i iD
i
x Dn
Online Solver [Mairal et al 2010]:
Data
Dictionary Update
History Current
Online Dictionary Learning
1
2
2 1,..., ,1
1min
n
n
i i iD
i
x Dn
Online Solver [Mairal et al 2010]:
Data
Dictionary Update
History Current
Online Dictionary Learning
1
2
2 1,..., ,1
1min
n
n
i i iD
i
x Dn
Online Solver [Mairal et al 2010]:
Data
Dictionary Update
History Current
Online Dictionary Learning
1
2
2 1,..., ,1
1min
n
n
i i iD
i
x Dn
Online Solver [Mairal et al 2010]:
Data
Dictionary Update
History Current
Online Dictionary Learning
1
2
2 1,..., ,1
1min
n
n
i i iD
i
x Dn
Online Solver [Mairal et al 2010]:
Data
Dictionary Update
History Current
Online Dictionary Learning
1
2
2 1,..., ,1
1min
n
n
i i iD
i
x Dn
Online Solver [Mairal et al 2010]:
Data
Dictionary Update
History Current
Online Dictionary Learning
1
2
2 1,..., ,1
1min
n
n
i i iD
i
x Dn
Online Solver [Mairal et al 2010]:
Data
Dictionary Update
History Current
Online Dictionary Learning
1
2
2 1,..., ,1
1min
n
n
i i iD
i
x Dn
Online Solver [Mairal et al 2010]:
Data
Dictionary Update
CurrentHistory
Dictionary
Online Dictionary Learning
1
2
2 1,..., ,1
1min
n
n
i i iD
i
x Dn
Online Solver [Mairal et al 2010]:
Batch Method Online Method
Time Complexity
polynomial O(n)
Memory Complexity
O(n) O(1)
Online Dictionary Learning
Online Robust Dictionary Learning
Make robust dictionary learning online.
Our goal:
Less ComputationLess Memory
Online Robust Dictionary Learning
Inliers
Outliers
Dictionary
Dictionary Update
History Current
online
Robust
Forget history Data
Require Whole Data
Current
Online Robust Dictionary Learning
Inliers
Outliers
Dictionary
Dictionary Update
History
online
Robust
Forget history Data
Require Whole Data
Challenging
Online Robust Dictionary Learning
• Our Online Approach (Online)• Robustness Analysis (Robust)• Discussion
Online Robust Dictionary Learning
• Our Online Approach• Robustness Analysis• Discussion
Settings:
We have two parameter matrixes and .
Online Dictionary Learning
jtM
jtC
…
Each min-batch data contains h data point.
Data (Min-batch)
…
Initialization: and are zero matrixes.
Data
Update
Current
Online Dictionary Learning
0jM0
jC
D
Dictionary D is a random matrix.
0jC 0
jM
… …
1jC 1
jM
Data
Update
History Current
Online Dictionary Learning
…D
…
1jC 1
jM
General Framework
2jC 2
jM
Data
Update
History Current
Online Dictionary Learning
D
……
2jC 2
jM
General Framework
3jC 3
jM
Data
Update
History Current
Online Dictionary Learning
…D
…
3jC 3
jM
General Framework
…Data
Update
History Current
Online Dictionary Learning
D
…
jtC
jtM
General Framework
1jtC 1
jtM
…Data
Update
History Current
Online Dictionary Learning
D
…
1jtC 1
jtM
General Framework
……Data
Update
CurrentHistory
Dictionary
Online Dictionary Learning
D
jnC
jnM
General Framework
Our Online Approach
Time Complexity O(n/h) = O(n)
Memory Complexity O(1)
…
Our Online Approach
Datat
History Current
Min-batch
1,...,t h tx x
1,...,t h t Sparse code:
Data:
1jtM
1jtC
DPrevious Dictionary
…
In step:tht
Our Online Approach
Datat
History Current
Min-batch
1,...,t h tx x
1,...,t h t Sparse code:
Data:
,:j jt tC D j M
, 1,...,i t h t 21
,:
ji
ijx D j
1 1
tj j j Tt t i i ii t h
M M
1 1
tj j j Tt t i ij ii t hC C x
Solve
1:j qfor
1jtM
1jtC
jtMjtC
1,:
2,:
...
,:
D
DD
D q
DCurrentDictionary
Our Online Approach
Datat
History Current
Min-batch
1,...,t h tx x
1,...,t h t Sparse code:
Data:
,:j jt tC D j M
, 1,...,i t h t 21
,:
ji
ijx D j
1 1
tj j j Tt t i i ii t h
M M
1 1
tj j j Tt t i ij ii t hC C x
Solve
1:j qfor
jtMjtC
DCurrentDictionary
History Information
New Data Information
1jtM
1jtC
Our Online Approach
Time Complexity O(n/h) = O(n)
Memory Complexity O(1)
jtM
jtCRecord and only.
Online Robust Dictionary Learning
• Our Online Approach• Robustness Analysis• Discussion
Our Online Approach
Data
History Current
Min-batch
1,...,t h tx x
1,...,t h t Sparse code:
Data:
jt hM
jt hC
,:j jt tC D j M
, 1,...,i t h t 21
,:
ji
ijx D j
1
tj j j Tt t h i i ii t h
M M
1
tj j j Tt t h i ij ii t hC C x
Solve
1:j qfor
jtMjtC
DCurrentDictionary
Robustness Analysis
,:j jt tC D j M
, 1,...,i t h t 21
,:
ji
ijx D j
1
tj j j Tt t h i i ii t h
M M
1
tj j j Tt t h i ij ii t hC C x
Solve
2
,:1 1
min ,: ,:t h t
ji ij i ij i
D ji i t h
x D j x D j
Robustness Analysis (Proof)
,:j jt tC D j M
, 1,...,i t h t 21
,:
ji
ijx D j
1
tj j j Tt t h i i ii t h
M M
1
tj j j Tt t h i ij ii t hC C x
Solve
2
,:1 1
min ,: ,:t h t
ji ij i ij i
D ji i t h
x D j x D j
Robustness Analysis (Proof)
,:j jt tC D j M
, 1,...,i t h t 21
,:
ji
ijx D j
1
tj j j Tt t h i i ii t h
M M
1
tj j j Tt t h i ij ii t hC C x
Solve
2 2
,:1 1
min ,: ,:t h t
j ji ij i i ij i
D ji i t h
x D j x D j
21
,:
ji
ijx D j
, 1,...,i t h t
Iterative Reweighted Least Squares
Robustness Analysis (Proof)
,:j jt tC D j M
, 1,...,i t h t 21
,:
ji
ijx D j
1
tj j j Tt t h i i ii t h
M M
1
tj j j Tt t h i ij ii t hC C x
Solve
21
,:
ji
ijx D j
, 1,...,i t h t
,:j jt tC D j M
1 1
t h tj j T j Tt i i i i i ii i t h
M
1 1
t h tj j T j Tt i ij i i ij ii i t hC x x
Solve
1
kj j Tk i ij iiC x
1
kj j Tk i i ii
M
Robustness Analysis (Proof)
,:j jt tC D j M
, 1,...,i t h t 21
,:
ji
ijx D j
1
tj j j Tt t h i i ii t h
M M
1
tj j j Tt t h i ij ii t hC C x
Solve
21
,:
ji
ijx D j
, 1,...,i t h t
,:j jt tC D j MSolve
1 1
tj j j Tt t i i ii t h
M M
1 1
tj j j Tt t i ij ii t hC C x
Robustness Analysis
,:minD j
History data New data
1
,:t
ij ii t h
x D j
21
,:t h
ji ij i
i
x D j
Robustness Analysis
,:minD j
History data New data
1
,:t
ij ii t h
x D j
Robustness Analysis
,:minD j
History data New data
1
,:ji
ij ix D j
21
,:t
ji ij i
i t h
x D j
Outliers have small weights
Robustness Analysis
,:minD j
History data New data
21
,:t h
ji ij i
i
x D j
Help New Data Term
Robustness Analysis
Robustness Boosting
,:minD j
History data New data
1
,:t
ij ii t h
x D j
21
,:t h
ji ij i
i
x D j
ji , 1,...,i t h t
Weights of new data (robust information)
Become history weights
Robustness Analysis
work together
,:minD j
History data New data
1
,:t
ij ii t h
x D j
21
,:t h
ji ij i
i
x D j
Online Robust Dictionary Learning
• Our Online Approach• Robustness Analysis• Discussion
Current
Discussion
Inliers
Outliers
Dictionary
Dictionary Update
History
online
Robust
Forget history Data
Require Whole Data
Challenging
Discussion
online
Robust
Forget history Data
Require Whole Data
1 1
tj j j Tt t i i ii t h
M M
1 1
tj j j Tt t i ij ii t hC C x
Robust information update:
Robustness: encode robust information.
ji
2
,:1 1
min ,: ,:t h t
ji ij i ij i
D ji i t h
x D j x D j
by
,:j jt tC D j MStatistic parameter:
Discussion (Limitation)
,:minD j
1
,:t
ij ii t h
x D j
21
,:t h
ji ij i
i
x D j
Initial bias: it cannot handle the extreme case where the initial data are primarily outliers.
History data New data
Online:
Batch: ,:minD j
1
,:t
ij ii t h
x D j
1
,:t h
ij ii
x D j
Experiments
Synthetic Data
Synthetic Data:
Generate
Sparse coefficient: Dictionary: 1,..., n D
clear data: , 1,...,i ix D i n
Training data: 0 , 1,...,i i ix x i n
Outlier (Laplace noise)
Synthetic Data
Synthetic Data:
BRDL ORDL KSVD ODL15dB 0.051 0.067 0.412 0.50520dB 0.034 0.042 0.224 0.28225dB 0.021 0.030 0.121 0.159
KSVD [Aharon et al 2006]ODL [Mairal et al 2010]
20
21
1n
i L ii
x Dn
Learnt Dictionary
L2 norm data fitting
Synthetic Data
Synthetic Data:
Digit Recognition
Digit Recognition:
Digital with Outliers
Data set: MNIST and USPS
Digit Recognition
Digit Recognition:
Our Learnt Dictionary
Learnt Dictionary of L2 norm data-fitting
Online Robust Dictionary Learning
Digit Recognition:
BRDL ORDL KSVD ODLMNIST 18.1 22.7 39.3 34.3USPS 27.3 29.4 45.3 42.5
KSVD [Aharon et al 2006]ODL [Mairal et al 2010]
L2 norm data fitting
Digit Recognition
Digit Recognition:
Time Comparison (in second) in digit recognition in MNIST dataset (with Outlier)
for each digit
Digit 0 1 2 3 4 5 6 7 8 9
Size 5323 6742 5958 6131 5842 5421 5918 6265 5851 5949
ORDL 187 214 188 194 185 171 187 198 185 188
BRDL 2424 3466 2440 2587 2392 1979 2359 4115 2344 2402
Background Subtraction
Video frames(Training Data)
Background Outlier
Background Subtraction
Dataset [Zhao 2011]
KSVD [Aharon et al 2006]ORDL: Online Robust Dictionary LearningBRDL: Batch Robust Dictionary Learning [Zhao 2011]
KSVD ORDL BRDL
Background Subtraction
Traffic light status 1
ORDL BRDL
Background Subtraction
Traffic light status 2
ORDL BRDL
Background Subtraction
Comparison of time and memory