talk icml

Online Dictionary Learning forSparse Coding

Julien Mairal1 Francis Bach1Jean Ponce2 Guillermo Sapiro3

1INRIA–Willow project 2Ecole Normale Supérieure3University of Minnesota

ICML, Montréal, June 2008

What this talk is aboutLearning efficiently dictionaries (basis set) for sparsecoding.

Solving a large-scale matrix factorization problem.

Making some large-scale image processing problemstractable.

Proposing an algorithm which extends to NMF, sparsePCA,. . .

1 The Dictionary Learning Problem

2 Online Dictionary Learning

3 Extensions

The Dictionary Learning Problem

y︸︷︷︸measurements

= xorig︸︷︷︸original image

+ w︸︷︷︸noise

The Dictionary Learning Problem[Elad & Aharon (’06)]

Solving the denoising problemExtract all overlapping 8× 8 patches xi .

Solve a matrix factorization problem:

minαi ,D∈C

n∑i=1

12||xi −Dαi ||22︸︷︷︸

reconstruction

+λ||αi ||1︸︷︷︸sparsity

with n > 100, 000

Average the reconstruction of each patch.

The Dictionary Learning Problem[Mairal, Bach, Ponce, Sapiro & Zisserman (’09)]

Denoising result

The Dictionary Learning Problem[Mairal, Sapiro & Elad (’08)]

Image completion example

The Dictionary Learning ProblemWhat does D look like?

The Dictionary Learning Problem

minα∈Rk×n

n∑i=1

12 ||xi −Dαi ||22 + λ||αi ||1

C M= {D ∈ Rm×k s.t. ∀j = 1, . . . , k , ||dj ||2 ≤ 1}.

Classical optimization alternates between Dand α.

Good results, but very slow!

3 Extensions

Online Dictionary Learning

Classical formulation of dictionary learning

minD∈C

fn(D) = minD∈C

n∑i=1

l(xi ,D),

l(x,D) M= minα∈Rk

12||x−Dα||22 + λ||α||1.

Which formulation are we interested in?

minD∈C

[f (D) M= Ex [l(x,D)] ≈ lim

n→+∞

n∑i=1

l(xi ,D)]

Online learning canhandle potentially infinite datasets,adapt to dynamic training sets,be dramatically faster than batchalgorithms [Bottou & Bousquet (’08)].

Online Dictionary LearningProposed approach

1: for t=1,. . . ,T do2: Draw xt3: Sparse Coding

αt ← argminα∈Rk

12||xt −Dt−1α||22 + λ||α||1,

4: Dictionary Learning

Dt ← argminD∈C

t∑i=1

(12||xi−Dαi ||22+λ||αi ||1

5: end for

Implementation detailsUse LARS for the sparse coding step,

Use a block-coordinate approach for thedictionary update, with warm restart,

Use a mini-batch.

Which guarantees do we have?Under a few reasonable assumptions,

we build a surrogate function f̂t of theexpected cost f verifying

limt→+∞

f̂t(Dt)− f (Dt) = 0,

Dt is asymptotically close to a stationarypoint.

Online Dictionary LearningExperimental results, batch vs online

m = 8× 8, k = 256

m = 12× 12× 3, k = 512

m = 16× 16, k = 1024

Online Dictionary LearningExperimental results, ODL vs SGD

m = 8× 8, k = 256

m = 12× 12× 3, k = 512

m = 16× 16, k = 1024

Online Dictionary LearningInpainting a 12-Mpixel photograph

3 Extensions

Extension to NMF and sparse PCA

NMF extension

minα∈Rk×n

n∑i=1

12||xi −Dαi ||22 s.t. αi ≥ 0, D ≥ 0.

SPCA extension

minα∈Rk×n

D∈C′

n∑i=1

12||xi −Dαi ||22 + λ||α1||1

C ′ M= {D ∈ Rm×k s.t. ∀j ||dj ||22 + γ||dj ||1 ≤ 1}.

Extension to NMF and sparse PCAFaces: Extended Yale Database B

(a) PCA (b) NNMF (c) DL

Extension to NMF and sparse PCAFaces: Extended Yale Database B

(d) SPCA, τ = 70% (e) SPCA, τ = 30% (f) SPCA, τ = 10%

Extension to NMF and sparse PCANatural Patches

(a) PCA (b) NNMF (c) DL

Extension to NMF and sparse PCANatural Patches

(d) SPCA, τ = 70% (e) SPCA, τ = 30% (f) SPCA, τ = 10%

Conclusion

Take-home messageOnline techniques are adapted to thedictionary learning problem.Our method makes some large-scaleimage processing tasks tractable—. . .. . .— and extends to various matrixfactorization problems.

talk icml

dictionary update

dictionary learning

dictionary learning

dictionary learning1

batch vs online

dcdc n i

dc t i

nmin f d

Documents

multi-view clustering via canonical correlation analysis...

research talk - the stanford natural language processing...

battista biggio @ icml 2015 - "is feature selection secure...

icml 2019 workshop book · 2019-07-06 · icml 2019...

counting and sampling solutions of sat/smt...

by sergey alexandrov icml 2003

icml 2018 notesthis document contains notes i took during...

icml-time-wkshp - oregon state...

icml 2016€¦ · welcome to new york and the 33rd...

neil d. lawrence 11th july 2015 deep learning@icml

icml ’11 tutorial: recommender problems for web...

icml 2016: the information sieve

icml!2013!atlanta! conferenceprogram!

atrl (icml)

icml â€™11 tutorial: recommender problems for web...

icml · icml 2015 at a glance 7 local information 9...

american association 15 icml preliminary...3 welcome to 15...

causata machine learning icml atlanta june 2013

icml talk on deep learning for music recommendation

international conference on machine...