an introduction to sparse coding, sparse sensing, and ...disp.ee.ntu.edu.tw/~pujols/an introduction...

Post on 23-Jul-2020

8 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

An Introduction to Sparse Coding,

Sparse Sensing, and Optimization

Speaker: Wei-Lun Chao

Date: Nov. 23, 2011

DISP Lab, Graduate Institute of Communication Engineering, National Taiwan University 1

Outline

• Introduction

• The fundamental of optimization

• The idea of sparsity: coding V.S. sensing

• The solution

• The importance of dictionary

• Applications

2

Introduction

3

Introduction

• What is sparsity?

• Usage:

Compression

Analysis

Representation

Fast / sparse sensing4

Projection

bases

Reconstruction

bases

Introduction

• Why do we use Fourier transform and its modifications

for image and acoustic compression?

Differentiability (theoretical)

Intrinsic sparsity (data-dependent)

Human perception (human-centric)

• Better bases for compression or representation?

Wavelets

How about data-dependent bases?

How about learning?

5

Introduction

• Optimization

Frequently faced in algorithm design

Used to implement you creative idea

• Issue

What kinds of mathematical form and its corresponding

optimization algorithms do guarantee the convergence to

local or global optima?

6

The Fundamental of Optimization

7

A Warming-up Question

• How do you solve the following problems?

(1)

(2)

8

2min ( ) ( 5)w

f w w

Local minima

Global minima

(a) Plot

5

(b) Take derivatives, check = 0

An Advanced Question

• How about the following questions?

(3)

(4)

(5)

9

1

min ( )N

nw

n

f w

(a) Plot? (b) Take derivative = 0?

2min ( ) ( 5)

s.t. 3

wf w w

w

1

min ( )

s.t.

N

n

i

n

i

f

w b

w

w

53

Derivative?

How to do?

Illustration

• 2-D case:

10

1 21 21 2

,min ( , ), s. ( , )t. w w

g ww wf w b

1w

2w

1 2( , ) 1f w w

1 2( , ) 2f w w

1 2( , ) 3f w w

1 2( , ) 4f w w

1 2( , ) 5f w w 1 2( , ) 6f w w

1 2( , )g w w b

How to Solve?

• Thanks to……

Lagrange multiplier

Linear programming, quadratic programming, and recently,

convex optimization

• Standard form:

11

0min ( )

s.t. ( ) , 1,......,

s.t. ( ) , 1,......,

i i

i i

f

h b i m

g c i n

ww

w

w

Fallacy

• A quadratic programming problem with constraints

12

2

min A x

x b

1

1 2

| | | |

......

|| | |

N

N

x

b

x

a a a

The importance of each food

Personal nutrient need

Nutrient content of each food

0ix

(1) Take derivative (x)

(2) Quadratic programming (o)

(3) Sparse coding (o)

2

arg min , s.t . 0iA b x x

x

1( )

choose with 0

T T

i i

A A A

x x

x b

The Idea of Sparsity

13

What is Sparsity?

• Think about a problem?

14

2

1

1 2

min

| | | |

......

|| | |

d

N

N

N

A

x

R

x

xx b

a a a b

2

Many can achieve

min 0

x

A x

x b

Which do you want?

Assume full rank, N > d

Choose the x with the

least nonzero component

2

0arg min , s.t. 0A

xx x b

Why Sparsity?

• The more concise, the more better

• In some domain, there naturally exists a sparse latent vector

that controls the data we saw. (ex. MRI, music)

• In some domain, samples from the same class have the sparse

property.

• The domain can be learned.

15

1 2

0

| | | |

...... (

| | | |

0

noise)

i

d

j

x

x

b a a aA k-sparse domain means that each b can

be constructed by a x vector with at most

k nonzero element

Sparse Sensing VS. Sparse Coding

• Assume that:

16

Sparse

coding

Sparse

sensing

2*

0arg min , s.t. 0A

xx x x b

We have , . Now an observation comes ind dNA RN dR b

, p d

W

W R d p

y b

dRb

pRy 2**

0arg min , s.t. 0Q

xx x x y

, with sparseAb x x

, with sp rse aW W QA y b x x x

* **x = x

Note: p is based on the sparsity of the data (on k)

Sparse Sensing

17

, p d

W

W R d p

y b

dRb

pRy

, with sparseAb x x

, with sp rse aW W QA y b x x x

* **x x

Sparse Sensing VS. Sparse Coding

• Sparse sensing (compressed sensing):

It spends much time or money to get b, so get y first then

recover b

• Sparse coding (sparse representation):

Believe that there exists the sparse property in the data,

otherwise sparse representation means nothing.

x is used to be the feature of b

x can be used to efficiently store b and reconstruct b

18

The Solution

19

How to Get The Sparse Solution?

• There is no algorithm other than exhaustively searching to solve:

• While in some situations (ex. special form of A), the solution of

l1 minimization approaches the one of l0 minimization

20

2*

0arg min , s.t. 0A

xx x x b

2*** ( )

11

*** *

arg min = , s.t. 0N

n

n

x A

x

x x x b

x x

Why l1?

• Question 1: Why l1 can result in a sparse solution?

21

2

11

2arg min , arg min , s.t. 0 s.t. A cA

x xx b xx x b

1w

2w

2

A x b

1 cx

2 cx

Why l1?

• Question 2: Why the sparse solution achieved by l1

minimization approaches the one of l0 minimization?

This is a matter or Mathematics

No matter how, sparse representation based on l1 minimization

has been widely used for pattern recognition.

In addition, if one doesn’t care about using the sparse solution

for representation (feature), it seems OK if these two solutions

are not the same.

22

***

*

A

A

b x

b x

Noise

• Sometimes, the data is observed with nose

• The answer seems to be negative

23

*

1 2

*

0

| | | |

......

| | | |

0

i

d

i

x

x

b a a anoise b b

0 1( ) minimizationl l

???x x

2* *

1

* * * *

2* * *

1

not

, arg min , s.t. 0

, and is usually

arg min , s.t. 0 is neither equal to no

spar

se

r to

A A noise

A

y

x

b x y y y

y x x y

x x x b y x x

possibly not sparse

Noise

• Several ways to overcome this:

• What is the difference between:

24

2

1 1 2

1 1

arg min , s.t. 0 arg min , s.t.

arg min , s.t.

A A c

A c

x x

x

x x b x x b

x x b

22

1 1arg min , s.t. 0 arg min , s.t. | 0, where A A I

x z

xx x b z z b z

t

2 1 and A c A c x b x b

Equivalent form

• You may also see several forms for the problem:

• These equivalent forms are derived from Lagrange

multiplier

• There have been several publications aiming at how

solving the l1 minimization problem.

25

1 1 1 1

1

arg min , s.t. arg min

arg min , s.t.

A Ac

dA

x x

x

x x b x x b

x b x

The Importance of Dictionary

26

Dictionary generation

• If the preceding sections, we generally assume that

the (over-complete) bases A is existed and known

• However in practice, we usually need to build it:

Wavelet + Fourier + Haar + ……

Learning based on data

• How to learn?

• May result in over-fitting27

( ) (1) (2) ( )

1

* * (1) (2) (2 )

1,

Given a training set , form as ......

, arg min , where ......

Ni d N

i

N

A X F

R B B

A X B AX X X

b b b b

x x x

Applications

28

Back to the problem we have

• A quadratic programming problem with constraints

29

2

min A x

x b

1

1 2

| | | |

......

|| | |

N

N

x

b

x

a a a

The importance of each food

Personal nutrient need

Nutrient content of each food

0ix

(1) Take derivative (x)

(2) Quadratic programming (o)

(3) Sparse coding (o)

2

arg min , s.t . 0iA b x x

x

1( )

choose with 0

T T

i i

A A A

x x

x b

Face Recognition (1)

30

Face Recognition (2)

31

An important issue

• When using sparse representation as a way of feature

extraction, you may wonder, even if there exists the

sparsity property in the data, does sparse feature

really come up with better results? Does it contain

any semantic meaning?

• Successful areas:

Face recognition

Digit recognition

Object recognition (with carful design):

Ex. K-means Sparse representation

32

De-noising

33

Learn a patch dictionary.

For each patch, compute

the sparse representation

then use it to reconstruct

the patch.

*

1 1

*

arg min A

A

xx x x b

b x

Detection based on reconstruction

34

Learn a patch dictionary for a specific

object. For each patch in the image,

compute the sparse representation

and use it to reconstruct the image.

Check the error for each patch, and

identify those with small error as

detected object.

*

1 1

*

2

2

arg min

check

A

A

xx x x b

b x

b bMaybe not over-complete

Other cases: Foreground-background detection, pedestrian detection, ……

Conclusion

35

What you should know

• What is the form of standard optimization?

• What is sparsity?

• What is sparse coding and sparse sensing?

• What kind of optimization method to solve it?

• Try to use it !!

36

Thank you for listening

37

top related