low-rank models - nyu courantcfgranda/pages/obda_spring16... · 1 25(5) 5 superman2....

87
Low-rank models Optimization-Based Data Analysis http://www.cims.nyu.edu/~cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda 5/2/2016

Upload: others

Post on 22-Jun-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Low-rank models - NYU Courantcfgranda/pages/OBDA_spring16... · 1 25(5) 5 Superman2. Matrixcompletion Thematrixcompletionproblem Nuclearnorm Theoreticalguarantees Algorithms Alternatingminimization

Low-rank models

Optimization-Based Data Analysishttp://www.cims.nyu.edu/~cfgranda/pages/OBDA_spring16

Carlos Fernandez-Granda

5/2/2016

Page 2: Low-rank models - NYU Courantcfgranda/pages/OBDA_spring16... · 1 25(5) 5 Superman2. Matrixcompletion Thematrixcompletionproblem Nuclearnorm Theoreticalguarantees Algorithms Alternatingminimization

Matrix completionThe matrix completion problemNuclear normTheoretical guaranteesAlgorithmsAlternating minimization

Robust PCAOutliersLow rank + sparse modelTheoretical guaranteesAlgorithmsBackground subtraction

Page 3: Low-rank models - NYU Courantcfgranda/pages/OBDA_spring16... · 1 25(5) 5 Superman2. Matrixcompletion Thematrixcompletionproblem Nuclearnorm Theoreticalguarantees Algorithms Alternatingminimization

Matrix completionThe matrix completion problemNuclear normTheoretical guaranteesAlgorithmsAlternating minimization

Robust PCAOutliersLow rank + sparse modelTheoretical guaranteesAlgorithmsBackground subtraction

Page 4: Low-rank models - NYU Courantcfgranda/pages/OBDA_spring16... · 1 25(5) 5 Superman2. Matrixcompletion Thematrixcompletionproblem Nuclearnorm Theoreticalguarantees Algorithms Alternatingminimization

Netflix Prize

? ? ? ?

?

?

??

??

???

?

?

Page 5: Low-rank models - NYU Courantcfgranda/pages/OBDA_spring16... · 1 25(5) 5 Superman2. Matrixcompletion Thematrixcompletionproblem Nuclearnorm Theoreticalguarantees Algorithms Alternatingminimization

Matrix completion

Bob Molly Mary Larry

1 ? 5 4 The Dark Knight? 1 4 5 Spiderman 34 5 2 ? Love Actually5 4 2 1 Bridget Jones’s Diary4 5 1 2 Pretty Woman1 2 ? 5 Superman 2

Page 6: Low-rank models - NYU Courantcfgranda/pages/OBDA_spring16... · 1 25(5) 5 Superman2. Matrixcompletion Thematrixcompletionproblem Nuclearnorm Theoreticalguarantees Algorithms Alternatingminimization

Matrix completion as an inverse problem

[1 ? 5? 3 2

]

For a fixed sampling pattern, underdetermined system of equations

1 0 0 0 0 0

0 0 0 1 0 0

0 0 0 0 1 0

0 0 0 0 0 1

M11

M21

M12

M22

M13

M23

=

1

3

5

2

Page 7: Low-rank models - NYU Courantcfgranda/pages/OBDA_spring16... · 1 25(5) 5 Superman2. Matrixcompletion Thematrixcompletionproblem Nuclearnorm Theoreticalguarantees Algorithms Alternatingminimization

Isn’t this completely ill posed?

Assumption: Matrix is low rank, depends on ≈ r (m + n) parameters

As long as data > parameters recovery is possible (in principle)

1 1 1 1 ? 11 1 1 1 1 11 1 1 1 1 1? 1 1 1 1 1

Page 8: Low-rank models - NYU Courantcfgranda/pages/OBDA_spring16... · 1 25(5) 5 Superman2. Matrixcompletion Thematrixcompletionproblem Nuclearnorm Theoreticalguarantees Algorithms Alternatingminimization

Matrix cannot be sparse

0 0 0 0 0 00 0 0 23 0 00 0 0 0 0 00 0 0 0 0 0

Page 9: Low-rank models - NYU Courantcfgranda/pages/OBDA_spring16... · 1 25(5) 5 Superman2. Matrixcompletion Thematrixcompletionproblem Nuclearnorm Theoreticalguarantees Algorithms Alternatingminimization

Singular vectors cannot be sparse

1111

[1 1 1 1]

+

0001

[1 2 3 4]

=

1 1 1 11 1 1 11 1 1 12 3 4 5

Page 10: Low-rank models - NYU Courantcfgranda/pages/OBDA_spring16... · 1 25(5) 5 Superman2. Matrixcompletion Thematrixcompletionproblem Nuclearnorm Theoreticalguarantees Algorithms Alternatingminimization

Incoherence

The matrix must be incoherent: its singular vectors must be spread out

For 1/√

n ≤ µ ≤ 1

max1≤i≤r ,1≤j≤m

|Uij | ≤ µ

max1≤i≤r ,1≤j≤n

|Vij | ≤ µ

for the left U1, . . . ,Ur and right V1, . . . ,Vr singular vectors

Page 11: Low-rank models - NYU Courantcfgranda/pages/OBDA_spring16... · 1 25(5) 5 Superman2. Matrixcompletion Thematrixcompletionproblem Nuclearnorm Theoreticalguarantees Algorithms Alternatingminimization

Measurements

We must see an entry in each row/column at least1 1 1 1? ? ? ?1 1 1 11 1 1 1

=

1?11

[1 1 1 1]

Assumption: Random sampling (usually does not hold in practice!)

Page 12: Low-rank models - NYU Courantcfgranda/pages/OBDA_spring16... · 1 25(5) 5 Superman2. Matrixcompletion Thematrixcompletionproblem Nuclearnorm Theoreticalguarantees Algorithms Alternatingminimization

Underdetermined inverse problems

Measurements Class of signals

Compressedsensing

Gaussian, randomFourier coeffs.

Sparse

Super-resolution Low passSignals with min.

separation

Matrixcompletion Random sampling

Incoherent low-rankmatrices

Page 13: Low-rank models - NYU Courantcfgranda/pages/OBDA_spring16... · 1 25(5) 5 Superman2. Matrixcompletion Thematrixcompletionproblem Nuclearnorm Theoreticalguarantees Algorithms Alternatingminimization

Matrix completionThe matrix completion problemNuclear normTheoretical guaranteesAlgorithmsAlternating minimization

Robust PCAOutliersLow rank + sparse modelTheoretical guaranteesAlgorithmsBackground subtraction

Page 14: Low-rank models - NYU Courantcfgranda/pages/OBDA_spring16... · 1 25(5) 5 Superman2. Matrixcompletion Thematrixcompletionproblem Nuclearnorm Theoreticalguarantees Algorithms Alternatingminimization

Matrix inner product

The trace of a n × n matrix is defined as

Trace (A) :=n∑

i=1

Aii

The inner product between two m × n matrices is defined as

〈A,B〉 = Trace(ATB

)=

m∑i=1

n∑i=1

AijBij

For any matrices A, B , C with appropriate dimensions

Trace (ABC ) = Trace (BCA) = Trace (CAB)

Page 15: Low-rank models - NYU Courantcfgranda/pages/OBDA_spring16... · 1 25(5) 5 Superman2. Matrixcompletion Thematrixcompletionproblem Nuclearnorm Theoreticalguarantees Algorithms Alternatingminimization

Matrix norm

Let σ1 ≥ σ2 ≥ . . . ≥ σn be the singular values of M ∈ Rm×n, m ≥ n,

Operator norm

||M|| := max||u||2≤1

||M u||2 = σ1

Frobenius norm

||M||F :=

√∑i

M2ij =

√Trace (MTM) =

√√√√ n∑i=1

σ2i

Nuclear norm

||M||∗ :=n∑

i=1

σi

Page 16: Low-rank models - NYU Courantcfgranda/pages/OBDA_spring16... · 1 25(5) 5 Superman2. Matrixcompletion Thematrixcompletionproblem Nuclearnorm Theoreticalguarantees Algorithms Alternatingminimization

Characterization of nuclear norm

||A||∗ = sup||B||≤1

〈A,B〉

Consequence: Nuclear norm satisfies triangle inequality

||A + B||∗ ≤ ||A||∗ + ||B||∗

Page 17: Low-rank models - NYU Courantcfgranda/pages/OBDA_spring16... · 1 25(5) 5 Superman2. Matrixcompletion Thematrixcompletionproblem Nuclearnorm Theoreticalguarantees Algorithms Alternatingminimization

Proof of characterization

For any M ∈ Rm×n, U ∈ Rm×m, V ∈ Rn×n, if UTU = I , V TV = I then

||UMV || = ||M||

For any M ∈ Rn×n

max1≤i≤n

|Mii | ≤ ||M||

Page 18: Low-rank models - NYU Courantcfgranda/pages/OBDA_spring16... · 1 25(5) 5 Superman2. Matrixcompletion Thematrixcompletionproblem Nuclearnorm Theoreticalguarantees Algorithms Alternatingminimization

Experiment

Compare rank, operator norm, Frobenius norm and nuclear norm of

M (t) :=

0.5 + t 1 10.5 0.5 t0.5 1− t 0.5

for different values of t

Page 19: Low-rank models - NYU Courantcfgranda/pages/OBDA_spring16... · 1 25(5) 5 Superman2. Matrixcompletion Thematrixcompletionproblem Nuclearnorm Theoreticalguarantees Algorithms Alternatingminimization

Matrix norms vs rank

1.0 0.5 0.0 0.5 1.0 1.5t

1.0

1.5

2.0

2.5

3.0Rank

Operator norm

Frobenius norm

Nuclear norm

Page 20: Low-rank models - NYU Courantcfgranda/pages/OBDA_spring16... · 1 25(5) 5 Superman2. Matrixcompletion Thematrixcompletionproblem Nuclearnorm Theoreticalguarantees Algorithms Alternatingminimization

Low-rank matrix estimation

First idea:

minX∈Rm×n

rank(X)

such that XΩ = y

Ω: indices of revealed entriesy : revealed entries

Computationally intractable because of missing entries

Tractable alternative:

minX∈Rm×n

∣∣∣∣∣∣X ∣∣∣∣∣∣∗

such that XΩ = y

Page 21: Low-rank models - NYU Courantcfgranda/pages/OBDA_spring16... · 1 25(5) 5 Superman2. Matrixcompletion Thematrixcompletionproblem Nuclearnorm Theoreticalguarantees Algorithms Alternatingminimization

Low-rank matrix estimation

If data are noisy

minX∈Rm×n

∣∣∣∣∣∣XΩ − y∣∣∣∣∣∣2

2+ λ

∣∣∣∣∣∣X ∣∣∣∣∣∣∗

where λ > 0 is a regularization parameter

Page 22: Low-rank models - NYU Courantcfgranda/pages/OBDA_spring16... · 1 25(5) 5 Superman2. Matrixcompletion Thematrixcompletionproblem Nuclearnorm Theoreticalguarantees Algorithms Alternatingminimization

Matrix completion via nuclear-norm minimization

Bob Molly Mary Larry

1 2 (1) 5 4 The Dark Knight

2 (2) 1 4 5 Spiderman 34 5 2 2 (1) Love Actually5 4 2 1 Bridget Jones’s Diary4 5 1 2 Pretty Woman1 2 5 (5) 5 Superman 2

Page 23: Low-rank models - NYU Courantcfgranda/pages/OBDA_spring16... · 1 25(5) 5 Superman2. Matrixcompletion Thematrixcompletionproblem Nuclearnorm Theoreticalguarantees Algorithms Alternatingminimization

Matrix completionThe matrix completion problemNuclear normTheoretical guaranteesAlgorithmsAlternating minimization

Robust PCAOutliersLow rank + sparse modelTheoretical guaranteesAlgorithmsBackground subtraction

Page 24: Low-rank models - NYU Courantcfgranda/pages/OBDA_spring16... · 1 25(5) 5 Superman2. Matrixcompletion Thematrixcompletionproblem Nuclearnorm Theoreticalguarantees Algorithms Alternatingminimization

Exact recovery

Guarantees by Gross 2011, Candès and Recht 2008, Candès and Tao 2009

minX∈Rm×n

∣∣∣∣∣∣X ∣∣∣∣∣∣∗

such that XΩ = y

achieves exact recovery with high probability as long as the number ofsamples is proportional to r (n + m) up to log terms

The proof is based on the construction of a dual certificate

Page 25: Low-rank models - NYU Courantcfgranda/pages/OBDA_spring16... · 1 25(5) 5 Superman2. Matrixcompletion Thematrixcompletionproblem Nuclearnorm Theoreticalguarantees Algorithms Alternatingminimization

Subgradient of nuclear norm

Let M = UΣV T . Any matrix of the form

G := UV T + W

where

||W || ≤ 1

UTW = 0W V = 0

is a subgradient of the nuclear norm at M, so that

||M + H||∗ ≥ ||M||∗ + 〈G ,H〉 for any H

Page 26: Low-rank models - NYU Courantcfgranda/pages/OBDA_spring16... · 1 25(5) 5 Superman2. Matrixcompletion Thematrixcompletionproblem Nuclearnorm Theoreticalguarantees Algorithms Alternatingminimization

Proof

Follows from

||A||∗ = sup||B||≤1

〈A,B〉

Page 27: Low-rank models - NYU Courantcfgranda/pages/OBDA_spring16... · 1 25(5) 5 Superman2. Matrixcompletion Thematrixcompletionproblem Nuclearnorm Theoreticalguarantees Algorithms Alternatingminimization

Dual certificate

Let M = UΣV T . A dual certificate Q of the optimization problem

minX∈Rm×n

∣∣∣∣∣∣X ∣∣∣∣∣∣∗

such that XΩ = y

is any matrix supported on Ω such that

Q = UV T + W

||W || < 1

UTW = 0W V = 0

Page 28: Low-rank models - NYU Courantcfgranda/pages/OBDA_spring16... · 1 25(5) 5 Superman2. Matrixcompletion Thematrixcompletionproblem Nuclearnorm Theoreticalguarantees Algorithms Alternatingminimization

Dual certificate

UV T = Q −W where Q is supported on Ω and ||W || < 1

If U or V are not incoherent, UV T might have large entries not in Ω

Proof of existence relies on concentration bounds

Page 29: Low-rank models - NYU Courantcfgranda/pages/OBDA_spring16... · 1 25(5) 5 Superman2. Matrixcompletion Thematrixcompletionproblem Nuclearnorm Theoreticalguarantees Algorithms Alternatingminimization

Matrix completionThe matrix completion problemNuclear normTheoretical guaranteesAlgorithmsAlternating minimization

Robust PCAOutliersLow rank + sparse modelTheoretical guaranteesAlgorithmsBackground subtraction

Page 30: Low-rank models - NYU Courantcfgranda/pages/OBDA_spring16... · 1 25(5) 5 Superman2. Matrixcompletion Thematrixcompletionproblem Nuclearnorm Theoreticalguarantees Algorithms Alternatingminimization

Proximal gradient method

Method to solve the optimization problem

minimize f (x) + g (x) ,

where f is differentiable and proxg is tractable

Proximal-gradient iteration:

x (0) = arbitrary initialization

x (k+1) = proxαk g

(x (k) − αk ∇f

(x (k)

))

Page 31: Low-rank models - NYU Courantcfgranda/pages/OBDA_spring16... · 1 25(5) 5 Superman2. Matrixcompletion Thematrixcompletionproblem Nuclearnorm Theoreticalguarantees Algorithms Alternatingminimization

Proximal operator of nuclear norm

The solution X to

minX∈Rm×n

12

∣∣∣∣∣∣Y − X∣∣∣∣∣∣2

F+ τ

∣∣∣∣∣∣X ∣∣∣∣∣∣∗

is obtained by soft-thresholding the SVD of Y

X = Dτ (Y )

Dτ (M) := U Sτ (Σ) V T where M = U ΣV T

Sτ (Σ)ii :=

Σii − τ if Σii > τ

0 otherwise

Page 32: Low-rank models - NYU Courantcfgranda/pages/OBDA_spring16... · 1 25(5) 5 Superman2. Matrixcompletion Thematrixcompletionproblem Nuclearnorm Theoreticalguarantees Algorithms Alternatingminimization

Proximal gradient method

Proximal gradient method for the problem

minX∈Rm×n

∣∣∣∣∣∣XΩ − y∣∣∣∣∣∣2

2+ λ

∣∣∣∣∣∣X ∣∣∣∣∣∣∗

X (0) = arbitrary initialization

M(k) = X (k) − αk

(X (k)

Ω − y)

X (k+1) = Dαkλ

(M(k)

)

Page 33: Low-rank models - NYU Courantcfgranda/pages/OBDA_spring16... · 1 25(5) 5 Superman2. Matrixcompletion Thematrixcompletionproblem Nuclearnorm Theoreticalguarantees Algorithms Alternatingminimization

Matrix completionThe matrix completion problemNuclear normTheoretical guaranteesAlgorithmsAlternating minimization

Robust PCAOutliersLow rank + sparse modelTheoretical guaranteesAlgorithmsBackground subtraction

Page 34: Low-rank models - NYU Courantcfgranda/pages/OBDA_spring16... · 1 25(5) 5 Superman2. Matrixcompletion Thematrixcompletionproblem Nuclearnorm Theoreticalguarantees Algorithms Alternatingminimization

Low-rank matrix completion

Intractable problem

minX∈Rm×n

rank(X)

such that XΩ ≈ y

Nuclear norm: convex (,) but computationally expensive (/)due to SVD computations

Page 35: Low-rank models - NYU Courantcfgranda/pages/OBDA_spring16... · 1 25(5) 5 Superman2. Matrixcompletion Thematrixcompletionproblem Nuclearnorm Theoreticalguarantees Algorithms Alternatingminimization

Alternative

I Fix rank k beforehand

I Parametrize the matrix as AB where A ∈ Rm×k and B ∈ Rk×n

I Solve

minA∈Rm×k ,B∈Rk×n

∣∣∣∣∣∣(AB)

Ω− y∣∣∣∣∣∣

2

by alternating minimization

Page 36: Low-rank models - NYU Courantcfgranda/pages/OBDA_spring16... · 1 25(5) 5 Superman2. Matrixcompletion Thematrixcompletionproblem Nuclearnorm Theoreticalguarantees Algorithms Alternatingminimization

Alternating minimization

Sequence of least-squares problems (much faster than computing SVDs)

I To compute A(k) fix B(k−1) and solve

minA∈Rm×k

∣∣∣∣∣∣(AB(k−1))

Ω− y∣∣∣∣∣∣

2

I To compute B(k) fix A(k) and solve

minB∈Rk×n

∣∣∣∣∣∣(A(k)B)

Ω− y∣∣∣∣∣∣

2

Theoretical guarantees: Jain, Netrapalli, Sanghavi 2013

Page 37: Low-rank models - NYU Courantcfgranda/pages/OBDA_spring16... · 1 25(5) 5 Superman2. Matrixcompletion Thematrixcompletionproblem Nuclearnorm Theoreticalguarantees Algorithms Alternatingminimization

Matrix completionThe matrix completion problemNuclear normTheoretical guaranteesAlgorithmsAlternating minimization

Robust PCAOutliersLow rank + sparse modelTheoretical guaranteesAlgorithmsBackground subtraction

Page 38: Low-rank models - NYU Courantcfgranda/pages/OBDA_spring16... · 1 25(5) 5 Superman2. Matrixcompletion Thematrixcompletionproblem Nuclearnorm Theoreticalguarantees Algorithms Alternatingminimization

Matrix completionThe matrix completion problemNuclear normTheoretical guaranteesAlgorithmsAlternating minimization

Robust PCAOutliersLow rank + sparse modelTheoretical guaranteesAlgorithmsBackground subtraction

Page 39: Low-rank models - NYU Courantcfgranda/pages/OBDA_spring16... · 1 25(5) 5 Superman2. Matrixcompletion Thematrixcompletionproblem Nuclearnorm Theoreticalguarantees Algorithms Alternatingminimization

Collaborative filtering

A :=

Bob Molly Mary Larry

1 1 5 5 The Dark Knight1 1 5 5 Spiderman 35 5 1 1 Love Actually5 5 1 1 Bridget Jones’s Diary5 5 1 1 Pretty Woman1 1 5 5 Superman 2

Page 40: Low-rank models - NYU Courantcfgranda/pages/OBDA_spring16... · 1 25(5) 5 Superman2. Matrixcompletion Thematrixcompletionproblem Nuclearnorm Theoreticalguarantees Algorithms Alternatingminimization

SVD

A− A = UΣV T = U

9.798 0 0 00 0 0 00 0 0 00 0 0 0

V T

Page 41: Low-rank models - NYU Courantcfgranda/pages/OBDA_spring16... · 1 25(5) 5 Superman2. Matrixcompletion Thematrixcompletionproblem Nuclearnorm Theoreticalguarantees Algorithms Alternatingminimization

First left singular vector

U1 =D. Knight Sp. 3 Love Act. B.J.’s Diary P. Woman Sup. 2

( )−0.4082 −0.4082 0.4082 0.4082 0.4082 −0.4082

Interpretations:

I Score atom: Centered scores for each person are proportional to U1

I Coefficients: They cluster movies into action (-) and romantic (+)

Page 42: Low-rank models - NYU Courantcfgranda/pages/OBDA_spring16... · 1 25(5) 5 Superman2. Matrixcompletion Thematrixcompletionproblem Nuclearnorm Theoreticalguarantees Algorithms Alternatingminimization

First right singular vector

V1 =Bob Molly Mary Larry

( )0.5 0.5 −0.5 −0.5

Interpretations:

I Score atom: Centered scores for each movie are proportional to V1

I Coefficients: They cluster people into action (-) and romantic (+)

Page 43: Low-rank models - NYU Courantcfgranda/pages/OBDA_spring16... · 1 25(5) 5 Superman2. Matrixcompletion Thematrixcompletionproblem Nuclearnorm Theoreticalguarantees Algorithms Alternatingminimization

Outliers

A :=

Bob Molly Mary Larry

5 1 5 5 The Dark Knight1 1 5 5 Spiderman 35 5 1 1 Love Actually5 5 1 1 Bridget Jones’s Diary5 5 1 1 Pretty Woman1 1 5 1 Superman 2

Page 44: Low-rank models - NYU Courantcfgranda/pages/OBDA_spring16... · 1 25(5) 5 Superman2. Matrixcompletion Thematrixcompletionproblem Nuclearnorm Theoreticalguarantees Algorithms Alternatingminimization

SVD

A− A = UΣV T = U

8.543 0 0 00 4.000 0 00 0 2.649 00 0 0 0

V T

Without outliers

A− A = UΣV T = U

9.798 0 0 00 0 0 00 0 0 00 0 0 0

V T

Page 45: Low-rank models - NYU Courantcfgranda/pages/OBDA_spring16... · 1 25(5) 5 Superman2. Matrixcompletion Thematrixcompletionproblem Nuclearnorm Theoreticalguarantees Algorithms Alternatingminimization

First left singular vector

U1 =D. Knight Sp. 3 Love Act. B.J.’s Diary P. Woman Sup. 2

( )−0.2610 −0.4647 0.4647 0.4647 0.4647 −0.2610

Without outliers

U1 =D. Knight Sp. 3 Love Act. B.J.’s Diary P. Woman Sup. 2

( )−0.4082 −0.4082 0.4082 0.4082 0.4082 −0.4082

Page 46: Low-rank models - NYU Courantcfgranda/pages/OBDA_spring16... · 1 25(5) 5 Superman2. Matrixcompletion Thematrixcompletionproblem Nuclearnorm Theoreticalguarantees Algorithms Alternatingminimization

First right singular vector

V1 =Bob Molly Mary Larry

( )0.4352 0.5573 −0.5573 −0.4352

Without outliers

V1 =Bob Molly Mary Larry

( )0.5 0.5 −0.5 −0.5

Page 47: Low-rank models - NYU Courantcfgranda/pages/OBDA_spring16... · 1 25(5) 5 Superman2. Matrixcompletion Thematrixcompletionproblem Nuclearnorm Theoreticalguarantees Algorithms Alternatingminimization

PCA without outliers

σ1√n

= 1.042σ2√n

= 0.192

u2

u1

Page 48: Low-rank models - NYU Courantcfgranda/pages/OBDA_spring16... · 1 25(5) 5 Superman2. Matrixcompletion Thematrixcompletionproblem Nuclearnorm Theoreticalguarantees Algorithms Alternatingminimization

PCA with outliers

σ1√n

= 1.774σ2√n

= 0.633

u2

u1

Page 49: Low-rank models - NYU Courantcfgranda/pages/OBDA_spring16... · 1 25(5) 5 Superman2. Matrixcompletion Thematrixcompletionproblem Nuclearnorm Theoreticalguarantees Algorithms Alternatingminimization

Matrix completionThe matrix completion problemNuclear normTheoretical guaranteesAlgorithmsAlternating minimization

Robust PCAOutliersLow rank + sparse modelTheoretical guaranteesAlgorithmsBackground subtraction

Page 50: Low-rank models - NYU Courantcfgranda/pages/OBDA_spring16... · 1 25(5) 5 Superman2. Matrixcompletion Thematrixcompletionproblem Nuclearnorm Theoreticalguarantees Algorithms Alternatingminimization

Low rank + sparse model

Sum of a low-rank component L and a sparse component S

+ =

L S M

Page 51: Low-rank models - NYU Courantcfgranda/pages/OBDA_spring16... · 1 25(5) 5 Superman2. Matrixcompletion Thematrixcompletionproblem Nuclearnorm Theoreticalguarantees Algorithms Alternatingminimization

Low-rank component cannot be sparse

0 0 0 0 0 00 0 0 23 0 00 0 0 0 0 00 0 0 0 0 0

+

0 0 0 0 0 00 0 0 0 0 00 0 0 0 0 00 0 0 0 47 0

=

0 0 0 0 0 00 0 0 0 0 00 0 0 0 0 00 0 0 0 0 0

+

0 0 0 0 0 00 0 0 23 0 00 0 0 0 0 00 0 0 0 47 0

Page 52: Low-rank models - NYU Courantcfgranda/pages/OBDA_spring16... · 1 25(5) 5 Superman2. Matrixcompletion Thematrixcompletionproblem Nuclearnorm Theoreticalguarantees Algorithms Alternatingminimization

Incoherence

Low-rank component must be incoherent

For L = U Σ V T

max1≤i≤r ,1≤j≤m

|Uij | ≤ µ

max1≤i≤r ,1≤j≤n

|Vij | ≤ µ

where 1/√

n ≤ µ ≤ 1

Page 53: Low-rank models - NYU Courantcfgranda/pages/OBDA_spring16... · 1 25(5) 5 Superman2. Matrixcompletion Thematrixcompletionproblem Nuclearnorm Theoreticalguarantees Algorithms Alternatingminimization

Sparse component cannot be low rank

1 1 1 1 1 11 1 1 1 1 11 1 1 1 1 11 1 1 1 1 1

+

0 0 0 0 0 00 0 0 0 0 00 0 0 0 0 01 1 1 1 1 1

=

1 1 1 1 1 11 1 1 1 1 11 1 1 1 1 12 2 2 2 2 2

+

0 0 0 0 0 00 0 0 0 0 00 0 0 0 0 00 0 0 0 0 0

Assumption: Support distributed uniformly at random(doesn’t hold in practice!)

Page 54: Low-rank models - NYU Courantcfgranda/pages/OBDA_spring16... · 1 25(5) 5 Superman2. Matrixcompletion Thematrixcompletionproblem Nuclearnorm Theoreticalguarantees Algorithms Alternatingminimization

Nuclear norm + `1-norm

We want to promote a low-rank L and a sparse S

minL,S∈Rm×n

∣∣∣∣∣∣L∣∣∣∣∣∣∗

+ λ∣∣∣∣∣∣S∣∣∣∣∣∣

1such that L + S = Y

Here ||·||1 is the `1 norm of the vectorized matrix

Page 55: Low-rank models - NYU Courantcfgranda/pages/OBDA_spring16... · 1 25(5) 5 Superman2. Matrixcompletion Thematrixcompletionproblem Nuclearnorm Theoreticalguarantees Algorithms Alternatingminimization

Choice of λ

∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣1 1 1 11 1 1 11 1 1 11 1 1 1

∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∗

= n

∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣1 1 1 11 1 1 11 1 1 11 1 1 1

∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣1

= n2

λ >1n

Page 56: Low-rank models - NYU Courantcfgranda/pages/OBDA_spring16... · 1 25(5) 5 Superman2. Matrixcompletion Thematrixcompletionproblem Nuclearnorm Theoreticalguarantees Algorithms Alternatingminimization

Choice of λ

∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣0 0 0 00 1 0 00 0 0 00 0 0 0

∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∗

= 1

∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣0 0 0 00 1 0 00 0 0 00 0 0 0

∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣1

= 1

λ < 1

Page 57: Low-rank models - NYU Courantcfgranda/pages/OBDA_spring16... · 1 25(5) 5 Superman2. Matrixcompletion Thematrixcompletionproblem Nuclearnorm Theoreticalguarantees Algorithms Alternatingminimization

Choice of λ

∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣1 1 1 10 0 0 00 0 0 00 0 0 0

∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∗

=√

n

∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣1 1 1 10 0 0 00 0 0 00 0 0 0

∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣1

= n

λ ≈ 1√n

Page 58: Low-rank models - NYU Courantcfgranda/pages/OBDA_spring16... · 1 25(5) 5 Superman2. Matrixcompletion Thematrixcompletionproblem Nuclearnorm Theoreticalguarantees Algorithms Alternatingminimization

L + S

Page 59: Low-rank models - NYU Courantcfgranda/pages/OBDA_spring16... · 1 25(5) 5 Superman2. Matrixcompletion Thematrixcompletionproblem Nuclearnorm Theoreticalguarantees Algorithms Alternatingminimization

λ = 1√n

L S

Page 60: Low-rank models - NYU Courantcfgranda/pages/OBDA_spring16... · 1 25(5) 5 Superman2. Matrixcompletion Thematrixcompletionproblem Nuclearnorm Theoreticalguarantees Algorithms Alternatingminimization

Large λ

L S

Page 61: Low-rank models - NYU Courantcfgranda/pages/OBDA_spring16... · 1 25(5) 5 Superman2. Matrixcompletion Thematrixcompletionproblem Nuclearnorm Theoreticalguarantees Algorithms Alternatingminimization

Small λ

L S

Page 62: Low-rank models - NYU Courantcfgranda/pages/OBDA_spring16... · 1 25(5) 5 Superman2. Matrixcompletion Thematrixcompletionproblem Nuclearnorm Theoreticalguarantees Algorithms Alternatingminimization

Matrix completionThe matrix completion problemNuclear normTheoretical guaranteesAlgorithmsAlternating minimization

Robust PCAOutliersLow rank + sparse modelTheoretical guaranteesAlgorithmsBackground subtraction

Page 63: Low-rank models - NYU Courantcfgranda/pages/OBDA_spring16... · 1 25(5) 5 Superman2. Matrixcompletion Thematrixcompletionproblem Nuclearnorm Theoreticalguarantees Algorithms Alternatingminimization

Exact recovery

Guarantees by Candès, Li, Ma, Wright 2011

minL,S∈Rn×n

∣∣∣∣∣∣L∣∣∣∣∣∣∗

+ λ∣∣∣∣∣∣S∣∣∣∣∣∣

1such that L + S = Y

achieves an exact decomposition with high probability for

I rank (L) of order n if L is incoherent

I a sparsity level of S of order n2 if its support is random

The proof is based on the construction of a dual certificate

Page 64: Low-rank models - NYU Courantcfgranda/pages/OBDA_spring16... · 1 25(5) 5 Superman2. Matrixcompletion Thematrixcompletionproblem Nuclearnorm Theoreticalguarantees Algorithms Alternatingminimization

Dual certificate

Let L = UΣV T and Ω be the support of S

A dual certificate Q of the optimization problem

minL,S∈Rm×n

∣∣∣∣∣∣L∣∣∣∣∣∣∗

+ λ∣∣∣∣∣∣S∣∣∣∣∣∣

1such that L + S = Y

is any matrix such that

Q = UV T + W = λ sign (S) + F

||W || < 1 UTW = 0 W V = 0

FΩ = 0 ||F ||∞ < λ

Page 65: Low-rank models - NYU Courantcfgranda/pages/OBDA_spring16... · 1 25(5) 5 Superman2. Matrixcompletion Thematrixcompletionproblem Nuclearnorm Theoreticalguarantees Algorithms Alternatingminimization

Matrix completionThe matrix completion problemNuclear normTheoretical guaranteesAlgorithmsAlternating minimization

Robust PCAOutliersLow rank + sparse modelTheoretical guaranteesAlgorithmsBackground subtraction

Page 66: Low-rank models - NYU Courantcfgranda/pages/OBDA_spring16... · 1 25(5) 5 Superman2. Matrixcompletion Thematrixcompletionproblem Nuclearnorm Theoreticalguarantees Algorithms Alternatingminimization

Convex program with equality constraints

Canonical problem with linear equality constraints

minimize f (x)

subject to Ax = y

Lagrangian

L (x , z) := f (x) + 〈z ,Ax − y〉

z is a Lagrange multiplier

Dual function

g (z) := infx

f (x) + 〈z ,Ax − y〉

Page 67: Low-rank models - NYU Courantcfgranda/pages/OBDA_spring16... · 1 25(5) 5 Superman2. Matrixcompletion Thematrixcompletionproblem Nuclearnorm Theoreticalguarantees Algorithms Alternatingminimization

Convex program with equality constraints

If strong duality holds for the optimal x∗ and z∗

f (x∗) = g (z∗)= inf

xL (x , z∗)

≤ f (x∗)

If x∗ is unique and we know z∗, we can compute x∗ by solving

minimize L (x , z∗)

Page 68: Low-rank models - NYU Courantcfgranda/pages/OBDA_spring16... · 1 25(5) 5 Superman2. Matrixcompletion Thematrixcompletionproblem Nuclearnorm Theoreticalguarantees Algorithms Alternatingminimization

Dual-ascent method

Find z∗ using gradient ascent

Iterations:

I Primal variable update

x (k) = argminxL(x , z(k)

)I Compute gradient of dual function at z(k)

∇g(z(k)

)= Ax (k) − y

I Dual variable update

z(k+1) = z(k) + α(k)∇g(z(k)

)

Page 69: Low-rank models - NYU Courantcfgranda/pages/OBDA_spring16... · 1 25(5) 5 Superman2. Matrixcompletion Thematrixcompletionproblem Nuclearnorm Theoreticalguarantees Algorithms Alternatingminimization

Augmented Lagrangian

Aim: Make dual-ascent method more robust

Augmented Lagrangian

Lρ (x , z) := f (x) + 〈z ,Ax − y〉+ρ

2||Ax − y ||22

Lagrangian of modified problem

minimize f (x) +ρ

2||Ax − y ||22

subject to Ax = y

Page 70: Low-rank models - NYU Courantcfgranda/pages/OBDA_spring16... · 1 25(5) 5 Superman2. Matrixcompletion Thematrixcompletionproblem Nuclearnorm Theoreticalguarantees Algorithms Alternatingminimization

Method of multipliers

Iterations:

I Primal variable update

x (k) = argminxLρ(x , z(k)

)I Compute z(k+1) such that

∇xL(x (k), z(k+1)

)= 0

Page 71: Low-rank models - NYU Courantcfgranda/pages/OBDA_spring16... · 1 25(5) 5 Superman2. Matrixcompletion Thematrixcompletionproblem Nuclearnorm Theoreticalguarantees Algorithms Alternatingminimization

Dual update

We have

∇xLρ(x (k), z(k)

)= 0

∇xLρ(x (k), z(k)

)= ∇x f

(x (k)

)+ AT

(z(k) + ρ (Ax − y)

)

∇xL(x (k), z(k) + ρ (Ax − y)

)= ∇x f

(x (k)

)+ AT

(z(k) + ρ (Ax − y)

)So we can use the dual-ascent update with αk = ρ

z(k+1) = z(k) + ρ (Ax − y)

Page 72: Low-rank models - NYU Courantcfgranda/pages/OBDA_spring16... · 1 25(5) 5 Superman2. Matrixcompletion Thematrixcompletionproblem Nuclearnorm Theoreticalguarantees Algorithms Alternatingminimization

Alternating direction method of multipliers (ADMM)

Apply same ideas to

minimize f1 (x1) + f2 (x2)

subject to Ax1 + Bx2 = y

Page 73: Low-rank models - NYU Courantcfgranda/pages/OBDA_spring16... · 1 25(5) 5 Superman2. Matrixcompletion Thematrixcompletionproblem Nuclearnorm Theoreticalguarantees Algorithms Alternatingminimization

Alternating direction method of multipliers (ADMM)

Iterations:

I Primal variable updates

x (k)1 = argmin

xLρ(x , x (k−1)

2 , z(k))

x (k)2 = argmin

xLρ(x (k)1 , x , z(k)

)I Dual variable update

z(k+1) = z(k) + ρ(Ax (k)

1 + Bx (k)2 − y

)

Page 74: Low-rank models - NYU Courantcfgranda/pages/OBDA_spring16... · 1 25(5) 5 Superman2. Matrixcompletion Thematrixcompletionproblem Nuclearnorm Theoreticalguarantees Algorithms Alternatingminimization

ADMM for robust PCA

Robust PCA problem

minL,S∈Rn×n

||L||∗ + λ ||S ||1 such that L + S = Y

Augmented Lagrangian

||L||∗ + λ ||S ||1 + 〈Z , L + S − Y 〉+ρ

2||L + S −M||2F

Page 75: Low-rank models - NYU Courantcfgranda/pages/OBDA_spring16... · 1 25(5) 5 Superman2. Matrixcompletion Thematrixcompletionproblem Nuclearnorm Theoreticalguarantees Algorithms Alternatingminimization

Primal updates

L(k) = argminLLρ(L, S (k−1),Z (k)

)= argmin

L||L||∗ +

⟨Z (k), L

⟩+ρ

2

∣∣∣∣∣∣L + S (k−1) −M∣∣∣∣∣∣2

F

= D1/ρ

(1ρZ (k) + S (k−1) −M

)

S (k) = argminSLρ(S , L(k−1),Z (k)

)= argmin

Sλ ||S ||1 +

⟨Z (k), S

⟩+ρ

2

∣∣∣∣∣∣L(k−1) + S −M∣∣∣∣∣∣2

F

= Sλ/ρ(1ρZ (k) + L(k) −M

)

Page 76: Low-rank models - NYU Courantcfgranda/pages/OBDA_spring16... · 1 25(5) 5 Superman2. Matrixcompletion Thematrixcompletionproblem Nuclearnorm Theoreticalguarantees Algorithms Alternatingminimization

ADMM for robust PCA

Iterations:

I Primal variable updates

L(k) = D1/ρ

(1ρZ (k) + S (k−1) −M

)

S (k) = Sλ/ρ(1ρZ (k) + L(k) −M

)I Dual variable update

Z (k+1) = Z (k) + ρ(L(k) + S (k) −M

)

Page 77: Low-rank models - NYU Courantcfgranda/pages/OBDA_spring16... · 1 25(5) 5 Superman2. Matrixcompletion Thematrixcompletionproblem Nuclearnorm Theoreticalguarantees Algorithms Alternatingminimization

Matrix completionThe matrix completion problemNuclear normTheoretical guaranteesAlgorithmsAlternating minimization

Robust PCAOutliersLow rank + sparse modelTheoretical guaranteesAlgorithmsBackground subtraction

Page 78: Low-rank models - NYU Courantcfgranda/pages/OBDA_spring16... · 1 25(5) 5 Superman2. Matrixcompletion Thematrixcompletionproblem Nuclearnorm Theoreticalguarantees Algorithms Alternatingminimization

Background subtraction

20 40 60 80 100 120 140 160

20

40

60

80

100

120

Page 79: Low-rank models - NYU Courantcfgranda/pages/OBDA_spring16... · 1 25(5) 5 Superman2. Matrixcompletion Thematrixcompletionproblem Nuclearnorm Theoreticalguarantees Algorithms Alternatingminimization

Frame 17

20 40 60 80 100 120 140 160

20

40

60

80

100

120

Page 80: Low-rank models - NYU Courantcfgranda/pages/OBDA_spring16... · 1 25(5) 5 Superman2. Matrixcompletion Thematrixcompletionproblem Nuclearnorm Theoreticalguarantees Algorithms Alternatingminimization

Low-rank component

20 40 60 80 100 120 140 160

20

40

60

80

100

120

Page 81: Low-rank models - NYU Courantcfgranda/pages/OBDA_spring16... · 1 25(5) 5 Superman2. Matrixcompletion Thematrixcompletionproblem Nuclearnorm Theoreticalguarantees Algorithms Alternatingminimization

Sparse component

20 40 60 80 100 120 140 160

20

40

60

80

100

120

Page 82: Low-rank models - NYU Courantcfgranda/pages/OBDA_spring16... · 1 25(5) 5 Superman2. Matrixcompletion Thematrixcompletionproblem Nuclearnorm Theoreticalguarantees Algorithms Alternatingminimization

Frame 42

20 40 60 80 100 120 140 160

20

40

60

80

100

120

Page 83: Low-rank models - NYU Courantcfgranda/pages/OBDA_spring16... · 1 25(5) 5 Superman2. Matrixcompletion Thematrixcompletionproblem Nuclearnorm Theoreticalguarantees Algorithms Alternatingminimization

Low-rank component

20 40 60 80 100 120 140 160

20

40

60

80

100

120

Page 84: Low-rank models - NYU Courantcfgranda/pages/OBDA_spring16... · 1 25(5) 5 Superman2. Matrixcompletion Thematrixcompletionproblem Nuclearnorm Theoreticalguarantees Algorithms Alternatingminimization

Sparse component

20 40 60 80 100 120 140 160

20

40

60

80

100

120

Page 85: Low-rank models - NYU Courantcfgranda/pages/OBDA_spring16... · 1 25(5) 5 Superman2. Matrixcompletion Thematrixcompletionproblem Nuclearnorm Theoreticalguarantees Algorithms Alternatingminimization

Frame 75

20 40 60 80 100 120 140 160

20

40

60

80

100

120

Page 86: Low-rank models - NYU Courantcfgranda/pages/OBDA_spring16... · 1 25(5) 5 Superman2. Matrixcompletion Thematrixcompletionproblem Nuclearnorm Theoreticalguarantees Algorithms Alternatingminimization

Low-rank component

20 40 60 80 100 120 140 160

20

40

60

80

100

120

Page 87: Low-rank models - NYU Courantcfgranda/pages/OBDA_spring16... · 1 25(5) 5 Superman2. Matrixcompletion Thematrixcompletionproblem Nuclearnorm Theoreticalguarantees Algorithms Alternatingminimization

Sparse component

20 40 60 80 100 120 140 160

20

40

60

80

100

120