mc complete version
TRANSCRIPT
-
8/3/2019 MC Complete Version
1/170
ELEG 867 - Compressive Sensing and Sparse Signal
Representations
Introduction to Matrix Completion and Robust PCA
Gonzalo Garateguy
Depart. of Electrical and Computer Engineering
University of Delaware
Fall 2011
ELEG 867 (MC and RPCA problems) Fall, 2011 1 / 91
http://-/?-http://find/http://goback/http://-/?-http://-/?-http://-/?-http://-/?-http://-/?- -
8/3/2019 MC Complete Version
2/170
Matrix Completion Problems - Motivation
Recomender Systems
Items
User 1 x x ? ? x xUser 2 ? ? x x ? ?
. ? x ? x x ?
. x ? ? x ? x
. x ? x ? ? x
. ? x ? ? x ?
. ? ? x x x ?
User n x x ? ? ? x
Collaborative filtering (Amazon, last.fm)Content based (Pandora,
www.nanocrowd.com)
Netflix prize competition boosted interest in
the area
http://www.ima.umn.edu/videos/index.php?id=1598
http://sahd.pratt.duke.edu/Videos/keynote.html
ELEG 867 (MC and RPCA problems) Fall, 2011 2 / 91
http://www.ima.umn.edu/videos/index.php?id=1598http://sahd.pratt.duke.edu/Videos/keynote.htmlhttp://sahd.pratt.duke.edu/Videos/keynote.htmlhttp://www.ima.umn.edu/videos/index.php?id=1598http://find/http://goback/http://-/?-http://-/?-http://-/?-http://-/?-http://-/?- -
8/3/2019 MC Complete Version
3/170
Matrix Completion Problems - Motivation
Sensor location estimation in Wireless
Sensor Networks
node 1
no e 2
node 3
node 4
node 5
node 6
node 7
d12
d13
d24
d34d45
d57
d56
d67
d23=?
d43=?
d74=?
d64=?
.
.
.
Distance matrix
1 2 3 4 5 6 7
1 0 d1,2 d1,3 ? ? ? ?
2 d2,1 0 ? d2,4 ? ? ?
3 d3,1 ? 0 d3,4 ? ? ?
4 ? d4,2
d4,3
0 d4,5
? ?
5 ? ? ? d5,4 0 d5,6 d5,76 ? ? ? ? d6,5 0 d6,77 ? ? ? ? d7,5 d7,6 0
The problem is to find the positions of the
sensors in R2 given the partial information
about relative distances
A distance matrix like this has rank 2 in R2
For certain types of graphs the problem can be
solved if we know the whole distance matrix
ELEG 867 (MC and RPCA problems) Fall, 2011 3 / 91
http://find/http://goback/http://-/?-http://-/?-http://-/?-http://-/?-http://-/?- -
8/3/2019 MC Complete Version
4/170
Matrix Completion Problems - Motivation
Image reconstruction from incomplete data
Reconstructed image Incomplete image 50% of the pixels
ELEG 867 (MC and RPCA problems) Fall, 2011 4 / 91
http://find/http://goback/http://-/?-http://-/?-http://-/?-http://-/?-http://-/?- -
8/3/2019 MC Complete Version
5/170
Robust PCA - Motivation
Foreground identification for surveillance applications
E.J. Candes, X. Li, Y. Ma, and Wright, J. Robust principal component analysis? http://arxiv.org/abs/0912.3599
ELEG 867 (MC and RPCA problems) Fall, 2011 5 / 91
http://arxiv.org/abs/0912.3599http://arxiv.org/abs/0912.3599http://find/http://goback/http://-/?-http://-/?-http://-/?-http://-/?-http://-/?- -
8/3/2019 MC Complete Version
6/170
Robust PCA - Motivation
Image alignment and texture recognition
Z. Zhang, X. Liang, A. Ganesh, and Y. Ma, TILT: transform invariant low-rank textures Computer VisionACCV 2010
ELEG 867 (MC and RPCA problems) Fall, 2011 6 / 91
http://find/http://goback/http://-/?-http://-/?-http://-/?-http://-/?-http://-/?- -
8/3/2019 MC Complete Version
7/170
Robust PCA - Motivation
Camera calibration with radial distortion
J. Wright, Z. Lin, and Y. Ma Low-Rank Matrix Recovery: From Theory to Imaging Applications Tutorial presented at International
Conference on Image and Graphics (ICIG), August 2011
ELEG 867 (MC and RPCA problems) Fall, 2011 7 / 91
http://find/http://goback/http://-/?-http://-/?-http://-/?-http://-/?-http://-/?- -
8/3/2019 MC Complete Version
8/170
Motivation
Many other applications
System Identification in control theory
Covariance matrix estimation
Machine Learning
Computer Vision
Videos to watchMatrix Completion via Convex Optimization: Theory and Algorithms by Emmanuel Candeshttp://videolectures.net/mlss09us_candes_mccota/
Low Dimensional Structures in Images or Data by Yi Ma, Workshop in Signal Processing with Adaptive
Sparse Structured Representations (June 2011)
http://ecos.maths.ed.ac.uk/SPARS11/YiMa.wmv
ELEG 867 (MC and RPCA problems) Fall, 2011 8 / 91
http://videolectures.net/mlss09us_candes_mccota/http://ecos.maths.ed.ac.uk/SPARS11/YiMa.wmvhttp://ecos.maths.ed.ac.uk/SPARS11/YiMa.wmvhttp://videolectures.net/mlss09us_candes_mccota/http://find/http://goback/http://-/?-http://-/?-http://-/?-http://-/?-http://-/?- -
8/3/2019 MC Complete Version
9/170
Problem Formulation
Matrix completion
minimize rank (A) (1)
subject to Aij = Dij (i,j)
Robust PCA
minimize rank (A) + ||E||0 (2)
subject to Aij + Eij = Dij (i,j)
ELEG 867 (MC and RPCA problems) Fall, 2011 9 / 91
http://find/http://goback/http://-/?-http://-/?-http://-/?-http://-/?-http://-/?- -
8/3/2019 MC Complete Version
10/170
Problem Formulation
Matrix completion
minimize rank (A) (1)
subject to Aij = Dij (i,j)
Robust PCA
minimize rank (A) + ||E||0 (2)
subject to Aij + Eij = Dij (i,j)
Very hard to solve in general without any asumptions, some times NP
hard.
ELEG 867 (MC and RPCA problems) Fall, 2011 9 / 91
http://find/http://goback/http://-/?-http://-/?-http://-/?-http://-/?-http://-/?- -
8/3/2019 MC Complete Version
11/170
Problem Formulation
Matrix completion
minimize rank (A) (1)
subject to Aij = Dij (i,j)
Robust PCA
minimize rank (A) + ||E||0 (2)
subject to Aij + Eij = Dij (i,j)
Very hard to solve in general without any asumptions, some times NP
hard.
Even if we can solve them, are the solutions always what we expect?
ELEG 867 (MC and RPCA problems) Fall, 2011 9 / 91
http://find/http://goback/http://-/?-http://-/?-http://-/?-http://-/?-http://-/?- -
8/3/2019 MC Complete Version
12/170
Problem Formulation
Matrix completion
minimize rank (A) (1)
subject to Aij = Dij (i,j)
Robust PCA
minimize rank (A) + ||E||0 (2)
subject to Aij + Eij = Dij (i,j)
Very hard to solve in general without any asumptions, some times NP
hard.
Even if we can solve them, are the solutions always what we expect?
Under wich conditions we can have exact recovery of the real matrices?
ELEG 867 (MC and RPCA problems) Fall, 2011 9 / 91
http://find/http://goback/http://-/?-http://-/?-http://-/?-http://-/?-http://-/?- -
8/3/2019 MC Complete Version
13/170
Outline
Convex Optimization concepts
Matrix Completion
Exact Recovery from incomplete data by convex relaxationALM method for Nuclear Norm Minimization
Robust PCA
Exact Recovery from incomplete data and corrupted data by convex
relaxation
ALM method for Low rank and Sparse separation
ELEG 867 (MC and RPCA problems) Fall, 2011 10 / 91
http://find/http://goback/http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?- -
8/3/2019 MC Complete Version
14/170
Convex sets and Convex functions
Convex set
A set C is convex if the line segment between any two points in Clies in C.
For any x1,x2 C and any with 0 1 we have
x1 + (1 )x2 C.
ELEG 867 (MC and RPCA problems) Fall, 2011 11 / 91
http://find/http://goback/http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?- -
8/3/2019 MC Complete Version
15/170
Convex sets and Convex functions
Convex set
A set C is convex if the line segment between any two points in Clies in C.
For any x1,x2 C and any with 0 1 we have
x1 + (1 )x2 C.
convex non convex non convex
ELEG 867 (MC and RPCA problems) Fall, 2011 11 / 91
-
8/3/2019 MC Complete Version
16/170
Convex sets and Convex functions
Convex combination
A convex combination of k points x1,..,xk is defined as
1x1 + ... + kxk , where i 0 and 1 + ... + k = 1
ELEG 867 (MC and RPCA problems) Fall, 2011 12 / 91
-
8/3/2019 MC Complete Version
17/170
Convex sets and Convex functions
Convex combination
A convex combination of k points x1,..,xk is defined as
1x1 + ... + kxk , where i 0 and 1 + ... + k = 1
Convex hull
The convex hull ofC is the set of all convex conbinations of points in C
conv C = {1x1 + ... + kxk|xi C, i 0, i = 1,..., k, 1 + ... + k = 1}
ELEG 867 (MC and RPCA problems) Fall, 2011 12 / 91
-
8/3/2019 MC Complete Version
18/170
Convex sets and Convex functions
Operations that preserve convexity
Intersection
IfS1 and S2 are convex, then S1
S2 is convex.
In general ifS is convex for every A, then
A S is convex.
Subspaces, affine sets and convex cones are therefore closed under arbitraryintersections.
Affine functions
Let f : Rn Rm be affine, f(x) = Ax + b, where A Rmn and b Rm. IfS R
n
is convex, then the image ofS under f
f(S) = {f(x)|x S}
is convex
ELEG 867 (MC and RPCA problems) Fall, 2011 13 / 91
-
8/3/2019 MC Complete Version
19/170
Convex sets and Convex functions
Convex functions
A function f : Rn R is convex ifdomf is a convex set and if for allx,y domf, and with 0 1, we have
f(x + (1 )y) f(x) + (1 f(y))
we say that f is strictly convex if the strict intequality holds whenever x = yand 0 < < 1
ELEG 867 (MC and RPCA problems) Fall, 2011 14 / 91
-
8/3/2019 MC Complete Version
20/170
Operations that preserve convexity
Composition with an affine mapping
Suppose f : Rn R, A Rnm and b Rn. Define g : Rm R byg(x) = f(Ax + b)
with domg =
{x
|Ax + b
domf
}. Then iff is convex, so is g.
Pointwise maximum
iff1 and f2 are convex functions then their pointwise maximum f defined by
f(x) = max
{f1(x),f2(x)
}with domf = domf1 domf2 is also convex. This also extend to the casewhere f1, ...,fm are convex, then
f(x) = max
{f1(x), ...,fm(x)
}, is also convex
ELEG 867 (MC and RPCA problems) Fall, 2011 15 / 91
-
8/3/2019 MC Complete Version
21/170
Pointwise maximum of convex functions
f1(x)
f2(x)f(x)=max{f1(x),f2(x)}
f(x)f(x1)
f(x2)^f(x)
ELEG 867 (MC and RPCA problems) Fall, 2011 16 / 91
-
8/3/2019 MC Complete Version
22/170
Convex sets and Convex functions
Convex differentiable functions
Iff is differentiable (i.e. its gradient f exist at each point in domf). Then fis convex if and only if domf is convex and
f(y) f(x) + f(x)T(y x)
holds for all x,y domf.
ELEG 867 (MC and RPCA problems) Fall, 2011 17 / 91
-
8/3/2019 MC Complete Version
23/170
Second order conditions
Iff is twice differentiable, i.e. its Hessian 2f exist at each point in domf.Then f is convex if and only ifdomf is convex and its Hessian is positive
semidefinite for all x domf2f(x) 0
ELEG 867 (MC and RPCA problems) Fall, 2011 18 / 91
-
8/3/2019 MC Complete Version
24/170
Convex non-differentiable functions
The concept of gradient can be extended to non-differentiable functions
introducing the subgradient
Subgradient of a function
A vector g Rn is a subgradient off : Rn R at x domf if for allz
domf
f(z) f(x) + gT(z x)
ELEG 867 (MC and RPCA problems) Fall, 2011 19 / 91
-
8/3/2019 MC Complete Version
25/170
Subgradients
Observations
Iff is convex and differentiable, then its gradient at x , f(x) is its onlysubgradient
Subdifferentiable functions
A function f is called subdifferentiable at x if there exist at least one
subgradient at x
Subdifferential at a point
The set of subgradients of f at the point x is called the subdifferential of f at x,
and is denoted f(x)
Subdifferentiability of a function
A function f is called subdifferentiable if it is subdifferentiable at all
x
domf
ELEG 867 (MC and RPCA problems) Fall, 2011 20 / 91
-
8/3/2019 MC Complete Version
26/170
Basic properties
Existence of the subgradient of a convex function
Iff is convex and x int domf, then f(x) is nonempty and bounded.
The subdifferential f(x) is always a closed convex set, even if f is notconvex. This follows from the fact that it is the intersection of an infinite set
of halfspaces
f(x) = zdomf{
g
|f(z)
f(x) + gT(z
x)
}.
ELEG 867 (MC and RPCA problems) Fall, 2011 21 / 91
-
8/3/2019 MC Complete Version
27/170
Basic properties
Nonnegative scaling
For 0, (f)(x) = f(x)
ELEG 867 (MC and RPCA problems) Fall, 2011 22 / 91
-
8/3/2019 MC Complete Version
28/170
Basic properties
Nonnegative scaling
For 0, (f)(x) = f(x)
Subgradient of the sum
Given f = f1 + ... + fm, where f1, ...,fm are convex functions, the subgradient
off at x is given by f(x) = f1(x) + ... + fm(x)
ELEG 867 (MC and RPCA problems) Fall, 2011 22 / 91
-
8/3/2019 MC Complete Version
29/170
Basic properties
Nonnegative scaling
For 0, (f)(x) = f(x)
Subgradient of the sum
Given f = f1 + ... + fm, where f1, ...,fm are convex functions, the subgradient
off at x is given by f(x) = f1(x) + ... + fm(x)
Affine transformations of domain
Suppose f is convex, and let h(x) = f(Ax + b). Then h(x) = ATf(Ax + b).
ELEG 867 (MC and RPCA problems) Fall, 2011 22 / 91
B i i
-
8/3/2019 MC Complete Version
30/170
Basic properties
Nonnegative scaling
For 0, (f)(x) = f(x)
Subgradient of the sum
Given f = f1 + ... + fm, where f1, ...,fm are convex functions, the subgradient
off at x is given by f(x) = f1(x) + ... + fm(x)
Affine transformations of domain
Suppose f is convex, and let h(x) = f(Ax + b). Then h(x) = ATf(Ax + b).
Pointwise maximum
Suppose f is the pointwise maximum of convex functions f1, ...,fm,f(x) = max
i=1,...,mfi(x), then f(x) = Co {fi(x)|fi(x) = f(x)}
ELEG 867 (MC and RPCA problems) Fall, 2011 22 / 91
S b di f h i i i f
-
8/3/2019 MC Complete Version
31/170
Subgradient of the pointwise maximum of two convex
functions
f1(x)
f2(x)f(x)=max{f1(x),f2(x)}
x
ELEG 867 (MC and RPCA problems) Fall, 2011 23 / 91
S b di t f th i t i i f t
-
8/3/2019 MC Complete Version
32/170
Subgradient of the pointwise maximum of two convex
functions
f1(x)
f2(x)f(x)=max{f1(x),f2(x)}
x
ELEG 867 (MC and RPCA problems) Fall, 2011 24 / 91
S b di t f th i t i i f t
-
8/3/2019 MC Complete Version
33/170
Subgradient of the pointwise maximum of two convex
functions
f1(x)
f2(x)f(x)=max{f1(x),f2(x)}
x
ELEG 867 (MC and RPCA problems) Fall, 2011 25 / 91
E l
-
8/3/2019 MC Complete Version
34/170
Examples
Conside the function f(x) = |x|.
ELEG 867 (MC and RPCA problems) Fall, 2011 26 / 91
Examples
-
8/3/2019 MC Complete Version
35/170
Examples
Conside the function f(x) = |x|. At x0=0 , the subdiferential is defined by theinequality
ELEG 867 (MC and RPCA problems) Fall, 2011 26 / 91
Examples
-
8/3/2019 MC Complete Version
36/170
Examples
Conside the function f(x) = |x|. At x0=0 , the subdiferential is defined by theinequality
f(z) f(x0) + g(z x0), z dom f
ELEG 867 (MC and RPCA problems) Fall, 2011 26 / 91
Examples
-
8/3/2019 MC Complete Version
37/170
Examples
Conside the function f(x) = |x|. At x0=0 , the subdiferential is defined by theinequality
f(z) f(x0) + g(z x0), z dom f|z| gz, z R
ELEG 867 (MC and RPCA problems) Fall, 2011 26 / 91
Examples
-
8/3/2019 MC Complete Version
38/170
Examples
Conside the function f(x) = |x|. At x0=0 , the subdiferential is defined by theinequality
f(z) f(x0) + g(z x0), z dom f|z| gz, z R
f(0) = {g | g [1 , 1]}
ELEG 867 (MC and RPCA problems) Fall, 2011 26 / 91
Examples
-
8/3/2019 MC Complete Version
39/170
Examples
Conside the function f(x) = |x|. At x0=0 , the subdiferential is defined by theinequality
f(z) f(x0) + g(z x0), z dom f|z| gz, z R
f(0) = {g | g [1 , 1]}then for all x
ELEG 867 (MC and RPCA problems) Fall, 2011 26 / 91
Examples
-
8/3/2019 MC Complete Version
40/170
Examples
Conside the function f(x) = |x|. At x0=0 , the subdiferential is defined by theinequality
f(z) f(x0) + g(z x0), z dom f|z| gz, z R
f(0) = {g | g [1 , 1]}then for all x
f(x) = 1 for x < 01 for x > 0{g|g [1, 1]} for x = 0
ELEG 867 (MC and RPCA problems) Fall, 2011 26 / 91
Examples
-
8/3/2019 MC Complete Version
41/170
Examples
Conside the function f(x) = |x|. At x0=0 , the subdiferential is defined by theinequality
f(z) f(x0) + g(z x0), z dom f|z| gz, z R
f(0) = {g | g [1 , 1]}then for all x
f(x) = 1 for x < 01 for x > 0{g|g [1, 1]} for x = 0
ELEG 867 (MC and RPCA problems) Fall, 2011 26 / 91
Example: 1 norm
-
8/3/2019 MC Complete Version
42/170
Example: 1 norm
Consider f(x) = x1 = |x1| + + |xn|, and note that f can be expressed asthe maximum of 2
n
linear functions
ELEG 867 (MC and RPCA problems) Fall, 2011 27 / 91
Example: 1 norm
-
8/3/2019 MC Complete Version
43/170
Example: 1 norm
Consider f(x) = x1 = |x1| + + |xn|, and note that f can be expressed asthe maximum of 2
n
linear functions
x1 = max{ f1(x) ,.., f2n (x) }
ELEG 867 (MC and RPCA problems) Fall, 2011 27 / 91
Example: 1 norm
-
8/3/2019 MC Complete Version
44/170
a p 1
Consider f(x) = x1 = |x1| + + |xn|, and note that f can be expressed asthe maximum of 2
n
linear functions
x1 = max{ f1(x) ,.., f2n (x) }
x1 = max{ sT1x ,.., sT2nx | si {1, 1}n }
ELEG 867 (MC and RPCA problems) Fall, 2011 27 / 91
Example: 1 norm
-
8/3/2019 MC Complete Version
45/170
p 1
Consider f(x) = x1 = |x1| + + |xn|, and note that f can be expressed asthe maximum of 2
n
linear functions
x1 = max{ f1(x) ,.., f2n (x) }
x1 = max{ sT1x ,.., sT2nx | si {1, 1}n }
The active functions fi(x) at x are the ones for wich sTi x = x1. Thendenoting
si = [si,1, ..., si,n]T, si,j {1, 1}
ELEG 867 (MC and RPCA problems) Fall, 2011 27 / 91
Example: 1 norm
-
8/3/2019 MC Complete Version
46/170
p 1
Consider f(x) = x1 = |x1| + + |xn|, and note that f can be expressed asthe maximum of 2
n
linear functionsx1 = max{ f1(x) ,.., f2n (x) }
x1 = max{ sT1x ,.., sT2nx | si {1, 1}n }
The active functions fi(x) at x are the ones for wich sTi x = x1. Thendenoting
si = [si,1, ..., si,n]T, si,j {1, 1}
the set of indices of the active functions at x is
Ax =i
si,j = 1 for xj < 0si,j = 1 for xj > 0
si,j = 1 or 1 for xj = 0, for j = 1,.., n
ELEG 867 (MC and RPCA problems) Fall, 2011 27 / 91
subgradient of the 1 norm
-
8/3/2019 MC Complete Version
47/170
g
The subgradient ofx1 at a generic point x is defined byx1 = co { fi(x) | i Ax }
ELEG 867 (MC and RPCA problems) Fall, 2011 28 / 91
subgradient of the 1 norm
-
8/3/2019 MC Complete Version
48/170
g
The subgradient ofx1 at a generic point x is defined byx1 = co { fi(x) | i Ax }x1 = co{ fi(x) | i Ax }
ELEG 867 (MC and RPCA problems) Fall, 2011 28 / 91
subgradient of the 1 norm
-
8/3/2019 MC Complete Version
49/170
The subgradient ofx1 at a generic point x is defined byx1 = co { fi(x) | i Ax }x1 = co{ fi(x) | i Ax }x1 = co{ si|i Ax }
ELEG 867 (MC and RPCA problems) Fall, 2011 28 / 91
subgradient of the 1 norm
-
8/3/2019 MC Complete Version
50/170
The subgradient ofx1 at a generic point x is defined byx1 = co { fi(x) | i Ax }x1 = co{ fi(x) | i Ax }x1 = co{ si|i Ax }x1 = {g|g = iAx isi , i 0 , i i = 1}
or equivalently
x1 = g gj = 1 for xj < 0gj = 1 for xj > 0gj = [1, 1] for xj = 0
ELEG 867 (MC and RPCA problems) Fall, 2011 28 / 91
1 norm on R2
-
8/3/2019 MC Complete Version
51/170
in R2
the set of subgradients are
s1 = [ 1, 1]Ts2 = [ 1, 1]Ts3 = [ 1, 1]
T
s4 = [ 1,
1]T
ELEG 867 (MC and RPCA problems) Fall, 2011 29 / 91
-
8/3/2019 MC Complete Version
52/170
ELEG 867 (MC and RPCA problems) Fall, 2011 30 / 91
Convex optimization problems
-
8/3/2019 MC Complete Version
53/170
An optimization problem is convex if its objective is a convex function, theinequality constraints fj are convex and the equality constraints hj are affine
minimizex
f0(x) (Convex function)
s.t. fi(x)
0 (Convex sets)
hj(x) = 0 (Affine)
or equivalently
minimizex f0(x) (Convex function)s.t. x C C is a convex set
hj(x) = 0 (Affine)
ELEG 867 (MC and RPCA problems) Fall, 2011 31 / 91
Theorem
If i l l i i i f ti i ti bl it i l b l
-
8/3/2019 MC Complete Version
54/170
Ifx is a local minimizer of a convex optimization problem, it is a global
minimizer.
Optimality conditions
A point x is a minimizer of a convex function f if and only iff is
subdifferentiable at x and
0 f(x)
ELEG 867 (MC and RPCA problems) Fall, 2011 32 / 91
Convex optimization problems
-
8/3/2019 MC Complete Version
55/170
Given the convex problem
minimizex
f0(x)
s.t. fi(x) 0, i = {1, ..., k}hj(x) = 0, j =
{1, ..., l
}its Lagrangian function is defined as
L(x, , ) = f0(x) +
l
j=1
jhj(x) +k
i=1
ifi(x)
where i 0, i R
ELEG 867 (MC and RPCA problems) Fall, 2011 33 / 91
Augmented Lagrangian Method
-
8/3/2019 MC Complete Version
56/170
Considering the problem
minimizex
f(x)
s.t. x Ch(x) = 0
(3)
The augmented lagrangian is defined as
L(x, , c) = f(x) + Th(x) +
2 h(x)
22
where is a penalty parameter and is the multiplier vector
ELEG 867 (MC and RPCA problems) Fall, 2011 34 / 91
Augmented Lagrangian Method
-
8/3/2019 MC Complete Version
57/170
The augmented lagrangian method consist of solving a sequence of problems
of the form
minimizex
L(x, k, k) = f(x) + kTh(x) + k2 h(x)22s.t. x C
where {k} is a bounded sequence in Rl and {k} is a penalty parametersequence satisfying
0 < k < k+1 k , k
ELEG 867 (MC and RPCA problems) Fall, 2011 35 / 91
Augmented Lagrangian Method
-
8/3/2019 MC Complete Version
58/170
The exact solution to problem (3) can be found using the following iterative
algorithm
set > 1
while not converged do
solve xk+1 = argminxC
L(x, k, k)k+1 = k + kh(xk+1)k = k
end while
Output xk
ELEG 867 (MC and RPCA problems) Fall, 2011 36 / 91
Matrix completion
-
8/3/2019 MC Complete Version
59/170
Optimization problem
minimize rank (A) (4)
subject to Aij = Dij
(i,j)
ELEG 867 (MC and RPCA problems) Fall, 2011 37 / 91
Matrix completion
-
8/3/2019 MC Complete Version
60/170
Optimization problem
minimize rank (A) (4)
subject to Aij = Dij
(i,j)
We look for the simplest explanation for the observed data
ELEG 867 (MC and RPCA problems) Fall, 2011 37 / 91
Matrix completion
-
8/3/2019 MC Complete Version
61/170
Optimization problem
minimize rank (A) (4)
subject to Aij = Dij
(i,j)
We look for the simplest explanation for the observed data
Given enough number of samples, the likelihood of the solution to beunique should be high
ELEG 867 (MC and RPCA problems) Fall, 2011 37 / 91
Matrix completion
-
8/3/2019 MC Complete Version
62/170
minimize rank (A)
subject to Aij = Dij (i,j)
ELEG 867 (MC and RPCA problems) Fall, 2011 38 / 91
Matrix completion
-
8/3/2019 MC Complete Version
63/170
minimize rank (A)
subject to Aij = Dij (i,j) The minimization of the rank() function is a combinatorial problem,with exponential complexity in the size of the matrix!
Need for a convex relaxation
ELEG 867 (MC and RPCA problems) Fall, 2011 38 / 91
Matrix completion
-
8/3/2019 MC Complete Version
64/170
minimize rank (A)
subject to Aij = Dij (i,j) The minimization of the rank() function is a combinatorial problem,with exponential complexity in the size of the matrix!
Need for a convex relaxation
rank(A)
ELEG 867 (MC and RPCA problems) Fall, 2011 38 / 91
Matrix completion
-
8/3/2019 MC Complete Version
65/170
minimize rank (A)
subject to Aij = Dij (i,j) The minimization of the rank() function is a combinatorial problem,with exponential complexity in the size of the matrix!
Need for a convex relaxation
rank(A) = ||diag()||0 A = UVT
ELEG 867 (MC and RPCA problems) Fall, 2011 38 / 91
Matrix completion
-
8/3/2019 MC Complete Version
66/170
minimize rank (A)
subject to Aij = Dij (i,j) The minimization of the rank() function is a combinatorial problem,with exponential complexity in the size of the matrix!
Need for a convex relaxation
rank(A) = ||diag()||0 A = UVT
ELEG 867 (MC and RPCA problems) Fall, 2011 38 / 91
Matrix completion
-
8/3/2019 MC Complete Version
67/170
minimize rank (A)
subject to Aij = Dij (i,j) The minimization of the rank() function is a combinatorial problem,with exponential complexity in the size of the matrix!
Need for a convex relaxation
rank(A) = ||diag()||0 A = UVT
||A|| = ||diag()||1
ELEG 867 (MC and RPCA problems) Fall, 2011 38 / 91
Matrix completion
-
8/3/2019 MC Complete Version
68/170
minimize rank (A)
subject to Aij = Dij (i,j) The minimization of the rank() function is a combinatorial problem,with exponential complexity in the size of the matrix!
Need for a convex relaxation
rank(A) = ||diag()||0 A = UVT
||A|| = ||diag()||1
Convex relaxation
minimize A (5)subject to Aij = Dij (i,j)
ELEG 867 (MC and RPCA problems) Fall, 2011 38 / 91
Matrix Completion
-
8/3/2019 MC Complete Version
69/170
Nuclear Norm
The nuclear norm of a matrix A Rmn is defined as ||A|| =r
i=1 i(A),where {i(A)}ri=1 are the elements of the diagonal matrix from the SVDdecomposition ofA = UVT
ELEG 867 (MC and RPCA problems) Fall, 2011 39 / 91
Matrix Completion
-
8/3/2019 MC Complete Version
70/170
Nuclear Norm
The nuclear norm of a matrix A Rmn is defined as ||A|| =r
i=1 i(A),where {i(A)}ri=1 are the elements of the diagonal matrix from the SVDdecomposition ofA = UVT
Observationsr= rank(A) can be r < m, n. If this is the case we say that the matrix islow rank
ELEG 867 (MC and RPCA problems) Fall, 2011 39 / 91
Matrix Completion
-
8/3/2019 MC Complete Version
71/170
Nuclear Norm
The nuclear norm of a matrix A Rmn is defined as ||A|| =r
i=1 i(A),where {i(A)}ri=1 are the elements of the diagonal matrix from the SVDdecomposition ofA = UVT
Observationsr= rank(A) can be r < m, n. If this is the case we say that the matrix islow rank
the singular values i(A) =
i(ATA) are obtained as the square root ofthe eigenvalues ofATA and are always
i 0
ELEG 867 (MC and RPCA problems) Fall, 2011 39 / 91
Matrix Completion
-
8/3/2019 MC Complete Version
72/170
Nuclear Norm
The nuclear norm of a matrix A Rmn is defined as ||A|| =r
i=1 i(A),where {i(A)}ri=1 are the elements of the diagonal matrix from the SVDdecomposition ofA = UVT
Observationsr= rank(A) can be r < m, n. If this is the case we say that the matrix islow rank
the singular values i(A) =
i(ATA) are obtained as the square root ofthe eigenvalues ofATA and are always
i 0
the left singular vectors U are the eigenvectors ofAAT
ELEG 867 (MC and RPCA problems) Fall, 2011 39 / 91
Matrix Completion
-
8/3/2019 MC Complete Version
73/170
Nuclear Norm
The nuclear norm of a matrix A Rmn is defined as ||A|| =r
i=1 i(A),where {i(A)}ri=1 are the elements of the diagonal matrix from the SVDdecomposition ofA = UVT
Observationsr= rank(A) can be r < m, n. If this is the case we say that the matrix islow rank
the singular values i(A) =
i(ATA) are obtained as the square root ofthe eigenvalues ofATA and are always i
0
the left singular vectors U are the eigenvectors ofAAT
the right singular vectors V are the eigenvectors ofATA
ELEG 867 (MC and RPCA problems) Fall, 2011 39 / 91
Matrix Completion
-
8/3/2019 MC Complete Version
74/170
Spectral Norm
The spectral norm of a matrix A Rmn is defined as A2 = max(A), wheremax = max({i(A)}ri=1)
ELEG 867 (MC and RPCA problems) Fall, 2011 40 / 91
Matrix Completion
-
8/3/2019 MC Complete Version
75/170
Spectral Norm
The spectral norm of a matrix A Rmn is defined as A2 = max(A), wheremax = max({i(A)}ri=1)
Dual Norm
Given an arbitrary norm | | | | in Rn, its dual norm | | | | is defined asz = sup{zTx | x 1}
ELEG 867 (MC and RPCA problems) Fall, 2011 40 / 91
Matrix Completion
-
8/3/2019 MC Complete Version
76/170
Spectral Norm
The spectral norm of a matrix A Rmn is defined as A2 = max(A), wheremax = max({i(A)}ri=1)
Dual Norm
Given an arbitrary norm | | | | in Rn, its dual norm | | | | is defined asz = sup{zTx | x 1}
Observations
The nuclear norm is the dual norm of the spectral norm
ELEG 867 (MC and RPCA problems) Fall, 2011 40 / 91
Matrix Completion
-
8/3/2019 MC Complete Version
77/170
Spectral Norm
The spectral norm of a matrix A Rmn is defined as A2 = max(A), wheremax = max({i(A)}ri=1)
Dual Norm
Given an arbitrary norm | | | | in Rn, its dual norm | | | | is defined asz = sup{zTx | x 1}
Observations
The nuclear norm is the dual norm of the spectral norm
A = sup{tr(ATX)|X2 1}
ELEG 867 (MC and RPCA problems) Fall, 2011 40 / 91
Matrix Completion
Convex relaxation of the rank
-
8/3/2019 MC Complete Version
78/170
Convex relaxation of the rank
ELEG 867 (MC and RPCA problems) Fall, 2011 41 / 91
Matrix Completion
Convex relaxation of the rank
-
8/3/2019 MC Complete Version
79/170
Convex relaxation of the rank
Convex envelope of a function
Let f : C R where C Rn. The convex envelope off (on C) is defined asthe largest convex function g such that g(x) f(x) for all x C
ELEG 867 (MC and RPCA problems) Fall, 2011 41 / 91
Matrix Completion
Convex relaxation of the rank
-
8/3/2019 MC Complete Version
80/170
Convex relaxation of the rank
Convex envelope of a function
Let f : C R where C Rn. The convex envelope off (on C) is defined asthe largest convex function g such that g(x) f(x) for all x C
TheoremThe convex envelope of the function (X) =rank(X) onC = {X Rmn|X2 1}, is env(X) = X.
ELEG 867 (MC and RPCA problems) Fall, 2011 41 / 91
-
8/3/2019 MC Complete Version
81/170
Matrix Completion
Convex relaxation of the rank
-
8/3/2019 MC Complete Version
82/170
Convex relaxation of the rank
Convex envelope of a function
Let f : C R where C Rn. The convex envelope off (on C) is defined asthe largest convex function g such that g(x) f(x) for all x C
TheoremThe convex envelope of the function (X) =rank(X) onC = {X Rmn|X2 1}, is env(X) = X.Observations
The convex envelope of rank(X) on a the set {X|X2 M} is given by 1
MX
By solving the heuristic problem we obtain a lower bound on the optimal value of the original
problem (provided we can identify a bound M on the feasible set).
M. Fazel, H. Hindi and S. Boyd A Rank Minimization Heuristic with Application to Minimum Order System Approximation American
Control Conference, 2001.
ELEG 867 (MC and RPCA problems) Fall, 2011 41 / 91
Matrix completion
-
8/3/2019 MC Complete Version
83/170
Convex relaxation
minimize A (6)subject to Aij = Dij
(i,j)
The original problem is now a problem with a non-smooth but convex
function as the objective
The remaining problem is the number of measurements and in which
positions have to be taken in order to guarantee that the solution is equalto the matrix D?
ELEG 867 (MC and RPCA problems) Fall, 2011 42 / 91
Matrix completion
-
8/3/2019 MC Complete Version
84/170
Which types of matrices can be completed exactly?
ELEG 867 (MC and RPCA problems) Fall, 2011 43 / 91
Matrix completion
-
8/3/2019 MC Complete Version
85/170
Which types of matrices can be completed exactly?
Consider the matrix
M = e1.eT
n
=
0 0 0 10 0 0 0...
..
.
..
.
..
.
..
.0 0 0 00 0 0 0
ELEG 867 (MC and RPCA problems) Fall, 2011 43 / 91
Matrix completion
-
8/3/2019 MC Complete Version
86/170
Which types of matrices can be completed exactly?
Consider the matrix
M = e1.eT
n
=
0 0 0 10 0 0 0..
.
..
.
..
.
..
.
..
.0 0 0 00 0 0 0
Can it be recovered from 90 % of its samples ?
ELEG 867 (MC and RPCA problems) Fall 2011 43 / 91
Matrix completion
-
8/3/2019 MC Complete Version
87/170
Which types of matrices can be completed exactly?
Consider the matrix
M = e1.eT
n
=
0 0 0 10 0 0 0..
.
..
.
..
.
..
.
..
.0 0 0 00 0 0 0
Can it be recovered from 90 % of its samples ?
Is the sampling set important?
ELEG 867 (MC and RPCA problems) Fall 2011 43 / 91
Matrix completion
-
8/3/2019 MC Complete Version
88/170
Which types of matrices can be completed exactly?
Consider the matrix
M = e1.eT
n
=
0 0 0 10 0 0 0..
.
..
.
..
.
..
.
..
.0 0 0 00 0 0 0
Can it be recovered from 90 % of its samples ?
Is the sampling set important?
Which sampling sets work and which ones doesnt?
ELEG 867 (MC and RPCA problems) Fall 2011 43 / 91
Matrix completion
Sampling set
-
8/3/2019 MC Complete Version
89/170
p g
The sampling set
is defined as = {(
i,j
) |D
ijis observed
}
ELEG 867 (MC and RPCA problems) Fall 2011 44 / 91
Matrix completion
Sampling set
-
8/3/2019 MC Complete Version
90/170
p g
The sampling set
is defined as = {(
i,j
) |D
ijis observed
}Consider
D = xyT x Rm,y Rn
Dij = xiyj
ELEG 867 (MC and RPCA problems) Fall 2011 44 / 91
Matrix completion
Sampling set
-
8/3/2019 MC Complete Version
91/170
The sampling set
is defined as = {(
i,j
) |D
ijis observed
}Consider
D = xyT x Rm,y Rn
Dij = xiyj
If the sampling set avoids row i, then xi can not be recovered by any
method whatsoever
ELEG 867 (MC and RPCA problems) Fall 2011 44 / 91
Matrix completion
Sampling set
-
8/3/2019 MC Complete Version
92/170
The sampling set is defined as ={
(i,j)|D
ijis observed
}Consider
D = xyT x Rm,y Rn
Dij = xiyj
If the sampling set avoids row i, then xi can not be recovered by any
method whatsoever
ObservationNo columns or rows from D can be avoided in the sampling set
ELEG 867 (MC and RPCA problems) Fall 2011 44 / 91
Matrix completion
Sampling set
-
8/3/2019 MC Complete Version
93/170
The sampling set is defined as ={
(i,j)|D
ijis observed
}Consider
D = xyT x Rm,y Rn
Dij = xiyj
If the sampling set avoids row i, then xi can not be recovered by any
method whatsoever
ObservationNo columns or rows from D can be avoided in the sampling set
There is a need for a characterization of the sampling operator with
respect to the set of matrices that we want to recover
ELEG 867 (MC and RPCA problems) Fall 2011 44 / 91
-
8/3/2019 MC Complete Version
94/170
ELEG 867 (MC and RPCA problems) Fall 2011 45 / 91
-
8/3/2019 MC Complete Version
95/170
ELEG 867 (MC and RPCA problems) Fall 2011 46 / 91
-
8/3/2019 MC Complete Version
96/170
ELEG 867 (MC and RPCA problems) Fall 2011 47 / 91
-
8/3/2019 MC Complete Version
97/170
ELEG 867 (MC and RPCA problems) Fall 2011 48 / 91
-
8/3/2019 MC Complete Version
98/170
ELEG 867 (MC and RPCA problems) Fall 2011 49 / 91
-
8/3/2019 MC Complete Version
99/170
ELEG 867 (MC and RPCA problems) Fall 2011 50 / 91
-
8/3/2019 MC Complete Version
100/170
ELEG 867 (MC d RPCA bl ) F ll 2011 51 / 91
Matrix completion
Intuition
the singular vectors need to be sufficiently spread, i.e. uncorrelated with
-
8/3/2019 MC Complete Version
101/170
g y p , w
the standar basis in order to minimize the number of observations neededto recover a low rank matrix
ELEG 867 (MC and RPCA problems) Fall, 2011 52 / 91
Matrix completion
Intuition
the singular vectors need to be sufficiently spread, i.e. uncorrelated with
-
8/3/2019 MC Complete Version
102/170
g y p ,
the standar basis in order to minimize the number of observations neededto recover a low rank matrix
Coherence of a subspace
Let U be a subspace ofRn of dimension r and PU be the orthogonal projection
onto U. Then the coherence ofU is defined to be
(U) =n
rmax
1inPUei2
ELEG 867 (MC and RPCA problems) Fall, 2011 52 / 91
Matrix completion
Intuition
the singular vectors need to be sufficiently spread, i.e. uncorrelated with
-
8/3/2019 MC Complete Version
103/170
g y p
the standar basis in order to minimize the number of observations neededto recover a low rank matrix
Coherence of a subspace
Let U be a subspace ofRn of dimension r and PU be the orthogonal projection
onto U. Then the coherence ofU is defined to be
(U) =n
rmax
1inPUei2
Observations
The minimum value that (U) can achieve is 1 for example if U isspanned by vectors whos entries all have magnitude 1/
n
ELEG 867 (MC and RPCA problems) Fall, 2011 52 / 91
Matrix completion
Intuition
the singular vectors need to be sufficiently spread, i.e. uncorrelated with
-
8/3/2019 MC Complete Version
104/170
g y p
the standar basis in order to minimize the number of observations neededto recover a low rank matrix
Coherence of a subspace
Let U be a subspace ofRn of dimension r and PU be the orthogonal projection
onto U. Then the coherence ofU is defined to be
(U) =n
rmax
1inPUei2
Observations
The minimum value that (U) can achieve is 1 for example if U isspanned by vectors whos entries all have magnitude 1/
n
The largest possible value for (U) is n/r corresponding to a subspacethat contains a standard basis element.
ELEG 867 (MC and RPCA problems) Fall, 2011 52 / 91
Matrix completion
0 coherence
-
8/3/2019 MC Complete Version
105/170
A matrix D =
1kr kukvTk is 0 coherent if for some positive 0max((U), (V)) 0
ELEG 867 (MC and RPCA problems) Fall, 2011 53 / 91
Matrix completion
0 coherence
-
8/3/2019 MC Complete Version
106/170
A matrix D =
1kr kukvTk is 0 coherent if for some positive 0max((U), (V)) 0
1 coherence
A matrix D =
1kr kukvTk has 1 coherence if
UVT 1
r/mn
for some 1 > 0
ELEG 867 (MC and RPCA problems) Fall, 2011 53 / 91
Matrix completion
0 coherence
-
8/3/2019 MC Complete Version
107/170
A matrix D =
1kr kukvTk is 0 coherent if for some positive 0max((U), (V)) 0
1 coherence
A matrix D =
1kr kukvTk has 1 coherence if
UVT 1
r/mn
for some 1 > 0
Observation
IfD is 0 coherent then it is 1 coherent for 1 = 0
r
ELEG 867 (MC and RPCA problems) Fall, 2011 53 / 91
Coherence of a rank 300 approximation of kowalski
2 4
2.6
||PU
ei||
3
||PV
ei||
-
8/3/2019 MC Complete Version
108/170
0 100 200 300 400 500 600 7000.8
1
1.2
1.4
1.6
1.8
2
2.2
2.4
index i
100 200 300 400 500 600 700 800 900
1
1.5
2
2.5
index i
(U) = 1.9588 (V) = 2.22900 = 2.2290
1 =
mnr
UVT = 13.412
ELEG 867 (MC and RPCA problems) Fall, 2011 54 / 91
Matrix completion
Theorem
-
8/3/2019 MC Complete Version
109/170
Let D Rmn of rankr be (0, 1)-coherent and let N = max(m, n). If weobserve M entries ofD with locations sampled uniformly at random. Then
there exist constants Cand c such that if
M
Cmax(21,
1/20 1, 0N
1/4)Nr(logN)
for some > 2, then the minimizer of (6) is unique and equal to D withprobability at least 1 cn . If in addition r 10 N1/5 then the number ofobservations can be improved to
M C0N6/5r(logN)Candes, E.J. and Recht, B. Exact matrix completion via convex optimization, Foundations of Computational Mathematics 2009
ELEG 867 (MC and RPCA problems) Fall, 2011 55 / 91
Matrix completion
-
8/3/2019 MC Complete Version
110/170
21 = 179.99 1/20 1 = 12.5139 0N1/4 = 4.7682
max(21, 1/20 1, 0N
1/4)Nr(2.1logN) = 6.6076 108
What is the value of C? must be C > 0.
In the limit case M = mn, C = 6759006.607108
= 9.194
104.
For the bound to be useful 0 < C < 9.194 104.
ELEG 867 (MC and RPCA problems) Fall, 2011 56 / 91
SNR = 23.74 dB , 10% of the samples
-
8/3/2019 MC Complete Version
111/170
ELEG 867 (MC and RPCA problems) Fall, 2011 57 / 91
SNR = 22.52 dB , 25% of the samples
-
8/3/2019 MC Complete Version
112/170
ELEG 867 (MC and RPCA problems) Fall, 2011 58 / 91
SNR = 25.89 dB , 35% of the samples
-
8/3/2019 MC Complete Version
113/170
ELEG 867 (MC and RPCA problems) Fall, 2011 59 / 91
SNR = 30.55 dB , 50% of the samples
-
8/3/2019 MC Complete Version
114/170
ELEG 867 (MC and RPCA problems) Fall, 2011 60 / 91
SNR = 39.51 dB , 70% of the samples
-
8/3/2019 MC Complete Version
115/170
ELEG 867 (MC and RPCA problems) Fall, 2011 61 / 91
SNR = 42.75 dB , 75% of the samples
-
8/3/2019 MC Complete Version
116/170
ELEG 867 (MC and RPCA problems) Fall, 2011 62 / 91
SNR = 47.10 dB , 80% of the samples
-
8/3/2019 MC Complete Version
117/170
ELEG 867 (MC and RPCA problems) Fall, 2011 63 / 91
-
8/3/2019 MC Complete Version
118/170
Completion Performance
70
-
8/3/2019 MC Complete Version
119/170
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.910
20
30
40
50
60
percentage of samples
SN
R
ELEG 867 (MC and RPCA problems) Fall, 2011 65 / 91
Matrix completion
Recovery performance for random matrices
-
8/3/2019 MC Complete Version
120/170
Figure: The x axis corresponds to rank(A)/min{m, n} and the y axis to s = 1M/mn (probability that
an entry is omited from the observations)
Emmanuel J. Candes, Xiaodong Li, Yi Ma, John Wright Robust Principal Component Analysis?
http://arxiv.org/abs/0912.3599
ELEG 867 (MC and RPCA problems) Fall, 2011 66 / 91
Matrix completion
-
8/3/2019 MC Complete Version
121/170
Other bounds on number of meassurements and sampling operatorsEmmanuel J. Candes, Xiaodong Li, Yi Ma, John Wright Rodbust Principal Component Analysis?
http://arxiv.org/abs/0912.3599
Venkat Chandrasekaran, Sujay Sanghavi, Pablo A. Parrilo, Alan S. Willsky Rank-Sparsity Incoherence for Matrix Decomposition
http://arxiv.org/abs/0906.2220
Zihan Zhou, Xiaodong Li, John Wright, Emmanuel Candes, Yi Ma Stable Principal Component Pursuit
http://arxiv.org/abs/1001.2363
Raghunandan H. Keshavan, Andrea Montanari, Sewoong Oh Matrix Completion from a Few Entries
http://arxiv.org/abs/0901.3150
Sahand Negahban, Martin J. Wainwright Restricted strong convexity and weighted matrix completion: Optimal bounds with noise
http://arxiv.org/abs/1009.2118v2
Yonina C. Eldar, Deanna Needell, Yaniv Plan Unicity conditions for low-rank matrix recovery
http://arxiv.org/abs/1103.5479
ELEG 867 (MC and RPCA problems) Fall, 2011 67 / 91
Solving the problem
Rewriting the problem
-
8/3/2019 MC Complete Version
122/170
ELEG 867 (MC and RPCA problems) Fall, 2011 68 / 91
Solving the problem
Rewriting the problem
minimize
A
-
8/3/2019 MC Complete Version
123/170
subject to Aij = Dij (i,j)
ELEG 867 (MC and RPCA problems) Fall, 2011 68 / 91
Solving the problem
Rewriting the problem
minimize
A
( )
-
8/3/2019 MC Complete Version
124/170
subject to Aij = Dij (i,j)
ELEG 867 (MC and RPCA problems) Fall, 2011 68 / 91
Solving the problem
Rewriting the problem
minimize
A
bj A D (i j)
-
8/3/2019 MC Complete Version
125/170
subject to Aij = Dij (i,j)
minimize Asubject to A + E = D
, (E) = 0
ELEG 867 (MC and RPCA problems) Fall, 2011 68 / 91
Solving the problem
Rewriting the problem
minimize
A
bj t t A D (i j)
-
8/3/2019 MC Complete Version
126/170
subject to Aij = Dij (i,j)
minimize Asubject to A + E = D
, (E) = 0
where
ELEG 867 (MC and RPCA problems) Fall, 2011 68 / 91
Solving the problem
Rewriting the problem
minimize
A
bj t t A D (i j)
-
8/3/2019 MC Complete Version
127/170
subject to Aij = Dij (i,j)
minimize Asubject to A + E = D
, (E) = 0
where
[(E)]ij =
Eij if(i,j)
0 if(i,j) / D
ij =
Dij if(i,j)
0 if(i,j) /
ELEG 867 (MC and RPCA problems) Fall, 2011 68 / 91
Solving the problem
Rewriting the problem
minimize
A
subject to A D (i j)
-
8/3/2019 MC Complete Version
128/170
subject to Aij = Dij (i,j)
minimize Asubject to A + E = D
, (E) = 0
where
[(E)]ij =
Eij if(i,j)
0 if(i,j) / D
ij =
Dij if(i,j)
0 if(i,j) /
The new problem can be solved by Augmented Lagrangian Method in anefficient way
Z. Lin, M. Chen, L. Wu and Y. Ma The augmented lagrange multiplier method for exact recovery of corrupted low-rank matrices
http://arxiv.org/abs/1009.5055
ELEG 867 (MC and RPCA problems) Fall, 2011 68 / 91
Solving the problem
The augmented lagrangian for the problem
-
8/3/2019 MC Complete Version
129/170
The augmented lagrangian for the problem
minimize Asubject to A + E = D
, (E) = 0
isL(A,E, Y, ) = A + Y,D A E +
2D A E2F (7)
The tradidional iterative method to minimize the augmented lagrangian can be
used here, but at each iteration the constraint (E) = 0 has to be fulfilled.
ELEG 867 (MC and RPCA problems) Fall, 2011 69 / 91
Solving the problem
Algorithm
i t Ob i l D (i j )
-
8/3/2019 MC Complete Version
130/170
input: Observation samples Dij, (i,j )
Y0 = 0;E0 = 0; 0 > 0; > 1; k = 0
while not converged
Ak+1 = argminA
L(A,Ek, Yk, k)
Ek+1 = argminE,(E)=0
L(Ak+1,E, Yk, k)
Yk+1 = Yk + k(D
Ak+1 Ek+1)
k+1 = kk k+ 1
end while
Output: (Ak,Ek)
ELEG 867 (MC and RPCA problems) Fall, 2011 70 / 91
Solving the subproblems
Solving for Ak+1
Ak+1 = argmin L(A Ek Yk k)
-
8/3/2019 MC Complete Version
131/170
Ak+1 = argminA
L(A,Ek, Yk, k)
Ak+1 = argminA
A + Yk,D A Ek + 2 D A Ek2F
Ak+1 = argminA
1A + 12D A Ek + 1Yk2F
which has the general form
argminA
A + 12XA2F
ELEG 867 (MC and RPCA problems) Fall, 2011 71 / 91
Solving the subproblems
Singular value shrinkage operator
Given a matrix X = UV T
-
8/3/2019 MC Complete Version
132/170
Given a matrix X = UVT,
the operator D(.) : Rmn Rmn is defined as
D(X) = U
S()V
T,
S() = sign()
{|
|
}
ELEG 867 (MC and RPCA problems) Fall, 2011 72 / 91
Solving the subproblems
Singular value shrinkage operator
Given a matrix X = UV T
-
8/3/2019 MC Complete Version
133/170
Given a matrix X = UVT,
the operator D(.) : Rmn Rmn is defined as
D(X) = U
S()V
T,
S() = sign()
{|
|
}Theorem
For each 0 and Y Rmn, the singular value shrinkage operator obeys
D(Y) = argminX
X + 12Y X2F
ELEG 867 (MC and RPCA problems) Fall, 2011 72 / 91
Solving the subproblems
Proof:
Consider the function h0(X) =
X
+
12
X
Y
2F
-
8/3/2019 MC Complete Version
134/170
ELEG 867 (MC and RPCA problems) Fall, 2011 73 / 91
Solving the subproblems
Proof:
Consider the function h0(X) =
X
+
12
X
Y
2F
-
8/3/2019 MC Complete Version
135/170
A sufficient condition for optimality of X is that
0 X Y + Xwhere X is the set of subgradients of the nuclear norm X at X.
ELEG 867 (MC and RPCA problems) Fall, 2011 73 / 91
Solving the subproblems
Proof:
Consider the function h0(X) =
X
+
12
X
Y
2F
-
8/3/2019 MC Complete Version
136/170
A sufficient condition for optimality of X is that
0 X Y + Xwhere X is the set of subgradients of the nuclear norm X at X.We know that for an arbitraty X = UVT Rmn
X
=
{UVT + W : W
Rmn, UTW = 0, WV = 0,
W
2
1
}
ELEG 867 (MC and RPCA problems) Fall, 2011 73 / 91
Solving the subproblems
Proof:
Consider the function h0(X) =
X
+
12
X
Y
2F
-
8/3/2019 MC Complete Version
137/170
A sufficient condition for optimality of X is that
0 X Y + Xwhere X is the set of subgradients of the nuclear norm X at X.We know that for an arbitraty X = UVT Rmn
X
=
{UVT + W : W
Rmn, UTW = 0, WV = 0,
W
2
1
}If we set X = D(Y) and prove that Y X X, then the theorem isconcluded.
ELEG 867 (MC and RPCA problems) Fall, 2011 73 / 91
Decompose Y = U00VT0 + U11V
T1
-
8/3/2019 MC Complete Version
138/170
ELEG 867 (MC and RPCA problems) Fall, 2011 74 / 91
Decompose Y = U00VT0 + U11V
T1 where U0, V0 are the singular vectors
associated with singular values > and U1, V1 are the ones associated withvalues
.
-
8/3/2019 MC Complete Version
139/170
ELEG 867 (MC and RPCA problems) Fall, 2011 74 / 91
Decompose Y = U00VT0 + U11V
T1 where U0, V0 are the singular vectors
associated with singular values > and U1, V1 are the ones associated withvalues
. Since X =
D(Y) we can write
( ) T
-
8/3/2019 MC Complete Version
140/170
X = U0(0 I)VT0 .
ThenY X = U11VT1 + U0VT0
= (U0VT0 + W) , W =
1U11VT1
By definition UT0 W = 0 , WV0 = 0 and since the diagonal elements of1have magnitudes bounded by , we also have W2 1. HenceY X X which concludes the proof.
ELEG 867 (MC and RPCA problems) Fall, 2011 74 / 91
Solving the subproblems
Solving for Ek+1
Ek+1 = argminE,(E)=0
L(Ak+1,E, Yk, k)
-
8/3/2019 MC Complete Version
141/170
Ek+1 = argminE,(E)=0
Y,D A E + 2D A E2F
Ek+1 = argminE,(E)=0
12D
A E+ 1Y2F
Ek+1 = (D Ak+1 + 1k Yk)
Here is the complementary set of,
= {(i,j) | (i,j) / }.
ELEG 867 (MC and RPCA problems) Fall, 2011 75 / 91
Solving the problem
The algorithm is reduced to
Input: Observation samples Dij, (i,j )
-
8/3/2019 MC Complete Version
142/170
Y0 = 0;E0 = 0; 0 > 0; > 1; k = 0
while not converged
Ak+1 = D1k
(D
Ek + 1k
Yk)
Ek+1 = (D
Ak+1 + 1k Yk)
Yk+1 = Yk + k(D
Ak+1 Ek+1)
k+1 = kk k+ 1
end while
Output: (Ak,Ek)
ELEG 867 (MC and RPCA problems) Fall, 2011 76 / 91
Robust PCA
Optimization problem
-
8/3/2019 MC Complete Version
143/170
minimize rank (A) + ||E||0 (8)subject to Aij + Eij = Dij (i,j)
ELEG 867 (MC and RPCA problems) Fall, 2011 77 / 91
Robust PCA
Optimization problem
-
8/3/2019 MC Complete Version
144/170
minimize rank (A) + ||E||0 (8)subject to Aij + Eij = Dij (i,j)
We look for the best rank-k approximation of the matrix D which is
corrupted by sparse noise
ELEG 867 (MC and RPCA problems) Fall, 2011 77 / 91
Robust PCA
Optimization problem
-
8/3/2019 MC Complete Version
145/170
minimize rank (A) + ||E||0 (8)subject to Aij + Eij = Dij (i,j)
We look for the best rank-k approximation of the matrix D which is
corrupted by sparse noise
Similar problems and conditions as in the case of matrix completion
ELEG 867 (MC and RPCA problems) Fall, 2011 77 / 91
Robust PCA
The original problem is very hard to solve so we look again for a convex
relaxation of the problem
-
8/3/2019 MC Complete Version
146/170
ELEG 867 (MC and RPCA problems) Fall, 2011 78 / 91
Robust PCA
The original problem is very hard to solve so we look again for a convex
relaxation of the problem
-
8/3/2019 MC Complete Version
147/170
rank(A) = ||diag()||0 A = UVT
ELEG 867 (MC and RPCA problems) Fall, 2011 78 / 91
Robust PCA
The original problem is very hard to solve so we look again for a convex
relaxation of the problem
-
8/3/2019 MC Complete Version
148/170
rank(A) = ||diag()||0 A = UVT , E0
ELEG 867 (MC and RPCA problems) Fall, 2011 78 / 91
Robust PCA
The original problem is very hard to solve so we look again for a convex
relaxation of the problem
-
8/3/2019 MC Complete Version
149/170
rank(A) = ||diag()||0 A = UVT , E0
ELEG 867 (MC and RPCA problems) Fall, 2011 78 / 91
Robust PCA
The original problem is very hard to solve so we look again for a convex
relaxation of the problem
-
8/3/2019 MC Complete Version
150/170
rank(A) = ||diag()||0 A = UVT , E0
||A|| = ||diag()||1 E1
ELEG 867 (MC and RPCA problems) Fall, 2011 78 / 91
Robust PCA
The original problem is very hard to solve so we look again for a convex
relaxation of the problem
T
-
8/3/2019 MC Complete Version
151/170
rank(A) = ||diag()||0 A = UVT , E0
||A|| = ||diag()||1 E1
Convex relaxation
minimize A + E1 (9)subject to Aij + Eij = Dij (i,j)
ELEG 867 (MC and RPCA problems) Fall, 2011 78 / 91
-
8/3/2019 MC Complete Version
152/170
ELEG 867 (MC and RPCA problems) Fall, 2011 79 / 91
Robust PCA
Conditions for exact recovery of the convex relaxation
In order to have exact recovery we need to impose that the low rank part
i t d l th t th t i t l k
-
8/3/2019 MC Complete Version
153/170
is not sparse and also that the sparse part is not low rank
Icoherence condition of the low rank part
The incoherence condition of a matrix A = USVT Rmn with parameter states that
maxi
UTei2 rm , maxi VTei2 rn
UVT rmn
ELEG 867 (MC and RPCA problems) Fall, 2011 80 / 91
Robust PCA
Theorem
IfA0 obeys the incoherent condition of parameter and the sampling set isuniformly distributed among all sets of cardinality M obeying M = 0.1mn and
-
8/3/2019 MC Complete Version
154/170
uniformly distributed among all sets of cardinality M obeying M 0.1mn andalso each observed entry is corrupted with probability independently of theothers. Then for N = max(m, n) there exist a constant c such that withprobability at least 1
cN10 problem (9) with = 1/
0.1N recovers the
exact solutions (A0,E0) provided that
rank(A0) rN1(logN)2 , swhere r and s are positive numerical constants
E.J. Candes, X. Li, Y. Ma, and Wright, J. Robust principal component analysis? http://arxiv.org/abs/0912.3599
ELEG 867 (MC and RPCA problems) Fall, 2011 81 / 91
Solving the problem
Rewriting the problem
-
8/3/2019 MC Complete Version
155/170
ELEG 867 (MC and RPCA problems) Fall, 2011 82 / 91
Solving the problem
Rewriting the problem
minimizeA
+ E
1subject to Aij + Eij = Dij (i,j)
-
8/3/2019 MC Complete Version
156/170
ELEG 867 (MC and RPCA problems) Fall, 2011 82 / 91
Solving the problem
Rewriting the problem
minimizeA
+
E
1subject to Aij + Eij = Dij (i,j)
-
8/3/2019 MC Complete Version
157/170
ELEG 867 (MC and RPCA problems) Fall, 2011 82 / 91
Solving the problem
Rewriting the problem
minimizeA
+
E
1subject to Aij + Eij = Dij (i,j)
-
8/3/2019 MC Complete Version
158/170
minimize A + E1subject to A + E+ Z = D
, (Z) = 0
ELEG 867 (MC and RPCA problems) Fall, 2011 82 / 91
Solving the problem
Rewriting the problem
minimizeA
+
E
1subject to Aij + Eij = Dij (i,j)
-
8/3/2019 MC Complete Version
159/170
minimize A + E1subject to A + E+ Z = D
, (Z) = 0
where
ELEG 867 (MC and RPCA problems) Fall, 2011 82 / 91
Solving the problem
Rewriting the problem
minimizeA
+
E
1subject to Aij + Eij = Dij (i,j)
-
8/3/2019 MC Complete Version
160/170
minimize A + E1subject to A + E+ Z = D
, (Z) = 0
where
[(Z)]ij =
Zij if(i,j)
0 if(i,j) / D
ij =
Dij if(i,j)
0 if(i,j) /
ELEG 867 (MC and RPCA problems) Fall, 2011 82 / 91
Solving the problem
Rewriting the problem
minimizeA
+
E
1
subject to Aij + Eij = Dij (i,j)
-
8/3/2019 MC Complete Version
161/170
minimize A + E1subject to A + E+ Z = D
, (Z) = 0
where
[(Z)]ij =
Zij if(i,j)
0 if(i,j) / D
ij =
Dij if(i,j)
0 if(i,j) /
Z. Lin, M. Chen, L. Wu and Y. Ma The augmented lagrange multiplier method for exact recovery of corrupted low-rank matrices
http://arxiv.org/abs/1009.5055
ELEG 867 (MC and RPCA problems) Fall, 2011 82 / 91
Solving the problem
The augmented lagrangian for the problem
minimize A + E1
-
8/3/2019 MC Complete Version
162/170
minimize A + E1subject to A + E+ Z = D
, (Z) = 0
is
L(A,E, Y,Z, ) = A+E1+Y,DAEZ+ 2DAEZ2F (10)
with the additional constraint (Z) = 0.
ELEG 867 (MC and RPCA problems) Fall, 2011 83 / 91
Solving the problem
Algorithm
input: Observation samples Dij, (i,j )
Y0 = 0;E0 = 0;A0 = D
;Z0 = 0, 0 > 0; > 1; k = 0
while not converged
-
8/3/2019 MC Complete Version
163/170
while not converged
Ak+1 = argminA
L(A,Ek, Yk,Zk, k)
Ek+1 = argminE
L(Ak+1,E, Yk,Zk, k)
Zk+1 = argminZ,(Z)=0
L(Ak+1,Ek+1, Yk,Z, k)
Yk+1 = Yk + k(D
Ak+1 Ek+1 Zk+1)
k+1 = k
k k+ 1end while
Output: (Ak,Ek,Zk)
ELEG 867 (MC and RPCA problems) Fall, 2011 84 / 91
Solving the subproblems
Solving for Ak+1
Ak+1 = argminA
L(A,Ek, Yk,Zk, k)
-
8/3/2019 MC Complete Version
164/170
Ak+1 = argminA
A + Yk,D A Ek Zk + k2 D A Ek Zk2F
Ak+1 = argminA
1k A + 12D A Ek Zk + 1Yk2F
which has closed form solution
Ak+1 = D1k (D
Ek Zk + 1
k Yk)
ELEG 867 (MC and RPCA problems) Fall, 2011 85 / 91
Solving the subproblems
Solving for Ek+1
Ek+1 = argminE
L(Ak+1,E, Yk,Zk, k)
i A
-
8/3/2019 MC Complete Version
165/170
Ek+1 = argminE
E1 + Y,D Ak+1 EZk+ k
2D Ak+1 EZk2F
Ek+1 = argminE
1k E1 + 12D Ak+1 EZk + 1k Y2F
which has the form
argminE
E1 + 12XE2F
ELEG 867 (MC and RPCA problems) Fall, 2011 86 / 91
Solving the subproblems
Shrinkage operator
Given a matrix Y
Rmn, the operatorS
(.) : Rmn
Rmn is defined as
S (Y) = sign(Y)(|Y| )
-
8/3/2019 MC Complete Version
166/170
( ) g ( )(| | )
where
sign(Y)(|Y
| ) is applied componentwise to Y
Theorem
For each 0 and Y Rmn, the shrinkage operator obeys
S(Y) = argminX
X1 + 12Y X2F
ELEG 867 (MC and RPCA problems) Fall, 2011 87 / 91
Proof:
Consider the function h(X) = X1 + 12X Y2F
A sufficient condition for optimality of X is that
0 X Y + X1h i h f b di f h l
-
8/3/2019 MC Complete Version
167/170
where X1 is the set of subgradients of the l1 norm X1 at X.
All the subgradients of
X
1 at X are given by
X1 =
G Rmn
Gij =
1 for Xij < 01 for Xij > 0
[1, 1] for Xij = 0
If we prove that Y X X1 then X is the unique minimizer of theproblem.
ELEG 867 (MC and RPCA problems) Fall, 2011 88 / 91
Consider the candidate X = S(Y), then[Y S(Y)]ij = Yij sign(Yij).max(|Yij| , 0)
[Y
S(Y)]ij =
sign(Yij) if|Yij| > Y
ijif
|Y
ij| 1 for Yij <
-
8/3/2019 MC Complete Version
168/170
S(Y)1 =
G Rmn
Gij =
j
1 for Yij > [1, 1] for |Yij|
S(Y)1 =
G Rmn
Gij = sign(Yij) for |Yij| > [, ] for |Yij|
Y S(Y) S(Y) S(Y) is the optimal solution
ELEG 867 (MC and RPCA problems) Fall, 2011 89 / 91
Solving the subproblemsSolving for Zk+1
Zk+1 = argminZ,(Z)=0
L(Ak+1,Ek+1, Yk,Z, k)
Zk+1 = argmin Yk,D Ak+1 Ek+1 Z
-
8/3/2019 MC Complete Version
169/170
Z,(Z)=0
+ 2D Ak+1 Ek+1 Z2F
Zk+1 = argminZ,(Z)=0
12D Ak+1 Ek+1 Z + 1k Yk2F
Zk+1 = (D Ak+1 Ek+1 + 1k Yk)
Here is the complementary set of,
= {(i,j) | (i,j) / }.ELEG 867 (MC and RPCA problems) Fall, 2011 90 / 91
Solving the problem
Algorithm
input: Observation samples Dij, (i,j )
Y0 = 0;E0 = 0;Z0 = 0; 0 > 0; > 1; k = 0
while not converged
-
8/3/2019 MC Complete Version
170/170
Ak+1 = D1k
(D
EkZk + 1k
Yk)
Ek+1 = S(D
Ak+1 Zk + 1k
Y)
Zk+1 = (D
Ak+1 Ek+1 + 1k
Yk)
Yk+1 = Yk + k(D
Ak+1 Ek+1)
k+1 = kk k+ 1
end while
Output: (Ak,Ek,Zk)
ELEG 867 (MC and RPCA problems) Fall, 2011 91 / 91