mc complete version

Upload: gonzalo-garateguy

Post on 06-Apr-2018

227 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/3/2019 MC Complete Version

    1/170

    ELEG 867 - Compressive Sensing and Sparse Signal

    Representations

    Introduction to Matrix Completion and Robust PCA

    Gonzalo Garateguy

    Depart. of Electrical and Computer Engineering

    University of Delaware

    Fall 2011

    ELEG 867 (MC and RPCA problems) Fall, 2011 1 / 91

    http://-/?-http://find/http://goback/http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-
  • 8/3/2019 MC Complete Version

    2/170

    Matrix Completion Problems - Motivation

    Recomender Systems

    Items

    User 1 x x ? ? x xUser 2 ? ? x x ? ?

    . ? x ? x x ?

    . x ? ? x ? x

    . x ? x ? ? x

    . ? x ? ? x ?

    . ? ? x x x ?

    User n x x ? ? ? x

    Collaborative filtering (Amazon, last.fm)Content based (Pandora,

    www.nanocrowd.com)

    Netflix prize competition boosted interest in

    the area

    http://www.ima.umn.edu/videos/index.php?id=1598

    http://sahd.pratt.duke.edu/Videos/keynote.html

    ELEG 867 (MC and RPCA problems) Fall, 2011 2 / 91

    http://www.ima.umn.edu/videos/index.php?id=1598http://sahd.pratt.duke.edu/Videos/keynote.htmlhttp://sahd.pratt.duke.edu/Videos/keynote.htmlhttp://www.ima.umn.edu/videos/index.php?id=1598http://find/http://goback/http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-
  • 8/3/2019 MC Complete Version

    3/170

    Matrix Completion Problems - Motivation

    Sensor location estimation in Wireless

    Sensor Networks

    node 1

    no e 2

    node 3

    node 4

    node 5

    node 6

    node 7

    d12

    d13

    d24

    d34d45

    d57

    d56

    d67

    d23=?

    d43=?

    d74=?

    d64=?

    .

    .

    .

    Distance matrix

    1 2 3 4 5 6 7

    1 0 d1,2 d1,3 ? ? ? ?

    2 d2,1 0 ? d2,4 ? ? ?

    3 d3,1 ? 0 d3,4 ? ? ?

    4 ? d4,2

    d4,3

    0 d4,5

    ? ?

    5 ? ? ? d5,4 0 d5,6 d5,76 ? ? ? ? d6,5 0 d6,77 ? ? ? ? d7,5 d7,6 0

    The problem is to find the positions of the

    sensors in R2 given the partial information

    about relative distances

    A distance matrix like this has rank 2 in R2

    For certain types of graphs the problem can be

    solved if we know the whole distance matrix

    ELEG 867 (MC and RPCA problems) Fall, 2011 3 / 91

    http://find/http://goback/http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-
  • 8/3/2019 MC Complete Version

    4/170

    Matrix Completion Problems - Motivation

    Image reconstruction from incomplete data

    Reconstructed image Incomplete image 50% of the pixels

    ELEG 867 (MC and RPCA problems) Fall, 2011 4 / 91

    http://find/http://goback/http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-
  • 8/3/2019 MC Complete Version

    5/170

    Robust PCA - Motivation

    Foreground identification for surveillance applications

    E.J. Candes, X. Li, Y. Ma, and Wright, J. Robust principal component analysis? http://arxiv.org/abs/0912.3599

    ELEG 867 (MC and RPCA problems) Fall, 2011 5 / 91

    http://arxiv.org/abs/0912.3599http://arxiv.org/abs/0912.3599http://find/http://goback/http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-
  • 8/3/2019 MC Complete Version

    6/170

    Robust PCA - Motivation

    Image alignment and texture recognition

    Z. Zhang, X. Liang, A. Ganesh, and Y. Ma, TILT: transform invariant low-rank textures Computer VisionACCV 2010

    ELEG 867 (MC and RPCA problems) Fall, 2011 6 / 91

    http://find/http://goback/http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-
  • 8/3/2019 MC Complete Version

    7/170

    Robust PCA - Motivation

    Camera calibration with radial distortion

    J. Wright, Z. Lin, and Y. Ma Low-Rank Matrix Recovery: From Theory to Imaging Applications Tutorial presented at International

    Conference on Image and Graphics (ICIG), August 2011

    ELEG 867 (MC and RPCA problems) Fall, 2011 7 / 91

    http://find/http://goback/http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-
  • 8/3/2019 MC Complete Version

    8/170

    Motivation

    Many other applications

    System Identification in control theory

    Covariance matrix estimation

    Machine Learning

    Computer Vision

    Videos to watchMatrix Completion via Convex Optimization: Theory and Algorithms by Emmanuel Candeshttp://videolectures.net/mlss09us_candes_mccota/

    Low Dimensional Structures in Images or Data by Yi Ma, Workshop in Signal Processing with Adaptive

    Sparse Structured Representations (June 2011)

    http://ecos.maths.ed.ac.uk/SPARS11/YiMa.wmv

    ELEG 867 (MC and RPCA problems) Fall, 2011 8 / 91

    http://videolectures.net/mlss09us_candes_mccota/http://ecos.maths.ed.ac.uk/SPARS11/YiMa.wmvhttp://ecos.maths.ed.ac.uk/SPARS11/YiMa.wmvhttp://videolectures.net/mlss09us_candes_mccota/http://find/http://goback/http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-
  • 8/3/2019 MC Complete Version

    9/170

    Problem Formulation

    Matrix completion

    minimize rank (A) (1)

    subject to Aij = Dij (i,j)

    Robust PCA

    minimize rank (A) + ||E||0 (2)

    subject to Aij + Eij = Dij (i,j)

    ELEG 867 (MC and RPCA problems) Fall, 2011 9 / 91

    http://find/http://goback/http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-
  • 8/3/2019 MC Complete Version

    10/170

    Problem Formulation

    Matrix completion

    minimize rank (A) (1)

    subject to Aij = Dij (i,j)

    Robust PCA

    minimize rank (A) + ||E||0 (2)

    subject to Aij + Eij = Dij (i,j)

    Very hard to solve in general without any asumptions, some times NP

    hard.

    ELEG 867 (MC and RPCA problems) Fall, 2011 9 / 91

    http://find/http://goback/http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-
  • 8/3/2019 MC Complete Version

    11/170

    Problem Formulation

    Matrix completion

    minimize rank (A) (1)

    subject to Aij = Dij (i,j)

    Robust PCA

    minimize rank (A) + ||E||0 (2)

    subject to Aij + Eij = Dij (i,j)

    Very hard to solve in general without any asumptions, some times NP

    hard.

    Even if we can solve them, are the solutions always what we expect?

    ELEG 867 (MC and RPCA problems) Fall, 2011 9 / 91

    http://find/http://goback/http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-
  • 8/3/2019 MC Complete Version

    12/170

    Problem Formulation

    Matrix completion

    minimize rank (A) (1)

    subject to Aij = Dij (i,j)

    Robust PCA

    minimize rank (A) + ||E||0 (2)

    subject to Aij + Eij = Dij (i,j)

    Very hard to solve in general without any asumptions, some times NP

    hard.

    Even if we can solve them, are the solutions always what we expect?

    Under wich conditions we can have exact recovery of the real matrices?

    ELEG 867 (MC and RPCA problems) Fall, 2011 9 / 91

    http://find/http://goback/http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-
  • 8/3/2019 MC Complete Version

    13/170

    Outline

    Convex Optimization concepts

    Matrix Completion

    Exact Recovery from incomplete data by convex relaxationALM method for Nuclear Norm Minimization

    Robust PCA

    Exact Recovery from incomplete data and corrupted data by convex

    relaxation

    ALM method for Low rank and Sparse separation

    ELEG 867 (MC and RPCA problems) Fall, 2011 10 / 91

    http://find/http://goback/http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-
  • 8/3/2019 MC Complete Version

    14/170

    Convex sets and Convex functions

    Convex set

    A set C is convex if the line segment between any two points in Clies in C.

    For any x1,x2 C and any with 0 1 we have

    x1 + (1 )x2 C.

    ELEG 867 (MC and RPCA problems) Fall, 2011 11 / 91

    http://find/http://goback/http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-
  • 8/3/2019 MC Complete Version

    15/170

    Convex sets and Convex functions

    Convex set

    A set C is convex if the line segment between any two points in Clies in C.

    For any x1,x2 C and any with 0 1 we have

    x1 + (1 )x2 C.

    convex non convex non convex

    ELEG 867 (MC and RPCA problems) Fall, 2011 11 / 91

  • 8/3/2019 MC Complete Version

    16/170

    Convex sets and Convex functions

    Convex combination

    A convex combination of k points x1,..,xk is defined as

    1x1 + ... + kxk , where i 0 and 1 + ... + k = 1

    ELEG 867 (MC and RPCA problems) Fall, 2011 12 / 91

  • 8/3/2019 MC Complete Version

    17/170

    Convex sets and Convex functions

    Convex combination

    A convex combination of k points x1,..,xk is defined as

    1x1 + ... + kxk , where i 0 and 1 + ... + k = 1

    Convex hull

    The convex hull ofC is the set of all convex conbinations of points in C

    conv C = {1x1 + ... + kxk|xi C, i 0, i = 1,..., k, 1 + ... + k = 1}

    ELEG 867 (MC and RPCA problems) Fall, 2011 12 / 91

  • 8/3/2019 MC Complete Version

    18/170

    Convex sets and Convex functions

    Operations that preserve convexity

    Intersection

    IfS1 and S2 are convex, then S1

    S2 is convex.

    In general ifS is convex for every A, then

    A S is convex.

    Subspaces, affine sets and convex cones are therefore closed under arbitraryintersections.

    Affine functions

    Let f : Rn Rm be affine, f(x) = Ax + b, where A Rmn and b Rm. IfS R

    n

    is convex, then the image ofS under f

    f(S) = {f(x)|x S}

    is convex

    ELEG 867 (MC and RPCA problems) Fall, 2011 13 / 91

  • 8/3/2019 MC Complete Version

    19/170

    Convex sets and Convex functions

    Convex functions

    A function f : Rn R is convex ifdomf is a convex set and if for allx,y domf, and with 0 1, we have

    f(x + (1 )y) f(x) + (1 f(y))

    we say that f is strictly convex if the strict intequality holds whenever x = yand 0 < < 1

    ELEG 867 (MC and RPCA problems) Fall, 2011 14 / 91

  • 8/3/2019 MC Complete Version

    20/170

    Operations that preserve convexity

    Composition with an affine mapping

    Suppose f : Rn R, A Rnm and b Rn. Define g : Rm R byg(x) = f(Ax + b)

    with domg =

    {x

    |Ax + b

    domf

    }. Then iff is convex, so is g.

    Pointwise maximum

    iff1 and f2 are convex functions then their pointwise maximum f defined by

    f(x) = max

    {f1(x),f2(x)

    }with domf = domf1 domf2 is also convex. This also extend to the casewhere f1, ...,fm are convex, then

    f(x) = max

    {f1(x), ...,fm(x)

    }, is also convex

    ELEG 867 (MC and RPCA problems) Fall, 2011 15 / 91

  • 8/3/2019 MC Complete Version

    21/170

    Pointwise maximum of convex functions

    f1(x)

    f2(x)f(x)=max{f1(x),f2(x)}

    f(x)f(x1)

    f(x2)^f(x)

    ELEG 867 (MC and RPCA problems) Fall, 2011 16 / 91

  • 8/3/2019 MC Complete Version

    22/170

    Convex sets and Convex functions

    Convex differentiable functions

    Iff is differentiable (i.e. its gradient f exist at each point in domf). Then fis convex if and only if domf is convex and

    f(y) f(x) + f(x)T(y x)

    holds for all x,y domf.

    ELEG 867 (MC and RPCA problems) Fall, 2011 17 / 91

  • 8/3/2019 MC Complete Version

    23/170

    Second order conditions

    Iff is twice differentiable, i.e. its Hessian 2f exist at each point in domf.Then f is convex if and only ifdomf is convex and its Hessian is positive

    semidefinite for all x domf2f(x) 0

    ELEG 867 (MC and RPCA problems) Fall, 2011 18 / 91

  • 8/3/2019 MC Complete Version

    24/170

    Convex non-differentiable functions

    The concept of gradient can be extended to non-differentiable functions

    introducing the subgradient

    Subgradient of a function

    A vector g Rn is a subgradient off : Rn R at x domf if for allz

    domf

    f(z) f(x) + gT(z x)

    ELEG 867 (MC and RPCA problems) Fall, 2011 19 / 91

  • 8/3/2019 MC Complete Version

    25/170

    Subgradients

    Observations

    Iff is convex and differentiable, then its gradient at x , f(x) is its onlysubgradient

    Subdifferentiable functions

    A function f is called subdifferentiable at x if there exist at least one

    subgradient at x

    Subdifferential at a point

    The set of subgradients of f at the point x is called the subdifferential of f at x,

    and is denoted f(x)

    Subdifferentiability of a function

    A function f is called subdifferentiable if it is subdifferentiable at all

    x

    domf

    ELEG 867 (MC and RPCA problems) Fall, 2011 20 / 91

  • 8/3/2019 MC Complete Version

    26/170

    Basic properties

    Existence of the subgradient of a convex function

    Iff is convex and x int domf, then f(x) is nonempty and bounded.

    The subdifferential f(x) is always a closed convex set, even if f is notconvex. This follows from the fact that it is the intersection of an infinite set

    of halfspaces

    f(x) = zdomf{

    g

    |f(z)

    f(x) + gT(z

    x)

    }.

    ELEG 867 (MC and RPCA problems) Fall, 2011 21 / 91

  • 8/3/2019 MC Complete Version

    27/170

    Basic properties

    Nonnegative scaling

    For 0, (f)(x) = f(x)

    ELEG 867 (MC and RPCA problems) Fall, 2011 22 / 91

  • 8/3/2019 MC Complete Version

    28/170

    Basic properties

    Nonnegative scaling

    For 0, (f)(x) = f(x)

    Subgradient of the sum

    Given f = f1 + ... + fm, where f1, ...,fm are convex functions, the subgradient

    off at x is given by f(x) = f1(x) + ... + fm(x)

    ELEG 867 (MC and RPCA problems) Fall, 2011 22 / 91

  • 8/3/2019 MC Complete Version

    29/170

    Basic properties

    Nonnegative scaling

    For 0, (f)(x) = f(x)

    Subgradient of the sum

    Given f = f1 + ... + fm, where f1, ...,fm are convex functions, the subgradient

    off at x is given by f(x) = f1(x) + ... + fm(x)

    Affine transformations of domain

    Suppose f is convex, and let h(x) = f(Ax + b). Then h(x) = ATf(Ax + b).

    ELEG 867 (MC and RPCA problems) Fall, 2011 22 / 91

    B i i

  • 8/3/2019 MC Complete Version

    30/170

    Basic properties

    Nonnegative scaling

    For 0, (f)(x) = f(x)

    Subgradient of the sum

    Given f = f1 + ... + fm, where f1, ...,fm are convex functions, the subgradient

    off at x is given by f(x) = f1(x) + ... + fm(x)

    Affine transformations of domain

    Suppose f is convex, and let h(x) = f(Ax + b). Then h(x) = ATf(Ax + b).

    Pointwise maximum

    Suppose f is the pointwise maximum of convex functions f1, ...,fm,f(x) = max

    i=1,...,mfi(x), then f(x) = Co {fi(x)|fi(x) = f(x)}

    ELEG 867 (MC and RPCA problems) Fall, 2011 22 / 91

    S b di f h i i i f

  • 8/3/2019 MC Complete Version

    31/170

    Subgradient of the pointwise maximum of two convex

    functions

    f1(x)

    f2(x)f(x)=max{f1(x),f2(x)}

    x

    ELEG 867 (MC and RPCA problems) Fall, 2011 23 / 91

    S b di t f th i t i i f t

  • 8/3/2019 MC Complete Version

    32/170

    Subgradient of the pointwise maximum of two convex

    functions

    f1(x)

    f2(x)f(x)=max{f1(x),f2(x)}

    x

    ELEG 867 (MC and RPCA problems) Fall, 2011 24 / 91

    S b di t f th i t i i f t

  • 8/3/2019 MC Complete Version

    33/170

    Subgradient of the pointwise maximum of two convex

    functions

    f1(x)

    f2(x)f(x)=max{f1(x),f2(x)}

    x

    ELEG 867 (MC and RPCA problems) Fall, 2011 25 / 91

    E l

  • 8/3/2019 MC Complete Version

    34/170

    Examples

    Conside the function f(x) = |x|.

    ELEG 867 (MC and RPCA problems) Fall, 2011 26 / 91

    Examples

  • 8/3/2019 MC Complete Version

    35/170

    Examples

    Conside the function f(x) = |x|. At x0=0 , the subdiferential is defined by theinequality

    ELEG 867 (MC and RPCA problems) Fall, 2011 26 / 91

    Examples

  • 8/3/2019 MC Complete Version

    36/170

    Examples

    Conside the function f(x) = |x|. At x0=0 , the subdiferential is defined by theinequality

    f(z) f(x0) + g(z x0), z dom f

    ELEG 867 (MC and RPCA problems) Fall, 2011 26 / 91

    Examples

  • 8/3/2019 MC Complete Version

    37/170

    Examples

    Conside the function f(x) = |x|. At x0=0 , the subdiferential is defined by theinequality

    f(z) f(x0) + g(z x0), z dom f|z| gz, z R

    ELEG 867 (MC and RPCA problems) Fall, 2011 26 / 91

    Examples

  • 8/3/2019 MC Complete Version

    38/170

    Examples

    Conside the function f(x) = |x|. At x0=0 , the subdiferential is defined by theinequality

    f(z) f(x0) + g(z x0), z dom f|z| gz, z R

    f(0) = {g | g [1 , 1]}

    ELEG 867 (MC and RPCA problems) Fall, 2011 26 / 91

    Examples

  • 8/3/2019 MC Complete Version

    39/170

    Examples

    Conside the function f(x) = |x|. At x0=0 , the subdiferential is defined by theinequality

    f(z) f(x0) + g(z x0), z dom f|z| gz, z R

    f(0) = {g | g [1 , 1]}then for all x

    ELEG 867 (MC and RPCA problems) Fall, 2011 26 / 91

    Examples

  • 8/3/2019 MC Complete Version

    40/170

    Examples

    Conside the function f(x) = |x|. At x0=0 , the subdiferential is defined by theinequality

    f(z) f(x0) + g(z x0), z dom f|z| gz, z R

    f(0) = {g | g [1 , 1]}then for all x

    f(x) = 1 for x < 01 for x > 0{g|g [1, 1]} for x = 0

    ELEG 867 (MC and RPCA problems) Fall, 2011 26 / 91

    Examples

  • 8/3/2019 MC Complete Version

    41/170

    Examples

    Conside the function f(x) = |x|. At x0=0 , the subdiferential is defined by theinequality

    f(z) f(x0) + g(z x0), z dom f|z| gz, z R

    f(0) = {g | g [1 , 1]}then for all x

    f(x) = 1 for x < 01 for x > 0{g|g [1, 1]} for x = 0

    ELEG 867 (MC and RPCA problems) Fall, 2011 26 / 91

    Example: 1 norm

  • 8/3/2019 MC Complete Version

    42/170

    Example: 1 norm

    Consider f(x) = x1 = |x1| + + |xn|, and note that f can be expressed asthe maximum of 2

    n

    linear functions

    ELEG 867 (MC and RPCA problems) Fall, 2011 27 / 91

    Example: 1 norm

  • 8/3/2019 MC Complete Version

    43/170

    Example: 1 norm

    Consider f(x) = x1 = |x1| + + |xn|, and note that f can be expressed asthe maximum of 2

    n

    linear functions

    x1 = max{ f1(x) ,.., f2n (x) }

    ELEG 867 (MC and RPCA problems) Fall, 2011 27 / 91

    Example: 1 norm

  • 8/3/2019 MC Complete Version

    44/170

    a p 1

    Consider f(x) = x1 = |x1| + + |xn|, and note that f can be expressed asthe maximum of 2

    n

    linear functions

    x1 = max{ f1(x) ,.., f2n (x) }

    x1 = max{ sT1x ,.., sT2nx | si {1, 1}n }

    ELEG 867 (MC and RPCA problems) Fall, 2011 27 / 91

    Example: 1 norm

  • 8/3/2019 MC Complete Version

    45/170

    p 1

    Consider f(x) = x1 = |x1| + + |xn|, and note that f can be expressed asthe maximum of 2

    n

    linear functions

    x1 = max{ f1(x) ,.., f2n (x) }

    x1 = max{ sT1x ,.., sT2nx | si {1, 1}n }

    The active functions fi(x) at x are the ones for wich sTi x = x1. Thendenoting

    si = [si,1, ..., si,n]T, si,j {1, 1}

    ELEG 867 (MC and RPCA problems) Fall, 2011 27 / 91

    Example: 1 norm

  • 8/3/2019 MC Complete Version

    46/170

    p 1

    Consider f(x) = x1 = |x1| + + |xn|, and note that f can be expressed asthe maximum of 2

    n

    linear functionsx1 = max{ f1(x) ,.., f2n (x) }

    x1 = max{ sT1x ,.., sT2nx | si {1, 1}n }

    The active functions fi(x) at x are the ones for wich sTi x = x1. Thendenoting

    si = [si,1, ..., si,n]T, si,j {1, 1}

    the set of indices of the active functions at x is

    Ax =i

    si,j = 1 for xj < 0si,j = 1 for xj > 0

    si,j = 1 or 1 for xj = 0, for j = 1,.., n

    ELEG 867 (MC and RPCA problems) Fall, 2011 27 / 91

    subgradient of the 1 norm

  • 8/3/2019 MC Complete Version

    47/170

    g

    The subgradient ofx1 at a generic point x is defined byx1 = co { fi(x) | i Ax }

    ELEG 867 (MC and RPCA problems) Fall, 2011 28 / 91

    subgradient of the 1 norm

  • 8/3/2019 MC Complete Version

    48/170

    g

    The subgradient ofx1 at a generic point x is defined byx1 = co { fi(x) | i Ax }x1 = co{ fi(x) | i Ax }

    ELEG 867 (MC and RPCA problems) Fall, 2011 28 / 91

    subgradient of the 1 norm

  • 8/3/2019 MC Complete Version

    49/170

    The subgradient ofx1 at a generic point x is defined byx1 = co { fi(x) | i Ax }x1 = co{ fi(x) | i Ax }x1 = co{ si|i Ax }

    ELEG 867 (MC and RPCA problems) Fall, 2011 28 / 91

    subgradient of the 1 norm

  • 8/3/2019 MC Complete Version

    50/170

    The subgradient ofx1 at a generic point x is defined byx1 = co { fi(x) | i Ax }x1 = co{ fi(x) | i Ax }x1 = co{ si|i Ax }x1 = {g|g = iAx isi , i 0 , i i = 1}

    or equivalently

    x1 = g gj = 1 for xj < 0gj = 1 for xj > 0gj = [1, 1] for xj = 0

    ELEG 867 (MC and RPCA problems) Fall, 2011 28 / 91

    1 norm on R2

  • 8/3/2019 MC Complete Version

    51/170

    in R2

    the set of subgradients are

    s1 = [ 1, 1]Ts2 = [ 1, 1]Ts3 = [ 1, 1]

    T

    s4 = [ 1,

    1]T

    ELEG 867 (MC and RPCA problems) Fall, 2011 29 / 91

  • 8/3/2019 MC Complete Version

    52/170

    ELEG 867 (MC and RPCA problems) Fall, 2011 30 / 91

    Convex optimization problems

  • 8/3/2019 MC Complete Version

    53/170

    An optimization problem is convex if its objective is a convex function, theinequality constraints fj are convex and the equality constraints hj are affine

    minimizex

    f0(x) (Convex function)

    s.t. fi(x)

    0 (Convex sets)

    hj(x) = 0 (Affine)

    or equivalently

    minimizex f0(x) (Convex function)s.t. x C C is a convex set

    hj(x) = 0 (Affine)

    ELEG 867 (MC and RPCA problems) Fall, 2011 31 / 91

    Theorem

    If i l l i i i f ti i ti bl it i l b l

  • 8/3/2019 MC Complete Version

    54/170

    Ifx is a local minimizer of a convex optimization problem, it is a global

    minimizer.

    Optimality conditions

    A point x is a minimizer of a convex function f if and only iff is

    subdifferentiable at x and

    0 f(x)

    ELEG 867 (MC and RPCA problems) Fall, 2011 32 / 91

    Convex optimization problems

  • 8/3/2019 MC Complete Version

    55/170

    Given the convex problem

    minimizex

    f0(x)

    s.t. fi(x) 0, i = {1, ..., k}hj(x) = 0, j =

    {1, ..., l

    }its Lagrangian function is defined as

    L(x, , ) = f0(x) +

    l

    j=1

    jhj(x) +k

    i=1

    ifi(x)

    where i 0, i R

    ELEG 867 (MC and RPCA problems) Fall, 2011 33 / 91

    Augmented Lagrangian Method

  • 8/3/2019 MC Complete Version

    56/170

    Considering the problem

    minimizex

    f(x)

    s.t. x Ch(x) = 0

    (3)

    The augmented lagrangian is defined as

    L(x, , c) = f(x) + Th(x) +

    2 h(x)

    22

    where is a penalty parameter and is the multiplier vector

    ELEG 867 (MC and RPCA problems) Fall, 2011 34 / 91

    Augmented Lagrangian Method

  • 8/3/2019 MC Complete Version

    57/170

    The augmented lagrangian method consist of solving a sequence of problems

    of the form

    minimizex

    L(x, k, k) = f(x) + kTh(x) + k2 h(x)22s.t. x C

    where {k} is a bounded sequence in Rl and {k} is a penalty parametersequence satisfying

    0 < k < k+1 k , k

    ELEG 867 (MC and RPCA problems) Fall, 2011 35 / 91

    Augmented Lagrangian Method

  • 8/3/2019 MC Complete Version

    58/170

    The exact solution to problem (3) can be found using the following iterative

    algorithm

    set > 1

    while not converged do

    solve xk+1 = argminxC

    L(x, k, k)k+1 = k + kh(xk+1)k = k

    end while

    Output xk

    ELEG 867 (MC and RPCA problems) Fall, 2011 36 / 91

    Matrix completion

  • 8/3/2019 MC Complete Version

    59/170

    Optimization problem

    minimize rank (A) (4)

    subject to Aij = Dij

    (i,j)

    ELEG 867 (MC and RPCA problems) Fall, 2011 37 / 91

    Matrix completion

  • 8/3/2019 MC Complete Version

    60/170

    Optimization problem

    minimize rank (A) (4)

    subject to Aij = Dij

    (i,j)

    We look for the simplest explanation for the observed data

    ELEG 867 (MC and RPCA problems) Fall, 2011 37 / 91

    Matrix completion

  • 8/3/2019 MC Complete Version

    61/170

    Optimization problem

    minimize rank (A) (4)

    subject to Aij = Dij

    (i,j)

    We look for the simplest explanation for the observed data

    Given enough number of samples, the likelihood of the solution to beunique should be high

    ELEG 867 (MC and RPCA problems) Fall, 2011 37 / 91

    Matrix completion

  • 8/3/2019 MC Complete Version

    62/170

    minimize rank (A)

    subject to Aij = Dij (i,j)

    ELEG 867 (MC and RPCA problems) Fall, 2011 38 / 91

    Matrix completion

  • 8/3/2019 MC Complete Version

    63/170

    minimize rank (A)

    subject to Aij = Dij (i,j) The minimization of the rank() function is a combinatorial problem,with exponential complexity in the size of the matrix!

    Need for a convex relaxation

    ELEG 867 (MC and RPCA problems) Fall, 2011 38 / 91

    Matrix completion

  • 8/3/2019 MC Complete Version

    64/170

    minimize rank (A)

    subject to Aij = Dij (i,j) The minimization of the rank() function is a combinatorial problem,with exponential complexity in the size of the matrix!

    Need for a convex relaxation

    rank(A)

    ELEG 867 (MC and RPCA problems) Fall, 2011 38 / 91

    Matrix completion

  • 8/3/2019 MC Complete Version

    65/170

    minimize rank (A)

    subject to Aij = Dij (i,j) The minimization of the rank() function is a combinatorial problem,with exponential complexity in the size of the matrix!

    Need for a convex relaxation

    rank(A) = ||diag()||0 A = UVT

    ELEG 867 (MC and RPCA problems) Fall, 2011 38 / 91

    Matrix completion

  • 8/3/2019 MC Complete Version

    66/170

    minimize rank (A)

    subject to Aij = Dij (i,j) The minimization of the rank() function is a combinatorial problem,with exponential complexity in the size of the matrix!

    Need for a convex relaxation

    rank(A) = ||diag()||0 A = UVT

    ELEG 867 (MC and RPCA problems) Fall, 2011 38 / 91

    Matrix completion

  • 8/3/2019 MC Complete Version

    67/170

    minimize rank (A)

    subject to Aij = Dij (i,j) The minimization of the rank() function is a combinatorial problem,with exponential complexity in the size of the matrix!

    Need for a convex relaxation

    rank(A) = ||diag()||0 A = UVT

    ||A|| = ||diag()||1

    ELEG 867 (MC and RPCA problems) Fall, 2011 38 / 91

    Matrix completion

  • 8/3/2019 MC Complete Version

    68/170

    minimize rank (A)

    subject to Aij = Dij (i,j) The minimization of the rank() function is a combinatorial problem,with exponential complexity in the size of the matrix!

    Need for a convex relaxation

    rank(A) = ||diag()||0 A = UVT

    ||A|| = ||diag()||1

    Convex relaxation

    minimize A (5)subject to Aij = Dij (i,j)

    ELEG 867 (MC and RPCA problems) Fall, 2011 38 / 91

    Matrix Completion

  • 8/3/2019 MC Complete Version

    69/170

    Nuclear Norm

    The nuclear norm of a matrix A Rmn is defined as ||A|| =r

    i=1 i(A),where {i(A)}ri=1 are the elements of the diagonal matrix from the SVDdecomposition ofA = UVT

    ELEG 867 (MC and RPCA problems) Fall, 2011 39 / 91

    Matrix Completion

  • 8/3/2019 MC Complete Version

    70/170

    Nuclear Norm

    The nuclear norm of a matrix A Rmn is defined as ||A|| =r

    i=1 i(A),where {i(A)}ri=1 are the elements of the diagonal matrix from the SVDdecomposition ofA = UVT

    Observationsr= rank(A) can be r < m, n. If this is the case we say that the matrix islow rank

    ELEG 867 (MC and RPCA problems) Fall, 2011 39 / 91

    Matrix Completion

  • 8/3/2019 MC Complete Version

    71/170

    Nuclear Norm

    The nuclear norm of a matrix A Rmn is defined as ||A|| =r

    i=1 i(A),where {i(A)}ri=1 are the elements of the diagonal matrix from the SVDdecomposition ofA = UVT

    Observationsr= rank(A) can be r < m, n. If this is the case we say that the matrix islow rank

    the singular values i(A) =

    i(ATA) are obtained as the square root ofthe eigenvalues ofATA and are always

    i 0

    ELEG 867 (MC and RPCA problems) Fall, 2011 39 / 91

    Matrix Completion

  • 8/3/2019 MC Complete Version

    72/170

    Nuclear Norm

    The nuclear norm of a matrix A Rmn is defined as ||A|| =r

    i=1 i(A),where {i(A)}ri=1 are the elements of the diagonal matrix from the SVDdecomposition ofA = UVT

    Observationsr= rank(A) can be r < m, n. If this is the case we say that the matrix islow rank

    the singular values i(A) =

    i(ATA) are obtained as the square root ofthe eigenvalues ofATA and are always

    i 0

    the left singular vectors U are the eigenvectors ofAAT

    ELEG 867 (MC and RPCA problems) Fall, 2011 39 / 91

    Matrix Completion

  • 8/3/2019 MC Complete Version

    73/170

    Nuclear Norm

    The nuclear norm of a matrix A Rmn is defined as ||A|| =r

    i=1 i(A),where {i(A)}ri=1 are the elements of the diagonal matrix from the SVDdecomposition ofA = UVT

    Observationsr= rank(A) can be r < m, n. If this is the case we say that the matrix islow rank

    the singular values i(A) =

    i(ATA) are obtained as the square root ofthe eigenvalues ofATA and are always i

    0

    the left singular vectors U are the eigenvectors ofAAT

    the right singular vectors V are the eigenvectors ofATA

    ELEG 867 (MC and RPCA problems) Fall, 2011 39 / 91

    Matrix Completion

  • 8/3/2019 MC Complete Version

    74/170

    Spectral Norm

    The spectral norm of a matrix A Rmn is defined as A2 = max(A), wheremax = max({i(A)}ri=1)

    ELEG 867 (MC and RPCA problems) Fall, 2011 40 / 91

    Matrix Completion

  • 8/3/2019 MC Complete Version

    75/170

    Spectral Norm

    The spectral norm of a matrix A Rmn is defined as A2 = max(A), wheremax = max({i(A)}ri=1)

    Dual Norm

    Given an arbitrary norm | | | | in Rn, its dual norm | | | | is defined asz = sup{zTx | x 1}

    ELEG 867 (MC and RPCA problems) Fall, 2011 40 / 91

    Matrix Completion

  • 8/3/2019 MC Complete Version

    76/170

    Spectral Norm

    The spectral norm of a matrix A Rmn is defined as A2 = max(A), wheremax = max({i(A)}ri=1)

    Dual Norm

    Given an arbitrary norm | | | | in Rn, its dual norm | | | | is defined asz = sup{zTx | x 1}

    Observations

    The nuclear norm is the dual norm of the spectral norm

    ELEG 867 (MC and RPCA problems) Fall, 2011 40 / 91

    Matrix Completion

  • 8/3/2019 MC Complete Version

    77/170

    Spectral Norm

    The spectral norm of a matrix A Rmn is defined as A2 = max(A), wheremax = max({i(A)}ri=1)

    Dual Norm

    Given an arbitrary norm | | | | in Rn, its dual norm | | | | is defined asz = sup{zTx | x 1}

    Observations

    The nuclear norm is the dual norm of the spectral norm

    A = sup{tr(ATX)|X2 1}

    ELEG 867 (MC and RPCA problems) Fall, 2011 40 / 91

    Matrix Completion

    Convex relaxation of the rank

  • 8/3/2019 MC Complete Version

    78/170

    Convex relaxation of the rank

    ELEG 867 (MC and RPCA problems) Fall, 2011 41 / 91

    Matrix Completion

    Convex relaxation of the rank

  • 8/3/2019 MC Complete Version

    79/170

    Convex relaxation of the rank

    Convex envelope of a function

    Let f : C R where C Rn. The convex envelope off (on C) is defined asthe largest convex function g such that g(x) f(x) for all x C

    ELEG 867 (MC and RPCA problems) Fall, 2011 41 / 91

    Matrix Completion

    Convex relaxation of the rank

  • 8/3/2019 MC Complete Version

    80/170

    Convex relaxation of the rank

    Convex envelope of a function

    Let f : C R where C Rn. The convex envelope off (on C) is defined asthe largest convex function g such that g(x) f(x) for all x C

    TheoremThe convex envelope of the function (X) =rank(X) onC = {X Rmn|X2 1}, is env(X) = X.

    ELEG 867 (MC and RPCA problems) Fall, 2011 41 / 91

  • 8/3/2019 MC Complete Version

    81/170

    Matrix Completion

    Convex relaxation of the rank

  • 8/3/2019 MC Complete Version

    82/170

    Convex relaxation of the rank

    Convex envelope of a function

    Let f : C R where C Rn. The convex envelope off (on C) is defined asthe largest convex function g such that g(x) f(x) for all x C

    TheoremThe convex envelope of the function (X) =rank(X) onC = {X Rmn|X2 1}, is env(X) = X.Observations

    The convex envelope of rank(X) on a the set {X|X2 M} is given by 1

    MX

    By solving the heuristic problem we obtain a lower bound on the optimal value of the original

    problem (provided we can identify a bound M on the feasible set).

    M. Fazel, H. Hindi and S. Boyd A Rank Minimization Heuristic with Application to Minimum Order System Approximation American

    Control Conference, 2001.

    ELEG 867 (MC and RPCA problems) Fall, 2011 41 / 91

    Matrix completion

  • 8/3/2019 MC Complete Version

    83/170

    Convex relaxation

    minimize A (6)subject to Aij = Dij

    (i,j)

    The original problem is now a problem with a non-smooth but convex

    function as the objective

    The remaining problem is the number of measurements and in which

    positions have to be taken in order to guarantee that the solution is equalto the matrix D?

    ELEG 867 (MC and RPCA problems) Fall, 2011 42 / 91

    Matrix completion

  • 8/3/2019 MC Complete Version

    84/170

    Which types of matrices can be completed exactly?

    ELEG 867 (MC and RPCA problems) Fall, 2011 43 / 91

    Matrix completion

  • 8/3/2019 MC Complete Version

    85/170

    Which types of matrices can be completed exactly?

    Consider the matrix

    M = e1.eT

    n

    =

    0 0 0 10 0 0 0...

    ..

    .

    ..

    .

    ..

    .

    ..

    .0 0 0 00 0 0 0

    ELEG 867 (MC and RPCA problems) Fall, 2011 43 / 91

    Matrix completion

  • 8/3/2019 MC Complete Version

    86/170

    Which types of matrices can be completed exactly?

    Consider the matrix

    M = e1.eT

    n

    =

    0 0 0 10 0 0 0..

    .

    ..

    .

    ..

    .

    ..

    .

    ..

    .0 0 0 00 0 0 0

    Can it be recovered from 90 % of its samples ?

    ELEG 867 (MC and RPCA problems) Fall 2011 43 / 91

    Matrix completion

  • 8/3/2019 MC Complete Version

    87/170

    Which types of matrices can be completed exactly?

    Consider the matrix

    M = e1.eT

    n

    =

    0 0 0 10 0 0 0..

    .

    ..

    .

    ..

    .

    ..

    .

    ..

    .0 0 0 00 0 0 0

    Can it be recovered from 90 % of its samples ?

    Is the sampling set important?

    ELEG 867 (MC and RPCA problems) Fall 2011 43 / 91

    Matrix completion

  • 8/3/2019 MC Complete Version

    88/170

    Which types of matrices can be completed exactly?

    Consider the matrix

    M = e1.eT

    n

    =

    0 0 0 10 0 0 0..

    .

    ..

    .

    ..

    .

    ..

    .

    ..

    .0 0 0 00 0 0 0

    Can it be recovered from 90 % of its samples ?

    Is the sampling set important?

    Which sampling sets work and which ones doesnt?

    ELEG 867 (MC and RPCA problems) Fall 2011 43 / 91

    Matrix completion

    Sampling set

  • 8/3/2019 MC Complete Version

    89/170

    p g

    The sampling set

    is defined as = {(

    i,j

    ) |D

    ijis observed

    }

    ELEG 867 (MC and RPCA problems) Fall 2011 44 / 91

    Matrix completion

    Sampling set

  • 8/3/2019 MC Complete Version

    90/170

    p g

    The sampling set

    is defined as = {(

    i,j

    ) |D

    ijis observed

    }Consider

    D = xyT x Rm,y Rn

    Dij = xiyj

    ELEG 867 (MC and RPCA problems) Fall 2011 44 / 91

    Matrix completion

    Sampling set

  • 8/3/2019 MC Complete Version

    91/170

    The sampling set

    is defined as = {(

    i,j

    ) |D

    ijis observed

    }Consider

    D = xyT x Rm,y Rn

    Dij = xiyj

    If the sampling set avoids row i, then xi can not be recovered by any

    method whatsoever

    ELEG 867 (MC and RPCA problems) Fall 2011 44 / 91

    Matrix completion

    Sampling set

  • 8/3/2019 MC Complete Version

    92/170

    The sampling set is defined as ={

    (i,j)|D

    ijis observed

    }Consider

    D = xyT x Rm,y Rn

    Dij = xiyj

    If the sampling set avoids row i, then xi can not be recovered by any

    method whatsoever

    ObservationNo columns or rows from D can be avoided in the sampling set

    ELEG 867 (MC and RPCA problems) Fall 2011 44 / 91

    Matrix completion

    Sampling set

  • 8/3/2019 MC Complete Version

    93/170

    The sampling set is defined as ={

    (i,j)|D

    ijis observed

    }Consider

    D = xyT x Rm,y Rn

    Dij = xiyj

    If the sampling set avoids row i, then xi can not be recovered by any

    method whatsoever

    ObservationNo columns or rows from D can be avoided in the sampling set

    There is a need for a characterization of the sampling operator with

    respect to the set of matrices that we want to recover

    ELEG 867 (MC and RPCA problems) Fall 2011 44 / 91

  • 8/3/2019 MC Complete Version

    94/170

    ELEG 867 (MC and RPCA problems) Fall 2011 45 / 91

  • 8/3/2019 MC Complete Version

    95/170

    ELEG 867 (MC and RPCA problems) Fall 2011 46 / 91

  • 8/3/2019 MC Complete Version

    96/170

    ELEG 867 (MC and RPCA problems) Fall 2011 47 / 91

  • 8/3/2019 MC Complete Version

    97/170

    ELEG 867 (MC and RPCA problems) Fall 2011 48 / 91

  • 8/3/2019 MC Complete Version

    98/170

    ELEG 867 (MC and RPCA problems) Fall 2011 49 / 91

  • 8/3/2019 MC Complete Version

    99/170

    ELEG 867 (MC and RPCA problems) Fall 2011 50 / 91

  • 8/3/2019 MC Complete Version

    100/170

    ELEG 867 (MC d RPCA bl ) F ll 2011 51 / 91

    Matrix completion

    Intuition

    the singular vectors need to be sufficiently spread, i.e. uncorrelated with

  • 8/3/2019 MC Complete Version

    101/170

    g y p , w

    the standar basis in order to minimize the number of observations neededto recover a low rank matrix

    ELEG 867 (MC and RPCA problems) Fall, 2011 52 / 91

    Matrix completion

    Intuition

    the singular vectors need to be sufficiently spread, i.e. uncorrelated with

  • 8/3/2019 MC Complete Version

    102/170

    g y p ,

    the standar basis in order to minimize the number of observations neededto recover a low rank matrix

    Coherence of a subspace

    Let U be a subspace ofRn of dimension r and PU be the orthogonal projection

    onto U. Then the coherence ofU is defined to be

    (U) =n

    rmax

    1inPUei2

    ELEG 867 (MC and RPCA problems) Fall, 2011 52 / 91

    Matrix completion

    Intuition

    the singular vectors need to be sufficiently spread, i.e. uncorrelated with

  • 8/3/2019 MC Complete Version

    103/170

    g y p

    the standar basis in order to minimize the number of observations neededto recover a low rank matrix

    Coherence of a subspace

    Let U be a subspace ofRn of dimension r and PU be the orthogonal projection

    onto U. Then the coherence ofU is defined to be

    (U) =n

    rmax

    1inPUei2

    Observations

    The minimum value that (U) can achieve is 1 for example if U isspanned by vectors whos entries all have magnitude 1/

    n

    ELEG 867 (MC and RPCA problems) Fall, 2011 52 / 91

    Matrix completion

    Intuition

    the singular vectors need to be sufficiently spread, i.e. uncorrelated with

  • 8/3/2019 MC Complete Version

    104/170

    g y p

    the standar basis in order to minimize the number of observations neededto recover a low rank matrix

    Coherence of a subspace

    Let U be a subspace ofRn of dimension r and PU be the orthogonal projection

    onto U. Then the coherence ofU is defined to be

    (U) =n

    rmax

    1inPUei2

    Observations

    The minimum value that (U) can achieve is 1 for example if U isspanned by vectors whos entries all have magnitude 1/

    n

    The largest possible value for (U) is n/r corresponding to a subspacethat contains a standard basis element.

    ELEG 867 (MC and RPCA problems) Fall, 2011 52 / 91

    Matrix completion

    0 coherence

  • 8/3/2019 MC Complete Version

    105/170

    A matrix D =

    1kr kukvTk is 0 coherent if for some positive 0max((U), (V)) 0

    ELEG 867 (MC and RPCA problems) Fall, 2011 53 / 91

    Matrix completion

    0 coherence

  • 8/3/2019 MC Complete Version

    106/170

    A matrix D =

    1kr kukvTk is 0 coherent if for some positive 0max((U), (V)) 0

    1 coherence

    A matrix D =

    1kr kukvTk has 1 coherence if

    UVT 1

    r/mn

    for some 1 > 0

    ELEG 867 (MC and RPCA problems) Fall, 2011 53 / 91

    Matrix completion

    0 coherence

  • 8/3/2019 MC Complete Version

    107/170

    A matrix D =

    1kr kukvTk is 0 coherent if for some positive 0max((U), (V)) 0

    1 coherence

    A matrix D =

    1kr kukvTk has 1 coherence if

    UVT 1

    r/mn

    for some 1 > 0

    Observation

    IfD is 0 coherent then it is 1 coherent for 1 = 0

    r

    ELEG 867 (MC and RPCA problems) Fall, 2011 53 / 91

    Coherence of a rank 300 approximation of kowalski

    2 4

    2.6

    ||PU

    ei||

    3

    ||PV

    ei||

  • 8/3/2019 MC Complete Version

    108/170

    0 100 200 300 400 500 600 7000.8

    1

    1.2

    1.4

    1.6

    1.8

    2

    2.2

    2.4

    index i

    100 200 300 400 500 600 700 800 900

    1

    1.5

    2

    2.5

    index i

    (U) = 1.9588 (V) = 2.22900 = 2.2290

    1 =

    mnr

    UVT = 13.412

    ELEG 867 (MC and RPCA problems) Fall, 2011 54 / 91

    Matrix completion

    Theorem

  • 8/3/2019 MC Complete Version

    109/170

    Let D Rmn of rankr be (0, 1)-coherent and let N = max(m, n). If weobserve M entries ofD with locations sampled uniformly at random. Then

    there exist constants Cand c such that if

    M

    Cmax(21,

    1/20 1, 0N

    1/4)Nr(logN)

    for some > 2, then the minimizer of (6) is unique and equal to D withprobability at least 1 cn . If in addition r 10 N1/5 then the number ofobservations can be improved to

    M C0N6/5r(logN)Candes, E.J. and Recht, B. Exact matrix completion via convex optimization, Foundations of Computational Mathematics 2009

    ELEG 867 (MC and RPCA problems) Fall, 2011 55 / 91

    Matrix completion

  • 8/3/2019 MC Complete Version

    110/170

    21 = 179.99 1/20 1 = 12.5139 0N1/4 = 4.7682

    max(21, 1/20 1, 0N

    1/4)Nr(2.1logN) = 6.6076 108

    What is the value of C? must be C > 0.

    In the limit case M = mn, C = 6759006.607108

    = 9.194

    104.

    For the bound to be useful 0 < C < 9.194 104.

    ELEG 867 (MC and RPCA problems) Fall, 2011 56 / 91

    SNR = 23.74 dB , 10% of the samples

  • 8/3/2019 MC Complete Version

    111/170

    ELEG 867 (MC and RPCA problems) Fall, 2011 57 / 91

    SNR = 22.52 dB , 25% of the samples

  • 8/3/2019 MC Complete Version

    112/170

    ELEG 867 (MC and RPCA problems) Fall, 2011 58 / 91

    SNR = 25.89 dB , 35% of the samples

  • 8/3/2019 MC Complete Version

    113/170

    ELEG 867 (MC and RPCA problems) Fall, 2011 59 / 91

    SNR = 30.55 dB , 50% of the samples

  • 8/3/2019 MC Complete Version

    114/170

    ELEG 867 (MC and RPCA problems) Fall, 2011 60 / 91

    SNR = 39.51 dB , 70% of the samples

  • 8/3/2019 MC Complete Version

    115/170

    ELEG 867 (MC and RPCA problems) Fall, 2011 61 / 91

    SNR = 42.75 dB , 75% of the samples

  • 8/3/2019 MC Complete Version

    116/170

    ELEG 867 (MC and RPCA problems) Fall, 2011 62 / 91

    SNR = 47.10 dB , 80% of the samples

  • 8/3/2019 MC Complete Version

    117/170

    ELEG 867 (MC and RPCA problems) Fall, 2011 63 / 91

  • 8/3/2019 MC Complete Version

    118/170

    Completion Performance

    70

  • 8/3/2019 MC Complete Version

    119/170

    0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.910

    20

    30

    40

    50

    60

    percentage of samples

    SN

    R

    ELEG 867 (MC and RPCA problems) Fall, 2011 65 / 91

    Matrix completion

    Recovery performance for random matrices

  • 8/3/2019 MC Complete Version

    120/170

    Figure: The x axis corresponds to rank(A)/min{m, n} and the y axis to s = 1M/mn (probability that

    an entry is omited from the observations)

    Emmanuel J. Candes, Xiaodong Li, Yi Ma, John Wright Robust Principal Component Analysis?

    http://arxiv.org/abs/0912.3599

    ELEG 867 (MC and RPCA problems) Fall, 2011 66 / 91

    Matrix completion

  • 8/3/2019 MC Complete Version

    121/170

    Other bounds on number of meassurements and sampling operatorsEmmanuel J. Candes, Xiaodong Li, Yi Ma, John Wright Rodbust Principal Component Analysis?

    http://arxiv.org/abs/0912.3599

    Venkat Chandrasekaran, Sujay Sanghavi, Pablo A. Parrilo, Alan S. Willsky Rank-Sparsity Incoherence for Matrix Decomposition

    http://arxiv.org/abs/0906.2220

    Zihan Zhou, Xiaodong Li, John Wright, Emmanuel Candes, Yi Ma Stable Principal Component Pursuit

    http://arxiv.org/abs/1001.2363

    Raghunandan H. Keshavan, Andrea Montanari, Sewoong Oh Matrix Completion from a Few Entries

    http://arxiv.org/abs/0901.3150

    Sahand Negahban, Martin J. Wainwright Restricted strong convexity and weighted matrix completion: Optimal bounds with noise

    http://arxiv.org/abs/1009.2118v2

    Yonina C. Eldar, Deanna Needell, Yaniv Plan Unicity conditions for low-rank matrix recovery

    http://arxiv.org/abs/1103.5479

    ELEG 867 (MC and RPCA problems) Fall, 2011 67 / 91

    Solving the problem

    Rewriting the problem

  • 8/3/2019 MC Complete Version

    122/170

    ELEG 867 (MC and RPCA problems) Fall, 2011 68 / 91

    Solving the problem

    Rewriting the problem

    minimize

    A

  • 8/3/2019 MC Complete Version

    123/170

    subject to Aij = Dij (i,j)

    ELEG 867 (MC and RPCA problems) Fall, 2011 68 / 91

    Solving the problem

    Rewriting the problem

    minimize

    A

    ( )

  • 8/3/2019 MC Complete Version

    124/170

    subject to Aij = Dij (i,j)

    ELEG 867 (MC and RPCA problems) Fall, 2011 68 / 91

    Solving the problem

    Rewriting the problem

    minimize

    A

    bj A D (i j)

  • 8/3/2019 MC Complete Version

    125/170

    subject to Aij = Dij (i,j)

    minimize Asubject to A + E = D

    , (E) = 0

    ELEG 867 (MC and RPCA problems) Fall, 2011 68 / 91

    Solving the problem

    Rewriting the problem

    minimize

    A

    bj t t A D (i j)

  • 8/3/2019 MC Complete Version

    126/170

    subject to Aij = Dij (i,j)

    minimize Asubject to A + E = D

    , (E) = 0

    where

    ELEG 867 (MC and RPCA problems) Fall, 2011 68 / 91

    Solving the problem

    Rewriting the problem

    minimize

    A

    bj t t A D (i j)

  • 8/3/2019 MC Complete Version

    127/170

    subject to Aij = Dij (i,j)

    minimize Asubject to A + E = D

    , (E) = 0

    where

    [(E)]ij =

    Eij if(i,j)

    0 if(i,j) / D

    ij =

    Dij if(i,j)

    0 if(i,j) /

    ELEG 867 (MC and RPCA problems) Fall, 2011 68 / 91

    Solving the problem

    Rewriting the problem

    minimize

    A

    subject to A D (i j)

  • 8/3/2019 MC Complete Version

    128/170

    subject to Aij = Dij (i,j)

    minimize Asubject to A + E = D

    , (E) = 0

    where

    [(E)]ij =

    Eij if(i,j)

    0 if(i,j) / D

    ij =

    Dij if(i,j)

    0 if(i,j) /

    The new problem can be solved by Augmented Lagrangian Method in anefficient way

    Z. Lin, M. Chen, L. Wu and Y. Ma The augmented lagrange multiplier method for exact recovery of corrupted low-rank matrices

    http://arxiv.org/abs/1009.5055

    ELEG 867 (MC and RPCA problems) Fall, 2011 68 / 91

    Solving the problem

    The augmented lagrangian for the problem

  • 8/3/2019 MC Complete Version

    129/170

    The augmented lagrangian for the problem

    minimize Asubject to A + E = D

    , (E) = 0

    isL(A,E, Y, ) = A + Y,D A E +

    2D A E2F (7)

    The tradidional iterative method to minimize the augmented lagrangian can be

    used here, but at each iteration the constraint (E) = 0 has to be fulfilled.

    ELEG 867 (MC and RPCA problems) Fall, 2011 69 / 91

    Solving the problem

    Algorithm

    i t Ob i l D (i j )

  • 8/3/2019 MC Complete Version

    130/170

    input: Observation samples Dij, (i,j )

    Y0 = 0;E0 = 0; 0 > 0; > 1; k = 0

    while not converged

    Ak+1 = argminA

    L(A,Ek, Yk, k)

    Ek+1 = argminE,(E)=0

    L(Ak+1,E, Yk, k)

    Yk+1 = Yk + k(D

    Ak+1 Ek+1)

    k+1 = kk k+ 1

    end while

    Output: (Ak,Ek)

    ELEG 867 (MC and RPCA problems) Fall, 2011 70 / 91

    Solving the subproblems

    Solving for Ak+1

    Ak+1 = argmin L(A Ek Yk k)

  • 8/3/2019 MC Complete Version

    131/170

    Ak+1 = argminA

    L(A,Ek, Yk, k)

    Ak+1 = argminA

    A + Yk,D A Ek + 2 D A Ek2F

    Ak+1 = argminA

    1A + 12D A Ek + 1Yk2F

    which has the general form

    argminA

    A + 12XA2F

    ELEG 867 (MC and RPCA problems) Fall, 2011 71 / 91

    Solving the subproblems

    Singular value shrinkage operator

    Given a matrix X = UV T

  • 8/3/2019 MC Complete Version

    132/170

    Given a matrix X = UVT,

    the operator D(.) : Rmn Rmn is defined as

    D(X) = U

    S()V

    T,

    S() = sign()

    {|

    |

    }

    ELEG 867 (MC and RPCA problems) Fall, 2011 72 / 91

    Solving the subproblems

    Singular value shrinkage operator

    Given a matrix X = UV T

  • 8/3/2019 MC Complete Version

    133/170

    Given a matrix X = UVT,

    the operator D(.) : Rmn Rmn is defined as

    D(X) = U

    S()V

    T,

    S() = sign()

    {|

    |

    }Theorem

    For each 0 and Y Rmn, the singular value shrinkage operator obeys

    D(Y) = argminX

    X + 12Y X2F

    ELEG 867 (MC and RPCA problems) Fall, 2011 72 / 91

    Solving the subproblems

    Proof:

    Consider the function h0(X) =

    X

    +

    12

    X

    Y

    2F

  • 8/3/2019 MC Complete Version

    134/170

    ELEG 867 (MC and RPCA problems) Fall, 2011 73 / 91

    Solving the subproblems

    Proof:

    Consider the function h0(X) =

    X

    +

    12

    X

    Y

    2F

  • 8/3/2019 MC Complete Version

    135/170

    A sufficient condition for optimality of X is that

    0 X Y + Xwhere X is the set of subgradients of the nuclear norm X at X.

    ELEG 867 (MC and RPCA problems) Fall, 2011 73 / 91

    Solving the subproblems

    Proof:

    Consider the function h0(X) =

    X

    +

    12

    X

    Y

    2F

  • 8/3/2019 MC Complete Version

    136/170

    A sufficient condition for optimality of X is that

    0 X Y + Xwhere X is the set of subgradients of the nuclear norm X at X.We know that for an arbitraty X = UVT Rmn

    X

    =

    {UVT + W : W

    Rmn, UTW = 0, WV = 0,

    W

    2

    1

    }

    ELEG 867 (MC and RPCA problems) Fall, 2011 73 / 91

    Solving the subproblems

    Proof:

    Consider the function h0(X) =

    X

    +

    12

    X

    Y

    2F

  • 8/3/2019 MC Complete Version

    137/170

    A sufficient condition for optimality of X is that

    0 X Y + Xwhere X is the set of subgradients of the nuclear norm X at X.We know that for an arbitraty X = UVT Rmn

    X

    =

    {UVT + W : W

    Rmn, UTW = 0, WV = 0,

    W

    2

    1

    }If we set X = D(Y) and prove that Y X X, then the theorem isconcluded.

    ELEG 867 (MC and RPCA problems) Fall, 2011 73 / 91

    Decompose Y = U00VT0 + U11V

    T1

  • 8/3/2019 MC Complete Version

    138/170

    ELEG 867 (MC and RPCA problems) Fall, 2011 74 / 91

    Decompose Y = U00VT0 + U11V

    T1 where U0, V0 are the singular vectors

    associated with singular values > and U1, V1 are the ones associated withvalues

    .

  • 8/3/2019 MC Complete Version

    139/170

    ELEG 867 (MC and RPCA problems) Fall, 2011 74 / 91

    Decompose Y = U00VT0 + U11V

    T1 where U0, V0 are the singular vectors

    associated with singular values > and U1, V1 are the ones associated withvalues

    . Since X =

    D(Y) we can write

    ( ) T

  • 8/3/2019 MC Complete Version

    140/170

    X = U0(0 I)VT0 .

    ThenY X = U11VT1 + U0VT0

    = (U0VT0 + W) , W =

    1U11VT1

    By definition UT0 W = 0 , WV0 = 0 and since the diagonal elements of1have magnitudes bounded by , we also have W2 1. HenceY X X which concludes the proof.

    ELEG 867 (MC and RPCA problems) Fall, 2011 74 / 91

    Solving the subproblems

    Solving for Ek+1

    Ek+1 = argminE,(E)=0

    L(Ak+1,E, Yk, k)

  • 8/3/2019 MC Complete Version

    141/170

    Ek+1 = argminE,(E)=0

    Y,D A E + 2D A E2F

    Ek+1 = argminE,(E)=0

    12D

    A E+ 1Y2F

    Ek+1 = (D Ak+1 + 1k Yk)

    Here is the complementary set of,

    = {(i,j) | (i,j) / }.

    ELEG 867 (MC and RPCA problems) Fall, 2011 75 / 91

    Solving the problem

    The algorithm is reduced to

    Input: Observation samples Dij, (i,j )

  • 8/3/2019 MC Complete Version

    142/170

    Y0 = 0;E0 = 0; 0 > 0; > 1; k = 0

    while not converged

    Ak+1 = D1k

    (D

    Ek + 1k

    Yk)

    Ek+1 = (D

    Ak+1 + 1k Yk)

    Yk+1 = Yk + k(D

    Ak+1 Ek+1)

    k+1 = kk k+ 1

    end while

    Output: (Ak,Ek)

    ELEG 867 (MC and RPCA problems) Fall, 2011 76 / 91

    Robust PCA

    Optimization problem

  • 8/3/2019 MC Complete Version

    143/170

    minimize rank (A) + ||E||0 (8)subject to Aij + Eij = Dij (i,j)

    ELEG 867 (MC and RPCA problems) Fall, 2011 77 / 91

    Robust PCA

    Optimization problem

  • 8/3/2019 MC Complete Version

    144/170

    minimize rank (A) + ||E||0 (8)subject to Aij + Eij = Dij (i,j)

    We look for the best rank-k approximation of the matrix D which is

    corrupted by sparse noise

    ELEG 867 (MC and RPCA problems) Fall, 2011 77 / 91

    Robust PCA

    Optimization problem

  • 8/3/2019 MC Complete Version

    145/170

    minimize rank (A) + ||E||0 (8)subject to Aij + Eij = Dij (i,j)

    We look for the best rank-k approximation of the matrix D which is

    corrupted by sparse noise

    Similar problems and conditions as in the case of matrix completion

    ELEG 867 (MC and RPCA problems) Fall, 2011 77 / 91

    Robust PCA

    The original problem is very hard to solve so we look again for a convex

    relaxation of the problem

  • 8/3/2019 MC Complete Version

    146/170

    ELEG 867 (MC and RPCA problems) Fall, 2011 78 / 91

    Robust PCA

    The original problem is very hard to solve so we look again for a convex

    relaxation of the problem

  • 8/3/2019 MC Complete Version

    147/170

    rank(A) = ||diag()||0 A = UVT

    ELEG 867 (MC and RPCA problems) Fall, 2011 78 / 91

    Robust PCA

    The original problem is very hard to solve so we look again for a convex

    relaxation of the problem

  • 8/3/2019 MC Complete Version

    148/170

    rank(A) = ||diag()||0 A = UVT , E0

    ELEG 867 (MC and RPCA problems) Fall, 2011 78 / 91

    Robust PCA

    The original problem is very hard to solve so we look again for a convex

    relaxation of the problem

  • 8/3/2019 MC Complete Version

    149/170

    rank(A) = ||diag()||0 A = UVT , E0

    ELEG 867 (MC and RPCA problems) Fall, 2011 78 / 91

    Robust PCA

    The original problem is very hard to solve so we look again for a convex

    relaxation of the problem

  • 8/3/2019 MC Complete Version

    150/170

    rank(A) = ||diag()||0 A = UVT , E0

    ||A|| = ||diag()||1 E1

    ELEG 867 (MC and RPCA problems) Fall, 2011 78 / 91

    Robust PCA

    The original problem is very hard to solve so we look again for a convex

    relaxation of the problem

    T

  • 8/3/2019 MC Complete Version

    151/170

    rank(A) = ||diag()||0 A = UVT , E0

    ||A|| = ||diag()||1 E1

    Convex relaxation

    minimize A + E1 (9)subject to Aij + Eij = Dij (i,j)

    ELEG 867 (MC and RPCA problems) Fall, 2011 78 / 91

  • 8/3/2019 MC Complete Version

    152/170

    ELEG 867 (MC and RPCA problems) Fall, 2011 79 / 91

    Robust PCA

    Conditions for exact recovery of the convex relaxation

    In order to have exact recovery we need to impose that the low rank part

    i t d l th t th t i t l k

  • 8/3/2019 MC Complete Version

    153/170

    is not sparse and also that the sparse part is not low rank

    Icoherence condition of the low rank part

    The incoherence condition of a matrix A = USVT Rmn with parameter states that

    maxi

    UTei2 rm , maxi VTei2 rn

    UVT rmn

    ELEG 867 (MC and RPCA problems) Fall, 2011 80 / 91

    Robust PCA

    Theorem

    IfA0 obeys the incoherent condition of parameter and the sampling set isuniformly distributed among all sets of cardinality M obeying M = 0.1mn and

  • 8/3/2019 MC Complete Version

    154/170

    uniformly distributed among all sets of cardinality M obeying M 0.1mn andalso each observed entry is corrupted with probability independently of theothers. Then for N = max(m, n) there exist a constant c such that withprobability at least 1

    cN10 problem (9) with = 1/

    0.1N recovers the

    exact solutions (A0,E0) provided that

    rank(A0) rN1(logN)2 , swhere r and s are positive numerical constants

    E.J. Candes, X. Li, Y. Ma, and Wright, J. Robust principal component analysis? http://arxiv.org/abs/0912.3599

    ELEG 867 (MC and RPCA problems) Fall, 2011 81 / 91

    Solving the problem

    Rewriting the problem

  • 8/3/2019 MC Complete Version

    155/170

    ELEG 867 (MC and RPCA problems) Fall, 2011 82 / 91

    Solving the problem

    Rewriting the problem

    minimizeA

    + E

    1subject to Aij + Eij = Dij (i,j)

  • 8/3/2019 MC Complete Version

    156/170

    ELEG 867 (MC and RPCA problems) Fall, 2011 82 / 91

    Solving the problem

    Rewriting the problem

    minimizeA

    +

    E

    1subject to Aij + Eij = Dij (i,j)

  • 8/3/2019 MC Complete Version

    157/170

    ELEG 867 (MC and RPCA problems) Fall, 2011 82 / 91

    Solving the problem

    Rewriting the problem

    minimizeA

    +

    E

    1subject to Aij + Eij = Dij (i,j)

  • 8/3/2019 MC Complete Version

    158/170

    minimize A + E1subject to A + E+ Z = D

    , (Z) = 0

    ELEG 867 (MC and RPCA problems) Fall, 2011 82 / 91

    Solving the problem

    Rewriting the problem

    minimizeA

    +

    E

    1subject to Aij + Eij = Dij (i,j)

  • 8/3/2019 MC Complete Version

    159/170

    minimize A + E1subject to A + E+ Z = D

    , (Z) = 0

    where

    ELEG 867 (MC and RPCA problems) Fall, 2011 82 / 91

    Solving the problem

    Rewriting the problem

    minimizeA

    +

    E

    1subject to Aij + Eij = Dij (i,j)

  • 8/3/2019 MC Complete Version

    160/170

    minimize A + E1subject to A + E+ Z = D

    , (Z) = 0

    where

    [(Z)]ij =

    Zij if(i,j)

    0 if(i,j) / D

    ij =

    Dij if(i,j)

    0 if(i,j) /

    ELEG 867 (MC and RPCA problems) Fall, 2011 82 / 91

    Solving the problem

    Rewriting the problem

    minimizeA

    +

    E

    1

    subject to Aij + Eij = Dij (i,j)

  • 8/3/2019 MC Complete Version

    161/170

    minimize A + E1subject to A + E+ Z = D

    , (Z) = 0

    where

    [(Z)]ij =

    Zij if(i,j)

    0 if(i,j) / D

    ij =

    Dij if(i,j)

    0 if(i,j) /

    Z. Lin, M. Chen, L. Wu and Y. Ma The augmented lagrange multiplier method for exact recovery of corrupted low-rank matrices

    http://arxiv.org/abs/1009.5055

    ELEG 867 (MC and RPCA problems) Fall, 2011 82 / 91

    Solving the problem

    The augmented lagrangian for the problem

    minimize A + E1

  • 8/3/2019 MC Complete Version

    162/170

    minimize A + E1subject to A + E+ Z = D

    , (Z) = 0

    is

    L(A,E, Y,Z, ) = A+E1+Y,DAEZ+ 2DAEZ2F (10)

    with the additional constraint (Z) = 0.

    ELEG 867 (MC and RPCA problems) Fall, 2011 83 / 91

    Solving the problem

    Algorithm

    input: Observation samples Dij, (i,j )

    Y0 = 0;E0 = 0;A0 = D

    ;Z0 = 0, 0 > 0; > 1; k = 0

    while not converged

  • 8/3/2019 MC Complete Version

    163/170

    while not converged

    Ak+1 = argminA

    L(A,Ek, Yk,Zk, k)

    Ek+1 = argminE

    L(Ak+1,E, Yk,Zk, k)

    Zk+1 = argminZ,(Z)=0

    L(Ak+1,Ek+1, Yk,Z, k)

    Yk+1 = Yk + k(D

    Ak+1 Ek+1 Zk+1)

    k+1 = k

    k k+ 1end while

    Output: (Ak,Ek,Zk)

    ELEG 867 (MC and RPCA problems) Fall, 2011 84 / 91

    Solving the subproblems

    Solving for Ak+1

    Ak+1 = argminA

    L(A,Ek, Yk,Zk, k)

  • 8/3/2019 MC Complete Version

    164/170

    Ak+1 = argminA

    A + Yk,D A Ek Zk + k2 D A Ek Zk2F

    Ak+1 = argminA

    1k A + 12D A Ek Zk + 1Yk2F

    which has closed form solution

    Ak+1 = D1k (D

    Ek Zk + 1

    k Yk)

    ELEG 867 (MC and RPCA problems) Fall, 2011 85 / 91

    Solving the subproblems

    Solving for Ek+1

    Ek+1 = argminE

    L(Ak+1,E, Yk,Zk, k)

    i A

  • 8/3/2019 MC Complete Version

    165/170

    Ek+1 = argminE

    E1 + Y,D Ak+1 EZk+ k

    2D Ak+1 EZk2F

    Ek+1 = argminE

    1k E1 + 12D Ak+1 EZk + 1k Y2F

    which has the form

    argminE

    E1 + 12XE2F

    ELEG 867 (MC and RPCA problems) Fall, 2011 86 / 91

    Solving the subproblems

    Shrinkage operator

    Given a matrix Y

    Rmn, the operatorS

    (.) : Rmn

    Rmn is defined as

    S (Y) = sign(Y)(|Y| )

  • 8/3/2019 MC Complete Version

    166/170

    ( ) g ( )(| | )

    where

    sign(Y)(|Y

    | ) is applied componentwise to Y

    Theorem

    For each 0 and Y Rmn, the shrinkage operator obeys

    S(Y) = argminX

    X1 + 12Y X2F

    ELEG 867 (MC and RPCA problems) Fall, 2011 87 / 91

    Proof:

    Consider the function h(X) = X1 + 12X Y2F

    A sufficient condition for optimality of X is that

    0 X Y + X1h i h f b di f h l

  • 8/3/2019 MC Complete Version

    167/170

    where X1 is the set of subgradients of the l1 norm X1 at X.

    All the subgradients of

    X

    1 at X are given by

    X1 =

    G Rmn

    Gij =

    1 for Xij < 01 for Xij > 0

    [1, 1] for Xij = 0

    If we prove that Y X X1 then X is the unique minimizer of theproblem.

    ELEG 867 (MC and RPCA problems) Fall, 2011 88 / 91

    Consider the candidate X = S(Y), then[Y S(Y)]ij = Yij sign(Yij).max(|Yij| , 0)

    [Y

    S(Y)]ij =

    sign(Yij) if|Yij| > Y

    ijif

    |Y

    ij| 1 for Yij <

  • 8/3/2019 MC Complete Version

    168/170

    S(Y)1 =

    G Rmn

    Gij =

    j

    1 for Yij > [1, 1] for |Yij|

    S(Y)1 =

    G Rmn

    Gij = sign(Yij) for |Yij| > [, ] for |Yij|

    Y S(Y) S(Y) S(Y) is the optimal solution

    ELEG 867 (MC and RPCA problems) Fall, 2011 89 / 91

    Solving the subproblemsSolving for Zk+1

    Zk+1 = argminZ,(Z)=0

    L(Ak+1,Ek+1, Yk,Z, k)

    Zk+1 = argmin Yk,D Ak+1 Ek+1 Z

  • 8/3/2019 MC Complete Version

    169/170

    Z,(Z)=0

    + 2D Ak+1 Ek+1 Z2F

    Zk+1 = argminZ,(Z)=0

    12D Ak+1 Ek+1 Z + 1k Yk2F

    Zk+1 = (D Ak+1 Ek+1 + 1k Yk)

    Here is the complementary set of,

    = {(i,j) | (i,j) / }.ELEG 867 (MC and RPCA problems) Fall, 2011 90 / 91

    Solving the problem

    Algorithm

    input: Observation samples Dij, (i,j )

    Y0 = 0;E0 = 0;Z0 = 0; 0 > 0; > 1; k = 0

    while not converged

  • 8/3/2019 MC Complete Version

    170/170

    Ak+1 = D1k

    (D

    EkZk + 1k

    Yk)

    Ek+1 = S(D

    Ak+1 Zk + 1k

    Y)

    Zk+1 = (D

    Ak+1 Ek+1 + 1k

    Yk)

    Yk+1 = Yk + k(D

    Ak+1 Ek+1)

    k+1 = kk k+ 1

    end while

    Output: (Ak,Ek,Zk)

    ELEG 867 (MC and RPCA problems) Fall, 2011 91 / 91