random matrix theory in sparse recovery - tu berlin

29
Random matrix theory in sparse recovery Maryia Kabanava RWTH Aachen University CoSIP Winter Retreat 2016 Maryia Kabanava (RWTH Aachen) Random matrix theory in sparse recovery CoSIP 2016

Upload: others

Post on 10-Apr-2022

8 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Random matrix theory in sparse recovery - TU Berlin

Random matrix theory in sparse recovery

Maryia Kabanava

RWTH Aachen University

CoSIP Winter Retreat 2016

Maryia Kabanava (RWTH Aachen) Random matrix theory in sparse recovery CoSIP 2016

Page 2: Random matrix theory in sparse recovery - TU Berlin

Compressed sensing

Goal: reconstruction of (high-dimensional) signals from minimalamount of measured data

Key ingredients:

Exploit low complexity of signals (e.g. sparsity/compressibility)

Efficient algorithms (e.g. convex optimization)

Randomness (random matrices)

Maryia Kabanava (RWTH Aachen) Random matrix theory in sparse recovery CoSIP 2016

Page 3: Random matrix theory in sparse recovery - TU Berlin

Signal recovery problem

Signal x ∈ Rd is unknown.

Given:

Signal linear measurement map: M : Rd → Rm, m ≪ d .

Measurement vector: y = Mx + w ∈ Rm, ‖w‖2 ≤ η.

Goal: recover x from y .Idea: recovery is possible if x belongs to a set of low complexity.

Standard compressed sensing: sparsity (small number ofnonzero coefficients)

Cosparsity: sparsity after transformation

Structured sparsity: e.g. block sparsity

Low rank matrix recovery

Low rank tensor recovery

Maryia Kabanava (RWTH Aachen) Random matrix theory in sparse recovery CoSIP 2016

Page 4: Random matrix theory in sparse recovery - TU Berlin

Noiseless model

M x

S

S c

y

m m × d=

under-determined linear system

supp x = S ⊂ {1, 2, . . . , d}ℓ0-minimization

minz∈Rd

‖z‖0 s.t. Mz = y

NP-hard

ℓ1-minimization

minz∈Rd

‖z‖1 s.t. Mz = y

efficient minim. methods

Maryia Kabanava (RWTH Aachen) Random matrix theory in sparse recovery CoSIP 2016

Page 5: Random matrix theory in sparse recovery - TU Berlin

Nonuniform vs. uniform recovery

Nonuniform recoveryA fixed sparse (compressible) vector is recovered with highprobability using M.Sufficient conditions on M

Descent cone of ℓ1-norm at x intersects kerM trivially.Construct (approximate) dual certificate.

Uniform recoveryWith high probability on M every sparse (compressible)vector is recovered.Sufficient conditions on M

Null space property.Restricted isometry property.

Maryia Kabanava (RWTH Aachen) Random matrix theory in sparse recovery CoSIP 2016

Page 6: Random matrix theory in sparse recovery - TU Berlin

Nonuniform recovery: descent cone

For fixed x ∈ Rd , we define the convex cone

T (x) = cone{z − x : z ∈ Rd , ‖z‖1 ≤ ‖x‖1}.

Theorem

Let M ∈ Rm×d . A vector x ∈ R

d isthe unique minimizer of ‖z‖1 subjectto Mz = Mx if and only ifkerM ∩ T (x) = {0}.

x + kerM

x

x + T (x)

Let Sd−1 = {x ∈ Rd : ‖x‖2 = 1} and set T := T (x) ∩ S

d−1. If

infx∈T

‖Mx‖2 > 0, (1)

then kerM ∩ T = ∅ and kerM ∩ T (x) = {0}.Maryia Kabanava (RWTH Aachen) Random matrix theory in sparse recovery CoSIP 2016

Page 7: Random matrix theory in sparse recovery - TU Berlin

Uniform recovery: null space property (NSP)

M ∈ Rm×d is said to satisfy the stable NSP of order s with

0 < ρ < 1, if for any S ⊂ [d ] with |S | ≤ s it holds

‖vS‖1 < ρ‖vSc‖1 for all v ∈ kerM. (2)

Theorem

Let M ∈ Rm×d satisfy (2). Then, for any x ∈ R

d the solution x̂ of

minz∈Rd

‖z‖1 subject to Mz = y ,

with y = Mx, approximates x with ℓ1-error

‖x − x̂‖1 ≤2(1 + ρ)

1− ρσs(x)1, (3)

where σs(x)1 := inf {‖x − z‖1 : z is s-sparse}.

Maryia Kabanava (RWTH Aachen) Random matrix theory in sparse recovery CoSIP 2016

Page 8: Random matrix theory in sparse recovery - TU Berlin

Strategy to check NSP

Lemma

Let

Tρ,s:={

w ∈ Rd : ‖wS‖1≥ ρ‖wSc‖1 for some S ⊂ [d ], |S |≤ s

}

.

Set T := Tρ,k ∩ Sd−1. If

infw∈T

‖Mw‖2 > 0,

then for any v ∈ kerM it holds

‖vS‖1 < ρ‖vSc‖1.

Maryia Kabanava (RWTH Aachen) Random matrix theory in sparse recovery CoSIP 2016

Page 9: Random matrix theory in sparse recovery - TU Berlin

Uniform recovery: restricted isometry property (RIP)

Definition

The restricted isometry constant δs of a matrix M ∈ Rm×d is

defined as the smallest δs such that

(1− δs)‖x‖22 ≤ ‖Mx‖22 ≤ (1 + δs)‖x‖22 (4)

for all s-sparse x ∈ Rd .

Requires that all s-column submatrices of M arewell-conditioned.

δs = max|S|≤s

‖MTS MS − Id ‖2→2

Implies stable NSP.

We say that M satisfies the restricted isometry property if δs issmall for reasonably large s.

Maryia Kabanava (RWTH Aachen) Random matrix theory in sparse recovery CoSIP 2016

Page 10: Random matrix theory in sparse recovery - TU Berlin

RIP implies recovery by ℓ1-minimization

(1− δs)‖x‖22 ≤ ‖Mx‖22 ≤ (1 + δs)‖x‖22 (5)

Theorem

Assume that the restricted isometry constant of M ∈ Rm×d

satisfiesδ2s < 1/

√2 ≈ 0.7071.

Then ℓ1-minimization reconstructs every s-sparse vector x ∈ Rd

from y = Mx.

Maryia Kabanava (RWTH Aachen) Random matrix theory in sparse recovery CoSIP 2016

Page 11: Random matrix theory in sparse recovery - TU Berlin

Matrices satisfying recovery conditions

Open problem: Give explicit matrices M ∈ Rm×d that satisfy

recovery conditions.Goal: Successful recovery with M ∈ R

m×d , if

m ≥ Cs lnα(d),

for constants C and α.

Deterministic matrices known, for which m ≥ Cs2.

Way out: consider random matrices.

Maryia Kabanava (RWTH Aachen) Random matrix theory in sparse recovery CoSIP 2016

Page 12: Random matrix theory in sparse recovery - TU Berlin

Gaussian random variables

A standard Gaussian random variabel X ∼ N(0, 1) has probabilitydensity function

ψ(x) =1√2π

e−x2/2. (6)

1 The tail of X decays super-exponentially

P(|X | > t) ≤ e−t2/2, t > 0. (7)

2 The absolute moments of X can be computed as

(E |X |p)1/p =√2

(

Γ((1 + p)/2)

Γ(1/2)

)1/p

= O(√p), p ≥ 1.

3 The moment generating function of X equals

E exp(tX ) = et2/2, t ∈ R.

Maryia Kabanava (RWTH Aachen) Random matrix theory in sparse recovery CoSIP 2016

Page 13: Random matrix theory in sparse recovery - TU Berlin

Subgaussian random variables

Lemma

Let X be a random variable with EX = 0. Then the followingproperties are equivalent.

1 Tails: There exist β, κ > 0 such that

P(|X | > t) ≤ βe−κt2 for all t > 0. (8)

2 Moments:

(E |X |p)1/p ≤ C√p for all p ≥ 1. (9)

3 Moment generating function:

E exp(tX ) ≤ ect2

for all t ∈ R. (10)

A random variable X with EX = 0 that satisfies one of theproperties above is called subgaussian.

Maryia Kabanava (RWTH Aachen) Random matrix theory in sparse recovery CoSIP 2016

Page 14: Random matrix theory in sparse recovery - TU Berlin

Subgaussian random variables: examples

1 Gaussian

2 Bernoulli: P {X = −1} = P {X = 1} =1

23 Bounded: |X | ≤ M almost surely for some M

Maryia Kabanava (RWTH Aachen) Random matrix theory in sparse recovery CoSIP 2016

Page 15: Random matrix theory in sparse recovery - TU Berlin

Hoeffding-type inequality

Theorem

Let X1, . . . ,XN be a sequence of independent subgaussian randomvariables,

E exp(tXi) ≤ ect2

for all t ∈ R and i ∈ {1, . . . ,N}. (11)

For a ∈ RN , the random variable Z :=

N∑

i=1

aiXi is subgaussian, i.e.

E exp(tZ ) ≤ exp(

c‖a‖22t2)

for all t ∈ R (12)

and

P

(∣

N∑

i=1

aiXi

≥ t

)

≤ 2 exp

(

− t2

4c‖a‖22

)

for all t ∈ R. (13)

Maryia Kabanava (RWTH Aachen) Random matrix theory in sparse recovery CoSIP 2016

Page 16: Random matrix theory in sparse recovery - TU Berlin

Subexponential random variables

A random variable X with EX = 0 is called subexponential if thereexist β, κ > 0 such that

P(|X | > t) ≤ βe−κt for all t > 0. (14)

Theorem (Bernstein-type inequality)

Let X1, . . . ,XN be a sequence of independent subexponentialrandom variables,

P(|Xi | > t) ≤ βe−κt for all t > 0 and i ∈ {1, . . . ,N}. (15)

Then

P

(∣

N∑

i=1

Xi

≥ t

)

≤ 2 exp

(

− (κt)2

2βN + κt

)

for all t ∈ R. (16)

Maryia Kabanava (RWTH Aachen) Random matrix theory in sparse recovery CoSIP 2016

Page 17: Random matrix theory in sparse recovery - TU Berlin

Random matrices

Definition

Let M ∈ Rm×d be a random matrix.

If the entries of M are independent Bernoulli variables (i.e.taking values ±1 with equal probability), then M is called aBernoulli random matrix.

If the entries of M are independent standard Gaussian randomvariables, then M is called a Gaussian random matrix.

If the entries of M are independent subgaussian randomvariables,

P (|Mjk | ≥ t) ≤ βe−κt2 for all t > 0,

then M is called a subgaussian random matrix.

Maryia Kabanava (RWTH Aachen) Random matrix theory in sparse recovery CoSIP 2016

Page 18: Random matrix theory in sparse recovery - TU Berlin

RIP for subgaussian random matrices

Theorem

Let M ∈ Rm×d be subgaussian random matrix. Then there exists

C = C (β, κ) > 0 such that the restricted isometry constant of1√mM satisfies δs ≤ δ w.p. at least 1− ε provided

m ≥ Cδ−2(

s ln(ed/s) + ln(2ε−1))

. (17)

Maryia Kabanava (RWTH Aachen) Random matrix theory in sparse recovery CoSIP 2016

Page 19: Random matrix theory in sparse recovery - TU Berlin

Random matrices with subgaussian rows

Let Y ∈ Rd be random.

If E |〈Y , x〉|2 = ‖x‖22 for all x ∈ Rd , then Y is called isotropic.

If, for all x ∈ Rd with ‖x2‖ = 1, the random variable 〈Y , x〉 is

subgaussian,

E exp (t〈Y , x〉) ≤ exp(ct2) for all t ∈ R, (c is indep. of x),

then Y is called a subgaussian random vector.

Theorem

Let M ∈ Rm×d be random with independent, isotropic,

subgaussian rows with the same parameter c. If

m ≥ Cδ−2(

s ln(ed/s) + ln(2ε−1))

, (18)

then the restricted isometry constant of 1√mM satisfies δs ≤ δ w.p.

at least 1− ε.

Maryia Kabanava (RWTH Aachen) Random matrix theory in sparse recovery CoSIP 2016

Page 20: Random matrix theory in sparse recovery - TU Berlin

Ingredients of the proof: concentration inequality

Let M ∈ Rm×d be random with independent, isotropic,

subgaussian rows. Then, for all x ∈ Rd and every t ∈ (0, 1),

P(∣

∣m−1‖Mx‖22 − ‖x‖22∣

∣ ≥ t‖x‖22)

≤ 2 exp(−ct2m). (19)

Proof.

Let x ∈ Rd , ‖x‖2 = 1. Denote the rows of M by Y1, . . . ,Ym ∈ R

d .Define

Zi = |〈Yi , x〉|2 − ‖x‖22, i = 1, . . . ,m.

EZi = 0, P (|Zi | ≥ r) ≤ β exp(−κr)

m−1‖Mx‖22 − ‖x‖22 = m−1m∑

i=1

Zi

Bernstein inequality:

P

(∣

m−1m∑

i=1

Zi

≥ t

)

≤ 2 exp

(

− κ2

4β + 2κmt2

)

Maryia Kabanava (RWTH Aachen) Random matrix theory in sparse recovery CoSIP 2016

Page 21: Random matrix theory in sparse recovery - TU Berlin

Ingredients of the proof: covering argument

Let M ∈ Rm×d be random and

P(∣

∣m−1‖Mx‖22 − ‖x‖22∣

∣ ≥ t‖x‖22)

≤ 2 exp(−ct2m) for all x ∈ Rd .

Define M̃ = 1√mM. Then

P

(∣

∣‖M̃x‖22 − ‖x‖22

∣≥ t‖x‖22

)

≤ 2 exp(−ct2m) for all x ∈ Rd .

For S ⊂ {1, . . . , d}, |S | = s and δ, ε ∈ (0, 1), if

m ≥ Cδ−2(7s + 2 ln(2ε−1)), (20)

then w.p. at least 1− ε

‖M̃TS M̃S − Id ‖2→2 < δ. (21)

Maryia Kabanava (RWTH Aachen) Random matrix theory in sparse recovery CoSIP 2016

Page 22: Random matrix theory in sparse recovery - TU Berlin

Ingredients of the proof: union bound

Let M̃ ∈ Rm×d be random and

P

(∣

∣‖M̃x‖22 − ‖x‖22

∣≥ t‖x‖22

)

≤ 2 exp(−ct2m) for all x ∈ Rd .

If for δ, ε ∈ (0, 1),

m ≥ Cδ−2[

s(9 + 2 ln(d/s)) + 2 ln(2ε−1)]

, (22)

then w.p. at least 1− ε, the restricted isometry constant δs of M̃satisfies δs < δ.

Maryia Kabanava (RWTH Aachen) Random matrix theory in sparse recovery CoSIP 2016

Page 23: Random matrix theory in sparse recovery - TU Berlin

Gaussian width

For T ⊂ Rd we define its Gaussian width by

ℓ(T ) := Esupx∈T

〈x , g〉, g ∈ Rd is Gaussian. (23)

u

T

width Due to the rotation invariance(23) can be written as

ℓ(T ) = E‖g‖2 · Esupx∈T

〈x , u〉,

where u is uniformly distributedon S

d−1.

ℓ(Sd−1) = E sup‖x‖2=1

〈x , g〉 = E‖g‖2 ∼√d

D := conv{

x ∈ Sd−1 : |supp x | ≤ s

}

, ℓ(D) ∼√

s ln(d/s)

Maryia Kabanava (RWTH Aachen) Random matrix theory in sparse recovery CoSIP 2016

Page 24: Random matrix theory in sparse recovery - TU Berlin

Gordon’s escape through a mesh

ℓ(T ) := Esupx∈T

〈x , g〉, g ∈ Rd is Gaussian.

Em := E‖g‖2 =√2 Γ((m+1)/2)

Γ(m/2) , g ∈ Rm is Gaussian,

m√m + 1

≤ Em ≤ √m.

Theorem

Let M ∈ Rm×d be Gaussian and T ⊂ S

d−1. Then, for t > 0, itholds

P

(

infx∈T

‖Mx‖2 > Em − ℓ(T )− t

)

≥ 1− e−t2

2 . (24)

The proof relies on the concentration of measure inequality forLipschitz functions.m is determined by:

Em ≥ m√m + 1

≥ ℓ(T ) + t +1

τ(m & ℓ(T )2)

Maryia Kabanava (RWTH Aachen) Random matrix theory in sparse recovery CoSIP 2016

Page 25: Random matrix theory in sparse recovery - TU Berlin

Estimates for Gaussian widths of T (x)

T (x) = cone{z − x : z ∈ Rd , ‖z‖1 ≤ ‖x‖1} (25)

N (x) := {z ∈ Rd:〈z ,w−x〉 ≤ 0 for all w s.t. ‖w‖1 ≤ ‖x‖1} (26)

ℓ(T (x) ∩ Sd−1) ≤ E min

z∈N (x)‖g − z‖2, g ∈ R

d is a standard

Gaussian random vector.

Let supp(x) = S . Then

N (x) =⋃

t≥0

{

z ∈ Rd : zi = t sgn(xi ), i ∈ S , |zi | ≤ t, i ∈ Sc

}

[

ℓ(

T (x) ∩ Sd−1)]2 ≤ 2s ln(ed/s)

Maryia Kabanava (RWTH Aachen) Random matrix theory in sparse recovery CoSIP 2016

Page 26: Random matrix theory in sparse recovery - TU Berlin

Nonuniform recovery with Gaussian measurements

Theorem

Let x ∈ Rd be an s-sparse vector. Let M ∈ R

m×d be a randomlydrawn Gaussian matrix. If, for some ε ∈ (0, 1),

m2

m + 1≥ 2s

(

ln(ed/s) +

ln(ε−1)

s

)2

, (27)

then w.p. at least 1− ε the vector x is the unique minimizer of‖z‖1 subject to Mz = Mx.

Maryia Kabanava (RWTH Aachen) Random matrix theory in sparse recovery CoSIP 2016

Page 27: Random matrix theory in sparse recovery - TU Berlin

Estimates for Gaussian widths of Tρ,s

Tρ,s:={

w ∈ Rd :‖wS‖1≥ρ‖wSc‖1 for some S ⊂ [d ], |S |= s

}

(28)

D := conv{

x ∈ Sd−1 : |supp(x)| ≤ s

}

(29)

Tρ,s ∩ Sd−1 ⊂ (1 + ρ−1)D

ℓ(D) ≤√

2s ln(ed/s) +√s

ℓ(Tρ,s ∩ Sd−1) ≤ (1 + ρ−1)(

2s ln(ed/s) +√s)

Maryia Kabanava (RWTH Aachen) Random matrix theory in sparse recovery CoSIP 2016

Page 28: Random matrix theory in sparse recovery - TU Berlin

Ununiform recovery with Gaussian measurements

Theorem

Let M ∈ Rm×d be Gaussian, 0 < ρ < 1 and 0 < ε < 1. If

m2

m + 1≥ 2s

(

(1 + ρ−1)2)

(

ln(ed/s) +1√2+

ln(ε−1)

s ((1 + ρ−1)2)

)2

then w. p. at least 1− ε for every x ∈ Rd a minimizer x̂ of ‖z‖1

subject to Mz = Mx approximates x with ℓ1-error

‖x − x̂‖1 ≤2(1 + ρ)

(1− ρ)σs(x)1.

Maryia Kabanava (RWTH Aachen) Random matrix theory in sparse recovery CoSIP 2016

Page 29: Random matrix theory in sparse recovery - TU Berlin

Thank you for your attention !!!

Maryia Kabanava (RWTH Aachen) Random matrix theory in sparse recovery CoSIP 2016