compressed sensing compressive sampling daniel weller june 30, 2015

Post on 23-Dec-2015

223 Views

Category:

Documents

2 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Compressed SensingCompressive Sampling

Daniel WellerJune 30, 2015

2

Introduction to Sampling

s(-t)

t = tm

y[m]x(t)

• Sampling is composed of two operations: – Discretization: continuous-time to discrete-time– Quantization: discrete-time to digital

• Discretization can be written as:

• Usually, we assume shift invariance:Question:What is s(-t) in classical DSP?

3

Sampling and Reconstruction• Given a set of samples {y[m]}, how do we

reconstruct x(t)?• Classical approach:

– Sampling period T– Interpolate with a filter h(t)– Another interpretation: {h(t-nT)} is a basis of xr(t).

• Inverse problem form with coefficients {αn}:

– We will mainly consider the finite M, N case.

h(t)

Impulse train

y[m] xr(t)×

4

Classical Approximation Theory• Least squares problem: normal equations are

• For A’A to be positive definite, A must have at least N rows: need M ≥ N

• For infinite length signals,– Shannon sampling theory: bandlimited signals– Solution minimizes error power

• How about signals with structure?– How many samples do we need to reconstruct a

sine wave?

5

Compression and Approximation• The more we know about a signal, the fewer

samples we need to approximate it!• This basic idea underlies signal compression

methods like MP3, JPEG2000, others

Peppers with JPEG compressionPeppers (uncompressed)

6

Compressible Signals• What kinds of structure are useful to us?

– Low dimensionality: Φ is N×K, where K << N

– Union of subspaces: Φ1, Φ2, … are each subspaces

– Sparsity:

The set XK is the set of all K-sparse signals in the N-dimensional space. (not the same as Grassmann…)

– Others like finite rate of innovation also possible…

7

What signals are compressible?• A signal is K-sparse if • A signal is approximately

K-sparse if

is small enough.

0 0.2 0.4 0.6 0.8 10

0.05

0.1

0.15

0.2

0.25

Sparsity ratio (K/N)

Err

or

K(x

) 2/||x

|| 2 Peppers (10% coefficients)

Peppers (grayscale)

10% coefficients 2% error

8

Sparsity• Set of signals with p-norm ≤ K is called K-ball:

• Suppose we have a measurement y1 = a1’x. Where is x that minimizes the p-norm?

-4 -2 0 2 4-4

-2

0

2

4

-4 -2 0 2 4-4

-2

0

2

4

-4 -2 0 2 4-4

-2

0

2

4

-4 -2 0 2 4-4

-2

0

2

4p = 1 p = 2 p = ½ p = ¼

9

The Compressed Sensing Problem• Finite measurements y generated by

with M×N sensing matrix A, with M < N.• If we know x is K-sparse (K << N), when is x

determined uniquely by y?– Null space condition:

– This is true when the null space of A contains no 2K-sparse vectors. Why?

• More formal conditions follow.

10

A Spark for Compressed Sensing• Spark: minimum # of columns of A that are

linearly dependent• Theorem: x is unique iff spark(A) > 2K.

• What if x is compressible instead?– We need to modify our condition to ensure the

null space of A is not too compressible:

– This condition is related to the recovery guarantee

11

R.I.P., Compressed Sensing• We assumed y = Ax, exactly, with no noise.• The restricted isometry property (RIP) extends

the idea to noisy recovery.

• An isometry preserves the norm, so RIP states that A preserves the norm across K-sparse x.

• RIP is necessary for stability with noise e:

– Here, δ2K ≥ 1-1/C2.

12

RIP Measurement Bounds• If A satisfies RIP-2K with constant δ2K ≤ ½,

• Proof: First, construct a subset X of XK:

Since A satisfies RIP-2K,

These bounds allow us to state (via sphere-packing):

Details are in MA Davenport et al., Compressed Sensing: Theory and Applications, YC Eldar and G Kutyniok, eds., Cambridge, 2015, pp. 45-47.

13

Mutual (In-)Coherence• Coherence of a matrix A is the largest inner

product between two different columns:

• It is possible to show spark(A) ≥ 1+1/μ(A).• Thus, we have a coherence bound for exact

recovery of K-sparse signals:• Also A, with unit-norm columns, satisfies RIP-K

with δK = (K-1)μ(A) for all K < 1/μ(A).• Thus, the less coherent A is, the better RIP is.

14

Matrices for CS• Some deterministic matrices have properties

like minimum coherence or maximum spark.– Equiangular tight frames (ETF)– Vandermonde matrices

• Random matrices can satisfythese properties without thelimitations of construction.– iid Gaussian, Bernoulli matrices satisfy RIP.– Such constructions are universal, in that RIP is

satisfied irrespective of the signal basis.

-1 0 1-1

0

1

Mercedes-Benz ETF

15

Matrices for CS• For image processing, iid random A may be

extremely large.• Instead, we can randomly subsample a

deterministic sensing matrix.– Fourier transform is used for MRI, some optical.– The coherence, RIP bounds are not quite as good,

and not universal.– Some work (e.g., SparseMRI) empirically verifies

the incoherence of a random sensing matrix.• We can also construct dictionaries from data.

16

CS Reconstruction Formulation• Consider the sparse recovery problem:

– Exact:– Noisy:

• The convex relaxation yields a sparse solution:

• An unconstrained version also is popular:

• The matrix A may include a dictionary Φ.• We will describe several standard approaches.

17

(Orthogonal) Matching Pursuit• At each step, include the next column of A

that best correlates with the residual.• The columns of A must have unit norm.• The ith step chooses an atom λi according to:

• If Ai is the collection of atoms λ1, …, λi, the new signal is then

• The new residual is This residual is orthogonal to Ai.

18

Iterative Hard Thresholding• Another approach repeatedly thresholds the

non-sparse coefficients.• At each step, the normal residual A’(y-Axi-1) is

added to xi-1, and thresholded to form xi.

• We can also view IHT as thresholding the separable quadratic approximation to

19

Convex Optimization• The unconstrained version is

• Iterative soft thresholding is similar to IHT:

• Split Bregman iteration also is popular:

– This method is the same as ADMM (1 split).

20

Convex Optimization• The constrained version (BPDN) is

• When ε = 0, and x is real, the problem is a LP.• SPGL1 solves the related LASSO problem

and by finding τ that maps to ε, solves BPDN.• SPGL1 uses a linesearch-based projected

gradient approach to solve the LASSO.

Details are in E van den Berg and MP Friedlander, SIAM J. Sci. Comput., 31(2), pp. 890-912, 2008.

top related