bregman iterative methods, lagrangian connections, dual

48
Bregman Iterative Methods, Lagrangian Connections, Dual Interpretations, and Applications Ernie Esser UCLA 6-30-09 1

Upload: others

Post on 01-Jun-2022

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Bregman Iterative Methods, Lagrangian Connections, Dual

Bregman Iterative Methods,Lagrangian Connections,

Dual Interpretations,and Applications

Ernie Esser

UCLA

6-30-09

1

Page 2: Bregman Iterative Methods, Lagrangian Connections, Dual

Outline

• Bregman Iteration Overview• Method for Constrained Optimization• Compare to Denoising Application

• Linearized Bregman forl1-Minimization• Derivation and Equivalent Forms

• Lagrangian Connections• Bregman Iteration / Method of Multipliers• Linearized Bregman / Uzawa Method

• Dual Interpretations• Proximal Point Algorithm• Gradient Ascent

2

Page 3: Bregman Iterative Methods, Lagrangian Connections, Dual

Outline Continued

• Split Bregman Idea• TV-l2 Example

• More General Separable Convex Programs• Split Bregman Connection to ADMM

• Convergence• Dual Interpretation• TV-l − 1 Minimization Example

• Decoupling Variables for More Explicit Algorithms• TV Deblurring Example• Compressive Sensing Example

• Connection to PDHG• Main Idea and Derivation

• Further Applications...

3

Page 4: Bregman Iterative Methods, Lagrangian Connections, Dual

A Model Constrained Minimization Problem

minu

J(u) s.t. Ku = f

J closed proper convex

J : Rm → (−∞,∞], u ∈ Rm

K ∈ Rs×m, f ∈ R

s

Examples:• J(u) = ‖u‖1 (Basis Pursuit)

• J(u) = ‖u‖TV

4

Page 5: Bregman Iterative Methods, Lagrangian Connections, Dual

Bregman Distance

Dpk

J (u, uk) = J(u) − J(uk) − 〈pk, u − uk〉

wherepk ∈ ∂J(uk).

By definition of subdifferential,pk ∈ ∂J(uk) means

J(v) − J(uk) − 〈pk, v − uk〉 ≥ 0 ∀v

5

Page 6: Bregman Iterative Methods, Lagrangian Connections, Dual

Bregman Iteration

uk+1 = arg minu

Dpk

J (u, uk) +δ

2‖Ku − f‖2

pk+1 = pk − δKT (Kuk+1 − f) ∈ ∂J(uk+1)

Equivalentuk+1:

uk+1 = arg minu

J(u) − 〈pk, u〉 +δ

2‖Ku − f‖2

Initialization: p0 = 0, u0 arbitrary

Ref: Yin, W., Osher, S., Goldfarb, D., Darbon, J.,Bregman Iterative Algorithmsfor l1-Minimization with Applications to Compressed Sensing, UCLA CAM Report[07-37], 2007.

6

Page 7: Bregman Iterative Methods, Lagrangian Connections, Dual

Denoising Example

minu

‖u‖TV s.t. ‖u − f‖2 ≤ σ2

Apply Bregman iteration to:

minu

‖u‖TV s.t. u = f

⇒ uk+1 = arg minu

‖u‖TV +δ

2‖u − f − pk

δ‖2

pk+1 = pk − δ(uk+1 − f)

‖uk − f‖ → 0 monotonically

‖uk − u∗‖ non-increasing while‖uk − f‖ ≥ ‖u∗ − f‖

⇒ Stop iterating when constraint satisfied

7

Page 8: Bregman Iterative Methods, Lagrangian Connections, Dual

Linearized Bregman for l1-Minimization

Apply Bregman iteration to

minu

‖u‖1 s.t. Ku = f

but replaceδ2‖Ku − f‖2 with 〈δKT (Kuk − f), u〉 + 1

2α‖u − uk‖2

⇒ uk+1 = arg minu

‖u‖1 +1

2α‖u − uk − αpk + δαKT (Kuk − f)‖2

pk+1 =−uk+1

α+

uk

α+ pk − δKT (Kuk − f)

Initialization: p0 = 0, u0 arbitrary

Ref: Osher, S., Mao, Y., Dong, B., and Yin, W.,Fast Linearized BregmanIteration for Compressive Sensing and Sparse Denoising, UCLA CAM Report[08-37], 2008.

8

Page 9: Bregman Iterative Methods, Lagrangian Connections, Dual

Equivalent Form

Let vk = pk+1 + uk+1

α, v0 = δKT f .

Can rewrite linearized Bregman steps as

uk+1 = arg minu

‖u‖1 +1

2α‖u − αvk‖2

vk+1 = vk − δKT (Kuk+1 − f)

Remark 1: Algorithm actually solves

minu

‖u‖1 +1

2α‖u‖2 s.t. Ku = f

Remark 2: In practice, useµ‖u‖1 instead of‖u‖1 for numerical reasons.

9

Page 10: Bregman Iterative Methods, Lagrangian Connections, Dual

Soft Thresholding

Explicit formula for

Sα(z) = arg minu

‖u‖1 +1

2α‖u − z‖2

2

=

{

z − α sign(z) if |z| > α

0 otherwise

Can use Moreau decomposition to reinterpretSα(z) in terms of a projection

Sα(z) = z − αΠ{z:‖z‖∞≤1}(z

α)

whereΠ(z) = zmax(|z|,1) is orthogonal projection onto{z : ‖z‖∞ ≤ 1}.

10

Page 11: Bregman Iterative Methods, Lagrangian Connections, Dual

Some Convex Optimization References

• Bertsekas, D.,Constrained Optimization and Lagrange Multiplier Methods,Athena Scientific, 1996.

• Bertsekas, D.,Nonlinear Programming, Athena Scientific, Second Edition.1999.

• Bertsekas, D., and Tsitsiklis, J.,Parallel and Distributed Computation,Prentice Hall, 1989.

• Boyd, S., and Vandenberghe, L.,Convex Analysis, Cambridge UniversityPress, 2006.

• Ekeland, I., and Temam, R.Convex Analysis and Variational Problems,SIAM, Classics in Applied Mathematics, 28, 1999.

• Rockafellar, R., T.,Convex Analysis, Princeton University Press,Princeton, NJ, 1970.

11

Page 12: Bregman Iterative Methods, Lagrangian Connections, Dual

Legendre-Fenchel Transform

J∗(p) = supw

〈p, w〉 − J(w)

Special case whenJ is a norm,J(w) = ‖w‖:

J∗(p) = supw

〈p, w〉 − ‖w‖

=

{

0 if 〈p, w〉 ≤ ‖w‖ ∀w

∞ otherwise

=

{

0 if sup‖w‖≤1〈p, w〉 ≤ 1

∞ otherwise

=

{

0 if ‖p‖∗ ≤ 1 by dual norm definition

∞ otherwise

12

Page 13: Bregman Iterative Methods, Lagrangian Connections, Dual

Moreau Decomposition

Let f ∈ Rm andJ be a closed proper convex function onR

m. Then:

f = arg minu

J(u) +1

2α‖u − f‖2

2 + α

[

arg minp

J∗(p) +α

2‖p − f

α‖22

]

Sometimes written:

f = proxαJ(f) + α prox J∗

α(f

α)

Ref: Combettes, P., and Wajs, W.,Signal Recovery by Proximal Forward-BackwardSplitting, Multiscale Modelling and Simulation, 2006.

13

Page 14: Bregman Iterative Methods, Lagrangian Connections, Dual

Bregman / Method of Multipliers

Bregman iteration forminu J(u) s.t. Ku = f :

uk+1 = arg minu

J(u) − 〈pk, u〉 +δ

2‖Ku − f‖2

pk+1 = pk − δKT (Kuk+1 − f), p0 = 0

Equivalent to method of multipliers:

uk+1 = arg minu

J(u) + 〈λk, Ku − f〉 +δ

2‖Ku − f‖2

λk+1 = λk + δ(Kuk+1 − f), λ0 = 0

with pk = −KT λk ∀k.

Ref: Yin, W., Osher, S., Goldfarb, D., Darbon, J.,Bregman Iterative Algorithmsfor l1-Minimization with Applications to Compressed Sensing, UCLA CAM Report[07-37], 2007.

14

Page 15: Bregman Iterative Methods, Lagrangian Connections, Dual

Linearized Bregman / Uzawa

Linearized Bregman iteration forminu J(u)+ 12α

‖u‖2 s.t. Ku = f :

uk+1 = arg minu

J(u) +1

2α‖u − αvk‖2

vk+1 = vk − δKT (Kuk+1 − f), v0 = δKT f

Equivalent to Uzawa’s method:

uk+1 = arg minu

J(u) +1

2α‖u‖2 + 〈λk, Ku − f〉

λk+1 = λk + δ(Kuk+1 − f), λ0 = −δf

with vk = −KT λk ∀k.

Ref: Cai, J.F., Candes, E., and Shen, Z.,A Singular Value Thresholding Algorithmfor Matrix Completion, CAM [08-77], 2008.

15

Page 16: Bregman Iterative Methods, Lagrangian Connections, Dual

Relevant Dual Functionals

Lagrangian forminu J(u) s.t. Ku = f is

L(u, λ) = J(u) + 〈λ, Ku − f〉

Dual functional is

q(λ) = infu

L(u, λ) = −J∗(−KT λ) − 〈λ, f〉

Dual problem:maxλ q(λ)

Augmented Lagrangian:

Lδ(u, λ) = L(u, λ) +δ

2‖Ku − f‖2

qδ(λ) = infu

Lδ(u, λ)

16

Page 17: Bregman Iterative Methods, Lagrangian Connections, Dual

Proximal Point Interpretation

Lδ(u, λk) = maxy

L(u, y) − 1

2δ‖y − λk‖2

⇒ y∗ = λk + δ(Ku − f)

minu

maxy

L(u, y) − 1

2δ‖y − λk‖2 attained at(uk+1, λk+1)

⇒maxy

q(y) − 1

2δ‖y − λk‖2 attained atλk+1

⇒λk+1 = arg maxy

q(y) − 1

2δ‖y − λk‖2

(Proximal point algorithm for maximizingq(λ))

17

Page 18: Bregman Iterative Methods, Lagrangian Connections, Dual

Gradient Ascent Interpretation

qδ(λ) = maxy q(y) − 12δ‖y − λ‖2 can be shown to be differentiable.

∇qδ(λk) = −

[

λk − arg maxy q(y) − 12δ‖y − λk‖2

δ

]

=λk+1 − λk

δ

⇒ λk+1 = λk + δ∇qδ(λk)

18

Page 19: Bregman Iterative Methods, Lagrangian Connections, Dual

Dual Functional for Linearized Bregman

Let JLB = J(u) + 12δ‖u‖2.

Lagrangian forminu JLB s.t. Ku = f is

LLB(u, λ) = JLB(u) + 〈λ, Ku − f〉

Dual functional is

qLB(λ) = −J∗LB(−KT λ) − 〈λ, f〉

Remark: From strict convexity ofJLB , J∗LB is differentiable and

∇J∗LB is Lipschitz with constantα‖K‖2.

19

Page 20: Bregman Iterative Methods, Lagrangian Connections, Dual

Gradient Ascent Interpretation

From optimality condition for Lagrangian form ofuk+1 update,

uk+1 = arg minu

J(u) +1

2α‖u‖2 + 〈λk, Ku − f〉,

0 ∈ ∂JLB(uk+1) + KT λk.

Using definitions of Legendre transform and subdifferential,

uk+1 = ∇J∗LB(−KT λk),

So ∇qLB(λk) = Kuk+1 − f

Can therefore interpret

λk+1 = λk + δ(Kuk+1 − f) as

λk+1 = λk + δ∇qLB(λk).

Ref: Yin, W.,Analysis and Generalizations of the Linearized Bregman Method,UCLA CAM Report [09-42], May 2009.

20

Page 21: Bregman Iterative Methods, Lagrangian Connections, Dual

Split Bregman Idea

Example: Total Variation Denoising

minu

”‖∇u‖1” +λ

2‖u − f‖2

Reformulate as

minw,u

”‖w‖1” +λ

2‖u − f‖2 s.t. w = ∇u

Apply Bregman iteration to constrained problem but use alternatingminimization with respect tow andu.

Ref: Goldstein, T., and Osher, S.,The Split Bregman Algorithm for L1 RegularizedProblems, UCLA CAM Report [08-29], April 2008.

21

Page 22: Bregman Iterative Methods, Lagrangian Connections, Dual

Discrete TV Seminorm Notation

‖u‖TV =

Mr∑

p=1

Mc∑

q=1

(D+1 up,q)2 + (D+

2 up,q)2

VectorizeMr × Mc matrix by stacking columns(p, q) element of matrix↔ (q − 1)Mr + p element of vector

Define grid-shaped graph withm nodes corresponding to elements(p, q).Index nodes by(q − 1)Mr + p and edges arbitrarily. For each edgeη withendpoint indices(i, j), i < j, define:

Dη,k =

−1 for k = i,

1 for k = j,

0 for k 6= i, j.

Also defineE ∈ Re×m such that

Eη,k =

{

1 if Dη,k = −1,

0 otherwise.

22

Page 23: Bregman Iterative Methods, Lagrangian Connections, Dual

TV Notation (continued)

Define norm‖w‖E =∑m

k=1

(

ET (w2))

k.

Then can rewrite TV norm as

‖u‖TV = ‖Du‖E

Dual norm is defined by

‖p‖E∗ = ‖√

ET (p2)‖∞

23

Page 24: Bregman Iterative Methods, Lagrangian Connections, Dual

Convex Programs with Separable Structure

minu

J(u) s.t. Ku = f

J(u) = H(u) +N

i=1

Gi(Aiu + bi)

Rewrite asminz,u

F (z) + H(u) s.t. Bz + Au = b

where F (z) =∑N

i=1 Gi(zi), z =

z1

...

zN

, B =

[

−I

0

]

,

A =

A1

...

AN

K

, and b =

−b1

...

−bN

f

.

24

Page 25: Bregman Iterative Methods, Lagrangian Connections, Dual

Application of Bregman Iteration

Apply Bregman Iteration to:

minz,u

F (z) + H(u) s.t. Bz + Au = b

(zk+1, uk+1) = arg minz∈Rn,u∈Rm

F (z) − F (zk) − 〈pkz , z − zk〉+

H(u) − H(uk) − 〈pku, u − uk〉+

α

2‖b − Au − Bz‖2

pk+1z =pk

z + αBT (b − Auk+1 − Bzk+1)

pk+1u =pk

u + αAT (b − Auk+1 − Bzk+1).

Initialization: p0z = 0, p0

u = 0

25

Page 26: Bregman Iterative Methods, Lagrangian Connections, Dual

Augmented Lagrangian Form

Augmented Lagrangian is given by

Lα(z, u, λ) = F (z) + H(u) + 〈λ, Au + Bz − b〉 +α

2‖Au + Bz − b‖2

Then(zk+1, uk+1) can be equivalently updated by

(zk+1, uk+1) = arg minz,u

Lα(z, u, λk)

λk+1 = λk + α(Auk+1 + Bzk+1 − b), λ0 = 0,

which is the method of multipliers.

Equivalence to Bregman iteration again follows frompkz = −BT λk and

pku = −AT λk.

26

Page 27: Bregman Iterative Methods, Lagrangian Connections, Dual

ADMM / Split BregmanAlternate minimization with respect tou andz:Theorem 1 (Eckstein, Bertsekas) Suppose B has full column rank andH(u) + ‖Au‖2 is strictly convex. Let λ0 and u0 be arbitrary and let α > 0.Suppose we are also given sequences {µk} and {νk} such that µk ≥ 0,νk ≥ 0,

∑∞k=0 µk < ∞ and

∑∞k=0 νk < ∞. Suppose that

‖zk+1 − arg minz∈Rn

F (z) + 〈λk, Bz〉 +α

2‖Auk + Bz − b‖2‖ ≤ µk (1)

‖uk+1 − arg minu∈Rm

H(u) + 〈λk, Au〉 +α

2‖Au + Bzk+1 − b‖2‖ ≤ νk (2)

λk+1 = λk + α(Auk+1 + Bzk+1 − b). (3)

If there exists a saddle point of L(z, u, λ) , then zk → z∗, uk → u∗ andλk → λ∗, where (z∗, u∗, λ∗) is such a saddle point. If no such saddle pointexists, then at least one of the sequences {uk} or {λk} must be unbounded.

Ref: Eckstein, J., and Bertsekas, D.,On the Douglas-Rachford splitting methodand the proximal point algorithm for maximal monotone operators, MathematicalProgramming 55, North-Holland, 1992.

27

Page 28: Bregman Iterative Methods, Lagrangian Connections, Dual

Dual Functional

q(λ) = infz,u

F (z) + H(u) + 〈λ, Au + Bz − b〉

= −F ∗(−BT λ) − 〈λ, b〉 − H∗(−AT λ)

λ∗ is optimal if

0 ∈ −B∂F ∗(−BT λ∗) + b − A∂H∗(−AT λ∗)

Let Ψ(λ) = −B∂F ∗(−BT λ) + b φ(λ) = −A∂H∗(−AT λ).

28

Page 29: Bregman Iterative Methods, Lagrangian Connections, Dual

Douglas Rachford Splitting

Formally apply Douglas Rachford splitting withα as the time step:

0 ∈ rk+1 − λk

α+ Ψ(rk+1) + φ(λk),

0 ∈ λk+1 − λk

α+ Ψ(rk+1) + φ(λk+1)

Remark: There are possibly many ways to satisfy the above iterations, butADMM satisfies it in a particular way.

rk+1 = (I + αΨ)−1(λk + αAuk)

λk+1 = (I + αφ)−1(rk+1 − αAuk)

29

Page 30: Bregman Iterative Methods, Lagrangian Connections, Dual

Reformulation of DR Splitting

rk+1 = arg minr

F ∗(−BT r) + 〈r, b〉 +1

2α‖r − λk + αqk‖2

λk+1 = arg minλ

H∗(−AT λ) +1

2α‖λ − rk+1 − αqk‖2

qk+1 = qk +1

α(rk+1 − λk+1)

Remark: Don’t need the ’full column rank’ or ’strictly convex’ assumptions toguarantee thatλk converges to solution of dual problem

Ref: Eckstein, J.,Splitting Methods for Monotone Operators with Applications toParallel Optimization, Ph. D. Thesis, Massachusetts Institute of Technology,Dept. of Civil Engineering, http://hdl.handle.net/1721.1/14356, 1989.

30

Page 31: Bregman Iterative Methods, Lagrangian Connections, Dual

TV- l1 Example

minu

‖u‖TV + β‖Ku − f‖1

Rewrite asmin

u‖Du‖E + β‖Ku − f‖1

Let z =

[

w

v

]

=

[

Du

Ku − f

]

, B = −I A =

[

D

K

]

, b =

[

0

f

]

to put in form minz,u F (z) + H(u) s.t. Bz + Au = b

Introduce dual variableλ =

[

p

q

]

.

Solution exists assumingker(D)⋂

ker(K) = {0}.

31

Page 32: Bregman Iterative Methods, Lagrangian Connections, Dual

Augmented Lagrangian and ADMM Iterations

L(z, u, λ) =‖w‖E + β‖v‖1 + 〈p, Du − w〉 + 〈q, Ku − f − v〉+α

2‖w − Du‖2 +

α

2‖v − Ku + f‖2

The ADMM iterations are given by

wk+1 = arg minw

‖w‖E +α

2‖w − Duk − pk

α‖2

vk+1 = arg minv

β‖v‖1 +α

2‖v − Kuk + f − qk

α‖2

uk+1 = arg minu

α

2‖Du − wk+1 +

pk

α‖2 +

α

2‖Ku − vk+1 − f +

qk

α‖2

pk+1 = pk + α(Duk+1 − wk+1)

qk+1 = qk + α(Kuk+1 − f − vk+1),

wherep0 = q0 = 0, u0 is arbitrary andα > 0.

32

Page 33: Bregman Iterative Methods, Lagrangian Connections, Dual

Explicit Iterations

The explicit formulas forwk+1, vk+1 anduk+1 are given by

wk+1 = S̃ 1α(Duk +

pk

α)

vk+1 = S β

α

(Kuk − f +qk

α)

uk+1 = (−4 + KT K)−1

(

DT wk+1 − DT pk

α+ KT (vk+1 + f) − KT qk

α

)

= (−4 + KT K)−1(

DT wk+1 + KT (vk+1 + f))

.

where

S̃c(f) = f − cΠ{p:‖p‖E∗≤1}(f

c),

Π{p:‖p‖E∗≤1}(p) =p

E max(

ET (p2), 1)

33

Page 34: Bregman Iterative Methods, Lagrangian Connections, Dual

TV- l1 Results

f u

TV-l1 Minimization of512 × 512 Synthetic Image

Image Size Iterations Time

64 × 64 40 1s

128 × 128 51 5s

256 × 256 136 78s

512 × 512 359 836sIterations until‖uk − uk−1‖∞ ≤ .5, ‖Duk − wk‖∞ ≤ .5 and‖vk − uk + f‖∞ ≤ .5

β = .6, .3, .15 and.075, α = .02, .01, .005 and.0025

34

Page 35: Bregman Iterative Methods, Lagrangian Connections, Dual

Decoupling VariablesOne can add additional proximal-like penalties to ADMM iterations andobtain a more explicit algorithm that still converges.

Given a step of the ADMM algorithm of the form

uk+1 = arg minu

J(u) + 〈λk, Ku − f〉 +α

2‖Ku − f‖2,

modify the objective functional by adding

1

2〈u − uk, (

1

δ− αKT K)(u − uk)〉,

whereδ is chosen such that0 < δ < 1α‖KT K‖ .

Modified update is given by

uk+1 = arg minu

J(u) + 〈λk, Ku − f〉 +1

2δ‖u − uk + αδKT (Kuk − f)‖2.

Ref: Zhang, X., Burger, M., Bresson, X., Osher, S.,Bregmanized NonlocalRegularization for Deconvolution and Sparse Reconstruction, UCLA CAM Report[09-03] 2009.

35

Page 36: Bregman Iterative Methods, Lagrangian Connections, Dual

Convex Constraints as Indicator Functions

Given a constraint of the formu ∈ S whereS is convex, we can enforce theconstraint by adding to the objective functional the indicator function forS,

H(u) =

{

0 if u ∈ S

∞ otherwise

One can then develop algorithms that project onto the constraint set at eachiteration, or in combination with the decoupling trick handle the constraint ina more explicit manner.

36

Page 37: Bregman Iterative Methods, Lagrangian Connections, Dual

Example Constraint

Suppose the constraint is‖Ku − f‖ ≤ ε, soS = {u : ‖Ku − f‖ ≤ ε}.

ΠS(z) = (I − K†K)z + K†

{

Kz if ‖Kz − f‖ ≤ ε

f + r(

Kz−KK†f‖Kz−KK†f‖

)

otherwise,

where

r =√

ε2 − ‖(I − KK†)f‖22

By decoupling variables, can simplify projection step to

Π{z:‖z−f‖2≤ε}(z) = f +z − f

max(

‖z−f‖2

ε, 1

) .

Useful whenK† not easy to compute.

37

Page 38: Bregman Iterative Methods, Lagrangian Connections, Dual

TV Deblurring Example

minu

‖u‖TV s.t. ‖Ku − f‖ ≤ ε

Rewrite as

minu

‖Du‖E + H(Ku) where H(z) =

{

0 ‖z − f‖ ≤ ε

∞ otherwise.

Let T = {z : ‖z − f‖ ≤ ε} andX = {p : ‖p‖E∗ ≤ 1}.

Saddle point problem from Lagrangian:

maxp,q

infu,w,z

‖w‖E + 〈p, Du − w〉 + H(z) + 〈q, Ku − z〉

38

Page 39: Bregman Iterative Methods, Lagrangian Connections, Dual

Split Inexact Uzawa Method for TV Deblurring

uk+1 = arg minu

δ

2‖Du − wk‖2 +

δ

2‖Ku − zk‖2 + 〈pk, Du〉 + 〈qk, Ku〉

+1

2〈u − uk,

[

(1

α− δDT D) + (

1

α− δKT K)

]

(u − uk)〉

wk+1 = arg minw

‖w‖E +δ

2‖w − Duk+1 − pk

δ‖2

zk+1 = arg minz

H(z) +δ

2‖z − Kuk+1 − qk

δ‖2

pk+1 = pk + δ(Duk+1 − wk+1)

qk+1 = qk + δ(Kuk+1 − zk+1)

If D ∼ ∇ andK is normalized blurring operator, just need0 < α < 14δ

.Ref: Zhang, X.,A Unified Primal-Dual Algorithm Based on l1 and Bregman Iteration,(Private Communication), April 2009.

39

Page 40: Bregman Iterative Methods, Lagrangian Connections, Dual

TV Deblurring Algorithm (continued)

Use two applications of Moreau decomposition to rewrite previous algorithmin terms of projections ontoT andX :

uk+1 = uk − α

2

[

DT (2pk − pk−1) + KT (2qk − qk−1)]

pk+1 = ΠX(pk + δDuk+1)

qk+1 = (qk + δKuk+1) − δΠT (qk

δ+ Kuk+1)

Remark: Can require more iterations than a more implicit algorithm, but hasthe advantage of only requiring matrix multiplications andsimple projections.

40

Page 41: Bregman Iterative Methods, Lagrangian Connections, Dual

Compressive Sensing Example

minz

‖Ψz‖1 s.t. ‖RΓz − f‖2 ≤ ε,

whereRΓ is the measurement matrix and we expectΨz to be sparse.

Let J = ‖ · ‖1, A = Ψ, K = RΓ and

H(x) =

{

0 if ‖x − f‖2 ≤ ε

∞ otherwise.

⇒ Just like the deblurring example

If ΨT Ψ = I (tight frame), can choose to handle implicitly

If Γ is discrete Fourier transform, can handleK implicitly too

41

Page 42: Bregman Iterative Methods, Lagrangian Connections, Dual

Connections to PDHG

SinceJ∗∗ = J , J(Au) = J∗∗(Au) = supp〈p, Au〉 − J∗(p).

Can therefore obtain the following saddle point problem fromminu J(Au) + H(u),

minu

supp

−J∗(p) + 〈p, Au〉 + H(u).

The Primal Dual Hybrid Gradient algorithm then alternates primal and dualproximal steps of the form:

pk+1 = arg maxp

−J∗(p) + 〈p, Auk〉 − 1

2δk

‖p − pk‖22

uk+1 = arg minu

H(u) + 〈AT pk+1, u〉 +1

2αk

‖u − uk‖22

Ref: Zhu, M., and Chan, T.,An Efficient Primal-Dual Hybrid Gradient Algorithm forTotal Variation Image Restoration, UCLA CAM Report [08-34], May 2008.

42

Page 43: Bregman Iterative Methods, Lagrangian Connections, Dual

Figure 1: PDHG-Related Algorithm Framework

(P) minu FP (u)

FP (u) = J(Au) + H(u)

(D)maxp FD(p)

FD(p) = −J∗(p) − H∗(−AT p)

(PD)minu supp LPD(u, p)

LPD(u, p) = 〈p,Au〉 − J∗(p) + H(u)

(SPP)maxp infu,w LP (u,w, p)

LP (u,w, p) = J(w) + H(u) + 〈p,Au − w〉

(SPD)maxu infp,y LD(p, y, u)

LD(p, y, u) = J∗(p) + H∗(y) + 〈u,−AT p − y〉?? ??

AMA on (SPP)⇔

PFBS on (D)

AMA on (SPD)⇔

PFBS on (P)

@@

@@R

��

��

+ 1

2α‖u − uk‖2

2 + 1

2δ‖p − pk‖2

2

����������

����������

CCCCCCCCCW

CCCCCCCCCW

+ δ

2‖Au − w‖2

2 +α

2‖AT p + y‖2

2Relaxed AMA

on (SPP)Relaxed AMA

on (SPD)

@@@R@

@@I ����

���

ADMM on (SPP)⇔

Dougles Rachfordon (D)

ADMM on (SPD)⇔

Dougles Rachfordon (P)

@@@@ ����

@@@R

@@@R

���

���

+ 1

2〈u − uk, ( 1

α− δAT A)(u − uk)〉 + 1

2〈p − pk, (1

δ− αAAT )(p − pk)〉

Primal-Dual Proximal Pointon (PD)

⇔PDHG

��

��

��

��

@@

@@@R

@@

@@@R

pk+1 →2pk+1 − pk

uk →2uk − uk−1

Split Inexact Uzawaon (SPP)

⇔PDHGMp

Split Inexact Uzawaon (SPD)

⇔PDHGMu

Legend: (P): Primal(D): Dual(PD): Primal-Dual(SPP): Split Primal(SPD): Split Dual

AMA: Alternating Minimization Algorithm (4.2.1)PFBS: Proximal Forward Backward Splitting (4.2.1)ADMM: Alternating Direction Method of Multipliers (4.2.2)PDHG: Primal Dual Hybrid Gradient (4.2)PDHGM: Modified PDHG (4.2.3)⇒ Well Understood Convergence Properties

43

Page 44: Bregman Iterative Methods, Lagrangian Connections, Dual

Types of Applications

• Convex programs that decompose into problems of the form

minz,u

F (z) + H(u) s.t. Au + Bz = b

• Especially useful for problems involving convex constraints, l2 andl1-like terms that can be separated

• l1-like terms include TV seminorm, Besov norm and even the nuclearnorm, which is thel1 norm of the singular values of a matrix

• Can also apply these algorithms to convex relaxations of non-convexproblems

44

Page 45: Bregman Iterative Methods, Lagrangian Connections, Dual

Sparse Approximation

These algorithms are useful for functionals involving multiple l1-like terms,which can arise when modelling signals as sums of sparse signals in differentrepresentations:

• TV-l1• Cartoon / Texture Decomposition

Sparse∇ + Sparse Fourier coefficients• Background Video Detection

min ‖A‖nuclear + λ‖E‖1 s.t. A + E ∼ original video

Low rank (background) + Sparse error (foreground)

Ref: Osher, S., Sole, A. Vese, L.,Image Decomposition and Restoration Using

Total Variation Minimization and the H−1 Norm. [UCLA CAM Report 02-57]Ref: Talk by John Wright

45

Page 46: Bregman Iterative Methods, Lagrangian Connections, Dual

Nonlocal Total Variation

Graph definition of discrete TV seminorm makes it straightforward to extendthese algorithms to non-local TV minimization problems.

‖u‖TV = ‖Du‖E

Simply redefine the edge-node adjacency matrixD.

Let A be the adjacency matrix for the new set of edges, redefineE

accordingly and letW be a diagonal matrix of precomputed nonnegativeweights on the edges. Then

‖u‖NLTV = ‖√

WAu‖E

46

Page 47: Bregman Iterative Methods, Lagrangian Connections, Dual

Convexification of Image Segmentation

minu

‖u‖TV + λ‖u(c1 − f)‖2 + λ‖(1 − u)(c2 − f)‖2 s.t. u binary

minu

‖u‖TV + λ〈(c1 − f)2 − (c2 − f)2, u〉 s.t. 0 ≤ u ≤ 1

Convexification idea also extends to active contours, multiphase segmentation.

Ref: Burger, M., and Hintermüller, M.,Projected Gradient Flows for BV / LevelSet Relaxation, UCLA CAM Report [05-40] 2005.Ref: Goldstein, T., Bresson, X., Osher, S.,Geometric Applications of the SplitBregman Method: Segmentation and Surface Reconstruction, UCLA CAM Report[09-06] 2009.

47

Page 48: Bregman Iterative Methods, Lagrangian Connections, Dual

Convexification of Image Registration

Given imagesu andφ, minimize

‖φ(x − v) − u(x)‖2 +γ

2‖∇v1‖2 +

γ

2‖∇v2‖2

with respect to displacement fieldv.Obtain convex relaxation by adding edges with unknown weights ci,j suchthat

(vi1, v

i2) =

(xi1 −

j∼i

ci,jyj1), (x

i2 −

j∼i

ci,jyj2)

phi

x2

x1

y2

y1

u

F (c) = ‖Aφc − u‖2 +γ

2‖D(Ay1

c − x1)‖2 +γ

2‖D(Ay2

c − x2)‖2

such thatci,j ≥ 0 and∑

j∼i ci,j = 1.

48