deconvolution with admm - stanford universitydeconvolution • given measurements band convolution...
TRANSCRIPT
Deconvolution with ADMM
Gordon WetzsteinStanford University
EE367/CS448I: Computational Imaging and Displaystanford.edu/class/ee367
Lecture 6
Lens as Optical Low-pass Filter
• point source on focal plane maps to point
focal plane
• away from focal plane: out of focus blur
focal plane
blur
red
poin
t
Lens as Optical Low-pass Filter
• shift-invariant convolution
focal plane
Lens as Optical Low-pass Filter
Lens as Optical Low-pass Filter
poin
t spr
ead
func
tion
(PSF
): c
x bsharp image measured, blurred image
b = c∗ x
convolution kernel is calledpoint spread function (PSF)
Lens as Optical Low-pass Filter
poin
t spr
ead
func
tion
(PSF
): c
x bsharp image measured, blurred image
b = c∗ x
diffraction-limited PSF of circular aperture (aka “Airy” pattern):
PSF, OTF, MTF
• point spread function (PSF) is fundamental concept in optics• optical transfer function (OTF) is (complex) Fourier transform of PSF
• modulation transfer function (MTF) is magnitude of OTF
• example: PSFOTF=F{PSF}MTF=|OTF|
PSF, OTF, MTF
PSFOTF=F{PSF}MTF=|OTF|• example:
Deconvolution
• given measurements b and convolution kernel c, what is x?
*
=
bcx
?
Deconvolution with Inverse Filtering• naive solution: apply inverse kernel
!x = c−1 ∗b = F−1 F b{ }
F c{ }⎧⎨⎩
⎫⎬⎭
x !x
Deconvolution with Inverse Filtering & Noise• naive solution: apply inverse kernel
• Gaussian noise,
!x = c−1 ∗b = F−1 F b{ }
F c{ }⎧⎨⎩
⎫⎬⎭
!xσ = 0.05
Deconvolution with Inverse Filtering & Noise
• results: terrible!
• why? this is an ill-posed problem (division by (close to) zero in frequency
domain) à noise is drastically amplified!
• need to include prior(s) on images to make up for lost data• for example: noise statistics (signal to noise ratio)
Deconvolution with Wiener Filtering• apply inverse kernel and don’t divide by 0
!x = F−1 F c{ } 2
F c{ } 2 + 1SNR⋅F b{ }F c{ }
⎧⎨⎪
⎩⎪
⎫⎬⎪
⎭⎪
amplitude-dependent damping factor!
𝑆𝑁𝑅 =𝑚𝑒𝑎𝑛 𝑠𝑖𝑔𝑛𝑎𝑙 ≈ 0.5𝑛𝑜𝑖𝑠𝑒 𝑠𝑡𝑑 = 𝜎
Deconvolution with Wiener Filtering
x !xNaïve inverse filter Wiener
Deconvolution with Wiener Filtering
σ = 0.05 σ = 0.1σ = 0.01
Deconvolution with Wiener Filtering
• results: not too bad, but noisy
• this is a heuristic à dampen noise amplification
• idea: promote sparse gradients (edges)
• is finite differences operator, i.e. matrix
Total Variation
minimizex
Cx − b 22 + λTV (x) = minimize
xCx − b 2
2 + λ ∇x 1
x 1 = xii∑
∇
−1 1−1 1!
−1
⎡
⎣
⎢⎢⎢⎢
⎤
⎦
⎥⎥⎥⎥
Rudin et al. 1992
Total Variation
∗0 0 00 −1 10 0 0
⎡
⎣
⎢⎢⎢
⎤
⎦
⎥⎥⎥
∇yxx ∇xx
∗0 0 00 −1 00 1 0
⎡
⎣
⎢⎢⎢
⎤
⎦
⎥⎥⎥
express (forward finite difference) gradient as convolution!
better: isotropic
Total Variation
∇xx( )2 + ∇yx( )2x ∇xx( )2 + ∇yx( )2easier: anisotropic
Total Variation
• for simplicity, this lecture only discusses anisotropic TV:
• problem: l1-norm is not differentiable, can’t use inverse filtering
• however: simple solution for data fitting along and simple solution
for TV alone à split problem!
TV (x) = ∇xx 1 + ∇yx 1=
∇x
∇y
⎡
⎣⎢⎢
⎤
⎦⎥⎥x1
minimize f (x)+ g(z)subject to Ax + Bz = c
f (x) = Cx − b 22
g(z) = λ z 1
A = ∇, B = −I , c = 0
Deconvolution with ADMM
• split deconvolution with TV prior:
• general form of ADMM (alternating direction method of multiplies):
minimize Cx − b 22 + λ z 1
subject to ∇x = z
minimize f (x)+ g(z)subject to Ax + Bz = c
f (x) = Cx − b 22
g(z) = λ z 1
A = ∇, B = −I , c = 0
Deconvolution with ADMM
• split deconvolution with TV prior:
• general form of ADMM (alternating direction method of multiplies):
minimize Cx − b 22 + λ z 1
subject to ∇x = z
minimize f (x)+ g(z)subject to Ax + Bz = c
f (x) = Cx − b 22
g(z) = λ z 1
A = ∇, B = −I , c = 0
Deconvolution with ADMM
• split deconvolution with TV prior:
• general form of ADMM (alternating direction method of multiplies):
minimize Cx − b 22 + λ z 1
subject to ∇x = z
minimize f (x)+ g(z)subject to Ax + Bz = c
f (x) = Cx − b 22
g(z) = λ z 1
A = ∇, B = −I , c = 0
Deconvolution with ADMM
• split deconvolution with TV prior:
• general form of ADMM (alternating direction method of multiplies):
minimize Cx − b 22 + λ z 1
subject to ∇x = z
ADMM
• Lagrangian (bring constraints into objective = penalty method):
• augmented Lagrangian:
minimize f (x)+ g(z)subject to Ax + Bz = c
Lρ (x, y, z) = f (x)+ g(z)+ yT (Ax + Bz − c)+ (ρ / 2) Ax + Bz − c 22
L(x, y, z) = f (x)+ g(z)+ yT (Ax + Bz − c)
dual variable or Lagrange multiplier
additional penalty term
ADMM
• augmented Lagrangian is differentiable under mild conditions (usually
better convergence etc.)
minimize f (x)+ g(z)subject to Ax + Bz = c
Lρ (x, y, z) = f (x)+ g(z)+ yT (Ax + Bz − c)+ (ρ / 2) Ax + Bz − c 22
• ADMM consists of 3 steps per iteration k:
ADMMminimize f (x)+ g(z)subject to Ax + Bz = c
xk+1 := argminx
Lρ (x, zk , yk )
zk+1 := argminz
Lρ (xk+1, z, yk )
yk+1 := yk + ρ(Axk+1 + Bzk+1 − c)
ADMM
• ADMM consists of 3 steps per iteration k:
minimize f (x)+ g(z)subject to Ax + Bz = c
xk+1 := argminx
f (x)+ (ρ / 2) Ax + Bzk − c + uk( )zk+1 := argmin
zg(z)+ (ρ / 2) Axk+1 + Bz − c + uk( )
uk+1 := uk + Axk+1 + Bzk+1 − c
constant
u = (1 / ρ)yscaled dual variable:
ADMM
• ADMM consists of 3 steps per iteration k:
minimize f (x)+ g(z)subject to Ax + Bz = c
xk+1 := argminx
f (x)+ (ρ / 2) Ax + Bzk − c + uk2
2( )zk+1 := argmin
zg(z)+ (ρ / 2) Axk+1 + Bz − c + uk
2
2( )uk+1 := uk + Axk+1 + Bzk+1 − c
split f(x) and g(x) into independent problems! (u connects them)
u = (1 / ρ)yscaled dual variable:
Deconvolution with ADMM
• ADMM consists of 3 steps per iteration k:
minimize 12Cx − b 2
2 + λ z 1
subject to ∇x − z = 0
xk+1 := argminx
12Cx − b 2
2 + (ρ / 2) ∇x − zk + uk2
2⎛⎝⎜
⎞⎠⎟
zk+1 := argminz
λ z 1 + (ρ / 2) ∇xk+1 − z + uk
2
2( )uk+1 := uk +∇xk+1 − zk+1
Deconvolution with ADMM
1. x-update: xk+1 := argminx
12Cx − b 2
2 + (ρ / 2) ∇x − zk + uk2
2⎛⎝⎜
⎞⎠⎟
CTC + ρ∇T∇( )x = CTb + ρ∇T v( )
constant, say
minimize 12Cx − b 2
2 + λ z 1
subject to ∇x − z = 0 v = zk − uk
∇T v =∇x
∇y
⎡
⎣⎢⎢
⎤
⎦⎥⎥
T
v = ∇xT v1 +∇y
T v2
solve normal equations
Deconvolution with ADMM
1. x-update:
• inverse filtering:
à may blow up, but that’s okay
xk+1 := argminx
12Cx − b 2
2 + (ρ / 2) ∇x − zk + uk2
2⎛⎝⎜
⎞⎠⎟
x = CTC + ρ∇T∇( )−1 CTb + ρ∇T v( )
constant, say
minimize 12Cx − b 2
2 + λ z 1
subject to ∇x − z = 0 v = zk − uk
xk+1 = F−1F c{ }* ⋅F b{ }+ ρ F ∇x{ }* ⋅F v1{ }+ F ∇y{ }* ⋅F v2{ }( )F c{ }* ⋅F c{ }+ ρ F ∇x{ }* ⋅F ∇x{ }+ F ∇y{ }* ⋅F ∇y{ }( )
⎧
⎨⎪
⎩⎪
⎫
⎬⎪
⎭⎪
precompute!
Deconvolution with ADMM
2. z-update:
• l1-norm is not differentiable! yet, closed-form solution via element-wise
soft thresholding:
zk+1 := argminz
λ z 1 + (ρ / 2) ∇xk+1 − z + uk
2
2( ):= argmin
zλ z 1 + (ρ / 2) z − a 2
2
constant, say
minimize 12Cx − b 2
2 + λ z 1
subject to ∇x − z = 0
zk+1 := Sλ /ρ (a) Sκ (a) =a −κ a >κ0 a ≤κa +κ a < −κ
⎧
⎨⎪
⎩⎪
= (a −κ )+ − (−a −κ )+
a = ∇xk+1 + uk
κ = λ / ρ
Deconvolution with ADMM
for k=1:max_iters
minimize 12Cx − b 2
2 + λ z 1
subject to ∇x − z = 0
xk+1 := argminx
12
Cρ∇
⎡
⎣⎢⎢
⎤
⎦⎥⎥x −
bρv
⎡
⎣⎢⎢
⎤
⎦⎥⎥2
2⎛
⎝⎜⎜
⎞
⎠⎟⎟
zk+1 := Sλ /ρ (∇xk+1 + uk )
uk+1 := uk +∇xk+1 − zk+1
inverse filtering
element-wise threshold
trivial
Deconvolution with ADMM
for k=1:max_iters
minimize 12Cx − b 2
2 + λ z 1
subject to ∇x − z = 0
xk+1 := argminx
12
Cρ∇
⎡
⎣⎢⎢
⎤
⎦⎥⎥x −
bρv
⎡
⎣⎢⎢
⎤
⎦⎥⎥2
2⎛
⎝⎜⎜
⎞
⎠⎟⎟
zk+1 := Sλ /ρ (∇xk+1 + uk )
uk+1 := uk +∇xk+1 − zk+1
inverse filtering
element-wise threshold
trivial
à easy! J
Deconvolution with ADMMminimize 12Cx − b 2
2 + λ z 1
subject to ∇x − z = 0Wiener filtering ADMM with anisotropic TV, λ = 0.01, ρ = 10
Deconvolution with ADMMminimize 12Cx − b 2
2 + λ z 1
subject to ∇x − z = 0
λ = 0.1, ρ = 10λ = 0.05, ρ = 10λ = 0.01, ρ = 10
• too much TV: “patchy”, too little TV: noisy
Deconvolution with ADMMminimize 12Cx − b 2
2 + λ z 1
subject to ∇x − z = 0Wiener filtering ADMM with anisotropic TV, λ = 0.1, ρ = 10
Deconvolution with ADMMminimize 12Cx − b 2
2 + λ z 1
subject to ∇x − z = 0
λ = 0.1, ρ = 10λ = 0.05, ρ = 10λ = 0.01, ρ = 10
• too much TV: okay because image actually has sparse gradients!
Outlook ADMM• powerful tool for many computational imaging problems• include generic prior in g(z), just need to derive proximal operator
• example priors: noise statistics, sparse gradient, smoothness, …
• weighted sum of different priors also possible
• anisotropic TV is one of the easiest priors
minimizex,z{ }
f (x)+ g(z)
subject to Ax = z
minimizex
12Ax − b 2
2
data fidelity! "# $#
+ Γ(x)regularization%
Remember!
• implement matrix-free operations for Ax and A’x if efficient (e.g.
multiplications and divisions in frequency space)
• split difficult problems (e.g., inverse problems with non-
differentiable priors) into easier subproblems - ADMM
Homework 3
• implement:• filtering
• inverse filtering and Wiener filtering
• deconvolution with ADMM + (anisotropic) TV prior
• notes for ADMM implementation:
• initialize U, Z, X with 0
• implement with matrix-free form: all FT multiplications / divisions
• in 2D, finite differences matrix becomes(anisotropic form), use matrix free-operations as well!
• see note notes in HW
• check ADMM example scripts: http://web.stanford.edu/~boyd/papers/admm/
∇ =∇x
∇y
⎡
⎣⎢⎢
⎤
⎦⎥⎥
Notes for Homework 3I ∈ℜM×N , X ∈ℜMN×1
U ∈ℜ2MN×1, Z ∈ℜ2MN×1
• signal-to-noise ratio (SNR):
• peak signal-to-noise ratio (PSNR):
(always in dB)
• residual is value of objective function:
• convergence: residual for increasing iterations (should always decrease!)
Notes for Homework 3
12Cx − b 2
2 + λ∇x
∇y
⎡
⎣⎢⎢
⎤
⎦⎥⎥x1
12Cx − b 2
2not regularized: regularized:
MSE = 1mn
xtarget − xest( )n∑m∑ 2
PSNR = 10 ⋅ log10max(xtarget )
2
MSE⎛
⎝⎜⎞
⎠⎟= 10 ⋅ log10
1MSE
⎛⎝⎜
⎞⎠⎟
SNR =PsignalPnoise
SNRdB = 10 ⋅ log10PsignalPnoise
⎛⎝⎜
⎞⎠⎟
References and Further Reading• Boyd, Parikh, Chu, Peleato, Eckstein, “Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers”,
Foundations and Trends in Machine Learning, 2011
• A. Chambolle, T. Pock “A first-order primal-dual algorithm for convex problems with applications in imaging”, Journal of Mathematical Imaging and Vision, 2011
• Boreman, “Modulation Transfer Function in Optical and ElectroOptical Systems”, SPIE Publications, 2001• Rudin, Osher, Fatemi, “Nonlinear total variation based noise removal algorithms”, Physica D: Nonlinear Phenomena 60, 1
• http://www.imagemagick.org/Usage/fourier/