accelerated image reconstruction for nonlinear diffractive imaging - arxiv.org … ·...

11
Accelerated Image Reconstruction for Nonlinear Diffractive Imaging Yanting Ma, Student Member, IEEE, Hassan Mansour, Senior Member, IEEE, Dehong Liu, Senior Member, IEEE, Petros T. Boufounos, Senior Member, IEEE, and Ulugbek S. Kamilov, Member, IEEE Abstract—The problem of reconstructing an object from the measurements of the light it scatters is common in numerous imaging applications. While the most popular formulations of the problem are based on linearizing the object-light relation- ship, there is an increased interest in considering nonlinear formulations that can account for multiple light scattering. In this paper, we propose an image reconstruction method, called CISOR, for nonlinear diffractive imaging, based on a nonconvex optimization formulation with total variation (TV) regularization. The nonconvex solver used in CISOR is our new variant of fast iterative shrinkage/thresholding algorithm (FISTA). We provide fast and memory-efficient implementation of the new FISTA variant and prove that it reliably converges for our nonconvex optimization problem. In addition, we systematically compare our method with other state-of-the-art methods on simulated as well as experimentally measured data in both 2D and 3D settings. Index Terms—Diffraction tomography, proximal gradient method, total variation regularization, nonconvex optimization I. I NTRODUCTION Estimation of the spatial permittivity distribution of an ob- ject from the scattered wave measurements is ubiquitous in nu- merous applications. Conventional methods usually rely on lin- earizing the relationship between the permittivity and the wave measurements. For example, the first Born approximation [1] and the Rytov approximation [2] are linearization techniques commonly adopted in diffraction tomography [3]–[8]. Other imaging systems that are based on a linear forward model in- clude optical projection tomography (OPT), optical coherence tomography (OCT), digital holography, and subsurface radar [9]–[18]. One attractive aspect of linear methods is that the inverse problem can be formulated as a convex optimization problem and solved by various efficient convex solvers [19]– [22]. However, linear models are highly inaccurate when the physical size of the object is large compared to the wavelength of the incident wave or the permittivity contrast of the object compared to the background is high [23]. Therefore, in order to be able to image strongly scattering objects such as human tissue [24], nonlinear formulations that can model multiple scattering need to be considered. The challenge is then to develop fast, memory-efficient, and reliable inverse methods Y. Ma is with the Department of Electrical and Computer Engineering, North Carolina State University, Raleigh, NC 27606. This work was completed while Y. Ma was with MERL. H. Mansour, D. Liu, and P. T. Boufounos are with Mitsubishi Electric Research Laboratories (MERL), 201 Broadway, Cambridge, MA 02139. U. S. Kamilov (email: [email protected]) is with Computational Imaging Group (CIG), Washington University in St. Louis, St. Louis, MO 63130. IL FB CISOR 0.2% 8.2% 16.2% 24% 32.2% Fig. 1. Comparison between linear method with first Born approximation (FB, first row), iterative linearization method (IL, second row), and our proposed nonlinear method (CISOR, third row). Each column represents one contrast value as indicated at the bottom of the images on the third row. CISOR is stable for all tested contrast values, whereas FB and IL fail for large contrast. that can account for the nonlinearity. Note that the nonlinearity in this work refers to the fundamental relationship between the scattered wave and the permittivity contrast, rather than that introduced by the limitations of sensing systems such as missing phase. A standard way for solving inverse problems is via op- timization, where a sequence of estimates is generated by iteratively minimizing a cost function. For ill-posed problems, a cost function usually consists of a quadratic data-fidelity term and a regularization term, which incorporates prior in- formation such as transform-domain sparsity to mitigate the ill-posedness. Total variation (TV) [25] is one of the most commonly used regularizers in image processing, as it captures the property of piece-wise smoothness in natural objects. The challenge of such formulation for nonlinear diffractive imaging is that the data-fidelity term is nonconvex due to the nonlinearity and that the TV regularizer is nondifferentiable. For such nonsmooth and nonconvex problems, the prox- imal gradient method, also known as iterative shrink- age/thresholding algorithm (ISTA) [26]–[28], is a natural choice. ISTA is easy to implement and is proved to converge under some technical conditions. However, its convergence speed is slow. FISTA [22] is an accelerated variant of ISTA, which is proved to have the optimal worst case convergence rate for convex problems. Unfortunately, its convergence anal- ysis for nonconvex problems has not been established. arXiv:1708.01663v2 [cs.CV] 14 Dec 2017

Upload: others

Post on 07-Aug-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Accelerated Image Reconstruction for Nonlinear Diffractive Imaging - arXiv.org … · 2017-12-15 · Petros T. Boufounos, Senior Member, IEEE, and Ulugbek S. Kamilov, Member, IEEE

Accelerated Image Reconstruction for NonlinearDiffractive Imaging

Yanting Ma, Student Member, IEEE, Hassan Mansour, Senior Member, IEEE, Dehong Liu, Senior Member, IEEE,Petros T. Boufounos, Senior Member, IEEE, and Ulugbek S. Kamilov, Member, IEEE

Abstract—The problem of reconstructing an object from themeasurements of the light it scatters is common in numerousimaging applications. While the most popular formulations ofthe problem are based on linearizing the object-light relation-ship, there is an increased interest in considering nonlinearformulations that can account for multiple light scattering. Inthis paper, we propose an image reconstruction method, calledCISOR, for nonlinear diffractive imaging, based on a nonconvexoptimization formulation with total variation (TV) regularization.The nonconvex solver used in CISOR is our new variant of fastiterative shrinkage/thresholding algorithm (FISTA). We providefast and memory-efficient implementation of the new FISTAvariant and prove that it reliably converges for our nonconvexoptimization problem. In addition, we systematically compareour method with other state-of-the-art methods on simulated aswell as experimentally measured data in both 2D and 3D settings.

Index Terms—Diffraction tomography, proximal gradientmethod, total variation regularization, nonconvex optimization

I. INTRODUCTION

Estimation of the spatial permittivity distribution of an ob-ject from the scattered wave measurements is ubiquitous in nu-merous applications. Conventional methods usually rely on lin-earizing the relationship between the permittivity and the wavemeasurements. For example, the first Born approximation [1]and the Rytov approximation [2] are linearization techniquescommonly adopted in diffraction tomography [3]–[8]. Otherimaging systems that are based on a linear forward model in-clude optical projection tomography (OPT), optical coherencetomography (OCT), digital holography, and subsurface radar[9]–[18]. One attractive aspect of linear methods is that theinverse problem can be formulated as a convex optimizationproblem and solved by various efficient convex solvers [19]–[22]. However, linear models are highly inaccurate when thephysical size of the object is large compared to the wavelengthof the incident wave or the permittivity contrast of the objectcompared to the background is high [23]. Therefore, in orderto be able to image strongly scattering objects such as humantissue [24], nonlinear formulations that can model multiplescattering need to be considered. The challenge is then todevelop fast, memory-efficient, and reliable inverse methods

Y. Ma is with the Department of Electrical and Computer Engineering,North Carolina State University, Raleigh, NC 27606. This work was completedwhile Y. Ma was with MERL.

H. Mansour, D. Liu, and P. T. Boufounos are with Mitsubishi ElectricResearch Laboratories (MERL), 201 Broadway, Cambridge, MA 02139.

U. S. Kamilov (email: [email protected]) is with Computational ImagingGroup (CIG), Washington University in St. Louis, St. Louis, MO 63130.

IL

FB

𝜆

CISOR

0.2% 8.2% 16.2% 24% 32.2%

Fig. 1. Comparison between linear method with first Born approximation (FB,first row), iterative linearization method (IL, second row), and our proposednonlinear method (CISOR, third row). Each column represents one contrastvalue as indicated at the bottom of the images on the third row. CISOR isstable for all tested contrast values, whereas FB and IL fail for large contrast.

that can account for the nonlinearity. Note that the nonlinearityin this work refers to the fundamental relationship betweenthe scattered wave and the permittivity contrast, rather thanthat introduced by the limitations of sensing systems such asmissing phase.

A standard way for solving inverse problems is via op-timization, where a sequence of estimates is generated byiteratively minimizing a cost function. For ill-posed problems,a cost function usually consists of a quadratic data-fidelityterm and a regularization term, which incorporates prior in-formation such as transform-domain sparsity to mitigate theill-posedness. Total variation (TV) [25] is one of the mostcommonly used regularizers in image processing, as it capturesthe property of piece-wise smoothness in natural objects.The challenge of such formulation for nonlinear diffractiveimaging is that the data-fidelity term is nonconvex due to thenonlinearity and that the TV regularizer is nondifferentiable.

For such nonsmooth and nonconvex problems, the prox-imal gradient method, also known as iterative shrink-age/thresholding algorithm (ISTA) [26]–[28], is a naturalchoice. ISTA is easy to implement and is proved to convergeunder some technical conditions. However, its convergencespeed is slow. FISTA [22] is an accelerated variant of ISTA,which is proved to have the optimal worst case convergencerate for convex problems. Unfortunately, its convergence anal-ysis for nonconvex problems has not been established.

arX

iv:1

708.

0166

3v2

[cs

.CV

] 1

4 D

ec 2

017

Page 2: Accelerated Image Reconstruction for Nonlinear Diffractive Imaging - arXiv.org … · 2017-12-15 · Petros T. Boufounos, Senior Member, IEEE, and Ulugbek S. Kamilov, Member, IEEE

2

A. Contributions

In this paper, we propose a new image reconstructionmethod called Convergent Inverse Scattering using Optimiza-tion and Regularization (CISOR) for fast and memory-efficientnonlinear diffractive imaging.

The key contributions of this paper are as follows:• A novel nonconvex formulation that precisely models the

fundamental nonlinear relationship between the scatteredwave and the permittivity contrast, while enabling fastand memory-efficient implementation, as well as rigorousconvergence analysis.

• A new relaxed variant of FISTA for solving our non-convex problem with rigorous convergence analysis. Ournew variant of FISTA may be of interest on its own as ageneral nonconvex solver.

• Extension of the proposed formulation and algorithm, aswell as the convergence analysis, to the 3D vectorial case,which makes our method applicable in a broad range ofengineering and scientific areas.

B. Related Work

Many methods that attempt to integrate the nonlinearity ofwave scattering have been proposed in the literature. The iter-ative linearization (IL) method [29], [30] iteratively computesthe forward model using the current estimated permittivity, andestimates the permittivity using the field from the previouslycomputed forward model. Hence, each sub-problem at eachiteration is a linear problem. Contrast source inversion (CSI)[31]–[33] defines an auxiliary variable called the contrastsource, which is the product of the permittivity contrast andthe field. CSI alternates between estimating the contrast sourceand the permittivity. Hybrid methods (HM) [34]–[36] combineIL and CSI, aiming to benefit from each of their advantages.A comprehensive comparison of these three methods can befound in the review paper [35]. Recently, the idea of neuralnetwork unfolding has inspired a class of methods that updatesthe estimates using error backpropagation [37]–[41]. Whilesuch methods can in principle model the precise nonlinearity,in practice, the accuracy may be limited by the availability ofmemory to store the iterates needed to perform unfolding.

Figure 1 provides a visual comparison of the reconstructedimages obtained by the first Born (FB) linear approximation[1], the iterative linearization (IL) method [29], [30], andour proposed nonlinear method CISOR; detailed experimentalsetup will be presented in Section IV. We can see that whenthe contrast of the object is small, all methods achieve similarreconstruction quality. As the contrast value increases, theperformance of the linear method degrades significantly andthe iterative linearization method only succeeds up to a certainlevel, whereas our nonlinear method CISOR provides reliablereconstruction for all tested contrast values.

While we were concluding this manuscript, we becameaware of very recent related work in [42], which considereda similar problem as in this paper. Our work differs from[42] in the following aspects: (i) Only the 2D case hasbeen considered in [42], whereas our work extends to the3D vectorial case. (ii) FISTA [22], which does not have

𝑟

𝑟 = (𝑥, 𝑦)

Fig. 2. Visual representation of the measurement scenario considered inthis paper. An object with a real permittivity contrast f(r) is illuminatedwith an input wave uin(r), which interacts with the object and results inthe scattered wave usc at the sensor domain Γ⊂R2. The complex scatteredwave is captured at the sensor and the algorithm proposed here is used forestimating the contrast f .

convergence guarantee for nonconvex problems, is applied in[42] as the nonconvex solver, whereas our method appliesour new variant of FISTA with rigorous convergence analysisestablished here.

II. PROBLEM FORMULATION

The problem of inverse scattering based on the scalartheory of diffraction [1], [43] is described as follows andillustrated in Figure 2; the formulation for the 3D vectorialcase is presented in Appendix VI-A. Let d=2 or 3, supposethat an object is placed within a bounded domain Ω⊂Rd.The object is illuminated by an incident wave uin, and thescattered wave usc is measured by the sensors placed in asensing region Γ⊂Rd. Let u denote the total field, whichsatisfies u(r)=uin(r)+usc(r),∀r∈Rd. The scalar Lippmann-Schwinger equation [1] establishes the relationship betweenwave and permittivity contrast:

u(r)=uin(r)+k2

∫Ω

g(r−r′)u(r′)f(r′)dr′, ∀r∈Rd.

In the above, f(r)=(ε(r)−εb) is the permittivity contrast,where ε(r) is the permittivity of the object, εb is the permit-tivity of the background, and k=2π/λ is the wavenumberin vacuum. We assume that f is real, or in other words, theobject is lossless. The free-space Green’s function is definedas follows:

g(r)=

− j4H

(1)0 (kb‖r‖), if d=2

ejkb‖r‖

4π‖r‖ , if d=3, (1)

where ‖·‖ denotes the Euclidean norm, H(1)0 is the Hankel

function of first kind, and kb=k√εb is the wavenumber of

the background medium. The corresponding discrete systemis then

y=Hdiag(u)f +e, (2)

u=uin +Gdiag(f)u, (3)

where f ∈RN , u∈CN , uin∈CN are N uniformly spacedsamples of f(r), u(r), and uin(r) on Ω, respectively, andy∈CM is the measured scattered wave at the sensors with

Page 3: Accelerated Image Reconstruction for Nonlinear Diffractive Imaging - arXiv.org … · 2017-12-15 · Petros T. Boufounos, Senior Member, IEEE, and Ulugbek S. Kamilov, Member, IEEE

3

measurement error e∈CM . For a vector a∈RN , diag(a)∈RN×N is a diagonal matrix with a on the diagonal. Thematrix H∈CM×N is the discretization of Green’s functiong(r−r′) with r∈Γ and r′∈Ω, whereas G∈CN×N is thediscretization of Green’s function with r,r′∈Ω. The inversescattering problem is then to estimate f given y, H, G, anduin. Define

A :=I−Gdiag(f). (4)

Note that (2) and (3) define a nonlinear inverse problem,because u depends on f through u=A−1uin, where A isdefined in (4). In this paper, the linear method refers tothe formulation where u=uin, and the iterative linearizationmethod replaces u in (2) with the estimate u computed from(3) using the current estimate f and assumes that u is aconstant with respect to f .

To estimate f from the nonlinear inverse problem (25)and (26), we consider the following nonconvex optimizationformulation. Define

Z(f) :=Hdiag(u)f , (5)

which is the (clean) scattered wave from the object withpermittivity contrast f . Moreover, let C⊂RN be a boundconvex set that contains all possible values that f may take.Then f is estimated by minimizing the following compositecost function with a nonconvex data-fidelity term D(f) and aconvex regularization term R(f):

f∗=arg minf∈RN

F(f) :=D(f)+R(f) , (6)

where

D(f)=1

2‖y−Z(f)‖22, (7)

R(f)=τ

N∑n=1

√√√√ 2∑d=1

|[Ddf ]n|2 +χC(f). (8)

In (8), Dd is the discrete gradient operator in the dth dimen-sion, hence the first term in R(f) is the total variation (TV)cost, and the parameter τ >0 controls the contribution of theTV cost to the total cost. The second term χC(·) is defined as

χC(f) :=

0, if f ∈C∞, if f 6∈C

.

Note that D(·) is differentiable if A is non-singular, R(·) isproper, convex, and lower semi-continuous if C is convex andclosed.

III. PROPOSED METHOD

As mentioned in Section I that for a cost function like(6), the class of proximal gradient methods, including ISTA[26]–[28] and FISTA [22], can be applied. However, ISTA isempirically slow and FISTA has only been proved to convergefor convex problems. A variant of FISTA has been proposed in[44] for nonconvex optimization with convergence guarantees.This algorithm computes two estimates from ISTA and FISTA,respectively, at each iteration, and selects the one with lowerobjective function value as the final estimate at the current it-eration. Therefore, both the gradient and the objective function

100 200 300 400 500Iteration

10-1

100

Cos

t Fun

ctio

n V

alue =0 (ISTA)

=0.88=0.96=1 (FISTA)

Fig. 3. Empirical convergence speed for relaxed FISTA with various α valuestested on experimentally measured data.

need to be evaluated at two different points at each iteration.While such extra computation may be insignificant in someapplications, it can be prohibitive in the inverse scatteringproblem, where additional evaluations of the gradient andthe objective function require the computation of the entireforward model. Another accelerated proximal gradient methodthat is proved to converge for nonconvex problems is proposedin [45], which is not directly related to FISTA for non-smoothobjective functions like (6).

A. Relaxed FISTA

We now introduce our new variant of FISTA to solve (6).Starting with some initialization f0∈RN and setting s1 = f0,t0 =1, α∈ [0,1), for k≥1, the proposed algorithm proceedsas follows:

fk=proxγR (sk−γ∇D(sk)) (9)

tk+1 =

√4t2k+1+1

2(10)

sk+1 = fk+α

(tk−1

tk+1

)(fk− fk−1), (11)

where the choice of the step-size γ to ensure convergencewill be discussed in Section III-B. Notice that the algorithm(9)-(11) is equivalent to ISTA when α=0 and is equivalentto FISTA when α=1. For this reason, we call it relaxedFISTA. Figure 3 shows that the empirical convergence speedof relaxed FISTA improves as α increases from 0 to 1. The plotwas obtained by using the experimentally measured scatteredmicrowave data collected by the Fresnel Institute [46]. Ourtheoretical analysis of relaxed FISTA in Section III-B estab-lishes convergence for any α∈ [0,1) with appropriate choiceof the step-size γ.

The two main elements of relaxed FISTA are the computa-tion of the gradient ∇D and the proximal mapping proxγR.Given ∇D(sk), the proximal mapping (9) can be efficientlysolved [47], [48]. The following proposition provides anexplicit formula for ∇D, which enables fast and memory-efficient computation of ∇D.

Proposition 1. Let Z(f) be defined in (5) and w=Z(f)−y.Then

∇D(f)=Re

diag(u)H (HHw+GHv), (12)

where u and v are obtained from the linear systems

Au=uin, and AHv=diag(f)HHw. (13)

Page 4: Accelerated Image Reconstruction for Nonlinear Diffractive Imaging - arXiv.org … · 2017-12-15 · Petros T. Boufounos, Senior Member, IEEE, and Ulugbek S. Kamilov, Member, IEEE

4

Proof. See Appendix VI-B1.

In the above, u and v can be efficiently solved by theconjugate gradient method. Note that the formulation forcomputing the gradient in Proposition 1 is also known as theadjoint state method [49]. In our implementation, A is anoperator rather than an explicit matrix, and the convolutionwith Green’s function in A is computed using the fast Fouriertransform (FFT) algorithm. Note that the formula for thegradient defined in (12) is for the scalar field case. The formulafor the 3D vectorial case is provided in (30) in Appendix VI-A.

B. Theoretical Analysis

We now provide the convergence analysis of relaxed FISTAapplied to our nonconvex optimization problem (6).

The following proposition shows that the data-fidelity term(7) has Lipschitz gradient on a bounded domain. Note thatLipschitz gradient of the smooth term in a composite costfunction like (6) is essential to prove the convergence ofrelaxed FISTA, which we will establish in Proposition 3.

Proposition 2. Suppose that U⊂RN is bounded. Assume that‖uin‖<∞ and the matrix A defined in (4) is non-singular forall s∈U . Then D(s) has Lipschitz gradient on U . That is,there exists an L∈(0,∞) such that

‖∇D(s1)−∇D(s2)‖≤L‖s1−s2‖, ∀s1,s2∈U . (14)

Proof. See Appendix VI-B2.

Notice that all fk obtained from (9) are within a bounded setC, and each sk+1 obtained from (11) is a linear combinationof fk and fk−1, where the weight α

(tk−1tk+1

)∈ [0,1) since

α∈ [0,1) and tk−1tk+1≤1 by (10). Hence, the set that covers all

possible values for fkk≥0 and skk≥1 is bounded. Usingthis fact, we have the following convergence guarantee forrelaxed FISTA applied to solve (6).

Proposition 3. Let U in Proposition 2 be the bounded set thatcovers all possible values for fkk≥0 and skk≥1 obtainedfrom (9) and (11), L be the corresponding Lipschitz constantdefined in (14). Choose 0<γ≤ 1−α2

2L for any fixed α∈ [0,1).Define the gradient mapping as

Gγ(f) :=f−proxγR (f−γ∇D(f))

γ, ∀f ∈RN . (15)

Then, relaxed FISTA achieves the stationary points of thecost function F defined in (6) in the sense that the gradientmapping norm satisfies

limk→∞

‖Gγ(fk)‖=0. (16)

Moreover, Let F∗ denote the global minimum of F and weassume its existence. Then for any K>0,

mink∈1,...,K

‖Gγ(fk)‖2≤ 2L(3+γL)2 (F(f0)−F∗)KγL(1−γL)

. (17)

Proof. See Appendix VI-B3.

Note that for any f ∈RN , Gγ(f)=0 implies 0∈∂F(f),where ∂F(f) is the limiting subdifferential [50] of F definedin (6) at f , hence f is a stationary point of F .

0 0.05 0.1 0.15 0.2 0.25 0.3 0.35Contrast Value

0

5

10

15

20

SN

R [d

B]

CISORCSIILFB

Fig. 4. Comparison of different reconstruction methods for various contrastlevels tested on simulated data.

Relaxed FISTA can be used as a general nonconvex solverfor any cost function that has a smooth nonconvex termwith Lipschitz gradient and a nonsmooth convex term whoseproximal mapping can be easily computed. The convergenceanalysis of relaxed FISTA does not require the estimates to beconstrained on a bounded domain. The condition of bounded Uin the statement of Proposition 3 is to ensure that the gradientof the specific function D(f) defined in (7) is Lipschitz, asdiscussed in Proposition 2.

IV. EXPERIMENTAL RESULTS

We now compare our method CISOR with several state-of-the-art methods, including iterative linearization (IL) [29],[30], contrast sourse inversion (CSI) [31]–[33], and SEAGLE[40], as well as a conventional linear method, the first Bornapproximation (FB) [1]. All algorithms use additive total vari-ation regularization. In our implementation, CSI uses Polak-Ribiere conjugate gradient. CISOR uses the relaxed FISTAdefined in SectionIII-A with α=0.96 and fixed step-size γ,which is manually tuned. The other methods use the standardFISTA [22], also with manually tuned and fixed step-sizes.

Comparison on simulated data. The wavelength of theincident wave in this experiment is 7.49 cm. Define thecontrast of an object with permittivity contrast distribution f asmax(|f |). We consider the Shepp-Logan phantom and changeits contrast to the desired value to obtain the ground-truth ftrue.We then solve the Lippmann-Schwinger equation to generatethe scattered waves that are then used as measurements. Thecenter of the image is the origin and the physical size of theimage is set to 120 cm × 120 cm. Two linear detectors areplaced on two opposite sides of the image at a distance of95.9 cm from the origin. Each detector has 169 sensors witha spacing of 3.84 cm. The transmitters are placed on a line48.0 cm to the left of the left detector, and they are spaceduniformly in azimuth with respect to the origin within a rangeof [−60,60] at every 5. The reconstructed SNR, which isdefined as 20log10(‖ftrue‖/‖f− ftrue‖), is used as the compari-son criterion. The size the reconstructed images f is 128×128pixels. For each contrast value and each algorithm, we run thealgorithm with five different regularization parameter valuesand select the result that yields the highest reconstructed SNR.

Figure 4 shows that as the contrast increases, the recon-structed SNR of FB and IL decreases, whereas that of CSI andCISOR is more stable. While it is possible to further improvethe reconstructed SNR for CSI by running more iterations,

Page 5: Accelerated Image Reconstruction for Nonlinear Diffractive Imaging - arXiv.org … · 2017-12-15 · Petros T. Boufounos, Senior Member, IEEE, and Ulugbek S. Kamilov, Member, IEEE

5

λ 2.2

0

0.5

0

CISOR IL

0.5

0

FB

2.2

0

λ CSISEAGLE

Fig. 5. Reconstructed images obtained by different algorithms from 2D experimentally measured data. The first and second rows use the FoamDielExtTMand the FoamDielIntTM objects, respectively. From left to right: ground truth, reconstructed images by CISOR, SEAGLE, IL, CSI, and FB. The color-mapfor FB is different from the rest, because FB significantly underestimated the contrast value. The size of the reconstructed objects are 128×128 pixels.

1.76𝜆

0

𝑦

𝑥

𝑧

𝑥

𝑧=0 mm

𝑦=0 mm

𝑧

𝑦 𝑥=-25 mm

ILCISOR CSI

Fig. 6. Reconstructed images obtained by CISOR, IL, and CSI from 3D experimentally measured data for the TwoShperes object. From left to right: groundtruth, reconstructed image slices by CISOR, IL, and CSI, the reconstructed contrast distribution along the dashed lines showing in the image slices on the firstcolumn. From top to bottom: image slices parallel to the x-y plane with z=0 mm, parallel to the x-z plane with y=0 mm, and parallel to the y-z planewith x=−25 mm. The size of the reconstructed objects are 32×32×32 pixels for a 150×150×150 mm cube centered at (0,0,0).

CSI is known to be slow as the comparisons shown in [35].A visual comparison between FB, IL, and CISOR at severalcontrast levels has been presented in Figure 1 at the beginningof the paper.

Comparison on experimental data. We test our method inboth 2D and 3D settings using the public dataset provided bythe Fresnel Institute [46], [51]. Two objects for the 2D setting,FoamDielExtTM and FoamDielintTM, and two objects for the3D setting, TwoSpheres and TwoCubes, are considered.

In the 2D setting, the objects are placed within a 150mm × 150 mm square region centered at the origin of thecoordinate system. The number of transmitters is 8 and thenumber of receivers is 360 for all objects. The transmittersand the receivers are placed on a circle centered at the originwith radius 1.67 m and are spaced uniformly in azimuth. Onlyone transmitter is turned on at a time and only 241 receivers

are active for each transmitter. That is, the 119 receivers thatare closest to a transmitter are inactive for that transmitter.While the dataset contains multiple frequency measurements,we only use the ones corresponding to 3 GHz, hence thewavelength of the incident wave is 99.9 mm. The pixel sizeof the reconstructed images is 1.2 mm.

In the 3D setting, the transmitters are located on a spherewith radius 1.769 m. The azimuthal angle θ ranges from 20

to 340 with a step of 40, and the polar angle φ rangesfrom 30 to 150 with a step of 15. The receivers are onlyplaced on a circle with radius 1.769 m in the azimuthalplane with azimuthal angle ranging from 0 to 350 witha step of 10. Only the receivers that are more than 50

away from a transmitter are active for that transmitter. Avisual representation of this setup is shown in Figure 8 in theAppendix. We use the data corresponding to 4 GHz for the

Page 6: Accelerated Image Reconstruction for Nonlinear Diffractive Imaging - arXiv.org … · 2017-12-15 · Petros T. Boufounos, Senior Member, IEEE, and Ulugbek S. Kamilov, Member, IEEE

6

0

1.43𝜆

𝑦

𝑥 𝑧=33 mm

𝑧

𝑥 𝑦=-17 mm

𝑧

𝑦 𝑥=17 mm

ILCISOR CSI

Fig. 7. Reconstructed images obtained by CISOR, IL, and CSI from 3D experimentally measured data for the TwoCubes object. From left to right: groundtruth, reconstructed image slices by CISOR, IL, and CSI, the reconstructed contrast distribution along the dashed lines showing in the image slices on thefirst column. From top to bottom: image slices parallel to the x-y plane with z=33 mm, parallel to the x-z plane with y=−17 mm, and parallel to the y-zplane with x=17 mm. The size of the reconstructed objects are 32×32×32 pixels for a 100×100×100 mm cube centered at (0,0,50) mm.

TwoSpheres object and 6 GHz for the TwoCubes object, hencethe wavelength of the incident wave is 74.9 mm and 50.0 mm,respectively. The pixel size is 4.7 mm for the TwoSpheres and3.1 mm for TwoCubes.

Figure 5 provides a visual comparison of the reconstructedimages obtained by different algorithms for the 2D data. Foreach object and each algorithm, we run the algorithm withfive different regularization parameter values and select theresult that has the best visual quality. Figure 5 shows that allnonlinear methods CISOR, SEAGLE, IL, and CSI obtainedreasonable reconstruction results in terms of both the contrastvalue and the shape of the object, whereas the linear methodFB significantly underestimated the contrast value and failed tocapture the shape. These results demonstrate that the proposedmethod is competitive with several state-of-the-art methods.

Figures 6 and 7 present the results for the TwoSpheres andthe TwoCubes objects, respectively. Again, the results showthat CISOR is competitive with the state-of-the-art methodsfor the 3D vectorial setting as well.

V. CONCLUSION

In this paper, we proposed a nonconvex formulation fornonlinear diffractive imaging based on the scalar theory ofdiffraction, and further extended our formulation to the 3Dvectorial field setting. The nonconvex optimization problemwas solved by our new variant of FISTA. We provided anexplicit formula for fast computation of the gradient at eachFISTA iteration and proved that the algorithm converges forour nonconvex problem. Numerical results demonstrated thatthe proposed method is competitive with several state-of-the-art methods. The advantages of CISOR over other methodsare mainly in the following two aspects. First, CISOR is more

memory-efficient due to the explicit formula for the gradientas provided in Proposition 1. The formula allows using theconjugate gradient method to accurately compute the gradientwithout the need of storing all the iterates, which was requiredin SEAGLE [41]. Second, CISOR enjoys rigorous theoreticalconvergence analysis as established in Proposition 3, whileother methods only have reported empirical convergence per-formance.

VI. APPENDIX

A. 3D Diffractive Imaging

1) Problem Formulation: The measurement scenario forthe 3D case follows that in [51] and is illustrated in Fig-ure 8. The fundamental object-wave relationship in case ofelectromagnetic wave obeys Maxwell’s equations. For time-harmonic electromagnetic field and under the Silver-Mullerradiation condition, it can be shown that the solution toMaxwell’s equations is equivalent to that of the followingintegral equation [52]:

~E(r)= ~Ein(r)+(k2I+∇∇·)∫

Ω

g(r−r′)f(r′) ~E(r′)dr′,

(18)which holds for all r∈R3. In (18), ~E(r)∈C3 is the electricfield at spatial location r, which is the sum of the incidentfield ~Ein(r)∈C3 and the scattered field ~Esc(r)∈C3. Thescalar permittivity contrast distribution is defined as f(r)=(ε(r)−εb), where ε(r) is the permittivity of the object, εbis the permittivity of the background, and k=2π/λ is thewavenumber in vacuum. We assume that f(r) is real. The3D free-space scalar Green’s function g(r) is defined in (1).

Page 7: Accelerated Image Reconstruction for Nonlinear Diffractive Imaging - arXiv.org … · 2017-12-15 · Petros T. Boufounos, Senior Member, IEEE, and Ulugbek S. Kamilov, Member, IEEE

7

𝑥

𝑦

𝑧

Γ

𝑂

Ω

Tx

TxRx𝜃

𝜙

𝐸inpolarization

polarization𝐸in

polarization

𝐸sc

Fig. 8. The measurement scenario for the 3D case considered in this paper.The object is placed within a bounded image domain Ω. The transmitterantennas (Tx) are placed on a sphere and are linearly polarized. The arrowsin the figure define the polarization direction. The receiver antennas (Rx) areplaced in the sensor domain Γ within the x-y (azimuth) plane, and are linearlypolarized along the z direction.

As illustrated in Figure 8, the measurements in our problemare the scattered field measured in the sensor region Γ,

~Esc(r)=(k2I+∇∇·)∫

Ω

g(r−r′)f(r′) ~E(r′)dr′,∀r∈Γ,

(19)where the total field ~E in (19) is obtained by evaluating (18)at r∈Ω. In the experimental setup considered in this paper,Γ∩Ω=∅, hence Green’s function is non-singular within theintegral region in (19). Therefore, we can conveniently movethe gradient-divergence operator ∇∇· inside the integral. Then(19) becomes

~Esc(r)=

∫Ω

G(r−r′)f(r′) ~E(r′)dr′,∀r∈Γ, (20)

where G(r−r′)=(k2I+∇∇)g(r−r′) is the dyadic Green’sfunction in free-space, which has an explicit form:

G(r−r′)=k2

((3

k2d2− 3j

kd−1

)(r−r′

d⊗ r−r′

d

)+

(1+

j

kd− 1

k2d2

)I

)g(r−r′), (21)

where ⊗ denotes the Kronecker product, d=‖r−r′‖, and Iis the unit dyadic.

2) Discretization: To obtain a discrete system for (18)and (20), we define the image domain Ω as a cube anduniformly sample Ω on a rectangular grid with sampling stepδ in all three dimensions. Let the center of Ω be the originand let rl,s,t :=((l−J/2−0.5)δ,(s−J/2−0.5)δ,(t−J/2−0.5)δ) for r,s, t=1, . . . ,J . To simplify the notation, we use aone-to-one mapping (l,s, t) 7→n and denote the samples by rnfor n=1, . . . ,N , where N=J3. Assume that M detectors areplaced at rm for m=1, . . . ,M . Note that the detectors do nothave to be placed on a regular grid, because the dyadic Green’sfunction 21 can be evaluated at any spatial points, whereas

the rectangular grid on Ω is important in order for discretedifferentiation to be easily defined. Using these definitions, thediscrete system that corresponds to (20) and (18) with r∈Ωcan then be written as

~Esc(rm)=δ3N∑n=1

G(rm−rn)f(rn) ~E(rn), (22)

~E(rn)= ~Ein(rn)+(k2 +∇∇·) ~B(rn), (23)

where m=1, . . . ,M , n=1, . . . ,N , and

~B(rn)=δ3N∑k=1

g(rn−rk)f(rk) ~E(rk), n=1, . . . ,N.

Let us organize ~En and ~Einn for n=1, . . . ,N into column

vectors as follows:

u=

E(1)

E(2)

E(3)

, uin =

Ein,(1)

Ein,(2)

Ein,(3)

, (24)

where E(i)∈CN is a column vector whose nth coordinateis E

(i)n ; similar notation applies to Ein,(i)∈CN . Then the

discretized inverse scattering problem is defined as follows:

y=H(I3⊗diag(f)

)u+ e, (25)

u= uin +(k2I+D)(I3⊗(Gdiag(f))

)u, (26)

where Ip is a p×p identity matrix and we drop the subscriptp if the dimension is clear from the context, f ∈RN is thevectorized permittivity contrast distribution, which is assumedto be real in this work, D∈R3N×3N is the matrix representa-tion of the gradient-divergence operator ∇∇·, y∈CM is thescattered wave measurement vector with measurement noisee∈CM , and G∈CN×N is the matrix representation of theconvolution operator induced by the 3D free-space Green’sfunction (1). Note that according to the polarization of thereceiver antennas in the experimental setup we consider in thispaper, the ideal noise-free measurements ym− em are Esc,(3)

m ,for m=1, ...,M , which are the scattered wave ~Esc along thez-dimension measured at M different locations in the sensorregion Γ. Therefore, H∈CM×3N is the matrix representationof the convolution operator induced by the third row of thedyadic Green’s function (21). Define

A :=I−(k2I+D)(I3⊗(Gdiag(f))

). (27)

Similar to the scalar field case, we can see that the mea-surement vector y is nonlinear in the unknown f , becauseu depends on f according to u=A−1uin, which follows from(26).

Next, we define the discrete gradient-divergence operator∇∇·, for which the matrix representation is D in (26). For ascalar function f of the spatial location rl,s,t, denote f(rl,s,t)by fl,s,t. Following the finite difference rule,

∂2

∂x2fl,s,t :=

fl−1,s,t−2fl,s,t+fl+1,s,t

δ2,

∂2

∂x∂yfl,s,t :=

fl−1,s−1,t−fl−1,s+1,t

4δ2

+fl+1,s+1,t−fl+1,s−1,t

4δ2.

Page 8: Accelerated Image Reconstruction for Nonlinear Diffractive Imaging - arXiv.org … · 2017-12-15 · Petros T. Boufounos, Senior Member, IEEE, and Ulugbek S. Kamilov, Member, IEEE

8

The definitions of ∂2

∂x∂z and ∂2

∂y∂z are similar to that of ∂2

∂x∂y .

Moreover, for a vector ~Em,n,p∈C3, we use E(i)m,n,p to denote

its ith coordinate. Then we have that the first coordinate of~Em,n,p is

E(1)m,n,p=Ein,(1)

m,n,p+k2B(1)m,n,p

+∂

∂x1

(∂

∂x1B(1)m,n,p+

∂x2B(2)m,n,p+

∂x3B(3)m,n,p

)=Ein,(1)

m,n,p+k2B(1)m,n,p+

B(1)m+1,n,p−2B

(1)m,n,p+B

(1)m−1,n,p

δ2

+B

(2)m−1,n−1,p−B

(2)m−1,n+1,p−B

(2)m+1,n−1,p+B

(2)m+1,n+1,p

4δ2

+B

(3)m−1,n,p−1−B

(3)m−1,n,p+1−B

(3)m+1,n,p−1 +B

(3)m+1,n,p+1

4δ2.

The second and third coordinates of ~Em,n,p can be obtainedin a similar way.

3) CISOR for 3D: The data-fidelity term D(·) in theobjective function (6) is now defined as D(f) := 1

2‖y−Z(f)‖2,where Z(f) :=H

(I3⊗diag(f)

)u. Let w= Z(f)− y, and

g=diag(u)H(HHw+(I3⊗GH)(k2I+DH)v

), (28)

where u and v are obtained from the linear systems

Au= uin, and AHv=(I3⊗diag(f))HHw, (29)

where A is defined in (27). Then the gradient of D can bewritten as

∇D(f)=Re

3∑i=1

g(i)

(30)

with g(i) =(g(i−1)N+1, . . . ,giN )∈CN for i=1,2,3.

B. Proofs for Theoretical Results

1) Proof for Proposition 1: The gradient of D(·) definedin (7) is Re

JHZw

, where JZ is the Jocobian matrix of Z(·)5, which is defined as

JZ=

∂Z1

∂f1. . . ∂Z1

∂fN...

. . ....

∂ZM

∂f1. . . ∂ZM

∂fN

.Recall that A=(I−Gdiag(f)) and u=A−1uin, hence both Aand u are functions of f . We omit such dependencies in ournotation for brevity. Following the chain rule of differentiation,

∂Zm∂fn

=∂

∂fn

N∑i=1

Hm,ifiui

=Hm,nun+

N∑i=1

[∂ui∂fn

]Hm,ifi.

Using the definition w=Z(f)−y and summing over m=1, ...,M ,

[∇D(f)]n=

M∑m=1

[∂Zm∂fn

]wm

=un

M∑m=1

Hm,nwm+

N∑i=1

[∂ui∂fn

]fi

M∑m=1

Hm,iwm

=un[HHw

]n

+

N∑i=1

[∂ui∂fn

]fi[HHw

]i, (31)

where a denotes the complex conjugate of a∈C. Label thetwo terms in (31) as T1 and T2, then

T1 =[diag(u)HHHw

]n, (32)

and

T2(a)= (uin)H

[∂A−1

∂fn

]H

diag(f)HHw

(b)=−(uin)HA−H

[∂A

∂fn

]H

A−Hdiag(f)HHw

(c)=−uH(f)

[∂A

∂fn

]H

v(d)= [diag(u)HGHv]n. (33)

In the above, step (a) holds by plugging in ui=[A−1uin

]i.

Step (b) uses the identity

∂A−1

∂fn=−A−1 ∂A

∂fnA−1, (34)

which follows by differentiating both sides of AA−1 =I,

∂A

∂fnA−1 +A

∂A−1

∂fn=0.

From step (b) to step (c), we used the fact that u=A−1uin anddefined v :=A−Hdiag(f)HHw, which matches (13). Finally,step (d) follows by plugging in A=I−Gdiag(f). Combining(31), (32), and (33), we have obtained the expression in (12).

Note that the extension to the 3D vectorial field case(30) is straightforward. We highlight the differences from thescalar field case in the following. Let H=[H(1)|H(2)|H(3)],where H(i)∈CM×N for i=1,2,3. Following the chain ruleof differentiation, we have

∂Zm∂fn

=

3∑k=1

H(k)m,nu

(k)n +

3∑k=1

N∑i=1

[∂u

(k)i

∂fn

]H

(k)m,ifi.

Using the definition w and summing over m=1, ...,M ,

[∇D(f)

]n

=

M∑m=1

[∂Zm∂fn

]wm=

3∑k=1

u(k)n

[(H(k))Hw

]n

+

3∑k=1

N∑i=1

[∂u

(k)i

∂fn

]fi

[(H(k))Hw

]i. (35)

Page 9: Accelerated Image Reconstruction for Nonlinear Diffractive Imaging - arXiv.org … · 2017-12-15 · Petros T. Boufounos, Senior Member, IEEE, and Ulugbek S. Kamilov, Member, IEEE

9

Label the two terms in (35) as T1 and T2, then

T1 =

3∑k=1

[diag(u(k))H(H(k))Hw

]n, (36)

T2(a)=

3∑k=1

([∂A−1

∂fnuin](k)

)H

diag(f)(H(k))Hw

(b)=

3∑k=1

[−A−1 ∂A

∂fnu

](k)H

diag(f)(H(k))Hw

(c)=

3∑k=1

−uH

[∂A

∂fn

]Hv

(k)

(d)=

3∑k=1

[diag(u(k))HGH[(k2I+DH)v](k)

]n. (37)

In the above, step (a) follows from u(k)i =

[A−1uin

](k)

i. Step

(b) follows from (34). In step (c), we defined v :=A−H(I3⊗diag(f))HHw, which matches (29). Finally, step (d) followsby plugging in the definition of A in (27). Combining (35),(36), and (37), we have obtained the expression in (30).

2) Proof for Proposition 2: For a vector a that is a functionof s, ai denotes the value of a evaluated at si. Using thisnotation, we have

‖∇D(s1)−∇D(s2)‖≤‖diag(u1)HHHw1−diag(u2)HHHw2‖+‖diag(u1)HGHv1−diag(u2)HGHv2‖.

Label the two terms on the RHS as T1 and T2. We will proveT1≤L1‖s1−s2‖ for some L1∈(0,∞), and T2≤L2‖s1−s2‖can be proved in a similar way for some L2∈(0,∞). Theresult (14) is then obtained by letting L=L1 +L2.

T1≤‖diag(u1)HHHw1−diag(u2)HHHw1‖+‖diag(u2)HHHw1−diag(u2)HHHw2‖≤‖u1−u2‖‖H‖op‖w1‖+‖A−1

2 ‖‖uin‖‖H‖op‖w1−w2‖,

where ‖·‖op denotes the operator norm and the last inequalityuses the fact that ‖diag(d)‖op =maxn∈[N ] |dn|≤‖d‖. We nowbound ‖u1−u2‖ and ‖w1−w2‖.

‖u1−u2‖=‖A−11 uin−A−1

2 uin‖≤‖A−11 −A−1

2 ‖op‖uin‖=‖A−1

1 (A2−A1)A−12 ‖op‖uin‖

≤‖A−11 ‖op‖G‖op‖s1−s2‖‖A−1

2 ‖op‖uin‖,‖w1−w2‖≤‖Hdiag(s1)u1−Hdiag(s1)u2‖

+‖Hdiag(s1)u2−Hdiag(s2)u2‖≤‖H‖op‖s1‖‖u1−u2‖+‖H‖op‖s1−s2‖‖A−1

2 ‖op‖uin‖.

Then the result T1≤L1‖s1−s2‖ follows by noticingthat ‖s1‖, ‖uin‖, ‖G‖op, ‖H‖op, and ‖A−1

i ‖op fori=1,2 are bounded, and the fact that ‖w1‖≤‖y‖+‖H‖op‖s1‖‖A−1

1 ‖op‖uin‖<∞.The Lipschitz property of the gradient (30) in the 3D case

can be proved in a similar way.

3) Proof for Proposition 3: First, we show that the Lips-chitz gradient condition (14) implies that for all x,y∈U ,

D(y)−D(x)−〈∇D(x),y−x〉≥−L2‖x−y‖2. (38)

Define a function h :R→R as h(λ) :=D(x+λ(y−x)),then h′(λ)=〈∇D(x+λ(y−x)),y−x〉. Notice that h(1)=D(y) and h(0)=D(x). Using the equality h(1)=h(0)+∫ 1

0h′(λ)dλ, we have that

D(y)=D(x)+

∫ 1

0

〈∇D(x+λ(y−x)),y−x〉dλ

=D(x)+

∫ 1

0

〈∇D(x),y−x〉dλ

+

∫ 1

0

〈∇D(x+λ(y−x))−∇D(x),y−x〉dλ

(a)

≥ D(x)−∫ 1

0

λL‖y−x‖2dλ+〈∇D(x),y−x〉

=D(x)− L2‖y−x‖2 +〈∇D(x),y−x〉,

where step (a) uses Cauchy-Schwarz inequality and the Lip-schitz gradient condition (14).

Next, by (9), we have that for all x∈U , t≥0,

R(x)≥R(ft)+〈st− ftγ−∇D(st),x− ft〉. (39)

Let y= fk, x= fk+1 in (38) and x= fk, t=k+1 in (39).Adding (38) and (39), we have

F(fk+1)−F(fk)≤〈∇D(fk+1)−∇D(sk+1), fk+1− fk〉

+1

γ〈sk+1− fk+1, fk+1− fk〉+

L

2‖fk+1− fk‖2

(a)

≤ L

2‖sk+1− fk+1‖2 +

L

2‖fk+1− fk‖2 +

1

2γ‖sk+1− fk‖2

− 1

2γ‖sk+1− fk+1‖2−

1

2γ‖fk+1− fk‖2 +

L

2‖fk+1− fk‖2

(b)

≤(

1

2γ−L)‖fk− fk−1‖2−

(1

2γ−L)‖fk+1− fk‖2

−(

1

2γ− L

2

)‖sk+1− fk+1‖2.

In the above, step (a) uses Cauchy-Schwarz, Proposition 2,as well as the fact that 2ab≤a2 +b2 and 2〈a−b,b−c〉=‖a−c‖2−‖a−b‖2−‖b−c‖2. Step (b) uses the conditionin the proposition statement that γ≤ 1−α2

2L and (11), whichimplies ‖sk+1− fk‖≤α tk−1

tk+1‖fk− fk−1‖, where we notice that

tk−1tk+1≤1 by (10), and α<1 by our assumption. Summing both

sides from k=0 to K:(1

2γ− L

2

)K−1∑k=0

‖sk+1− fk+1‖2≤F(f0)−F(fK)

+

(1

2γ−L)(‖f0− f−1‖2−‖fK− fK−1‖2

)≤F(f0)−F∗,

where F∗ is the global minimum. The last step follows byletting f−1 = f0, which satisfies (11) for the initialization s1 =f0, and the fact that F∗≤F(fK).

Page 10: Accelerated Image Reconstruction for Nonlinear Diffractive Imaging - arXiv.org … · 2017-12-15 · Petros T. Boufounos, Senior Member, IEEE, and Ulugbek S. Kamilov, Member, IEEE

10

Recall the definition of the gradient mapping Gγ in (15), wehave Gγ(sk)= sk−fk

γ . Therefore,

K∑k=1

‖Gγ(sk)‖2≤ 2L(F(f0)−F∗)γL(1−γL)

, (40)

which implies that limK→∞∑Kk=1 ‖Gγ(sk)‖2<∞, hence

limk→∞

‖Gγ(sk)‖=0. (41)

Note that (41) establishes that the gradient mapping actingon the sequence skk≥0 generated from relaxed FISTAconverges to 0. In the following, we show that the gra-dient mapping acting on fkk≥0 converges to 0 as well,which is the desired result stated in (16). By (9) and (15),we have Gγ(sk)= 1

γ (sk− fk). Therefore, (41) implies thatlimk→∞ ‖sk− fk‖=0. Moreover,

‖Gγ(fk)−Gγ(sk)‖(a)

≤ 1

γ‖sk− fk‖

+1

γ‖proxγR(fk−γ∇D(fk))−proxγR(sk−γ∇D(sk))‖

(b)

≤ 1

γ‖sk− fk‖+

1

γ‖fk−γ∇D(fk)−(sk−γ∇D(sk))‖

(c)

≤ 1

γ‖sk− fk‖+

1

γ‖sk− fk‖+L‖sk− fk‖, (42)

where step (a) follows by (15) and the triangle inequality,step (b) follows by the well-known property that the proximaloperators for convex functions are Lipschitz-1, and step (c)follows by the triangle inequality and the result that ∇Dis Lipschitz-L as we established in Proposition 2. Takingthe limit as k→∞, we have limk→∞ ‖Gγ(fk)−Gγ(sk)‖=0,hence, limk→∞Gγ(fk)=limk→∞Gγ(sk)=0.

Moreover, by (42),

‖Gγ(fk)‖−‖Gγ(sk)‖≤‖Gγ(fk)−Gγ(sk)‖≤(2+γL)‖Gγ(sk)‖,

hence, ‖Gγ(fk)‖2≤(3+γL)2‖Gγ(sk)‖2 for all k≥0. Then by(40),

K∑k=1

‖Gγ(fk)‖2≤(3+γL)2 2L(F(f0)−F∗)γL(1−γL)

,

which implies

K mink∈1,...,K

‖Gγ(fk)‖2≤(3+γL)2 2L(F(f0)−F∗)γL(1−γL)

,

hence

mink∈1,...,K

‖Gγ(fk)‖2≤(3+γL)2 2L(F(f0)−F∗)KγL(1−γL)

.

REFERENCES

[1] M. Born and E. Wolf, Principles of Optics, 7th ed. Cambridge Univ.Press, 2003, ch. Scattering from inhomogeneous media, pp. 695–734.

[2] A. J. Devaney, “Inverse-scattering theory within the Rytov approxima-tion,” Opt. Lett., vol. 6, no. 8, pp. 374–376, August 1981.

[3] M. M. Bronstein, A. M. Bronstein, M. Zibulevsky, and H. Azhari,“Reconstruction in diffraction ultrasound tomography using nonuniformFFT,” IEEE Trans. Med. Imag., vol. 21, no. 11, pp. 1395–1401,November 2002.

[4] J. W. Lim, K. R. Lee, K. H. Jin, S. Shin, S. E. Lee, Y. K. Park, andJ. C. Ye, “Comparative study of iterative reconstruction algorithms formissing cone problems in optical diffraction tomography,” Opt. Express,vol. 23, no. 13, pp. 16 933–16 948, June 2015.

[5] Y. Sung and R. R. Dasari, “Deterministic regularization of three-dimensional optical diffraction tomography,” J. Opt. Soc. Am. A, vol. 28,no. 8, pp. 1554–1561, August 2011.

[6] V. Lauer, “New approach to optical diffraction tomography yieldinga vector equation of diffraction tomography and a novel tomographicmicroscope,” J. Microsc., vol. 205, no. 2, pp. 165–176, 2002.

[7] Y. Sung, W. Choi, C. Fang-Yen, K. Badizadegan, R. R. Dasari, andM. S. Feld, “Optical diffraction tomography for high resolution live cellimaging,” Opt. Express, vol. 17, no. 1, pp. 266–277, December 2009.

[8] T. Kim, R. Zhou, M. Mir, S. Babacan, P. Carney, L. Goddard, andG. Popescu, “Supplementary information: White-light diffraction tomog-raphy of unlabelled live cells,” Nat. Photonics, vol. 8, pp. 256–263,March 2014.

[9] J. Sharpe, U. Ahlgren, P. Perry, B. Hill, A. Ross, J. Hecksher-Sørensen,R. Baldock, and D. Davidson, “Optical projection tomography as a toolfor 3D microscopy and gene expression studies,” Science, vol. 296, no.5567, pp. 541–545, April 2002.

[10] W. Choi, C. Fang-Yen, K. Badizadegan, S. Oh, N. Lue, R. R. Dasari, andM. S. Feld, “Tomographic phase microscopy: Supplementary material,”Nat. Methods, vol. 4, no. 9, pp. 1–18, September 2007.

[11] T. S. Ralston, D. L. Marks, P. S. Carney, and S. A. Boppart, “Inversescattering for optical coherence tomography,” J. Opt. Soc. Am. A, vol. 23,no. 5, pp. 1027–1037, May 2006.

[12] B. J. Davis, S. C. Schlachter, D. L. Marks, T. S. Ralston, S. A. Boppart,and P. S. Carney, “Nonparaxial vector-field modeling of optical coher-ence tomography and interferometric synthetic aperture microscopy,” J.Opt. Soc. Am. A, vol. 24, no. 9, pp. 2527–2542, September 2007.

[13] D. J. Brady, K. Choi, D. L. Marks, R. Horisaki, and S. Lim, “Com-pressive holography,” Opt. Express, vol. 17, no. 15, pp. 13 040–13 049,2009.

[14] L. Tian, N. Loomis, J. A. Dominguez-Caballero, and G. Barbastathis,“Quantitative measurement of size and three-dimensional position offast-moving bubbles in airwater mixture flows using digital holography,”Appl. Opt., vol. 49, no. 9, pp. 1549–1554, March 2010.

[15] W. Chen, L. Tian, S. Rehman, Z. Zhang, H. P. Lee, and G. Barbastathis,“Empirical concentration bounds for compressive holographic bubbleimaging based on a Mie scattering model,” Opt. E, vol. 23, no. 4, p.February, 2015.

[16] H. M. Jol, Ed., Ground Penetrating Radar: Theory and Applications.Amsterdam: Elsevier, 2009.

[17] M. Leigsnering, F. Ahmad, M. Amin, and A. Zoubir, “Multipathexploitation in through-the-wall radar imaging using sparse reconstruc-tion,” IEEE Trans. Aerosp. Electron. Syst., vol. 50, no. 2, pp. 920–939,April 2014.

[18] D. Liu, U. S. Kamilov, and P. T. Boufounos, “Compressive tomographicradar imaging with total variation regularization,” in Proc. IEEE 4thInternational Workshop on Compressed Sensing Theory and its Appli-cations to Radar, Sonar, and Remote Sensing (CoSeRa 2016), Aachen,Germany, September 19-22, 2016, pp. 120–123.

[19] S. Boyd and L. Vandenberghe, Convex Optimization. Cambridge Univ.Press, 2004.

[20] J. Nocedal and S. J. Wright, Numerical Optimization, 2nd ed. Springer,2006.

[21] J. M. Bioucas-Dias and M. A. T. Figueiredo, “A new TwIST: Two-stepiterative shrinkage/thresholding algorithms for image restoration,” IEEETrans. Image Process., vol. 16, no. 12, pp. 2992–3004, December 2007.

[22] A. Beck and M. Teboulle, “A fast iterative shrinkage-thresholdingalgorithm for linear inverse problems,” SIAM J. Imaging Sciences, vol. 2,no. 1, pp. 183–202, 2009.

[23] B. Chen and J. J. Stamnes, “Validity of diffraction tomography basedon the first born and the first rytov approximations,” Appl. Opt., vol. 37,no. 14, pp. 2996–3006, May 1998.

[24] V. Ntziachristos, “Going deeper than microscopy: the optical imagingfrontier in biology,” Nat. Methods, vol. 7, no. 8, pp. 603–614, August2010.

[25] L. I. Rudin, S. Osher, and E. Fatemi, “Nonlinear total variation basednoise removal algorithms,” Physica D, vol. 60, no. 1–4, pp. 259–268,November 1992.

[26] M. A. T. Figueiredo and R. D. Nowak, “An EM algorithm for wavelet-based image restoration,” IEEE Trans. Image Process., vol. 12, no. 8,pp. 906–916, August 2003.

Page 11: Accelerated Image Reconstruction for Nonlinear Diffractive Imaging - arXiv.org … · 2017-12-15 · Petros T. Boufounos, Senior Member, IEEE, and Ulugbek S. Kamilov, Member, IEEE

11

[27] I. Daubechies, M. Defrise, and C. D. Mol, “An iterative thresholding al-gorithm for linear inverse problems with a sparsity constraint,” Commun.Pure Appl. Math., vol. 57, no. 11, pp. 1413–1457, November 2004.

[28] J. Bect, L. Blanc-Feraud, G. Aubert, and A. Chambolle, “A `1-unifiedvariational framework for image restoration,” in Proc. ECCV, Springer,Ed., vol. 3024, New York, 2004, pp. 1–13.

[29] K. Belkebir, P. C. Chaumet, and A. Sentenac, “Superresolution in totalinternal reflection tomography,” J. Opt. Soc. Am. A, vol. 22, no. 9, pp.1889–1897, September 2005.

[30] P. C. Chaumet and K. Belkebir, “Three-dimensional reconstruction fromreal data using a conjugate gradient-coupled dipole method,” Inv. Probl.,vol. 25, no. 2, p. 024003, 2009.

[31] P. M. van den Berg and R. E. Kleinman, “A contrast source inversionmethod,” Inv. Probl., vol. 13, no. 6, pp. 1607–1620, December 1997.

[32] A. Abubakar, P. M. van den Berg, and T. M. Habashy, “Application ofthe multiplicative regularized contrast source inversion method tm- andte-polarized experimental fresnel data,” Inv. Probl., vol. 21, no. 6, pp.S5–S14, 2005.

[33] M. T. Bevacquad, L. Crocco, L. Di Donato, and T. Isernia, “Non-linearinverse scattering via sparsity regularized contrast source inversion,”IEEE Trans. Comp. Imag., 2017.

[34] K. Belkebir and A. Sentenac, “High-resolution optical diffraction mi-croscopy,” J. Opt. Soc. Am. A, vol. 20, no. 7, pp. 1223–1229, July 2003.

[35] E. Mudry, P. C. Chaumet, K. Belkebir, and A. Sentenac, “Electromag-netic wave imaging of three-dimensional targets using a hybrid iterativeinversion method,” Inv. Probl., vol. 28, no. 6, p. 065007, April 2012.

[36] T. Zhang, C. Godavarthi, P. C. Chaumet, G. Maire, H. Giovannini,A. Talneau, M. Allain, K. Belkebir, and A. Sentenac, “Far-field diffrac-tion microscopy at λ/10 resolution,” Optica, vol. 3, no. 6, pp. 609–612,June 2016.

[37] U. S. Kamilov, I. N. Papadopoulos, M. H. Shoreh, A. Goy, C. Vonesch,M. Unser, and D. Psaltis, “Learning approach to optical tomography,”Optica, vol. 2, no. 6, pp. 517–522, June 2015.

[38] ——, “Optical tomographic image reconstruction based on beam prop-agation and sparse regularization,” IEEE Trans. Comp. Imag., vol. 2,no. 1, pp. 59–70,, March 2016.

[39] U. S. Kamilov, D. Liu, H. Mansour, and P. T. Boufounos, “A recursiveBorn approach to nonlinear inverse scattering,” IEEE Signal Process.Lett., vol. 23, no. 8, pp. 1052–1056, August 2016.

[40] H.-Y. Liu, U. S. Kamilov, D. Liu, H. Mansour, and P. T. Boufounos,“Compressive imaging with iterative forward models,” in Proc. IEEEInt. Conf. Acoustics, Speech and Signal Process. (ICASSP 2017), NewOrleans, LA, USA, March 5-9, 2017, pp. 6025–6029.

[41] H.-Y. Liu, D. Liu, H. Mansour, P. T. Boufounos, L. Waller, andU. S. Kamilov, “SEAGLE: Sparsity-driven image reconstruction undermultiple scattering,” May 2017, arXiv:1705.04281 [cs.CV].

[42] E. Soubies, T.-A. Pham, and M. Unser, “Efficient inversion of multiple-scattering model for optical diffraction tomography,” Opt. Express,vol. 25, no. 18, pp. 21 786–21 800, Sep 2017.

[43] J. W. Goodman, Introduction to Fourier Optics, 2nd ed. McGraw-Hill,1996.

[44] H. Li and Z. Lin, “Accelerated proximal gradient methods for nonconvexprogramming,” in Proc. Advances in Neural Information ProcessingSystems 28, Montreal, Canada, December 7-12 2015.

[45] S. Ghadimi and G. Lan, “Accelerated gradient methods for nonconvexnonlinear and stochastic programming,” Math. Program. Ser. A, vol. 156,no. 1, pp. 59–99, March 2016.

[46] J.-M. Geffrin, P. Sabouroux, and C. Eyraud, “Free space experimentalscattering database continuation: experimental set-up and measurementprecision,” Inv. Probl., vol. 21, no. 6, pp. S117–S130, 2005.

[47] A. Beck and M. Teboulle, “Fast gradient-based algorithm for constrainedtotal variation image denoising and deblurring problems,” IEEE Trans.Image Process., vol. 18, no. 11, pp. 2419–2434, November 2009.

[48] U. S. Kamilov, “A parallel proximal algorithm for anisotropic totalvariation minimization,” IEEE Trans. Image Process., vol. 26, no. 2,pp. 539–548, February 2017.

[49] R.-E. Plessix, “A review of the adjoint-state method for computing thegradient of a functional with geophysical applications,” GeophysicalJournal International, vol. 167, no. 2, pp. 495–503, 2006.

[50] R. T. Rockafellar and R. J.-B. Wets, Variational Analysis. SpringerScience & Business Media, 2009.

[51] J.-M. Geffrin and P. Sabouroux, “Continuing with the Fresnel database:experimental setup and improvements in 3D scattering measurements,”Inv. Probl., vol. 25, no. 2, p. 024001, 2009.

[52] D. Colton and R. Kress, Inverse Acoustic and Electromagnetic ScatteringTheory. Springer Science & Business Media, 1992, vol. 93.