uncertainty quantification in inverse scattering problems ......denote conjugate transpose,...

10
1 Uncertainty Quantification in Inverse Scattering Problems with Bayesian Convolutional Neural Networks Zhun Wei and Xudong Chen Abstract—Recently, tremendous progress has been made in applying deep learning schemes (DLSs) to solve inverse scat- tering problems (ISPs), where state-of-the-art performance has been achieved. However, little attention has been paid to the uncertainty quantification of these deep-learning methods in solving ISPs. In other words, the error of the prediction is not known since the ground truth is not available in practice. In this paper, a Bayesian convolutional neural network (BCNN) is used to quantify the uncertainties in solving ISPs. With Monte Carlo dropout, the proposed BCNN is able to directly predict the pixel-based uncertainties of the widely used DLSs in ISPs. To quantitatively evaluate the performance of uncertainty predictions, both correlation coefficient and nonlinear correlation distribution between the predicted uncertainty and true absolute error are calculated. The tests on both synthetic and experimental data show that the predictive uncertainty is highly correlated with true absolute error calculated from the ground truth. Besides, when DLSs fail to solve an ISP, the predicted uncertainty increases significantly, which offers a pixel-based “confidence level” in solving ISPs. Index Terms—Inverse scattering problems, Uncertainty quan- tification, Bayesian convolutional neural network, Deep learning schemes. I. I NTRODUCTION E LECTROMAGNETIC inverse scattering problems (ISPs) are targeted to reconstruct the information of unknown scatterer, such as its position, shape, and electrical properties by probing the unknown target with electromagnetic energy and collecting the response of the target to electromagnetic excitation [1]. Solving ISPs is able to obtain the information of unknown object even when there is a far distance of the targets from sensors or barriers between targets and sensors. Conse- quently, it is widely seen in through-wall imaging [2], [3], geophysics [4], nondestructive evaluation [5], remote sensing [6] and so on. Nevertheless, due to the ill-posed and nonlinear properties of ISPs, pixel-based reconstruction in ISPs is chal- lenging. To solve ISPs, several non-iterative inversion methods have been proposed, such as Rytov approximation inversion method [7], back-propagation method [8], and extended Born approximation method [9]. One advantage of the non-iterative methods is that they are able to solve ISPs fast, but it suffers Corresponding author: Zhun Wei. Z. Wei is with Key Lab. of Advanced Micro/Nano Electronic Devices and Smart Systems of Zhejiang, College of Information Science and Electronic Engineering, Zhejiang University, Hangzhou 310027, China; X. Chen is with Department of Electrical and Computer Engineering, National University of Singapore, Singapore 117583. (Email: Z. Wei:[email protected]; X. Chen: [email protected]). from low image qualities of reconstructions, especially for strong scatterers. Besides non-iterative method, some iterative inversion methods have also been proposed, such as distorted Born iterative method [10], contrast source inversion (CSI) method [11], contrast source extended Born method [12], and subspace optimization method (SOM) [13]–[15]. In addition to these objective-function approaches, learning approaches have also been proposed to solve ISPs. For example, artificial neural networks (ANN) [16], [17] and support vector machines [18] have been used in ISPs, where only few parameters are extracted from the scattered electromagnetic field. Due to the rapid development of artificial intelligence, sever- al DLS have been proposed in ISPs recently. Generally speak- ing, DLS can be classified into the following categories. The first category is direct learning approach, where the measured scattered field and unknown parameters (permittivity contrast) are directly treated as input and output of DNN, respectively. The “black box” strategy in direct learning approach enables the trained DNN to reconstruct only some simple information [19], such the basic location and shape of the scatterers. The second category is learning-assisted objective-function ap- proach, where ISPs are solved in the framework of traditional object function approach but DNN is employed to learn some components of iterative solvers [20]–[23]. The third category is physics-assisted learning approach, where domain knowledge is incorporated either in the inputs or internal structures of DNN [19], [24]–[28]. It is also shown in [19] by comparisons that physics-assisted learning approach has advantages over direct learning approach, where the former has also been applied to biomedical imaging [29], [30]. Although DNN has been widely used in the ISPs [31], the reconstructed results are highly dependent on training data or network structures. All of the above methods have assumed that the reconstructed results from the network are trustable, but it is not always the case. In this paper, Bayesian convolutional neural network (BCNN) is used to quantify the uncertainties in solving ISPs, where pixel-based uncertainties are obtained through replacing the deterministic weights in traditional DLS structures by distributions over these weights. The main contributions of this paper are summarized as follows: The proposed BCNN is able to directly output pixel- based uncertainties of nonlinear reconstructions without any additional calculations, which offers us a “confidence level” of the solutions when solving ISPs. Different with previous work [32] that solves ISPs un-

Upload: others

Post on 29-Nov-2020

8 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Uncertainty Quantification in Inverse Scattering Problems ......denote conjugate transpose, transpose, and complex conjugate, respectively. In addition, jjand jjjj F are used to denote

1

Uncertainty Quantification in Inverse ScatteringProblems with Bayesian Convolutional Neural

NetworksZhun Wei and Xudong Chen

Abstract—Recently, tremendous progress has been made inapplying deep learning schemes (DLSs) to solve inverse scat-tering problems (ISPs), where state-of-the-art performance hasbeen achieved. However, little attention has been paid to theuncertainty quantification of these deep-learning methods insolving ISPs. In other words, the error of the prediction is notknown since the ground truth is not available in practice. Inthis paper, a Bayesian convolutional neural network (BCNN)is used to quantify the uncertainties in solving ISPs. WithMonte Carlo dropout, the proposed BCNN is able to directlypredict the pixel-based uncertainties of the widely used DLSs inISPs. To quantitatively evaluate the performance of uncertaintypredictions, both correlation coefficient and nonlinear correlationdistribution between the predicted uncertainty and true absoluteerror are calculated. The tests on both synthetic and experimentaldata show that the predictive uncertainty is highly correlatedwith true absolute error calculated from the ground truth.Besides, when DLSs fail to solve an ISP, the predicted uncertaintyincreases significantly, which offers a pixel-based “confidencelevel” in solving ISPs.

Index Terms—Inverse scattering problems, Uncertainty quan-tification, Bayesian convolutional neural network, Deep learningschemes.

I. INTRODUCTION

ELECTROMAGNETIC inverse scattering problems (ISPs)are targeted to reconstruct the information of unknown

scatterer, such as its position, shape, and electrical propertiesby probing the unknown target with electromagnetic energyand collecting the response of the target to electromagneticexcitation [1]. Solving ISPs is able to obtain the information ofunknown object even when there is a far distance of the targetsfrom sensors or barriers between targets and sensors. Conse-quently, it is widely seen in through-wall imaging [2], [3],geophysics [4], nondestructive evaluation [5], remote sensing[6] and so on. Nevertheless, due to the ill-posed and nonlinearproperties of ISPs, pixel-based reconstruction in ISPs is chal-lenging. To solve ISPs, several non-iterative inversion methodshave been proposed, such as Rytov approximation inversionmethod [7], back-propagation method [8], and extended Bornapproximation method [9]. One advantage of the non-iterativemethods is that they are able to solve ISPs fast, but it suffers

Corresponding author: Zhun Wei.Z. Wei is with Key Lab. of Advanced Micro/Nano Electronic Devices and

Smart Systems of Zhejiang, College of Information Science and ElectronicEngineering, Zhejiang University, Hangzhou 310027, China; X. Chen is withDepartment of Electrical and Computer Engineering, National University ofSingapore, Singapore 117583. (Email: Z. Wei:[email protected]; X. Chen:[email protected]).

from low image qualities of reconstructions, especially forstrong scatterers. Besides non-iterative method, some iterativeinversion methods have also been proposed, such as distortedBorn iterative method [10], contrast source inversion (CSI)method [11], contrast source extended Born method [12], andsubspace optimization method (SOM) [13]–[15]. In additionto these objective-function approaches, learning approacheshave also been proposed to solve ISPs. For example, artificialneural networks (ANN) [16], [17] and support vector machines[18] have been used in ISPs, where only few parameters areextracted from the scattered electromagnetic field.

Due to the rapid development of artificial intelligence, sever-al DLS have been proposed in ISPs recently. Generally speak-ing, DLS can be classified into the following categories. Thefirst category is direct learning approach, where the measuredscattered field and unknown parameters (permittivity contrast)are directly treated as input and output of DNN, respectively.The “black box” strategy in direct learning approach enablesthe trained DNN to reconstruct only some simple information[19], such the basic location and shape of the scatterers.The second category is learning-assisted objective-function ap-proach, where ISPs are solved in the framework of traditionalobject function approach but DNN is employed to learn somecomponents of iterative solvers [20]–[23]. The third category isphysics-assisted learning approach, where domain knowledgeis incorporated either in the inputs or internal structures ofDNN [19], [24]–[28]. It is also shown in [19] by comparisonsthat physics-assisted learning approach has advantages overdirect learning approach, where the former has also beenapplied to biomedical imaging [29], [30]. Although DNN hasbeen widely used in the ISPs [31], the reconstructed resultsare highly dependent on training data or network structures.All of the above methods have assumed that the reconstructedresults from the network are trustable, but it is not alwaysthe case. In this paper, Bayesian convolutional neural network(BCNN) is used to quantify the uncertainties in solving ISPs,where pixel-based uncertainties are obtained through replacingthe deterministic weights in traditional DLS structures bydistributions over these weights. The main contributions ofthis paper are summarized as follows:

• The proposed BCNN is able to directly output pixel-based uncertainties of nonlinear reconstructions withoutany additional calculations, which offers us a “confidencelevel” of the solutions when solving ISPs.

• Different with previous work [32] that solves ISPs un-

Page 2: Uncertainty Quantification in Inverse Scattering Problems ......denote conjugate transpose, transpose, and complex conjugate, respectively. In addition, jjand jjjj F are used to denote

2

Fig. 1. The proposed framework of BCNN for uncertainty qualifications in ISPs. Tx1, Tx2, ... and Rx1, Rx2, ..., represent the transmitted and receivingantennas, respectively. ξ

ais the approximated contrast obtained by back-prorogation (BP) and dominant current (DC) in BPS and DCS, respectively. For an

arbitrary pixel j, BCNN is able to reconstruct the contrast of the scatterers at the jth pixel yj in DOI (D) and at the same time, qualify the uncertainty atthe jth pixel σt

j during the reconstructions.

der sparse and weak scatterers conditions, the proposedmethod in our work do not have such constraints onscatterers.

• Different with the theoretical works in computer vision[33], [34] and application work in phase imaging [35],the proposed framework is designed for ISPs and verifiedunder DLS-based ISP solvers including back-prorogationscheme (BPS) and dominant current scheme (DCS) [19]with both synthetic and experimental data.

• The performance of the predicted uncertainties are quan-titatively evaluated with both correlation coefficient andnonlinear correlation distribution. It is also shown thatwhen the DLSs fail in reconstructions, the predicteduncertainty increases significantly.

• The previous work [32] is a model-based method underthe framework of first-order Born approximation. Oncontrary, our method is devoted to qualifying uncertaintywhen using DLSs to solve ISPs. Thus, the outlined BCNNcan be applied to other DLS-based ISP solvers, such asthe ones presented in [24]–[28], [36], and to quantify theuncertainties of their reconstructed results.

Notations throughout the paper are as follows: A and A areused to denote the matrix and vector forms of the discretizedparameter A, respectively. We use superscripts H , T , and ∗ todenote conjugate transpose, transpose, and complex conjugate,respectively. In addition, |·| and ||·||F are used to denote takingelement-wise absolute value and Frobenius norm of a matrix.

II. METHOD

As shown in Fig. 1, a two-dimensional transverse-magneticcase is considered in this paper. A domain of interest (DOI)D is illuminated by Ni line sources located at rip withp = 1, 2, ..., Ni, where the electromagnetic waves interact withscatterers in DOI. The scattered field is then collected by Nr

antennas at rsq with q = 1, 2, ..., Nr on the surface S.

A. Formulation of the problem

In ISPs, the forward model can be formulated by the follow-ing two discretized equation based on Lippmann-Schwinger

equation:E

t= E

i+GD · ξ · E

t, (1)

Es

= GS · J, (2)

where Et, E

i, and E

sdenote the discretized forms of total

electrical field, incident field, and scattered field, respectively.GD and GS are the free space Green’s functions at domainD and boundary surface S, respectively. ξ is the contrast indomain D, and the contrast current J is defined as J = ξ ·Et

[1], [36]. When solving ISPs with the above equations, thecontrast ξ is the unknown parameter to be reconstructed basedon the measured E

s, whereas E

i, GD, and GS are known

parameters that are independent of the contrast ξ.In typical learning approaches, such as BPS and DCS

[19], a training dataset containing Mt pairs of the ground-truth contrasts and their corresponding measured scatteredfields is usually generated and used for the training purpose.Specifically, the learning algorithm is expressed as:

Ral = argmin

RW ,W:

Mt∑m=1

f(RW (ξa

m), ξm) + g(W ). (3)

with RW and W denoting DNN architecture and the weightsto be learned, respectively. f and g(W ) are the measure ofmismatch and regularizer to avoid overfitting, respectively.ξa

m is the approximated contrast obtained by back-prorogation(BP) and dominant current (DC) in BPS and DCS, respec-tively, which is able to convert the inputs of DNN frommeasurement domain into contrast domain and consequentlymake the learning process easier. ξm is the ground truthcontrast. Contrast parameters instead of electrical properties,such as permittivities, are used as inputs and outputs be-cause contrast parameters have subtracted the backgroundrelative permittivities from the relative permittivities so thatthe network focuses on learning the interesting part and avoidslearning known background values.

B. BCNN for ISPs

As shown in Fig. 1, different with traditional convolutionalneural network, BCNN replaces the deterministic weights in

Page 3: Uncertainty Quantification in Inverse Scattering Problems ......denote conjugate transpose, transpose, and complex conjugate, respectively. In addition, jjand jjjj F are used to denote

3

Fig. 2. Example One (in-range tests for BPS): Reconstructed relative permittivity profiles from scattered fields with 5% noise. The true absolute error iscalculated from the ground truth and output following (14), and the predicted uncertainty is output from BCNN. The predicted uncertainty captures most ofthe features presented in true absolute error without the knowledge of ground truth.

TABLE IAVERAGE RELATIVE ERROR (Re) AND CORRELATION COEFFICIENT (CC) OF 100 TESTS FOR EACH NUMERICAL TEST (INCLUDING IN-RANGE TESTS,

15% NOISE TESTS, 30% NOISE TESTS, CYLINDER DATA, AND FAILURE CASES) AND EXPERIMENTAL TESTS.

In-Range Tests 15 % Noise 30 % Noise Cylinder data Highly nonlinear data Experimental TestBPS (Re) 0.096 0.098 0.11 0.1 0.467 0.14DCS (Re) 0.093 0.094 0.098 0.076 0.544 0.13BPS (CC) 0.78 0.77 0.76 0.74 0.6 0.81DCS (CC) 0.83 0.83 0.83 0.81 0.31 0.83

network by distributions over these parameters [33]. Followingthe BPS and DCS, BCNN still chooses the approximatecontrasts obtained from BP/DC and the ground truth as theinputs and labels, respectively.

We denote a total number of Ns training data pairs as(X,Y ) = {xn, yn}Nsn=1 with X and Y representing inputsand labels in BCNN, respectively. Given a specific test inputx, the predictive distribution p(y|x,X, Y ) can be modeled byaveraging over all possible weights (also known as marginal-ization):

p(y|x,X, Y ) =

∫p(y|x,W )p(W |X,Y )dW (4)

Here, p(y|x,W ) describes the predictive distribution giveninput x and weights W , which can be understood as datauncertainty (or aleatoric uncertainty). The posterior distribu-tion p(W |X,Y ) captures the set of possible weights given thetraining data pairs (X,Y ), which can be understood as modeluncertainty (or epistemic uncertainty) [35].

To model the data uncertainty in ISPs with BCNN, wedefine a multivariate Gaussian distributed likelihood function

as

p(y|x,W ) =

M∏j=1

p(yj |x,W ) (5)

with

p(yj |x,W ) =1

σj√

2πe− 1

2 (yj−yjσj

)2

, (6)

In the training stage, for each input x, BCNN has its cor-responding label y with yj denoting its jth pixel value. Themean yj and standard deviation σj of the Gaussian distributionare the first and the second channels of output layer in BCNNat the jth pixel, respectively. M is the total number of pixelsin y.

The Gaussian distributed data uncertainties are common inISPs. In practice, other likelihood functions, such as Laplaciandistributed likelihood function, can also be used in accordancewith practical situations.

By taking negative and logarithm operations on (5), anegative log-likelihood function, i.e., loss function L(W ) is

Page 4: Uncertainty Quantification in Inverse Scattering Problems ......denote conjugate transpose, transpose, and complex conjugate, respectively. In addition, jjand jjjj F are used to denote

4

Fig. 3. Example two (in-range tests for DCS): Reconstructed relative permittivity profiles from scattered fields with 5% noise. The true absolute error iscalculated from the ground truth and output following (14), and the predicted uncertainty is output from BCNN. The predicted uncertainty captures most ofthe features presented in true absolute error without the knowledge of ground truth.

obtained [33]:

L(W ) =1

NsM

Ns∑n=1

M∑j=1

1

2σ−2j,n|yj,n − yj,n|

2 +1

2logσ2

j,n (7)

Here, a total number of Ns data pairs in {X,Y } is consideredin the training stage. In ISPs, yj,n is the the label, i.e., ground-truth contrast when training the BCNN. It is important to notethat, yj,n σj,n are the first channel and the second channel inthe output layer, respectively, where both of them depend onW and update with the change of W during the process ofminimizing (7). The two terms in (7) are directly calculatedfrom the output of the network for each trial W . In otherwords, there is no need to know the ground-truth standarddeviation σj,n in (7).

In (7), the loss function consists of a reconstruction errorterm 1

2σ−2j,n|yj,n − yj,n|2 and an uncertainty term 1

2 logσ2j,n,

where the latter serves as a regularization term to balancethe residue and uncertainty. More specifically, without theregularization term, the loss function would easily reach toa point with a large uncertainty. In practice, to avoid potentialzero at the denominator part of loss function, the variance inloss function is usually replaced by log variance with

vj,n = logσ2j,n. (8)

The final loss function is therefore defined as:

Lf (W ) =1

NsM

Ns∑n=1

M∑j=1

1

2e−vj,n |yj,n − yj,n|2 +

1

2vj,n (9)

To evaluate the model uncertainty, it is important to evaluatethe posterior p(W |X,Y ) given our observations pairs {X,Y }in (4). However, p(W |X,Y ) is not known analytically, thus

it is hard to perform inference in BCNN and instead anapproximation has to be made.

In this paper, we cast dropout, i.e., individual neural networknodes are dropped out of the net with a specific probability(dropout rate) so that a reduced network is left, as approximateinference in Bayesian neural networks. Specifically, dropoutvariational inference (DVI) [34] is used to approximate theposterior p(W |X,Y ) by a simple distribution q(W ), wheredropout is performed on the convolution layers for bothtraining and testing stages (also referred to as Monte Carlodropout). The detailed expression of q(W ), which can befound in [34], is neglected in this work since one can extendour method to other deep-learning based ISP solvers withoutknowledge of it. It has been shown that the approximateinference in Bayesian neural network is related with dropoutby Bernoulli approximating variational distribution, wheredropout and Bayesian neural network result in the same modelparameters that best explain the data [34]. Therefore, thepredictive distribution p(y|x,X, Y ) in (4) can be approximatedby Monte Carlo integration as:

p(y|x,X, Y ) ≈∫p(y|x,W )q(W )dW

≈ 1

K

K∑k=1

p(y|x,W (k)),

(10)

where K is the total number of times tested with dropout.

To summarize, in the test stage, we can obtain the predicteduncertainty map σ

t with its jth pixel value σtj derived from

Page 5: Uncertainty Quantification in Inverse Scattering Problems ......denote conjugate transpose, transpose, and complex conjugate, respectively. In addition, jjand jjjj F are used to denote

5

Fig. 4. Example three (out-of-range tests for BPS): Reconstructed relative permittivity profiles from scattered fields. The first two columns have differentnoise level with the training data. The third column has different categories of profile with training data, and the fourth column has different profiles andmuch higher relative permittivity to test the failure cases. The predicted uncertainty of the failure case is much larger than those of successful cases.

the law of total variance [35]:

(σtj)

2 = Var(y|x,X, Y )

= E[Var(y|W,x,X, Y )] + Var(E[y|W,x,X, Y ])

= E[Var(y|W,x)] + Var(E[y|W,x])

≈ 1

K

K∑k=1

(σ(k)j )2 +

1

K

K∑k=1

(y(k)j − yj)2

(11)

Here, X,Y in the second equality are eliminated since onceconditioned on model weights, the prediction is no longerdepends on training data pairs {X,Y }. The approximateequation is made due to the use of Monte Carlo integration.

In the test stage, the standard deviation σ(k)j and mean y(k)j

in (11) are directly predicted from the second and first channelof the output layer at the kth test, respectively. Further, wehave the reconstructed contrast map ξ with its jth pixel valueyj being

yj =1

K

K∑k=1

y(k)j . (12)

The square roots of the first term 1K

K∑k=1

(σ(k)j )2 and the sec-

ond term 1K

K∑k=1

(y(k)j − yj)2 in (11) can be understood as data

and model uncertainties, respectively. The data uncertainty isable to capture the data imperfections such as data noise andout-of-range testing data. The model uncertainty accounts foruncertainty in the model parameters and is able to characterizethe robustness of DNN structures.

C. Implementation Procedures

One of the advantages of the proposed BCNN for ISPs isthat it is easy to implement and flexible to be adapted to theexisting deep learning schemes in solving ISPs. Specifically,for the BPS and DCS, the implementation procedures ofsolving ISPs with BCNN are summarized as follows:

• Step 1) Prepare the training data {X,Y }: The inputs Xare the approximated contrasts that are generated fromBP/DC. The labels Y are ground-truth contrast. Namely,same as (3), X = ξ

a

m and Y = ξm.• Step 2) Build the BCNN: the network structure is the

same as the BPS/DCS that is detailed in [19] exceptthat one more channel is added on the output layer torepresent the uncertainty of prediction as presented in Fig.1. Moreover, dropout is added on the convolutional layer,which changes the concept of learning all the weightstogether to learning a fraction of the weights in thenetwork in each training iteration.

• Step 3) Train BCNN with a specific dropout rate: Thecustom loss function in (9) is used with yj and σj beingthe first channel and the second channel of output layer,respectively. The approximated contrasts, i.e., ξ

a

m in (3),that are generated from BP/DC are fed into the inputof neural network. yj serves as the label of the trainingprocess, i.e., yj equals to the jth pixel value of groundtruth y, i.e., ξm in (3).

• Step 4) Test the trained BCNN with a specific input xfor K times by randomly dropping the weights in theconvolution layers with a specific dropout rate: Due to thedropout, both channels of the output layer differ each timefor a specific input x. To obtain the predicted contrast

Page 6: Uncertainty Quantification in Inverse Scattering Problems ......denote conjugate transpose, transpose, and complex conjugate, respectively. In addition, jjand jjjj F are used to denote

6

Fig. 5. Example four (out-of-range tests for DCS): Reconstructed relative permittivity profiles from scattered fields. The first two column has different levelwith the training data. The third columns have different categories of profile with training data, and the fourth column has different profiles and much higherrelative permittivity to test the failure cases. The predicted uncertainty of the failure case is much larger than those of successful cases.

map ξ, the outputs in the first channel are averaged over

K times as yj = 1K

K∑k=1

y(k)j following (12), where the

physical meaning of yj is the jth pixel of the predictedcontrast map ξ. With the output in the second channelσ(k)j , the predicted uncertainty map σ

t is consequentlycalculated following (11).

D. Computational Cost

In the proposed BCNN for ISPs, the training procedures aretotally the same as those in BPS and DCS except that dropoutlayers are added, where the computational cost is much smallerthan convolution layers. Thus, the computational cost of theproposed method is the same as that of BPS and DCS intraining stage. In the testing stage, as presented in step 4) of theabove section, it is true that we need to test the trained networkfor a specific input K times by randomly dropping the weights.However, all these tests are independent with each other, andcan be tested in parallel. Therefore, when parallel computationis available, the computational cost of the proposed method isalso the same as existing BPS or DCS in the testing stage.

III. TESTS WITH NUMERICAL AND EXPERIMENTAL DATA

A. Numerical Setup

In this work, we evaluate the proposed uncertainty quan-tification framework under the recently proposed two deeplearning schemes, i.e., BPS and DCS in [19]. Specifically, inthe training process, we randomly generate 5000 profiles withrelative permittivity in the range of 1.5-2 using the handwrittendigits in MNIST datasets to represent scatterers in DOI, among

which 500 profiles are used for validation process. Somerepresentative examples of these profiles are also presentedin the first row of Fig. 2. To qualify the uncertainties in BPSand DCS, all the other parameters setups are kept the same asthose in [19].

When generating the data, scattered fields are collected by32 receivers, where 16 line sources are used to illuminate theDOI. In the training stage, 5% Gaussian noise are added to thescattered field. In this work paper, we set the hyperparametersof the network structures as follows: Learning rate, dropoutrate, batch size, total number of test times K and weightdecay equal to 1e−4, 0.1, 20, 16, and 1e−6, respectively. Amaximum of 100 epoches are set in the training stage. It isnoted that, these hyperparameters are empirically chosen anda range of values instead of one single value presented hereworks for the proposed method. More specifically, for learningrate, batch size, and weight decay, we simply use the valuesmost commonly seen in the computer science community. Fordropout rate, according to our experience, dropout between0.05 to 0.3 works well for the proposed method. For totalnumber of test times K, since it represents sampling timesand any sufficient large value should work. According to ourobservations, all the values larger than 10 works well for theproposed method.

To quantitatively evaluate the image qualities of the recon-structions, average relative error Re is defined as

Re =1

Lt

∑Lt

||εtr − εrr||F /||ε

tr||F (13)

with εtr and εrr denoting true and reconstructed relative permit-

Page 7: Uncertainty Quantification in Inverse Scattering Problems ......denote conjugate transpose, transpose, and complex conjugate, respectively. In addition, jjand jjjj F are used to denote

7

Fig. 6. Representative histograms to illustrate the individual relative error ofthe proposed BCNN in Table I for (a) In-Range Tests (b) Highly nonlineardata under BPS.

tivity profiles, respectively. Lt denotes the total number of testsconducted. In order to quantitatively evaluate the predicteduncertainty, true absolute error T

eis firstly calculated as

Te

= |εtr − εrr| (14)

Then, the correlation coefficient (CC) between the predicteduncertainty σt and the true absolute error is defined as:

CC =

M∑j=1

(σtj − σ)(T e

j − T e)√M∑j=1

(σtj − σ)2

M∑j=1

(T ej − T e)2

, (15)

where σ and T e denote the mean of σt and Te, respectively.

T ej is the jth value in T

e. To further evaluate the performance

of the BCNN, the nonlinear correlation distribution (NC)between the predicted uncertainty σ

t and the true absoluteerror T

eis also defined [37], [38]:

NC = |F−1(|Θ|l ×Θ/|Θ|)|2, (16)

where the notations ”×” and ”/”denote element-wise productand division, respectively. l is nonlinear strength with l = 0.3

throughout the paper [37], [38]. Θ equals to F (Te)[F (σ

t)]∗

with F and F−1 denoting Fourier and inverse Fourier trans-form, respectively.

B. Numerical Results

In the first and second examples, we test the proposedBCNN on BPS and DCS with the in-range data. For eachscheme, 100 random tests are conducted. As presented in Figs.2 and 3, four representative results are presented for BPS andDCS, respectively. It is seen that both BPS and DCS obtainsatisfying reconstructions for in-range tests. More importantly,it is found that the true absolute error highly correlates withthe predicted uncertainty from BCNN. In other words, thepredicted uncertainty is able to capture the features of theerrors without knowing the ground truth. It is also interestingto find that the boundary of scatterers, shape edges, and innercylinders tend to have larger uncertainty or true absolute errorthan other parts of the reconstructed scatterers. To furtherquantitatively evaluate the performance, 100 test are conductedand the average relative error Re are presented in Table 1. The

Fig. 7. Nonlinear correlation distribution (NC) map of out-of-range testsfor test #5, test #6, test #7, and test #8 in (a) Fig. 4 (b) Fig. 5. Thesharp peak in the correlation distribution means the predicted uncertaintyis highly correlated with ground-truth error, whereas the noisy correlationdistributions means the correlation between them is low. The horizontal axesrefer to “pixels”, and vertical axis is denoted as “a.u.”.

average CC values are also included in Table I to show thecorrelation between the predicted uncertainty and true absoluteerror.

In the third and fourth examples, we test the proposedmethod with out-of-range data under the framework of BPSand DCS. We consider four out-of-range cases, where the levelof noise, the shape of scatterer, or the contrast of scattererare out of the training range. For each out-of-range case,100 random tests are conducted. Figs. 4 and 5 show therepresentative results for BPS and DCS, respectively. Test #5and Test #6 in Figs. 4 and 5 show the reconstructed resultsand predicted uncertainties when 15% and 30% Gaussian

Page 8: Uncertainty Quantification in Inverse Scattering Problems ......denote conjugate transpose, transpose, and complex conjugate, respectively. In addition, jjand jjjj F are used to denote

8

Fig. 8. The ground truth profiles of “FoamDielExt” [39]: The large cylinderand small cylinder have a relative permittivity of εr = 1.45 ± 0.15 andεr = 3± 0.3, respectively.

noise are added in the scattered field. Satisfying results areobtained for both high-level noise contamination cases, whichalso suggests the robustness of the proposed method to noisecontaminations. Test #7 shows the results when testing oncylinder data with the network trained previously with MNISTdata. It is found that the predicted uncertainty of the proposedBCNN is still able to capture the features showing in thetrue absolute error. These results all verify the generalizationcapability of the proposed method.

In test #8, we purposely increase the difficulties of re-constructions largely by increasing the relative permittivity ofthe testing scatterers but keeping the training data unchanged.Therefore, not only the testing profiles differ with trainingdata significantly, but also the nonlinearities of the ISPs(multiple scattering effect) are dramatically increased. It isshown in test #8 of Figs. 4 and 5 that both methods fail inreconstructions and both the true absolute error and predicteduncertainty are significantly increased, where the latter alsogives us an indicator to monitor the “failure” results. It isknown that the current DLS easily fails in solving generalISPs for out-of-range scatterers with permittivity larger than3. To statistically evaluate the performance when BCNN failsthe reconstructions, we have randomly generated 100 cylinderscatterers with relative permittivity 3.5 and marked as “Highlynonlinear data”. The quantitative results for all the out-of-range tests are also included in Table I, which is averaged over100 tests for each case. Further, two representative histogramsto illustrate the individual relative error of the proposed BCNNin Table I are also presented in Figs. 6(a) and (b).

In order to present the correlation between the predicteduncertainty and true absolute error in a more straightforwardway, we further calculate the nonlinear correlation distribution(NC) for out-of-range tests and present them in Fig. 7. In NC,when the two parameters are highly correlated with each other,a sharp peak will show, whereas when the two parametersare not related with each other, noisy sidelobes will show.It is seen from Fig. 7, for both BPS and DCS, nonlinearcorrelation distributions of Test #5, Test #6, and Test #7show sharp peak, but noisy sidelobes are observed for Test #8.For the failure cases, both the CC values in Table I and NCin Fig. 7 suggest that although one can observe the significantincrease of predicted uncertainty when the reconstructionfails, the predicted uncertainty is not correlated with trueabsolute error any more. Usually, to quantitatively evaluate

Fig. 9. Tests with experimental data: Image reconstruction and uncertaintyqualification of “FoamDielExt’ using the proposed BCNN under the frame-work of (a) BPS and (b) DCS, where the network is trained with MNISTdigit. For both (a) and (b), the first row presents input (left) and output (right)of the network and the second row presents the true absolute error (left) andpredicted uncertainty (right).

Fig. 10. Nonlinear correlation distribution NC of the experimental tests for(a) BPS and (b) DCS. The sharp peak in the correlation distribution meansthe predicted uncertainty is highly correlated with ground-truth error. Thehorizontal axes refer to “pixels”, and vertical axis is denoted as “a.u.”.

correlation effects in NC, peak-to-correlation energy (PCE)

[38], i.e., PCE=NCm/M∑j=1

NCj with NCm denoting maximum

values of in NC, is used. In our results, it is observed thatwhen PCE<0.03 (representing the lowly correlating with each

Page 9: Uncertainty Quantification in Inverse Scattering Problems ......denote conjugate transpose, transpose, and complex conjugate, respectively. In addition, jjand jjjj F are used to denote

9

other), the shapes and positions and the scatterers also cannotbe recognized from the reconstructed results. Specifically, thePCEs for test #5, test #6, test #7, and test #8 in Fig. 6(a)are 0.1, 0.13, 0.25, and 0.01, respectively. The PCEs for test#5, test #6, test #7, and test #8 in Fig. 6(b) are 0.22, 0.16,0.36, and 0.004, respectively.

C. Tests with Experimental Data

In order to further verify the proposed BCNN for ISPs,experimental data from Institue Fresnel [39] are used in thetest stage, where, as presented in in Fig. 8, a “FoamDielExt”profile with TM illumination is considered. The ground-truthprofile consists of a large cylinders with diameter of 80 mm(εr = 1.45 ± 0.15) and a small cylinder with diameter of 31mm (εr = 3±0.3). Specifically, the DOI is illuminated by theelectromagnetic wave from 8 transmitters and 241 receiversare used to receive the scattered field after the interactionbetween the scatterers and incident field. The scattered field isthen used in the reconstruction of scattterers and predict theuncertainty of the reconstructions. In all tests, reconstructionsare conducted at 3 GHz, which is a single frequency withoutfrequency hopping.

In the training stage, we use use the same MNIST profiles,including the range of relative permittivities, as those in[36]. We present the reconstructed scatterers and predicteduncertainty in Fig. 9. It is seen that the predicted uncertaintiesfor both BPS and DCS are highly correlated with the trueabsolute errors. To quantitatively verify the correlation, CCand NC are calculated and presented in Table I and Fig. 10,respectively. It is seen that the CC values for BPS and DCS are0.81 and 0.83, respectively. In Fig. 10, it is shown that sharppeaks are presented in NC for both BPS and DCS, which alsosuggests that the proposed method is able to predict pixel-based uncertainty in the reconstruction of DLS without theknowledge of ground-truth profile.

IV. CONCLUSION

It is critical to understand what an algorithm does not knowwhen it is used for reconstructions in ISPs. Most recently,deep learning algorithm has become an attractive, powerful,and promising approach to solve the ISPs. Nevertheless, all ofthese methods assume that the reconstructions from the trainednetwork are trustable, which is not true in practice. Differentwith traditional objective-function based approaches that relyon physical principles, the learning approach is a data-drivenapproach, where the reconstructions are highly dependent onwhether the test data is close to the training data and also onthe architecture of learning model. Therefore, it is extremelydesirable to build a framework that allows us to estimate theuncertainty or error of the reconstructions predicted by thenetwork.

In this paper, we proposed a framework of BCNN that isable to solve ISPs and at the same time predict the pixel-baseduncertainty of the reconstructions. Firstly, a close correlationbetween the predicted uncertainty and the true absolute errorindicates that the proposed framework is able to capture mostof the features shown in the true absolute error, where the

latter is known only if we have the information of groundtruth. Secondly, the proposed method is highly flexible, whichcan be applied to most of the DLSs in the community ofISPs. In this paper, we have tested it on two most recentdeep learning schemes to illustrate the concept. Thirdly, theproposed framework exhibits the properties of robustness tonoise contaminations in both BPS and DCS by testing on15% and 30% Gaussian noise contaminations. Fourthly, toquantitatively evaluate the proposed framework, correlationcoefficient and nonlinear correlation distribution are defined tocalculate the correlation between the true absolute error andpredicted uncertainty. It quantitatively demonstrates that thepredicted uncertainty is highly correlated with the ture absoluteerror. Fifthly, the predicted uncertainty is directly output fromthe network and it avoids complicated computations.

The proposed framework in uncertainty quantification isworth further studying to solve challenging topics in ISPs,which will be our future work. For example, it is difficultto deal with highly nonlinear ISPs [40], and failures in thereconstructions are common when solving these problems.Therefore, building a model which can supervise the uncertain-ty and give instructions to the algorithms is highly desirablefor highly nonlinear ISPs.

REFERENCES

[1] X. Chen, Computational Methods for Electromagnetic Inverse Scatter-ing. Wiley-IEEE, 2018.

[2] L.-P. Song, C. Yu, and Q.-H. Liu, “Through-wall imaging (TWI) byradar: 2-D tomographic results and analyses,” IEEE Transactions onGeoscience and Remote Sensing, vol. 43, no. 12, pp. 2793–2798, 2005.

[3] R. Chen, Z. Wei, and X. Chen, “Three dimensional through-wall imag-ing: Inverse scattering problems with an inhomogeneous backgroundmedium,” in 2015 IEEE 4th Asia-Pacific Conference on Antennas andPropagation (APCAP), 2015, pp. 505–506.

[4] R. Persico, Introduction to ground penetrating radar: inverse scatteringand data processing. John Wiley & Sons, 2014.

[5] R. Zoughi, Microwave non-destructive testing and evaluation principles.Springer Science & Business Media, 2012.

[6] H. Kagiwada, R. Kalaba, S. Timko, and S. Ueno, “Associate memoriesfor system identification: Inverse problems in remote sensing,” Mathe-matical and Computer Modelling, vol. 14, pp. 200–202, 1990.

[7] A. J. Devaney, “Inverse-scattering theory within the rytov approxima-tion,” Optics Letters, vol. 6, no. 8, pp. 374–376, 1981.

[8] K. Belkebir, P. C. Chaumet, and A. Sentenac, “Superresolution intotal internal reflection tomography,” Journal of the Optical Society ofAmerica A, vol. 22, no. 9, pp. 1889–1897, 2005.

[9] T. M. Habashy, R. W. Groom, and B. R. Spies, “Beyond the Bornand Rytov approximations: A nonlinear approach to electromagneticscattering,” Journal of Geophysical Research: Solid Earth, vol. 98,no. B2, pp. 1759–1775, 1993.

[10] W. C. Chew and Y. M. Wang, “Reconstruction of two-dimensionalpermittivity distribution using the distorted Born iterative method,” IEEETransactions on Medical Imaging, vol. 9, no. 2, pp. 218–225, 1990.

[11] P. M. van den Berg and R. E. Kleinman, “A contrast source inversionmethod,” Inverse Problems, vol. 13, no. 6, p. 1607, 1997.

[12] T. Isernia, L. Crocco, and M. D. Urso, “New tools and series for forwardand inverse scattering problems in lossy media,” IEEE Geoscience andRemote Sensing Letters, vol. 1, no. 4, pp. 327–331, 2004.

[13] X. Chen, “Subspace-based optimization method for solving inverse-scattering problems,” IEEE Transactions on Geoscience and RemoteSensing, vol. 48, no. 1, pp. 42–49, 2010.

[14] Y. Zhong and X. Chen, “An FFT twofold subspace-based optimizationmethod for solving electromagnetic inverse scattering problems,” IEEETransactions on Antennas and Propagation, vol. 59, no. 3, pp. 914–927,2011.

[15] K. Xu, Y. Zhong, and G. Wang, “A hybrid regularization technique forsolving highly nonlinear inverse scattering problems,” IEEE Transac-tions on Microwave Theory and Techniques, vol. 66, no. 1, pp. 11–21,2018.

Page 10: Uncertainty Quantification in Inverse Scattering Problems ......denote conjugate transpose, transpose, and complex conjugate, respectively. In addition, jjand jjjj F are used to denote

10

[16] S. Caorsi and P. Gamba, “Electromagnetic detection of dielectric cylin-ders by a neural network approach,” IEEE Transactions on Geoscienceand Remote Sensing, vol. 37, no. 2, pp. 820–827, 1999.

[17] I. T. Rekanos, “Neural-network-based inverse-scattering technique foronline microwave medical imaging,” IEEE Transactions on Magnetics,vol. 38, no. 2, pp. 1061–1064, 2002.

[18] E. Bermani, A. Boni, S. Caorsi, and A. Massa, “An innovative real-timetechnique for buried object detection,” IEEE Transactions on Geoscienceand Remote Sensing, vol. 41, no. 4, pp. 927–931, 2003.

[19] Z. Wei and X. Chen, “Deep-learning schemes for full-wave nonlinearinverse scattering problems,” IEEE Transactions on Geoscience andRemote Sensing, vol. 57, no. 4, pp. 1849–1860, 2019.

[20] Y. Sanghvi, Y. Kalepu, and U. K. Khankhoje, “Embedding deep learningin inverse scattering problems,” IEEE Transactions on ComputationalImaging, vol. 6, pp. 46–56, 2020.

[21] R. Guo, X. Song, M. Li, F. Yang, S. Xu, and A. Abubakar, “Superviseddescent learning technique for 2-d microwave imaging,” IEEE Trans-actions on Antennas and Propagation, vol. 67, no. 5, pp. 3550–3554,2019.

[22] G. Chen, P. Shah, J. Stang, and M. Moghaddam, “Learning-assistedmultimodality dielectric imaging,” IEEE Transactions on Antennas andPropagation, vol. 68, no. 3, pp. 2356–2369, 2020.

[23] J. Adler and O. Oktem, “Solving ill-posed inverse problems usingiterative deep neural networks,” Inverse Problems, vol. 33, no. 12, p.124007, 2017.

[24] L. Li, L. G. Wang, F. L. Teixeira, C. Liu, A. Nehora, and T. J. Cui,“Deepnis: Deep neural network for nonlinear electromagnetic inversescattering,” IEEE Transactions on Antennas and Propagation, vol. 67,no. 3, pp. 1819–1825, 2019.

[25] Y. Sun, Z. Xia, and U. S. Kamilov, “Efficient and accurate inversion ofmultiple scattering with deep learning,” Optics Express, vol. 26, no. 11,pp. 14 678–14 688, 2018.

[26] J. Xiao, J. Li, Y. Chen, F. Han, and Q. H. Liu, “Fast electromagneticinversion of inhomogeneous scatterers embedded in layered media byborn approximation and 3-D U-Net,” IEEE Geoscience and RemoteSensing Letters, pp. 1–5, 2019.

[27] V. A.-O. Khoshdel, A. Ashraf, and J. LoVetri, “Enhancement of mul-timodal microwave-ultrasound breast imaging using a deep-learningtechnique,” Sensors, vol. 19, no. 18, 2019.

[28] Y. Khoo and L. Ying, “Switchnet: A neural network model for forwardand inverse scattering problems,” SIAM Journal on Scientific Computing,vol. 41, no. 5, pp. A3182–A3201, 2019.

[29] Z. Wei, D. Liu, and X. Chen, “Dominant-current deep learning schemefor electrical impedance tomography,” IEEE Transactions on BiomedicalEngineering, vol. 66, no. 9, pp. 2546–2555, 2019.

[30] Z. Wei and X. Chen, “Induced-current learning method for nonlinearreconstructions in electrical impedance tomography,” IEEE Transactionson Medical Imaging, vol. 39, no. 5, pp. 1326–1334, 2020.

[31] A. Massa, D. Marcantonio, X. Chen, M. Li, and M. Salucci, “Dnns asapplied to electromagnetics, antennas, and propagationa review,” IEEEAntennas and Wireless Propagation Letters, vol. 18, no. 11, pp. 2225–2229, 2019.

[32] L. Poli, G. Oliveri, and A. Massa, “Microwave imaging within thefirst-order born approximation by means of the contrast-field bayesiancompressive sensing,” IEEE Transactions on Antennas and Propagation,vol. 60, no. 6, pp. 2865–2879, 2012.

[33] A. Kendall and Y. Gal, “What uncertainties do we need in bayesiandeep learning for computer vision?” in Advances in Neural InformationProcessing Systems 30 (NIPS 2017), 2017, pp. 5574–5584.

[34] Y. Gal and Z. Ghahramani, “Dropout as a bayesian approximation:Representing model uncertainty in deep learning,” Proceedings of The33rd International Conference on Machine Learning, p. 10501059, 2016.

[35] Y. Xue, S. Cheng, Y. Li, and L. Tian, “Reliable deep-learning-basedphase imaging with uncertainty quantification,” Optica, vol. 6, no. 5,pp. 618–629, 2019.

[36] Z. Wei and X. Chen, “Physics-inspired convolutional neural networkfor solving full-wave inverse scattering problems,” IEEE Transactionson Antennas and Propagation, vol. 67, no. 9, pp. 6138–6148, 2019.

[37] B. Javidi, “Nonlinear joint power spectrum based opticalcorrelation,”Applied Optics, vol. 28, no. 12, p. 23582367, 1989.

[38] Y. Xiao, L. Zhou, and W. Chen, “Single-pixel imaging authentication us-ing sparse hadamard spectrum coefficients,” IEEE Photonics TechnologyLetters, vol. 31, no. 24, pp. 1975–1978, 2019.

[39] G. Jean-Michel, S. Pierre, and E. Christelle, “Free space experimentalscattering database continuation: experimental set-up and measurementprecision,” Inverse Problems, vol. 21, no. 6, p. S117, 2005.

[40] Y. Zhong, M. Lambert, D. Lesselier, and X. Chen, “A new integralequation method to solve highly nonlinear inverse scattering problems,”IEEE Transactions on Antennas and Propagation, vol. 64, no. 5, pp.1788–1799, 2016.