ter for electromagnetics researc

37

Upload: others

Post on 12-Feb-2022

4 views

Category:

Documents


0 download

TRANSCRIPT

Preconditioners for Structured Matrices Arising in

Subsurface Object Detection

Misha Kilmery Eric Millery Carey Rappaporty

Center for Electromagnetics Research

Electrical and Computer Engineering Department

Northeastern University

Boston, MA 02115

email: [email protected], [email protected], [email protected]

KEYWORDS: preconditioner, Helmholtz equation, iterative methods, QMR, sine transform,

scattering, parallel

AMS Subject Classications: 65F10, 65F15, 65N22, 78-08, 78A45, 78A40

This work supported by the Army Research Oce Demining MURI under Grant DAAG55-97-1-0013

1

Eric L Miller
Eric L Miller
Under review, "Journal of Computational Physics"
Eric L Miller

RUNNING HEAD: Preconditioners for Matrices in Object Detection

CONTACT INFORMATION:

name: Dr. Misha E. Kilmer

mailing:

235 Forsyth Bldg.

Northeastern University

Boston, MA 02115

fax: 617-373-8627

email: [email protected]

2

Abstract

In this paper we focus on computationally ecient methods for solving the 2-D Helmholtz-

type equation with piecewise constant, complex wavenumber. To approximate the solution to

this equation, we replace the innite boundaries with a perfectly matched layer (PML) and

discretize the continuous problem in space using nite dierences. The corresponding matrix

is structured, large, sparse, complex, and is neither Hermitian nor symmetric. We examine

preconditioners for use with iterative Krylov subspace methods in solving the corresponding

discrete systems. Our preconditioners are nearly Toeplitz block approximations to the original

matrices. The preconditioners can be applied eciently, and in parallel, at each iteration using

1-D sine transforms and band solves. We show that the eigenvalues of the preconditioned

matrix can be computed once the eigenvalues of certain 1-D matrices are known; therefore we

are able to compute eigenvalues for rather large 2-D problems. We present results illustrating

the eectiveness of the preconditioners under various conditions.

1 Introduction

In this work we are interested in developing an accurate and highly ecient forward scattering

model ; that is, a computational code which determines observed scattered elds from a hypothesized

distribution of subsurface, or underground, scatterers. Ultimately, one would like to use the forward

scattering data generated in this way to design inversion algorithms aimed at detecting and localizing

buried objects, such as landmines.

The mathematical formulation of the forward scattering problem of interest here is given by

a Helmholtz-type equation, derived from the time harmonic form of Maxwell's Equations [13, pg

6,12]. In particular, if we assume a 2-D problem with the subsurface illuminated by a plane wave

impinging at an angle from the normal, then it is simple to show that the partial dierential

3

equation describing the scattering problem is given by

( + k2(x; y))e(x; y) = m(x; y)e0(x; y) (1)

where k2(x; y) = !20(x; y) is the square of the wave number, with ! representing angular frequency

and 0 a constant denoting the magnetic permeability. Here, e(x; y) denotes the scattered electric

eld and e0(x; y) the eld in the absence of a scatterer, which is known. The function (x; y),

called the electrical permittivity, is related to the electrical properties of the media while m(x; y)

describes the properties of the buried object and has support only over the region in which the

object is located. In our problem, we will assume that the wavenumber is piecewise constant; that

is,

k(x; y) =

8>>>>>><>>>>>>:

k0 (x; y) 2 air

ks (x; y) 2 soil

km (x; y) 2 subsurface object

:

Under this notation, m(x; y) = (k2s k2m) if (x; y) is in the object and 0 otherwise.

The computational problem of interest in this paper is the stable and rapid solution of a discrete

version of (1). Although there are also integral equation approaches to solving the forward scattering

problem, the approach of solving the nite dierence discretized partial dierential equation (PDE)

is more appealing to us for several reasons, not the least of which is its exibility. For instance, we can

readily incorporate a rough air-soil interface, volume inhomogeneities, and complex target geometry

into the model, and we have the added ability to easily and eectively deal with a space-varying

wavenumber. Further, the PDE is easily discretized with nite dierences, and the forward solver,

namely the preconditioned iterative method described herein, is straightforward to understand,

reasonably easy to implement, and fast.

Now (1) represents a scattering problem over an innite domain, but we are interested in the

solution over the rectangular region = [a1; a1][b1; b1] for some predened real values a1; b1 > 0.

4

Thus, we begin by replacing the innite boundaries with a perfectly matched layer (PML)[1, 11]

whose mathematical formulation we describe in x2. We discretize the continuous problem in space

using nite dierences. This leads to a matrix equation of the form

Af = g

where the matrix A is an n n, sparse, complex, structured matrix that is neither symmetric nor

Hermitian. The entries in A and right hand side g depend upon frequency and both A and g are

sparse. The values of g also depend on incident angle.

Typically, the matrix has some eigenvalues clustered around 0 whose real parts are both positive

and negative. Hence, direct solution methods like Gaussian elimination will require partial pivoting

for stability. Even without pivoting, the number of ops and the amount of storage needed to

compute a solution directly can be prohibitive. In our applications, n will be quite large. However,

since the matrix is structured and sparse, the system is a good candidate for solution via iterative

methods. Due to the eigenstructure of the matrix, an eective preconditioner | that is, one which

can be applied at each iteration for about the same cost as a matrix-vector product and which

clusters eigenvalues away from zero | must be used.

In this paper, we examine preconditioners for use with iterative Krylov subspace methods for

solving the resulting linear systems. Preconditioning techniques derived from fast direct methods

for the single medium case with Sommerfeld-like boundary condition have been explored for 2-D [4]

and 3-D [3] cases. Our preconditioner is somewhat similar to a domain decomposition based precon-

ditioning idea in [4]. However, the PML boundary condition we employ here is signicantly dierent

from the Sommerfeld-like boundary condition assumed in [3, 4] and requires special treatment. Our

preconditioner approximates A with a matrix that can be treated by fast direct methods, but unlike

most of the approaches in [3, 4], our development does not require that we replace the PML with

another boundary condition. Also, since A is neither complex symmetric nor Hermitian, neither is

5

our preconditioner. A further dierence from the methods in [3, 4] is that we consider piecewise

constant wavenumber.

Despite the additional diculties presented by incorporating the PML and piecewise constant

wavenumber into our model, we are able to use fast direct methods to apply our preconditioner by

exploiting the special structure of A. The preconditioner is a nearly Toeplitz block approximation

to the original matrix and can be applied eciently and in parallel at each iteration using 1-D sine

transforms and band solves. Moreover, we take a similar approach to that in [2] and show that the

eigenvalues of the 2-D preconditioned matrix can be obtained by solving a few 1-D eigenproblems. In

this way, we are able to develop intuition on the clustering of the eigenvalues of our preconditioned

matrix, and in turn, on the rate of convergence of the iterative method.

Our paper is organized as follows. In x 2, we discuss the formulation and structure of the matrix.

In x 3 we show how this structure can be exploited to form preconditioners which can be applied

eciently using fast sine transforms and 1-D solves. The eigenanalysis of our preconditioned matrix

is given in x 4 and we present numerical results in x 5. Conclusions and future work are the subject

of x 6.

2 Formulation and Structure of A

In this section, we describe how the matrix is formed using nite dierences and how a particular

ordering of the unknowns gives rise to a matrix with special structure. We begin by introducing the

PML boundary conditions.

2.1 PML

As mentioned in the introduction, we are interested in the solution to (1) on a rectangular region

= [a1; a1] [b1; b1]. The PML is an absorbing boundary condition in the sense that inside

6

the PML, waves are forced to attenuate in the PML outward from @ with a minimum amount of

re ection from the -PML interface, so that the solution to the nite dimensional problem resembles

the solution to the original innite space problem in the region .

In order to mathematically dene the PML, we introduce some notation. The complex-valued

quantity is

(x; y) = 0rel(x; y) + i(x; y)

!;

for some real 0; rel 1 with i =p1 and 0 a constant (the permittivity of free-space). The

value is the conductivity of the material. In air, = 0, while in soil, is usually complex valued.

High values of mean that the wave is rapidly attenuated in that medium.

The basic idea behind the perfectly matched layer approach is to place an articial layer around

computational grid in which the conductivity is strictly increasing from @ outward to the edge of

the PML (see Figure 1 for an illustration). By gradually increasing away from the inner boundary

@, the wave is forced to attenuate with a minimum amount of articial re ection o the -PML

interface.

Formally, the equation being solved inside the PML is as follows [11]:

c(x)@

@x

c(x)

@e

@x

+ c(y)

@

@y

c(y)

@e

@y

+ k2(x; y)e = 0 (2)

where k takes the value of wave number of the medium which it surrounds and

c(t) =

8>><>>:

0!0!+i(t)

t is the normal direction

1 otherwise

:

The key to the success of the PML at attenuating the wave while introducing nominal re ection is

a prudent choice of (t). We take

(t) = f

jtj

hN

p;

where is a1 or b1, depending on whether the normal direction t is x or y, N is the number of

desired layers in the PML and h is the mesh width and f and p are user dened parameters. The

7

interested reader is referred to [10] for appropriate choices of these parameters. Note that the width

of the PML in meters, say , is dened in terms of h and N ; that is, = hN . Likewise, we have a

dierent function c(t) depending on our choice of h and N .

Because the wave has theoretically attenuated in the PML, the nal outer boundary condition

is a Dirichlet condition:

e = 0 on @;

where is the rectangular region = (a1 ; a1 + ) (b1 ; b1 + ).

2.2 Discretization

The problem that we would like to discretize is

+ k2(x; y)

e(x; y) = m(x; y)e0(x; y) (x; y) 2

c2(x) @2

@x2+ c2(y) @2

@y2+ c(x) @c

@x@@x

+ c(y) @c@y

@@y

+ k2(x; y)e = 0 (x; y) 2 n

e = 0 @

: (3)

We illustrate this in Figure 1 .

In the interior, we use the standard 5-point dierence operator on a uniform mesh with grid

spacing h in both the x and y directions. Over the PML, we use standard centered dierences

for both the rst and second order derivatives of e in the x and y directions, and the coecients

c0(x); c(x); c0(y); c(y) are evaluated at the corresponding midpoints in the x and y directions, respec-

tively.

To obtain the problem in matrix form, we will order the unknowns lexicographically, one row

(left to right) at a time, bottom to top (refer to Figure 1). For ease of exposition, we will take

(hence ) to be a square so that the number of grid points in both the x and y directions is the

same and equal to mw on . Hence, n = m2w . We will adjust above so that the number of grid

points across the width of the PML is N . Note that if n1 denotes the number of grid points in

8

then mw = n1 + 2N . We will let na and ns denote the number of grid points in the y-direction in

the air and in the soil, respectively, so that for square we have na + ns = n1. Finally, we use nr

to denote the number of grid points on the object in the y direction and nc to denote the number of

grid points on the object in the x direction. Therefore, the total number of unknowns corresponding

to the object is nrnc. Throughout the remainder of the paper, we assume nr ; nc n1; such is the

case, for instance, in buried landmine detection. We summarize this notation in Table I.

Discretized in this way, we observe that A can be written as a tensor sum:

A = ((I A1 +H I) + E)

where = (k2m k2s), A1;H are tridiagonal matrices we will describe shortly, I denotes the identity

matrix of size mw, and E is a rank nrnc diagonal matrix which contains a 1 in that position on the

diagonal if that position corresponds to an unknown in the buried object and a 0 otherwise.

Now let T denote the n1 n1 tridiagonal Toeplitz matrix tridiag[1;2; 1], and let ej denote the

jth unit vector. We will use Pr to denote the r r matrix with ones on the main anti-diagonal and

zeros otherwise; notice that Pr = P Tr . Let xi = a1 + ih; i = 1 : N and denote ci = c(xi); c0i = c0(xi).

Finally, set si = c2i :5hcic0i and si = c2i + :5hcic0i. Then we can write the mw mw matrix A1 as

A1 =

26666664

A1;1 s1RT1 0

R1 T Pn1R1PN

0 s1PNRT1 Pn1 PNA1;1PN

37777775

where the n1 N matrix R1 is

R1 = [0; 0; : : : ; 0; e1; ]

9

and the N N matrix A1;1 is

A1;1 =

2666666666666664

2c2N sN 0

sN1 2c2N1 sN1 0

0.. .

. . .. . . 0

0 s2 2c22 s2

0 s1 2c21

3777777777777775

:

Observe that the entries of PNA1;1PN are just the entries of A1;1 in reverse order.

Further, the matrix H is given by

H = A1 +D

where D is a diagonal matrix whose rst N + ns entries are k2sh2 and whose last N + na entries are

k2ah2.

Note that because of the PML, the matrix will not be symmetric or Hermitian.

3 The Preconditioner

Observe that if N = 0, A1 would be a symmetric, tridiagonal Toeplitz matrix. In that case, the

system could be solved directly in an ecient manner by using fast 1-D sine transforms to decouple

the system followed by solving mw, 1-D tridiagonal systems. However, appropriate values of N are

greater than or equal to 8 for the types of problems we are solving [10], so A does not quite have

Toeplitz blocks. Fortunately, the near-Toeplitz structure of the blocks can be exploited to develop

a preconditioner, described below, which can be applied eciently and in parallel at each iteration.

Initially, we tried various preconditioners derived algebraically by replacing A1 with approximations

which could be diagonalized by fast orthogonal transforms; for example, the tridiagonal matrix

obtained by averaging along the diagonals of A1 is symmetric and Toeplitz and can be diagonalized

10

by a fast discrete sine transform (DST). Unfortunately, we found that these types of preconditioners

were not nearly as eective as the one we now describe.

Our idea is similar to the domain decomposition approach described in Section 3.3.1 of [4] for

the single-medium Helmholtz problem, but it varies in that we must deal with the PML rather than

the Sommerfeld-like boundary as well as with a varying wavenumber k.

The key is to isolate the symmetric, Toeplitz block portion of the matrix so that we may use

fast transforms to decouple the system further. To do this we will reorder the matrix so that the

unknowns in the right and left portion of the PML are ordered last. Under this reordering scheme,

A will have the following form:

A =

2664

T + E B

s1BT G

3775

where T = (Imw T+H In), T = tridiag(1;2; 1). Also,

G =

2664

G1 0

0 G2

3775 ; G1 = Imw

A1;1 +H IN ; G2 = Imw PNA1;1PN +H IN

and

B = [ImwR1; Imw

R2]; with R2 = Pn1R1PN :

Note that E is just the reordered version of E with the last 2Nmw zero rows truncated.

Thus B is rank 2mw matrix that contains the connections between unknowns in the right and

left PML and the unknowns on the left and right boundary of .

We will be interested in solving the following right preconditioned system:

AM1y = g; with y = Mf:

We choose to precondition on the right, rather than the left, simply because the residual for the left

preconditioned system is the same as the residual for the unpreconditioned system. We dene our

11

preconditioner as

M =

2664T B

0 G

3775 :

Hence, solving systems involving the preconditioner can be accomplished according to the following

algorithm:

Algorithm 1: Solving Mv = z

1. Partition v and z into vectors with lengths mwn1 and 2Nmw,

respectively:

v = [v(1); v(2)]T ; z = [z(1); z(2)]T .

2. Solve Gv(2) = z(2).

3. Solve T v(1) = z(1) Bv(2) y.

Note that T = SDS where S is the normalized discrete sine transform matrix of size n1

Skj =

r2

n1sin

kj

n1 + 1

and D is a diagonal matrix with known entries

dj = 2 + 2 cos(j=(n1 + 1)); j = 1; : : : ; n1:

Dene Q to be the matrix of size mwn1 which reorders the unknowns as 1;mw+1; 2mw+1; : : : ; (n1

1)mw + 1; 2;mw + 2; : : : ; (n1 1)mw + 2; : : : ; n1mw. Using the structure of T , step 3 is therefore

equivalent to solving the mw tridiagonal problems

(djI +H)~v(1)j = ~yj

where ~y = Q(IS)y, v(1) = (IS)QT ~v(1), and the subscript on ~v(1) and ~y denote the jth subvector

of the respective vector when it is sequentially partitioned into n1 subvectors of length mw.

12

Implemented in this way, the cost of solving a system with M is the cost of solving two 1-

D problems in step 2, each of size Nmw, (solving with G means solving two uncoupled systems

involving G1 and G2 in step 2 above) and each having bandwidth N in addition to the cost of mw,

1-D DST's followed by mw tridiagonal solves for a total of O(mwn1 lg(n1) + Nmw) operations. To

this end, we can prefactor G1 (hence, by denition, G2) and the tridiagonal matrices djI+H so that

the solution of the 1-D systems only requires forward and backward substitutions at each iteration.

Observe that our preconditioner M and the matrix A dier by a matrix of rank 2mw+nrnc n.

Note also that since our matrix was neither Hermitian nor symmetric, the preconditioner need not

have that structure.

4 Eigenanalysis

The iterative method which we employ in our examples is the coupled two-term recurrence version

of QMR (quasi-minimal residual) [7] without lookahead. Freund and Nachtigal [6] give the following

error bound on the kth residual when QMR is applied to the system AM1y = g:

Theorem 1 ([7]) Let Hm be the m m matrix generated by the unsymmetric Lanczos algorithm

after m steps, and assume that Hm is diagonalizable. Then for k = 1; 2; : : : ;m 1 the residual

vectors of the QMR algorithm satisfy

krkk2 kr0k2(Hk)"kpk + 1 (4)

where

"k = minp(0)=1

max2(AM1)

jp()j

and (Z) denotes the set of eigenvalues of a matrix Z, () denotes the condition number with respect

to the 2-norm, and p is a polynomial of degree k.

13

In particular, since our preconditioner and matrix dier by a matrix of rank 2mw + nrnc < n, it

is easy to show that at least n2mwnrnc eigenvalues of the preconditioned matrix are identically

one and therefore the theorem says that in exact arithmetic, QMR must terminate after at most

2mw + nrnc iterations. Therefore, to understand what is happening in the rst few iterations, we

must focus on characterizing the non-unit eigenvalues of the preconditioned matrix.

The approach we use in this section is based on that of [2], where the authors are interested

in the case when A comes from the discretization of the Helmholtz problem with Sommerfeld-like

boundary condition in a single layer media.

Before proceeding, let us set up the notation. We will use Ij to denote the identity matrix of

size j. As before, we use () to denote the set of eigenvalues of the argument.

Theorem 2 Let p = 2mw + nrnc and note p < n. Assume H and A1;1 are diagonalizable. Then

T1(E s1BG1BT ) = XW ;

where X and W are matrices of size n1mw p and rank p.

Because the proof of Theorem 2 is constructive and somewhat tedious, we defer the proof and

the precise denition of X and W until x4.1 and turn to the main result of this section.

Theorem 3 The matrix AM1 has at least n p eigenvalues which are identically 1. Further, the

p non-unit eigenvalues are given by 1 (W X) where X and W are the matrices from Theorem 2.

Proof: By a similarity transform, the eigenvalues of AM1 are the eigenvalues of M1A. Now

it can readily be checked that the matrix M1 is given by

M1 =

2664T1 T1BG1

0 G1

3775 :

14

Therefore

M1A =

2664In1mw

T1(s1BG1BT + E) 0

s1G1BT I2Nmw

3775 :

Since M1A is block lower triangular, it follows that the eigenvalues of M1A are the union of the

eigenvalues of the blocks on the diagonal [8, Lemma 7.1.1]. Therefore, the set of eigenvalues ofM1A

must contain at least 2Nmw ones plus the eigenvalues of In1mwT1(s1BG

1BT+E). The eigenval-

ues of the latter are just 1(T1(s1BG1BT +E)). But by Theorem 2, (T1(s1BG

1BT+E)) =

(XW ): Finally, by Lemma 1 of [2], the non-zero eigenvalues of XW are precisely the p eigenval-

ues of W X. It follows that M1A has n p unit eigenvalues and p non-unit eigenvalues given by

1 (W X). 2

The import of Theorem 3 is that if W X has a simple form, it becomes possible to compute

the eigenvalues of the fully 2-D preconditioned matrix in terms of a problem of size p, provided the

eigenvalues of the N N matrix A1;1 and the mw mw matrix H are known or can be computed.

As we show in the next section, W X does indeed have a compact representation from which we

can compute those p eigenvalues.

4.1 Dening X and W

In order to dene X and W we proceed in three steps. Let us assume that H and A1;1 are diagonal-

izable with H = UU1 and A1;1 = FF1. Further, let fN denote the N th row of F and let ~fN

denote the N th column of F1. Let the jth component of fN (resp. ~fN ) be given by fN (j) (resp.

~fN (j)). The rst step is the proof of the following lemma:

Lemma 1 The matrix BG1BT can be written

(U In1)( [e1; en1 ])(Imw [eT1 ; e

Tn1])(U1 In1); (5)

15

where is a diagonal matrix of size mw whose jth diagonal element is

NXi=1

fN (i) ~fN (i)

i + j:

Proof: We will let I denote Imwunless otherwise specied. Recalling the denitions in x 3,

BG1BT = [I R1; I R2]

2664G11 0

0 G12

37752664

I RT1

I RT2

3775

= (I R1)G11 (I RT

1 ) + (I R2)G11 (I RT

2 ): (6)

Now using the eigendecomposition of H and the relation of G1 to G2 specied in x3, it follows that

G11 = (U IN )(Imw

A1;1 + IN )1(U1 IN )

G12 = (U PN )(I A1;1 + I)1(U1 PT

N ):

Substituting the above equations into (6) and using the relation R2PN = Pn1R1 and the eigende-

composition of A1;1 we have

BG1BT = (U In1)(I R1F )(I + I)1(I F1RT

1 )+

(I PR1F )(I + I)(I F1RT1 P

T )(U1 In1)

= (U In1)Z(U1 In1): (7)

Now clearly R1 = e1eTN where the rst unit vector is understood to have length n1 and eN is the

N -length unit vector with a 1 in the N th position. Thus, R1F = e1fN and F1eN eT1 = ~fN eT1 . By

brute force one can readily show that the rst term in Z, (I e1fN )(I + I)1(I ~fN eT1 ),

is a block diagonal matrix with n1 n1 blocks, and in the jth block, only the (1; 1) element, given

byPN

i=1fN (i) ~fN (i)i+j

, is non-zero. Similarly, the second term in Z, (I Pe1fN )(I + I)1(I ~fN e

T1 P

T ), is a block diagonal matrix with n1 n1 blocks and in the jth block, only the (n1; n1)

element, also given byPN

i=1fN (i) ~fN (i)i+j

, is non-zero.

16

Therefore,

Z = [e1; en1]

2664

eT1

eTn1

3775 = ( [e1; en1 ])(Imw

[eT1 ; eTn1])

and substituting this expression into (7), the proof is complete. 2

For the second step, we need to get an expression for E s1BG1BT . To do this, we note that

E can be written in the following tensor form

E =

26666664

0

Inr

0

37777775[0; Inr; 0]

26666664

0

Inc

0

37777775[0; Inc; 0]:

Now using (5) and this expression for E, we pull terms involving U outside of the sum to obtain

E s1BG1BT = (U In1) ~X ~W (U1 In1)

where ~X and ~W are the mwn1 (nrnc + 2mw) matrices

~X =

26666664U1

26666664

0

Inr

0

37777775

26666664

0

Inc

0

37777775;s1 [e1; en1 ]

37777775

~W =

26666664

[0; Inr ; 0]U [0; Inc; 0]

I

2664

eT1

eTn1

3775

37777775:

For the third and nal step, consider T1(U In1) ~X . Recall T = (I T +H I), so that we

have

T1(U In1) ~X = (U I)(I T + I)1 ~X:

Thus, we may dene

X (U I)(I T + I)1 ~X

17

and

W ~W (U1 In1)

and the proof of Theorem 2 is complete.

Now by Theorem 3 we are interested in the p eigenvalues ofW X. But these are the eigenvalues

of ~W (I T + I)1 ~X. Fortunately, using the fact that T = SDS, the entries of this matrix are

not dicult to compute; indeed, the matrix C ~W (I T + I)1 ~X has the structure

C =

2664C1 C2

C3 C4

3775

where C4 has dimension 2mw and is block diagonal with 2 2 blocks on the diagonal and C1 has

dimension nrnc. Specically, the sub-blocks are dened as follows. Let s1; sn1 denote the rst and

last columns of the size n1 normalized discrete sine transform matrix S. Let ~U represent the matrix

[0; Inr; 0]U , let U = U1([0; Inr; 0]T ), and put ~S = [0; Inc; 0]S. Then

C1 = ( ~U ~S)(ImwD + In1)

1(U ~S);

C2 = ( ~U ~S)(ImwD + In1)

1( [s1; sn1 ]);

C3 = (Imw

2664

sT1

sTn1

3775)(Imw

D + In1)1(U ~S):

The jth block of C4 is dened as

j

2664

Pn1i=1

s1(i)2

di+j

Pn1i=1

s1(i)sn1 (i)di+j

Pn1i=1

s1(i)sn1 (i)

di+j

Pn1i=1

sn1 (i)2

di+j

3775 :

Fortunately, upon close examination, it becomes clear that it is possible to construct the entries of

the sub-blocks in no more ops than it takes to nd the eigenvalues of H and the matrix C. Hence,

rather than directly calculating the eigenvalues of the m2wm2

w matrix AM1 at a cost of O((m2w)

3)

ops for the 2-D preconditioned matrix, we can equivalently nd the eigenvalues of C in O(m3w)

ops, assuming nr; nc are suciently small relative to n1.

18

5 Numerical Results

In this section, we give results for the performance of our preconditioner for two dierent examples.

All computations were run in Matlab using IEEE double precision oating point arithmetic. In both

examples, we will assume a rectangular landmine of dimension 5cm-by-6cm lled with TNT is buried

so that the top of the landmine is 3 cm below the surface. The center of the landmine was centered

in the grid and a plane wave was assumed to be incident at 0 degrees (refer to Figure 1); from this

information, e0 at grid points over the mine could be calculated and thus used to form the right

hand side vectors g (refer to (3)). For the landmine, at the frequencies at which we were working,

we had rel = 2:9; = :0005. We primarily took our initial guess f0 to be the vector of all zeros;

however, for comparison purposes, average iteration counts obtained by setting f0 to be a vector

with real and imaginary parts consisting of uniformly distributed random numbers in [12 ;

12 ], are

given in the tables. We stopped iterating QMR when the relative residual norm, kg Afkk2=kbk2,

where fk = M1yk, is less than 107.

For a given soil type, permittivity rel and conductivity of the soil vary with frequency in a

complicated manner beyond the scope of this paper; therefore, in this work we do not attempt to

judge the performance of the preconditioner as a function of frequency, but leave this as a topic for

future research. Moreover, we note that to be ecient, the PML conductivity prole must change

as a function of sampling rate; yet how this prole changes is not well understood. The values we

have used give good performance for h at 10-50 points per wavelength, which is sucient sampling

for the class of scattering problems in which we are interested.

5.1 Example 1

The soil we modeled in this example was Puerto Rican clay loam. Thus, for the soil rel = 6:5 and =

:019, which are the estimated values for this soil type when ! = (2)480MHz [9]. One must sample all

19

the media at a rate of at least 10 points per wavelength; that is we need h mini2air;soil;mine i=10.

We conducted two sets of tests, one with h = soil=10 and one with h = soil=20. These values for

h ensured that soil, air and the mine were all sampled at a rate of at least 10 (resp. 20) points per

wavelength, or roughly at 2.45cm and 1.23cm increments, respectively. We used 8 layers of PML,

so that N = 8. We then calculated results for 4 dierent grid sizes mw, mw = 2q 1 + 2N for

q = 6; 7; 8; 9. The convergence curves for QMR using our preconditioner are given in Figures 2 and

3. The non-unit eigenvalues of AM1 were calculated using the method described in x4, and are

displayed in Figures 4 and 5.

In Table II for f0 = 0 we summarize the convergence results for both values of h and all four mesh

sizes. For comparison, the average number of iterations for convergence when f0 was initialized as

random are also given in the table. The averages were computed from iteration counts obtained for

ve dierent initial guesses, and in most cases all counts were within two of the average. Note from

the table and the gures that with f0 = 0, the convergence behavior does not appear to deteriorate

much with an increase in mesh size, even though the rank of I AM1 (refer to x4) is rapidly

increasing. The convergence behavior for f0 = 0 is also little aected by a decrease in h. The

numbers in the table indicate that if the initial guess is random, convergence behavior is somewhat

more sensitive to changes in h and mesh size, yet the number of iterations are still much smaller

than the rank of I AM1. Choosing f0 = 0 consistently resulted in signicantly fewer iterations:

we believe this behavior is related to the underlying smoothness and structure of the solution.

In light of Theorem 1, a look at the distribution of the eigenvalues as q varies helps explain this

behavior. We observe that a majority of the non-unit eigenvalues displayed in the gures have real

part clustered between .6 and 1 and imaginary part nearly zero while the relatively small number

of the remaining eigenvalues tend to lie on or near smooth curves in the plane. For example, for

q = 6 at 20 points per wavelength, there are 46 eigenvalues with real part < :6 out of 194 total

20

non-unit eigenvalues; for q = 7; 8; 9 there are 66; 111, and 199, respectively, out of 322; 578; and 1090

non-unit eigenvalues. Thus, a decreasing percentage of the non-unit eigenvalues fall outside this

range. Moreover, those outside the range also tend to cluster (i.e. Figure 5 for large q those with

imaginary part around .5 and real part around .35). Also, not obvious from the pictures, many of

the non-unit eigenvalues appear to have algebraic multiplicity 2.

From Theorem 1, one can obtain fast convergence if there is a low degree polynomial with

p(0) = 1 which has a small maximum absolute value when evaluated over all the eigenvalues of the

preconditioned matrix. By the preceding discussion, for each value of q, one can easily imagine a low

degree polynomial, say with roots taken as the distinct \outlying" eigenvalues plus a few of those

eigenvalues with near-zero imaginary part and real part in the range .6 to 1.1, which is small over

all the eigenvalues of AM1. Hence, the "(k) term will be small in (4). As an example, consider the

eigenvalues for the case with q = 9 and sampling at 20 points per wavelength, given by the '+'s in

Figure 6, and let the circles in the gure be the roots of a k = 56 degree polynomial with p(1) = 0.

The maximum over all the eigenvalues of AM1 is then 1:22105, indicating from the theorem we

expect the residual to be fairly small after 56 iterations even though there are more than a quarter of

a million unknowns. Clearly, by changing the roots, this upper bound could be decreased; however,

this particular choice illustrates our point.

5.2 Example 2

In this example, we use a dierent soil type (referred to as \Seabee" in the literature [12]) with

rel = 21:3078 and = :2273 at a slightly lower frequency, ! = (2)475MHz. We calculated results

for two sets of experiments, one with h = 1:34cm, or soil=10, and the other with h = :671cm, or

soil=20. Our computations were done for the same 4 dierent grid sizes as in the previous example.

A comparison of the convergence curves for the four dierent grid sizes for the larger h is given in

21

Figure 7 and the corresponding non-unit eigenvalues are displayed in Figure 9.

The convergence behavior for this example is summarized in Table III. Again, we observe that

increasing the grid size does not appear to deteriorate the performance of the preconditioner when

f0 = 0; when f0 was randomly chosen, there was some sensitivity. Additionally, decreasing h by

half seems to have only a very mild aect on the rate of convergence, with the aect being slightly

more pronounced in the case of a random starting guess. As before, this convergence behavior can

be analyzed by looking at the non-unit eigenvalues for this example. In other words, we again see

that a majority of the non-unit eigenvalues for each value of q for both sampling rates are clustered

between .6 and 1.1 on the real axis and 0 on the imaginary axis. There are other smaller clusters

of eigenvalues away from (0; 0), and everything else lies on or near a continuous curve in the plane.

Therefore, as in Example 1, we deduce that a low-degree polynomial with a judicious choice of

roots taken from among those in the gure(s) can be found which has small maximum magnitude

over all the eigenvalues; hence the bound in Theorem 1 will be small in many fewer iterations than

2mw + nrnc, the rank of I AM1.

6 Conclusions and Future Work

We developed an eective preconditioner for solving the Helmholtz-type problem with piecewise-

constant complex wavenumber with PML boundary on a rectangular grid. Additionally, we gave an

ecient method for applying the preconditioner based on 1-D discrete sine transforms and 1-D band-

solvers. For the case of a square grid, the cost of applying our preconditioner is O(m2w lg(mw + 1))

when mw +1 is a power of 2. We showed that our preconditioned matrix is a low rank perturbation

on the identity, say k, guaranteeing convergence in as many iterations, in exact arithmetic. We

illustrated a technique for determining the k non-unit eigenvalues of the preconditioned matrix

based on solving a few 1-D eigenproblems, thereby making it a computationally feasible problem to

22

analyze the convergence behavior of the 2-D problem. We applied the technique to our examples

and found a signicant number of those eigenvalues are still clustered in such a way as to ensure

convergence in even fewer iterations than the theorem guarantees. Indeed the behavior anticipated

by the eigenanalysis was illustrated in our convergence results.

We note that our preconditioner is also ecient for iterative methods such as BLQMR (block

QMR) [5] for solving 2-D problems involving multiple right-hand-sides, arising when multiple in-

cidence angles are used. In the future, we plan to examine preconditioners for the 3-D forward

scattering problem, also derived from the time harmonic form of Maxwell's equations, but which

has a more complicated mathematical formulation than a 3-D Helmholtz-type problem.

Acknowledgments. We wish to thank Howard Elman and Dianne O'Leary for their helpful

comments on an early draft of this paper.

References

[1] J. Berenger, A perfectly matched layer for the absorption of electromagnetic waves, J. Math.

Phys., 114 (1994), pp. 185200.

[2] H. Elman and D. O'Leary, Eigenanalysis of some preconditioned Helmholtz problems, Nu-

mer. Math. to appear.

[3] , Ecient iterative solution of the three-dimensional Helmholtz equation, J. Comp. Phys.,

142 (1998), pp. 163181.

[4] O. Ernst and G. Golub, A domain decomposition approach to solving the Helmholtz equation

with a radiation boundary condition, Contemporary Mathematics, 157 (1994), pp. 177192.

[5] R. Freund and M. Malhotra, A block-QMR algorithm for non-Hermitian linear systems

with multiple right hand sides, Linear Algebra Appl., 254 (1997), pp. 197257.

23

[6] R. Freund and N. Nachtigal, QMR: A quasi minimal residual method for non-hermitian

linear systems, Numer. Math., 60 (1991), pp. 315339.

[7] , An implementation of the QMR method based on coupled two-term recurrences, SIAM J.

Sci. Comput., 15 (1994), p. 313.

[8] G. Golub and C. V. Loan, Matrix Computations, Johns Hopkins Press, 1989. second ed.

[9] J. Hipp, Soil electromagnetic parameters as functions of frequency, soil density, and soil mois-

ture, Proceedings of the IEEE, 62 (1974), pp. 98103.

[10] E. Marengo, C. Rappaport, and E. Miller, Optimum PML ABC conductivity prole in

FDFD, IEEE Trans. on Magn., (1999). to appear.

[11] C. Rappaport, Interpreting and improving the pml absorbing boundary condition using

anisotropic lossy mapping of space, IEEE Trans. Magn., 32 (1996), pp. 968974.

[12] E. M. Rosen and T. W. Altshuler, Analysis of uxo and clutter signatures from the DARPA

background clutter experiment, in UXO Forum '98, May 1998.

[13] D. H. Staelin, A. Morgenthaler, and J. A. Kong, Electromagnetic Waves, Prentice

Hall, 1994.

24

+x-x

-y+

y

Ω

Ο

a1

γ

air PML

c(y)

=1

e=0

c(y)

=1

c(x)=1

c(x)=1

c(x),c(y)non-constant

b1

soil PML

Figure 1: Illustration of problem setup. denotes the whole of the larger square, while denotes

the region inside the inner square only.

25

na no. gridpoints in vertical direction in air

ns no. gridpoints in vertical direction in soil

nr no. gridpoints in vertical direction of buried object

nc no. gridpoints in horizontal direction of buried object

n1 no. gridpoints on the horizontal interval [a1; a1]

N no. PML layers

mw total no. gridpoints in horizontal (vertical) direction = n1 + 2N

n total no. of unknowns = m2w

Table I: Summary of variables.

26

0 5 10 15 20 25 30 35 40 4510

−8

10−7

10−6

10−5

10−4

10−3

10−2

10−1

100

iterations

rela

tive

resid

ual n

orm

Example 1, 10 ppw

solid: q=6dashed: q=7dash−dotted: q=8dotted: q=9

Figure 2: Relative residual norm per iteration for case h = 2:45cm and q = 6; 7; 8; 9, Example 1.

27

0 5 10 15 20 25 30 35 4010

−8

10−7

10−6

10−5

10−4

10−3

10−2

10−1

100

iteration

rela

tive

resid

ual n

orm

solid: q=6dashed: q=7dash−dotted: q=8dotted: q=9

Example 1, 20 ppw

Figure 3: Relative residual norm per iteration for case h = 1:23cm and q = 6; 7; 8; 9, Example 1.

28

m2w h = 2:45cm h = 1:23cm

unknowns f0 = 0 f0 random rank f0 = 0 f0 random rank

6,241 31 42.8 174 35 51.8 194

20,449 30 48 302 34 56 322

73,441 32 57.4 558 33 65.2 578

277,729 41 71 1070 39 76.2 1090

Table II: Number of iterations (with starting guess f0 = 0) and average number of iterations (with

random starting guess) until convergence for varying number of unknowns and two dierent values

of h, Example 1. Columns headed by \rank" give the corresponding rank of I AM1, the upper

bound on the number of iterations until convergence in exact arithmetic.

29

0 0.5 1 1.5 2 2.5−0.5

0

0.5

1

real

imag

q=6

0 0.5 1 1.5 2 2.5−0.5

0

0.5

1

real

imag

q=7

0 0.5 1 1.5 2 2.5−0.5

0

0.5

1

real

imag

q=8

0 0.5 1 1.5 2 2.5−0.5

0

0.5

1

real

imag

q=9

Figure 4: Non-unit eigenvalues of AM1, case h = 2:45cm and q = 6; 7; 8; 9, Example 1.

30

0 0.5 1 1.5 2 2.5−0.5

0

0.5

1

real

imag

q=6

0 0.5 1 1.5 2 2.5−0.5

0

0.5

1

real

imag

q=7

0 0.5 1 1.5 2 2.5−0.5

0

0.5

1

real

imag

q=8

0 0.5 1 1.5 2 2.5−0.5

0

0.5

1

real

imag

q=9

Figure 5: Non-unit eigenvalues of AM1, case h = 1:23cm and q = 6; 7; 8; 9, Example 1.

31

−0.4 −0.2 0 0.2 0.4 0.6 0.8 1 1.2−0.4

−0.2

0

0.2

0.4

0.6

0.8

Figure 6: x's represent non-unit eigenvalues of AM1, case h = 1:23cm and q = 9, Example 1, and

o's represent those eigenvalues selected to serve as roots of a polynomial.

32

0 5 10 15 20 25 3010

−8

10−7

10−6

10−5

10−4

10−3

10−2

10−1

100

iteration

rela

tive

resid

ual n

orm

Example 2, 10 ppw

solid: q=6dashed: q=7dash−dotted: q=8dotted: q=9

Figure 7: Relative residual norm per iteration for case h = 1:34cm and q = 6; 7; 8; 9; Example2.

33

0 5 10 15 20 25 30 3510

−8

10−7

10−6

10−5

10−4

10−3

10−2

10−1

100

iteration

rela

tive

resid

ual n

orm

Example 2, 20 ppw

solid: q=6dashed: q=7dash−dotted: q=8dotted: q=9

Figure 8: Relative residual norm per iteration for case h = 0:671cm and q = 6; 7; 8; 9, Example 2.

34

−0.5 0 0.5 1−0.8

−0.6

−0.4

−0.2

0

0.2

0.4

0.6

q=6

real

imag

−0.5 0 0.5 1−0.8

−0.6

−0.4

−0.2

0

0.2

0.4

0.6

real

imag

q=7

−0.5 0 0.5 1−0.8

−0.6

−0.4

−0.2

0

0.2

0.4

0.6

real

imag

q=8

−0.5 0 0.5 1−0.8

−0.6

−0.4

−0.2

0

0.2

0.4

0.6

real

imag

q=9

Figure 9: Non-unit eigenvalues of AM1, case h = 1:34cm and q = 6; 7; 8; 9, Example 2.

35

−0.5 0 0.5 1−0.8

−0.6

−0.4

−0.2

0

0.2

0.4

0.6

real

imag

q=6

−0.5 0 0.5 1−0.8

−0.6

−0.4

−0.2

0

0.2

0.4

0.6

real

imag

q=7

−0.5 0 0.5 1−0.8

−0.6

−0.4

−0.2

0

0.2

0.4

0.6

real

imag

q=8

−0.5 0 0.5 1−0.8

−0.6

−0.4

−0.2

0

0.2

0.4

0.6

real

imag

q=9

Figure 10: Non-unit eigenvalues of AM1, case h = 0:671cm and q = 6; 7; 8; 9, Example 2.

36

m2w h = 1:34cm h = 0:671cm

unknowns f0 = 0 f0 random rank f0 = 0 f0 random rank

6,241 26 55.6 188 32 61.6 248

20,449 27 61.2 316 30 71.4 376

73,441 27 74.8 572 33 82.6 632

277,729 26 88 1084 34 104.6 1144

Table III: Number of iterations (with starting guess f0 = 0) and average number of iterations (with

random starting guess) until convergence for varying number of unknowns and two dierent values

of h, Example 2. Columns headed by \rank" give the corresponding rank of I AM1, the upper

bound on the number of iterations until convergence in exact arithmetic.

37