sebastien loisel´ schwarzand2 … · × 2 block upper triangular matrix with the identity matrix...

29
Heriot-Watt University Research Gateway Optimized Schwarz and 2-Lagrange Multiplier Methods for Multiscale Elliptic PDEs Citation for published version: Loisel, S, Nguyen, TH & Scheichl, R 2015, 'Optimized Schwarz and 2-Lagrange Multiplier Methods for Multiscale Elliptic PDEs', SIAM Journal on Scientific Computing, vol. 37, no. 6, pp. A2896–A2923. https://doi.org/10.1137/15M1009676 Digital Object Identifier (DOI): 10.1137/15M1009676 Link: Link to publication record in Heriot-Watt Research Portal Document Version: Publisher's PDF, also known as Version of record Published In: SIAM Journal on Scientific Computing General rights Copyright for the publications made accessible via Heriot-Watt Research Portal is retained by the author(s) and / or other copyright owners and it is a condition of accessing these publications that users recognise and abide by the legal requirements associated with these rights. Take down policy Heriot-Watt University has made every reasonable effort to ensure that the content in Heriot-Watt Research Portal complies with UK legislation. If you believe that the public display of this file breaches copyright please contact [email protected] providing details, and we will remove access to the work immediately and investigate your claim. Download date: 28. Jul. 2021

Upload: others

Post on 01-Mar-2021

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: SEBASTIEN LOISEL´ Schwarzand2 … · × 2 block upper triangular matrix with the identity matrix in ... be expensive for these very large sparse systems of linear equations

Heriot-Watt University Research Gateway

Optimized Schwarz and 2-Lagrange Multiplier Methods forMultiscale Elliptic PDEs

Citation for published version:Loisel, S, Nguyen, TH & Scheichl, R 2015, 'Optimized Schwarz and 2-Lagrange Multiplier Methods forMultiscale Elliptic PDEs', SIAM Journal on Scientific Computing, vol. 37, no. 6, pp. A2896–A2923.https://doi.org/10.1137/15M1009676

Digital Object Identifier (DOI):10.1137/15M1009676

Link:Link to publication record in Heriot-Watt Research Portal

Document Version:Publisher's PDF, also known as Version of record

Published In:SIAM Journal on Scientific Computing

General rightsCopyright for the publications made accessible via Heriot-Watt Research Portal is retained by the author(s) and /or other copyright owners and it is a condition of accessing these publications that users recognise and abide bythe legal requirements associated with these rights.

Take down policyHeriot-Watt University has made every reasonable effort to ensure that the content in Heriot-Watt ResearchPortal complies with UK legislation. If you believe that the public display of this file breaches copyright pleasecontact [email protected] providing details, and we will remove access to the work immediately andinvestigate your claim.

Download date: 28. Jul. 2021

Page 2: SEBASTIEN LOISEL´ Schwarzand2 … · × 2 block upper triangular matrix with the identity matrix in ... be expensive for these very large sparse systems of linear equations

Copyright © by SIAM. Unauthorized reproduction of this article is prohibited.

SIAM J. SCI. COMPUT. c© 2015 Society for Industrial and Applied MathematicsVol. 37, No. 6, pp. A2896–A2923

OPTIMIZED SCHWARZ AND 2-LAGRANGE MULTIPLIERMETHODS FOR MULTISCALE ELLIPTIC PDES∗

SEBASTIEN LOISEL† , HIEU NGUYEN‡ , AND ROBERT SCHEICHL§

Abstract. In this article, we formulate and analyze a two-level preconditioner for optimizedSchwarz and 2-Lagrange multiplier methods for PDEs with highly heterogeneous (multiscale) diffu-sion coefficients. The preconditioner is equipped with an automatic coarse space consisting of low-frequency modes of approximate subdomain Dirichlet-to-Neumann maps. Under a suitable changeof basis, the preconditioner is a 2 × 2 block upper triangular matrix with the identity matrix inthe upper-left block. We show that the spectrum of the preconditioned system is included in thedisk having center z = 1/2 and radius r = 1/2 − ε, where 0 < ε < 1/2 is a parameter that we canchoose. We further show that the GMRES algorithm applied to our heterogeneous system convergesin O(1/ε) iterations (neglecting certain polylogarithmic terms). The number ε can be made arbi-trarily large by automatically enriching the coarse space. Our theoretical results are confirmed bynumerical experiments.

Key words. domain decomposition, coefficient dependent coarse space, adaptive coarse spaceenrichment, Dirichlet to Neumann generalized eigenproblem, multiscale PDEs, heterogeneous media

AMS subject classifications. 65N55, 65F10, 65N30, 65N22

DOI. 10.1137/15M1009676

1. Introduction. Simulations with heterogeneous media arise naturally in manyproblems in science and engineering, e.g., modeling of flows in oil reservoirs, porousmedia, and heat conduction in composite materials [7, 51, 4, 40]. Realistic simulationsfor such problems often require high-resolution (very fine) meshes. Direct solvers canbe expensive for these very large sparse systems of linear equations. In addition,heterogeneity in media can make the associated linear systems severely ill-conditionedand pose a challenge for traditional iterative schemes. Consequently, there has beena lot of research on development of efficient and robust iterative parallel solvers forheterogeneous media, especially in the setting of multigrid, multilevel, and domaindecomposition methods [11, 22, 27, 1, 23, 44, 54, 53, 18, 19, 35, 45, 47, 8, 46, 49, 48].

Domain decomposition splits a problem into coupled subproblems on smaller sub-domains forming a partition of the original domain [41, 52, 34]. It is one of the mostpopular approaches to solving large-scale problems on parallel supercomputers. Indomain decomposition, a coarse grid is an essential ingredient to achieve scalability.Early works, e.g., [6, 12, 11, 27, 52, 33, 54], show that many domain decompositionmethods work for heterogeneous media. However, these methods all require a geo-metric coarse grid which resolves the discontinuities in the properties of the media.

∗Submitted to the journal’s Methods and Algorithms for Scientific Computing section February 23,2015; accepted for publication (in revised form) October 9, 2015; published electronically December10, 2015. The work of the first and second authors was supported by the Numerical Algorithms andIntelligent Software Centre funded by the UK EPSRC grant EP/G036136 and the Scottish FundingCouncil.

http://www.siam.org/journals/sisc/37-6/M100967.html†Corresponding author. Department of Mathematics, Heriot-Watt University, Riccarton, Edin-

burgh EH14 4AS, United Kingdom ([email protected]).‡CIMNE - Centre Internacional de Metodes Numerics en Enginyeria, Parc Mediterrani de la

Tecnologia, Universitat Politecnica de Catalunya, Esteve Terradas 5, 08860 Castelldefels, Spain([email protected]).

§Department of Mathematical Sciences, University of Bath, Bath BA2 7AY, United Kingdom([email protected]).

A2896

Dow

nloa

ded

01/1

3/16

to 1

37.1

95.8

.21.

Red

istr

ibut

ion

subj

ect t

o SI

AM

lice

nse

or c

opyr

ight

; see

http

://w

ww

.sia

m.o

rg/jo

urna

ls/o

jsa.

php

Page 3: SEBASTIEN LOISEL´ Schwarzand2 … · × 2 block upper triangular matrix with the identity matrix in ... be expensive for these very large sparse systems of linear equations

Copyright © by SIAM. Unauthorized reproduction of this article is prohibited.

OSM AND 2LM FOR MULTISCALE PDES A2897

In practice, this is a strong requirement as the properties of the media might havecomplicated variation on many scales and be difficult to capture by a geometric coarsegrid. Recently, there have been works on coarse grids that do not resolve the hetero-geneity in the media [22, 44, 23, 37], especially automatic coarse spaces that dependon the properties of the media [18, 19, 35, 45, 47, 8, 46, 49, 48]. In the latter, thecoarse spaces are constructed from eigenfunctions associated with small eigenvalues(low-frequency modes) of appropriated generalized eigenvalue problems. They areindeed “energy minimizing spaces with constraints” and can be analyzed using theunified theory in [45]. In this paper, we utilize automatic coarse spaces similar to theones in [35, 8] to formulate an efficient preconditioner for the optimized Schwarz and2-Lagrange multiplier (2LM) domain decomposition methods.

2LM methods are generalized versions of optimized Schwarz methods [21, 20, 29,30, 14]. They use Robin transmission conditions across the “artificial interface,” andthe Robin parameter can be optimized to obtain the fastest convergence. The firstform of 2LM method was introduced in [16], and its relation with optimized Schwarzmethods was established in [42]. In [28], Loisel gave a rigorous formulation of themethods for general domains and cross points. The methods have been successfullyused for different academic and engineering large-scale applications [39, 25, 3].

2LM methods work with a nonoverlapping decomposition. Their formulations aresimilar to that of the FETI method [17]. However, instead of having local Neumannproblems, 2LM methods have local Robin problems (one for each subdomain). Thesolvabilty of these Robin problems is guaranteed, and thus there is no need for specialtreatment of floating subdomains as in the FETI method. In 2LM methods, the globallinear system is reduced to a system on the interface for the Lagrange multipliers.The 2LM interface system, which is nonsymmetric in the considered methods, is ofmuch smaller size but can still be difficult to solve by iterative solvers. One-leveland two-level preconditioners for this system were studied in [28, 24]. For the two-level preconditioner, the coarse space is spanned by piecewise constant functions onthe trace space. These are indeed eigenfunctions associated with the eigenvalue 0 ofthe subdomain Dirichlet-to-Neumann (DtN) maps [8]. Under a suitable change ofbasis, the two-level preconditioner in [24] has a 2 × 2 block diagonal structure. Thispreconditioner appears to work well for homogeneous media, but its performancedeteriorates for heterogeneous media (see numerical experiments in subsections 5.4and 5.5).

In this work, not only the piecewise constant functions but also other low-frequencyeigenfunctions of the subdomain DtN maps are included in the coarse space, as sug-gested by [35, 8]. However, the DtN maps are only computed approximately, with thecoefficient on the interface being replaced by its average over adjacent subdomains.Our preconditioner is formulated, under a suitable change of basis, as a 2 × 2 blockupper triangular matrix with the identity matrix in the upper-left block. The changesin the coarse space and the form of the preconditioner mean the analysis in [24] isno longer valid. With a new analysis, we are able to show that the spectrum of thepreconditioned system, except for the isolated eigenvalue 1, is included in the diskhaving center (1/2, 0) and radius 1/2 − ε, where 0 < ε < 1/2 is a parameter. Undersuitable assumptions, this leads to explicit upper bounds for the relative residual normof the GMRES algorithm. Asymptotically, these bounds decrease to 0 linearly withthe same rate at which (1−2ε)k−2 converges to 0, where k is the iteration number. Inaddition, the parameter ε, and consequently the rate of convergence, can be calculateda priori once a tentative coarse space is chosen. If ε is too small (slow convergence), it

Dow

nloa

ded

01/1

3/16

to 1

37.1

95.8

.21.

Red

istr

ibut

ion

subj

ect t

o SI

AM

lice

nse

or c

opyr

ight

; see

http

://w

ww

.sia

m.o

rg/jo

urna

ls/o

jsa.

php

Page 4: SEBASTIEN LOISEL´ Schwarzand2 … · × 2 block upper triangular matrix with the identity matrix in ... be expensive for these very large sparse systems of linear equations

Copyright © by SIAM. Unauthorized reproduction of this article is prohibited.

A2898 SEBASTIEN LOISEL, HIEU NGUYEN, AND ROBERT SCHEICHL

can be made bigger (faster convergence) by enriching the tentative coarse space witheigenmodes of the subdomain DtN maps associated with the next (larger) frequencies.In other words, the coarse space can be adapted automatically to the variation of thecoefficient and the difficulty of the problem to ensure a good rate of convergence.

The rest of this paper is organized as follows. We first state the model problem,derive the 2LM system, and introduce the spectral coarse space preconditioner insection 2. In section 3, we discuss the motivations as well as the structure of ourpreconditioner through studying a transform of the 2LM system. Convergence anal-ysis and optimal choice of the Robin parameter is given in section 4. In section 5,we present an extensive set of numerical experiments for different types of diffusioncoefficients with different configurations of the mesh size, the number of subdomains,and the partition to verify our theoretical results. We end with a short conclusion insection 6.

2. Method formulation.

2.1. Preparatory material. We consider the model problem

(2.1)−∇ · (α(x) ∇u(x)) = f(x) in Ω,

u(x) = 0 on ∂Ω,

where the diffusion coefficient α(x) is a positive function that may have large variationswithin Ω.

Let Th be a mesh of size h of Ω. We assume that Th resolves any discontinuityin α(x); i.e., α(x) is piecewise constant on Th. When (2.1) is discretized, e.g., usingpiecewise linear finite elements with basis {φj}nj=1, we obtain the following system oflinear equations:

(2.2) Au = f.

Assume the domain Ω has a nonoverlapping decomposition Ω = ∪pi=1Ωi\∂Ω with

the “artificial interface” Γ = ∪pi=1∂Ωi\∂Ω. The partition can have floating subdo-

mains. Let H = maxi{diam(Ωi)}. For each subdomain Ωi, we consider the localproblem

(2.3)

−∇ · (α(x) ∇ui(x)) = f(x) in Ωi,(a+ ∂

∂ni) (α(x)ui(x)) = λi(x) on ∂Ωi\∂Ω,

ui(x) = 0 on ∂Ωi ∩ ∂Ω,

where a > 0 is the “Robin parameter,” ni is the outward normal vector of the bound-ary ∂Ωi\∂Ω, and λi is a Lagrange multiplier.

Multiplying the first equation of (2.3) with a test function v ∈ H10 (Ωi, ∂Ωi ∩ ∂Ω),

applying the divergence theorem, and using the second and third equations, we getthe variational formulation of the local problem: find ui ∈ H1

0 (Ωi, ∂Ωi∩∂Ω) such that

(2.4)

∫Ωi

α(∇ui · ∇v)dx + a

∫∂Ωi

αuivdx =

∫Ωi

fvdx+

∫∂Ωi

λivdx

for all v ∈ H10 (Ωi, ∂Ωi ∩ ∂Ω).

Discretizing (2.4) using the finite element method with {φ(i)j }ni

j=1, the subset ofthe basis associated with Ωi, we have

(2.5) (A(i) + aB(i))ui = f (i) + λ(i),

Dow

nloa

ded

01/1

3/16

to 1

37.1

95.8

.21.

Red

istr

ibut

ion

subj

ect t

o SI

AM

lice

nse

or c

opyr

ight

; see

http

://w

ww

.sia

m.o

rg/jo

urna

ls/o

jsa.

php

Page 5: SEBASTIEN LOISEL´ Schwarzand2 … · × 2 block upper triangular matrix with the identity matrix in ... be expensive for these very large sparse systems of linear equations

Copyright © by SIAM. Unauthorized reproduction of this article is prohibited.

OSM AND 2LM FOR MULTISCALE PDES A2899

where

A(i)kl =

∫Ωi

α(∇φl · ∇φk)dx, B(i)kl =

∫∂Ωi

αφlφkdx,(2.6a)

f(i)k =

∫Ωi

fφkdx, λ(i)k =

∫∂Ωi

λiφkdx.(2.6b)

We would like to find λi so that each local discrete solution ui is the restriction of theglobal discrete solution u on Ωi, namely,

(2.7) Riu = ui.

Here Ri is the restriction matrix, which restricts any n-dimensional vector u (associ-ated with a grid function on the mesh Th of Ω) to an ni-dimensional vector Riu thatcontains only the components of u corresponding to Ωi.

2.2. 2LM system. Relabeling degrees of freedom (dofs) to separate those inthe interior of Ωi (corresponding to subscript I) and those on the boundary ∂Ωi

(corresponding to subscript Γ), (2.5) becomes

(2.8)

[A

(i)II A

(i)IΓ

A(i)ΓI A

(i)ΓΓ + aBi

][u(i)I

u(i)Γ

]=

[f(i)I

f(i)Γ + λi

].

Theoretically, Bi and λi are, respectively, the submatrix and “subvector” of B(i) andλ(i) associated only with dofs on ∂Ωi. However, we will show later in Lemma 2.1 thatour method formulation does not rely on the formulation of B(i) in (2.6). Therefore,we let Bi be an arbitrary symmetric positive definite matrix of the appropriated size.

Eliminating the interior unknowns u(i)I in (2.8), we arrive at the following system

for the unknowns on the interface:

(2.9) (Si + aBi)u(i)Γ = gi + λi,

where Si = A(i)ΓΓ − A

(i)ΓI(A

(i)II )

−1A(i)IΓ and gi = f

(i)Γ − A

(i)ΓI(A

(i)II )

−1f(i)I are the Schur

complement and the accumulated right-hand side, respectively.Let S and B be the block diagonal matrices S = diag{S1, S2, . . . , Sp} and B =

diag{B1, . . . , Bp}. In addition, denote g = [gT1 , . . . , gTp ]

T , λ = [λT1 , . . . , λ

Tp ]

T , and

uΓ = [u(i)Γ

T, . . . , u

(p)Γ

T]T . Since the matrices Si are symmetric positive semidefinite,

the matrices Bi are symmetric positive definite, and a > 0, the matrices Si + aBi areinvertible. Therefore, (2.9) is equivalent to

(2.10) aBuΓ = Q(g + λ),

where

(2.11) Q = aB(S + aB)−1 =

⎡⎢⎣aB1(S1 + aB1)

−1

. . .

aBp(Sp + aBp)−1

⎤⎥⎦ .

If we think of the vector [uT1 , . . . , u

Tp ]

T as a functions which is defined on Ω,continuous inside each Ωi, but with jump discontinuities across Γ, then the vector uΓ

is actually its multivalued or many-sided trace. For each vertex xj ∈ Γ, let mj be its

Dow

nloa

ded

01/1

3/16

to 1

37.1

95.8

.21.

Red

istr

ibut

ion

subj

ect t

o SI

AM

lice

nse

or c

opyr

ight

; see

http

://w

ww

.sia

m.o

rg/jo

urna

ls/o

jsa.

php

Page 6: SEBASTIEN LOISEL´ Schwarzand2 … · × 2 block upper triangular matrix with the identity matrix in ... be expensive for these very large sparse systems of linear equations

Copyright © by SIAM. Unauthorized reproduction of this article is prohibited.

A2900 SEBASTIEN LOISEL, HIEU NGUYEN, AND ROBERT SCHEICHL

multiplicity, namely, the number of subdomains adjacent to xj . In order for uΓ tocorrespond to a continuous function, e.g., the solution of (2.1), the following relationmust hold:

(2.12) KuΓ = uΓ,

where

(2.13) K = ΠT diag

{1

mj1

1mj1×mj1, . . . ,

1

mjp

1mjnΓ×mjnΓ

}Π,

with 1 the matrix of ones, jk, 1 ≤ k ≤ nΓ, the global indices of the dofs on theinterface, and Π the permutation that rearranges these dofs so that they have thesame ordering as in uΓ. Since K2 = K and KT = K, K is an orthogonal projection(it projects onto the space of vectors with continuous many-sided trace).

We also need to equate the fluxes of the subsolutions across the artificial interfaceΓ. Using φj , a nodal basis function associated with Γ, as a test function for (2.3), wefind that

(2.14)

∫∂Ωi

α∂ui

∂niφjdx =

∫Ωi

α(∇ui · ∇φj)dx−∫Ωi

fvdx.

Consequently, the “discrete flux vector” η(i) of the local solution ui across ∂Ωi canbe computed by

(2.15) η(i) = A(i)ΓIu

(i)I +A

(i)ΓΓu

(i)Γ − f

(i)Γ = λi − aBiu

(i)Γ (using (2.8)).

The discrete weighted fluxes are matched when they have zero average or equivalently

(2.16) aKBuΓ = Kλ.

The following result is purely algebraic, namely, (2.2) and (2.6) do not need to comefrom (2.1).

Lemma 2.1. Let us assume that A is invertible, A =∑p

i=1 RTi A

(i)Ri, andf =

∑pi=1 R

Ti fi. We also assume that the matrices on the left-hand side of (2.8)

are invertible. Then there exists a unique solution u1, . . . , up, λ1, . . . , λp to the simul-taneous equations (2.8), (2.12), and (2.16). Such a solution u1, . . . , up implies theunique solution u of (2.2) through (2.7), and vice versa.

Proof. Assume u is the unique solution of (2.2). Letting ui = Riu and substitutingthem into (2.8), we obtain λi. Clearly, (2.12) holds. In addition,

Au =

p∑i=1

RTi A

(i)Riu =

p∑i=1

RTi A

(i)ui =

p∑i=1

RTi (f

(i) + λ(i) − aB(i)ui)(2.17)

= f +

p∑i=1

RTi

[0

λi − aBiu(i)Γ

].

This implies (2.16).Now assume that u1, . . . , up, λ1, . . . , λp is a solution to the simultaneous equations

(2.8), (2.12), and (2.16). As (2.12) holds, there is clearly a unique u that satisfies (2.7).The fact that this u is also the solution of (2.2) comes from arguments in (2.17).

Assume u∗1, . . . , u

∗p, λ

∗1, . . . , λ

∗p is another solution to the simultaneous equations

(2.8), (2.12), and (2.16). From (2.8), if u∗i = ui, then λ∗

i = λ. If u∗i �= ui for some

Dow

nloa

ded

01/1

3/16

to 1

37.1

95.8

.21.

Red

istr

ibut

ion

subj

ect t

o SI

AM

lice

nse

or c

opyr

ight

; see

http

://w

ww

.sia

m.o

rg/jo

urna

ls/o

jsa.

php

Page 7: SEBASTIEN LOISEL´ Schwarzand2 … · × 2 block upper triangular matrix with the identity matrix in ... be expensive for these very large sparse systems of linear equations

Copyright © by SIAM. Unauthorized reproduction of this article is prohibited.

OSM AND 2LM FOR MULTISCALE PDES A2901

1 ≤ i ≤ p, then we obtain u∗ satisfying (2.7), and consequently Au∗ = f . Since A isinvertible, u∗ = u. Hence u∗

i = Riu∗ = Riu = u, which contradicts u∗

i �= ui.Remark 2.2. We again emphasize that the result in Lemma 2.1 is purely algebraic.

Especially, B(i), and thus Bi, do not need to come from (2.1); i.e., they do not haveto be defined as in (2.6).

As there is freedom in choosing Bi, we make the following assumption.Assumption 2.3. The matrices B and K commute, i.e.,

KB = BK.

In fact, we will choose B to be the diagonal matrix satisfying

(2.18) diag(B) = K diag(B),

where B = diag{B1, . . . , Bp}, with Bi being the lumped mass matrix on ∂Ωi. Inother words, B is the “average” of the lumped mass matrix associated with dofs onthe interface. In some sense, our choice of B means that the interface mass matrixand the local generalized eigenproblem introduced later in (3.5) take into accountsome information about the behavior of the coefficient in the vicinity of the interfacein adjacent subdomains.

Using Assumption 2.3, (2.16), (2.12), and (2.10), after some algebra we find thenonsymmetric 2LM system

(2.19) A2LMλ := (I − 2K)(Q−K)λ = −(I − 2K)Qg.

Here, we note that Q−K is always nonsingular, as shown in [10, Theorem 1] and [28,Lemma 3.2]. Therefore, λ is uniquely defined by (2.19). The 2LM system (2.19) canbe regarded as a generalization of optimized Schwarz methods to the case where thepartition has cross points (cf. [28, 24]).

2.3. The spectral coarse space preconditioner. The system (2.19) can besolved iteratively using GMRES [43]. In order to accelerate the convergence of GM-RES, we now briefly introduce a preconditioner with a spectral coarse space. Theidea is to choose an auxiliary coarse (or deflation) space and solve the reduced prob-lem on it exactly. Since the second-level preconditioner consisting of the subdomainsolves will only need to be effective on the complement space, faster convergence canbe guaranteed if the coarse space contains all or most of the functions that are notcaptured well by the local subdomain solves. For more information about this typeof preconditioners, we refer the reader to [36, 13, 5, 32, 50, 2] and references therein.

A preconditioner with a coarse space is already formulated for the homogeneouscase [24]. In this case, the coarse space consists of the kernel of S (i.e., the piecewiseconstant functions). When the problem is heterogeneous, we use the same piecewiseconstant functions, as well as any functions that are “almost” in the kernel of S.

We choose a “truncation parameter” smin for the coarse space, and we considerall the generalized eigenvectors

Sv = sBv, where s < smin.

We collect all such column vectors into the columns of a matrix J , which is B-orthonormalized:

JTBJ = I.

Dow

nloa

ded

01/1

3/16

to 1

37.1

95.8

.21.

Red

istr

ibut

ion

subj

ect t

o SI

AM

lice

nse

or c

opyr

ight

; see

http

://w

ww

.sia

m.o

rg/jo

urna

ls/o

jsa.

php

Page 8: SEBASTIEN LOISEL´ Schwarzand2 … · × 2 block upper triangular matrix with the identity matrix in ... be expensive for these very large sparse systems of linear equations

Copyright © by SIAM. Unauthorized reproduction of this article is prohibited.

A2902 SEBASTIEN LOISEL, HIEU NGUYEN, AND ROBERT SCHEICHL

We define the B−1-orthogonal projection E = BJJT and the spectral coarse spacepreconditioner by

(2.20) P = (I − E) +A2LME.

The preconditioned system is

(2.21) P−1A2LMλ = −P−1(I − 2K)Qg.

An efficient strategy for evaluating the matrix-vector product P−1λ is as follows (seesubsection 3.3 for an explanation). Let

Z = JTA2LMBJ(2.22)

be the “coarse matrix.” Then

P−1 =(I − (I − E)A2LME

)(I − E +BJZ−1JT

)=

(I − (I −BJJT )A2LMBJJT

)(I −BJJT +BJZ−1JT

).(2.23)

As the dimension of the coarse space is generally much smaller than the size of the2LM system, (2.23) suggests an efficient way to apply P−1. The coarse matrix Z,whose size is the dimension of the coarse space, can be factorized at low cost. Thematrix-vector products involving J , JT , and B can also be executed inexpensivelybecause the number of columns in J is much smaller than the number of rows and Bis a diagonal matrix.

3. Transformed 2LM system.

3.1. The transformation. It is clear from (2.11) that Q is nonsymmetric. Inorder to exploit symmetry in our analysis, we consider the following similarity trans-formation.

Definition 3.1. Given a matrix C and a vector v, we define their “hat-associates”:

(3.1) C := B−1/2CB1/2, v := B−1/2v.

Here we note that B1/2 and B−1/2 are well-defined because B is a diagonal matrixwith positive entries.

Remark 3.2. σ(C) = σ(C), where σ(·) denotes the spectrum of a matrix.Multiplying (2.19) with B−1/2 from the left, and using the fact that B1/2 and K

commute, we obtain the transformed 2LM system

A2LM λ = (I − 2K)(Q−K)λ = −(I − 2K)Qg.(3.2)

Using the definition of Q in (2.11), it follows that

Q = B−1/2QB1/2 = B−1/2aB(S + aB)−1B1/2 = a(B−1/2SB−1/2 + aI)−1.

Clearly, Q is symmetric. In addition,

(3.3) σ(Q) = σ(Q) =

{a

a+ s

∣∣∣ s ∈ σ(B−1/2SB−1/2)

}.

Furthermore, the spectrum of S = B−1/2SB−1/2 is exactly the same as the spectrumof the following generalized eigenvalue problem:

(3.4) Sv = sBv.

Dow

nloa

ded

01/1

3/16

to 1

37.1

95.8

.21.

Red

istr

ibut

ion

subj

ect t

o SI

AM

lice

nse

or c

opyr

ight

; see

http

://w

ww

.sia

m.o

rg/jo

urna

ls/o

jsa.

php

Page 9: SEBASTIEN LOISEL´ Schwarzand2 … · × 2 block upper triangular matrix with the identity matrix in ... be expensive for these very large sparse systems of linear equations

Copyright © by SIAM. Unauthorized reproduction of this article is prohibited.

OSM AND 2LM FOR MULTISCALE PDES A2903

Due to the block structure of S and B, the spectrum of (3.4) is the union of thespectra of the following generalized eigenvalue problems on the subdomains:

(3.5) Siv(i) = sBiv

(i).

As Si is symmetric positive semidefinite and Bi is symmetric positive definite, all theeigenvalues of (3.5) are nonnegative. In addition, when Ωi is a floating subdomain,(3.5) has exactly one eigenvalue that is 0 with constant eigenvectors. This togetherwith (3.3), (3.4), and (3.5) implies that

(3.6) σ(Q) = σ(Q) ⊂ (0, 1].

According to [8], if Bi is the submatrix of B(i) associated with dofs on ∂Ωi and is com-puted exactly as in (2.6), then (3.5) is the discrete form of the following eigenproblemin function space:

(3.7) DtNi(v(i)Γ ) = s α v

(i)Γ , where DtNi(v

(i)Γ ) = α

∂v(i)

∂ni

∣∣∣∂Ωi\∂Ω

,

and v(i) is the α-harmonic extension of v(i)Γ to Ωi. Here, we note that DtNi is the

Dirichlet-to-Neumann operator on the boundary of Ωi. The coarse spaces in [35, 8]are spanned by eigenfunctions associated with low-frequency modes of (3.7). We usethe same type of coarse space, but with Bi being the “averaged” lumped mass matrixon the boundary (cf. (2.18)). In other words, the α in (3.7) at each dof on Γ ∩ ∂Ωi isreplaced by its domainwise averaged counterpart.

3.2. Block structure of the preconditioner. Assume σ(Q) = σ(Q) = [ε, 1],where 0 < ε < 1/2. The coarse space V0 is defined as the subspace spanned byeigenvectors of Q corresponding to the eigenvalues in the interval (1 − ε, 1] (theseeigenvalues correspond to the small eigenvalues of (3.5) and (3.7)). Let J be a matrixwhose columns are orthonormal eigenvectors of Q spanning V0. Also let

E = J JT ,

i.e., E is the orthogonal projection onto the coarse space V0. We define our precondi-tioner for the transformed 2LM system (3.2) as follows:

P = (I − E) + A2LM E.(3.8)

We consider a change of basis from the standard one to a new one consisting ofeigenvectors of Q. If the new basis is arranged so that the eigenvectors associatedwith eigenvalues in [ε, 1 − ε] are listed first, then the transformed matrices Q and Ktake the form

Q =

[Q1 O

O Q2

]and K =

[K11 K12

K21 K22

],(3.9)

where Q1, Q2, and K are symmetric with σ(Q1) ⊂ [ε, 1 − ε], σ(Q2) ⊂ (1 − ε, 1], andσ(K) = {0, 1}. Under the same change of basis, the transformed 2LM matrix is

(3.10) A2LM =

[(I − 2K11)Q1 +K11 −2K12Q2 +K12

−2K21Q1 +K21 (I − 2K22)Q2 +K22

],

Dow

nloa

ded

01/1

3/16

to 1

37.1

95.8

.21.

Red

istr

ibut

ion

subj

ect t

o SI

AM

lice

nse

or c

opyr

ight

; see

http

://w

ww

.sia

m.o

rg/jo

urna

ls/o

jsa.

php

Page 10: SEBASTIEN LOISEL´ Schwarzand2 … · × 2 block upper triangular matrix with the identity matrix in ... be expensive for these very large sparse systems of linear equations

Copyright © by SIAM. Unauthorized reproduction of this article is prohibited.

A2904 SEBASTIEN LOISEL, HIEU NGUYEN, AND ROBERT SCHEICHL

the projection is

E =

[O OO I

],(3.11)

and the preconditioner is

(3.12) P =

[I −2K12Q2 +K12

O (I − 2K22)Q2 +K22

].

It can be seen that the preconditioner P is a 2 × 2 block upper triangular matrixwhich is obtained from A2LM (cf. (3.10)) by “zeroing out” the lower-left block andreplacing the top-left block by I.

3.3. Connection with the original 2LM system. The preconditioned trans-formed 2LM system is

(3.13) P−1(I − 2K)(Q−K)λ = P−1(I − 2K)Qg.

Multiplying (3.13) from the left by B1/2, using Definition 3.1, the fact that K andB1/2 commute, and B−1/2B1/2 = I, we obtain

B1/2P−1B−1/2(I − 2K)(Q−K)λ = −B1/2P−1B−1/2(I − 2K)Qg.(3.14)

This implies that using P as the preconditioner for the transformed system (3.2)is equivalent to using P = B1/2PB−1/2 as the preconditioner for the original 2LMsystem (2.19). In addition, σ(P−1A2LM ) = σ(P−1A2LM ).

Using the definition of P in (3.2), we recover the formulation of P,

P = B1/2PB−1/2 = B1/2(I − E) + A2LM EB−1/2 = (I − E) +A2LME,(3.15)

where

E = B1/2EB−1/2 = B1/2J JTB−1/2 = BJJT , J = B−1/2J .

In addition, if v is a normalized eigenvector of Q (a column of J), then B−1/2v is aneigenvector of the generalized eigenvalue problem (3.4). Furthermore, as the columnsof J are orthonormal, the columns of J are orthonormal with respect to the B-norm:

JTBJ = JT J = I.

These explain the formulation of our spectral coarse space preconditioner given inadvance in section 2.

We now explain how to efficiently compute the matrix-vector product P−1λ.First, we note that Z is the lower-right block of P and A2LM :

Z = JTA2LMBJ = JTB1/2A2LMB−1/2BJ = JT A2LM J

= (I − 2K22)Q2 +K22.(3.16)

Then, considering P given by (3.12), we find that

P−1 =

[I −2K12Q2 +K12

O I

] [I OO Z−1

],

=(I − (I − E)A2LM E

)(I − E + JZ−1JT

)=

(I − (I − J JT )A2LM J JT

)(I − J JT + JZ−1JT

).(3.17)

Dow

nloa

ded

01/1

3/16

to 1

37.1

95.8

.21.

Red

istr

ibut

ion

subj

ect t

o SI

AM

lice

nse

or c

opyr

ight

; see

http

://w

ww

.sia

m.o

rg/jo

urna

ls/o

jsa.

php

Page 11: SEBASTIEN LOISEL´ Schwarzand2 … · × 2 block upper triangular matrix with the identity matrix in ... be expensive for these very large sparse systems of linear equations

Copyright © by SIAM. Unauthorized reproduction of this article is prohibited.

OSM AND 2LM FOR MULTISCALE PDES A2905

As (3.17) is very much similar to (2.23), we refer the reader to the discussion rightafter (2.23) for an explanation of why the action of P−1 can be efficiently implementedusing (3.17).

In addition, it is worth noting that conjugating (3.17) by B1/2 gives us P−1 in(2.23).

4. Convergence analysis. Even though the analysis in this section is presentedexclusively for 2LM methods, the results also hold for optimized Schwarz methodswhen the partition has no cross point.

We first study the transformed 2LM system. The following lemma concerns thespectrum of the preconditioned system P−1A2LM .

Lemma 4.1. If Q2 − K22 is invertible, then the spectrum of the preconditionedsystem P−1A2LM , except for the isolated eigenvalue 1, is included in the disk havingcenter (1/2, 0) and radius 1− ε, i.e.,

σ(P−1A2LM ) ⊂ {z ∈ C : |z − 1/2| ≤ 1/2− ε} ∪ {1} =: Sε.(4.1)

Proof. The invertibility of Q2 −K22 will be discussed later in Remark 4.2. Nowwe focus on showing that (4.1) holds.

Since K2 = K, it implies that (I − 2K)−1 = I − 2K. Thus, for any θ ∈ C, wehave

rank(P−1A2LM − θI

)= rank

(A2LM − θP

)= rank

((I − 2K)(Q−K)− θ(I − E)− θ(I − 2K)(Q−K)E

)= rank

((Q −K)− θ(I − 2K)(I − E)− θ(Q −K)E

)= rank

([Q1 −K11 − θ(I − 2K11) (θ − 1)K12

(2θ − 1)K21 (1− θ)(Q2 −K22)

]).(4.2)

The number θ is an eigenvalue of P−1A2LM if and only if the rank of P−1A2LM − θIis deficient. This obviously occurs when θ = 1. Let us now consider the case θ �= 1.Note that the matrix (4.2) has an invertible lower-right block, so we can use a Schurcomplement and study the rank deficiency of the matrix

W =

(Q1 −K11 − θ(I − 2K11) + (2θ − 1)K12

(Q2 −K22

)−1

K21

).(4.3)

Setting θ = 12 + s and Q1 = 1

2I +M , we find that

(4.4) W = M + sG, where G =

(2

(K12

(Q2 −K22

)−1

K21 +K11

)− I

).

We assume that

(4.5) σ(G) ⊂ (−∞,−1] ∪ [1,∞).

According to (4.2), (4.3), and (4.4), θ �= 1 is an eigenvalue of P−1A2LM only ifσ(M + sG) 0. This only happens when

(4.6) |θ − 1/2| = |s| ≤ 1/2− ε,

Dow

nloa

ded

01/1

3/16

to 1

37.1

95.8

.21.

Red

istr

ibut

ion

subj

ect t

o SI

AM

lice

nse

or c

opyr

ight

; see

http

://w

ww

.sia

m.o

rg/jo

urna

ls/o

jsa.

php

Page 12: SEBASTIEN LOISEL´ Schwarzand2 … · × 2 block upper triangular matrix with the identity matrix in ... be expensive for these very large sparse systems of linear equations

Copyright © by SIAM. Unauthorized reproduction of this article is prohibited.

A2906 SEBASTIEN LOISEL, HIEU NGUYEN, AND ROBERT SCHEICHL

because of Lemma 3 in [10] and the fact that σ(M) ⊂ [ε − 1/2, 1/2− ε]. The desiredrelation follows immediately.

Now we need to show that the assumption (4.5) is actually true. Let K =[UV

] [UT V T

]with UTU + V TV = I. The Woodbury identity gives

K12

(Q2 −K22

)−1

K21 +K11 = UV T (Q2 − V V T )−1V UT + UUT

= U(V T (Q2 − V V T )−1V + I

)UT

= U(I − V T Q−12 V )−1UT =: F.(4.7)

Applying the Woodbury identity one more time, we have

(I − 2F )−1 =(I − 2U(I − V T Q−1

2 V )−1UT)−1

= I − (−2U)(I − V T Q−1

2 V + UT (−2U))−1

UT

= I + 2U(I − V T Q−12 V − 2UTU)−1UT

= I − 2U(V T (Q−12 − I)V + UTU)−1UT ≤ I.

Here, for square matrices G,H of equal size, we use G ≤ H to express that xTGx ≤xTHx for all x �= 0 of appropriate size. We note that U(V T (Q−1

2 −I)V +UTU)−1UT ≤PU , where PU is the orthogonal projection onto the range of U , and hence (I−2F )−1 ≥−I. This completes the proof.

Remark 4.2. We refer the reader to [24, Remarks 2.5 and 2.7] and [28, subsection3.1] for a discussion of the invertibility of Q2−K22 when the coarse space is the kernelof S (containing only piecewise constant functions). It is clear that Q2 −K22 is alsoinvertible when the coarse space is the whole space, as Q −K as well as Q −K areinvertible. Here, the matrix Q2 −K22 typically corresponds to an intermediate spacebetween the kernel of S and the whole space. It can be regarded as the discretizationof a certain coupled Robin problem for the subdomains, using some complicated(boundary element–like) multiscale basis functions. Since the coupled Robin problemis indefinite or even nonsymmetric, depending on the formulation, we have not beenable to rigorously prove the invertibility of Q2 − K22 in the general case. However,we did not encounter any problem related to this in our diverse set of numericalexperiments. So we believe that the matrix is either always invertible or fails to beinvertible only in very rare situations.

The following lemma presents an estimate needed in the derivation of convergencebounds for GMRES applied to 2LM systems.

Lemma 4.3. For θ ∈ C, the resolvent of P−1A2LM at θ is defined by

R(θ) := (P−1A2LM − θI)−1.

Then, for θ /∈ Sε, the resolvent norm ‖R(θ)‖ can be bounded as follows:

‖R(θ)‖ ≤|1− θ|+ |θ − 1

2 | −12 + ε+ (1− 2ε)‖Z−1‖

|1− θ|(|θ − 1

2 | −12 + ε

)(4.8)

=: Rb(θ).(4.9)

Proof. Performing block-row eliminations on (3.12), we find that[I O

O ((I − 2K22)Q2 +K22)−1

]P =

[I −2K12Q2 +K12

O I

],

Dow

nloa

ded

01/1

3/16

to 1

37.1

95.8

.21.

Red

istr

ibut

ion

subj

ect t

o SI

AM

lice

nse

or c

opyr

ight

; see

http

://w

ww

.sia

m.o

rg/jo

urna

ls/o

jsa.

php

Page 13: SEBASTIEN LOISEL´ Schwarzand2 … · × 2 block upper triangular matrix with the identity matrix in ... be expensive for these very large sparse systems of linear equations

Copyright © by SIAM. Unauthorized reproduction of this article is prohibited.

OSM AND 2LM FOR MULTISCALE PDES A2907[I 2K12Q2 −K12

O I

] [I O

O ((I − 2K22)Q2 +K22)−1

]P = I.

Hence,

P−1 =

[I (2K12Q2 −K12)((I − 2K22)Q2 +K22)

−1

O ((I − 2K22)Q2 +K22)−1

].

This together with (3.10) implies that the preconditioned matrix P−1A2LM is[(I − 2K11)Q1+K11+(2K12Q2 −K12)((I− 2K22)Q2+K22)

−1(K21− 2K21Q1) O

((I − 2K22)Q2 +K22)−1(K21 − 2K21Q1) I

].

We now simplify the top-left and bottom-left blocks of the preconditioned matrixusing Q1 = 1

2 +M , where ‖M‖ ≤ 12 − ε, which gives us

(P−1A2LM )11 =1

2I +

(I − 2K11 − 2K12(2Q2 − I)((I − 2K22)Q2 +K22)

−1K21

)M.

We use the Woodbury identity to compute the inverse

(I + V T (2Q2 − I)((I − 2V V T )Q2 + V V T )−1V )−1

= I − V T (2Q2 − I)(((I − 2V V T )Q2 + V V T ) + V V T (2Q2 − I))−1V

= I − V T (2I − Q−12 )V.

This leads to

(P−1A2LM )11 =1

2I +

(I − 2U

(I − V T (2I − Q−1

2 )V)−1

UT

)M =:

1

2I + YM,

where Y = I − 2U(I − V T (2I − Q−1

2 )V)−1

UT . Using arguments similar to the onesused in estimating (I − 2F )−1 in the proof of Lemma 4.1, we obtain Y ≤ I.

In order to save space, we set X := P−1A2LM such that

R(θ) =

[X11 − θI 0

X21 (1− θ)I]

]−1

=

[(X11 − θI)−1 0

−(1− θ)−1X21(X11 − θI)−1 (1 − θ)−1I

].

Using the triangle inequality, it follows that

(4.10) ‖R(θ)‖ ≤ ‖(X11 − θI)−1‖+ ‖(1− θ)−1I‖+ ‖(1− θ)−1X21(X11 − θI)−1‖.

We begin with the upper-left block R11(θ) = (X11 − θI)−1. We find that ‖YM‖ =‖R11(1/2)‖ ≤ ‖M‖ ≤ 1

2 − ε and hence

σmin

((1/2− θ)I + YM

)≥ |1/2− θ| − (1/2− ε).

In other words,

‖R11(θ)‖ ≤ 1

|θ − 12 | −

12 + ε

for |θ − 1

2| > 1

2− ε.(4.11)

We now look at the lower-left entry R21(θ) and find

R21(θ) = −2(1− θ)−1((I − 2K22)Q2 +K22)−1K21M (R11(θ)).(4.12)

Dow

nloa

ded

01/1

3/16

to 1

37.1

95.8

.21.

Red

istr

ibut

ion

subj

ect t

o SI

AM

lice

nse

or c

opyr

ight

; see

http

://w

ww

.sia

m.o

rg/jo

urna

ls/o

jsa.

php

Page 14: SEBASTIEN LOISEL´ Schwarzand2 … · × 2 block upper triangular matrix with the identity matrix in ... be expensive for these very large sparse systems of linear equations

Copyright © by SIAM. Unauthorized reproduction of this article is prohibited.

A2908 SEBASTIEN LOISEL, HIEU NGUYEN, AND ROBERT SCHEICHL

From (4.12), (4.11), (3.16), and the fact that ‖M‖ ≤ 1/2− ε, we have

‖R21(θ)‖ ≤ 2|1− θ|−1‖(I − 2K22)Q2 +K22)−1‖‖K21M‖‖R11(θ)‖

≤(

1− 2ε

|1− θ|(|θ − 1

2 | −12 + ε

)) ‖Z−1‖.(4.13)

Then the resolvent norm estimate (4.8) follows from (4.10), (4.11), and (4.13).We now state the main convergence result for the transformed 2LM system. We

will use A(k) � B(k) to express that A(k) ≤ C B(k) for some constant C which isindependent of k and the choice of the coarse space.

Theorem 4.4. Let CZ = (1 + ‖Z−1‖)/2 with Z as defined in (3.16). Then therelative residual norm in solving (3.2) by GMRES with preconditioner (3.8) satisfies

‖rk‖‖r0‖

� min{1, CZ kk(k − 2)2−k(1− 2ε)k−2} for k ≥ max{3, 2ε−1 − 2

}.(4.14)

Furthermore, the residual in solving the transformed 2LM system by GMRES is re-duced by a fixed tolerance factor of 10−2 every

kf = max{((1− log ε)2)ε−1 + log2 CZ , 2ε

−1 − 2}

(4.15)

iterations.Proof. The first part of the proof will follow an approach similar to that in [15,

subsection 2.3] to derive convergence bounds.It is known (see [15] and references therein) that the residuals in the GMRES

algorithm satisfy the minimum residual property, i.e.,

‖rk‖ = minp∈Pk

‖p(P−1A2LM )r0‖,

where Pk = {polynomials p of degree ≤ k with p(0) = 1 }. This implies that ‖rk‖ ≤‖r0‖.

Let Sε be the disk with center (1/2, 0) and radius 1/2 − ε, where 0 ≤ ε < ε.Clearly, Sε ⊃ Sε, i.e., Sε contains the whole spectrum of P−1A2LM , except the isolatedeigenvalue 1. Denote by Γε the (circular) boundary of Sε and recall the definition ofRb(·) in (4.9). Using the estimates which are popular in pseudospectral analysis, e.g.,in [15, subsection 2.3], we have

‖rk‖‖r0‖

≤ minp∈Pk

‖p(P−1A2LM )‖ ≤ minp∈Pk

1

∫Γε

|p(z)|‖Rb(z)‖dz

≤ L(Γε)

2πmaxz∈Γε

‖Rb(z)‖ minp∈Pk

maxz∈Sε

|p(z)|, (L(Γε) : length of Γε)

≤ 1

2maxz∈Γε

‖Rb(z)‖ minp∈Pk

maxz∈Sε

|p(z)|.(4.16)

We emphasize here that (4.16) holds as long as Sε ⊃ Sε, i.e., for any 0 ≤ ε < ε.Consequently,

‖rk‖‖r0‖

≤ min0≤ε<ε

{1

2maxz∈Γε

‖Rb(z)‖ minp∈Pk

maxz∈Sε

|p(z)|}.(4.17)

Dow

nloa

ded

01/1

3/16

to 1

37.1

95.8

.21.

Red

istr

ibut

ion

subj

ect t

o SI

AM

lice

nse

or c

opyr

ight

; see

http

://w

ww

.sia

m.o

rg/jo

urna

ls/o

jsa.

php

Page 15: SEBASTIEN LOISEL´ Schwarzand2 … · × 2 block upper triangular matrix with the identity matrix in ... be expensive for these very large sparse systems of linear equations

Copyright © by SIAM. Unauthorized reproduction of this article is prohibited.

OSM AND 2LM FOR MULTISCALE PDES A2909

Fig. 1. Different upper bounds, CZ ρ(ε, ε, k), of the relative residual as functions of k for

CZ = 1, ε = 0.2 and different values of ε, ε = 0, 0.02, . . . , 0.18 (the larger ε is, the faster the bounddecreases).

This family of bounds is common in analyzing GMRES residuals with pseudospec-tra [9, 15]. Figure 1 gives an example of how these bounds might look. Each individualbound is “sharpest” only for a certain range of iterations. For the first few iterations,the relative residual is followed closest by the bound associated with ε = 0, becausein this case Sε contains the whole spectrum of P−1A2LM , including the outlier 1.After that, the outlier has no effect and GMRES enters a regime with faster conver-gence, with the sharpest bounds being the ones associated with ε ≈ ε. Since it isdifficult to quantify the minimum of all the bounds in (4.17)—especially the secondterm on the right-hand side—we focus on bounds associated with the regime withfaster convergence. More specifically, we consider ε/2 ≤ ε < ε.

Recalling the definition of Rb(·) in (4.9) and noting that for z ∈ Γε, Rb(z) is largewhen either z = ε or z = 1− ε, we have a rough estimate for the second term in (4.17):

maxz∈Γε

‖Rb(z)‖ = max{‖Rb(ε)‖, ‖Rb(1− ε)‖

}= max

{1− 2ε+ ε+ (1− 2ε)‖Z−1‖(1− ε)(ε − ε)

,ε+ (1− 2ε)‖Z−1‖

ε(ε− ε)

}≤ 2CZ (ε − ε)−2,(4.18)

where CZ is defined in the statement of the theorem and ε/2 ≤ ε < ε < 1/2.In addition, as Sε is a disk, according to [9], the last term in (4.17) can be

estimated sharply for sufficiently large k by

(4.19) minp∈Pk

maxz∈Sε

|p(z)| ≤ Cρk, where ρ = 1− 2ε, C ≈ 1.

From (4.17), (4.19), and (4.18), we find that

‖rk‖‖r0‖

� CZ

ρ(ε,ε,k)︷ ︸︸ ︷((ε − ε)−2(1− 2ε)k

).(4.20)

Dow

nloa

ded

01/1

3/16

to 1

37.1

95.8

.21.

Red

istr

ibut

ion

subj

ect t

o SI

AM

lice

nse

or c

opyr

ight

; see

http

://w

ww

.sia

m.o

rg/jo

urna

ls/o

jsa.

php

Page 16: SEBASTIEN LOISEL´ Schwarzand2 … · × 2 block upper triangular matrix with the identity matrix in ... be expensive for these very large sparse systems of linear equations

Copyright © by SIAM. Unauthorized reproduction of this article is prohibited.

A2910 SEBASTIEN LOISEL, HIEU NGUYEN, AND ROBERT SCHEICHL

0.10.2

0.30.4

0.5

0.10.2

0.30.4

0

2

4

6

x 10−3

ε1/C

Fig. 2. Residual norm ρ(ε, CZ) after k0 = ((1− log ε)2)ε−1 + log2 CZ iterations, as a function

of ε and 1/CZ .

Figure 1 shows some of these bounds for CZ = 1, ε = 0.2. It can be noticedthat in the regime with faster convergence, the overall bound behaves very much likeCs(1− 2ε)k−2 for some constant Cs.

Now, solving ∂∂ερ(ε, ε, k) = 0 reveals that the best choice of ε is ε = max

{εk−1k−2 , 0

}.

In order for this best choice to be valid, i.e., ε/2 ≤ ε, it is required that k ≥ 2ε−1 − 2.Substituting the best choice of ε into (4.20) gives (4.14). To obtain the iterationestimate, substitute k = k0 = ((1− log ε)2)ε−1 + log2 CZ into (4.14) and obtain

ρ(ε, CZ) =CZ

ε2

⎛⎝(

(1 − log ε)2 + ε log2 CZ

)((1−log ε)2+ε log2 CZ)ε−1

×(

1− 2ε

(1 − log ε)2 + ε(log2 CZ − 2)

)((1−log ε)2+ε(log2 CZ−2))ε−1⎞⎠ .

We have plotted ρ(ε, CZ) in Figure 2. From this plot, we see that the residual isreduced by a factor of ρ(ε, CZ) < 0.01 every kf = max

{k0, 2ε

−1 − 2}iterations, as

required.We are now ready to give our main result on the convergence of the GMRES

algorithm for solving the preconditioned 2LM system (2.21).Theorem 4.5. Let CZ =

√κ(B) (1 + ‖Z−1‖)/2 with Z as defined in (3.16).

Then the relative residual norm in solving (2.21) by GMRES satisfies

‖rk‖‖r0‖

� min{1, CZ kk(k − 2)2−k(1− 2ε)k−2} for k ≥ max{3, 2ε−1 − 2

}.(4.21)

Furthermore, the residual in solving the 2LM system by GMRES is reduced by a fixedtolerance factor of 10−2 every

kf = max{((1− log ε)2)ε−1 + log2 CZ , 2ε

−1 − 2}

(4.22)

iterations.

Dow

nloa

ded

01/1

3/16

to 1

37.1

95.8

.21.

Red

istr

ibut

ion

subj

ect t

o SI

AM

lice

nse

or c

opyr

ight

; see

http

://w

ww

.sia

m.o

rg/jo

urna

ls/o

jsa.

php

Page 17: SEBASTIEN LOISEL´ Schwarzand2 … · × 2 block upper triangular matrix with the identity matrix in ... be expensive for these very large sparse systems of linear equations

Copyright © by SIAM. Unauthorized reproduction of this article is prohibited.

OSM AND 2LM FOR MULTISCALE PDES A2911

Proof. The residual norms of GMRES in solving (2.21) satisfy the minimumresidual property, namely,

‖rk‖ = minp∈Pk

‖p(P−1A2LM )r0‖.

Therefore,

‖rk‖‖r0‖

≤ minp∈Pk

‖p(P−1A2LM )‖ = minp∈Pk

‖B1/2p(P−1A2LM )B−1/2‖

≤ κ(B1/2)minp∈Pk

‖p(P−1A2LM )‖ =√κ(B) min

p∈Pk

‖p(P−1A2LM )‖.(4.23)

The proof is finished using the same estimate in the proof for Theorem 4.4 for thesecond factor of the last term in (4.23).

4.1. Coarse space construction and the optimal Robin parameter. In[35, 8], all eigenvectors of (3.5) associated with eigenvalues of size less than diam(Ωi)

−1

are included in the coarse space. This seems to work well for all of their consideredtest problems. The same approach can be utilized for our proposed method. However,in our method, the rate of convergence can be estimated a priori, and we will exploitthis feature to decide the cut-off for our spectral coarse space.

Assume that the spectrum of the generalized eigenvalue problem can be decom-posed as follows (3.4):

(4.24) σ(S,B) = {0 < · · · < s0} ∪ {smin < · · · < smax},

where the coarse space is constructed using the eigenfunctions associated with eigen-values in the set {0 < · · · < s0}. We recall that the eigenvalues of Q are of the forma/(a+s) with s ∈ σ(S,B). In addition, the spectrum ofQ is σ(Q) = [ε, 1−ε]∪(1−ε, 1].Therefore, it is required that

a

a+ s⊂ [ε, 1− ε] for smin < s < smax,

or, equivalently,

(4.25) ε ≤ min

{smin

a+ smin,

a

a+ smax

}.

Up until this point, we have assumed that the Robin parameter a is given. However,an attractive feature of 2LM methods is that this parameter can be optimized on thefly to improve the convergence of the methods. According to Theorem 4.5, the largerε is, the faster the 2LM methods converge. Therefore, we would like to choose a sothat ε is largest. This happens when the two quantities on the right-hand side of(4.25) are equal, or

(4.26) aop =√sminsmax,

as one of the ratios in (4.25) is increasing and the other decreasing with respect to a.Consequently, the optimal value of ε is

(4.27) εop =1

1 +√κeff(S)

, where κeff(S) =smax

smin.

Dow

nloa

ded

01/1

3/16

to 1

37.1

95.8

.21.

Red

istr

ibut

ion

subj

ect t

o SI

AM

lice

nse

or c

opyr

ight

; see

http

://w

ww

.sia

m.o

rg/jo

urna

ls/o

jsa.

php

Page 18: SEBASTIEN LOISEL´ Schwarzand2 … · × 2 block upper triangular matrix with the identity matrix in ... be expensive for these very large sparse systems of linear equations

Copyright © by SIAM. Unauthorized reproduction of this article is prohibited.

A2912 SEBASTIEN LOISEL, HIEU NGUYEN, AND ROBERT SCHEICHL

It now remains for us to discuss how we compute or choose the parameterssmax, smin, ε, and a. The parameter smax can be calculated by solving for the largesteigenvalue of the eigenvalue problem on (3.4). The process of choosing smin is moredelicate. First, we decide (roughly) how fast we would like the convergence to be bychoosing a lower bound for ε. For example, in our experiments we require ε > 0.1. Inother words, we demand that the rate of convergence (which is less than 1 − 2ε, asshown in Theorems 4.4 and 4.5) be at least 0.8. With the restriction on ε, from (4.27),we obtain a lower bound for smin. Then smin is determined as the first eigenvalue of(3.4) that is bigger than the lower bound. During the process, all of the eigenvectorsof (3.4) associated with eigenvalues up to smin are computed and saved to build thecoarse space. Finally, we note that the eigenvalues and eigenvectors of (3.4) can befound by solving in parallel the eigenproblems in subdomains (3.5).

4.2. Convergence estimate in terms of mesh parameters. Although wecannot estimate the norm of the coarse problem ‖Z−1‖ and thus CZ when we haveno information about the coarse space, it is worthwhile discussing what the estimate(4.14) reveals about the classical case where the coarse space consists of piecewise con-stant functions and where the problem is homogeneous, with benign variations in thediffusion coefficient (or even in the case where the elliptic problem is the Laplacian).

In this case, the condition number of the local Schur complement S (modulo thecoarse space of constant functions) is O(H/h), yielding the value ε−1 = O(

√H/h);

and CZ = 2‖((I − 2K22)Q2 +K22)−1‖ = 2‖(I −K22)

−1‖ = O(H−2) [28]. As a result,the iteration count estimate (4.15) becomes

O(√

H/h log2√H/h+ log2(H)) iterations,(4.28)

which is consistent with the Fourier analysis done in [14].The above analysis also applies to the heterogeneous case where the diffusion co-

efficient is “quasi-monotone” [38]. Recall that the diffusion coefficient α(x) is quasi-monotone, roughly, if for any x ∈ Ω there is a path γ(t) from x to y = argmaxx α(x)such that α(γ(t)) is monotonically increasing. In that situation, the generalized con-dition number of the pencil (S,B), modulo the coarse space of piecewise constantfunctions, is also O(H/h) and the estimate is again (4.28).

If the diffusion coefficient is heterogeneous and not quasi-monotone, then thepencil (S,B) is likely to have some extreme eigenvalues apart from those relatedto the kernel of S. In that case, using a “classical” coarse space gives very slowconvergence. Our new spectral coarse space automatically adapts to this difficultheterogeneous case and gives arbitrarily good convergence by automatically enrichingthe coarse space.

5. Numerical experiments. In this section, we will use our proposed methodto solve the model problem (2.1) for different types of variation in the coefficient α.The considered types of α are similar to the ones in [35].

In all of the experiments, the domain Ω is the unit square Ω = (0, 1)2. We useuniform triangular meshes of size h = 1/64, 1/128, 1/256. Unless stated otherwise,the regular tile partitions 4× 4, 8× 8, and 16× 16 will be considered.

The transformed 2LM system (3.13) is solved by the GMRES algorithm [43] withrelative residual tolerance of 10−9 and maximum number of iterations of nΓ or 500,whichever is smaller (nΓ is the size of the 2LM systems). We consider three cases:without any preconditioner, with the two-level preconditioner P0 in [24], and with

our preconditioner P in (3.8). The λ obtained from λ is used as data for the local

Dow

nloa

ded

01/1

3/16

to 1

37.1

95.8

.21.

Red

istr

ibut

ion

subj

ect t

o SI

AM

lice

nse

or c

opyr

ight

; see

http

://w

ww

.sia

m.o

rg/jo

urna

ls/o

jsa.

php

Page 19: SEBASTIEN LOISEL´ Schwarzand2 … · × 2 block upper triangular matrix with the identity matrix in ... be expensive for these very large sparse systems of linear equations

Copyright © by SIAM. Unauthorized reproduction of this article is prohibited.

OSM AND 2LM FOR MULTISCALE PDES A2913

Fig. 3. log10(α(x)) in the continuous test case (left) and alternating test case (right).

discrete problems (2.8). These are solved directly to obtain u, the approximationof the discrete solution. We will report the number of GMRES iterations in solving(3.13) for the Lagrange multiplier λ and the relative error of the approximation of

the discrete solution ‖u−uex‖‖uex‖ , where uex is computed by a direct solver. We will also

provide “dim,” the dimension of the coarse space and the value of parameter ε for thecase with preconditioner P . In all of the experiments, we will start with

dim = min{max

{4p, round(0.1nΓ)

}, round(0.2nΓ)

}and increase “dim” (through adding more eigenfunctions to the coarse space) by0.05nΓ if ε < 0.1.

For the last three experiments, the configuration with h = 1/64 and partition 4×4is more thoroughly studied. We plot eigenvalues of the generalized eigenvalue problem(3.4), the spectrum of the preconditioned system P−1A2LM , and the convergencehistory of solving (3.13) by GMRES. A plot of Cs(1 − 2ε)k−2 for a suitable Cs isprovided along the convergence history for comparison.

5.1. Continuous variations of the coefficient. In this experiments, we con-sider a continuous function αc(x), where

log10(αc(x)) = κ sin(wπ(x(1) + x(2))),

with κ = 3, w = 4, and x(i) is the ith coordinate of x.The coefficient α(x) is a piecewise constant approximation of αc(x) with inter-

polation points at the centroids of elements. The contrast ratio in this experimentis 106. Figure 3 (left) shows log10(α(x)) on the uniform mesh of size 1/64 for thecontinuous test case.

In Table 1, we can see that the preconditioner P helps to substantially reduce thenumber of GMRES iterations while delivering better accuracy for the approximation ofthe discrete solution. The convergence rate (which is a function of ε) and consequentlythe iteration count in the preconditioned case are stable with respect to changes inthe mesh size h and the number of subdomains p. The size of the coarse space doesgrow as h becomes smaller and p becomes bigger. However, this is inevitable as theproblem becomes harder and the coarse space must adapt to maintain a reasonableiteration count.

Dow

nloa

ded

01/1

3/16

to 1

37.1

95.8

.21.

Red

istr

ibut

ion

subj

ect t

o SI

AM

lice

nse

or c

opyr

ight

; see

http

://w

ww

.sia

m.o

rg/jo

urna

ls/o

jsa.

php

Page 20: SEBASTIEN LOISEL´ Schwarzand2 … · × 2 block upper triangular matrix with the identity matrix in ... be expensive for these very large sparse systems of linear equations

Copyright © by SIAM. Unauthorized reproduction of this article is prohibited.

A2914 SEBASTIEN LOISEL, HIEU NGUYEN, AND ROBERT SCHEICHL

Table 1

Convergence in continuous test case.

No preconditioner Preconditioner P

p Its.‖u−uex‖‖uex‖ Its.

‖u−uex‖‖uex‖ dim ε

h−1 = 6416 76 4.5e-8 21 2.1e-10 128 2.56e-164 134 2.5e-8 24 9.3e-11 353 2.67e-1

256 265 7.6e-1 25 3.2e-10 756 2.62e-1h−1 = 128

16 88 4.0e-9 21 9.3e-11 229 2.78e-164 157 9.4e-9 22 3.1e-10 533 2.74e-1

256 288 1.2e-8 21 9.3e-10 1524 3.02e-1h−1 = 256

16 106 6.3e-9 18 1.9e-10 459 2.82e-164 193 1.1e-9 18 6.3e-10 1071 2.83e-1

256 337 8.8e-9 22 2.2e-10 2295 2.83e-1

Table 2

Convergence in alternating test case.

No preconditioner Preconditioner P

p Its.‖u−uex‖‖uex‖ Its.

‖u−uex‖‖uex‖ dim ε

h−1 = 6416 70 3.1e-8 16 1.6e-10 128 3.08e-164 148 2.0e-8 18 4.8e-10 353 3.24e-1

256 214 9.3e-9 24 5.8e-10 756 2.98e-1h−1 = 128

16 88 1.7e-8 17 7.1e-11 229 2.95e-164 157 2.2e-8 20 2.0e-10 533 2.95e-1

256 337 7.1e-9 20 3.0e-10 1524 3.12e-1h−1 = 256

16 108 1.2e-8 16 1.2e-10 459 2.98e-164 198 1.1e-8 17 2.3e-10 1071 2.95e-1

256 402 9.5e-9 20 2.3e-10 2295 2.96e-1

5.2. Highly heterogeneous coefficient: Alternating case. We consider thecase where the coefficient α alternates between 1 and 108 in eleven horizontal stripes(cf. Figure 3 (right)). More precisely, α|τ = 108 if every point x in τ satisfies

mod(floor(11 x(2)), 2

)= 1,

and ατ = 1 otherwise. Here “mod” and “floor” denote the modulo and rounding (tothe nearest integer toward −∞) operators, respectively. The contrast ratio in thisexperiment is 108.

In Table 2, P shows a performance similar to that in the continuous test case. Itrequires a small, stable number of iterations while delivering better accuracy.

5.3. Highly heterogeneous coefficient: Skyscraper case. In this experi-ment, we consider

α|τ = 102∗mod(floor(10 cxτ(1)),2)−1, cxτ is the centroid of τ

if every point x of τ satisfies

mod(floor(10 x(i)), 2

)= 1, i = 1, 2.

Dow

nloa

ded

01/1

3/16

to 1

37.1

95.8

.21.

Red

istr

ibut

ion

subj

ect t

o SI

AM

lice

nse

or c

opyr

ight

; see

http

://w

ww

.sia

m.o

rg/jo

urna

ls/o

jsa.

php

Page 21: SEBASTIEN LOISEL´ Schwarzand2 … · × 2 block upper triangular matrix with the identity matrix in ... be expensive for these very large sparse systems of linear equations

Copyright © by SIAM. Unauthorized reproduction of this article is prohibited.

OSM AND 2LM FOR MULTISCALE PDES A2915

Fig. 4. log10(α(x)) in the skyscraper test case (left) and channels and inclusions test case (right).

Table 3

Convergence in skyscraper test case.

No preconditioner Preconditioner P

p Its. ‖u−uex‖‖uex‖ Its. ‖u−uex‖

‖uex‖ dim ε

h−1 = 6416 301 2.7e-2 21 3.7e-6 128 1.74e-164 442 3.7e-2 19 3.8e-6 353 2.33e-1

256 500 7.6e-1 19 2.1e-6 756 2.16e-1h−1 = 128

16 457 1.7e-2 20 6.4e-6 229 1.67e-164 500 5.2e-2 19 2.2e-6 533 1.98e-1

256 500 8.8e-1 18 1.7e-6 1524 2.41e-1h−1 = 256

16 500 1.1e-0 17 2.5e-5 459 1.70e-164 500 8.9e-1 16 6.7e-6 1071 2.01e-1

256 500 1.0e-0 15 4.1e-6 2295 2.16e-1

Figure 4 (left) shows log10(α) for the skyscraper test case. Basically, in the islands,we have α = 10(2k−1), k = 1, . . . , 5, from left to right. In the rest of the domain,α = 1. The contrast ratio in this experiment is 109.

According to Table 3, in this test case, (3.13) is very difficult to solve without apreconditioner. In many cases, the desired tolerance of 10−9 cannot be achieved evenafter the maximum GMRES iterations min{nΓ, 500}. Consequently, the computeddiscrete solutions are inaccurate with relative errors often bigger than 10−2. On theother hand, the preconditioner P keeps the number of GMRES iterations below 21.The computed discrete solutions are also fairly accurate with relative errors of around10−6. We also do not see big changes in iteration count as the mesh size and thenumber of subdomains vary.

For the case study, where h = 1/64 and p = 16, the eigenvalues of the generalizedeigenvalue problem are plotted in Figure 5, with those selected for the coarse spacemarked by red circles in the right plot (see electronic version for color figures). Thenumber of selected eigenvalues is small compared to the size of the 2LM system. Thespectrum of A2LM and P−1A2LM is illustrated in Figure 6. We can see that all ofthe eigenvalues of P−1A2LM lie inside the circle Sε, as proved by Lemma 4.1. Theconvergence history and predicted rate of convergence (dashed-dotted line) is shown

Dow

nloa

ded

01/1

3/16

to 1

37.1

95.8

.21.

Red

istr

ibut

ion

subj

ect t

o SI

AM

lice

nse

or c

opyr

ight

; see

http

://w

ww

.sia

m.o

rg/jo

urna

ls/o

jsa.

php

Page 22: SEBASTIEN LOISEL´ Schwarzand2 … · × 2 block upper triangular matrix with the identity matrix in ... be expensive for these very large sparse systems of linear equations

Copyright © by SIAM. Unauthorized reproduction of this article is prohibited.

A2916 SEBASTIEN LOISEL, HIEU NGUYEN, AND ROBERT SCHEICHL

0 100 200 300 400 500 600 700 80010−15

10−10

10−5

100

105

0 100 200 300 400 500 600 700 80010−20

10−15

10−10

10−5

100

105

Fig. 5. Skyscraper: Eigenvalues of the generalized eigenvalue problem (3.4) are sorted inascending order and plotted against their indexes, with those selected for the coarse space markedby red circles in the right plot (color available online).

10−10 10−8 10−6 10−4 10−2 100−0.4

−0.3

−0.2

−0.1

0

0.1

0.2

0.3

0.4

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 1.1−0.4

−0.3

−0.2

−0.1

0

0.1

0.2

0.3

0.4

Fig. 6. Skyscraper: Spectrum of A2LM (left) and P−1A2LM (right).

0 5 10 15 2010−6

10−4

10−2

100

102

104

106

Fig. 7. Skyscraper: Convergence history with predicted rate of convergence (slope of the dotted

line) in solving P−1A2LM λ = −P−1(I − 2K)Qg.

in Figure 7. It can be seen that the rate of convergence agrees with the prediction inTheorem 4.4.

5.4. Channels and inclusions. The setup for the coefficient α in this exper-iment is similar to the one in subsection 5.3 with α = 10k, k = 1, . . . , 5, in theislands from left to right. In addition, there are three channels with α = 106 (see

Dow

nloa

ded

01/1

3/16

to 1

37.1

95.8

.21.

Red

istr

ibut

ion

subj

ect t

o SI

AM

lice

nse

or c

opyr

ight

; see

http

://w

ww

.sia

m.o

rg/jo

urna

ls/o

jsa.

php

Page 23: SEBASTIEN LOISEL´ Schwarzand2 … · × 2 block upper triangular matrix with the identity matrix in ... be expensive for these very large sparse systems of linear equations

Copyright © by SIAM. Unauthorized reproduction of this article is prohibited.

OSM AND 2LM FOR MULTISCALE PDES A2917

Table 4

Convergence in channels and inclusions test case.

No preconditioner Preconditioner P0 Preconditioner P

p Its.‖u−uex‖‖uex‖ Its.

‖u−uex‖‖uex‖ Its.

‖u−uex‖‖uex‖ ε

h−1 = 6416 205 1.4e-6 185 1.4e-5 32 3.0e-8 1.84e-164 332 1.2e-6 224 1.3e-2 30 2.3e-8 2.28e-1

256 500 9.0e-2 500 3.6e-1 36 2.6e-8 2.08e-1h−1 = 128

16 299 3.4e-7 185 1.4e-5 33 2.4e-8 1.73e-164 414 2.9e-7 239 4.6e-3 35 1.4e-8 2.01e-1

256 500 1.9e-2 500 1.6e-1 30 1.2e-8 2.41e-1h−1 = 256

16 408 2.3e-7 377 6.9e-6 37 1.4e-8 1.74e-164 498 1.6e-7 302 1.4e-3 30 6.2e-9 2.06e-1

256 500 5.9e-2 500 1.3e-1 30 1.0e-8 2.16e-1

0 100 200 300 400 500 600 700 80010−15

10−10

10−5

100

105

0 100 200 300 400 500 600 700 80010−15

10−10

10−5

100

105

Fig. 8. Channels and inclusions: Eigenvalues of the generalized eigenvalue problem (3.4) aresorted in ascending order and plotted against their indexes, with the ones selected for the coarsespace marked by red circles in the right plot (color available online).

Figure 4 (right)). The contrast ratio of the coefficient α in this experiment is 106.This is a test problem with known difficulties for many common preconditioners.

In this experiment, we also compare our new preconditioner P with the moreclassical preconditioner P0 based on a piecewise constant coarse space [24]. Due tolimited space, we omit the coarse space dimensions. They are actually the same as inthe previous experiments.

From Table 4, it can be seen that P0 only helps to reduce the GMRES iterationcount minimally, and that its performance and accuracy quickly deteriorate when pincreases. Our preconditioner P , on the other hand, keeps the iteration count stableand reasonably small while delivering superior accuracy.

For the case study, where h = 1/64 and p = 16, the eigenvalues of the generalizedeigenvalue problem are plotted in Figure 8, and the spectrum of the preconditionedsystems are shown in Figure 9. It can be seen that the convergence history in Figure 10agrees with Theorem 4.4.D

ownl

oade

d 01

/13/

16 to

137

.195

.8.2

1. R

edis

trib

utio

n su

bjec

t to

SIA

M li

cens

e or

cop

yrig

ht; s

ee h

ttp://

ww

w.s

iam

.org

/jour

nals

/ojs

a.ph

p

Page 24: SEBASTIEN LOISEL´ Schwarzand2 … · × 2 block upper triangular matrix with the identity matrix in ... be expensive for these very large sparse systems of linear equations

Copyright © by SIAM. Unauthorized reproduction of this article is prohibited.

A2918 SEBASTIEN LOISEL, HIEU NGUYEN, AND ROBERT SCHEICHL

10−8 10−6 10−4 10−2 100−8

−6

−4

−2

0

2

4

6

8x 10−13

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 1.1−0.4

−0.3

−0.2

−0.1

0

0.1

0.2

0.3

0.4

Fig. 9. Channels and inclusions: Spectrum of P−10 A2LM (left) and P−1A2LM (right).

0 5 10 15 20 25 30 3510−8

10−6

10−4

10−2

100

102

Fig. 10. Channels and inclusions: Convergence history with predicted rate of convergence (slope

of the dotted line) in solving P−1A2LM λ = −P−1(I − 2K)Qg.

5.5. Lognormal. In this experiment, α = α(x,w) = 10Z(x,w), where Z(x,w) isa Gaussian random field with zero mean and Gaussian covariance

C(x, y) = σ2 exp

(−‖x− y‖2

�2

), with σ = 1, �2 = 1e-3.

Our realization of α is generated by the spectral decomposition method described in[31]. An example of log10(α) for the mesh of size 1/128 is shown in Figure 11 (left).The contrast ratio in this example is 108.

Similar to the channels and inclusions test case in subsection 5.4, the precondi-tioner P0 only helps to reduce the GMRES iteration count minimally (cf. Table 5). Itsperformance and accuracy quickly deteriorate when p increases. Our preconditionerP , on the other hand, keeps the iteration count stable and small while delivering goodaccuracy. It is also robust with the changes in mesh size and number of subdomains.The coarse space dimensions are again the same as in the first three test cases.

For the case study, where h = 1/64 and p = 16, the eigenvalues of the generalizedeigenvalue problem are plotted in Figure 12. The spectrum of the preconditionedsystems and the convergence history are shown in Figures 13 and 14, respectively. Inthese figures, we see a good agreement with Lemma 4.1 and Theorem 4.4.

In order to make sure that our preconditioner works for general partitions, weuse Metis [26] to generate the partitions used in our last test (see Table 6). Figure 11(right) shows the partition for the mesh with h = 1/64 and p = 64.

Dow

nloa

ded

01/1

3/16

to 1

37.1

95.8

.21.

Red

istr

ibut

ion

subj

ect t

o SI

AM

lice

nse

or c

opyr

ight

; see

http

://w

ww

.sia

m.o

rg/jo

urna

ls/o

jsa.

php

Page 25: SEBASTIEN LOISEL´ Schwarzand2 … · × 2 block upper triangular matrix with the identity matrix in ... be expensive for these very large sparse systems of linear equations

Copyright © by SIAM. Unauthorized reproduction of this article is prohibited.

OSM AND 2LM FOR MULTISCALE PDES A2919

Fig. 11. log10(α(x)) in the lognormal test case (left) and a partition generated by Metis (right).

Table 5

Convergence in lognormal test case.

No preconditioner Preconditioner P0 Preconditioner P

p Its. ‖u−uex‖‖uex‖ Its. ‖u−uex‖

‖uex‖ Its. ‖u−uex‖‖uex‖ ε

h−1 = 6416 99 1.5e-9 66 3.3e-4 23 1.2e-9 2.65e-164 161 1.5e-9 113 2.1e-3 23 2.1e-9 2.79e-1

256 500 7.6e-1 308 1.7e-2 19 2.1e-6 2.16e-1h−1 = 128

16 128 2.8e-9 74 6.2e-5 23 2.0e-9 2.40e-164 216 1.5e-9 141 2.9e-4 28 4.5e-9 2.22e-1

256 451 7.9e-10 372 1.2e-2 27 4.0e-9 2.46e-1h−1 = 256

16 158 1.5e-9 102 7.6e-6 20 3.4e-9 2.59e-164 267 7.9e-9 159 1.4e-4 21 3.4e-9 2.58e-1

256 500 3.0e-9 367 1.3e-3 23 3.5e-9 2.50e-1

0 500 1000 1500 2000 2500 3000 3500 400010−15

10−10

10−5

100

105

0 500 1000 1500 2000 2500 3000 3500 400010−15

10−10

10−5

100

105

Fig. 12. Lognormal: Eigenvalues of the generalized eigenvalue problem (3.4) are sorted inascending order and plotted against their indexes, with the ones selected for the coarse space markedby red circles in the right plot (color available online).

Dow

nloa

ded

01/1

3/16

to 1

37.1

95.8

.21.

Red

istr

ibut

ion

subj

ect t

o SI

AM

lice

nse

or c

opyr

ight

; see

http

://w

ww

.sia

m.o

rg/jo

urna

ls/o

jsa.

php

Page 26: SEBASTIEN LOISEL´ Schwarzand2 … · × 2 block upper triangular matrix with the identity matrix in ... be expensive for these very large sparse systems of linear equations

Copyright © by SIAM. Unauthorized reproduction of this article is prohibited.

A2920 SEBASTIEN LOISEL, HIEU NGUYEN, AND ROBERT SCHEICHL

10−8 10−6 10−4 10−2 100−4

−3

−2

−1

0

1

2

3

4x 10−10

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 1.1−0.4

−0.3

−0.2

−0.1

0

0.1

0.2

0.3

0.4

Fig. 13. Lognormal: Spectrum of P−10 A2LM (left) and P−1A2LM (right).

0 5 10 15 20 25 3010−8

10−6

10−4

10−2

100

102

Fig. 14. Lognormal: Convergence history with predicted rate of convergence (slope of the dotted

line) in solving P−1A2LM λ = −P−1(I − 2K)Qg.

Table 6

Convergence in lognormal test case with Metis partitions.

No preconditioner Preconditioner P0 Preconditioner P

p Its. ‖u−uex‖‖uex‖ Its. ‖u−uex‖

‖uex‖ Its. ‖u−uex‖‖uex‖ ε

h−1 = 6416 130 9.7e-10 80 1.0e-4 39 4.9e-9 1.53e-164 202 1.1e-9 155 3.5e-4 46 6.9e-9 1.51e-1

256 390 8.2e-10 463 3.9e-3 64 1.2e-8 8.65e-2h−1 = 128

16 156 1.1e-9 112 1.6e-5 30 5.9e-9 1.91e-164 249 9.0e-10 186 1.9e-4 32 8.1e-9 1.84e-1

256 465 1.1e-9 454 4.7e-3 33 9.0e-9 1.68e-1h−1 = 256

16 181 1.5e-9 133 2.0e-6 22 5.2e-9 2.25e-164 294 7.9e-10 192 1.1e-4 25 4.7e-9 2.10e-1

256 500 9.1e-9 478 1.3e-3 27 5.8e-9 2.19e-1

The preconditioner P is still the winner, with good accuracy and a much smalleriteration count. In comparison with the case where regular partitions are used, theiteration counts are bigger, especially when the mesh is coarse (h = 1/64). However,they become reasonably small for finer meshes.

Dow

nloa

ded

01/1

3/16

to 1

37.1

95.8

.21.

Red

istr

ibut

ion

subj

ect t

o SI

AM

lice

nse

or c

opyr

ight

; see

http

://w

ww

.sia

m.o

rg/jo

urna

ls/o

jsa.

php

Page 27: SEBASTIEN LOISEL´ Schwarzand2 … · × 2 block upper triangular matrix with the identity matrix in ... be expensive for these very large sparse systems of linear equations

Copyright © by SIAM. Unauthorized reproduction of this article is prohibited.

OSM AND 2LM FOR MULTISCALE PDES A2921

6. Conclusion. We have formulated and analyzed a two-level preconditionerfor optimized Schwarz and 2-Lagrange methods. With a coarse space that can auto-matically adapt to diffusion coefficient and achieve any a priori given linear rate ofconvergence, our preconditioner is very efficient and robust with highly heterogeneousdiffusion coefficient. Numerical results have verified our theoretical findings.

Acknowledgment. We would like to thank the anonymous referees whose con-structive remarks helped improve the paper significantly.

REFERENCES

[1] J. Aarnes and T. Y. Hou, Multiscale domain decomposition methods for elliptic problemswith high aspect ratios, Acta Math. Appl. Sin. Engl. Ser., 18 (2002), pp. 63–76.

[2] M. Benzi, A. Frommer, R. Nabben, and D. B. Szyld, Algebraic theory of multiplicativeSchwarz methods, Numer. Math., 89 (2001), pp. 605–639.

[3] H. Berninger, S. Loisel, and O. Sander, The 2-Lagrange multiplier method applied to non-linear transmission problems for the Richards equation in heterogeneous soil with crosspoints, SIAM J. Sci. Comput., 36 (2014), pp. A2166–A2198.

[4] E. J. Braga and M. J. S. de Lemos, Heat transfer in enclosures having a fixed amount ofsolid material simulated with heterogeneous and homogeneous models, Internat. J. HeatMass Trans., 48 (2005), pp. 4748–4765.

[5] J. H. Bramble, J. E. Pasciak, J. P. Wang, and J. Xu, Convergence estimates for productiterative methods with applications to domain decomposition, Math. Comp., 57 (1991),pp. 1–21.

[6] T. F. Chan and T. P. Mathew, Domain decomposition algorithms, Acta Numer., 3 (1994),pp. 61–143.

[7] K. A. Cliffe, I. G. Graham, R. Scheichl, and L. Stals, Parallel computation of flow inheterogeneous media modelled by mixed finite elements, J. Comput. Phys., 164 (2000),pp. 258–282.

[8] V. Dolean, F. Nataf, R. Scheichl, and N. Spillane, Analysis of a two-level Schwarz methodwith coarse spaces based on local Dirichlet-to-Neumann maps, Comput. Methods Appl.Math., 12 (2012), pp. 391–414.

[9] T. A. Driscoll, K.-C. Toh, and L. N. Trefethen, From potential theory to matrix iterationsin six steps, SIAM Rev., 40 (1998), pp. 547–578.

[10] S. W. Drury and S. Loisel, Sharp condition number estimates for the symmetric 2-Lagrangemultiplier method, in Domain Decomposition Methods in Science and Engineering XX,Springer, Heidelberg, 2013, pp. 255–261.

[11] M. Dryja, M. V. Sarkis, and O. B. Widlund, Multilevel Schwarz methods for elliptic prob-lems with discontinuous coefficients in three dimensions, Numer. Math., 72 (1996), pp. 313–348.

[12] M. Dryja, B. F. Smith, and O. B. Widlund, Schwarz analysis of iterative substructuringalgorithms for elliptic problems in three dimensions, SIAM J. Numer. Anal., 31 (1994),pp. 1662–1694.

[13] M. Dryja and O. B. Widlund, Towards a unified theory of domain decomposition algorithmsfor elliptic problems, in Proceedings of the Third International Symposium on DomainDecomposition Methods for Partial Differential Equations (Houston, TX, 1989), SIAM,Philadelphia, 1990, pp. 3–21.

[14] O. Dubois, M. J. Gander, S. Loisel, A. St-Cyr, and D. B. Szyld, The optimized Schwarzmethod with a coarse grid correction, SIAM J. Sci. Comput., 34 (2012), pp. A421–A458.

[15] M. Embree, How Descriptive Are GMRES Convergence Bounds?, Tech. report, Oxford Uni-versity Computing Laboratory, 1999.

[16] C. Farhat, A. Macedo, M. Lesoinne, F.-X. Roux, F. Magoules, and A. de La Bour-

donnaie, Two-level domain decomposition methods with Lagrange multipliers for the fastiterative solution of acoustic scattering problems, Comput. Methods Appl. Mech. Engrg.,184 (2000), pp. 213–239.

[17] C. Farhat and F.-X. Roux, A method of finite element tearing and interconnecting and itsparallel solution algorithm, Internat. J. Numer. Methods Engrg., 32 (1991), pp. 1205–1227.

[18] J. Galvis and Y. Efendiev, Domain decomposition preconditioners for multiscale flows inhigh-contrast media, Multiscale Model. Simul., 8 (2010), pp. 1461–1483.

[19] J. Galvis and Y. Efendiev, Domain decomposition preconditioners for multiscale flows in

Dow

nloa

ded

01/1

3/16

to 1

37.1

95.8

.21.

Red

istr

ibut

ion

subj

ect t

o SI

AM

lice

nse

or c

opyr

ight

; see

http

://w

ww

.sia

m.o

rg/jo

urna

ls/o

jsa.

php

Page 28: SEBASTIEN LOISEL´ Schwarzand2 … · × 2 block upper triangular matrix with the identity matrix in ... be expensive for these very large sparse systems of linear equations

Copyright © by SIAM. Unauthorized reproduction of this article is prohibited.

A2922 SEBASTIEN LOISEL, HIEU NGUYEN, AND ROBERT SCHEICHL

high contrast media: Reduced dimension coarse spaces, Multiscale Model. Simul., 8 (2010),pp. 1621–1644.

[20] M. J. Gander, Optimized Schwarz methods, SIAM J. Numer. Anal., 44 (2006), pp. 699–731.[21] M. J. Gander, L. Halpern, and F. Nataf, Optimized Schwarz methods, in Domain Decom-

position Methods in Sciences and Engineering (Chiba, 1999), DDM.org, Augsburg, 2001,pp. 15–27.

[22] I. G. Graham and M. J. Hagger, Unstructured additive Schwarz–conjugate gradient methodfor elliptic problems with highly discontinuous coefficients, SIAM J. Sci. Comput., 20(1999), pp. 2041–2066.

[23] I. G. Graham, P. O. Lechner, and R. Scheichl, Domain decomposition for multiscale PDEs,Numer. Math., 106 (2007), pp. 589–626.

[24] A. Karangelis and S. Loisel, Condition number estimates and weak scaling for 2-level 2-Lagrange multiplier methods for general domains and cross points, SIAM J. Sci. Comput.,37 (2015), pp. C247–C267.

[25] A. Karangelis, S. Loisel, and C. Maynard, Solving large systems on hector using the 2-Lagrange multiplier methods, in Domain Decomposition Methods in Science and Engineer-ing XXI, Springer, Heidelberg, 2014, pp. 497–505.

[26] G. Karypis and V. Kumar, A Software Package for Partitioning Unstructured Graphs, Par-titioning Meshes, and Computing Fill-Reducing Orderings of Sparse Matrices, Universityof Minnesota, Department of Computer Science and Engineering, Army HPC ResearchCenter, Minneapolis, MN, 1998.

[27] A. Klawonn and O. B. Widlund, Dual and dual-primal FETI methods for elliptic problemswith discontinuous coefficients in three dimensions, in Domain Decomposition Methods inSciences and Engineering (Chiba, 1999), DDM.org, Augsburg, 2001, pp. 29–39.

[28] S. Loisel, Condition number estimates for the nonoverlapping optimized Schwarz method andthe 2-Lagrange multiplier method for general domains and cross points, SIAM J. Numer.Anal., 51 (2013), pp. 3062–3083.

[29] S. Loisel and D. B. Szyld, On the convergence of optimized Schwarz methods by way ofmatrix analysis, in Domain Decomposition Methods in Science and Engineering XVIII,Lecture Notes in Comput. Sci. 70, Springer, Berlin, 2009, pp. 363–370.

[30] S. Loisel and D. B. Szyld, On the geometric convergence of optimized Schwarz methods withapplications to elliptic problems, Numer. Math., 114 (2010), pp. 697–728.

[31] G. J. Lord, C. E. Powell, and T. Shardlow, An Introduction to Computational StochasticPDEs, Cambridge Texts Appl. Math., Cambridge University Press, New York, 2014.

[32] J. Mandel, Hybrid domain decomposition with unstructured subdomains, in Domain Decom-position Methods in Science and Engineering (Como, 1992), Contemp. Math. 157, AMS,Providence, RI, 1994, pp. 103–112.

[33] J. Mandel and C. R. Dohrmann, Convergence of a balancing domain decomposition by con-straints and energy minimization, Numer. Linear Algebra Appl., 10 (2003), pp. 639–659.

[34] T. P. A. Mathew, Domain Decomposition Methods for the Numerical Solution of PartialDifferential Equations, Lect. Notes Comput. Sci. Eng. 61, Springer, Berlin, 2008.

[35] F. Nataf, H. Xiang, V. Dolean, and N. Spillane, A coarse space construction based onlocal Dirichlet-to-Neumann maps, SIAM J. Sci. Comput., 33 (2011), pp. 1623–1642.

[36] R. A. Nicolaides, Deflation of conjugate gradients with applications to boundary value prob-lems, SIAM J. Numer. Anal., 24 (1987), pp. 355–365.

[37] C. Pechstein and R. Scheichl, Analysis of FETI methods for multiscale PDEs, Numer.Math., 111 (2008), pp. 293–333.

[38] C. Pechstein and R. Scheichl, Weighted Poincare inequalities, IMA J. Numer. Anal., 33(2013), pp. 652–686.

[39] A. Qaddouri, L. Laayouni, S. Loisel, J. Cote, and M. J. Gander, Optimized Schwarzmethods with an overset grid for the shallow-water equations: Preliminary results, Appl.Numer. Math., 58 (2008), pp. 459–471.

[40] R. Qiao and P. He, Simulation of heat conduction in nanocomposite using energy-conservingdissipative particle dynamics, Molecular Simul., 33 (2007), pp. 677–683.

[41] A. Quarteroni and A. Valli, Domain decomposition methods for partial differential equa-tions, in Numerical Mathematics and Scientific Computation, The Clarendon Press, OxfordUniversity Press, New York, 1999.

[42] F.-X. Roux, F. Magoules, S. Salmon, and L. Series, Optimization of interface operatorbased on algebraic approach, in Domain Decomposition Methods in Science and Engineer-ing, Natl. Auton. Univ. Mex., Mexico, 2003, pp. 297–304.

[43] Y. Saad and M. H. Schultz, GMRES: A generalized minimal residual algorithm for solvingnonsymmetric linear systems, SIAM J. Sci. Statist. Comput., 7 (1986), pp. 856–869.

Dow

nloa

ded

01/1

3/16

to 1

37.1

95.8

.21.

Red

istr

ibut

ion

subj

ect t

o SI

AM

lice

nse

or c

opyr

ight

; see

http

://w

ww

.sia

m.o

rg/jo

urna

ls/o

jsa.

php

Page 29: SEBASTIEN LOISEL´ Schwarzand2 … · × 2 block upper triangular matrix with the identity matrix in ... be expensive for these very large sparse systems of linear equations

Copyright © by SIAM. Unauthorized reproduction of this article is prohibited.

OSM AND 2LM FOR MULTISCALE PDES A2923

[44] R. Scheichl and E. Vainikko, Additive Schwarz with aggregation-based coarsening for ellipticproblems with highly variable coefficients, Computing, 80 (2007), pp. 319–343.

[45] R. Scheichl, P. S. Vassilevski, and L. T. Zikatanov, Weak approximation properties ofelliptic projections with functional constraints, Multiscale Model. Sim., 9 (2011), pp. 1677–1699.

[46] R. Scheichl, P. S. Vassilevski, and L. T. Zikatanov, Multilevel methods for elliptic problemswith highly varying coefficients on nonaligned coarse grids, SIAM J. Numer. Anal., 50(2012), pp. 1675–1694.

[47] N. Spillane, V. Dolean, P. Hauret, F. Nataf, C. Pechstein, and R. Scheichl, A robusttwo-level domain decomposition preconditioner for systems of PDEs, C. R. Math. Acad.Sci. Paris, 349 (2011), pp. 1255–1259.

[48] N. Spillane, V. Dolean, P. Hauret, F. Nataf, C. Pechstein, and R. Scheichl, Abstractrobust coarse spaces for systems of PDEs via generalized eigenproblems in the overlaps,Numer. Math., 126 (2014), pp. 741–770.

[49] N. Spillane, V. Dolean, P. Hauret, F. Nataf, and D. J. Rixen, Solving generalized eigen-value problems on the interfaces to build a robust two-level FETI method, C. R. Math.Acad. Sci. Paris, 351 (2013), pp. 197–201.

[50] J. M. Tang, R. Nabben, C. Vuik, and Y. A. Erlangga, Comparison of two-level precon-ditioners derived from deflation, domain decomposition and multigrid methods, J. Sci.Comput., 39 (2009), pp. 340–370.

[51] A. F. B. Tompson and L. W. Gelhar, Numerical simulation of solute transport in three-dimensional, randomly heterogeneous porous media, Water Resour. Res., 26 (1990),pp. 2541–2562.

[52] A. Toselli and O. Widlund, Domain Decomposition Methods—Algorithms and Theory,Springer Ser. Comput. Math. 34, Springer-Verlag, Berlin, 2005.

[53] J. Xu and Y. Zhu, Uniform convergent multigrid methods for elliptic problems with stronglydiscontinuous coefficients, Math. Models Methods Appl. Sci., 18 (2008), pp. 77–105.

[54] Y. Zhu, Domain decomposition preconditioners for elliptic equations with jump coefficients,Numer. Linear Algebra Appl., 15 (2008), pp. 271–289.

Dow

nloa

ded

01/1

3/16

to 1

37.1

95.8

.21.

Red

istr

ibut

ion

subj

ect t

o SI

AM

lice

nse

or c

opyr

ight

; see

http

://w

ww

.sia

m.o

rg/jo

urna

ls/o

jsa.

php