eﬃcient update of hierarchical matrices in the case of ...like application of h-matrices in the...

Efficient Update of Hierarchical Matrices

in the case of

Adaptive Discretisation Schemes

Von der Fakultat fur Mathematik und Informatikder Universitat Leipzig

angenommene

D I S S E R T A T I O N

zur Erlangung des akademischen Grades

DOCTOR RERUM NATURALIUM(Dr.rer.nat.)

im Fachgebiet

Mathematik

vorgelegt

von Diplommathematikerin Jelena Djokicgeboren am 23.01.1978 in Kragujevac (Serbien)

Die Annahme der Dissertation haben empfohlen:

1. Professor Dr. Dr. h.c. Wolfgang Hackbusch (MPIMN Leipzig)

2. Professor Dr. Sergej Rjasanow (Universitat des Saarlands)

3. Professor Dr. Stefan Sauter (Universitat Zurich, CH)

Die Verleihung des akademischen Grades erfolgt auf Beschluss des Rates der Fakultat furMathematik und Informatik vom 17.07.2006 mit dem Gesamtpradikat magna cum laude.

Acknowledgment

I thank

Prof. Dr. Dr. h.c. Wolfgang Hackbusch for giving me the possibility

to study and work at the Max-Planck-Institute for Mathematics in

the Sciences, as well for useful ideas and discussions,

Dr. Lars Grasedyck and Dr. Steffen Borm for scientific support,

patience and nice friendship,

Mrs. Valeria Hunniger for helping me to start my life in Germany.

Finally, I would like to thank my family in Serbia for teaching me to

love mathematics and for an unselfish support in all the years of my

studies. Last but not least I would like to thank my friends all over

the world for staying in contact with me and having faith in me.

I dedicate this work to my late grandparents.

In loving memory of Ljubica and Milovan Jaglicic

Contents

1. Introduction 9

1.1. Model Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121.2. Numerical Solution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

1.3. Discretisation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

1.4. Boundary Element Method . . . . . . . . . . . . . . . . . . . . . . . . . . 15

1.4.1. Triangulation of the Surface Γ . . . . . . . . . . . . . . . . . . . . 161.4.2. Ansatz Space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

1.5. Degenerate Kernel Function . . . . . . . . . . . . . . . . . . . . . . . . . . 20

2. Clustering 23

2.1. Cluster Tree . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

2.1.1. Tree . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 232.1.2. Cluster Tree . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

2.2. Clustering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

2.2.1. Geometric Clustering . . . . . . . . . . . . . . . . . . . . . . . . . 26

2.2.2. Cardinality Balanced Clustering . . . . . . . . . . . . . . . . . . . 282.3. Box Tree . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

2.3.1. Box Tree Clustering . . . . . . . . . . . . . . . . . . . . . . . . . . 31

2.4. Admissibility Condition . . . . . . . . . . . . . . . . . . . . . . . . . . . . 332.5. Block Cluster Tree . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41

2.6. Complexity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

3. Hierarchical Matrices 47

3.1. Rk-Matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47

3.2. H-Matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48

3.3. Low-Rank Approximation . . . . . . . . . . . . . . . . . . . . . . . . . . . 513.3.1. Interpolation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53

3.3.2. Adaptive Cross Approximation . . . . . . . . . . . . . . . . . . . . 56

3.3.3. Hybrid Cross Approximation . . . . . . . . . . . . . . . . . . . . . 583.4. H2-Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60

3.5. Problem Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63

3.5.1. Motivation for the Problem . . . . . . . . . . . . . . . . . . . . . . 63

3.5.2. Source of the Problem: One-dimensional Example . . . . . . . . . 643.5.3. Source of the Problem: Generalisation . . . . . . . . . . . . . . . . 66

3.6. Adaptive Refinement of the Grid . . . . . . . . . . . . . . . . . . . . . . . 68

5

Contents

3.6.1. Red, Green, Blue Refinement . . . . . . . . . . . . . . . . . . . . . 68

3.6.2. Refinement Schemes . . . . . . . . . . . . . . . . . . . . . . . . . . 68

4. Update of the Cluster Tree 71

4.1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71

4.2. Update Algorithm for Cluster Trees . . . . . . . . . . . . . . . . . . . . . 73

4.2.1. Update of the Cluster Trees . . . . . . . . . . . . . . . . . . . . . . 73

4.2.2. Indirect Clustering . . . . . . . . . . . . . . . . . . . . . . . . . . . 75

4.2.3. Reduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78

4.2.4. Fusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79

4.2.5. Update of the Admissibility Tree . . . . . . . . . . . . . . . . . . . 83

4.3. Update of the Block Cluster Tree . . . . . . . . . . . . . . . . . . . . . . . 84

5. Update of Hierarchical Matrices 89

5.1. From an Updated Block Cluster Tree to the Update of an H-Matrices . . 89

5.2. Update of Low-rank Blocks . . . . . . . . . . . . . . . . . . . . . . . . . . 91

5.2.1. Update of Low-rank Blocks by Interpolation . . . . . . . . . . . . . 91

5.2.2. Update of Low-rank Blocks by Adaptive Cross Approximation . . 94

5.2.3. Update of Low-rank Blocks by Hybrid Cross Approximation . . . 98

5.3. Update of Full Matrix Blocks . . . . . . . . . . . . . . . . . . . . . . . . . 104

5.3.1. Extended Update . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105

5.4. Update of H-Matrices in the case of Piecewise Linear Functions . . . . . . 107

5.4.1. Update of Low-rank Blocks . . . . . . . . . . . . . . . . . . . . . . 107

5.4.2. Update of Full Matrix Blocks . . . . . . . . . . . . . . . . . . . . . 109

5.5. Update of H2-Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109

5.5.1. Update of the Cluster Bases . . . . . . . . . . . . . . . . . . . . . . 110

5.5.2. Update of Uniform Matrices . . . . . . . . . . . . . . . . . . . . . . 111

5.6. Costs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114

6. Applications 117

6.1. Applications in Boundary Element Methods . . . . . . . . . . . . . . . . . 117

6.1.1. Green’s Representation Formula . . . . . . . . . . . . . . . . . . . 117

6.1.2. Numerical Results for H-Matrices: Interpolation . . . . . . . . . . 119

6.1.3. Numerical Results for H-Matrices using Adaptive Cross Approxi-mation (ACA) for Computing Rk-Matrix Blocks . . . . . . . . . . 121

6.2. Numerical Results for H2-Matrices . . . . . . . . . . . . . . . . . . . . . . 124

6.3. Numerical Results for Hybrid Cross Approximation . . . . . . . . . . . . . 125

6.3.1. HCA(I) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126

6.3.2. Numerical Results for HCA(II) . . . . . . . . . . . . . . . . . . . . 127

6.4. Numerical Results for Non-local Refinements . . . . . . . . . . . . . . . . 130

6.5. Error Estimators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131

6.5.1. Averaging Error Estimators . . . . . . . . . . . . . . . . . . . . . . 132

6

Contents

7. Implementation 1377.1. H-Matrix Library-HLib . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1377.2. Implementation of the H-Matrix Structure . . . . . . . . . . . . . . . . . . 138

7.2.1. Implementation of the Cluster Tree . . . . . . . . . . . . . . . . . . 1387.2.2. Box Tree Clustering Implementation . . . . . . . . . . . . . . . . . 1397.2.3. Implementation of Full and Rk-matrices . . . . . . . . . . . . . . . 1417.2.4. Implementation of H-Matrix . . . . . . . . . . . . . . . . . . . . . 1427.2.5. Implementation of H2-Matrices . . . . . . . . . . . . . . . . . . . . 143

7.3. Update of Cluster Tree-Implementation . . . . . . . . . . . . . . . . . . . 1447.3.1. Implementation of Indirect Clustering . . . . . . . . . . . . . . . . 1447.3.2. Implementation of Reduction . . . . . . . . . . . . . . . . . . . . . 1477.3.3. Implementation of Fusion . . . . . . . . . . . . . . . . . . . . . . . 148

7.4. Implementation of Matrix Blocks Update . . . . . . . . . . . . . . . . . . 1517.4.1. Implementation of Update for Admissible Blocks . . . . . . . . . . 1527.4.2. Implementation of Update for Inadmissible Matrix Blocks . . . . . 154

A. APPENDIX 157A.1. Piecewise Constant Ansatz . . . . . . . . . . . . . . . . . . . . . . . . . . 157

A.1.1. Entries for the SLP Matrix . . . . . . . . . . . . . . . . . . . . . . 157A.1.2. Entries for the DLP Matrix . . . . . . . . . . . . . . . . . . . . . . 159

A.2. Piecewise Linear Ansatz . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162A.2.1. Entries for the SLP Matrix . . . . . . . . . . . . . . . . . . . . . . 162A.2.2. Entries for the DLP Matrix . . . . . . . . . . . . . . . . . . . . . . 163

7

Contents

8

1. Introduction

Computer technology has begun to change everyday life, but its effect on scientific andmathematical research has been even more profound. This would be impossible withoutefficient algorithms. What follows is a small contribution to the vast field of efficientnumerical algorithms.

The setting of the problem originates from an elliptic partial differential equation. Cer-tain PDEs (in the simplest case Laplace’s equation with Dirichlet (Neumann) boundaryconditions) can be reformulated as integral equations (e.g. Fredholm integral equationof the first kind) which possess a unique but, for practical applications, not explicitlycomputable solution. The main reason for the reformulation is the simplification of theoriginal task. While the PDE should be solved in the domain Ω ⊂ Rd, d ∈ 2, 3, theintegral equation is posed on a (lower dimensional) manifold Γ ⊆ Rd. The numericaltask is to provide an approximated but useful solution of the integral equation using theproperties of the kernel function g(x, y) that defines the integral operator. In almost allpractical problems we consider, the kernel function of the integral operator has singu-larities only on the diagonal, i.e., on the set (x, y) ∈ Γ × Γ | x = y.

Very popular and widely used methods of solving integral equations numerically arediscretisation schemes like Ritz-Galerkin or collocation techniques. These methods solvethe integral equation approximately by solving a system of linear equations. The prob-lem that arises is in the matrix of the system of linear equations which is, in the gen-eral case, densely populated. In order to overcome this obstacle we can approximatethat matrix using methods like multipole [47, 37], wavelets [24] or hierarchical matrices[40, 42, 41, 27, 31] that will be our choice in this work.

The hierarchical matrix technique (or briefly H-matrix technique) has been developedduring the past ten years and was built on the basis of panel clustering [44]. The mainproperty of the hierarchical matrices is their data-sparse structure and the main ad-vantage is that H-matrix arithmetic can be performed in almost optimal complexityO(n logc n) for n × n systems and small positive integer c ([27, 31]). The complexityO(n log n) achieved for H-matrices can be improved to O(n) if we use H2-matrices,which are a special class of H-matrices, and which will be also considered in this work.There are many research areas where H-matrices can be efficiently applied. Here weshall mention some of them.

For finite element methods (FEM) the paper [31] considers the construction of theH-matrices for standard finite element applications. A proof of existence of the inverse

9

1. Introduction

of the FEM stiffness matrix in the H-matrix format that is carefully discussed in [6].Further on, there are some applications of FEM combined with the H-matrix techniquelike application of H-matrices in the case of convection-dominated elliptic PDE’s thatcan be found in [30] and black box domain decomposition developed in [36]. Also there isan application of FEM and H-matrix techniques for the EEG and MEG inverse problemthat has been published in [51].In control theory one applies the H-matrix technique for a special kind of matrix equa-tions. The existence of an H-matrix approximant to the solution of a Sylvester equationis presented in [28]. A work related to the efficient treatment of Lyapunov equationscan be found in [4]. Also the multigrid method is combined with a data-sparse matrixrepresentation and used for solving Sylvester’s equation. This idea is developed in [32].Besides Sylvester’s equation, also the Riccati equation can be treated by H-matrices.For more details we refer to [34].The H-matrix technique has the largest application field in boundary element meth-ods (BEM). There are many works related to this topic starting with the constructionof H-matrices for standard BEM applications presented in [31, 11, 14]. There are alsocontributions to the cross approximation techniques introduced in [26, 50] and furtherdeveloped in [7, 25] into adaptive cross approximation as a method for constructing thelow-rank approximation. Based on the adaptive cross approximation, the hybrid crossapproximation introduced in [12] represents an improvement for the construction of thelow-rank approximations. The LU-decomposition has been considered first in [46], fur-ther development can be found [30, 36]. There are also improvements in the constructionof H-matrices. A recompression technique that minimises the storage requirements andspeeds up the H-matrix arithmetic is presented in [29]. Ideas of updating hierarchicalmatrices in [35, 33] are the basis of this work. Concerning the H2-matrices, basic ideascan be found in [43, 15]. Theory of variable-order interpolation is discussed in [18] whilethe recompression techniques are considered in [16, 9]. An application of H2-matrices toMaxwell’s equations is presented in [19]. There are also papers on H2-matrix arithmeticsgiven in [8]. Further H-matrix application can be found in [1] where this technique isapplied to the Helmholtz operator.

Data-sparsity means that H-matrices can be described by few data. The name “hi-erarchical matrices” comes from the specific hierarchical structure of these matrices.The structure depends on the discretisation and on the properties of the kernel function.Observing the kernel function we notice that in certain parts of the domain, which arefar away from singularities, the kernel function is smooth. These parts of the domain wename “far-field” while the parts of the domain where the kernel possesses singularitieswe name “near-field”. The natural idea is to approximate matrix blocks correspondingto the “far-field” by low-rank matrices. Following this idea we obtain a block matrixwhose blocks are either low-rank matrices or full matrices, and we name it hierarchicalor H-matrix. The structure of the H-matrix is not trivial and its construction requiresseveral auxiliary structures. Before we start constructing the needed structures we intro-duce the finite index set I. It corresponds to the basis functions from the discretisationscheme and their supports. The H-matrix constructing algorithm has three major steps:

10

1. In the first step the hierarchical partitioning of the index set I is determined andstored in the structure called cluster tree. It is followed by

2. the partition of the product index set I×I that is stored in form of a block clustertree, which defines the structure of the H-matrix.

3. The last step of the construction of an H-matrix includes the computation of a low-rank approximation for “far-field” blocks. We will consider three different methodsfor constructing the low-rank approximation: interpolation, adaptive cross approx-imation (ACA) and hybrid cross approximation (HCA).

The central part of this work is the update of hierarchical matrices. The motivation forthe update of H-matrices comes from two directions. One of them is the discretisationscheme and on the other hand it is the H-matrix itself. A discretisation scheme canbecome finer in different ways but an especially interesting one is when it is refinedonly locally. More precisely, if the new discretisation scheme contains, compared to theold discretisation scheme, only few new elements, solving the system of equations willdemand the full assembly of an H-matrix. Assembling the H-matrix is not just memorybut also time consuming. Therefore we try to improve both time and storage require-ments. If we assume that the H-matrix corresponding to the old discretisation schemeexists, in order to obtain the H-matrix corresponding to the new discretisation scheme,we choose an alternative way which includes recycling the “old” H-matrix. This methodwe will call update of hierarchical matrices. The description of this method andtesting its efficiency will be the central part of this work. The update of an H-matrix isconsiderably faster than constructing a new one.

The update algorithm is similar to the construction of a hierarchical matrix. Thereare three major points:

1. update of the cluster tree,

2. update of the block cluster tree and

3. update of the low-rank blocks for each approximation scheme separately (interpo-lation, ACA, HCA).

The recycling strategy will be applied in each step of the update process. The updateof the cluster tree and the update of the block cluster tree use the “old” cluster treeand “old” block cluster tree absorbing from them everything useful. In the case of theupdate of low-rank blocks our strategy is to copy all entries that can be copied whichcorrespond to the basis functions from both (new and old) discretisation schemes. Theefficiency of the method is measured by the time needed to assemble the matrix. Ouraim is to achieve the following: if the discretisation scheme contains p% new elementsand if t is the time in seconds needed to assemble the matrix using the original algorithmfor constructing the H-matrix, then we want to assemble the H-matrix by the updatealgorithm in time t p

100 .

11

1. Introduction

This thesis is divided into seven chapters. This one will contain the precise defini-tion of the problem we try to solve and some general properties of the boundary elementmethod (BEM). Chapter 2 contains the definition of the cluster tree, several clusteringalgorithms and the definition of the block cluster tree. The third chapter contains thedefinition of hierarchical matrices and algorithms for obtaining the low-rank approxi-mation. In the last section of this chapter we explain the motivation for the updateand the purpose of the update algorithm. The update algorithm is explained in twoseparate chapters. Chapter 4 will consider the update of the cluster tree, while Chapter5 explains the update of hierarchical matrices. Chapter 6 contains the numerical testsof our method and the last Chapter is devoted to implementational details.

1.1. Model Problem

Let K be a field of real (R) or complex (C) numbers, and d ∈ N a fixed number. Further,let Ω be a bounded and connected Lipschitz domain in Rd and Γ := ∂Ω its boundary.If Ω ⊂ R3, then Γ is a surface. We consider integral operators of the form

G : X(Γ) −→ Y (Γ), u 7→(x 7→

∫

Γg(x, y)u(y)dΓy

)(1.1)

where X(Γ) and Y (Γ) are function spaces over Γ and g : Γ × Γ −→ K is a givenkernel function. The function u : Γ → K is a so-called density function. Depending onthe kernel function and its singularities we define different kinds of integral operators.In this work we will restrict ourselves to integral operators that arise from Laplace’sequation in Ω ⊂ R3 with Dirichlet boundary conditions

−∆u = 0 in Ω, (1.2)

u = fD on Γ, (1.3)

or with Neumann boundary conditions

−∆u = 0 in Ω,

∂u

∂n= fN on Γ. (1.4)

The fundamental solution of (1.2) is

g(x, y) :=1

4π

1

‖x− y‖ (1.5)

where x resp. y refers to x := (x1, x2, x3) resp. y := (y1, y2, y3). This function dependsonly on the Euclidean norm of x−y and solves (1.2) in R3 \y. Using the kernel (1.5) andits (normal) derivatives we specify the single-layer potential, the double-layer potentialand the hyper-singular operator.

12

1.2. Numerical Solution

Definition 1.1 Let s ∈ [−12 ,

12 ]. The integral operators corresponding to Laplace’s equa-

tion are:

1. Single-layer potential operator V : Hs− 12 → Hs+ 1

2 with kernel

gV : Γ × Γ → R, (x, y) → 1

4π

1

‖x− y‖ .

2. Double-layer potential operator K : Hs− 12 → Hs− 1

2 with kernel

gK : Γ × Γ → R, (x, y) → 1

4π

〈ny, x− y〉‖x− y‖3

.

3. Hyper-singular operator W : Hs+ 12 → Hs− 1

2 with kernel

gW : Γ × Γ → R,

(x, y) → 1

4π

[〈ny, x− y〉‖x− y‖3

− 3〈nx, x− y〉〈ny, x− y〉

‖x− y‖5

].

From the definition of the kernel functions we observe that all of them have singularitiesalong the diagonal, i.e. in the set (x, y) | x, y ∈ Γ ∧ x = y.The next step is to define an integral equation. As the name suggests the unknownfunction is under the integral sign, i.e. we seek the solution of the equation

Gu(x) + λu(x) = f(x) (1.6)

where G is an integral operator (1.1), f ∈ Y (Γ), λ ∈ K, and u ∈ X(Γ) is the soughtfunction. If λ = 0 then we have an integral equation of the first kind. In the casethat the integral is taken over a fixed interval of R or more general a fixed subset of Rd

(including curves and surfaces) we have a Fredholm integral equation.

Example 1.2 The solution of the boundary value problem (1.2, 1.3) can be obtained bysolving the corresponding integral equation:

1

4π

∫

Γ

1

‖x− y‖u(y)dΓy = fD(x), x ∈ Γ. (1.7)

The solution of (1.7) exists (Theorem 8.1.21 from [39]) and it is unique (Theorem 8.1.20from [39]).

1.2. Numerical Solution

The history of solving integral equations is spread across several centuries. The work ofvarious mathematicians (Fredholm, Galerkin, etc) provided different methods for solvingthe integral equations ([39]). The discretisation methods (like Ritz-Galerkin, collocation

13

1. Introduction

or Nystrøm schemes) are very popular and widely used. These methods approximatethe solution of the integral equation by the solution of a system of linear equations. Themajor advantage of discretisation schemes is in their practical applications. However,the computational costs (either for assembling the matrix or the costs for solving thesystem of equations) of a naive approach are rather high. If in the optimal case thecosts are of the order O(n), for a fixed number of degrees of freedom n, the costs of thediscretisation methods are of the order O(n2). From the implementational point of viewthe disadvantage is a high storage requirement.

In the last twenty years several methods have been developed that reduce the high costsof the standard discretisation methods. Among them are the multipole method,panel-clustering, wavelets and— the one that we consider here—the hierarchicalmatrix technique. All of them can be regarded as extended discretisation methods,that reduce the costs by approximating the full matrix (of the system of linear equations)trying to keep the same order of error as the discretisation error.

1.3. Discretisation

This section gives a short explanation of the Galerkin discretisation scheme. We startwith the integral equation

Gu(x) + λu(x) = f(x)

and give its equivalent variational formulation: we seek u ∈ X(Γ) such that

〈Gu+ λu, ψ〉 = 〈f, ψ〉, for all 〈., ψ〉 ∈ Y ′(Γ). (1.8)

Since the spaces X(Γ) as well as Y ′(Γ) are of infinite dimension, the approximativesolution of (1.8) will be sought in a finite dimensional subspace Xn(Γ) that is spannedby functions ϕ1, . . . , ϕn. The solution u can be represented in the space Xn(Γ) in theform un :=

∑ni=1 xiϕi for unknown coefficients xi. We choose also a finite dimensional

subspace Y ′n(Γ) with basis 〈., ψ1〉, . . . , 〈., ψn〉 and replace (1.8) by the discrete formulation.

We seek un ∈ Xn(Γ) such that

〈Gun + λun, ψi〉 = 〈f, ψi〉 for all i ∈ 1, . . . , n. (1.9)

Remark 1.3 The functions ψi, i ∈ 1, . . . , n, can be identical to the basis functionsϕi, i.e. ϕi = ψi, for all i ∈ 1, . . . , n and in this case we have the Ritz-Galerkindiscretisation method. Otherwise we have the Petrov-Galerkin discretisationmethod.The further considerations will be done for a Ritz-Galerkin discretisation scheme.

Inserting the representation of un in the basis of Xn(Γ) into (1.9) we obtain:

n∑

i=1

xi〈Gϕi + λϕi, ϕj〉 = 〈f, ϕj〉 for all j ∈ 1, . . . , n. (1.10)

14

1.4. Boundary Element Method

From the last equation we define the matrices

Gn ∈ Kn×n with (Gn)ij := 〈Gϕi, ϕj〉,Mn ∈ Kn×n with (Mn)ij := 〈ϕi, ϕj〉, and the right-hand side vector

fn ∈ Kn with fi := 〈f, ϕi〉.

The equation (1.10) can be written as a linear system:

(Gn + λMn)x = fn. (1.11)

The matrix Mn is the mass matrix, the vector fn is the right-hand side vector andthe matrix Gn is the stiffness matrix.

Example 1.4 Let Xn ⊂ L2(Γ). Each function ϕi defines a functional in L2(Γ)′

L2(Γ) → K, f 7→∫

Γf(x)ϕi(x)dΓx.

So the entries for the matrices Mn and Gn and for the vector fn can be computed asfollows:

(Mn)ij :=

∫

Γϕi(x)ϕj(x)dΓx, (1.12)

(Gn)ij :=

∫

Γϕi(x)

∫

Γg(x, y)ϕj(y)dΓxdΓy, (1.13)

(fn)i :=

∫

Γf(x)ϕi(x)dΓx. (1.14)

The complexity and convergence of the discretisation method depend on the choice ofthe basis functions. There are some standard choices for basis function and they will beconsidered in the next section.


From Example 1.4 we notice that evaluating the entries for the matrix Gn includes com-puting the surface integrals. Aiming to simplify this computation we can choose ansatzspaces (for the discretisation method) whose basis functions have small supports. Suchbasis functions are called finite elements. Discretisation schemes for integral equations,which use these finite elements could be named finite element methods, but since thisname is associated with the approximation of partial differential equations, the combina-tion of the boundary integral method with the discretisation by finite elements is calledthe boundary element method or abbreviated BEM.

Example 1.2 demonstrates the possibility of transforming the partial differential equa-tion defined in a d-dimensional domain into the integral equation defined on the (d− 1)-dimensional boundary. This transition from d-dimensional domains to (d−1)-dimensional

15

1. Introduction

A

B

C

χ(A,B,C)

(0, 0) (1, 0)

(0, 1)

Figure 1.1.: The affine mapping that defines an open triangle in R3

boundaries is the main advantage of the boundary element methods.

The construction of the boundary elements involves two tasks. The first of them isthe decomposition of Γ := ∂Ω into finitely many pieces (rectangles or triangles) theso-called triangulation, and will be considered in the following subsection. The secondtask is the definition of polynomial functions on each of the pieces. In Subsection 1.4.2we will discuss the possibilities for the ansatz space.

1.4.1. Triangulation of the Surface Γ

Before we define a triangulation let us define a triangle in R3. We start with the definitionof the reference triangle.

Definition 1.5 (Reference triangle) The reference triangle τref is an open subset ofR2 defined as

τref := (x, y) | x ∈ (0, 1), y ∈ (0, 1 − x).

Definition 1.6 (Triangle in R3) Let A,B,C ∈ R3 be three points such that the vectorsB −A and C −A are linearly independent. The image of the affine mapping

χ(A,B,C) : τref → R3, (s, t) 7→ A+ (B −A)s + (C −A)t,

defines a triangle in R3 that we will denote by τ(A,B,C). We say that: “τ is a triangle”if there exist A,B,C ∈ R3 such that the vectors B − A,C − A are linearly independentand τ = χA,B,C(τref ). The points A,B,C are the vertices of the triangle τ .

Figure 1.1 shows a reference triangle τref and a triangle τ in R3. Based on the definitionof triangles in R3 we can proceed to specify a triangulation.

Definition 1.7 (Triangulation) Let Γ be the boundary of a polygonal domain Ω ⊂ R3.A triangulation of Γ is a set of triangles

T := τ ⊂ Γ | τ is a triangle

with the following properties:

16


1. For τ1, τ2 ∈ T holds τ1 = τ2 or τ1 ∩ τ2 = ∅.

2. Let A1, B1, C1, A2, B2, C2 ∈ R3 and τ1 := τ(A1,B1,C1) and τ2 := τ(A2,B2,C2), τ1, τ2 ∈T . If τ1 ∩ τ2 6= ∅ and τ1 6= τ2 holds, then either

a) the triangles τ1 and τ2 have one vertex in common, i.e., #A1, B1, C1, A2, B2, C2 =5, or

b) the triangles τ1 and τ2 have one edge in common, i.e., #A1, B1, C1, A2, B2, C2 =4.

3. Γ =⋃

τ∈T τ .

The set of all vertices of all triangles is denoted by VT . The cardinality of the triangu-lation T is denoted by #T . The cardinality of VT will be similarly denoted by #VT .

Remark 1.8 If the domain Ω is not a polygon we approximate it by a polygonal domainand work further on with this approximation.

The triangles in T are either disjoint or identical, or have exactly one edge or one vertexin common. Figure 1.2 shows allowed connections between triangles in a triangulation,while Figure 1.3 contains examples of forbidden connections.

Figure 1.2.: Allowed connections between triangles from T

hanging nodepartial overlapping

Figure 1.3.: Not allowed connections between triangles from T

17

1. Introduction

1.4.2. Ansatz Space

The next task is to define the “ansatz space”, i.e., the space of the basis functions.

Definition 1.9 (Ansatz Space) Let k ∈ N0 be fixed. The space of polynomials ofdegree k over R2 is defined as:

Pk(R2) := spanx 7→ xα := xα11 · xα2

2 | α ∈ N20, α1 + α2 ≤ k

The space of polynomials of degree k over the triangle τ is defined as:

Pk(τ) := p : τ → R | ∃q ∈ Pk(R2) with p = q χ−1τ

The ansatz space over the boundary Γ with triangulation T , parameters k, r ∈ N0, andspace dimension n ∈ N is defined as follows:

Xk,−1n := ϕ : Γ → R | ϕ|τ ∈ Pk(τ) ∀τ ∈ T ,Xk,r

n := ϕ : Γ → R | ϕ ∈ Cr(Γ), ϕ|τ ∈ Pk(τ) ∀τ ∈ T .

Example 1.10 (Piecewise Constant Ansatz) With X0,−1n we denote the space of

piecewise constant functions. In this case the dimension of the space is equal to thenumber of triangles in T . Each basis function is defined on the corresponding triangle,

ϕi(x) :=

1 if x ∈ τi0 if x ∈ Γ \ τi.

We conclude that supp(ϕi) = τi.

Figure 1.4.: Constant basis function

Example 1.11 (Piecewise Linear Ansatz) With X1,0n we will denote the space of

piecewise linear functions. In this case the dimension of the space is equal to the numberof vertices in T . Each basis function corresponds to exactly one vertex xj ∈ VT

ϕj(x) :=

1 if x = xj

0 if x ∈ VT \ xj.Here we have that supp(ϕj) =

⋃τ∋xj

τ , i.e. the support is the union of all trianglesfrom T that have the point xj as a vertex.

18


Figure 1.5.: Linear basis function

Definition 1.12 (General indexing) The triangulation (grid) T and set of basis func-tions are indexed by the index set I if there exist mappings PI and QI defined as

PI : I → P (Γ) s.t. PI(i) = suppϕi

QI : I → Xk,rn s.t. QI(i) = ϕi

Example 1.13 (Indexing) Example 1.10 and Example 1.11 show that the dimensionof the standard ansatz space (p.w. constant or p.w. linear) is determined by the trian-gulation. The dimension is either equal to the number of triangles #T or to the numberof vertices #VT . According to Definition 1.12 we can introduce the index sets I and Jand then define mappings PI and PJ

PI : I → P (Γ) s.t. PI(i) = τi

PJ : J → P (Γ) s.t. PJ (j) =⋃

xj∈τ

τ,

and

QI : I → X0,−1n s.t. QI(i) = ϕi,

QJ : I → X1,0n s.t. QJ (j) = ϕj .

Now we can write T = τi | i ∈ I and VT = xj | j ∈ J .

Remark 1.14 (Other boundary elements) We have chosen a triangulation in orderto decompose the boundary Γ. Instead of triangles we could as well choose rectangles orparallelograms. In this case we would use the square as a reference element.

Remark 1.15 (Computational costs for Gn) The basis functions have local sup-ports but the kernel function is non-local. This leads to a densely (or fully) populatedmatrix Gn. Even if it is possible to compute the integrals exactly in O(1) there will stillbe O(n2) entries to compute. If we assume that the computation of integrals will includea quadrature rule (e.g. Gaußian quadrature ) the costs will be of order O(n2log(n)α) forα > 0. The parameter α depends on the kernel function, discretisation and triangulation.

19

1. Introduction

1.5. Degenerate Kernel Function

The computation of the matrix entries (Gn)ij is not an easy task, due to the singularbehaviour of the kernel function. For the Laplacian kernel the singularities occur onlyon the diagonal. In the parts of the domain where the kernel does not have singularities(cf. Figure 1.5) we can approximate the kernel function by a degenerate one. The idea

τ

σ

σ

τ1

1

Figure 1.6.: If the function has the singularities only on the diagonal then in the shad-owed parts it is smooth.

of a degenerate kernel expansion is to separate the variables x and y as

g(x, y) :=M∑

ν=1

h1ν(x)h

2ν(y). (1.15)

This expansion shows that integration with respect to the x-variable is separated fromthe one with respect to the y-variable. If we insert (1.15) into (1.13) we obtain:

(Gn)ij :=

∫

Γ

∫

Γϕi(x)g(x, y)ϕj(y)dΓxdΓy

=

M∑

ν=1

∫

Γ

∫

Γϕi(x)h

1ν(x)h2

ν(y)ϕj(y)dΓxdΓy

=M∑

ν=1

∫

Γϕi(x)h

1ν(x)dΓx

∫

Γϕj(y)h

2ν(y)dΓy

= (ABT )ij,

where

Aiν =

∫

Γh1

ν(x)ϕi(x)dΓx,

Bjν =

∫

Γh2

ν(x)ϕj(y)dΓy.

Let t, s ⊆ I. If τ =⋃

i∈t suppϕi and σ =⋃

j∈s suppϕj contain the supports of the basisfunctions and g|τ×σ allows a degenerate kernel approximation, then the matrix block

20

1.5. Degenerate Kernel Function

(Gn)|t×s can be approximated by

(Gn)|t×s ≈ ABT , A ∈ R#t×k, B ∈ R#s×k

A

BT

Example 1.16 We consider the example for a degenerate kernel expansion in the caseof the kernel function g(x, y) = log |x − y|. For simplicity we take Γ = [0, 1]. Letτ := [a, b], σ := [c, d], τ × σ ⊆ [0, 1] × [0, 1] be a subdomain with the property b < c suchthat the intervals are disjoint: τ ∩ σ = ∅. Then the kernel function is nonsingular inτ × σ. We apply Taylor expansion in one variable to the kernel function and obtain theapproximation g,

g(x, y) :=

k−1∑

ν=0

∂νxg(x0, y)(x− x0)

ν .

Then the following lemma holds

Lemma 1.17 For each k ∈ N the function g(x, y) (truncated Taylor series) approxi-mates the kernel g(x, y) = log |x− y| with an error

|g(x, y) − g(x, y)| ≤ |x0 − a||c− b|

(1 +

|c− b||x0 − a|

)−k.

Proof: Lemma 1.3 in [14].If b → c then the estimate for the remainder tends to infinity and the approximationcan be arbitrarily bad. This can be improved by replacing the condition b < c, i.e thedisjointness of the intervals by the stronger admissibility condition

0 1

dist

diamστ diam(τ) ≤ dist(τ, σ), (1.16)

in which case the approximation error can be estimated by

|g(x, y) − g(x, y)| ≤ 3

2(1 +

2

1)−k ≤ 3

23−k.

This means we get a uniform bound for the approximation error independently of theintervals as long as the admissibility condition is fulfilled. The error decays exponentiallywith respect to the order k.

As the previous example indicates the approximation error depends on diam(τ)dist(τ,σ) . There-

fore we have to find parts of the domain where this ration is bounded. This task isdivided into two parts that will be the main topic of the next chapter. The partitioningof the domain (Γ) is the first task that involves also storing these parts. For this purposewe use a special tree structure that we call cluster tree. When the partition is made

21

1. Introduction

we determine the parts of the integration domain (Γ × Γ) where the degenerate kernelexpansion can be applied. This information we store in the tree structure called blockcluster tree, that, as the name says, contains information about the (block) structureof the H-matrix we want to construct, as an approximation of the densely populatedmatrix Gn. The definition and construction of the cluster tree and the block clustertree are presented in the next chapter. The construction of the H-matrix, i.e. variousmethods for assembling the low-rank blocks, will be described in Chapter 3.

22

2. Clustering

This chapter is devoted to the cluster tree, an important component in constructing H-matrices. Since the update of the H-matrices also depends on the update of the clustertree, we will carefully discuss this item. In the first section we define the cluster treestructure starting with the general definition of a graph. Then we shall present standardways of clustering, pointing out their weaknesses which are the motivation for definingother kinds of trees, the “box trees”. Using box trees we will define in Section 2.3 a newway of clustering, named box tree clustering. In the last section we shall introduce theblock cluster tree that defines the structure of an H-matrix.

2.1. Cluster Tree

2.1.1. Tree

Beside the definition of a tree, this subsection will also contain definitions of some itemsthat describe the tree structure in more detail.

Definition 2.1 (Tree) A tuple T = (r, V,E, , L) is a (labeled) tree with label set L androot r if the following holds:

1. V is a non-empty set of vertices and E ⊆ V × V is a set of edges.

2. r ∈ V , and for all v ∈ V there exists a unique path from r to v , i.e, a tuple ofvertices (vi)

li=0 such that (vi−1, vi) ∈ E holds for all i ∈ 1, . . . , l and vi 6= vj if

i, j ∈ 1, . . . , l, i 6= j, and v0 = r, vl = v.

3. ˆ: V → L is a mapping.

r is called the root of T and denoted by root(T ), the set of vertices and edges is denotedby V(T) and E(T) respectively, and the set of labels by L. For each v ∈ V , v ∈ L iscalled the label of v.A tree T is finite if the set of vertices V (T ) is finite. The next definition introduces thestandard notations for trees.

Definition 2.2 (Level, Depth, Sons, Leaf) For each vertex v ∈ V (T ), there is aunique path (vi)

li=0 with v0 = root(T ) and vl = v. This sequence is called the sequence

of ancestors of v.The number l ∈ N0 is the level of v and it is denoted by level(v).The maximal level is called the depth of T and it is denoted by

depth(T ) := maxlevel(v) : v ∈ V (T )

23

2. Clustering

If l > 0, the node vl−1 is called the father of v(= vl) and it is denoted by father(v). Forall v ∈ V (T ), we set

sons(v) := v′ ∈ V (T ) \ root(T ) : father(v′) = v.A vertex v ∈ V (T ) is a leaf if sons(v) = ∅ and we define the set

L(T ) := v ∈ V (T ) | v is a leaf.

Remark 2.3 For v ∈ V (T ) we have level(v) = 0 if and only if v = root(T ).

2.1.2. Cluster Tree

For a finite index set I, we define a special kind of tree called cluster tree.

Definition 2.4 (Cluster Tree) Let I be a finite, nonempty set and let TI = (V,E) bea tree with vertex set V and edge set E.The label set is L := P(I) and the corresponding label-mappingˆ: V → L will be definedas:

ˆ: V → P(I), i.e. v ⊆ I for all v ∈ V. (2.1)

The tree TI is called a cluster tree if the following conditions hold:

1. The root of TI is labelled by I, i.e. root(T ) = I.

2. For all v ∈ V it holdssons(v) = ∅ or v =

⋃

s∈sons(v)

s.

The vertices v ∈ V are called clusters.

Remark 2.5 The set of vertices in the cluster tree TI will be denoted by V (TI) or VTI.

If it is necessary to stress that a vertex v belongs to TI we shall write vTIor vV (TI).

If the set of vertices is not specially emphasised, we will identify V (TI) and TI , i.e. wewill write v ∈ TI instead of v ∈ V (TI).

Remark 2.6 The labeling function v 7→ v is in general neither injective nor surjective.

Definition 2.4 of the cluster tree is a standard one, and can be found in this or a similarform in various articles about H-matrices ([14],[13],[31]), where it is sometimes calledH-tree. Also other elements of the tree defined in Definition 2.2 can be applied to thecluster tree.

Definition 2.7 (Level, Cardinality of a Cluster, Leafsize) Let TI be a cluster tree.

• The levels of the tree TI are defined as

TI(0) := I, TI

(l) := v ∈ TI | father(v) ∈ TI(l−1) for l ∈ N

and we write level(v) := l if v ∈ TI(l). The leaves on level l = 0, . . . , depth(T ) are

L(TI , l) := L(TI) ∩ TI (l).

24

2.2. Clustering

• The cardinality of the cluster v, denoted by #v, is the cardinality of v, i.e.#v := #v.

• nmin ∈ N0 is a leafsize for the cluster tree TI if #v ≤ nmin holds for all v ∈ L(TI).

Example 2.8 (Cluster Tree)

On the right-hand side is an example of a cluster treebased on the index set I = 0, 1, 2, 3, 4, 5, 6, 7. Thedepth of the cluster tree is depth(TI) = 3, the leafsizeis nmin = 2.

3,4,2,5,6,7,1,0

3,4,2 5,6,7,1,0

3,4 2 5,6,7 1,0

5,6 7

2.2. Clustering

Having defined the items of the cluster tree, we can go one step further and constructa cluster tree. Before we do this we should give answers on two important questions:what is the index set I? What do we cluster?We have defined in Definition 1.13 the index set I. Each element of the index set isbijectively mapped to one basis function (corresponding to the discretisation scheme).Therefore, the cardinality of the index set is equal to the dimension of the discretisationspace.Also, from Definition 1.13 we know that the index set is mapped to the set of geometricalobjects (supports of basis functions). We cluster these geometrical objects obtaining inthis way (an overlapping) partitioning of Γ. This partitioning stored in the cluster treeprovide the candidates for checking in which parts of the domain low rank approximationis possible.There are different methods for clustering, two of them were developed first and arefrequently used:

• cardinality-balanced clustering and

• geometric clustering.

Independently of the clustering method, the root of the cluster tree is, by definition, thefull index set I. Also, all clustering methods have the same basic algorithm: find a wayto partition the index set into two disjoint subsets, and use them to create son clusters.Repeat this procedure recursively for the son clusters.For each i ∈ I there is a corresponding basis function ϕi and its support, which wedenote by Ωi := suppϕi. Since it might be too complicated to work directly with thesupports, we choose a point xi ∈ Ωi for each i ∈ I and further work with them insteadof the supports. Considering the discretisation with piecewise constant functions wenotice that clustering of the index set is also a partitioning of the grid since there is aone-one correspondence between the index set I and the supports of the basis functionsΩi (Example 1.13).

25

2. Clustering

2.2.1. Geometric Clustering

This clustering method is based on partitioning the smallest d-dimensional box thatcontains all chosen points xi. We define a box BI := [a1, b1] × [a2, b2] × . . . × [ad, bd],such that xi ∈ BI holds for all i ∈ I. We observe that this box does not necessarilycontain all supports Ωi. Then we split the box BI in the direction of the maximalextent, i.e. in the direction l := argmaxj=1,...,d |bj − aj |. In this way we split thebox and the index set at the same time, obtaining a disjoint partition of the formI = I1 ∪ I2, such that xi ∈ [a1, b1]× . . . [al,

al+bl

2 ) × . . .× [ad, bd] holds for all i ∈ I1 and

xi ∈ [a1, b1]× . . . [al+bl

2 , bl]× . . .× [ad, bd] holds for all i ∈ I2. If we recursively apply thesame procedure we obtain a cluster tree. Figure 2.1 illustrates the previously describedmethod. We can summarise the steps of the geometric clustering algorithm:

Figure 2.1.: Geometric clustering method: The red boxes are used for clustering, andthey contain only the points we have chosen to cluster.

Algorithm 2.9 (Geometrical clustering algorithm)

INPUT Cluster c with index set I, set of points xi | i ∈ I, the leafsize nmin.

IF #I > nmin THEN

1. Determine the smallest d-dimensional axial parallel box BI that contains allpoints xi, i ∈ I.

2. Determine the maximal extent of BI i.e. find l := argmaxj=1,...,d |bj − aj|.3. Split BI in the direction l into two boxes BI1 and BI2.

26

2.2. Clustering

4. Determine the clusters c1, c2 such that sons(c) = c1, c2 with c1 = I1 andxi ∈ BI1 for all i ∈ I1 and c2 = I2 and xi ∈ BI2 for all i ∈ I2.

5. If #I1 > nmin repeat the algorithm for the cluster c1.

6. If #I2 > nmin repeat the algorithm for the cluster c2.

OUTPUT Cluster tree TI .

Example 2.10 We consider a very simple discretisation space which contains only eightpiecewise constant basis functions, which are consecutively numbered by I = 0, . . . , 7.Each triangle is the support of one basis function. We cluster the index set I geometri-cally (Algorithm 2.9) and obtain the cluster tree TI . Figure 2.2 shows the grid and thecluster tree.

3,4,2

23,4 1,0

3 4

3,4,2,5,6,7,1,0

5,6,7,1,0

5,6,7

5 06,7 1

6 7

12

3

45

6

0

7

TTI

Figure 2.2.: The grid T with cluster tree TI obtained by geometrical clustering.

This clustering method works well in many cases but has also disadvantages, e.g., thereare cases in which this method does not provide the desired candidates for checking theadmissibility condition.

Example 2.11 Here is an example in which geometrical clustering leads to full matrixblocks of the size O(n2). We consider the grid T from Figure 2.3 that is refined inthe direction of the edge γ, obtaining in this way O(n) elements on each side. Thegeometrical clustering yields the cluster tree presented in Figure 2.4. We prove that thematrix block G|t∗×s∗ is densely populated. The clusters t∗ and s∗ (Figure 2.4) contain nelements each. The pair (t∗, s∗) is obviously not admissible, since the distance betweenthem is equal to zero. Therefore we consider all other pairs of clusters that contain thedescendants(sons, sons of sons,...) of clusters t∗, s∗. For all descendants s′ of the clusters∗ there holds that the block clusters (t∗, s′) are also not admissible for η = 1.0. On theother hand the descendants t′ of the cluster t∗ will not be paired with the descendantss′ of the cluster s∗, because level t′ > level s′. This is represented in Figure 2.4. Alldescendants of the cluster t∗ are on the levels from 0 to log n, while the descendantsof the cluster s∗ are on the levels from log n to 2 log n. Therefore G|t∗×s∗ is a denselypopulated matrix block with size O(n) ×O(n).This example shows that geometrical clustering may lead to a block matrix structure thatcontains large densely populated blocks.

27

2. Clustering

h ∼ n−1

H ∼ n−12

O(n) elements

γ

Figure 2.3.: The grid T is obtained by refinement of the L-shaped domain. The grid isrefined towards the the edge γ.

2.2.2. Cardinality Balanced Clustering

As the name says this cardinality clustering method provides a balanced cluster tree.We start, as in the previous method, with a d-dimensional box that contains all chosenpoints xi. Then we split this box in two boxes such that each box contains the samenumber of points. This process is repeated as long as the size of the correspondingclusters is greater than a given leafsize. The precise algorithm is presented below.

Algorithm 2.12 (Cardinality balanced clustering)

INPUT Cluster c with index set I, set of points xi | i ∈ I and nmin.

IF #I > nmin THEN

1. Compute the smallest d-dimensional box [a1, b1]× [a2, b2]× . . .× [ad, bd] whichcontains the points xi, for all i ∈ I.

2. Find the direction j = argmaxj=1,...,d |bj − aj|.3. Sort all indices I = i1, i2, . . . , i#I in a such way that for 1 ≤ k ≤ l ≤ #I

holds xik,j ≤ xil,j.

4. Form two clusters t, s labeled by the index sets I1 = i1, i2, . . . , i[#I/2] andI2 = i[#I/2]+1, . . . , i#I such that sons(c) = t, s.

28

2.2. Clustering

t∗

s∗

depth log n

depth 2 log n

Figure 2.4.: Geometrical clustering of the grid T gives the cluster tree whose parts arepresented in this Figure.

5. If #I1 > nmin then repeat the algorithm for the cluster t.

6. If #I2 > nmin then repeat the algorithm for the cluster s.

OUTPUT The cluster tree TI .

The advantage of this clustering method is that the depth of the tree is minimal. On theother hand, the geometry is not taken into account since the boxes are split irregularly.The following Figure 2.5 illustrates this method.

Example 2.13 The setup we use for this example is identical as in Example 2.10. Weconsider the same discretisation space and same grid which are indexed by the same indexset I = 0, . . . , 7. This time we cluster the index set in the cardinality balanced way(Algorithm 2.12) and we obtain the cluster tree TI . Figure 2.6 illustrates this clusteringmethod.

29

2. Clustering

Figure 2.5.: Cardinality balanced clustering: the box is split irregularly in the directionof maximal extent, so that the son clusters have the same cardinality.

12

3

45

6

7

0

65

3,2,4,1,5,6,7,0

3,2,4,1 5,6,7,0

3,2 4,1

3 2

7,05,6

7 04 1

T TI

Figure 2.6.: The grid T with cluster tree TI obtained by cardinality balanced clustering.

2.3. Box Tree

The first step in the two previously introduced clustering methods, geometric and car-dinality balanced clustering, is to determine a (d-dimensional) box that contains thechosen points. Those boxes depend on the triangulation of the boundary Γ and arerecomputed for each cluster. Our idea is to fix those boxes in the following way: atthe beginning we determine a d-dimensional box that contains the whole boundary Γ,cluster this box, and for a given set of points we determine, on each level, where thepoints belong to.At the beginning we will introduce a structure similar to the cluster tree, the box tree.This structure will be similar to the cluster tree and it will contain a partitioning of thebox that contains the boundary Γ. This box we name a boundary box.

Definition 2.14 (Boundary box) Let Γ be the boundary of the domain Ω in Rd. Thesmallest d-dimensional axial parallel box that contains Γ is called a boundary box and it

30

2.3. Box Tree

is denoted as BΓ.

Figure 2.3 shows the boundary box for Γ. The box tree has a tree structure and it is

Ω

Γ

BΓ

Figure 2.7.: Boundary box for the boundary Γ.

based on the boundary box BΓ.

Definition 2.15 (Box Tree) Let Γ be the boundary of the domain Ω in Rd, and BΓ

the boundary box for Γ. We denote by Bd := ⊗di=1[ai, bi] | ai ≤ bi the set of all d-

dimensional boxes. Let TBΓ= (V,E) be a tree with vertex set V and edge set E and the

set of labels L := b ∈ Bd | b ⊆ BΓ. A tree TBΓis a box tree if the following conditions

hold:

1. root(TBΓ) = BΓ and v is a subset of BΓ for all v ∈ V .

2. For all v ∈ V it holds

sons(v) = ∅ or v =⋃

s∈sons(v)s.

Remark 2.16 The vertices of a box tree are called box clusters and they are labeledby d-dimensional boxes. On each level l holds BΓ =

⋃v∈T

(l)BΓ

v. The depth of the box tree

is infinite, i.e., depth(BΓ) = ∞.

In practice, the box tree will never be constructed. Only finite parts are used as a toolfor constructing the cluster tree.

Let I be a finite index set for the basis functions of a discretisation scheme whosesupports belong to Γ. Further, let the box tree TBΓ

be defined. Our aim is to determinethe cluster tree TI based on the given box tree. Using the box trees in constructing thecluster trees leads to a new clustering technique which we call box tree clustering. Asimilar clustering method has been introduced in [31] and [33] and called geometricallyregular clustering.

2.3.1. Box Tree Clustering

This method is, as the name says, based on the box tree. We assume to have not just anindex set I that corresponds to the basis functions, but also the (virtual) box tree TBΓ

.

31

2. Clustering

When needed we compute the finite part. The algorithm for constructing the clustertree is simple. To split the index set I we use the same technique as in the previousclustering methods—we cluster the chosen points xi, i ∈ I, (xi ∈ Ωi). This time we needto figure out in which box clusters on each level the points xi belong to. We present thebox tree clustering algorithm in the following steps.

Algorithm 2.17 (Box tree clustering method)

INPUT The cluster t, t ⊂ I, the virtual box tree TBΓ, chosen points xi, i ∈ I and

the leafsize nmin, box Bt ∈ TBΓ.

DO 1. Split the index set t into two disjoint subsets t1 and t2 such that xi ∈ Bt1 forall i ∈ t1 and xi ∈ Bt2 for all i ∈ t2, where sons(Bt) = Bt1 , Bt2.

2. Form two clusters t1, t2 labeled by t1 and t2, such that sons(t) = t1, t2.3. If #t1 > nmin repeat the first and second step for the cluster t1.

4. If #t2 > nmin repeat the first and second step for the cluster t2.

STOP if the cardinality of the clusters is below the desired leafsize nmin.

OUTPUT The cluster tree TI .

Lemma 2.18 Let Γ be the boundary of the domain Ω. Let ϕi, i ∈ I be the set of basisfunctions, such that suppϕi ⊂ Γ holds for all i ∈ I. If the box tree TBΓ

is based on theboundary Γ then the cluster tree TI constructed by the box tree clustering is unique.

Proof: By Definition 2.14 we have Γ ⊂ BΓ. Therefore there holds suppϕi ⊂ BΓ for alli ∈ I and xi ∈ BΓ for all i ∈ I. The last observation gives us root(TI) = I.Further, let l be any level in the box tree TBΓ

and i ∈ I be an arbitrarily chosen index.The point xi belongs to exactly one box cluster on the level l since BΓ =

⋃v∈T

(l)BΓ

v. The

last statement gives us that on the fixed level l one index will belong to exactly onecluster. Therefore the cluster tree TI based on I and constructed by box tree clusteringis unique.

Definition 2.19 (Box cluster) Let TI be a cluster tree obtained by box tree clusteringand let TBΓ

be the corresponding box tree. A box cluster v is associated to the clustert ∈ TI if there holds

xi ∈ v for all i ∈ t.

Instead of v we shall write Ct for box clusters.

Figures 2.8 and 2.9 illustrate the box tree and the box tree clustering method in thecase that the discretisation space is spanned by piecewise constant function.

Example 2.20 We consider again the simple setup from Example 2.13 and 2.10. Thistime we cluster the index set I by the box tree clustering method (Algorithm 2.17). Figure2.10 illustrates this clustering method.

32

2.4. Admissibility Condition

Γ

Figure 2.8.: On the left-hand side is an example for a box tree; on the right-hand sidewe have a triangulation which is clustered by box tree clustering (Algorithm2.17)


The cluster tree stores the information on the partitioning of the grid and it provides can-didates for checking the admissibility condition. The admissibility condition serves tocheck in which parts of the domain it is possible to approximate the kernel function. Atthe beginning, we define the admissibility condition in the general case for two arbitrarysubsets of Rd, and then we specify it for the cluster tree.

Definition 2.21 (Admissibility) Let τ, σ be two subsets of Rd. They are min(max)-η-admissible if the following inequality holds:

mindiam(τ),diam(σ) ≤ η dist(τ, σ)

(maxdiam(τ),diam(σ) ≤ η dist(τ, σ))

To measure the subsets of Rd and the distance between them we use the Euclidean norm(or supremum norm). Let τ, σ ⊂ Rd. Then

diam(τ) := max‖x′ − x′′‖ : x′, x′′ ∈ τ,dist(τ, σ) := min‖x− y‖ : x ∈ τ, y ∈ σ.

In order to define the admissibility condition for pairs of clusters, we introduce thegeometrical objects associated to the clusters—the cluster supports.

33

2. Clustering

10

2 3

4 5

0

10

1716 18

1514

1312119

876

6

910

1112

2 3

7 8

1314

1815

17

1

4 5 1617

109 11

12 2

7 8

313

1415

18

01

4 5 6

16

14,15,18

10,11,16,17,12,2,3,7,8,13,14,15,180,1,4,5,6,9,

2,3,7,8,13

16,17,129,10,110,1,4,5,6 2,3,7,8 13,14,15,18

0,1,4,5,6,9,1011,16,17,12

Figure 2.9.: Discretisation is performed by piecewise constant functions, i.e. each tri-angle is the support of one basis function; the cluster tree is based on theindex set I, obtained by the box tree clustering method.

Definition 2.22 (Cluster support) Let TI be a given cluster tree based on the indexset I. To each cluster t ∈ TI we associate the set

Ωt :=⋃

i∈t

Ωi, Ωt ⊂ Rd, (2.2)

and we name it cluster support.

Since it would be inconvenient to measure the cluster using the set of the type Ωt, wecompute the smallest d-dimensional box Bt that contains Ωt, the so-called boundingbox.

Definition 2.23 (Bounding box) The bounding box Bt for a cluster t ∈ TI is thesmallest d-dimensional axial parallel box that contains the cluster support Ωt, i.e. Ωt ⊂Bt.

Figure 2.11 shows the cluster support and bounding box for an arbitrary cluster.

Definition 2.24 (Admissibility condition) Let η > 0, t ∈ TI and s ∈ TJ two clus-ters and Ωt,Ωs the corresponding cluster supports. The cluster pair (t, s) is η-admissibleif there holds

mindiam(Ωt),diam(Ωs) ≤ η dist(Ωt,Ωs). (2.3)

34


12

3

45

7

6

0

3,4,2,6,5,7,1,0

3,4,2 6,5,7,1,0

23,4 6,5,7 1,0

3 4 6,5 7 1,0

1,0

6 5

1 0

T

TI

Figure 2.10.: The grid T with cluster tree TI obtained by box tree clustering.

points used for clustering

Ωt

Bt

Figure 2.11.: Cluster support and bounding box for cluster

Remark 2.25 The admissibility condition from Definition 2.24 will be referred to asmin-admissibility. There exists also the max-admissibility condition defined as

maxdiam(Ωt),diam(Ωs) ≤ η dist(Ωt,Ωs). (2.4)

For practical application of the admissibility condition we use the corresponding bound-ing boxes. Let t, s ∈ TI be two cluster and Bt, Bs associated bounding boxes. Then themin-(max-)admissibility condition reads:

mindiam(Bt),diam(Bs) ≤ η dist(Bt, Bs) (2.5)(maxdiam(Bt),diam(Bs) ≤ η dist(Bt, Bs)

).

There also holds the following lemma:

Lemma 2.26 If the admissibility condition (2.5) holds then the admissibility condition(2.3) holds as well.

35

2. Clustering

Proof: Since Ωt ⊂ Bt there holds diam(Ωt) ≤ diam(Bt). On the other hand dist(Bt, Bs) ≤dist(Ωt,Ωs). Then there holds

mindiam(Ωt),diam(Ωs) ≤ mindiam(Bt),diam(Bs) ≤ η dist(Bt, Bs) ≤ η dist(Ωt,Ωs).

The last inequality gives the admissibility condition (2.3).The clustering methods presented in the previous sections do not cluster whole triangles

Bt

Bs

Ωt

Ωs

Figure 2.12.: Diameter and distance of clusters t and s

but only the chosen points. If the cluster tree TI was constructed by geometrical clus-tering (Algorithm 2.9) or by cardinality balanced clustering (Algorithm 2.12), the boxesused for clustering are chosen to contain only the chosen points. Figure 2.1 illustratesthe last observation. Since those boxes do not necessarily contain the cluster supports,they are also not suitable for checking the admissibility condition. Therefore, we needto compute the bounding box for each cluster in the cluster tree. All those boundingboxes we store in a special tree called bounding box tree.

Definition 2.27 (Bounding box tree) Let TI be the cluster tree based on the indexset I and BI := [a1, b1]× . . .× [ad, bd] the smallest d-dimensional box which contains thepoints xi | i ∈ I used for clustering. The bounding box tree TBI

, based on the clustertree TI, is defined as

1. TBI:= (V,E) is a labeled tree with the set of vertices V (TBI

) := V (TI) and theset of edges E(TBI

) := E(TI).

2. The set of labels L := b ∈ Bd | b ⊆ BI and the labeling function [·] is defined

[·] : V (TBI) → L [v] := bounding box forΩv.

3. [root(TBI)] = BI .

Remark 2.28 The bounding box tree TBIcan be regarded as a cluster tree whose clusters

are labeled by bounding boxes. It will be used for checking the admissibility condition.

Algorithm 2.29 (Bounding box construction)

36


INPUT The cluster tree TI, the triangulation T .

DO ∀t ∈ TI, compute the d-dimensional bounding box Bt, s.t. Ωt ⊂ Bt, and let [t] := Bt.

OUTPUT The bounding box tree TBIfor TI .

If the cluster tree TI is constructed by the box tree method, then to each cluster t ∈ TIone box cluster Ct (Definition 2.19) is associated. The box Ct is also an axial parallelbox, but in the general case

• it does not necessarily contain the cluster support, i.e., Ωt * Ct, and

• it may be smaller than Bt, Ct ⊆ Bt.

Figure 2.13 shows the case when the box cluster is smaller than the bounding box.Therefore, the boxes Ct are also not suitable for checking the admissibility condition.

coincides with the box used for clustering.the bounding box for the root

Ct Bt

Figure 2.13.: Example for Ct * Bt

There are two ways to overcome this shortcoming. First we could ignore the existence ofthe boxes Ct and recompute the bounding box tree using the Algorithm 2.29. Second,we could extend Ct at least to the size of the bounding boxes. We choose the secondsolution and define extended box clusters.

Definition 2.30 (Extended box cluster) Let TI be the cluster tree obtained by boxtree clustering. For each cluster t ∈ TI and corresponding box cluster Ct we define theextended box cluster Cex

t , depending on the parameter ρ ≥ 1 as

Cext := Ct + ρ[−ht, ht]

d, where

ht := maxi∈t

diam Ωi.

37

2. Clustering

Remark 2.31 The parameter ρ is chosen to enlarge the box cluster Ct such that laterinsertion of the indices in the cluster does not require an update of the extended boxclusters.

As for the bounding boxes, we define the admissibility condition for the extended boxclusters. Let t, s ∈ TI be two clusters and Cex

t , Cexs two extended box clusters. The

clusters t, s are min-(max)-admissible if there holds:

mindiam(Cext ),diam(Cex

s ) ≤ η dist(Cext , Cex

s ) (2.6)(maxdiam(Cex

t ),diam(Cexs ) ≤ η dist(Cex

t , Cexs )

).

Motivated by Definition 2.27 we define the extended box cluster tree.

Definition 2.32 (Extended box cluster tree) Let TI be the cluster tree based on theindex set I and on the box tree TBΓ

. The extended box cluster tree T exBΓ

based on TI andTBΓ

is defined as

1. T exBΓ

:= (V,E) is a labeled tree with the set of vertices V (T exBΓ

) := V (TI) and theset of edges E(T ex

BΓ) := E(TI).

2. The set of labels L := b ∈ Bd | b ⊆ BΓ + ρ[−hmax, hmax]d, where hmax :=maxi∈I diam Ωi. The labeling function [·] is defined as:

[·] : V (T exBΓ

) → L, [v] := Cexv for all v ∈ V (T ex

BΓ).

3. [root(T exBΓ

)] = BΓ + ρ[−hmax, hmax]d.

The construction of the extended box cluster tree is similar to the construction of thebounding box tree.

Algorithm 2.33 (Extended box cluster tree construction)

INPUT The cluster tree TI, the box tree TBΓ.

DO

1. ∀t ∈ TI compute ht = maxi∈t diam Ωi.

2. ∀t ∈ TI compute the d-dimensional box Cext = Ct + ρ[−ht, ht]

d.

OUTPUT The extended box cluster tree T exBΓ

for TI .

In order to compute the diameter of the extended box cluster and the distance betweenthem we make the following assumption. We assume that the boundary box BΓ is thed-dimensional cube [0, hmax)d. Then, in the box tree TBΓ

, and the corresponding clustertree TI on each level l the box clusters are translated versions of the box [0, 1

2lhmax)d.The following lemma contains the estimates of the diameter and distance.

38


Lemma 2.34 For any two clusters t, s ∈ TI from level l there holds

Ωt ⊂ Cext

diam(Cext ) =

√d(2−lhmax + 2ρht)

dist(Cext , Cex

s ) ≥ dist(Ct, Cs) − ρ√d(ht + hs).

Proof: The first part Ωt ⊂ Cext is trivial. To prove the second statement we recall that

all box clusters on level l are translated versions of the box [0, 12lhmax). In this case the

computation of the diameter of Cext is equivalent to the computation of the diameter of

the box [−ρht,12l + ρht). Thus, we have

diam(Cext ) =

√√√√d∑

i=1

(2−lhmax + 2ρht)2 =√d(2−lhmax + 2ρht).

If the box cluster Ct is given as [a1, b1]× [a2, b2]× . . .× [ad, bd] and the box cluster Cs as[c1, d1] × [c2, d2] × . . .× [cd, dd] then their distance can be computed by

dist(Ct, Cs) =( d∑

i=1

dist([ai, bi], [ci, di])2) 1

2.

Since the extended box clusters are defined as Cext = [a1 − ρht, b1 + ρht] × . . . × [ad −

ρht, bd + ρht] and Cexs = [c1 − ρhs, d1 + ρhs] × . . .× [cd − ρhs, dd + ρhs] we conclude

1. Cext and Cex

s are closed subsets ⇒ there exist x∗ ∈ Cext and y∗ ∈ Cex

s such that|x∗ − y∗| = dist(Cex

t , Cexs ).

2. dist(Cext , Ct) ≤

√dρht ⇒ there exist x ∈ Ct such that |x−x∗| ≤

√dρht. Similarly,

dist(Cexs , Cs) ≤

√dρhs ⇒ there exist y ∈ Cs such that |y − y∗| ≤

√dρhs. From 1

and 2 we have

dist(Ct, Cs) = minx′∈Ct

miny′∈Cs

|x′ − y′|

≤ |x− y|≤ |x− x∗| + |x∗ − y∗| + |y∗ − y|≤ dist(Cex

t , Cexs ) +

√dρ(ht + hs).

Lemma 2.35 If the admissibility condition (2.6) is fulfilled then the admissibility con-dition (2.5) also holds.

Proof: SinceBt ⊂ Cext , we have diam(Bt) ≤ diam(Cex

t ) and dist(Cext , Cex

s ) ≤ dist(Bt, Bs).Therefore, the following inequality holds

mindiam(Bt),diam(Bs) ≤ mindiam(Cext , Cex

s ) ≤ η dist(Cext , Cex

s ) ≤ η dist(Bt, Bs).

39

2. Clustering

Ωt

Bt

Ct

Cext

Figure 2.14.: The blue box is Ct used for clustering; it does not contain the whole Ωt.The red box is Cex

t ; it contains Ωt as well as Bt.

With the last equation and due to Lemma 2.26, we have that the admissibility condition(2.3) is also fulfilled.Figure 2.14 shows an arbitrary cluster t and its boxes Bt, Ct and Cex

t .

Remark 2.36 The bounding box tree and the extended box cluster tree will be called “ge-ometrical” trees, referring to the labeling sets that are taken to be either the correspondingbounding boxes or the extended box clusters.

Depending on the clustering algorithms we have defined two different “geometrical” treeswhich will be used for checking the admissibility.

Definition 2.37 (Admissibility tree) Let TI be the cluster tree. The admissibilitytree ATI

is

• the bounding box tree, i.e., ATI:= TBI

if TI is constructed by geometrical clustering(Algorithm 2.9) or cardinality balanced clustering (Algorithm 2.12).

• the extended box cluster tree, i.e., ATI:= T ex

BΓif TI is constructed by box tree

clustering (Algorithm 2.17).

The vertices of the admissibility tree we name admissibility boxes, and to each vertext ∈ TI we associate the admissibility box At where At := Bt or At := Cex

t respectively.

Checking the admissibility requires a pair of clusters and as a result we obtain “the pairof clusters is admissible” if the admissibility condition is satisfied or “the pair of clustersis not admissible” if the admissibility condition is not satisfied. The last observation isat the same time the motivation for introducing the admissibility function.

40

2.5. Block Cluster Tree

Definition 2.38 (Admissibility function) Let TI and TJ be two cluster trees andATI

and ATJthe corresponding admissibility trees. The admissibility function is a map-

pingAdm : TI × TJ → admissible, inadmissible (2.7)

defined as

Adm(t, s) := “admissible“ if mindiam(At),diam(As) ≤ η dist(At, As)

“inadmissible“ otherwise

and analogously Admmax(t, s) with “min“ replaced by “max“.

2.5. Block Cluster Tree

The goal of constructing the cluster tree is to find candidates for testing the admissibilitycondition. Testing all candidates will be too expensive since the number of all possiblecandidates for a cluster tree TI is (#TI)2 = O(n2) if #I = n.One way to avoid high costs is to check the admissibility on each level, and only in thecase of inadmissibility use clusters that belong to the lower levels. All admissible clusterpairs are stored in a cluster tree for I × J , called the block cluster tree.

Definition 2.39 (Block Cluster Tree) Let TI be a cluster tree for the index set I,and let TJ be a cluster tree for the index set J . A tree TI×J is a block cluster treeif the following conditions hold:

1. TI×J = (V,E) is a labeled tree with vertex set V (TI×J ) and edge set E(TI×J ).

2. The set of labels is L := P(I) × P(J ) and the labeling function is defined as

ˆ: V → L, b ⊂ I × J for all b ∈ V.

3. root(TI×J ) = I × J .

4. Each vertex b ∈ V (TI×J ) has the form b = (t, s) for clusters t ∈ TI and s ∈ TJ

and it is labelled as b = (t, s) := t× s.

5. For each vertex (t, s) ∈ V (TI×J ) with sons((t, s)) 6= ∅, we have

sons((t, s)) = (t′, s′) : t′ ∈ sons(t), s′ ∈ sons(s), t′ 6= ∅, s′ 6= ∅

In most cases, we consider a block cluster tree TI×I based only on one cluster treeTI = TJ . To construct the block cluster for given cluster trees TI and TJ , we use thefollowing steps:

Algorithm 2.40 (Block cluster tree construction)

INPUT The cluster trees TI and TJ , the admissibility trees ATIand ATJ

, admissibilityfunction Adm.

41

2. Clustering

START Check the admissibility of the root clusters root(TI) and root(TJ ) and ifAdm(root(TI), root(TJ )) = inadmissible then for levell = 1, . . . ,mindepth(TI),depth(TJ )

DO Check the admissibility of the clusters t ∈ T(l)I and s ∈ T

(l)J , and if Adm(t, s) =

inadmissible, we define the successors in the following way:

sons((t, s)) = (t′, s′) : t′ ∈ sons(t), s′ ∈ sons(s), t′ 6= ∅, s′ 6= ∅

and proceed by recursion for (t′, s′) ∈ sons((t, s)).

STOP The algorithm terminates if all leaves (t, s) of TI×I are either admissible orsons(t) = ∅ and sons(s) = ∅.

Example 2.41 In order to show the construction of the block cluster tree we choose thesimple discretisation scheme from Example 2.10. The cluster tree TI is obtained by boxtree clustering as it is shown in Figure 2.15. Based on TI we construct the block clustertree TI×I by Algorithm 2.40. The complete process of constructing the block cluster treeTI is presented in Figure 2.16.

12

3

45

6

7

0

3,4,2,6,5,7,1,0

3,4,2 6,5,7,1,0

23,4 6,5,7 1,0

3 4 6,5 7 1,0

1,0

6 5

1 0

0

1 2

3 4 65

7 8 9 10 11

12 13 14

15 16

T

Figure 2.15.: The index set I, of the grid τ , is clustered by the box tree clustering and thecluster tree TI is obtained (Example 2.20). The cluster tree is numberedas the last tree in the row shows. The non-numbered clusters are emptyclusters that are not involved in the construction of the block cluster tree.

2.6. Complexity

For a matrix MI×I one can count the number of non-zero entries per row

c := maxi∈I

#j ∈ I |Mij 6= 0**

***

****

* **

42

2.6. Complexity

0

1

2

0

21

7 8 9 10 11

7

8

9

10

11

12 13 14

14

13

12

3 4 5 6

6

5

4

3

0

1

2

0

21

0

0

0

1

2

0

21

3 4 5 6

6

5

4

3

7 8 9 10 11

7

8

9

10

11

0

1 2

3 4 6

7 8 9

12 13 14

15 16

1110

5

0

1

2

3

6

912

15

1614 11

10

13

8

7

4

50

1

2

0

21

3 4 5 6

6

5

4

3

Color legend

inadmissible block cluster

admissible block cluster

Figure 2.16.: The process of constructing the block cluster tree. We start with checkingthe admissibility for the pair (root(TI), root(TI)), that is not admissible.Therefore we continue checking the admissibility for the pairs of clusters(here presented by numbers) (1, 2), (1, 1), (2, 1), (2, 2), and stop if there isan admissible pair or continue if the admissibility condition is not satisfiedand the clusters are not leaves.

such that the number of nonzero entries in the whole matrix is at most c#I. Theconstant c measures the sparsity of the matrix M . The block cluster tree has a similarsparsity property which is measured by the sparsity constant Csp.

Definition 2.42 (Sparsity constant) Let TI×J be a block cluster tree based on TIand TJ . We define the sparsity constant Csp (cf. Figure 2.17) of TI×J by

Csp := max

maxt∈TI

#s ∈ TJ | (t, s) ∈ TI×J ,

maxs∈TJ

t ∈ TI | (t, s) ∈ TI×J .

Further on, we introduce the set of leaves for TI×J , this time with respect to the valueof the admissibility function.

Definition 2.43 Let TI×J be a block cluster tree based on TI and TJ . The set ofleaves is defined by L(TI×J ) := (t, s) ∈ TI×J | sons(t, s) = ∅. We define also the

43

2. Clustering

Figure 2.17.: The sparsity constant for this matrix is Csp = 8.

set of “inadmissible” leaves, denoted by L−(TI×J ) := (t, s) ∈ L(TI×J ) | Adm(t, s) =“inadmissible“ and the set of “admissible” leaves is denoted as L+(TI×J ) := L(TI×J )\L−(TI×J ) or L+(TI×J ) := (t, s) ∈ L(TI×J ) | Adm(t, s) = “admissible“.

There holds also the following lemma:

Lemma 2.44 Let TI×J be a block cluster tree based on TI and TJ with sparsity constantCsp.

1. If TI and TJ satisfy sons(v) 6= 1 for all vertices v ∈ TI ∪ TJ , then

#TI ≤ 2#L(TI) + 1, #TI×J ≤ 2Csp min#I,#J .

2. Let p := depth(TI×J ) > 1. Even if sons(v) 6= 1 is not fulfilled, we have

#TI ≤ 2p#I, #TI×J ≤ 2pCsp min#I,#J .

3. The previous estimates provide a bound for #L(TI×J ) ≤ #TI×J .

Proof: Lemma 2.2 in [31].

Lemma 2.45 Let TI×J be the block cluster tree. Then ∀(t, s) ∈ TI×J holds

t× s =⋃

(t′,s′)∈L(TI×J )(t′ × s′) ∩ (t× s)

Proof: Let p = depth(TI×J ). We prove this statement by induction over the levelsl = p, p−1, . . . , 0 of the block cluster tree. “⊇” holds trivially and for all leaves (t′, s′) 6=(t′′, s′′) ∈ L(TI×J ) holds t′ × s′ ∩ t′′ × s′′ = ∅. Therefore we prove only “⊆ ”.

Induction start: Let (t, s) ∈ TI×J such that level((t, s)) = p. Then (t, s) ∈ L(TI×J ) ⇒t× s ⊆ t× s.

44

2.6. Complexity

Induction step: We claim our assumption holds ∀(t, s) ∈ T(l+1)I×J . On level l we consider

two possibilities:

1. if sons((t, s)) = ∅ then (t, s) ∈ L(TI×J ) ⇒ t× s ⊆ t× s.

2. if sons((t, s)) 6= ∅ then

t× s =⋃

(t′,s′)∈sons((t,s))t′ × s′

=⋃

(t′,s′)∈sons((t,s))

⋃

(t′′,s′′)∈L(TI×J )

(t′′ × s′′) ∩ (t′ × s′)

=⋃

(t′′,s′′)∈L(TI×J )

(t′′ × s′′) ∩ (t× s).

The last equation proves the statement. We have used the fast that (t′, s′) ∈sons(t, s) are on the level l + 1 and the induction assumption.

Lemma 2.46 The block cluster tree TI×J fulfills

I × J =⋃

(t,s)∈L(TI×J )

t× s,

i.e., the leaves of the block cluster tree form a partition of I × J on the one side, andat the same time the partition of any matrix M ∈ RI×J .

Proof: Apply the previous lemma for the root cluster.So far, we have constructed the cluster tree TI that contains the information aboutthe subdivision of the domain, and the block cluster tree TI×I that provides the blockstructure of the matrix according to Lemma 2.46. Further steps in constructing an H-matrix will consider the block clusters. If a block cluster is admissible, then we canconstruct a low-rank approximation of the corresponding matrix block using the schemepresented in Subsection 1.5. In the case that the block cluster is an inadmissible leaf, thenthe corresponding matrix block can be represented as full matrix. If the block cluster isinadmissible and has sons, then the corresponding matrix block can be represented as ablock matrix. This is, roughly described the H-matrix structure, that will be topic ofthe next chapter.

45

2. Clustering

46

3. Hierarchical Matrices

In the previous chapter we have introduced the cluster tree and the block cluster tree.These two structures are the key components in constructing hierarchical or H-matrices(the last name will be used throughout this work). The first of them, the cluster tree,gives us the candidates for checking the admissibility condition, while the second, theblock cluster tree, contains the structure of the H-matrix. The structure of H-matricesis the main topic of this chapter. Also, this chapter includes the definitions of rank kmatrices (shortly Rk-matrices) that will be used to approximate all admissible matrixblocks, the definition of H-matrices and three different algorithms for assembling the Rk-matrices. One section will be devoted to the description of H2-matrices whose updatewill be considered as well. Finally, the last section will give a description of adaptiverefinement of the grid and its influence on the corresponding hierarchical matrices.

3.1. Rk-Matrix

In this section let m,n, k ∈ N0 be natural numbers including zero.

Definition 3.1 (R≤k-Matrix) A matrix M ∈ Rn×m is an R≤k-matrix if its rank is atmost k. The set of n×m matrices of rank at most k is denoted by

R(k, n,m) = M ∈ Rn×m| rank(M) ≤ k.

Knowing the rank of the matrix is useful, but the representation is also important.There are different representations possible for R≤k-matrices and we choose the set ofRk-matrices.

Definition 3.2 (Rk-Matrix) A matrix M ∈ Rn×m is represented in Rk-matrix formatif

M =k∑

i=1

ai(bi)T = ABT

for ai ∈ Rn and bi ∈ Rm, i = 1, . . . , k. The matrix A ∈ Rn×k contains the vectors ai,while the matrix B ∈ Rm×k contains the vectors bi.

Figure 3.1 illustrates an arbitrary matrix represented in Rk-matrix format.

Remark 3.3 (Storage) The storage requirement NR,St(n,m, k) for an n×m Rk-matrixM is

NR,St(n,m, k) = k(n +m).

47


= AB

m m

n n

k

k

T

Figure 3.1.: The Rk-matrix M ∈ Rn×m is represented as the product of the matricesA ∈ Rn×k and B ∈ Rm×k.

Remark 3.4 (Storage) The storage requirement NF,St(n,m) for an n×m full-matrix(full-matrix: array of n ·m entries, the standard dense matrix format) M is

NF,St(n,m) = n ·m.

3.2. H-Matrix

In this section we give the definition, storage requirements and an algorithm for theconstruction of H-matrices.Let TI , TJ be cluster trees based on the index sets I, J . Further, let nmin be the leafsizefor both cluster trees. Let TI×J be a block cluster tree based on TI and TJ .

Definition 3.5 (H-Matrix) Let nmin ∈ N0. The set of H-matrices induced by a blockcluster tree TI×J with blockwise rank k and leafsize nmin is defined as

H(TI×J , k) := M ∈ RI×J |∀(t, s) ∈ L(TI×J ) : rank(M |t×s) ≤ k

or #t ≤ nmin or #s ≤ nmin.

A matrix M ∈ H(TI×J , k) is said to be given in H-matrix representation, if for all leaves(t, s) of TI×J with #t ≤ nmin or #s ≤ nmin the corresponding matrix block M |t×s isgiven in full-matrix representation and in Rk-matrix representation for the other leaves.

Remark 3.6 The H-matrices have a block structure organised in the following way:

• If a leaf (t, s) is admissible then the corresponding matrix block M |t×s is representedin the Rk-matrix format.

• If a leaf (t, s) is inadmissible then the corresponding matrix block M |t×s is repre-sented in the full-matrix format.

A common graphical representation of an H-matrix is presented in Figure 3.2.

48

3.2. H-Matrix

Figure 3.2.: The green color is used for representing the Rk-matrices, while the red blocksrepresent full matrix blocks.

Remark 3.7 (Properties of H-Matrices) Hierarchical matrices are data-sparse, i.e.,they can be described by few data and stored efficiently. The advantage of using thehierarchical matrices for approximating densely populated matrices is in the arithmetic(matrix-vector multiplication, matrix-matrix multiplication, inversion, matrix functions)that can be performed in almost linear complexity. For H-matrices the arithmetic hasbeen carefully developed and explained particularly in [27], and in various articles (e.g.[31, 14, 13]). For the H2-matrix arithmetic we refer to some promising results publishedin [8], [10].

In this work we shall not discuss any of these arithmetic, but we shall entirely focus on thestructure of hierarchical matrices. Therefore, we define the algorithm for constructingan H-matrix.

Algorithm 3.8 (Hierarchical matrix construction)

INPUT Index sets I, J , admissibility parameter η, algorithms for computing entriesfor low-rank blocks and inadmissible blocks.

DO

1. Construct the cluster trees TI and TJ from the index sets I and J respectively.

2. Construct the block cluster tree TI×J from the cluster trees TI and TJ .

3. Compute the entries for admissible and inadmissible matrix blocks.

OUTPUT Hierarchical matrix in H(TI×J , k).

The first two steps of Algorithm 3.8 are covered in Chapter 2. The last one will bepresented in the forthcoming sections.The storage requirements and the costs of matrix-vector multiplication are estimated inthe two following lemmas.

49


Lemma 3.9 (Storage) Let TI and TJ be cluster trees with leafsize nmin and TI×J ablock cluster tree based on TI and TJ with sparsity constant Csp and depth p. Then thestorage requirements NH,St(TI×J , k) for an H-matrix M ∈ H(TI×J , k) are bounded by

NH,St(TI×J , k) ≤ Csp maxk, nmin(p+ 1)(#I + #J )

Proof: The proof is based on the following facts:

1. The set of leaves L(TI×J ) can be presented as disjoint union of admissible andinadmissible leaves

L(TI×J ) := L+(TI×J ) ∪ L−(TI×J ).

or as disjoint union of the leaves on the each level

L(TI×J ) :=

·⋃

l=0,...,p

L(TI×J , l).

2. The storage requirements for admissible blocks are k(#t + #s) while the storagerequirements for inadmissible blocks are #t · #s.

3. There holds ∑

t∈T(i)I

#t ≤ #I and∑

s∈T(i)J

#s ≤ #J

Proof: We will carry out this proof by contradiction. We assume that on contraryholds ∑

t∈T(i)I

#t > #I.

This would mean that there exist clusters t, s ∈ T(i)I such that t 6= s and t∩ s 6= ∅.

This statement contradicts with Lemma 3.11.

Nst(TI×J , k) =∑

(t,s)∈L+(TI×J )

k(#t+ #s) +∑

(t,s)∈L−(TI×J )

#t · #s

≤∑

(t,s)∈L+(TI×J )

k(#t+ #s) +∑

(t,s)∈L−(TI×J )

nmin(#t+ #s)

≤p∑

i=0

∑

(t,s)∈T(i)I×I

maxk, nmin(#t+ #s)

=

p∑

i=0

∑

(t,s)∈T(i)I×J

maxk, nmin#t+

p∑

i=0

∑

(t,s)∈T(i)I×J

maxk, nmin#s

≤ Csp maxk, nminp∑

i=0

( ∑

t∈T(i)I

#t+∑

s∈T(i)J

#s)

≤ Csp maxk, nmin(p+ 1)(#I + #J ).

50

3.3. Low-Rank Approximation

Lemma 3.10 (Matrix-vector multiplication) Let TI×J be a block cluster tree. Thecomplexity NH·v(TI×J , k) of the matrix-vector product in the set of H-matrices can bebounded from above and below by

NH,St(TI×J , k) ≤ NH·v(TI×J , k) ≤ 2NH,St(TI×J , k).

Proof: Lemma 2.5 in [31].

Lemma 3.11 Let TI be a cluster tree. For all t, s ∈ TI with t 6= s and level(t) = level(s),we have t ∩ s = ∅.

Proof: In order to prove this claim we use induction over level(t) = level(s) ∈ N0. Forlevel(t) = level(s) = 0 we have t = s = root(TI) and our statement is trivial. Let l ∈ N0

be such that

t 6= s⇒ t ∩ s = ∅holds for all t, s ∈ TI with level(t) = level(s) = l.Let t, s ∈ TI with t 6= s and level(t) = level(s) = l + 1. Since level(t) = level(s) > 0there are clusters t+, s+ ∈ TI with t ∈ sons(t+), sons(s+) and level(t+) = level(t) − 1 =l = level(s) − 1 = level(s+).If t+ = s+, we have s ∈ sons(t+), i.e., t, s are different sons of the same cluster, andDefinition 2.4 implies t ∩ s = ∅.If t+ 6= s+, we can apply the induction assumption in order to find t+ ∩ s+ = ∅. Defini-tion 2.4 yields t ⊆ t+ and s ⊆ s+, which implies t ∩ s ⊆ t+ ∩ s+ = ∅ and concludes theinduction.


In Section 1.3 we have introduced the stiffness matrix Gn, which arises from the (Ritz-Galerkin) discretisation of an integral operator. The index n is a notation for the di-mension #I and it will be omitted in the further considerations. The entries for thematrix G are defined in the following way:

Gij :=

∫

Γ

∫

Γϕi(x)g(x, y)ϕj(y)dΓxdΓy. (3.1)

The matrix G is a dense or full matrix due to the non-locality of the kernel g. Also, thekernel function possesses singularities on the diagonal. In the parts of the domain thatare far from singularities —the “far field”— we can apply a degenerate kernel expansion(1.5) that will lead to a low-rank approximation, represented in the Rk-matrix format.Exploiting the last observation, we would like to approximate the stiffness matrix G by

51


an H-matrix G. In the following we consider a degenerate approximation of the kernelfunction g(x, y):

g(x, y) :=k∑

ν=1

h1ν(x)h

2ν(y). (3.2)

Inserting this into (3.1) we obtain :

Gij :=

∫

Γ

∫


=k∑

ν=1

∫

Γ

∫

Γϕi(x)h

1ν(x)h2

ν(y)ϕj(y)dΓxdΓy

=k∑

ν=1

∫

Γϕi(x)h

1ν(x)dΓx

∫

Γϕj(y)h

2ν(y)dΓy

= (ABT )ij , where

Aiν :=

∫

Γϕi(x)h

1ν(x)dΓx, (3.3)

Bjν :=

∫

Γϕj(y)h

2ν(y)dΓy. (3.4)

(3.5)

Since a global degenerate approximation of the kernel function g(·, ·) can not be found,we work with local approximations on admissible pairs of clusters, i.e., i ∈ t and j ∈ s,where (t, s) is an admissible block.Let the block b = (t, s) be admissible. Then the matrix block G|t×s can be approximated

by a low-rank matrix of the form ABT where A ∈ R#t×k and B ∈ R#s×k with k as in(3.3), (3.4).There are several possibilities to compute a low-rank approximation for a matrix blockG|t×s. One possibility has been presented in Example 1.16. In this example we haveused the Taylor expansion to separate the variables. This method requests computationof the first k derivatives of the kernel function (which might be complicated to compute).Instead of using the Taylor expansion, we can use different methods:

• interpolation,

• adaptive cross approximation (ACA), or

• hybrid cross approximation (HCA), a combination of the previous two.

In the following subsections we shall assume that the cluster tree TI is based on I,the index set of the basis functions, the block cluster tree TI×I is based on TI andconstructed by Algorithm 2.40. We shall focus on the admissible leaves correspondingto admissible matrix blocks. Therefore, we fix one admissible block cluster (t, s) andinvestigate the low-rank approximation of the matrix block G|t×s.

52


3.3.1. Interpolation

The block (t, s) is admissible by means of Definition 2.3, and without loss of generalitywe assume that diam(At) ≤ diam(As).Let (xν)ν∈K be a family of interpolation points in Rd, and let (Lν)ν∈K be the corre-sponding Lagrange polynomials satisfying for ν, µ ∈ K:

Lν(xµ) = δνµ.

For a fixed y ∈ Rd we interpolate the function x 7→ g(x, y) and obtain a degenerateapproximation separating the variables x and y:

g(x, y) =∑

ν∈K

g(xtν , y)Lt

ν(x) (3.6)

The matrix block G|t×s is approximated by the matrix block G|t×s of the form:

(G|t×s)ij :=∑

ν∈K

∫

Γϕi(x)Lt

ν(x)dΓx

∫

Γϕj(y)g(x

tν , y)dΓy.

We set

Aiν :=

∫

Γϕi(x)Lt

ν(x)dΓx,

Bjν :=

∫

Γϕj(y)g(x

tν , y)dΓy.

It is obvious that the rank of G|t×s = ABT is bounded from above by #K.The degenerate kernel expansion can be computed for an admissible pair of clusters (t, s).In order to obtain the degenerate kernel we apply an interpolation. Therefore we haveto find a set of interpolation points and a set of corresponding Lagrange polynomials foreach cluster t ∈ TI such that the approximation error on the corresponding domain Ωt

is small enough. Since Ωt is a general domain, it would be difficult to compute a goodinterpolation operator. Therefore, we approximate the kernel function g(x, y) not on thegeneral domain Ωt but on the corresponding admissibility box At ⊇ Ωt. Let us recallthat the same simplification has been used for checking the admissibility condition.For simplicity we first discuss a one-dimensional interpolation scheme. For the in-terval [−1, 1] we take the m-th order Chebyshev interpolation points (xν)

mν=0 =(

cos(

2ν+12m+2π

))m

ν=0. For this set of interpolation points we define the Lagrange polyno-

mial and the corresponding interpolation operator by:

Lν(x) =∏m

µ=0,µ6=νx−xµ

xν−xµ

Jm : C[−1, 1] → Pm, f 7→m∑

ν=0

f(xν)Lν .

The admissibility box At is a tensor product of arbitrary intervals of the type [ai, bi],i.e., At := [a1, b1]× . . .× [ad, bd]. The interpolation points and interpolation operator onsuch domains are defined by the following algorithm:

53


Algorithm 3.12 (Interpolation scheme for tensor product domains)

INPUT The cluster t and its label t, the corresponding admissibility box At, Chebyshevpoints xν, Lagrange functions Lν and interpolation operators Jm.

DO

1. Transformation of the Chebyshev points for the interval [ai, bi]:

Φ[ai,bi] : [−1, 1] → [ai, bi], x 7→ bi+ai

2 + bi−ai

2 x,

x[ai,bi]ν := Φ[ai,bi](xν).

2. Computation of the Lagrange functions for the interval [ai, bi]:

L[ai,bi]ν := Lν Φ−1

[ai,bi], L[ai,bi]

ν (x) =m∏

µ=0,µ6=ν

x− x[ai,bi]µ

x[ai,bi]ν − x

[ai,bi]µ

.

3. Definition of the interpolation operator for the interval [ai, bi]:

J [ai,bi]m : C[ai, bi] → Pm,

J [ai,bi]m [f ] := (Jm[f Φ[ai,bi]]) Φ−1

[ai,bi],

J [ai,bi]m [f ] =

∑mν=0 f(x

[ai,bi]ν )L[ai,bi]

ν .

4. Definition of the set of multiindices:

K := ν ∈ Nd0 : νi ≤ m for all i ∈ 1, . . . , d = 0, . . . ,md.

5. Computation of the interpolation points for the admissibility box [a1, b1]×. . .×[ad, bd] corresponding to the cluster t:

xtν := (x[a1,b1]

ν1, . . . , x[ad,bd]

νd), ν ∈ K. (3.7)

6. Lagrange functions corresponding to the admissibility box:

Ltν(x) =

(L[a1,b1]

ν1⊗ . . .⊗ L[ad,bd]

νd

)(x) =

d∏

i=1

L[ai,bi]νi

(xi).

7. Tensor-product interpolation operator

J tm := J [a1,b1]

m ⊗ . . .⊗ J [ad,bd]m , J t

m[f ](x) =∑

ν∈K

f(xtν)Lt

ν(x).

OUTPUT The transformed Chebyshev points xtν and the Lagrange polynomials Lt

ν thatdefine the interpolation operator J t

m.

54


A similar low-rank approximation can be obtained in the case that diam(As) ≤ diam(At).Then, for a fixed x ∈ Rd we interpolate the function y → g(x, y) and obtain

g(x, y) :=∑

ν∈K

g(x, ysν)Ls

ν(y). (3.8)

As in the previous case we approximate the matrix block G|t×s by the block G|t×s whoseentries we compute by

(G|t×s)ij :=∑

ν∈K

∫

Γϕi(x)g(x, y

sν)dΓx

∫

Γϕj(y)Ls

ν(y)dΓy ,

where i ∈ t, j ∈ s and ν = 1, . . . , k. The conclusions about low-rank approximationachieved through interpolation are the same as in the previous case. The entries of thematrices A and B can be computed as:

Aiν :=

∫

Γϕi(x)g(x, y

sν)dΓx,

Bjν :=

∫

Γϕj(y)Ls

ν(y)dΓy.

Convergence of the interpolation schemeIn the general case for an arbitrary function f and cluster t with d-dimensional admis-sibility box At and corresponding interpolation operator J t

m there holds:

‖f − J tmf‖∞,At ≤

d(m+ 1)d−1

22m+1(m+ 1)!diam(At)

m+1 max‖∂m+1i f‖∞,At : i ∈ 1, . . . , d

(3.9)We assume that the kernel function g ∈ C∞(At ×As) is asymptotically smooth, i.e.,that

|∂αx ∂

βy g(x, y)| ≤ C(α+ β)!c

|α|+|β|0 ‖x− y‖−|α|−|β|−σ (3.10)

holds for some constants C, c0, σ ∈ R>0 and all multiindices α, β ∈ Nd0.

In the case of diam(At) ≤ diam(As) we have

|g(x, y) − g(x, y)| ≤ 2Cd(m+ 1)d−1

2 dist(At, As)σ

(c0η4

)m+1. (3.11)

This estimate is obtained using the estimate (3.9) and the property (3.10) of the kernelfunction .In a similar fashion we obtain the same estimate in the case diam(As) ≤ diam(At). Ifwe choose η < 4

c0, we have (c0η/4) < 1, and the approximation of the kernel function

converges exponentially in m. The precise proof for the convergence of the interpolationscheme can be found in [11] and [18]. Convergence for all η > 0 has been proven in [18].

Remark 3.13 (Admissibility condition and interpolation) The last estimates showthat the admissibility condition is responsible for the exponential convergence of the in-terpolation scheme.

55


If we assume that the block cluster (t, s) is admissible by means of the max-admissibility(Definition 2.4) then the kernel function can be interpolated in both directions. Weinterpolate the kernel function (x, y) → g(x, y) by

g(x, y) :=∑

ν∈K

∑

µ∈K

g(xtν , x

sµ)Lt

ν(x)Lsµ(y). (3.12)

The entries for the matrix block G|t×s are computed by

(G|t×s)ij :=∑

ν∈K

∑

µ∈K

g(xtν , x

sµ)

∫

Γϕi(x)Lt

ν(x)dΓx

∫

Γϕj(y)Ls

µ(y)dΓy

= V tSt,s(W s)T . (3.13)

The matrices V t ∈ Rt×K , W s ∈ Rs×K and St,s ∈ RK×K are given by

sW

VS

t

t,s

Figure 3.3.: Representation of the factorisation V tSt,s(W s)T .

(V t)iν :=

∫

Γϕi(x)Lt

ν(x)dΓx, i ∈ t,

(W s)jµ :=

∫

Γϕj(y)Ls

µ(y)dΓy, j ∈ s, (3.14)

(St,s)νµ := g(xtν , y

sµ) ν, µ ∈ K, (3.15)

where k is defined as k = (m+ 1)d. The matrix St,s is also called the coupling matrixand is of dimension k. That means, the rank of the factorised matrix block G|t×s is k.In the following we discuss the convergence of the interpolation in both variables. Underthe assumption that the kernel function g is asymptotically smooth (3.10) and that themax-admissibility (2.4) holds we have the following estimate:

|g(x, y) − g(x, y)| ≤ 4Cd(m+ 1)2d−1

dist(At, As)σ

(c0η4

)m+1. (3.16)

The proof of the last estimate can be found in [15] Lemma 5.1, Remark 5.2, and in [12].

3.3.2. Adaptive Cross Approximation

Adaptive cross approximation (or shortly ACA) is an algebraic method that tries toapproximate a given full matrix by a rank k matrix up to precision ǫ. The existence of

56


cross approximations has been proven and carefully discussed in [26]. Based on theseresults the adaptive cross approximation algorithm was developed in [5, 7]. Convergencehas been proven in special cases and the method provides a usable heuristic in generalsituations. Here we present just the basic algorithm from [7]. This algorithm uses thefull pivoting strategy that provides a complexity of order O(n2).

Let M ∈ Rn×m be a full matrix and ǫ > 0 a precision. The aim is to construct an approx-imation of the form

∑kν=1 aνb

Tν of M up to a relative error ‖M−∑k

ν=1 aνbTν ‖2 ≤ ǫ‖M‖2.

Algorithm 3.14 (Adaptive Cross Approximation)

INPUT A function that returns the matrix entry Mij for an index pair (i, j), the pre-cision ǫ.

STEP ν = 1 . . . k :

1. Determine a pivot index pair (i∗, j∗) maximising δ = |Mi∗j∗−∑ν−1

µ=1(aµ)i∗(bµ)j∗ |.2. Stop if δ = 0.

3. Compute the entries of the two vectors aν ∈ Rn, bν ∈ Rm by

(aν)i := Mij∗ −ν−1∑

µ=1

(aµ)i(bµ)j∗ ,

(bν)j :=1

δ

(Mi∗j −

ν−1∑

µ=1

(aµ)i∗(bµ)j

).

STOP IF ‖aν‖2‖bν‖2 ≤ ǫ‖a1‖2‖b1‖2.

OUTPUT The factorisation ABT ≈M .

In practice a modified algorithm is used that involves partial pivoting which reduces thecomplexity to O(nk2). The partial pivoting strategy fixes the column index j∗ (or rowindex i∗) and seeks the maximum in the column Mij∗.There are examples in which standard pivoting strategies fail [12]. A modified versionof ACA, called ACA+, which uses an improved strategy can be found in [29] and [12].

Remark 3.15 (Properties of ACA) The rank k is not fixed but depends on the choiceof the pivot elements. Related to the H-matrix technique, we notice that each admissibleblock can be approximated by ACA obtaining the desired low-rank approximation. Dif-ferently from the interpolation the admissibility boxes are not needed. Numerical resultsshow that ACA is faster than interpolation and the separation rank is almost optimal.

57


#t#s

MMM

k

k

U t

(V s)TABT

Figure 3.4.: Illustration of the HCA(I) algorithm. Matrices A and B are obtained asmatrix-vector multiplication between U t and A, and V s and B.

3.3.3. Hybrid Cross Approximation

The hybrid cross approximation (or abbreviated HCA) is a method which combines thebest properties of interpolation and ACA; it is convergent and fast. The aim remains thesame: find a low-rank approximation of the matrix block G|t×s where the block cluster(t, s) is max-admissible (Definition 2.4). There are two different algorithms known asHCA(I) and HCA(II), where the first one is closer to the interpolation method and thesecond one is closer to ACA.

(a) HCA(I) is based on the (tensor) interpolation of the kernel function in both variables(3.12). The factorization

G|t×s := U tSt,s(V s)T (3.17)

has a fixed rank k′. The matrix St,s ∈ Rk′×k′is in general a small full matrix and

will be replaced by the ACA approximation

St,s ≈ ABT (3.18)

where A, B ∈ Rk′×k. Inserting the factorisation (3.18) into (3.17) we obtain

G|t×s = U tABT (V s)T = (U tA)(V sB)T (3.19)

and define matricesA := U tA B := V sB. (3.20)

We conclude that A ∈ R#t×k and B ∈ R#s×k, and we notice that the computationof the matrices A and B involves a matrix-vector multiplication. This operationwill be introduced in Section 3.4. Figure 3.3.3 shows how the construction of thelow-rank approximation by HCA(I) looks like.

(b) HCA(II) does not require an ACA approximation of the matrix St,s as in (3.18) butonly the pivot indices from the ACA algorithm.The idea of HCA(II) is to approximate the kernel function g(x, y) in an admissiblebox At ×As by products of the form

g1(x, y) = g(x, yj1)g(xi1 , y)/g(xi1 , yj1),

58


where xi1 and yj1 are appropriate interpolation points from an m-th order inter-polation scheme in At ×As. The pivot elements i1 and j1 are those from an ACAapproximation of St,s.Successively, we approximate the remainder g−∑i

l=1 gl in the same way and obtainin the end an approximation of the form

g(x, y) :=k∑

l=1

( l∑

q=1

g(x, yjq)Cl,q

)( l∑

q=1

g(xiq , y)Dl,q

),

where Cl,q,Dl,q are given by recursion formula. If the kernel function g is of theform

g := DxDyγ

for a so-called generating kernel γ that is asymptotically smooth (g may be not),then the degenerate kernel is defined by

g := DxDy γ

=k∑

l=1

( l∑

q=1

(Dxγ)(x, yjq)Cl,q

)( l∑

q=1

(Dyγ)(xiq , y)Dl,q

).

The double integrals∫Γ

∫Γ ϕi(x)g(x, y)ϕj (y)dΓxdΓy now split into single integrals

of the form ∫

Γϕi(x)Dxγ(x, yjl

)dΓx,

∫

Γϕj(y)Dyγ(xil , y)dx.

Here, we can notice an additional advantage of HCA(II) as compared to ACA: ACArequires the evaluation of double integrals in order to compute approximations foradmissible blocks, while HCA(II) needs only single integrals that can be evaluatedby simpler quadrature rules.The admissible matrix block G|t×s can be approximated in the form G|t×s = ABT

for A ∈ R#t×k and B ∈ R#s×k.

A := UCT , B := V DT . (3.21)

The entries for the matrices U ∈ R#t×k and V ∈ R#s×k are given by

Uil :=

∫

Ωt

ϕi(x)Dxγ(x, yjl)dx Vjl :=

∫

Ωs

ϕj(y)Dyγ(xil , y)dy. (3.22)

The entries of the k × k matrices C and D are computed using Algorithm 3.3.3.

Remark 3.16 (Complexity) The total complexity of the HCA(I) algorithm is O((#t+#s)k′2k) while the complexity of the HCA(II) algorithm is O((#t+ #s)k2 + k′k2).

Remark 3.17 (Convergence) The convergence of the HCA algorithms has been provenin [12]. The proof is based on the convergence of the interpolation scheme and the resultsthat prove convergence of ACA in special cases.

59


Algorithm 1 HCA(II)

procedure HCA2(St,s, var A, B)Compute ACA approximation of St,s, (St,s ≈ ABT ) with A, B ∈ Rk′×k so that

‖St,s − ABT ‖2 ≤ ε‖St,s‖2

and store the pivot indices (iℓ)kℓ=1, (jℓ)

kℓ=1

Initialise C,D ∈ Rk×k and c, d ∈ Rk by zerofor ℓ = 1, . . . , k do

for i = 1, . . . , ℓ− 1 dodi := 0, ci := 0for q = 1, . . . , i doci := ci + Ci,qg(xiℓ , yjq)di := di +Di,qg(xiq , yjℓ

)end for

end forCℓ,ℓ := 1/

√|(aℓ)iℓ |, Dℓ,ℓ := sign((aℓ)iℓ)/

√|(aℓ)iℓ |

for q = 1, . . . , ℓ− 1 doCℓ,q := 0, Dℓ,q := 0for i = q, . . . , ℓ− 1 doCℓ,q := Cℓ,q − Ci,qdiCℓ,ℓ

Dℓ,q := Dℓ,q −Di,qciDℓ,ℓ

end forend for

end for

3.4. H2-Matrices

The idea of H2-matrices is similar to the idea of H-matrices. The aim remains the same:approximation of the full matrix by a specially structured block matrix. The blockstructure of H2-matrices is constructed using the block cluster tree that is based on thecluster tree and the max-admissibility condition. In the construction of the block clustertree appears the first difference between H-matrices and H2-matrices. In Algorithm2.40 the block cluster becomes a leaf if one of the clusters is leaf. Such a constructionleads to the storage requirement of O(n log n) for H-matrices. In the aim to reducestorage requirements to O(n) we change the construction of the block cluster tree in thefollowing way:If Adm((t, s)) =′′ inadmissible′′ then sons((t, s)) :=

(t′, s′) : t′ ∈ sons(t), s′ ∈ sons(s), t′, s′ 6= ∅ if sons(t) 6= ∅ and sons(s) 6= ∅(t′, s) : t′ ∈ sons(t), sons(s) = ∅, t′ 6= ∅ if sons(t) 6= ∅ and sons(s) = ∅(t, s′) : sons(t) = ∅, s′ ∈ sons(s), s′ 6= ∅ if sons(t) = ∅ and sons(s) 6= ∅∅ otherwise

60

3.4. H2-Matrices

#s#t

k k

kk

kCT D

U

V T

Figure 3.5.: Graphical representation of the low-rank approximation obtained by theHCA(II) algorithm. The matrices U and CT define the matrix A while thematrix product of V and DT defines the matrix B.

Concerning admissible blocks we can use interpolation of the kernel function in bothdirections (3.12), due to the max-admissibility (3.12) and obtain the following factorisa-tion

G|t×s = V tSt,s(W s)T ,

where the matrices V t, W s and St,s are defined in (3.14) and (3.15).The entries of the matrix St,s are just pointwise evaluations of the kernel function. Wecan notice that the matrix V t depends only on the cluster t and the matrix W s dependsonly on the cluster s. The last fact can be used to reduce the computational timesince the matrix V t(W s) needs to be computed (and stored) only once for each clustert ∈ TI(s ∈ TI). The second advantage is in computing the matrix-vector multiplication,and will be described below.

Remark 3.18 The coupling matrix St,s depends only on the admissibility boxes At andAs. This is easy to prove since the entries of the matrix St,s are obtained by pointwiseevaluation of the kernel function at transformed Chebyshev points, which depend only onthe admissibility boxes At and As.

Let t′ ∈ TI be another cluster. Since we use the same polynomial space for all clusterswe have

spanLtν : ν ∈ K = spanLt′

ν′ : ν ′ ∈ K,so there must be coefficients T t′,t

ν′,ν ∈ R such that

Ltν =

∑

ν′∈K

T t′,tν′,νLt′

ν′ (3.23)

holds, i.e., we can represent the Lagrange polynomials corresponding to the cluster tby the Lagrange polynomials corresponding to the cluster t′. The computation of thecoefficients is especially simple since we are working with Lagrange polynomials:

T t′,tν′,ν = Lt

ν(xt′

ν′). (3.24)

61


Let t, t′ ∈ TI . If we have an index i ∈ t with i ∈ t′ the equation (3.23) implies:

V tiν =

∫

Γ

ϕi(x)Ltν(x)dΓx =

∑

ν′∈K

T t′,t

ν′,ν

∫

Γ

ϕi(x)Lt′

ν′ (x)dΓx =∑

ν′∈K

T t′,t

ν′,νVt′

iν′ = (V t′T t′,t)iν . (3.25)

This equation allows us to speed up the matrix-vector computation: computing V tyt

directly for a vector yt ∈ Rk requires O(k#t) operations. If t is not a leaf, i.e., ifsons(t) 6= ∅, there is a t′ ∈ sons(t) for each i ∈ t such that i ∈ t′, and this implies(V tyt)i = (V t′T t′,tyt)i. So instead of computing V tyt directly, we can compute T t′,tyt

for all sons t′ ∈ sons(t), and this will require only O(k2) operations.

The matrices T t′,t are also called transformation matrices.Since the computation of the matrices V t for each clustert ∈ TI can be expensive, we use the transformation matricesto reduce costs. The Figure on the right-hand side illus-trates the approach. The matrices V t are computed onlyfor the leaves and all other matrices are obtained using theappropriate transformation matrices.

t

t’ t’’

t’2t’ t’’ t’’1 21

T T

T T T T

V V VVRemark 3.19 The matrix T t′,t depends only on the admissibility boxes At and At′ sincethe entries are obtained by evaluation of Lagrange polynomials at the transformed Cheby-shev points.

Although we have concluded that it is enough to compute the matrices V t only for theleaves we define a new structure based exactly on such matrices and the cluster tree TI .

Definition 3.20 (Cluster basis) Let TI be a cluster tree for the index set I. A family(V t)t∈TI

of matrices is a cluster basis, if for each t ∈ TI there is a number kt such thatV t ∈ R#t×kt

.A cluster basis (V t)t∈TI

is of constant order, if there is a k ∈ N0 such that k = kt holdsfor each t ∈ TI.

H2-matrices are a special kind of H-matrices whose low-rank blocks can be computedby interpolation of the kernel function in both variables as it has been done in (3.12). Ifthe H-matrices are based on the block cluster tree TI×I then the H2-matrices are basedon the cluster basis defined in Definition 3.20.

Definition 3.21 (Uniform H-matrix) Let TI×I be a block cluster tree and let (V t)t∈TI

and (W s)s∈TIbe cluster bases for the index set I. We define the set of uniform H-

matrices with row basis (V t)t∈TIand column basis (W s)s∈TI

as

H(TI×I , V,W ) := G ∈ RI×I | G|t×s = V tSt,s(W s)T for a matrix St,s ∈ Rkt×ks.

A uniform H-matrix is of constant order, if the cluster bases (V t)t∈TIand (W s)s∈TI

areof constant order.

62

3.5. Problem Description

Definition 3.22 (Nested cluster basis) A cluster basis (V t)t∈TIis nested, if for each

non-leaf cluster t ∈ TI and each son cluster t′ ∈ sons(t), there is a transfer matrix

T t′,t ∈ Rkt′×ktsatisfying

(V tyt)i = (V t′T t′,tyt)i

for all vectors yt ∈ Rktand all indices i ∈ t′.

Definition 3.23 (H2-matrix) A uniform H-matrix whose column and row cluster ba-sis are nested is called an H2-matrix.

Remark 3.24 H2-matrices have also the same block structure as H-matrices but theadmissible blocks are approximated by uniform matrices instead of Rk-matrices.


In this section we try to solve the problem how to update the hierarchical matrices.Wetry to answer the questions what the update of the hierarchical matrices is, what wasthe motivation for the problem and how to solve it. Precisely, we shall give answers tothe following questions:

Why have we chosen to solve this particular problem? (3.26)

What is the motivation for the update, resp.,

where is the source of the problem? (3.27)

How to perform the update efficiently? (3.28)

We shall divide this section into three subsections and each of them will give an answerto one of the stated questions. For simplicity we shall describe the source of the prob-lem for the one-dimensional case, and then it will be generalised. The solution will bepresented in form of an algorithm in the last subsection.

3.5.1. Motivation for the Problem

Before we explain why we have chosen to solve this particular problem we recall thebasic schemes introduced so far.In Section 1.1 we introduced the model problem (efficient approximation of integral op-erators that arise from solving Laplace’s equation) and one solution (a discretisationscheme that leads to the hierarchical matrix technique). The full matrix G ∈ Rn×n isapproximated by an H-matrix G. n is the number of basis functions, which span the(finite dimensional) discretisation space Vn, whose supports are defined by a grid withmesh size h. The aim of the discretisation scheme is to solve the integral equation in thefinite dimensional space. We will denote the discrete solution by uh. The natural ques-tion that arises is, how good is the discrete solution, resp. how good is the approximation‖u − uh‖, where u is the original solution. Similarly, we can ask whether the solution

63


K[u](x) =R

Γg(x, y)u(y)dy

Discretisation H-matrix technique

Figure 3.6.: The illustration of the model problem. The discretisation leads to a denselypopulated matrix that is approximated by an H-matrix.

will be better if the grid becomes finer, especially in the case if the grid is only locally(or partially) refined. Is it necessary, in the case of partial refinement of the grid, tocompute the entire H-matrix again? Figure 3.7 illustrates the last question. Finally

refinement

discretisationdiscretisation

?

T T ′

G G′

Figure 3.7.: The graphical representation of the problem. The grid T ′ is obtained byrefining the grid T . The H-matrices G, G′ can be obtained directly, apply-ing the discretisation scheme and H-matrix technique to the grids T ,T ′,respectively. The question is whether it is possible to obtain the matrix G′

from T ,T ′, and G without directly applying the discretisation scheme.

we can answer the question (3.26): because we would like to decrease the discretisationerror. In the aim to achieve this we have to assemble the H-matrix indirectly.

3.5.2. Source of the Problem: One-dimensional Example

Let T be a one-dimensional grid1, where τi := [ i8 ,

i+18 ] = suppϕτi

, i = 0, . . . , 7 , is thesupport of the basis function. We define a set of piece-wise constant functions indexedby I = 0, . . . , 7, i.e., there is a mapping QI : I → X0,−1

8 such that QI(i) = ϕτi.

1so far we considered the grids triangulating the surface Γ, in this case we make an exception in orderto simplify the explanation

64


0 1 2 3 4 5 6 7

0 1

Figure 3.8.: One-dimensional grid.

Further we cluster I obtainingthe cluster tree TI , that willserve for creating the blockcluster tree TI×I . The laststep will be to create the H-matrix G.

43210 5

76543210

765432

6 7

0 1 2 3 4 5 6 7

0 1

0 1

Now we refine the grid T obtaining the grid τ ′. Figure 3.9 illustrates the refined gridT ′. As in the previous case we define a set of piece-wise constant functions and we index

0 1 2 3 4 5 6 78

0 1

Figure 3.9.: Refined one-dimensional grid from Figure 3.8.

them by I ′ = 0, . . . , 7, 8 introducing the mapping Q′I′ : I ′ → X0,−1

9 , with Q′I′(i) = ϕ′

τi.

Similar to above we clusterthe index set I ′ and obtainthe cluster tree TI′ . Fol-lowing the algorithm we con-struct the block cluster treeTI′×I′ and the H-matrix G′.

0 1 2 3 4 5 6

2 3 4 5 6

7 8

7

7 8

87

8

7 8

1

0 1 2 3 4 5 6

0 1 2 3 4 5 6

0

0 1

The index mappings QI and Q′I′ coincide for all i ∈ 0, . . . , 6. Since we have already

clustered all indices from I and stored them in TI it is a natural question whether it ispossible to reuse TI in the construction of TI′ . Also, the same question can be asked forthe block cluster tree TI′×I′ . Finally, observing the H-matrices G and G′ we can noticethat there are matrix blocks that are identical. Therefore we ask whether it is possibleto use the matrix G in constructing the matrix G′.

65


3.5.3. Source of the Problem: Generalisation

Following the one-dimensional problem we describe the update problem in the generalcase. The problem and the solution will be represented in the form of an algorithm.

INPUT DATA

1. We start with the model problem introduced in Section 1.1: discretisation ofthe integral operator.

2. The geometry on the boundary Γ is defined by the triangulation T ( Subsection1.4.1) that contains the supports of the basis functions.

3. We introduce the finite dimensional discretisation space (dim(Vn) = n) spannedby basis functions ϕi | i ∈ I indexed by the index set I.

4. Applying the H-matrix technique we obtain the H-matrix G ∈ H(TI×I , k)that approximates the densely populated matrix G arising from the discreti-sation scheme.

APPLY a refinement scheme to the triangulation T , which defines a new triangulationT ′.

DEFINE the new discretisation space of dimension n′ ≥ n spanned by the set of basisfunctions ϕi | i ∈ I ′ indexed by I ′, #I ′ = n′.

CONSTRUCT a new H-matrix, G′, which approximates the densely populated matrixcorresponding to the new discretisation scheme.

IDEA is to indirectly compute the H-matrix G′, using the already given matrix G,under the assumption that only few basis functions have changed. This idea canbe considered as recycling the H-matrix instead of constructing a completely newone. This recycling process we name the update of a hierarchical matrix.

The idea of updating can be described by the following algorithm.

Algorithm 3.25 (Update algorithm for hierarchical matrices)

INPUT

1. The H-matrix G ∈ H(TI×I , k),

2. the corresponding block cluster tree TI×I and the cluster tree TI ,

3. the grid T indexed by the index set I.

DO

1. Apply the refinement scheme to the grid T and obtain a grid denoted by T ′.

2. Define the index set I ′ based on the grid T ′.

3. Construct the cluster tree TI′, based on I ′, as an update of the cluster treeTI (in Chapter 4).

66


4. Construct the block cluster tree TI′×I′ as an update of TI×I, using the al-ready updated cluster tree TI′ (also in Chapter 4).

5. UPDATE admissible and inadmissible matrix blocks from G (Chapter 5).

OUTPUT

1. The grid T ′, index set I ′.

2. The cluster tree TI′ and the block cluster tree TI′×I′.

3. The H-matrix G′ ∈ H(TI′×I′ , k′).

In the case that we perform the update of H2-matrices there are some additionalsteps that need to be carried out.

Remark 3.26 The algorithm for the update of H2-matrices contains additional elementscompared to Algorithm 3.25. Those elements are:

1. The INPUT contains also the cluster basis VTI, WTI

and instead of the H-matrix

G we have the H2-matrix, also denoted by G as the input data .

2. The update of the cluster basis VTI(WTI

) is performed in the aim to obtain thecluster basis VTI′ (WTI′ ). This update is also based on the update of the clustertree.

3. The OUTPUT contains the updated cluster basis VTI′ (WTI′ )and the H2-matrix

G′.

The second step of the algorithm, updating the cluster tree and block cluster tree willbe discussed in Chapter 4. The update of admissible and inadmissible matrix blocks willbe the topic of Chapter 5.If the matrix G′ is assembled by the update procedure there are some questions thatarise:

1. If a matrix G′new is assembled by the direct method and based on the same block

cluster tree as G′, do those matrices coincide?

2. Is the update algorithm efficient?

3. Are there some cases when it is not useful to apply the update algorithm?

4. To which kind of practical problems can it be applied?

The answer on the first question will be given in Chapter 5. Other questions will beanswered in Chapter 6 where various numerical results will show the efficiency of theupdate algorithm.Before we investigate the steps of Algorithm 3.25, we go back to the problem descriptionand focus in the next section on the refinement scheme that is applied to the giventriangulation T .

67


3.6. Adaptive Refinement of the Grid

It has been explained in the previous section that the aim of this work is to reduce thediscretisation error. One way to do this is by refining the grid since this will reduce thediscretisation error due to Cea’s lemma (Lemma 4.2 in [20]). In this section we shallgive a brief overview of the refinement methods which we shall apply in this work.

3.6.1. Red, Green, Blue Refinement

The grid T , defined in Definition 1.4.1, is a system of triangles with special properties(Definition 1.7). The grid, i.e., triangulation can become finer without destroying thetriangulation properties. This can be achieved by applying refinement schemes. Beforewe introduce different refinement schemes, we shortly recall the three simple trianglerefinement patterns.Red, blue, and green (also known as bisection) refinements of a triangle T are defined asfollows

Definition 3.27 (Red, blue, green triangle refinement) Let τA,B,C be a given tri-angle and A1, B1, C1 mid points of the edges BC,CA,AB, respectively. Then

Red refinement divides a given triangle τA,B,C into four congruent triangles, whosevertices are the vertices of the triangle τA,B,C and the mid points of the edges.

τA,B,Cred−→ τA,C1,B1 , τB,A1,C1 , τC,B1,A1, τA1,B1,C1

Blue refinement divides a given triangle τA,B,C into three triangles, whose vertices arethe vertices of the triangle τA,B,C and the mid points of two edges.

τA,B,Cblue−→ τA,B,A1, τA,A1,B1 , τB1,A1,C or τA,B,B1 , τB,A1,B1, τA1,C,B1

Green refinement divides a given triangle τA,B,C into two triangles, whose vertices arethe vertices of the triangle τA,B,C and the mid point of one edge.

τA,B,Cgreen−→ τA,C1,C1, τC,C1,B or τA,B,A1 , τA,A1,C or τA,B,B1 , τB1,B,C

Figure 3.10 illustrates the refinement strategies of the triangle defined above.

3.6.2. Refinement Schemes

There are different refinement schemes. Here we shall mention some of them. Let T be agiven triangulation whose cardinality is n = #T . We distinguish between the followingrefinement processes:

The uniform refinement scheme that includes only red refinement for each trianglefrom the grid. The number of the triangles in the new grid is 4n. The regularityparameter κ (the maximum ratio of circumcircle radius to the radius of an inscribedcircle) remains unchanged and the mesh size is halved. Figure 3.11 illustrates theuniform refinement scheme.

68

3.6. Adaptive Refinement of the Grid

or

or or

A

AAA

A

A

A B

BBB

B

B

B C

C

C C

A1

A1A1

A1

B1

B1

B1

B1

C1

C1

blue

g

e

ee

r

rn

d

Figure 3.10.: Red, blue and green refinement scheme.

uniform refinement

Figure 3.11.: Example for uniform refinement of the grid: for each triangle in the gridon the left side the red refinement strategy is applied.

Bisection uses only the green refinement for each triangle in the grid. The cardinalityof the new grid is 2n, if bisection is applied to the whole grid. The proof that thisscheme does not lead to a degeneration of the triangulation can be found in [3].Figure 3.12 gives an example for bisection.

The red-green closure creates the grid using red and green refinement. This refine-ment rule, which can be found in [2], guarantees that each of the angles in theoriginal triangulation is bisected at most once. One example of red-green closureis represented in Figure 3.13.

The last two refinement schemes can be applied locally, i.e., to a part of the domain.Therefore we will call them local refinement schemes. All these schemes may yield agrid which is not uniform even if it was before the refinement. Figure 3.14 presents anexample for local refinement.

Definition 3.28 (Refined Triangulations) The triangulation T ′ is obtained from thetriangulation T by refinement if

∀τ ′ ∈ T ′ holds ∃τ ∈ T such that τ ′ ⊆ τ.

69


bisection

Figure 3.12.: Example for bisection: for each triangle from the grid on the left side thegreen refinement strategy is applied.

red green closure

Figure 3.13.: Example for red-green closure.

adaptive refinement

Figure 3.14.: The grid left is adaptively refined.

70

4. Update of the Cluster Tree

This chapter is devoted to the first two steps of the H-matrix update algorithm 3.25:update of the cluster tree and update of the block cluster tree. We will explain whatthe update of the cluster tree is, and why it is useful to perform an update at all. Thenwe will present the algorithm for updating the cluster trees assuming that the clustertrees we update are constructed by box tree clustering. Similarly to the construction,the update of the block cluster tree will be based on the update of the cluster tree.The first section of this chapter contains a brief introduction to the basic concepts in theupdate. The rest of the chapter is organised in the following way: since the update ofcluster trees requires various operations with cluster trees, we devote the second sectionto defining those operations. The third section contains the update algorithm for theblock cluster tree.

4.1. Introduction

The input data of Algorithm 3.25 contain, besides the hierarchical matrix, the grid T ,the cluster tree TI and the block cluster tree TI×I based on the index set I. The indexmappings defined in Definition 1.13 associate each index i ∈ I to a basis function ϕi

on the one hand, and on the other hand an element from P(Γ). In the following, thetriangulation T and the set of basis functions ϕi | i ∈ I will be referred to as the oldtriangulation and the old discretisation space (old set of basis functions).The refinement process applied to the triangulation T defines a new triangulation T ′ onthe boundary Γ. It corresponds to a new discretisation space with the new index set I ′.According to Definition 1.12 we set the new index mappings P ′

I′ and Q′I′ as

P ′I′ : I ′ → P(T ′) P ′

I′(i′) = suppϕi′ ,

Q′I′ : I ′ → Xk,r

n′ (Xk,−1n′ ) Q′

I′(i′) = ϕ′i′ .

Concerning the index sets I and I ′ we can define the set Istay as

Istay := i ∈ I | QI(i) = Q′I′(i′), for i′ ∈ I ′. (4.1)

If Istay 6= ∅ then there are basis functions from the old and the new discretisation schemethat coincide. In this case, we introduce the mapping

R : Istay → I ′ s.t R(i) = i′ ⇔ ϕi ≡ ϕ′i′ . (4.2)

Remark 4.1 The mapping R is injective and there holds R(Istay) ⊂ I ′.

71


We define the set

I ′new := I ′ \R(Istay) (4.3)

which contains only basis functions from the new discretisation scheme that do notcoincide with any basis function from the old discretisation scheme.

Remark 4.2 The index set Istay will be referred to as the set of “old” indices pointingout that the basis functions corresponding to those indices are identical in both discreti-sations. The index set I ′

new will be called the set of “new” indices, for they correspondto the basis functions that belong only to the new discretisation.

Remark 4.3 (Standard ansatz) If the basis functions are from the set X0,−1n , i.e. if

they are piecewise constant, then the mapping R fulfills

R(i) = i′ ⇔ τi = τi′ .

Our aim is to cluster the index set I ′ corresponding to the new discretisation space. Wecluster the index set I ′ indirectly using the already given cluster tree TI and the setsIstay and I ′

new.

Example 4.4 We introduce the set of piecewise constant basis functions in X0,−18 that

are consecutively numbered, i.e., indexed by I = 0, 1, 2, 3, 4, 5, 6, 7. The grid T containseight triangles and each of them is the support of one basis function. Figure 4.1 illustratesthe triangulation T and the corresponding cluster tree.If we refine the triangles 0 and 7 of T , we obtain the new grid T ′ which is numberedby the index set I ′ = 0, 1, 2, 3, 4, 5, 6, 7, 8, 9. Figure 4.2 shows the new grid and thecluster tree based on the index set I ′. The set of old indices is Istay = 1, 2, 3, 4, 5, 6,i.e., R(Istay) = 2, 3, 4, 5, 6, 7, while the set of new indices is I ′

new = 0, 1, 8, 9.

0

1

2

3

45

6

7

T

3,4,2,6,5,7,1,0

3,4,2 6,5,7,1,0

23,4 6,5,7 1,0

3 4 6,5 7 1,0

1,0

6 5

1 0

Figure 4.1.: The left figure contains the grid T whose triangles are consecutively num-bered by the index set I = 0, 1, 2, 3, 4, 5, 6, 7. The corresponding basisfunctions are chosen to be piecewise constant. To the right we have thecluster tree based on the index set I and obtained by box tree clustering.

72

4.2. Update Algorithm for Cluster Trees

01

2

3

4

56

78

9

T ′

4,5,3 7,6,8,2,1,0,9

4,5 3 7,6,8 2,1,0,9

4 5 7,6 8

7 6

2,1 0,9

2,1

2 1

0,9

0,9

0 9

4,5,3,7,6,8,2,1,0,9

Figure 4.2.: The left figure contains the grid T ′ which is obtained by refining the grid T .To the right we have the cluster tree based on the index set I ′ and obtainedby box tree clustering.


This section will contain the update algorithm of the cluster tree. In the first subsectionwe shall explain the idea of indirect clustering. It will be followed by formulating theupdate algorithm which gives an answer to the question what the update of the clustertree means.

4.2.1. Update of the Cluster Trees

In Chapter 2 we have introduced different methods for clustering (geometrical clustering,cardinality balanced clustering and box tree clustering). These methods will be referredto as direct clustering methods, for all of them construct the cluster tree TI fromthe given index set I and the geometrical data (chosen points xi, i ∈ I) .The aim is to construct the cluster tree TI′ without using any of the direct methods butusing the already constructed cluster tree TI and the index set I ′.This construction of the new cluster tree based on the old cluster tree we call the in-direct clustering method. The motivation for introducing the indirect clusteringmethods is to construct the cluster tree with a cost of order O(#I ′

new) instead of O(I ′).The starting point is the cluster tree TI based on the index set I, obtained by a stan-dard clustering scheme (geometrical clustering, cardinality balanced clustering, box treeclustering) and the index set I ′. Also there are given the sets Istay and I ′

new introducedin (4.1) and (4.3) respectively. Additionally we define the set Iout ⊂ I of the indicesthat correspond to the basis functions from an old discretisation scheme that are not inthe new one,

Iout := I \ Istay.

The index set I ′ can be written in terms of the sets I, Istay and I ′new as

I ′ = R(Istay) ∪ I ′new. (4.4)

Motivated by the previous equation we can define the cluster tree TI′ as

TI′ = TR(Istay)∪TI′new

. (4.5)

73


where TR(Istay) is a cluster tree based on the set R(Istay) and TI′new

is the cluster treebased on the set I ′

new. The operation ∪ will be defined in Subsection 4.2.4, and it definesthe union of two cluster trees which we will call fusion.If the cluster trees TR(Istay) and TI′

neware obtained by the indirect clustering method

from TI then the construction of the cluster tree TI′ as in (4.5) is called update of thecluster tree.Before we define the algorithm for the update of the cluster tree, we recall that we haveassumed that the cluster tree TI is constructed by box tree clustering. Later we shallalso comment on the update in the case that the cluster tree TI was constructed by adifferent clustering scheme.

Algorithm 4.5 (Update of the cluster tree)

INPUT The cluster tree TI , the box tree TBΓused for clustering, the index sets I ′, Istay,

I, the leafsize nmin

DO

1. Construct the cluster tree TR(Istay).

2. Construct the cluster tree TI′new

.

3. Construct the cluster tree T ′I′ as the fusion of TR(Istay) and TI′

new.

FOR t ∈ L(T ′I′) DO

IF #t ≥ nmin THEN

cluster t using the box tree TBΓ.

OUTPUT The cluster tree TI′

The update of the cluster tree is trivial in two cases.

1. If Istay = ∅, i.e., I ′new = I ′. In this case the cluster tree TI′ has to be constructed

using a direct clustering scheme.

2. If Istay = I ′, i.e., the old and new discretisation space coincide, i.e., there was norefinement of the grid. In this case we have TI′ = TI .

Remark 4.6 1. The first step of the algorithm, the construction of the cluster treeTR(Istay), we name reduction. This cluster tree will be constructed using only thecluster tree TI.

2. The second step, the construction of the cluster tree TI′new

will be performed usingonly the box cluster tree that is associated to the cluster tree TI.

Before we introduce indirect clustering methods we introduce relations between clustertrees.

74


Definition 4.7 (Identically structured cluster trees) The cluster trees TI and TJhave isomorph structure if there holds:

1. V (TI) = V (TJ ).

2. E(TI) = E(TJ ).

Figure 4.2.1 shows two cluster trees with isomorph structure.

0,6

c

a b

a,b,c,d

da,b,c

a,b

4,8

2

2,0,6,4,8

2,0,6

2,0,6

Figure 4.3.: An example for isomorphically structured cluster trees. Although the labelsets are not the same, both cluster trees have the same number of vertices,edges and the set of sons is identical as well.

Remark 4.8 Two cluster trees of isomorph structure do not have necessarily the sameleafsize.

Definition 4.9 (Identical cluster trees) The cluster trees TI and TJ are identicalif they have isomorph structure and if the labeling function is the same.

Remark 4.10 For two cluster trees that have isomorph structure we say that they areidentical up to the labels.

4.2.2. Indirect Clustering

The construction of the cluster trees TIout, TR(Istay) and TI′new

needed for the update willnot be done by any of the direct methods (geometrical clustering, cardinality balancedclustering and box tree clustering) but it will be performed using the already constructedcluster tree TI .Here we shall present two constructions and algorithms that belong to the indirectclustering methods. The first of them will be the construction of the cluster tree TIout

and the second is the construction of the cluster tree TI′new

.The cluster tree TIout is based on the index set Iout which contains the indices that haveto be taken away from the cluster tree TI , since they correspond to basis functions thatare not a part of the new discretisation scheme.Let K ⊂ I. We construct the cluster tree TK based on the cluster tree TI as follows.

Construction 4.11 (The cluster tree TK)

1. We define the cluster tree TK := (V,E) such that V (TK) := V (TI) and E(TK) :=E(TI).

75


2. We define the set of labels L := P(K) and the labeling function

ˆ : V → L, t ⊂ K for all t ∈ V (TK).

3. ∀t ∈ V (TK) we define tV (TK) := K ∩ tV (TI).

For given Iout and TI the corresponding algorithm determining the cluster tree TIout hasthe following steps.

Algorithm 4.12 (The cluster tree TIout)

INPUT The index set Iout, the cluster tree TI

DO For all t ∈ TI construct the new cluster t′and label it by t′ := t ∩ Iout. Definethe mapping M : V (TI) 7→ V (TIout) such that M(t) = t′.

IF # sons(t) = 0 THEN sons(t′) := ∅.

ELSE Set sons(t′) := s′ | s ∈ sons(t) and s′ = M(s).

OUTPUT The cluster tree TIout.

We can see that the cluster tree TIout , except for the labels, has the same structure asthe cluster tree TI . Therefore all other dependencies in the cluster tree TIout (set ofleaves, levels, etc.) do not have to be defined separately.

Remark 4.13 It is a question whether it is necessary at all to construct the clustertree TIout, because the set of “old” indices, Istay, is used for the update. To answer thisquestion we have to observe the refinements of the given grid T that we shall also call“initial” or “starting” grid and denote by T (0). We apply the (local) refinement schemeto the grid T (0), obtaining the grid T (1), and we continue with the refinement process.In this way we obtain the system of grids

T (0) → T (1) → · · · → T (n) → · · ·

such that the grid T (k) and the corresponding discretisation space are finer than T (k−1)

and its discretisation space. Concerning the H-matrix technique, especially the updateof the cluster tree, we have developed the algorithm for reusing the elements of the olddiscretisation scheme, i.e., if we know which basis function remains unchanged after theupdate we can use them directly to construct the new discretisation. Using the languageof the index sets and cluster trees, we can denote the index set in the k-th step by Ikandthe corresponding cluster tree by TIk . According to Algorithm 4.5 we can construct thecluster tree TIk using the cluster tree TIk−1 if the sets Ik−1

old and Iknew are known.

If we apply the reverse process to the system of grids, coarsening, then the grid T (k−1)

and the corresponding discretisation scheme are coarser than the grid T (k) and its dis-cretisation. Concerning the update of the cluster tree we could construct the cluster treeTIk−1 from TIk if we know the set Ik

new and the set Ik−1out . Therefore, it is useful to mark

the set Iout and the corresponding cluster tree.

76


Example 4.14

In Example 4.4, Figure 4.1 shows the grid τ andthe corresponding cluster tree TI . If we refinethe triangles numbered by 0 and 7 we know thatthe basis functions corresponding to those twoindices will not be a part of the new discretisa-tion scheme. Therefore, we construct the clustertree TIout based on the set Iout = 0, 7.

07

7,0

7,0

7 0

0

0

The next task is to construct the cluster tree TI′new

based on the index set I ′new that

corresponds to the basis functions that are not yet clustered. We cluster these indicesusing a method similar to box tree clustering. The aim is to obtain the cluster treewhich has the same structure as the cluster tree TI .

Construction 4.15 (The cluster tree TI′new

)

1. We define the cluster tree TI′new

:= (V,E) such that V (TI′new

) := V (TI) andE(TI′

new) := E(TI).

2. We define the set of labels L := P(I ′new) and the labeling function

ˆ : V → L, t ⊂ I ′new for all t ∈ V (TI′

new).

3. i′ ∈ t′ if xi′ ∈ Ct, where Ct is the box cluster that corresponds to the cluster t, andxi′ is the chosen point that corresponds to the index i′.

The definition of the labels will be clear from the forthcoming algorithm. The algorithmfor constructing the cluster tree TI′

newis based on the reduced box tree. We recall that

box tree clustering is based on the box tree of infinite depth which is reduced to thenecessary boxes corresponding to clusters in TI , i.e., each cluster is associated with abox Ct from the box tree.

Algorithm 4.16 (The cluster tree TI′new

)

INPUT The index set I ′new, the set of the points xi | i ∈ I ′

new, the box tree TBΓ, the

cluster tree TI.

DO For all clusters t ∈ TI construct the cluster t′ and label it by t′ := i ∈ I ′new | xi ∈

Ct. Define the mapping M : V (TI) 7→ V (TI′new

) such that M(t) = t′.

IF # sons(t) = 0 THEN sons(t′) := ∅.

ELSE Set sons(t′) := s′ | s ∈ sons(t) and s′ = M(s).

OUTPUT The cluster tree TI′new

.

77


Example 4.17

Figure 4.2 in Example 4.4 shows the refined gridwhich contains four new triangles that are sup-ports of the four new piece-wise constant func-tions in the new discretisation. Those trianglesand the corresponding functions are indexed byI ′

new := 0, 1, 8, 9. The cluster tree TI′new

isconstructed using the cluster tree TI and thebox tree TBΓ

.

1,0,9

8,1,0,9

8

8 1 0,9

1

1

8,1,0,9

4.2.3. Reduction

The next step is to construct the cluster tree TR(Istay). The construction of this clustertree is the motivation for introducing the operation called reduction.

Definition 4.18 (Reduction) Let TI and TJ be two cluster trees that have identicalstructure. The reduction is the binary function

\ : TI , TJ → TI\J (4.6)

that defines the cluster tree TI\J which has an isomorph structure to TI and TJ .The labeling set LTI\J

is defined as

LTI\J:= P(I \ J ),

and the label of a cluster t ∈ TI\J with corresponding clusters t′ ∈ TI and t′′ ∈ TJ is

t := t′ \ t′′

If we assume that the set Iout and the cluster tree TI are given then the cluster tree TIout

can be constructed by indirect clustering. The cluster trees TI and TIout have isomorphstructure and using the reduction we can construct the cluster tree TI\Iout

. This clustertree contains the indices that correspond to the basis functions which are identical inboth discretisation schemes and is identical to the cluster tree TIstay . We write

TIstay = TI \ TIout.

Remark 4.19 The cluster tree TIstay obtained by reduction from the cluster trees TIand TIout is identical to the cluster tree TIstay obtained by the indirect clustering 4.12.

Proof: Independently of the construction, in both cases we obtain the cluster tree thathas the identical structure as the cluster tree TI . The labeling sets are also identicalsince there holds Istay = I \ Iout.

78


Construction 4.20 (The cluster tree TR(Istay))

1. We define the cluster tree TR(Istay) := (V,E) such that V (TR(Istay)) := V (TIstay)and E(TR(Istay)) := E(TIstay).

2. The labeling set is defined as L := R(t) | t ∈ P(Istay) (R is a mapping definedin 4.2) and the labeling mappingˆ is defined as

ˆ : V (TR(Istay)) → L, t ⊂ R(Istay) for all t ∈ V (TR(Istay)).

3. t := R(Istay ∩ t′) for t′ corresponding to t in TI .

The corresponding algorithm is similar to Algorithm 4.12.

Example 4.21 We consider once again Example 4.4. The triangles from τ (Figure 4.1)numbered by the indices Istay = 1, 2, 3, 4, 5, 6 are not refined and they remain the samein the grid τ ′. The corresponding basis functions, whose supports are those triangles,remain the same in both discretisation schemes. The mapping R : 1, 2, 3, 4, 5, 6 → I ′

takes Istay into the set 2, 3, 4, 5, 6, 7. Figure 4.4 shows the reduction. We reduce thecluster tree TI by the cluster tree TIout and obtain the cluster tree TI\Iout

. Applying themapping R we obtain the desired cluster tree TR(Istay) presented in Figure 4.5.

3,4,2

23,4

3 4 6,5

6 5

1

6,5 1

1

3,4,2,6,5,1

6,5,1

1

7

7,0

7,0

7 0

0

0

0

3,4,2,6,5,7,1,0

3,4,2 6,5,7,1,0

23,4 6,5,7 1,0

3 4 6,5 7 1,0

1,0

6 5

1 0

=\

Figure 4.4.: The cluster tree TI is reduced by the cluster tree TIout. As the result weobtain the cluster tree TI\Iout

.

4.2.4. Fusion

The fusion is a binary operation defined for cluster trees with isomorph structure.

Definition 4.22 (Fusion) Let TI and TJ be two cluster trees with isomorph structure.The fusion is a binary operation

∪ : TI , TJ −→ TI∪J

that defines the cluster tree TI∪J , such that it has the same structure as the cluster treesTI and TJ .

79


3,4,2

23,4

3 4 6,5

6 5

1

1

6,5,1

6,5 1

1

3,4,2,6,5,1

4 5

4,5 3

4,5,3

7,6 2

7,6

7

2

2

2

4,5,3,7,6,2

7,6,2

6

Figure 4.5.: Applying the mapping R we “rename” the cluster tree TI\Ioutand obtain

the cluster tree TR(Istay).

If the labeling sets of the cluster trees TI and TJ are denoted by LTIand LTJ

, then thelabeling set of the cluster tree TI∪J , LTI∪J

, is defined as:

LTI∪J:= X ∪ Y | X ∈ LTI

, Y ∈ LTJ

and the labels are defined by

t := t′ ∪ t′′ ∀t ∈ TI∪J

for corresponding clusters t′ ∈ TI and t′′ ∈ TJ .

This operation can be applied to the cluster trees TR(Istay) and TI′new

in the aim toobtain the cluster tree T ′

I′ . Both cluster trees have the same structure since both ofthem are based on the cluster tree TI . The label set of the cluster tree T ′

I′ is L :=P(R(Istay) ∪ I ′

new) = P(I ′). The algorithm for the fusion is similar to Algorithm 4.12.

Example 4.23 (Fusion) We consider once again Examples 4.4, 4.21 and 4.17. If weperform the fusion on the cluster trees TR(Istay) and TI′

new, we obtain the cluster tree T ′

I′

that has the same structure as the cluster tree TI but some leaves might be larger thanthe desired leafsize, nmin. Figure 4.6 illustrates the fusion.

4,5,3,7,6,8,2,1,0,9

4,5,3 7,6,8,2,1,0,9

4,5 3 7,6,8 2,1,0,9

4 5 7,6 8

7 6

2,1 0,9

2,1

2 1

8,1,0,9

8

8 1 0,9

1

1

8,1,0,9

1,0,9

2

7 6 2

2

4,5,3,7,6,2

4,5,3 7,6,2

4,5 3 7,6 2

4 5 7,6

∪ =

Figure 4.6.: The cluster tree TR(Istay) is fused with the cluster tree TI′new

. The result isthe cluster tree T ′

I′ .

The cluster tree T ′I′ as well as the cluster tree TI′ are based on the index set I ′. One

of the properties of the cluster tree is the leafsize. So far, in the constructions of the

80


previously introduced cluster trees, we have not changed the structure, not allowingthat the number of vertices increases. If we construct the cluster tree in the standardway, one of the input parameters is the leafsize, which determines when the clusteringterminates. Therefore we have to check whether the leaves of the obtained cluster treehave exceeded the desired leafsize after fusion and have to be subdivided further.

Algorithm 4.24

INPUT The cluster tree TI′, the leafsize nmin, the box tree TBΓ

DO ∀t ∈ L(T ′I′) check:

IF #t > nmin THEN apply the box tree clustering to the cluster t.

OUTPUT The cluster tree TI′.

Example 4.25 The cluster tree T ′I′ obtained by fusion of the cluster trees TR(Istay) and

TI′new

contains some leaves that are larger than the desired leafsize (we have taken nmin =1). Therefore, we cluster only those leaves whose size is larger than the leafsize usingthe box tree. Figure 4.7 illustrates this process.

4,5,3,7,6,8,2,1,0,9

4,5,3 7,6,8,2,1,0,9

4,5 3 7,6,8 2,1,0,9

4 5 7,6 8

7 6

2,1 0,9

2,1

2 1

4,5,3 7,6,8,2,1,0,9

4,5 3 7,6,8 2,1,0,9

4 5 7,6 8

7 6

2,1 0,9

2,1

2 1

0,9

0,9

0 9

4,5,3,7,6,8,2,1,0,9

Figure 4.7.: From the cluster tree T ′I′ we obtain the cluster tree TI′ clustering those

leaves whose leafsize is larger than the given nmin. The clustering of thoseleaves is marked in red.

Remark 4.26 The cluster tree TI′ obtained by the previously described process will becalled the updated cluster tree based on the initial cluster tree TI.

Remark 4.27 The cluster tree TI′ does not necessarily have the same structure as thecluster tree TI .

Is the updated cluster tree identical to the directly constructed one? We assume thatthe index set I indexes an old discretisation and an old grid τ , while I ′ indexes the new

81


discretisation and the new grid τ ′, that is obtained by refinement of the grid τ . Beforewe state a theorem we remark that the grids τ and τ ′ triangulate the same boundary Γand therefore the boundary box used for constructing the box tree is the same.

Theorem 4.28 Let TI be the cluster tree based on the index set I, obtained by the boxclustering method and T up

I′ the cluster tree based on I ′ and obtained as update of thecluster tree TI . Further, let T new

I′ be the cluster tree also based on I ′ and obtained bybox tree clustering. Then the cluster trees T up

I′ and T newI′ have isomorph structure and

identical labels.

Proof: First we shall prove that the cluster trees T upI′ and T new

I′ have isomorph structure.Since both cluster trees are based on the same box tree, each chosen point xi, i ∈ I ′, oneach level belongs exactly to one box cluster. To prove the last statement we considerthe set I ′. In the update scheme we have written this set as I ′ = R(Istay) ∪ I ′

new.All points xi, i ∈ R(Istay), are clustered directly, for they index the basis functionsthat are identical in both discretisation schemes. The points xi, i ∈ I ′

new, were by theupdate algorithm also directly clustered using the box tree. Therefore, the structures ofboth cluster trees is isomorph. The labels are identical because of the unique one-onecorrespondence between indices i ∈ I ′ and points xi to the corresponding box cluster.

We have assumed that the cluster tree TI was constructed by box tree clustering. Toeach cluster t ∈ TI we associate the box cluster Ct (Definition 2.19). If the cluster treeTI′ is an update of the cluster tree TI , then the box clusters remain the same sincethe box tree we used for updating is the same. Those box clusters are not suitable forchecking the admissibility condition since they, in general, do not contain the clustersupport Ωt. In order to remove this obstacle we have introduced extended box clustersCex

t (Definition 2.30). The parameter ρ used to define the extended cluster box hasbeen introduced to enlarge the box cluster such that the insertion of indices (arisingfrom adaptive grid refinement) does not require an update of the extended box cluster.Therefore, we assume that the extended box clusters also do not change in the updateprocedure. With this assumption we can define updated, identical and new clusters.

Definition 4.29 (Updated, identical and new cluster) Let TI and TI′ be two clus-ter trees such that TI′ is an update of TI . Let t ∈ TI and t′ ∈ TI′ be two clusters.

• The cluster t′ is an update of the cluster t if Ct = Ct′ and QI(t) 6= Q′I′(t′) hold.

• The clusters t and t′ are identical if Ct = Ct′ and QI(t) = Q′I′(t′).

• t′ ∈ TI′ is a new cluster if it is neither the update of some cluster from TI noridentical to some cluster from TI .

Remark 4.30 New clusters are obtained by clustering of the leaves from TI that becamelarger than nmin during the update process.

Lemma 4.31 The label of every cluster t′ from an updated cluster tree TI′ is the disjointunion of the sets t′new that contains the indices from the new discretisation scheme and

t′old that contains the indices from the old discretisation scheme.

82


Proof: The proof is based on two facts. t′ ⊂ I ′ for all t ∈ TI′ and I ′ can be written asdisjoint union of I ′ and R(Istay).

t′ = t′ ∩ I ′ = t′ ∩ (I ′new ∪R(Istay))

= (t′ ∩ I ′new) ∪ (t′ ∩R(Istay)).

We set

t′new := t′ ∩ I ′new and

t′old := t′ ∩R(Istay).

Finally we obtain the desired formula

t′ = t′new ∪ t′old.

Remark 4.32 The index sets t′new and t′old do not necessarily correspond to clusterst′new and t′old, they are just subsets of I ′.

4.2.5. Update of the Admissibility Tree

Before we consider the update of the block cluster tree, we shall shortly discuss theupdate of the admissibility tree. Let TI be the cluster tree, constructed by box treeclustering, and let ATI

be the corresponding admissibility tree. Further, let TI′ be theupdate of the cluster tree TI . The admissibility tree ATI′ can be obtained as an updateof the admissibility tree ATI

by performing the steps of the following algorithm.

Algorithm 4.33 (Update of the Admissibility Tree)

INPUT The cluster tree TI′, the box tree TBΓ, the admissibility tree ATI

.

DO Construct the admissibility tree ATI′ that has identical structure as the cluster treeTI′ and label the vertices according to the following rule:

1. If t′ ∈ TI′ is the update of a cluster from TI or it is identical to a cluster fromTI , then At′ := At.

2. If t′ ∈ TI′ is a new cluster, then At′ := Cext′ .

OUTPUT The admissibility tree ATI′ .

Remark 4.34 At′ = At = Cext but Cex

t 6= Cext′ holds for updated clusters.

83


4.3. Update of the Block Cluster Tree

The update of the block cluster tree is, similar to its construction, completely based onthe update of the cluster tree. The initial data are the cluster tree TI′ , update of thecluster tree TI and the corresponding admissibility tree ATI′ , update of ATI

. Further,there is TI×I the block cluster tree based on TI . The aim is to construct the blockcluster tree TI′×I′ .One way, that we shall not follow, is the direct construction from the cluster tree TI′

applying the Algorithm 2.40 without using the block cluster tree TI×I . The other wayis to update the block cluster tree following the idea of the indirect clustering.The algorithm for the update of the block cluster tree will not be as complex as thealgorithm for updating the cluster tree, that involves the construction of several trees andintroducing certain operations with cluster trees. We know that in the updated clustertree all old and updated clusters have identical admissibility boxes as the correspondingclusters in the initial trees. Therefore, for all block clusters that involve old and updatedclusters, the admissibility function takes the same value. The last observation will beprecisely formulated in the following lemmas.

Lemma 4.35 Let t, s ∈ TI be two clusters such that (t, s) ∈ TI×I is an admissible leaf.Let t′, s′ ∈ TI′ be two clusters that are updates of the clusters t, s respectively. Then theblock cluster (t′, s′) is also an admissible leaf.

Proof: If (t, s) ∈ TI×I is an admissible leaf that means that (F(t),F(s)) is not anadmissible leaf. If At and As are the admissibility boxes for clusters t and s then, since(t, s) is an admissible leaf, the admissibility function takes the value “admissible”. Thedefinition of the updated clusters gives us that At = At′ and As = As′ . Thereforethe admissibility function takes the value “admissible” for (t′, s′). The block cluster(F(t′),F(s′)) can not be admissible because (F(t),F(s)) is not admissible. Thereforethe block cluster (t′, s′) is also an admissible leaf.

Lemma 4.36 Let (t, s) ∈ TI×I be an inadmissible leaf. Let t′, s′ ∈ TI′ be two clustersthat are updates of the clusters t, s respectively. If sons(t′) = ∅ or sons(s′) = ∅ holds then(t′, s′) is also an inadmissible leaf.

Proof: The block cluster (t, s) is inadmissible that means t, s ∈ L(TI) and the admis-sibility condition (2.3) is not satisfied. We assume that sons(t′) = ∅ or sons(s′) = ∅.Since, t′, s′ are updates of t, s there holds At = At′ and As = As′ . Therefore the admis-sibility condition for (t′, s′) is also not satisfied. That means the block cluster (t′, s′) isan inadmissible leaf.Starting with the block cluster tree T ′

I′×I′ and the updated cluster tree TI′ we can con-struct the block cluster tree TI′×I′ . Similar like in Algorithm 2.40 we shall also use theadmissibility tree ATI′ and corresponding admissibility function.

Algorithm 4.37 (Updated Block Cluster Tree)

INPUT The updated cluster tree TI′, the block cluster tree TI×I

84


DO

1. root(TI′×I′) := I ′ × I ′.

2. Construct TI′×I′ recursively: let (t′, s′) ∈ TI′×I′ and (t, s) the correspondingblock cluster from TI×I exists.

a) If (t, s) is admissible then (t′, s′) is admissible too, and (t′, s′) ∈ L+(TI′×I′).In this case (t′, s′) is an update of (t, s).

b) If (t, s) is inadmissible and #t′ ≤ nmin or #s′ ≤ nmin then is (t′, s′)also inadmissible and (t′, s′) ∈ L−(TI′×I′). In this case (t′, s′) is anupdate of (t, s).

c) If (t, s) is inadmissible and #t′ > nmin and #s′ > nmin then

sons((t′, s′)) := (t′′, s′′) | t′′ ∈ sons(t′), s′′ ∈ sons(s′)

by recursion if # sons((t, s)) 6= 0. If # sons((t, s)) = 0 then there exists nocorresponding old block cluster and the constructed block cluster is new.

OUTPUT The block cluster tree TI′×I′.

This indirect way of constructing the block cluster tree TI′×I′ motivates to define iden-tical, updated and new block clusters with respect to the block cluster tree TI×I .

Definition 4.38 (Identical, Updated and New Block Cluster) Let (t′, s′) be a blockcluster from TI′×I′ that is updated from TI×I.

• If there exists the block cluster (t, s) ∈ TI×I and if the cluster t′ is identical to thecluster t and s′ is identical to the cluster s then the block cluster (t′, s′) is identicalto the block cluster (t, s).

• If there exists the block cluster (t, s) ∈ TI×I and if

the cluster t′ is an update of the cluster t and s′ is identical to s

or the cluster t′ is identical to t and s′ is an update of the cluster s

or the cluster t′ is an update of the cluster t and s′ is an update of the cluster s

and if ((t, s)) = Adm((t′, s′)) then the block cluster (t′, s′) is update of the blockcluster (t, s).

• A block cluster (t′, s′) ∈ TI′×I′ is new if it is neither identical nor an update.

Again, similar like with the cluster tree, there is an open question whether the indirectlyconstructed block cluster tree coincides with the directly constructed one. The answerto this question is given in the following theorem.

Theorem 4.39 Let TI′ be the updated cluster tree from TI . Further let TI×I be theblock cluster tree based on TI. If T up

I′×I′ is the block cluster tree based on TI′ obtainedby Construction 4.37 while T new

I′×I′ is the block cluster tree based on TI′ constructed byAlgorithm 2.40, then T up

I′×I′ and T newI′×I′ have identical structure.

85


Proof: Both block cluster trees T upI′×I′ and T new

I′×I′ are based on the same cluster treeTI′ . For checking the admissibility condition we use the same admissibility functionand admissibility tree ATI′ . The admissibility function takes the same value for blockclusters constructed directly and for the block clusters from T up

I′×I′. The last statementis clear from Construction 4.37. If the block cluster (t′, s′) ∈ T up

I′×I′ is not constructeddirectly then it is, as updated or identical block cluster, copied from the block clustertree TI×I .

Example 4.40 In the aim to present the update of the block cluster tree we considerExample 2.41 and Example 4.4. Figure 4.8 shows all elements needed for the constructionof the block cluster tree and the updated block cluster tree. The grid τ is locally refinedand as a result we obtain the grid τ ′. The basis functions for both discretisation schemesare chosen to be piece-wise constant and we index with I = 0, . . . , 7, resp. withI ′ = 0, . . . , 9. The cluster tree TI is obtained by box tree clustering while the clustertree TI′ is constructed as an update of TI. We number both cluster trees and constructthe block cluster tree TI′×I′ by Construction 4.37.

86


0

1 2

3 4 65

7 8 9 10 11

12 13 14

15 16

1

3 4

7 8 9

13 14

10 12

16

17 18 19

20 21

0

2

5 6

11

15

12

3

45

7

6

0 01

8

9

5

4

3

2

76

4,5,3 7,6,8,2,1,0,9

4,5 3 7,6,8 2,1,0,9

4 5 7,6 8

7 6

2,1 0,9

2,1

2 1

0,9

0,9

0 9

4,5,3,7,6,8,2,1,0,93,4,2,6,5,7,1,0

3,4,2 6,5,7,1,0

23,4 6,5,7 1,0

3 4 6,5 7 1,0

1,0

6 5

1 0

T T ′

TI TI′

local refinement

update

box tree clustering

numberingnumbering

old cluster

new cluster

updated cluster

colour legend

block cluster tree construction

87


0

1 2

3 4 6

7 8 9

12 13 14

15 16

1110

5

0

1

2

3

6

912

15

1614 11

10

13

8

7

4

5

0

1

2

3

4

5

6

8

7

9

12

11

10

13

14

16

1517

18

1921

20

0

1 2

4 5 6

87 9 10 11 12

13 14 1615

17 18 19

20 21

3

update

block cluster tree construction

old inadmissible block cluster

old admissible block cluster

new inadmissible block cluster

new admissible block cluster

updated admissible block cluster

block cluster colour legend

Figure 4.8.: The whole process of constructing the update of the block cluster tree in-cluding the update of the cluster tree and local refinement of the grid.

88

5. Update of Hierarchical Matrices

In this chapter we will present the last step of the update algorithm: the update ofhierarchical matrices. This includes:

• update of admissible and inadmissible matrix blocks for H-matrices, and

• update of clusterbases and uniform matrices for H2-matrices.

Concerning the H-matrices, we have introduced several methods for assembling theadmissible matrix blocks in Chapter 3. Therefore, we will also consider the update ofadmissible blocks for each method separately. The other sections of this chapter areorganised in the following way: in the first section we will describe the general updatealgorithm. It will be followed by the update of admissible and inadmissible blocks. Onesection will be devoted to the update of admissible blocks in the case of piecewise linearbasis functions. The last section contains the update of H2-matrices.

5.1. From an Updated Block Cluster Tree to the Update of an

H-Matrices

At the beginning of this section we recall the problem setting we have. We discretisethe integral operator with n basis functions and obtain a system of equations, such thatthe n× n matrix G of the system is densely populated. In order to avoid computationswith a full matrix we approximate G by the H-matrix G.

The matrix G is based on the block cluster tree TI×I that is constructed using thecluster tree TI , and an appropriate admissibility condition. The cluster tree TI is basedon the index set I, and constructed by the box tree clustering method. The index setI, #I = n enumerates the basis functions used for the discretisation, and indexes thegrid T , that contains the supports of the basis functions. An example for indexing canbe found in (1.13).

The grid T ′ is obtained by local refinement of the grid T (Algorithm 3.6) and will bereferred to as the “new” one. Correspondingly to the new grid we introduce the new setof basis functions that is indexed by I ′, #I ′ = n′. Discretising the integral operator withnew basis functions will lead again to a densely populated matrix. As in the previouscase we will avoid treating the densely populated matrices by applying the H-matrixtechnique. This time, instead of constructing a completely new H-matrix G′ we updatethe H-matrix G.

89


Independently of the construction, the matrix G′ is based on the block cluster treeTI′×I′ which is based on the cluster tree TI′ and on the admissibility condition. Thecluster tree TI′ , based on the index set I ′, and the block cluster tree TI′×I′ are obtainedas updates of the cluster tree TI (Algorithm 4.5) and block cluster tree TI×I (Algorithm4.37). The update algorithms for the cluster tree and block cluster tree are presented inthe previous chapter. We focus now on the update of the admissible and inadmissiblematrix blocks defining the appropriate update algorithm.

Algorithm 5.1 (Update of Admissible and Inadmissible Matrix Blocks)

INPUT The block cluster trees TI×I and TI′×I′, the H-matrix G.

DO

COPY If (t, s) and (t′, s′) are identical (Definition 4.38) then the (inadmissible

or admissible) matrix block G′|t′×s′ can be obtained as a copy of the matrix

block G|t×s.

copy

t

s

t’

s’

Figure 5.1.: Update of the H-matrix by copying a matrix block.

UPDATE If (t′, s′) is the update of (t, s) (Definition 4.38) then the matrix block

G′|t′×s′ will be updated from G|t×s.

update

t

s

t’

s’

Figure 5.2.: Update of the H-matrix by updating of a matrix block.

90

5.2. Update of Low-rank Blocks

RECOMPUTING If (t′, s′) is a new block cluster then the matrix block G|t′×s′ hasto be reassembled.

t’

s’

Figure 5.3.: Update of the H-matrix by assembling a new matrix block.

OUTPUT The H-matrix G′ ∈ H(TI′×I′, k′).


In Section 3.3 we have considered three algorithms for assembling the low-rank blocks:interpolation, adaptive cross approximation (ACA) and hybrid cross approximation(HCA). Similarly like in the Chapter 3 we will consider the update in the case of thesingle-layer potential operator. Although, in the update algorithms we do not specifythe type of basis functions, in the separate Section 5.4, we shall discuss the update oflow-rank blocks if the discretisation involves piecewise linear functions.

5.2.1. Update of Low-rank Blocks by Interpolation

Let (t, s) ∈ TI×I be an admissible block cluster. Further, we assume that diam(At) ≤diam(As). Let G|t×s be an admissible matrix block whose entries were assembled usingan interpolation scheme. The interpolation scheme defines the factorisation of the matrixblock G|t×s as follows:

G|t×s = ABT , A ∈ R#t×k, B ∈ R#s×k,

where the entries of the factors A and B, for the SLP kernel function, are

Aiν :=

∫

Γϕi(x)Lt

ν(x)dΓx,

Bjν :=

∫

Γϕj(y)g(x

tν , y)dΓy.

91


Let us recall that the rank k of the matrix block is fixed and uniquely determined ask := (m+ 1)d, where m is the degree of interpolation. Further, let (t′, s′) be an updateof the block cluster (t, s). Since (t, s) is an admissible leaf, we have that (t′, s′) is an

admissible leaf (Lemma 4.35), and therefore the matrix block G′|t′×s′ is admissible. For

the matrix block G′|t′×s′ we seek a low-rank approximation of the form

G′|t′×s′ = A′B′T , A′ ∈ R#t′×k, B ∈ R#s′×k.

The entries for the matrices A′ and B′, will be computed indirectly, using the fact thatt′ (s′) contains new and old indices. This indirect method of constructing the low-rankapproximation using the already existing low-rank approximation is called update. Theupdate algorithm contains three steps.

Algorithm 5.2 (Update of low-rank blocks computed by interpolation)

INPUT Clusters t, s, t′, s′, the matrix block G|t×s.

DO

COPY If the index i belongs to t′old (j ∈ s′old) then we obtain the entry A′iν (B′

jν)by copying the appropriate entry from A (B) for all ν = 1, . . . , k:

A′iν = AR−1(i)ν , ν = 1, . . . , k,

B′jν = BR−1(j)ν , ν = 1, . . . , k.

Since the rank k is fixed, all entries AR−1(i)ν , BR−1(j)ν , ν = 1, . . . , k, can becopied, i.e., the whole row can be copied. This case is illustrated in Figure5.4.

BT T

A

A’

Bcopyi

i’

Figure 5.4.: The index i is unchanged, and thus all entries from the row AR−1(i)∗ will becopied into the row A′

i∗.

RECOMPUTE If i ∈ t′new (j ∈ s′new) then the entry has to be computed directlyby:

A′iν =

∫

Γϕi(x)L

tν(x) dΓx, ν = 1, . . . , k,

B′jν =

∫

Γϕj(y)g(x

tν , y) dΓx, ν = 1, . . . , k.

92


All entries of the rows Aiν , Bjν, ν = 1, . . . , k, have to be reassembled. Inthis case we cannot reuse the existing low-rank approximation since the basisfunction corresponding to i (j) is new. Figure 5.5 illustrates this case.

A’

BT

A

TBinew

insert

Figure 5.5.: The index i is new and thus the whole row Ai∗ has to be recomputed.

REMOVE If the index i belongs to t and R(i) = −1 (j ∈ s and R(j) = −1) thenthere is nothing to be computed. This is the case when i (j) corresponds to abasis function which was refined. The row Ai∗(Bj∗) will be removed from thematrix A(B). Figure 5.6 represents this case.

BA’

T TB

Aiout remove

Figure 5.6.: The index i is associated to a triangle (basis function) that was refined, andtherefore the whole row Ai∗ is removed.

OUTPUT The matrix block G′|t′×s′.

Remark 5.3 The previous algorithm defines updates in the general case, assuming thatboth clusters t′ and s′ are updated. This algorithm can be also applied in the followingtwo cases.

• In the case that (t′, s′) is an admissible leaf and an update of (t, s) but t′ is identicalto t, then all entries of A will be copied to A′, i.e., there holds A′ = A.

• In the case that (t′, s′) is an admissible leaf and an update of (t, s) but s′ is identicalto s, then all entries of B will be copied to B′, i.e., there holds B′ = B.

Theorem 5.4 Let (t, s) be an admissible block cluster. The corresponding admissiblematrix block G|t×s has a low-rank approximation of the form G|t×s = ABT obtained

93


by interpolation. Further, let (t′, s′) be an update of (t, s). By G′up|t′×s′ we denote the

matrix block obtained via an update of the matrix G|t×s. Let G′new|t′×s′ be the matrix

block obtained by interpolation (Subsection 3.3.1). Then there holds

A′up = A′

new,

B′up = B′

new,

G′up|t′×s′ = G′

new|t′×s′ ,

where G′new|t′×s′ = A′

newB′new

T and G′up|t′×s′ = A′

upB′up

T .

Proof: We shall prove that A′up = A′

new. The set of indices t′ is the disjoint union of

old and new indices t′ = t′new ∪ t′old (Lemma 4.31). The old indices correspond to theunchanged basis functions, while the new indices correspond to new basis functions. Inthe matrix A′

up all new entries are computed using an interpolation scheme. This means

(A′up)iν = (A′

new)iν , ∀i ∈ t′new, ν = 1, . . . , k. (5.1)

All entries (A′up)iν corresponding to the old indices are obtained as copies of the cor-

responding entries from the matrix A. Since these entries were computed using aninterpolation scheme, for basis functions ϕi which remained unchanged there holds:

(A′up)iν = (A′

new)iν , ∀i ∈ t′old, ν = 1, . . . , k. (5.2)

From (5.1) and (5.2) we have:

(A′up)iν = (A′

new)iν , ∀i ∈ t′old ∪ t′new = t′, ν = 1, . . . , k. (5.3)

In a similar way we can show that B′ = Bnew, which yields G′|t′×s′ = Gnew|t′×s′ .

Remark 5.5 (Complexity) The complexity of the update algorithm is O((#t′new +

#s′new)k).

Proof: We recall that the storage requirements for low-rank blocks are (#t′ +#s′)k. Inthe update procedure we recompute only entries corresponding to the new basis func-tions and therefore the complexity is (#t′new + #s′new)k.

5.2.2. Update of Low-rank Blocks by Adaptive Cross Approximation

Let (t, s) be an admissible block cluster in the sense of the min-admissibility conditionfor the boxes At and As. In the case that an admissible block G|t×s is assembled usingadaptive cross approximation (ACA) (Subsection 3.3.2), the factorisation we obtain isof the same form as in the previous case:

G|t×s = ABT , A ∈ R#t×k, B ∈ R#s×k.

94


The rank is not the same for all admissible matrix blocks in the H-matrix as it was inthe case of the interpolation scheme, but it is determined separately for each admissiblematrix block and it depends on the choice of the pivot elements.From the construction of the approximation for the block (t, s) we have used the set ofpivot pairs

PACA := (i∗q , j∗q )kq=1. (5.4)

The cardinality of the set PACA is k, the rank of the low-rank approximation obtained byACA. Among all pivot pairs, only the first pivot j∗1 is chosen arbitrarily, while all otherpivots depend on the previous one. The Figure 5.7 presents the pivoting. Let t′, s′ ∈ TI′

i∗1i∗1 i∗1

j∗1j∗1j∗1

i∗2

j∗2j∗2

Figure 5.7.: The pivot process starts with an arbitrarily chosen index j∗1 . Then we choosethe index i∗1 as argmaxi∈t′ |Gij∗1

|. This is the way how we chose the first pivotpair (i∗1, j

∗1 ). We continue the procedure choosing j∗2 as argmaxj∈s′ |Gi∗1j|.

be updates for clusters t, s ∈ TI , respectively. Since (t, s) is an admissible leaf, (t′, s′)

is an admissible leaf, too (Lemma 4.35). The matrix block G′|t′×s′ is admissible and

we will compute its entries indirectly using the matrix block G|t×s. Compared to theinterpolation scheme, the update procedure for ACA is not straight-forward. It is not

enough to have an index, e.g., i ∈ t′old and then to copy the entry, as it was the case in theinterpolation scheme. Here each Aiν or Biν entry depends on the particular index i (j)and on all pivot pairs (i∗q , j

∗q ), q = 1, . . . , ν. Therefore, the set PACA will be considered

for the update beside the clusters t′, s′, and used as presented in the following algorithm:

Algorithm 5.6 (Update of low-rank blocks computed by ACA)

INPUT The clusters t, s ∈ TI , t′, s′ ∈ TI′, the set of pivot pairs PACA, the ACA-

factorisation G|t×s =∑k

ν=1 aν(bν)T .

ASSUMPTION We assume that the first p, p ≤ k pairs can be reused, i.e., R(i∗p) ∈ t′

and R(j∗p) ∈ s′. Additionally we assume δ = maxi∈t′ |G′iR(j∗p ) −

∑νµ=1 aiµbµR(j∗p)|.

START ν = 1, . . . , k′

IF ν ≤ p THEN we perform the UPDATE

95


COPY In the case i ∈ t′old (j ∈ s′old) we copy the entry aR−1(i)ν(bR−1(j)ν) intoaiν(bjν):

a′iν = aR−1(i)ν ,

b′jν = bR−1(j)ν .

RECOMPUTE In the case i ∈ t′new (j ∈ s′new) we compute the entry directly

a′iν = G′iR(j∗ν ) −

ν−1∑

µ=1

(a′µ)i(b′µ)R(j∗ν ),

b′jν =1

δ

(G′

R(i∗ν )j −ν−1∑

µ=1

(a′µ)R(i∗ν )(b′µ)j

).

STOP IF ‖G′|t′×s′ −∑ν

µ=1 a′µ(b′µ)T ‖ ≤ ǫ‖G′|t′×s′‖.

IF ‖G′|t′×s′ −∑ν

µ=1 a′µ(b′µ)T ‖ > ǫ‖G′|t′×s′‖ AND ν = p THEN continue the construc-

tion of the low-rank approximation as in Algorithm 3.14.

OUTPUT The factorisation G′|t×s = ABT where A ∈ R#t′×k′contains columns (a′i)

k′

i=1

and B ∈ R#s×k′contains columns (b′i)

k′

i=1.

Figure 5.8 shows how the update for low-rank blocks looks like in the case that ACAwas used.

copied entries

new entries

a

b1

a’

b’

1

1

1B’

A’

T

A

BT

Figure 5.8.: The pivot pair (i∗1, j∗1) can be reused. The column vector a′1(b

′1) in the

matrix A′(B′) is an update of a1(b1) from A(B). Graphically the green colorin the vector a′1(b

′1) represents copied entries from a1(b1) that correspond

to unchanged indices. The red color represents all entries that had to becomputed directly, since they correspond to the new indices.

Remark 5.7 In the case R(i∗1) /∈ t′ or R(j∗1) /∈ s′, it is not possible to use the previousalgorithm. In this case, although (t′, s′) is an updated cluster, the low-rank approximationof the block G′|t′×s′ will be done by Algorithm 3.14.

96


Remark 5.8 The previous algorithm is developed for the optimal case, taking into ac-count that the inherited pivot elements PACA maximise the chosen column or row. Inthe case that the pivot pair can be reused but it does not maximise the column (row),then there are two possibilities:

(a) we can keep the pivot index but in this case the accuracy ǫ might not be achieved,or

(b) we can change the pivot pair (and all others that follow) by finding the maximum ofthe row or the column; this process will lead to the higher complexity of the updatealgorithm.

Theorem 5.9 Let (t, s) be an admissible block cluster and G|t×s the corresponding ma-trix block, whose low-rank approximation is computed by ACA. Further, let (t′, s′) be an

update of (t, s). By G′up|t′×s′ we denote the matrix block obtained through ACA update

of the matrix G|t×s. Let G′new|t′×s′ be the matrix block assembled by the (original) ACA

algorithm. We assume that the pivot elements used in the ACA algorithm for computing

G′new|t′×s′ and in update algorithm for G′

up|t′×s′, are identical. Then, there holds:

G′up|t′×s′ = G′

new|t′×s′ .

Proof: The block cluster (t′, s′) is admissible as an update of an admissible block cluster

(Lemma 4.35). Therefore the matrix blocks G′up|t′×s′ and G′

new|t′×s′ are admissible aswell.The low-rank approximation of the matrix block G|t×s is provided by ACA, and it defines

the factorisation of the form G|t×s =∑k

i=1 ai(bi)T and the set of pivot pairs PACA.

The first pivot element in the ACA algorithm is chosen arbitrary, and therefore it can beany. For the updated block the choice of the (first) pivot elements is determined by the

set PACA. Since we have chosen the same first pivot element for matrix blocks G′up|t′×s′

and G′new|t′×s′ , the rank of the low-rank approximation is the same, k′, and the sets of

the pivot elements P upACA, Pnew

ACA are identical.The next step is to prove that

aupν = anew

ν ,

bupν = bnew

ν ,

where G′new|t′×s′ =

∑k′

ν=1 anewν bnew

νT and G′

up|t′×s′ =∑k′

ν=1 aupν bup

νT.

This can be proven by induction. It is obvious that (aupi1 ) = anew

i1 since both vectors areobtained directly from the matrix block G′|t′×s′ . If we assume that for all i = 1, . . . , k′1holds (aup

i ) = (anewi ) then the induction step i→ i+ 1 is straight forward because from

the construction of the vectors and the induction assumption we obtain (aupi+1) = (anew

i+1 ).

97


Remark 5.10 (Complexity) The complexity of the update algorithm in the case that

ACA is used is O((#t′new + #s′new)p2 + (#t′ + #s′)(k′ − p)2), where p is the number ofpivot pairs from PACA that were reused for the update, and k′ is the rank of the updatedmatrix block.

Proof: The complexity of the ACA algorithm for assembling the matrix block G′|t′×s′

is O((#t′ + #s′)k′2). Let p ≤ k′ be a number of pivot pairs that can be reused in theupdate algorithm. In this case only the last k′−p vectors will be computed by the (orig-inal) ACA algorithm with costs O((#t′ +#s′)(k′−p)2). In the first p vectors which willbe updated, only the entries corresponding to the new indices will be computed directlywith costs O((#t′new + #s′new)p2). Therefore the total costs for the update algorithm

are of the order O((#t′new + #s′new)p2 + (#t′ + #s′)(k′2 − p2)).

5.2.3. Update of Low-rank Blocks by Hybrid Cross Approximation

The hybrid cross approximation (HCA(I) and HCA(II)) algorithms for constructing alow-rank approximation are a combination of the interpolation scheme and ACA. There-fore, the update algorithm for this scheme will be based on the updates introduced inthe previous subsections. In both cases we have the same starting point. The leaf(t, s) ∈ TI×I is admissible (in the sense of max-admissibility (2.4) which is satisfied forboxes At and As) and it determines the admissible matrix block G|t×s.

HCA(I)

The expansion used for obtaining the low-rank form G|t×s = ABT is based on inter-polation in both directions such that we obtain the intermediate form

G(1)|t×s = U tSt,s(V s)T .

Since this approximation does not have the desired form and the rank is too large, weapply the ACA scheme to the M ×M matrix St,s obtaining the low-rank factorisationSt,s ≈ ABT , A ∈ RM×k, B ∈ RM×k, k ≤M . This yields

G|t×s = U tABT (V s)T = ABT , A = U tA, B = V sB. (5.5)

Let us assume that (t′, s′) is the update of an admissible block cluster (t, s). In order to

obtain the low-rank approximation of the matrix block G′|t′×s′ of the form

G′|t′×s′ = A′B′T

with A′ = U t′A′, B = V s′B′,

where A′ ∈ R#t′×k, B′ ∈ R#s′×k,

we can try to reuse the matrices A and B defined in equation (5.5). In order to obtainthe matrices A′ and B′ indirectly we look closer at the structure of the matrices A and B

98


which involves the matrices A, U t and B, V s. The matrix St,s does not (directly) dependon the block cluster (t, s) but only on the boxes At, As. Since t′, s′ are updated clusterswe have that At = At′ and As = As′ hold. Therefore, the ACA approximation of thematrix St,s = St′,s′ = ABT remains the same, i.e. A = A′ and B = B′. The matricesU t′ and V s′ will be updated using the update algorithm for interpolation presented inSection 5.2.1.

Algorithm 5.11 (Update Algorithm for HCA(I))

INPUT The clusters t, s ∈ TI and t′, s′ ∈ TI′, the matrix block G|t×s = ABT .

UPDATE will be performed first for the matrices U t′ , V s′ as in the Algorithm 5.2: theentry will be copied if the index is old, or it will be assembled newly if the index isa new one:

U t′

iν =

U t

R−1(i)ν i ∈ t′old,∫Γ ϕi(x)L

t′ν (x) dΓx i ∈ t′new,

V s′jν =

V s

R−1(j)ν j ∈ s′old,∫Γ ϕj(y)L

s′ν (y) dΓy j ∈ s′new.

Having updated the matrices U t′ and V s′ we can go one step further and updatethe matrices A′ and B′.Similarly like in the previous case we copy an entry if the index is old, or in thecase that the index is a new one we perform the matrix-vector multiplication usingthe updated matrices U t′ and V s′ and the ACA-factorisation of St′,s′ ≈ ABT :

A′iν =

AR−1(i)ν i ∈ t′old,

(U t′i A)ν i ∈ t′new,

B′jν =

BR−1(j)ν j ∈ s′old,

(V s′j B)ν j ∈ s′new.

Since the old entries in A′ (B′) are obtained as copies of the corresponding entriesfrom A (B), the old entries for matrices U t′ (V s′) need not to be computed at all.

OUTPUT The HCA(I) factorisation A′B′T of G′|t′×s′.

Remark 5.12 (Storage) In order to perform the update for the HCA(I) algorithm, weneed to store the matrices A and B. The storage requirements are of the order O(Mk).

Remark 5.13 (Complexity) The complexity of the update algorithm for HCA(I) is

O((#t′new + #s′new)Mk).

Proof: The complexity of the HCA(I) algorithm for an admissible matrix block G′|t′×s′

is O((#t′+#s′)Mk). Since only new entries will be computed by the original algorithm,

the complexity for update algorithm is O((#t′new + #s′new)Mk).

99


#t#s

#t′ #s′

MMM

MMM

k

k

k

k

U t

(V s)T

U t′

(V s′)TA

ABT

BT

update

Figure 5.9.: Illustration of the HCA(I) update algorithm: the matrices A and B remainunchanged. The matrices U t′ and V s′ are obtained as updates of U and V .The pink stripes denote the new entries that correspond to the new basisfunctions.

Theorem 5.14 Let (t, s) be an admissible block cluster and G|t×s the correspondingmatrix block, whose low-rank approximation is computed by HCA(I). Further, let (t′, s′)

be an update of (t, s). By G′up|t′×s′ we denote the matrix block obtained as HCA(I)-update

of the matrix G|t×s. Let G′new|t′×s′ be the matrix block assembled by the original HCA(I)

algorithm. Then there holds G′new|t′×s′ = G′

up|t′×s′.

Proof: The block cluster (t′, s′) is an admissible leaf (Lemma 4.35). The low-rank

matrix blocks G′new|t′×s′ and G′

up|t′×s′ are represented in the form:

G′new|t′×s′ = A′

newB′new

T,

G′up|t′×s′ = A′

upB′up

T.

We need to prove that A′new = A′

up and B′new = B′

up. A′new and A′

up are represented as

A′new = U t′

newA′new,

A′up = U t′

upA′up.

The matrices U t′new and U t′

up are identical according to Lemma (5.4). If we assume that

for the ACA approximation of the matrix St′,s′ , we fix the first pivot element then the

100


matrices A′new and A′

up are identical. Therefore we conclude that A′new = A′

up, yielding

G′new|t′×s′ = G′

up|t′×s′ .

HCA(II)

Similar to the previous case we have that a low-rank approximation of an admissiblematrix block G|t×s is of the form ABT . The HCA(II) algorithm defines these matricesas

A := UCT , B := V DT .

The entries for the matrices U , V are defined in (3.22) and in Algorithm 3.3.3.If we now assume that the clusters t′, s′ ∈ TI′ are updates of the clusters t, s ∈ TI ,such that the pair (t′, s′) is a leaf, then the corresponding matrix block G′|t′×s′ is also

admissible. To compute the factorisation of the matrix block G′|t′×s′ we use the already

assembled matrices A and B from the factorisation of the matrix block G|t×s. The aimis to obtain the factorisation of the form :

G′|t′×s′ = A′B′T ,

with A′ = U ′C ′T , B = V ′D′T ,

where A′ ∈ R#t′×k, B′ ∈ R#s′×k.

First we can determine which part of the factorisation can be completely copied, if thereis any at all.The matrices C and D are coefficient matrices whose entries are computed by evaluatingthe kernel function at the points whose coordinates are determined by the pivot pairsused in the ACA factorisation of the matrix St,s. In order to explain this in more detailwe start with the ACA approximation of the matrix St,s, which depends only on thebounding boxes At, As. From this approximation we save the set of pivot elements PSt,s

ACA.This set will be used for computing the entries of the matrices C and D. Since the matrixSt,s is identical to the matrix St′,s′ due to the fact that At = At′ and As = As′ , the ACAapproximation and the set of pivot pairs will not be changed. Therefore, the matrices Cand D can be inherited from the matrix block G|t×s and there holds C = C ′ and D = D′.The update algorithm for HCA(II) is similar to the update algorithm for HCA(I), andit is based on the update Algorithm 5.2.

Algorithm 5.15 (Update Algorithm for HCA(II))

INPUT The clusters t, s ∈ TI , t′, s′ ∈ TI′, the HCA(II) factorisation of the matrix block

G|t×s = ABT = (UCT )(V DT )T .

UPDATE The matrices U and V are directly dependent on the clusters t and s. Hence,we will for the construction of the matrices U ′ and V ′ reuse all indices from t′

and s′ which are copied from t and s. The entries of the matrices U ′ and V ′ are

101


computed for l = 1, . . . , k in the following way:

U ′il =

UR−1(i)l i ∈ t′old,∫Ωt′

ϕi(x)Dxγ(x, yjl) dx i ∈ t′new,

V ′jl =

VR−1(j)l j ∈ s′old,∫Ωs′

ϕj(y)Dyγ(xil , y) dy j ∈ s′new.

The second step will be computing the entries of the matrices A′ and B′. For allν = 1, . . . , k we have:

A′iν =

AR−1(i)ν i ∈ t′old,

(U ′iC

T )ν i ∈ t′new,

B′jν =

BR−1(j)ν j ∈ s′old,

(V ′jD

T )ν j ∈ s′new.

All new entries in the matrix A′ (B′) are obtained via matrix-vector multiplicationof the matrix C (D) by the updated row U ′

i (V ′j ). Since the old entries in A′ (B′)

are obtained as copies of the corresponding entries from A (B), the old entries formatrices U ′ (V ′) need not to be computed at all.

OUTPUT HCA(II) factorisation of the matrix block G′|t′×s′ = A′B′T = (U ′CT )(V ′DT )T .

Remark 5.16 (Storage) In order to compute the entries in the matrices A′, B′, we

need to store the vectors U ′i , ∀i ∈ t′new and V ′

j , ∀j ∈ s′new. The amount of memory we

require is of the order O(#t′newk + #s′newk). The matrices C and D will be stored aswell, and the storage requirements are of the order O(k2).

Remark 5.17 (Complexity) The complexity of the update algorithm for HCA(II) is

O((#t′new + #s′new)k2).

Proof: The complexity of the HCA(II) algorithm for admissible matrix block G′|t′×s′ is

O((#t′ + #s′)k2). In the update algorithm we apply the original algorithm in order tocompute the matrix entries corresponding to the new indices. Therefore the complexityof the update algorithm is O((#t′new + #s′new)k2).

Theorem 5.18 Let (t, s) be an admissible block cluster and G|t×s the correspondingmatrix block, whose low-rank approximation is computed by HCA(II). Further, let (t′, s′)

be an update of (t, s). By G′up|t′×s′ we denote the matrix block obtained as HCA(II)

update of the matrix G|t×s. Let G′new|t′×s′ be the matrix block assembled by the original

HCA(II) algorithm. If we assume that the ACA approximation of the coupling matrix

St′,s′ is fixed, then, there holds G′new|t′×s′ = G′

up|t′×s′.

102


#s#t

#s′#t′

k

k

k

k

k

k

k k

kk

k

k

CT

CT

D

D

U

V T

U ′

V ′T

update

Figure 5.10.: Illustration of HCA(II) update algorithm: the matrices C and D remainunchanged. The matrices U ′ and V ′ are obtained as update of the matricesU and V . The pink stripes denote the entries that correspond to the newindices. Since the old entries in A′(B′) will be directly copied form A(B),the matrices U ′ and V ′ will not be fully constructed.

Proof: The block cluster (t′, s′) is an admissible leaf (Lemma 4.35). The low-rank

matrix blocks G′new|t′×s′ and G′

up|t′×s′ are represented in the standard form:

G′new|t′×s′ = A′

newB′new

T,

G′up|t′×s′ = A′

upB′up

T.

We need to prove that A′new = A′

up and B′new = B′

up. In the HCA(II) algorithm, thematrices A′

new and A′up have the form

A′new = U t′

newC′new

T,

A′up = U t′

upC′up

T.

The assumption that the ACA approximation of the coupling matrix St′,s′ is fixed, gives

that the set of pivots PSt′,s′

ACA is fixed, as well. The last observation leads to the conclusionthat C ′

up = C ′new, resp. D′

up = D′new.

103


The matrices U t′new and U t′

up are identical due to the fact that the set PSt′,s′

ACA is fixed and

according to Lemma (5.4). Therefore we have G′new|t′×s′ = G′

up|t′×s′ .

5.3. Update of Full Matrix Blocks

Let us assume that (t, s) is an inadmissible leaf and that G|t×s is an inadmissible matrixblock represented in the full matrix format. The size of the matrix is of the orderO(n2

min), i.e., it is a small matrix whose entries are computed directly, using appropriatequadrature techniques (e.g. [17]):

(G|t×s)ij =

∫

Γ

∫


=

∫

Ωi

∫

Ωj

ϕi(x)g(x, y)ϕj(y)dxdy.

Let us assume that (t′, s′) is an update of (t, s). Further, let us assume that sons(t′) = ∅or sons(s′) = ∅. The last assumption implies that (t′, s′) is a leaf too, i.e., that G′|t′×s′ isinadmissible. The entries of this matrix block are computed indirectly using the fact thatboth clusters t′ and s′ are updated. The algorithm for this update of a full matrix ispresented below.

Algorithm 5.19 (Update algorithm for full matrices)

INPUT The clusters t, s ∈ TI, t′, s′ ∈ TI′, the matrix block G|t×s.

UPDATE contains two parts.

COPY In the case that i ∈ t′old and j ∈ s′old, the entry G′ij will be copied from

G|t×s:

(G′|t′×s′)ij := (G|t×s)R−1(i)R−1(j)

This case is presented in Figure 5.11.

i

j

R−1(i)

R−1(j)

copy

Figure 5.11.: Update of full matrices: The indices i and j correspond to the unchangedbasis functions and therefore the entry (G|t×s)R−1(i)R−1(j) is copied, with

possibly different position, into the matrix block G′|t′×s′ .

104

5.3. Update of Full Matrix Blocks

RECOMPUTE If i ∈ t′new or j ∈ s′new then we compute the entry directly.

OUTPUT The full matrix block G′|t′×s′.

Remark 5.20 (Complexity) The complexity of the update algorithm for one inadmis-

sible block in full matrix representation is O(#t′new#s′ + #s′new#t′).

Proof: Let i′ ∈ s′ be a new index. Then all entries in the matrix block G′|t′×s′ involvingthe index i′ must be computed by the original algorithm. Therefore the total costs forcomputing all new entries are of the order O(#t′news

′ + #s′newt′).

Theorem 5.21 Let (t, s) ∈ TI×I be an inadmissible leaf and G|t×s a full matrix block.

Further, let (t′, s′) ∈ TI′×I′ be an update of (t, s). Then the full matrix G′up|t′×s′ obtained

by update coincides with the full matrix G′new obtained by the direct method.

Proof: Our aim is to show that (G′up)ij = (G′

new)ij , for all i ∈ t′, j ∈ s′. We split the

sets of indices t′ and s′ in the disjoint unions of subsets t′ = t′new ∪ t′old, s′ = s′new ∪ s′old.

If i ∈ t′new, j ∈ s′new or i ∈ t′old, j ∈ s′new or i ∈ t′new, j ∈ s′old, then it holds

(G′up)ij = (G′

new)ij (5.6)

because those entries were computed in the same way by direct application of quadra-

ture. The entries (G′new)ij for i ∈ t′old, j ∈ s′old were copies of the entries from the old

matrix G but they also coincide with the entries (G′new)ij. This claim is based on the fact

that the basis functions and quadrature methods are in both cases the same. Therefore,

there holds (G′up)ij = (G′

new)ij for all i ∈ t′, j ∈ s′.

5.3.1. Extended Update

Algorithm 5.1 considers the update of matrix blocks only for updated block clusters. Forall new block clusters (t′, s′) the corresponding block matrices G|t′×s′ are constructed bythe original algorithms.Still, there are certain cases where it is also possible to assemble the new matrix block byan update procedure. We consider the situation presented in Figure 5.12. The clusterst, s ∈ TI are leaves and we assume that the block cluster (t, s) is inadmissible and definesthe full matrix block G|t×s. The clusters t′, s′ from an updated cluster tree TI′ are up-dates from t, s respectively, but the block cluster (t′, s′) is not an update of (t, s) becauset′, s′ /∈ L(TI′). If t′1, t

′2 ∈ sons(t′) and s′1, s

′2 ∈ sons(s′) and t′1, t

′2, s

′1, s

′2 ∈ L(TI′) we

can assume that the block clusters (t′1, s′1), (t

′1, s

′2), (t

′2, s

′1), (t

′2, s

′2) are inadmissible. Our

idea is to assemble the corresponding full matrix blocks G′|t′1×s′1

, G′|t′1×s′2

, G′|t′2×s′1

and

G′|t′2×s′2

updating the full matrix block G|t×s. This kind of update we name extended

update and we apply it only for certain full matrix blocks. The precise algorithm ispresented below.

105


TI TI′

t s

t1 s1t2 s2

t′ s′

update

Figure 5.12.: This is an example where due to the update some leaves become largerthan nmin. The clusters t, s ∈ TI are leaves. Their updates t′, s′ ∈ TI′ arenot leaves of TI′ .

Algorithm 5.22 (Extended update algorithm)

INPUT The clusters t′ν , s′µ, the inadmissible matrix block G|t×s

IF i ∈ (t′ν)old and j ∈ (s′µ)old then (G′|t′ν×s′µ)ij = (G|t×s)R−1(i)R−1(j).

ELSE the entry (G′t′ν×s′µ

)ij will be computed by the original algorithm.

OUTPUT The full matrix block G′t′ν×s′µ

.

t

s t′

s′

s′µ

t′ν

Figure 5.13.: The matrix block G|t×s can be used for assembling the block G′|t′ν×s′µal-

though the block cluster (t′ν , s′µ) is not the update of (t, s).

Theorem 5.23 Let G ∈ H(TI′×I′ , k) be an H-matrix. Further, let TI′×I′ be a blockcluster tree based on the cluster tree TI′, which is obtained by updating the cluster tree TI .With Gnew ∈ H(TI′×I′ , k) we denote the H-matrix whose low-rank block are assembled

by the same method as for the matrix G. G′ ∈ H(TI′×I′ , k) is the H-matrix obtained as

update of the matrix G. Then the matrices G′ and Gnew are identical.

106

5.4. Update of H-Matrices in the case of Piecewise Linear Functions

Proof: The matrices G′ and Gnew do have the same block structure since they are basedon the same block cluster tree TI′×I′ . To prove that these matrices are identical we needto prove that they are blockwise identical.All old and new blocks from G′ are identical to the corresponding blocks in Gnew becausethey are computed identically. All updated blocks are identical to the correspondingblocks in the new matrix as it was proven in Theorem 5.4 for admissible blocks and inTheorem 5.21 for inadmissible blocks.

Remark 5.24 The previous theorem will also hold if the extended update is applied.

5.4. Update of H-Matrices in the case of Piecewise LinearFunctions

In this section we shall present the update algorithm for H-matrices in the case thatthe discretisation space is spanned by piecewise linear functions. This choice impliessome changes in the definition of the index set, and construction of the cluster tree. Fora given grid T , the index set J is identified with the set of the vertices by mappingPJ : J → P(T ), PJ (j) = xj (the set J will be also identified with the set of basisfunctions). The cluster tree TJ is obtained by box tree clustering whereby the verticesare clustered. If T ′ is a grid obtained through local refinement of the grid T , we denoteby J ′ the index set corresponding to all vertices in the new grid. The cluster tree TJ ′

is obtained as an update of the cluster tree TJ in the way presented in the previouschapter. The block cluster tree TJ ′×J ′ is also obtained as update of TJ×J .

In the aim to obtain the H-matrix G′ based on TJ ′×J ′ we update G, an H-matrix basedon the block cluster tree TJ×J . The low-rank blocks of the matrix G are assembled usingthe interpolation scheme for a fixed number of interpolation points in each direction. Thefull matrix blocks are computed using quadrature rules for singular integrals.

5.4.1. Update of Low-rank Blocks

Let (t, s) ∈ TJ×J be an admissible leaf, and we assume that diam(At) ≤ diam(As). Theadmissible matrix block G|t×s is represented in the Rk-matrix form G|t×s = ABT , A ∈R#t×k, B ∈ R#s×k, where the rank k is fixed and uniquely determined as (m+ 1)d, form+ 1 interpolation points in each direction. We define the following sets:

ST (ϕi) := l | τl ⊆ suppϕi,ST := l | τl ∈ T .

107


The entries of the matrices A and B (for SLP operator) are computed in the followingway:

Aiν =∑

l∈ST (ϕi)

∫

τl

Ltν(x)dΓx =

∑

l∈ST (ϕi)

al

Bjν =∑

l∈ST (ϕi)

∫

τl

g(xtν , y)dΓy =

∑

l∈ST (ϕi)

bl

where al =

∫

τl

Ltν(x)dΓx

and bl =

∫

τl

g(xtν , y)dΓy.

Let (t′, s′) ∈ TJ ′×J ′ be the update of (t, s). The block cluster is also admissible (Lemma4.35) and we seek the factorisation of the form A′B′T , A′ ∈ R#t′×k, B′ ∈ R#s′×k for the

matrix block G′|t′×s′ . This factorisation is computed indirectly.

Algorithm 5.25 (Update Algorithm for Low-Rank Matrices)

INPUT t, s ∈ TI, t′, s′ ∈ TI′, the matrix block G|t×s.

DO

COPY If i′ ∈ t′ and ϕi = ϕi′ for i = R−1(i′) ∈ t then

Ai′ν = Aiν ∀ν = 1, . . . , k,

i.e., the whole row Ai∗ can be copied. Similar for the matrix B′. If j′ ∈ s′

and ϕj = ϕj′ for j = R−1(j′) ∈ s then

Bj′ν = Bjν ∀ν = 1, . . . , k,

i.e., the whole row Bj∗ can be copied.

UPDATE If i′ ∈ t′ and ST (ϕi)∩ST 6= ∅ then there exists i ∈ t such that ST (ϕi)∩ST ′(ϕi′) 6= ∅. Then we can compute the entries Aiν ∀ν = 1, . . . , k as:

Ai′ν =∑

l /∈ST ′(ϕi′ )

∫

τl

Ltν(x)dΓx +

∑

l∈§T (ϕi)

al.

Similarly: If j′ ∈ s′ and ST (ϕj) ∩ ST 6= ∅ then there exists j ∈ s such thatST (ϕj) ∩ ST ′(ϕj′) 6= ∅. Then we can compute the entries Bjν ∀ν = 1, . . . , kas:

Bj′ν =∑

l /∈ST ′(ϕj′

∫

τl

g(xtν , y)dΓy +

∑

l∈ST (ϕj)|

bl.

108

5.5. Update of H2-Matrices

RECOMPUTE In the case that i ∈ t′new or (and) j ∈ s′new we compute the entriesfor A and B from scratch.

OUTPUT The matrix block G′|t′×s′.

We can notice that the “update” part of the algorithm is a combination of copied addendsand newly computed elements.

Lemma 5.26 Let (t′, s′) ∈ TJ ′×J ′ be an admissible leaf updated from (t, s) ∈ TJ×J .

The low rank approximation of the matrix block G′|t′×s′ is obtained as update of the low

rank approximation of the block G|t×s. If Gnew|t′×s′ is the matrix block whose low-rank

approximation was obtained directly, then there holds G′|t′×s′ = Gnew|t′×s′.

5.4.2. Update of Full Matrix Blocks

If (t, s) ∈ TJ×J is an inadmissible leaf block then its corresponding matrix block G|t×s

is represented in the full matrix format. Let (t′, s′) ∈ TJ ′×J ′ be a leaf block that is

an update of (t, s). The matrix block G′|t′×s′ is also inadmissible, and its full matrix

representation will be computed using the full matrix G|t×s. The update algorithm hastwo steps:

Algorithm 5.27 (Update of Full Matrix Blocks)

COPY If i′ ∈ t′old and j′ ∈ s′old then the entry will be copied.

UPDATE If i′ ∈ t′new or j′ ∈ s′new then the entry will be computed directly in thefollowing way:

(G′|t′×s′)ij =∑

l∈ST ′(ϕi′ )

∑

h∈ST ′ (ϕj′ )

∫

τl

∫

τh

g(x, y)dΓxdΓy.

Lemma 5.28 Let (t′, s′) ∈ TJ ′×J ′ be an inadmissible leaf updated from (t, s) ∈ TJ×J .

The full matrix block G′|t′×s′ is obtained as update of the full matrix block G|t×s. If

Gnew|t′×s′ is the full matrix block whose entries are obtained directly, then there holds

G′|t′×s′ = Gnew|t′×s′.


In this section we present an update algorithm for H2-matrices. Although the ideabehind H-matrices and H2-matrices is similar, the construction is different. While H-matrices are based on the cluster tree and block cluster tree, the construction of H2-matrices relies additionally on cluster bases. Therefore, the update of H2-matrices startswith updates of cluster bases. It will be followed by the update of uniform matrices thatare used for the approximation of admissible matrix blocks.

109


5.5.1. Update of the Cluster Bases

Let TI be a cluster tree based on the index set I, and TI′ be the cluster tree TI′ basedon the index set I ′ and obtained as an update of TI . Let VTI

be the set of all clustersfrom TI and VTI′ be the set of all clusters from TI′ . Let (V t)t∈V (TI) be the cluster basis

for TI . We define cluster bases (V t′)t′∈TI′ by the following algorithm.

Algorithm 5.29 (Update of the Cluster Basis)

INPUT The cluster tree TI′, the cluster basis (V t)t∈V (TI).

COPY If t′ ∈ TI′ is a cluster identical to the cluster t ∈ TI then V t′ := V t.

UPDATE If t′ ∈ TI′ is an update of the cluster t ∈ TI the corresponding matrix V t canbe updated in the aim to obtain the matrix V t′ . This can be done in the followingway:

1. If i ∈ t′old then we set:

V t′iν := V t

R−1(i)ν ∀ν = 1, . . . , k,

2. If i ∈ t′new then we compute:

V t′

iν :=

∫

ΓLt′

ν (x)ϕi(x)dΓx ∀ν = 1, . . . , k.

RECOMPUTE If t′ ∈ TI′ is a new cluster we compute the matrix V t′ from scratch.

OUTPUT The cluster basis (V t′)t′∈TI′ .

The matrices V t are computed only for leaf clusters while the matrices V t for non-leafclusters are obtained by using the transformation matrices. Therefore, we would like toknow how to update the transfer matrices. The following lemma gives an answer:

Lemma 5.30 Let t ∈ TI be a non-leaf cluster and let t′ ∈ TI′ be also a non-leaf clusterwhich is an update of the cluster t. Then, there holds T t,t1 = T t′,t′1 and T t,t2 = T t′,t′2

where sons(t) = t1, t2 and sons(t′) = t′1, t′2.

Proof: The entries of the matrix T t,t1 are defined as T t,t1ν,ν1 = Lt

ν(xt1ν1

), i.e., they areobtained as evaluation of Lagrange polynomial at transformed Chebyshev points. Wenotice that the entries do not depend on the clusters t and t1, but only on the corre-sponding admissibility boxes At and At1 .Since the cluster t′ is an update of the cluster t, there holds At = At′ .From the previous observations and Algorithm 5.29 we conclude:

1. If t ∈ L(TI) and t′ ∈ L(TI′) and t and t′ are identical clusters then V t = V t′ holds.

2. If t ∈ L(TI) and t′ ∈ L(TI′) and t′ is an update of t then V t′ can be obtained asan update of the matrix V t.

110


3. If t ∈ L(TI) and sons(t′) 6= ∅ then we construct the new transformation matricescorresponding to the edges between t′ and sons(t′) and if sons(t′) ∈ L(TI′) thenwe construct the matrices V for all son clusters. If sons(t′) /∈ L(TI′) we continuewith assembling the transformation matrices down to the leaves.

This construction process is illustrated in Figure 5.14.

t

V−matrices

t

V−matrix

New V−matrices

New transfer matrices

is discarded

’

t′t′

t′′t′′

t′1t′1t′2 t′′1t′′1 t′′2 t′′2

t′2new

Figure 5.14.: Cluster tree with transfer matrices and V -matrices before and afterchanges.

Theorem 5.31 Let TI′ be the updated cluster tree and (V t′up)t′∈VT

I′the cluster basis

obtained as update of (V t)t∈VTI. If the cluster basis (V t′

new)t′∈VTI′

is constructed directly,

then it is identical to the updated cluster basis.

Proof: If t′ ∈ TI′ is a new cluster then it is obvious that V t′up = V t′

new, for they werecomputed in the same way. The same holds for the clusters t′ ∈ TI′ which are identicalto clusters t ∈ TI . If the cluster t′ ∈ TI′ is the update of some cluster t ∈ TI , the equalityV t′

up = V t′new also holds, as it was proven in Theorem 5.4.

5.5.2. Update of Uniform Matrices

In the definition of uniform matrices also a coupling matrix appears beside the clusterbasis structures which we denoted by St,s for clusters t and s. If the clusters t′, s′ ∈ TI′

are updates of the clusters t, s, it would be useful to know how to update the matrix St,s

in order to obtain the matrix St′,s′ .

Lemma 5.32 Let t, s ∈ TI be two clusters such that (t, s) is an admissible leaf. Further,let t′, s′ ∈ TI′ be two clusters that are updates of the clusters t and s. The pair (t′, s′) isalso an admissible leaf and the coupling matrices St,s and St′,s′ are identical.

111


Proof: The statement that (t′, s′) is an admissible leaf can be proven directly by Lemma4.35. The coupling matrix is assembled by evaluation of the kernel function at trans-formed Chebyshev points. Since the clusters t′, s′ are updates of the clusters t, s thismeans At = At′ and As = As′ . The last observation is equivalent to the statement thatthe transformation of the Chebyshev points remains the same for clusters t and t′, s ands′. Therefore the entries of the matrices St,s and St′,s′ are equal.

Remark 5.33 Let (t, s) ∈ TI×I be an admissible block cluster and St,s the correspondingcoupling matrix. Further, let (t′, s′) ∈ TI′×I′ be an admissible block cluster. Directly fromthe previous lemma we can conclude:

1. If t′ ∈ TI′ is an update of t ∈ TI , and s′ ∈ TI′ is identical to s ∈ TI thenSt,s = St′,s′ holds as well.

2. If t′ ∈ TI′ is identical to t ∈ TI , and s′ ∈ TI′ is is update s ∈ TI then St,s = St′,s′

holds as well.

3. If t′ ∈ TI′ is identical to t ∈ TI, and s′ ∈ TI′ is identical s ∈ TI then St,s = St′,s′

holds as well.

4. If t′ ∈ TI′ is a new cluster and s′ ∈ TI′ an arbitrary cluster (either identical tosome cluster from TI or update of some cluster from TI or a new cluster) then thecoupling matrix St′,s′ has to be assembled.

Having updated the cluster basis and the coupling matrix, we can perform the updateof uniform matrices.

Algorithm 5.34 (Update of Uniform Matrices)

INPUT The cluster basis (V t′)t′∈TI′ , the clusters t, s ∈ TI and t′, s′ ∈ TI′, the uniform

matrix representation of the block G|t×s = V tSt,s(W s)T .

DO

COPY If (t′, s′) is an identical block cluster to (t, s) then G′|t′×s′ := G|t×s.

UPDATE If (t′, s′) is an update of (t, s) then G′|t′×s′ := V t′St,s(W s′)T , where

V t′ ,W s′ are updates of V t,W s.

RECOMPUTE If (t′, s′) is a new block cluster then G′|t′×s′ := V t′St′,s′(W s′)T .

OUTPUT The uniform matrix G′|t′×s′ = V t′St′,s′(W s′)T .

The next important point is the property of nested cluster bases.

Theorem 5.35 If the cluster tree TI′ is an update of the cluster tree TI , and the clusterbasis (V t′

up)t′∈VTI′

is an update of the nested cluster basis (V tt∈VTI

), then the cluster basis

(V t′up)t′∈VT

I′is also nested.

112


Proof: If (V t′new)t′∈VT

I′is the cluster basis constructed directly then it is nested. Since,

there holds V t′new = V t′

up for all t′ ∈ V (TI′) (Theorem 5.31), it implies that (V t′up)t′∈TI′ is

also nested.

Theorem 5.36 Let (t′, s′) ∈ L(TI′×I′) be an admissible updated block cluster of (t, s) ∈L(TI×I). Further, let the matrix block G′|t×s be represented in the uniform matrixformat V t′St′,s′W t′, obtained as update of the uniform matrix V tSt,sW t. If the uniform

matrix V t′newS

t′,s′newW t′

new was constructed directly, then it is identical to the updated uniformmatrix.

Proof: To prove that the matrices St′,s′ and St′,s′new are identical, we can use the same

argumentation as in the proof of lemma 5.32. The matrices V t′ (W s′) are identical to thematrices V t′

new (W s′new) as it was proven in Lemma 5.4. Therefore the directly computed

uniform matrix is identical to the updated one.Finally, we define the update algorithm for H2-matrices which is similar to the updatealgorithm for H-matrices. We assume that the cluster tree TI′ is an update of the clustertree TI , that the block cluster tree TI′×I′ is an update of the block cluster tree TI×I andthat the cluster basis (V t′)t′∈VT

I′((W s′)s′∈VT

I′) is an update of (V t)t∈VTI

((W s)s∈VTI).

Algorithm 5.37 (Update of H2-Matrices)

INPUT The block cluster trees TI×I, TI′×I′, the cluster bases (V t)t∈, the H2-matrix G.

DO 1. Update cluster basis (Algorithm 5.29).

2. Update uniform matrix blocks (Algorithm 5.34)

3. Update full matrix blocks by Algorithm 5.19 or apply the extended update usingAlgorithm 5.22.

OUTPUT The H2-matrix G′.

Theorem 5.38 Let G be an H2-matrix based on the block cluster tree TI×I and thecluster basis (V t

t∈VTI). Let TI′×I′ and (V t′)t′∈VT

I′be the updated block cluster tree and

updated cluster basis, respectively. The H2-matrix G′up is obtained by the update al-

gorithm based on TI′×I′ and (V t′)t′∈VTI′

. If Gnew is the H2-matrix obtained by direct

computations, also based on the updated block cluster tree and updated cluster basis, it isidentical to G′.

Proof: The structure of the H2-matrices is identical since both are based on the sameblock cluster trees. Therefore we have to prove blockwise, i.e. for each (t, s) ∈ L(TI′×I′)that the matrices are identical.If (t′, s′) is either identical to (t, s) or a new leaf block cluster, it is clear that the

corresponding matrix block Gnew|t′×s′ will be identical to G′|t′×s′ . If the leaf blockcluster (t′, s′) was updated then if it was an inadmissible leaf and the full matrix blocks

113


are identical as it was proven in Theorem 5.21. If it is an admissible leaf then the uniformmatrices are also identical as it was proven in Theorem 5.36.Moreover, there holds even a stronger statement.

Remark 5.39 If the H2-matrix Gnew is based on the block cluster tree TI′×I′ which isconstructed from a directly computed cluster tree TI′, then the statement of the previoustheorem also holds, i.e. the updated H2 matrix is identical to the directly computed matrixGnew.

Proof: The proof is based on the previous theorem and on Theorem 4.39 that provesthat directly computed cluster tree and updated one (both based on the same index set)are identical.

5.6. Costs

In this section we shall briefly discuss the costs of the update procedure. At the beginningwe recall Lemma 3.9 that contains the estimates of the storage requirement for an H-matrix G ∈ H(TI×J , k). If G′ ∈ H(TI′×J ′ , k) is the update of G′ then the estimate ofthe storage requirements remains the same: if TI′×J ′ is an update of TI×J with sparsity

constant C ′sp and depth p, then the storage of G′ can be estimated by

NH,St(TI′×J ′ , k) ≤ C ′sp maxk, nmin(p′ + 1)(#I ′ + #J ′).

Our aim is to estimate the costs of the update Nupdate. For this purpose we investigatethe structure of L(TI′×J ′). Beside the standard decomposition of the set of leaves, wedecompose the set L(TI′×J ′) in the following way:

L(TI′×J ′) = Lup(TI′×J ′) ∪ Lold(TI′×J ′) ∪ Lnew(TI′×J ′).

where

Lup(TI′×J ′) is the set of the updated leaf blocks,

Lold(TI′×J ′) is the set of the identical leaf blocks and

Lnew(TI′×J ′) is the set of the new leaf blocks.

Lemma 5.40 Let TI′ (TJ ′) be an update of TI (TJ ) and let TI′×J ′, based on TI′ andTJ ′ , be the update of TI×J with sparsity constant C ′

sp and depth p′. Let k ∈ N0. Thenthe costs of the update can be estimated by

Nupdate(TI′×J ′ , k) ≤ C ′sp maxk, nmin(p′ + 1)(#I ′

new + #J ′new),

where the sets I ′new and J ′

new contains the indices corresponding to the new basis func-tions.

114

5.6. Costs

Proof: The proof is similar to the proof of Lemma 3.9. We assume also that thecomputation of the matrix blocks G′|t′×s′ , (t

′, s′) ∈ Lold(TI′×J ′) costs nothing, becausethe corresponding matrix block will be just copied.

Nupdate(TI′×J ′ , k) =∑

(t′,s′)∈L+new(TI′×J ′ )

k(#t′ + #s′) +∑

(t′,s′)∈L−new(TI′×J ′ )

#t′ · #s′

+∑

(t′,s′)∈L+up(TI′×J ′)

k(#t′ + #s′) +∑

(t′,s′)∈L−up(TI′×J ′)

#t′ · #s′

≤∑

(t′,s′)∈L+up∪L

−up∪L

+new∪L−

new

maxk, nmin(#t′new + #s′new)

≤ C ′sp maxk, nmin(p′ + 1)(#I ′

new + #J ′new).

115


116

6. Applications

6.1. Applications in Boundary Element Methods

In this chapter we will present the numerical results, which on the one hand, validate thetheoretical results from the previous chapters, and on the other hand show the efficiencyof the update method.At the beginning we will introduce the model problem which is used for testing theupdate. Then we will present some technical tools, that will be used for graphicalrepresentations of the numerical results. The rest of the chapter is organised as follows:since we have introduced three different methods for assembling and for updating thelow-rank matrices there will be for each of those methods a separate section for presentingthe numerical results. One section will contain the numerical results for H2-matrices.A further section will contain the explanation of the error estimator we used and thecorresponding numerical tests.

6.1.1. Green’s Representation Formula

If Ω is a normal domain (cf. [38]) then a solution of the Laplace equation satisfies Green’srepresentation formula

u(x) =

∫

Γ

g(x, y)

∂

∂nu(y) − u(y)

∂

∂nyg(x, y)

dΓy, x ∈ Ω. (6.1)

Inserting the fundamental solution (1.5) in (6.1) we obtain

u(x) =1

4π

∫

Γ

∂nu(y)

‖y − x‖dΓy +1

4π

∫

Γ

〈n(y), y − x〉‖y − x‖3

u(y)dΓy, x ∈ Ω, (6.2)

where the first integral makes use of the Neumann boundary condition (1.4). From (6.2)we have that on the boundary the integral equation

1

2u(x) = V[∂un] −K[u](x), x ∈ Γ, (6.3)

holds. From equation (6.3) we can find the Dirichlet (Neumann) dates. Equation (6.3)will serve as the model problem, which can be rewritten as

G[u](x) = f(x), x ∈ Γ := ∂Ω,

where Γ is (for example) either the surface of the cube Ω := [−1, 1]3 or the surface ofthe unit ball Ω := (x, y, z) ∈ R3 | x2 + y2 + z2 ≤ 1, i.e., the sphere.

117

6. Applications

G is the double layer potential operator added by one half of the identity operator

G[u](x) :=1

2u(x) + K[u](x)

=1

2u(x) +

1

4π

∫

Γ

〈n(y), x− y〉u(x)‖x− y‖3

dΓy,

where u are the Dirichlet data of the harmonic function in the domain Ω. The right-handside f is given by the single-layer potential operator V applied to the Neumann dataf := V[∂nu],

V[u](x) :=1

4π

∫

Γ

u(y)

‖x− y‖dΓy.

We test our method for the harmonic function

u(x) :=1

‖x− y0‖,

where y0 /∈ Γ is a chosen fixed point.If T is a triangulation of Γ then the discretisation error will be computed in every trianglefrom T . In the adaptive scheme we refine p% of the grid where the discretisation erroris the largest, such that roughly 2p% of the degrees of freedom on the refined grid arenew.There are also some technical details we would like to point out. The numerical resultspresented in the following tables are produced on a Sun UltraSparc III with 900 MHzCPU clock rate and 150 MHz memory clock rate. The graphical representation of theH- and H2-matrices, presented in Figure 3.2 will be extended for presenting the updatedmatrices. For this purpose we introduce the “colour legend” in Figure 6.1. The items of

old fullmatrix

new fullmatrix

updated fullmatrix

old rkmatrix

new rkmatrix

updated rkmatrix(uniformmatrix)

old uniformmatrix

new uniformmatrix

Figure 6.1.: Colours used for representing H-matrix

the tables used to present our numerical results have the following meaning:

n1 is the number of degrees of freedom we start with.

n2 is the number of degrees of freedom we obtain after refinement.

118


new is the percentage of new elements in the discretisation scheme with n2 degrees offreedom.

update is the time in seconds needed to update the matrix, i.e., the time needed toobtain, by update, the n2 × n2 H-matrix from the n1 × n1 H-matrix.

reassembly is the time in seconds needed to assemble the n2 × n2 H-matrix directly.

savings is the percentage of time we saved by updating the matrix instead of assemblingit directly.

costs is the percentage of time we need to update the matrix compared to reassembly.

The refinement scheme we apply is a bisection, and we start the numerical tests forn1 ∈ 49152, 786432 degrees of freedom for the cube, or n1 ∈ 32768, 524288 for thesphere. We refine p ∈ 1%, 5%, 25% of the grid which has the largest discretisationerror obtaining approximately 2%, 10%, 50% new degrees of freedom.

6.1.2. Numerical Results for H-Matrices: Interpolation

This subsection contains the numerical results for the update of H-matrices, whose low-rank matrix blocks are assembled using interpolation. Before we present the results forthe update algorithm we test the accuracy of the interpolation method. The accuracyof the interpolation is measured for n1 = 49152 degrees of freedom in the following way:we compute the relative error between the coarser H-matrix Gc

SLP , whose interpolation

order is pc, and the finer H-matrix GfSLP whose interpolation order is pf . The following

table contains the results of the previously described numerical test. We notice that theaccuracy of the method becomes better with increasing the order of the interpolation.

pc = 1,pf = 2 pc = 2,pf = 3 pc = 3,pf = 4 pc = 4,pf = 5‖ eGf

SLP− eGcSLP ‖

‖ eGfslp

‖3 × 10−2 4.1 × 10−3 4.2 × 10−4 5.5 × 10−5

In the following tables we will present the numerical results computed for the followingset of parameters:

• The operator is the single-layer potential.

• The geometry is defined by the cube discretised either by 49152 or 786342 triangles,or the geometry is defined by the sphere discretised either by 32768 or 524288triangles.

• The admissibility condition is the min-admissibility condition.

• The interpolation order is p = 2.

• The quadrature order is fixed q = 2, since it does not influence the efficiency ofthe algorithm.

119

6. Applications

• The leafsize is nmin = 32.

Cube

In order to test the efficiency of the update algorithm we consider two problems; amiddle-sized one with n1 = 49152 degrees of freedom and a relatively big one withn1 = 786432 degrees of freedom.We refine 1%, 5%, 25% of the grid obtaining approximately 2%, 10%, 50% new elementsin the refined grid. We measure the time needed to update the H-matrix and the timeneeded to reassemble the matrix.

n1 = 49152 n2 =49682 n2 =51880 n2 =62544new 2.1% 10.5% 42.8%

update 6.05 31.8 121.78reassembly 169 209.6 252.7

savings 97 % 85% 52%costs 3% 15% 48%

n1 =786432 n2 =794594 n2 =827282 n2 =988798new 2.05% 9.88% 41%

update 106.55 535.65 1710.33reassembly 4192.29 4388.35 5016.01

savings 97.5 % 87.8% 65.9%costs 2.5% 12.2% 34.1%

Sphere

Similar tests are carried out in the case when the boundary Γ is the sphere. We considertwo problems: one for 32768 degrees of freedom and the other with 524288 degrees offreedom.

n1 = 32768 n2 =33158 n2 = 34622 n2 =41918new 2.3% 10.5% 42.3%

update 2.7 11.7 57.9reassembly 84.5 89 115.1

savings 96.8 % 86.9% 49.7%costs 3.2% 13.1% 50.3%

n1 =524288 n2 =529994 n2 =551956 n2 =661544new 2.1 % 9.9% 41.2%

update 51.5 234 1151.5reassembly 1926.8 2032.8 1744.7

savings 97.3 % 88.5% 66%costs 2.7% 11.5% 34%

Figures 6.2, 6.3 and 6.4 show the structure of the H-matrices after the update.

120


Figure 6.2.: The red marked triangles on the cube are those that will be refined (bybisection). The matrix on the right-hand side is the update of the H-matrixassembled for 3072 degrees of freedom. Since the refinement is local, thenew and updated blocks in the matrix are concentrated.

6.1.3. Numerical Results for H-Matrices using Adaptive CrossApproximation (ACA) for Computing Rk-Matrix Blocks

The low-rank blocks in an H-matrix obtained by adaptive cross approximation (ACA) arecomputed with a given accuracy ǫ. We compare the updated matrix with the originallyassembled one assuming that, for each admissible block, the pivot elements are identical.For practical purposes it would be too expensive to store all pivot elements for everyadmissible block. Therefore we test the accuracy of updated matrices in the followingway: we assemble the H-matrix with higher accuracy and compare it with the originallyassembled matrix on the one side and with the updated matrix on the other side.In the tables below the numerical results are presented for the following inputs:

• The operator is the single-layer potential.

• The boundary Γ is either the cube discretised with 12288 or 49152 triangles or thesphere discretised with 8192 or 32768 triangles.

• Gor is the original H-matrix assembled with accuracy parameter ǫ = 10−3,

• Gad is an updated H-matrix whose low-rank blocks were computed using the sameaccuracy parameter ǫ as Gor.

121

6. Applications

Figure 6.3.: The cube will be refined locally by bisection. This time we refine an areaaround an edge as the red marked triangles show. The matrix on the right-hand side is the update of the H-matrix assembled for 3072 degrees of free-dom.

• Gexact is an H matrix whose low-blocks were computed with higher accuracy pa-rameter ǫ1 than in Gad and Gor. Here we have chosen ǫ1 := 10−5.

Cube

n1 =12288 n2 =12422 n2 =13014 n2 = 15790new 2.2% 11.2% 44.4%


savings 94.4% 76.3% 25.7%costs 5.6% 23.7% 74.3%

‖Gad − Gexact‖2 1.52×10−5 1.72×10−5 1.63×10−5

‖Gor − Gexact‖2 1.52×10−5 1.78×10−5 1.59×10−5

122


Figure 6.4.: The cube will be refined locally by bisection. This time we refine an areain the middle of the surface as the red marked triangles show. The matrixon the right-hand side is the update of the H-matrix assembled for 3072degrees of freedom.

n1 =49152 n2 = 49682 n2 =51870 n2 =62494new 2.1% 10.5% 42.7%

update 17 94.5 390.21reassembly 427.6 451 573.6

savings 96% 79% 32%costs 4% 21% 68%

‖Gad − Gexact‖2 4.68×10−6 6.02×10−6 4.95×10−6

‖Gor − Gexact‖2 4.68×10−6 4.8×10−6 4.72×10−6

Sphere

n1 = 8192 n2 =8304 n2 = 8696 n2 = 10618new 2.6% 11.2% 44.5%

update 13.8 37.6 142reassembly 166.9 197.6 173

savings 91.7% 81% 20%costs 8.3% 19% 80%

‖Gup − Gexact‖2 2.9×10−7 2.5×10−7 4.67×10−7

‖Gor − Gexact‖2 2.9×10−7 2.7×10−7 4.72×10−7

123

6. Applications

n1 =32768 n2 = 33176 n2 =34690 n2 =42078new 2.4% 10.8% 43.5%


savings 95.75% 75.44% 20%costs 4.25% 14.56% 80%

‖Gup − Gexact‖2 8.7×10−7 7.13×10−8 5.2×10−8

‖Gor − Gexact‖2 7.2×10−7 1.02×10−7 5.62×10−8

6.2. Numerical Results for H2-Matrices

The numerical results for the update of H2-matrices are provided using the same schemeas for testing the update for H-matrices in the case that the low-rank approximation iscomputed by interpolation. Similar like for the H-matrices we compare the time neededfor the update of the H2-matrix with the time needed to assemble the H2-matrix usingthe original algorithm. The setup we use for numerical tests, is the standard one.

• For the boundary Γ we choose the cube or sphere that is triangulated into n1

panels.

• The order of quadrature is q = 2 and the order of interpolation is p = 2.

• The leafsize is taken as nmin = 32.

• The admissibility condition is the max-admissibility.

• We refine 1%,5%,25% of the grid obtaining approximatively 2%, 10%, 50% newelements in the refined grid.

• The operator is the double-layer potential, Kor is the corresponding H2-matrixapproximation obtained by the original H2-matrix algorithm and Kup is the cor-responding H2-matrix approximation obtained by update. The last row in eachtable shows the relative error for Kor and Kup computed in the supremum norm.

Cube

We consider two discretisations for the cube; a middle sized one with n1 = 49152 degreesof freedom, while the second with n1 = 786432 unknowns is a larger one.

n1 =49152 n2 =49660 n2 =51798 n2 =62874new 2.05% 10.21% 43.65%


savings 96.6 % 86.6% 46.6%costs 3.4% 13.4% 53.4%

‖ eKor− eKup‖

‖ eKor‖1.3 × 10−16 1.3 × 10−16 1.2 × 10−16

124

6.3. Numerical Results for Hybrid Cross Approximation

n1 =786432 n2 =794462 n2 = 826590 n2 =991242new 2.02% 9.7% 41.3%


savings 97.9 % 89% 61.8%costs 2.1% 11% 38.2%

‖ eKor− eKup‖

‖ eKor‖9.4 × 10−17 9.2 × 10−17 9.2 × 10−17

Sphere

Similarly like for the cube we consider two discretisations; a middle sized one with n1 =32768 unknowns, while the other contains half a million degrees of freedom: n1 = 524288.

n1 = 32768 n2 =33182 n2 =34652 n2 =42020new 2.4% 10.6 % 43.2%


savings 94% 84.2% 47.9%costs 6 % 15.8% 52.1%

‖ eKor− eKup‖

‖ eKor‖1.3 × 10−16 1.3 × 10−16 1.2 × 10−16

n1 =524288 n2 = 530340 n2 = 552048 n2 = 661544new 2.3% 10 % 41.2%


savings 95.6 % 87.9% 66%costs 4.4% 12.1% 34%

‖ eKor− eKup‖

‖ eKor‖3.5 × 10−16 3.4 × 10−16 3 × 10−16


Testing the update algorithm for HCA methods contains first a short explanation aboutthe choice of the parameters we use. Since the HCA methods are constructed on thebase of interpolation and ACA, the approximation error is influenced by the interpola-tion error on the one side and by the ACA-accuracy ǫ on the other side. Therefore it isimportant to choose the parameters for the HCA method properly. Our first numericaltests are done in order to choose an appropriate interpolation order p and ACA-accuracyǫ. The approximation error is, according to theoretical results, dominated by the interpo-lation error. Hence, in the numerical tests we fix the interpolation order p, and decreaseǫ as long as it influences the total approximation error.Inputs for the following numerical tests are:

• The operator is the single-layer potential,

125

6. Applications

• The corresponding H-matrix is denoted by GHCA1 or GHCA2 depending on themethod we test, while G is the directly computed densely populated matrix,

• The geometry is the surface of the cube refined into 3072 triangles,

• The leafsize is nmin = 32, order of the quadrature is q = 2, the admissibilitycondition is the max-admissibility.

HCA(I)

p=2 ǫ = 10−1 ǫ = 10−2 ǫ = 10−3 ǫ = 10−4 ǫ = 10−5

‖G−GHCA1‖‖G‖ 2.5 × 10−2 5.6 × 10−3 4.8 × 10−3 4.8 × 10−3 4.8 × 10−3

p=3 ǫ = 10−1 ǫ = 10−2 ǫ = 10−3 ǫ = 10−4 ǫ = 10−5

‖G−GHCA1‖‖G‖ 1.03 × 10−2 2.45 × 10−3 5.3 × 10−4 5.3 × 10−4 5.3 × 10−4

We conclude that for testing the update of HCA(I) we should choose the ACA-accuracyǫ := 10−p if p is the given interpolation order.

HCA(II)

p=2 ǫ = 10−1 ǫ = 10−2 ǫ = 10−3 ǫ = 10−4 ǫ = 10−5

‖G−GHCA2‖‖G‖ 2.3 × 10−2 1.9 × 10−3 4.9 × 10−4 3.9 × 10−4 4.8 × 10−4

p=3 ǫ = 10−1 ǫ = 10−2 ǫ = 10−3 ǫ = 10−4 ǫ = 10−5

‖G−GHCA2‖‖G‖ 1.02 × 10−2 1.9 × 10−3 7.5 × 10−5 1.1 × 10−5 6.2 × 10−5

For testing the update of HCA(II) we use ǫ = 10−p−1 for a given interpolation order p.

6.3.1. HCA(I)

In this subsection we present the numerical results for the update of H-matrices in thecase that low-rank blocks are computed using HCA(I). As the following tables show thesenumerical results are not as good as the ones obtained when only interpolation was used.The reason is in the expensive matrix-vector multiplication via the clusterbasis appliedfor computing the new entries in the low-rank blocks.The setup we use for the numerical tests is the same as for interpolation. Shortly wehave the following input data:

• the operator is the single-layer potential; the corresponding H-matrix assembledby the original algorithm is Gor while the updated matrix is denoted by Gup,

• the geometry is the cube discretised with n1 = 12288 or n1 = 49152 triangles,

• the quadrature order is q = 2, the leafsize is nmin = 32,

126


• we refine 1% of the grid where the discretisation error is the largest.

n1=12288 p = 2 p = 3 p = 4n2=12418, new 2.1% ǫ = 10−2 ǫ = 10−3 ǫ = 10−4


savings 92.4 % 79.8% 67.07%costs 7.6% 20.2 % 32.93 %

‖Gor − Gup‖2/‖Gor‖2 7.4 × 10−17 1.2 × 10−16 1.6 × 10−16

n1=49152 p = 2 p = 3 p = 4n2 =49664, new 2.05% ǫ = 10−2 ǫ = 10−3 ǫ = 10−4


savings 92.2 % 83.2% 58.5%costs 7.6% 20.2 % 41.5 %

‖Gor − Gup‖2/‖Gor‖2 7.3 × 10−17 1.1 × 10−16 1.2 × 10−16

These results confirm our predictions that the HCA(I) update algorithm is not as effi-cient as other update algorithms. Still, we can get considerable savings by our updatealgorithm compared to a complete reassembly.

6.3.2. Numerical Results for HCA(II)

In this subsection we present the results for the HCA(II) update algorithm. Comparingto all previously tested update algorithms, the HCA(II) update algorithm is the mostefficient one. This statement will be justified in the following tables. The inputs for thenumerical test are similar to the one we used to carry out the other update algorithms:

• The operator is the single-layer potential; Gor is the H-matrix for the SLP operatorassembled by the original HCA(II) algorithm while Gup denotes the H-matrix forthe SLP operator obtained by update.

• The geometry is either the cube with n1 = 49152 or n1 = 786432 panels, or thesphere triangulated with n1 = 32768 or n1 = 524288 triangles.

• The interpolation order is p, while the ACA-accuracy is ǫ = 10−p−1;

• The quadrature order is fixed q = 2, and the leafsize is nmin = 32.

Cube

n1 = 49152, refined 1% of the grid, new= 2.1%.

127

6. Applications

n2 = 49682 49682 49682 49682m = 2 m = 3 m = 4 m = 5ǫ = 10−3 ǫ = 10−4 ǫ = 10−5 ǫ = 10−6

update 2.88 4.74 7.2 14.93reassembly 124 288.94 486.2 1141.5

savings 97.7% 98.4% 98.5% 98.7%costs 2.3% 1.6% 1.5% 1.3%

‖gGor− gGup

‖gGor‖8.2 × 10−17 1.1 × 10−16 1.6 × 10−16 2.1 × 10−16


n2 = 51824 51818 51820 51820

m = 2 m = 3 m = 4 m = 5ǫ = 10−3 ǫ = 10−4 ǫ = 10−5 ǫ = 10−6

update 15.2 25.5 38.4 78.2reassembly 150.5 422.7 711.7 1228.7

savings 89.1% 94% 94.6% 94.6%costs 10.9% 6% 5.4% 4.6%

‖gGor− gGup‖

‖gGor‖9.3 × 10−17 1.2 × 10−16 1.4 × 10−16 1.7 × 10−16

n1 = 49152, refined 25% of the grid, new = 43.5%.

n2 = 62822 62800 62800

m = 2 m = 3 m = 4ǫ = 10−3 ǫ = 10−4 ǫ = 10−5


savings 63.1% 66% 69.6%costs 36.9% 34% 30.4%

‖gGor− gGup‖

‖gGor‖9.4 × 10−17 1.2 × 10−16 1.5 × 10−16


n2 = 794506 794448 79445

m = 1 m = 2 m = 3ǫ = 10−2 ǫ = 10−3 ǫ = 10−4


savings 97.7% 98% 98.25%costs 2.3% 2% 1.75%

‖gGor− gGup‖

‖gGor‖2.5 × 10−17 5.2 × 10−17 7.4 × 10−16

Sphere

128



n2 = 33178 33176 33176 33176

m = 2 m = 3 m = 4 m = 5ǫ = 10−3 ǫ = 10−4 ǫ = 10−5 ǫ = 10−6


savings 96.25% 98% 98.03% 98.61%costs 3.75% 2% 1.93% 1.39%

‖gGor− gGup‖

‖gGor‖1.2 × 10−16 1.6 × 10−16 1.9 × 10−16 1.9 × 10−16


n2 = 34680 34690 34690 34690

m = 2 m = 3 m = 4 m = 5ǫ = 10−3 ǫ = 10−4 ǫ = 10−5 ǫ = 10−6


savings 88.7% 92.04% 92.69% 94 %costs 11.3% 7.96% 7.31% 6%

‖gGor− gGup‖

‖gGor‖1.1 × 10−16 1.6 × 10−16 1.9 × 10−16 2.3 × 10−16


n2 = 42082 42076 42076 42076

m = 2 m = 3 m = 4 m = 5ǫ = 10−3 ǫ = 10−4 ǫ = 10−5 ǫ = 10−6

update 77 152 187 379.5reassembly 217.3 466.3 739 1372.6

savings 64.6% 67.5% 74.5% 72.4 %costs 35.6% 32.5% 25.5% 27.6%

‖gGor− gGup‖

‖gGor‖1.1 × 10−16 1.1 × 10−16 1.9 × 10−16 2.8 × 10−16


n2 = 540182 530230 530138

m = 1 m = 2 m = 3ǫ = 10−2 ǫ = 10−3 ǫ = 10−4


savings 97.1% 97.6% 98.03%costs 2.9% 2.4% 1.97%

‖gGor− gGup‖

‖gGor‖3.4 × 10−17 7.5 × 10−17 1.1 × 10−16

129

6. Applications

6.4. Numerical Results for Non-local Refinements

In the previous numerical tests we have assumed that the grid T is refined locally. Asa result, especially in the case when p < 10%, we obtain that the H-matrix is updatedor recomputed only for a small number of blocks. The following table illustrates thisobservation; clearly most of the blocks were simply copied and only a very small numberof blocks was recomputed or updated. In this example we started with 3072 degrees offreedom and we refined 1% of the grid.

old rkmatrix 96.25% old full 95.84%adapt rkmatrix 3.19% adapt full 0%new rkmatrix 0.56% new full 4.16%

In the case that the refinement is done globally, we have in general the situation thatarbitrarily many blocks are recomputed or updated. Now we consider the same examplewith 3072 degrees of freedom where the underlying grid is refined globally. As the follow-ing table shows there will be definitely more blocks to update or recompute comparedto the local case.

old rkmatrix 37% old fullmatrix 53.74%adapt rkmatrix 61.42% adapt fullmatrix 39.53%new rkmatrix 1.58% new fullmatrix 6.73%

Figure 6.4 shows the difference between local and global update. Since the global refine-

Figure 6.5.: The matrix left is an update of the H-matrix in the case that 1% of the gridis refined locally. The matrix right is an update of the H-matrix in the casethat 1% of the grid is refined globaly. In both cases the initial discretisationcontains 3072 degrees of freedom.

ment demands more update and recomputing, it is expected that the update results willnot be as good as in the case of the local refinement. The following tables will justify

130

6.5. Error Estimators

this observation. For the setup we use the same one as for the interpolation scheme(H-matrix for SLP operator, nmin = 32, p = 2, q = 2) but we start with less degrees offreedom (n1 = 3072, n1 = 12288).

n1 =3072 n2 =3132 n2 =3360 n2 =4284new 3.8% 17.1% 56.6 %


savings 91.25% 70% 25%costs 8.75% 30% 75 %

n1 =12288 n2 =12526 n2 =13444 n2 =17128new 3.8% 17.2% 56.6 %


savings 87.6% 66.3% 22.6%costs 12.4% 33.6% 77.4 %

Still, if we use the HCA(II) algorithm for assembling the original H-matrix and forupdate consecutively, the time for the update is closer to the optimum.

n1 =3072 n2 =3132 n2 =3132 n2 =3132 n2=3132new 3.8% 3.8% 3.8% 3.8%

m = 2 m = 3 m = 4 m = 5ǫ = 10−3 ǫ = 10−4 ǫ = 10−5 ǫ = 10−6


savings 88.1% 90.7% 92.75% 95.6%costs 11.9% 9.3% 7.25% 4.4%


In the previous sections we have simply refined those elements whose local discretisationerror was largest. In order to test the update algorithm for real problems we have to useproper error estimators.

In [22] can be found a transparent introduction to and a state-of-the-art review of themathematical theory of a posteriori error estimates for an operator equation Au = f ona one-(or two-) dimensional boundary surface Γ. In this paper most of the presentederror estimates are residual based. There are two major challenges these error estimatorstry to achieve. The main challenge is the localization of the Sobolev norm. The secondtask is to prove reliability of the resulting error indicators in particular for the Dirichletboundary value problem where the related boundary integral operator is of order minusone.

131

6. Applications

On the other hand there are error estimators (also a posteriori) presented in [48, 49]that are based on truncated Neumann series to solve a second kind boundary integralequation.

Our choice of the error estimator was closely connected with implementational possi-bilities. Therefore we have chosen an error estimator for boundary integral methodsfrom the class of averaging error estimators introduced in [23]. We will not discuss indetail the properties of the chosen error estimator but, for the comprehensible applicationwe will present the basic ideas and the algorithm.

6.5.1. Averaging Error Estimators

In this subsection we will shortly explain the basic idea of the averaging error estimators.The idea to use averaging techniques for providing error estimators comes from the finiteelement method community. It has been shown that in the FEM case any averagingtechnique leads to a reliable error estimator (in [21]). In [23] this idea is extended forthe Galerkin boundary element method.The theory is developed for the following case: we are given the right-hand side f andan approximation uh for the unknown exact solution u of the given SLP operator

V[u](x) = f(x) in H−1/2(Γ), (6.4)

where Γ = ∂Ω is the Lipschitz boundary of the bounded domain Ω in Rd, d = 2, 3.The class of four error estimators is introduced. In the simple case we start with thecoarse grid TH and the fine grid Th obtained by applying a uniform refinement on TH .Let uh ∈ P0(Th) be a Th-piecewise constant Galerkin approximation of the exact solutionu ∈ H−1/2(Γ) of (6.4). We introduce also two operators for TH-piecewise linear functions(Section 3.1 in [23]):

Galerkin projection GH : H−1/2(Γ) → P1(TH) and

L2-projection AH : H−1/2(Γ) → P1(TH).

Under the assumption that the mesh size h is small enough compared to the mesh sizeH, then the error estimator with the energy norm ‖ · ‖V

ηM := ‖uh − GHuh‖V := 〈V(uh − GHuh), uh − GHuh〉12 (6.5)

is always reliable and efficient up to terms of higher order, cf. Theorem 5.2 in [23].Since GH is the best approximation operator with respect to the energy norm, thereholds ‖uh − GHuh‖V ≤ ‖uh −AHuh‖V . Particularly, the error estimator

ηA := ‖uh −AHuh‖V (6.6)

is reliable. By interpolation and inverse estimates it is shown that the error estimators

µM := ‖H1/2(1l − GH)uh‖L2(Γ) and ηA := ‖H1/2(1l −AH)uh‖L2(Γ) (6.7)

132


are equivalent to ηM and µA, respectively, i.e.

C−11 µM ≤ ηM ≤ C2µM and C−1

1 µA ≤ ηA ≤ C2µA (6.8)

with constants C1 and C2 that do not depend on the size or the number of elements inTh and TH . This statement is proven in Corollary 5.4 and Corollary 5.5 in [23]. Theproof is carried out under certain regularity assumptions on the exact solution. Sincethe L2 norm is local in the sense that ‖ · ‖2

L2(Γ) =∑

τ∈TH‖ · ‖2

L2(τ), µM and ηA can beused as indicators for an adaptive mesh refinement.

Algorithm 6.1 (Averaging Error Estimator)

INPUT The coarse grid TH

REFINE the grid TH uniformly and obtain the fine grid Th where h = H2 .

SOLVE the system of linear equations Vhuh = fh, where Vh is an SLP operator discre-tised with piecewise constant functions whose supports are the triangles from thefine triangulation Th.

COMPUTE the Galerkin projection of the solution uh in the discretisation space basedon the coarse grid TH that is spanned by piecewise linear functions. We denotethis projection by uH .

COMPUTE the quantities µM,τ for all τ ∈ TH :

µ2M,τ := ‖H

12τ (uh|τ − uH |τ )‖2

L2(τ) = H12τ

∫

τ(uh|τ − uH |τ )2 (6.9)

and determine the error estimator by

µM =∑

τ∈TH

µM,τ . (6.10)

ADAPTIVELY REFINE elements τ ∈ TH whose error indicator µM,τ is relatively large.In this way we produce the new coarse grid TH′ and we start the algorithm fromthe beginning.

Figure 6.6 illustrates how the error estimator may influence the grid. We apply thisalgorithm in the following way:

• We consider the equation Vn = f := Kd, where d and n are Dirichlet, respectivelyNeumann data for the harmonic function u. The H-matrix corresponding to theSLP operator will be denoted by Vh. n will be computed from the discretisedequation using GMRES and this solution will be denoted by nh. The solution isrepresented in a piecewise constant basis defined on the fine grid.

• We compute the solution nh in the piecewise linear basis defined on the coarse gridTH . This solution we denote nH .

133

6. Applications

TH Th

TH′ Th′

Figure 6.6.: Th is a fine grid obtained by uniform refinement of the coarse grid TH . Theerror indicators indicated the triangles in the down right corner as thosewith a relatively big error. We refine these triangles by bisection obtainingthe grid TH′ . By uniform refinement of TH′ we obtain the grid Th′ . Bothgrids are locally refined which gives a possibility of applying the updatealgorithm.

• We compute the error indicators, refine the coarse grid adaptively obtaining thegrid TH′ , and refine the obtained grid uniformly, obtaining the grid Th′. In thenext step the matrix Vh′ will be updated from Vh.

Graphically this looks as presented in Figure 6.7. The following table contains the datathat underline the efficiency of the update algorithm in the case the previously introducederror estimator µM was used as error indicator. The input data are:

• TH is a grid on the boundary of the cube containing 768 panels. The fine grid Th,obtained by the uniform refinement of TH contains 3072 panels.

• The operators are the single-layer and double-layer potential.

• We refine 5% of the grid and as Figure 6.8 shows, the refinement is a combinationof local and non-local refinement.

• The H-matrices are assembled by the original algorithm (using the interpolationfor assembling the low-rank blocks) only in the step 0. Further, in every other stepwe update the matrices.

134


TH Th

TH′ Th′uniform refinement

uniform refinement

Galerkin projection

Galerkin projectionuH uh

uh′

Vhuh = fh

Vh′uh′ = fh′

adaptive refinement H-matrix update

uH′

Vh

Vh′

error estimator

error estimator

Figure 6.7.: Schematic representation of Algorithm 6.1.

step n (new %) DLP SLP µM l2update reassembly update reassembly

0 3072 5.8s 5.5s 6.40 3.13

1 3296(13.6%) 1.6s 6.5s 1.6s 6.2s 5.3 3.13savings 74.15% 73.7%

2 3560(14.8%) 2.1s 7.2s 1.9s 6.6s 3.89 2.12savings 70.7% 71.3%

3 3848(15%) 1.97s 7.95s 1.7s 7.14s 3.21 1.59savings 75.22% 76%

4 4168(15.35%) 2.31s 8.85s 1.94s 7.90s 2.56 1.27savings 73.90% 75.44%

5 4448(12.4%) 2.36s 9.56s 1.91s 8.45s 2.27 1.28savings 75.31% 77.4%

135

6. Applications

Figure 6.8.: An illustration of the refinements indicated by the error estimator.

136

7. Implementation

The translation of mathematical algorithms into a programming language is not aneasy task. One of the main problems are data structures. The basic data structuresavailable in the mostly used programming languages (like int, double) do not matchdirectly to mathematical structures (like vector space). Nevertheless the implementationof the mathematical algorithms is possible and can be efficiently performed with existingdata structures. One way to overcome the problem of “not enough mathematical” datastructures is proposed in [45].For the implementation of the algorithms presented in the previous chapters we havechosen the C programming language, that satisfies the ANSI/ISO C standards. Further,we use the H-matrix library, HLIB, that contains the definition of all structures neededfor constructing the H-matrix. The basics of the HLib will be presented in the firstsection of this chapter. The second section will contain the implementation details thatconcern the geometry needed for update algorithms. In the third section we shall presentthe implementation of the update algorithms for the cluster tree and inadmissible andadmissible matrix blocks. All these sections will contain fragments of the source code.

7.1. H-Matrix Library-HLib

The HLib has been developed in the last years as support to the various theoreticalresults. The library was basically written for H-matrix arithmetics and it was extendedby H2-matrix structure. The HLib contains:

• Routines for the construction of hierarchical matrix structures (i.e., of cluster trees,block cluster trees, low-rank matrices and block matrices)

• Discretisation functions filling these structures by approximations of FEM or BEMoperators

• Arithmetic algorithms performing approximative matrix operations like addition,multiplication, factorizations and inversion

• Conversion routines turning sparse matrices and dense matrices into H-matricesand H-matrices or H2-matrices into H2-matrices

• service functions displaying matrix structures, perform numerical quadrature orhandle files

More informations about the library and what it offers, and how it can be used canbe found on the address http://www.hmatrix.org. For performing the basic algebraic

137

7. Implementation

operations the library uses BLAS and LAPACK, written in FORTRAN, which is also a stan-dardised ANSI/ISO programming language.Since this work was mostly focused on the structure of the hierarchical matrices, we shallmention only the way of implementation of basic structures: cluster tree, full matrix,rkmatrix, and H-matrix, which in the HLIB becomes supermatrix. For H2-matrices weneed the cluster basis and uniform matrices, and the implementation of these construc-tional elements will be presented as well.

7.2. Implementation of the H-Matrix Structure

In this section we shall give a brief introduction into the implementation of the struc-ture needed for constructing the H-matrices. It includes the implementation of thecluster tree, low-rank matrices in rkmatrix format and densely populated matrices infullmatrix format. We shall also present the implementation of H-matrix that willdirectly follow from the matrix block structure.

7.2.1. Implementation of the Cluster Tree

The first part of this subsection will contain the implementation of the index set I,whose cardinality is equal to the number of basis functions, e.g. n. So far, we have notstated what the index set I can be. For implementation purposes we chose that the setI is consecutively numbered, i.e. I = 0, . . . , n − 1. The array of integers will be thestructure used to represent the index set, that together with the integer n defines a newstructure.

typedef INDEX_SET struct _INDEX_SET;

typedef pINDEX_SET *INDEX_SET;

struct _INDEX_SET

int n;

int *index;

;

The definition of the index set is done straightforward:

for(i=0;i<n;i++)

index[i]=i;

The complex structure of the cluster tree is represented using a standard (binary) treestructure. Each cluster is a vertex of the cluster tree and useful informations should bestored. Those useful informations are:

• the number of sons (an integer),

• if the cluster has any sons, we would like to have a connection to the sonclusters (it will be a pointer)

138


• the information about the mappingˆ: VTI−→ I. Of interest is the size of t ⊂ I

for every t ∈ TI (an integer). Besides, it is useful to know which elements of theindex set are in t, and therefore we introduce an other variable, start (also aninteger) which shows where the elements of t start. Having start and size we definet = index[start],...,index[start+size].

struct _cluster

int start; /* First element of the associated index set */

int size; /* Size of the associated index set */

int sons; /* Number of sons */

pcluster *son; /* Array of sons */

;

Finally we can define the cluster tree structure that contains the previously definedstructures.

struct _CLUSTER_TREE

pINDEX_SET index;

pcluster root;

;

So far we have defined the cluster tree and a way to implement it. The next questionis how to construct such cluster tree. In Chapter 2 we presented several methods toconstruct the cluster tree. The implementation of geometrical clustering (Algorithm2.9) and cardinality balanced clustering (Algorithm 2.12) can be found in [13]. Theimplementation of the box tree clustering (Algorithm 2.17) will be presented in thenext subsection. Before we proceed we will briefly mention the implementation of thegeometry.The triangulation of the surface is stored in a list of triangles or/and in an array oftriangles. Each triangle is constructed in a way which is well known in various FEMor BEM packages. The only exception we introduce concerns the adaptive scheme. Wedefine triangles in the way they know their predecessors and successors.

7.2.2. Box Tree Clustering Implementation

The algorithm which performs the box tree clustering will be presented in the followingcode. The parameter list is rather long and contains besides necessary elements forconstructing one cluster, also the grid elements (points x and triangles arr). The integerleafsize is equal to nmin. The double arrays bmin, bmax are not bounding boxes forthe cluster, but the boxes that we need for clustering are stored in these arrays.

pcluster

split_boundingbox(double *bmin, double *bmax, double **x,

int d, int* index, int leafsize,

139

7. Implementation

int start, int size,pTriangle *arr)

/*... some initialisations ..*/

/*split bounding box in the direction of the maximal distance */

if(size <= leafsize)

root = new_cluster(start,size,d,0);

for(j=0;j<d;j++)

root->bmin[j] = bmin[j];

root->bmax[j] = bmax[j];

If the size of the cluster is less or equal to leafsize we create a new cluster that is aleaf (no sons). If not, we determine the splitting direction jnext as the direction of themaximal extent.

else /* we determine the splitting direction*/

pom = bmax[0] - bmin[0];

jnext = 0;

for(j = 1; j<d; j++)

if((bmax[j]-bmin[j])>pom)

pom = bmax[j] - bmin[j];

jnext = j;

pom = bmin[jnext] + 0.5*pom;

The next step is to rearrange the indices, i.e., to determine whether they are in the leftor right box.

/* rearranging the degrees of freedom*/

l = start; r = start+size-1;

while(l < r)

while(l < (size+start) && x[index[l]][jnext] <= pom)

l++;

while(r >= start && x[index[r]][jnext] > pom)

r--;

if(l < r)

h = index[l]; index[l] = index[r]; index[r] = h;

140


Now, we can create the cluster, which by default will have two sons. As a consequenceit might happen that one cluster is empty, but so far it is not a problem because ourdefinition of the cluster tree allows existence of empty clusters.

root = new_cluster(start, size, d, 2);

for(j=0;j<d;j++)

root->bmin[j] = bmin[j];

root->bmax[j] = bmax[j];

old = bmax[jnext];

bmax[jnext] = pom;

root->son[0] =

split_boundingbox(bmin,bmax,x,d, index, leafsize, start, l-start,arr);

bmax[jnext] = old;

old = bmin[jnext];

bmin[jnext] = pom;

root->son[1] =

split_boundingbox(bmin,bmax,x,d, index,leafsize, l, start+size-l,arr);

bmin[jnext]=old;

return root;

7.2.3. Implementation of Full and Rk-matrices

Implementation 7.1 (Full matrix representation) The elements of an n ×m fullmatrix F

F =

F11 . . . F1m...

. . ....

Fn1 . . . Fnm

will be stored in an array of doubles in the coloumnwise order F11, . . . , Fn1F12, . . . , Fn2, . . . , Fnm

(this corresponds to the LAPACK conventions).

typedef struct _fullmatrix fullmatrix;

typedef fullmatrix* pfullmatrix;

struct _fullmatrix

int n;

int m;

double *e;

;

Implementation 7.2 (Rk matrix representation of n×m matrix) The Rk-matrixR = ABT will be stored also coloumnwise in the arrays a for A ∈ Rn×kt and b forB ∈ Rm×kt.

141

7. Implementation

typedef struct _rkmatrix rkmatrix;

typedef rkmatrix* prkmatrix;

struct _rkmatrix

int k; /* maximal rank */

int kt: /* current rank */

int rows;

int cols;

double *a;

double *b;

;

Remark 7.3 The current rank kt is needed for optimised adaptive arithmetic.

7.2.4. Implementation of H-Matrix

Implementation 7.4 The H-matrix structure is represented in the supermatrix struc-ture, which is implemented as follows:

typedef struct _supermatrix supermatrix;

typedef supermatrix* psupermatrix;

struct _supermatrix

int rows;

int cols:

int block_rows;

int block_cols;

prkmatrix rk;

pfullmatrix full;

psupermatrix *s;

;

A supermatrix M consists of block rows×block cols submatrices. The size of the matrixis rows× cols, i.e., M ∈ Rrows×cols. The matrix can be

• an rkmatrix-then rk 6= 0 × 0, full= 0 × 0 and s= 0 × 0;

• a fullmatrix-then full 6= 0 × 0, rk= 0 × 0 and s= 0 × 0;

• a supermatrix-then s 6= 0 × 0, full= 0 × 0 and rk= 0 × 0: the array s containsthe pointers to the submatrices Mi,j of

M =

M1,1 . . . M1,block cols...

. . ....

Mblock rows,1 . . . Mblock rows,block cols

in columnwise order.

142


7.2.5. Implementation of H2-Matrices

The construction of H2-matrices involves defining the new structures cluster basis anduniform matrix.

Implementation 7.5 (Clusterbasis) The clusterbasis structure is defined as fol-lows:

typedef struct _clusterbasis clusterbasis;

typedef clusterbasis* pclusterbasis;

struct _clusterbasis

pcluster t;

double **T;

double *V;

int k;

int kt;

int n;

int sons;

pclusterbasis *son;

;

The pointer t is the cluster this cluster basis is used for.The fields sons and son are used to form a tree of the clusterbasis similar to thecluster tree.The integer n is identical to t->size, the field k gives the maximal possible rank forwhich memory has been allocated, while the field kt gives the current rank.The array T contains the transfer matrix T t′,t. The entry T[i] corresponds to the i-thson son[i] and represents a matrix in FORTRAN format with son[i]->kt rows andkt columns.

The next step is to present an implementation of a uniform matrix which will be straight-forward from the cluster basis.

Implementation 7.6 (Uniform matrix) The structure of uniformmatrix is similarto the structure of rkmatrix:

typedef struct _uniformmatrix uniformmatrix;

typedef uniformmatrix* puniformmatrix;

struct _uniformmatrix

pclusterbasis row;

pclusterbasis col;

143

7. Implementation

int n; /* n = row->n */

int m; /* m = col->n */

int kr; /* maximal rank for row cluster basis */

int kc; /* maximal rank for column cluster basis */

int ktr;/* current rank for row cluster basis */

int ktc;/* current rank for column cluster basis */

double *S;

;

The pointers row and col give us the cluster basis corresponding to this block; V t corre-sponds to row and W s corresponds to col.The array S contains the coefficient matrix St,s in standard FORTRAN format with ktr

rows and ktc columns.

Implementation 7.7 In order to be able to treat uniform H-matrices and H2-matriceswe have to add three fields to the supermatrix structure:

pclusterbasis row;

pclusterbasis col;

puniformmatrix u;

The fields row and col give us the cluster basis corresponding to this supermatrix, if itdescribes a uniform H-matrix. If it describes an admissible block of a uniform H-matrix,the field u contains the corresponding coefficient matrix St,s.

7.3. Update of Cluster Tree-Implementation

This section will contain the implementation of algorithms that we have introduced inChapter 4. We shall present the implementation of indirect clustering methods as wellas the implementation of the operations reduction and fusion.

7.3.1. Implementation of Indirect Clustering

At the beginning we present the implementation of Algorithm 4.12. In general we con-struct the cluster tree TJ starting from the cluster tree TI and the index set J ⊂ I.The following code contains on the parameter list the cluster tree Ti, and the indexset J containing the positions of the elements we want to cluster. In Example 4.14 weindirectly clustered the set J = 0, 7 from the cluster tree TI obtained by box treeclustering as in Example 4.4. The implementation needs the positions of the indices wewould like to cluster, i.e., as the parameter we would pass the set J = 5, 7 . Theprocedure create TjfromTi returns the cluster tree Tj which has the same structure asthe cluster tree Ti.

144


pcluster

create_TjFromTi(pcluster Ti, int* J,int size,int start, int d)

/* some initialisations*/

if(Ti->sons >0)

size1 =0;

n1 = Ti->son[0]->start + Ti->son[0]->size;

for(i=start; i<(size+start); i++)

if(J[i] <n1)

size1++;

if(Ti->sons==2)

Tj = new_cluster(start,size,d,2);

Tj->son[0] = create_TjFromTi(Ti->son[0],J,size1,start,d);

Tj->son[1] = create_TjFromTi(Ti->son[1],J,size-size1,start+size1,d);

return Tj;

else


return Tj;

The second indirect implementation we used for obtaining the tree TI′new

follows Algo-rithm 4.16. The implementation of this routine is closer to the direct clustering methodthan to the indirect clustering of the previous type. The parameter list contains thecluster tree Ti, the set of indices we need to cluster stored in the array of integers int*index and the set of triangles that contain the chosen points we cluster.

pcluster

create_TjFromBoxTi(pcluster Ti, int* index,

int size, int start,int d,

pTriangle* arr)

/* some initialisation */

if(Ti->sons>0)

/* if the cluster has sons then the have to check

whether the points we cluster are in the left or

right cluster son */

145

7. Implementation

size1 =0;

bmin = Ti->son[0]->bmin;

bmax = Ti->son[0]->bmax;

for(i=start; i<(start+size); i++)

k=0;

for(j=0;j<d;j++)

if(arr[index[i]]->mpoint->v[j] <=bmax[j] &&

arr[index[i]]->mpoint->v[j] >=bmin[j])

k++;

if(k==d)

size1++;

/* rearranging the indices depending in which box cluster points are*/

l = start; r = start+size-1;

while(l < r)

while(l < (size+start) && where_BB(l,arr,index,d,bmax,bmin) == 1)

l++;

while(r >= start && where_BB(r,arr,index,d,bmax,bmin) == 0)

r--;

if(l < r)

h = index[l]; index[l] = index[r]; index[r] = h;

/* we construct the cluster tree Tj using the structure

of the cluster tree and recursively call the routine

to construct the entire structure*/

if(Ti->sons ==2)


for(i=0;i<d;i++)

Tj->bmin[i] = Ti->bmin[i];

Tj->bmax[i] = Ti->bmax[i];

Tj->son[0] =

create_TjFromBoxTi(Ti->son[0],index,size1,start,d,arr);

Tj->son[1] =

create_TjFromBoxTi(Ti->son[1],index,size-size1,start+size1,d,arr);

if(Ti->sons == 1)

146



Tj = create_TjFromBoxTi(Ti->son[0],index,size1,start,d,arr);

else


for(i=0;i<d;i++)

Tj->bmin[i] = Ti->bmin[i];

Tj->bmax[i] = Ti->bmax[i];

return Tj;

7.3.2. Implementation of Reduction

Reduction as an operation involving two cluster trees will be implemented in the fol-lowing way: at the beginning we construct the cluster tree Tiout using the routinecreate TjFromTi. Using Tiout we update the start and the size of Ti storing theresult in Ti. The index set index reduced corresponds to R(Istay).

void

reduced_Cluster(pcluster Ti, int n, int *index_out,

int size, int start, int d,

int *index_prime, int *index_reduced)

Tiout = create_TjFromTi(Ti,index_out,size,start, d);

update_StartSizeMinus(Ti,Tiout,0);

del_cluster(Tiout); Tiout = NULL;

k=0;

for(i=0;i<n;i++)

if(index_prime[i] != -1)

index_reduced[k] = index_prime[i];

k++;

The recursive procedure update StartSizeMinus, as the name says, updates the clustertree, performing basically the operation Ti:= Ti \ Tj.

void

update_StartSizeMinus(pcluster Ti,pcluster Tj,int count)

147

7. Implementation

int i,size_i,size_j;

size_i = Ti->size;

size_j = Tj->size;

Ti->size = size_i - size_j;

Ti->start = count;

if(Ti->sons>0)

for(i=0; i<Ti->sons; i++)

update_StartSizeMinus(Ti->son[i],Tj->son[i],count);

count +=Ti->son[i]->size;

7.3.3. Implementation of Fusion

The implementation of the fusion is done recursively and it performs the operation Ti:=

Ti ∪ Tj. The parameter list contains the identically structured cluster trees Ti and Tj,as well, the index sets index I and index J.

void Fusion(pcluster Ti,pcluster Tj,int count,

int* index_IUJ, int* k,int *ki,int *kj,

int *index_I, int* index_J)

int size_I,size_i,size_j,size_J,i,j;

int ki_tmp,kj_tmp;

Ti->start = count;

size_i = Ti->size;

size_j = Tj->size;

Ti->size = size_i+size_j; /* updating the size */

if(Ti->sons==0 && Ti->size>0)

size_I = Ti->size;

size_J = Tj->size;

ki_tmp = ki[0];

for(i=ki_tmp; i<(ki_tmp+size_I-size_J);i++)

index_IUJ[k[0]] = index_I[i];

k[0]++;

148


ki[0]++;

kj_tmp = kj[0];

for(j=kj_tmp; j<(kj_tmp+size_J); j++)

index_IUJ[k[0]] = index_J[j];

k[0]++;

kj[0]++;

else

for(i=0; i<Ti->sons; i++)

Fusion(Ti->son[i],Tj->son[i],count,

index_IUJ,k,ki,kj,index_I,index_J);

count += Ti->son[i]->size;

In the updated cluster tree TI′ we distinguish between identical, updated and newclusters with respect to the initial cluster tree TI . For the update of admissible andinadmissible blocks it is important to have some flag indicating the type of the cluster.Therefore we extend the cluster structure by adding the integer flag that indicatesthe type of the cluster.

struct _cluster

...

...

int flag; /* number of new indices */

;

To flag the cluster tree means, in terms of implementation, to mark the clusters in theupdated cluster tree as updated, identical or new.

void

flag_Cluster(pcluster Tiprim, int *index_old)

int k,i;

size = Tiprim->size;

start = Tiprim->start;

/* here will be counted how many old indices are in the cluster*/

k = 0;

for(i=start; i<(start+size); i++)

149

7. Implementation

if(index_old[i] != -1)

k++;

/*if the number of the old indices is not equal to the

size of the cluster then the flag takes the value size-k. If

k=0 then the cluster is new. If k=size then the cluster is old

and its flag is 0*/

Tiprim->flag= size-k;

/*procedure will be recursively called for sons if there are any*/

if(Tiprim->sons>0)

for(i=0;i<Tiprim->sons;i++)

flag_Cluster(Tiprim->son[i],index_old);

Beside the index set, which characterises the basis functions and their supports, inthe update procedure, we need the connection between “old” and “new” discretisationschemes. This information will be stored in the Triangle structure, in the way thateach triangle possesses different names.

Implementation 7.8 (Triangle) In the structure triangle we define the properties ofthe geometry we use.

typedef struct _Triangle Triangle;

typedef Triangle* pTriangle;

struct _Triangle

pVertex t[3];

int name;

int old_name;

int new_name;

;

pVertex is a structure that defines the vertex as ordered triple of doubles.name is the current name of the triangle, and it is the index for the basis function whosesupport is this particular triangle (in the case of piece-wise constant ansatz)old name is the name of the triangle, in the case the triangle was a copy of some trianglefrom an old discretisation scheme. If the triangle is not a copy of some other trianglethen the old name is by default −1.new name is the name of the son triangle if the triangle has not been refined. In the casethe triangle was refined the value of new name is −1.

Having the triangulation (stored into either a list or an array) we can define the index setsindex old and index new. Those two sets contain the information about the indexing

150

7.4. Implementation of Matrix Blocks Update

in the old resp. new cluster tree. If index is given for some cluster tree TI and arraysof triangles pTriangle *arr then we define index old and index new as

for(i=0;i<n;i++)

index_old[i] = arr[index[i]]->old_name;

index_new[i] = arr[index[i]]->new_name;

There is one more index set of interest and this is index pos that contains the positionof indices from the index set. If index corresponds to some cluster tree TI then:

for(i=0;i<n;i++)

index_pos[index[i]]=i;

Example 7.9 We consider Example 4.4. The index for the cluster tree TI is

index = 3,4,2,6,5,7,1,0.

If the triangles from the old discretisation scheme are stored in the array of trianglespTriangle* arr, the index new can be in general defined as

for(i=0;i<n;i++)

index_new[i]=arr[index[i]]->new_name;

In our example the index new takes the values

index_new = 4,5,3,7,6,-1,2,-1.

If we now consider the updated cluster tree TI′ then the corresponding index is definedas

index = 4,5,3,7,6,8,2,1,0,9.

Having a new triangulation stored in the triangle array pTriangle* arr new we candefine the index set index old.

for(i=0;i<m;i++)

index_old[i]=arr_new[index[i]]->old_name;

index_old = 3,4,2,6,5,-1,1,-1,-1,-1


In this section we will present the implementation of admissible and inadmissible matrixblocks. We will restrict us to the case when the interpolation scheme is used to assemblelow-rank blocks and even more, we will assume that the interpolation is done in the firstvariable x of g(x, y).Some structures and functions which appear in the following procedures are not definedso far. Among them is the structure DataFactory which contains the following data:the index sets index, index old, index new and index pos, quadrature points qp andthe triangulation pTriangle *arr. We also use the functions

151

7. Implementation

• triangle FromArray that as the name says, returns the triangle for a particularindex from the array of triangles.

• transformed ChebyshevPoints that transforms Chebyshev points for a given clus-ter.

• get EntryLagrange returns the value∫τiLt

ν(x)dΓx.

• get EntryKernel returns the value of∫τig(xt

ν , y)dΓy.

• dcopy is a LAPACK function that copies elements of one vector into an other.

In the following procedures the clusters row and col correspond to clusters t′ and s′,while the clusters row old and col old correspond to the clusters t and s. We notice thatwe do not use block clusters (as in the original algorithm) but clusters directly. Thereforewe assume that the pair of clusters (row,col) is admissible resp. inadmissible.

7.4.1. Implementation of Update for Admissible Blocks

We present the implementation of the update of admissible blocks in the procedureadapt RkMatrix. On the parameter list is beside the elements mentioned before alsothe old rkmatrix r old.

prkmatrix

adapt_RkMatrix(pccluster row, pccluster col,

pccluster row_old, pccluster col_old,

pDataFactoryExtended dfe,

prkmatrix r_old)

/*some initialisations*/

/* we determine K*/

M = (int*) malloc (sizeof(int)*d);

for(i=0; i<d; i++)

if(fabs(row->bmax[i]-row->bmin[i])<1e-16)

M[i] = 1;

else

M[i] = p;

K = M[0]*M[1]*M[2];

transformed_ChebyshevPoints(dfe->qp->x,row,&l_row,M);

n = row->size;

m = col->size;

n_old = row_old->size;

m_old = col_old->size;

152


start_n = row->start;

start_m = col->start;

start_nold = row_old->start;

start_mold = col_old->start;

r = new_rkmatrix(K,n,m);

r->kt = K;

/* we check: if cluster row is old then we copy whole matrix r_old->a into

r->a; in the similar way we check the cluster col later */

if(row->name == 0)

r->a = r_old->a;

else

for(i=0; i<n; i++)

if(dfe->index_old[i+start_n] !=-1)

dcopy_(&K,&r_old->a[dfe->index_pos[i+start_n]-start_nold],

&n_old,&r->a[i],&n);

else

for(nu1=0; nu1<M[0]; nu1++)

for(nu2=0; nu2<M[1]; nu2++)

for(nu3=0; nu3<M[2]; nu3++)

Ti =triangle_FromArray(arr,ind->index[i+start_n]);

NU = nu1 + M[0]*nu2 + M[0]*M[1]*nu3;

nu[0] = nu1;

nu[1] = nu2;

nu[2] = nu3;

r->a[i+NU*n] = get_EntryLagrange(dfe->qp->hq,M,Ti,nu,l_row);

if(col->name == 0)

r->b = r_old->b;

else

for(i=0; i<m; i++)

if(ind->index_old[i+start_m] != -1)

dcopy_(&K,&r_old->b[ind->index_i[i+start_m]-start_mold],

&m_old,&r->b[i],&m);

else

for(nu1=0; nu1<M[0]; nu1++)

for(nu2=0; nu2<M[1]; nu2++)

for(nu3=0; nu3<M[2]; nu3++)

153

7. Implementation

Tj =triangle_FromArray(arr,ind->index[i+start_m]);

NU = nu1 + M[0]*nu2 + M[0]*M[1]*nu3;

nu[0] = nu1;

nu[1] = nu2;

nu[2] = nu3;

r->b[i+NU*m] = get_EntryKernel(dfe->qp->hq,Tj,nu,l_row);

free(M); M = NULL;

free(l_row); l_row = NULL;

return r;

7.4.2. Implementation of Update for Inadmissible Matrix Blocks

The implementation of the update of inadmissible matrix blocks presented in the rou-tine adapt FullMatrix follows step by step the algorithm for updating full matri-

ces. pfullmatrix full corresponds to the matrix block G|t×s. We use the functionget EntryFull that returns the entry for the full matrix defined as

∫τi

∫τjg(x, y)dΓxdΓy.

pfullmatrix

adapt_FullMatrix(pccluster row, pccluster col,

pccluster row_old, pccluster col_old,

pDataFactoryExtended dfe,

pfullmatrix full)

/*some initialisations*/

n = row->size;

m = col->size;

start_n = row->start;

start_m = col->start;

n_old = row_old->size;

m_old = col_old->size;

start_nold = row_old->start;

start_mold = col_old->start;

f = new_fullmatrix(n,m);

for(i=0; i<n; i++)

for(j=0; j<m; j++)

154


if(dfe->index_old[i+start_n] != -1 &&

dfe->index_old[j+start_m] != -1)

f->e[i+n*j] =

full->e[dfe->index_pos[i+start_n]-start_nold+

n_old*(def->index_pos[j+start_m]-start_mold)];

else

ti = triangle_FromArray(dfe->arr,dfe->index[i+start_n]);

tj = triangle_FromArray(dfe->arr,dfe->index[j+start_m]);

f->e[i+n*j] = get_EntryFull(ti,tj,dfe->qp->HQ);

return f;

155

7. Implementation

156

A. APPENDIX

In Chapter 3 we have introduced several approximation schemes for computing low-rankapproximations. In this appendix we will give a precise formula for the entries in thelow-rank approximations obtained by interpolation. The first section of this appendixcontains the computation formulas forthe single-layer potential (SLP) and double-layerpotential (DLP) in the case of a piecewise constant ansatz while the second sectiondescribes the computation of the entries for SLP and DLP in the case of a piecewiselinear ansatz.

A.1. Piecewise Constant Ansatz

Let T be a triangulation of the boundary Γ and let I be an index set. In case thediscretisation space is spanned by piecewise constant functions, the dimension of thespace is equal to the number of triangles in T . There holds #T = #I. Each basisfunction ϕi, i ∈ I, is defined on the corresponding triangle, i.e., suppϕi(x) = τi.

A.1.1. Entries for the SLP Matrix

If (t, s) is an admissible pair of clusters, the kernel function g(x, y) can be replacedby its approximation (3.6). Then the entries Gij will be replaced by Gij , leading to

G|t×s ≈ G|t×s.

(G|t×s)ij =

∫

Γϕi(x)

∫

Γg(x, y)ϕj(y)dΓydΓx

=∑

ν∈K

∫

Γϕi(x)

∫

Γg(xt

ν , y)Ltν(x)ϕj(y)dΓydΓx

=∑

ν∈K

∫

Γϕi(x)Lt

ν(x)dΓx

∫

Γg(xt

ν , y)ϕj(y)dΓy

= (ABT )ij .

The matrix block G|t×s can be represented as a product of two matrices A and B.

G|t×s = ABT ,

Aiν =

∫

Γϕi(x)Lt

ν(x)dΓx,

Bjν =

∫

Γg(xt

ν , y)ϕj(y)dΓy.

157

A. APPENDIX

The assembly of the matrices A and B will be further simplified if we recall the definitionof the functions ϕi(x), that are chosen to be characteristic function of τi, ϕi(x) = χτi

(x).

Aiν =

∫

τi

Ltν(x)dx,

Bjν =

∫

τj

g(xtν , y)dy.

These computations can be done in two steps. In the first step we apply a mapping Φ,which transforms the surface integral over the triangle τi (τj resp.) into a double integralover the unit triangle (Figure A.1). The matrix entries Aiν , Bjν are:

C

A

B

(0, 0) (1, 0)

(0, 1)

Φ

Figure A.1.: Φ : R2 −→ R3, Φ(x1, y1) = A+ (B −A)x1 + (C −A)y1

Aiν =

∫Lt

ν(Φ(x1, y1))grτi(x1, y1)dx1dy1,

Bjν =

∫g(xt

ν ,Φ(x1, y1))grτj(x1, y1)dx1dy1.

where grτi(grτj

resp.) is the Gram determinant depending on the triangle τi (τj resp.)

grτi= grτi

(x1, y1) =

√det

(〈 ∂Φ

∂xi,∂Φ

∂yj〉i,j=1,2

).

Since grτi(grτj

resp.) is a constant it can be taken outside of the integral.In the second step we apply the mapping Ψ which transforms the integral over the unittriangle into an integral over the unit square [0, 1] × [0, 1] (Figure A.2). This is done inorder to apply the standard Gauß quadrature formula.

Aiν = grτi

∫ 1

0

∫ 1

0Lt

ν(Φ(Ψ(x2, y2)))(1 − x2)dx2dy2,

Bjν = grτj

∫ 1

0

∫ 1

0g(xt

ν ,Φ(Ψ(x2, y2)))(1 − x2)dx2dy2,

∣∣∣DΨ(x, y)

D(x, y)

∣∣∣ = 1 − x.

158


(0, 0) (0, 0)(1, 0) (1, 0)

(0, 1) (0, 1)

Ψ

(1, 1)

Figure A.2.: Ψ : [0, 1] × [0, 1] −→ R2, Ψ(x2, y2) = (x2, (1 − x2)y2)

If the order of the quadrature is q, we will denote the number of quadrature points by nq.Points xl, yq are the quadrature points while wl are the corresponding weights. Finally,we obtain the formula for entries of the matrices A and B.

Aiν ≈ grτi

nq∑

ι=1

nq∑

θ=1

∑

ν∈K

Ltν(Φ(Ψ(xι, yθ)))(1 − xι)wιwθ,

Bjν ≈ grτj

nq∑

ι=1

nq∑

θ=1

∑

ν∈K

g(xtν ,Φ(Ψ(xι, yθ)))(1 − xι)wιwθ.

Finally, we have a practical formula for computing the entries for the matrix Gij for theadmissible pair of clusters (t, s) ∈ TI×I .

A.1.2. Entries for the DLP Matrix

The double-layer potential operator is defined as

K[u](x) =

∫

Γ〈n(y),∇yg(x, y)〉u(y)dΓy .

Applying the standard discretisation scheme for the DLP operator we obtain a denselypopulated matrix K whose entries are

Kij =

∫

Γ

∫

Γϕi(x)〈n(y),∇yg(x, y)〉ϕj(y)dΓxdΓy. (A.1)

Let (t, s) be an admissible pair of clusters and let us assume that diam(At) ≤ diam(As).We approximate the kernel function g by a degenerate kernel function

g(x, y) =∑

ν∈K

〈n(y),∇yg(xtν , y)〉Lt

ν(x). (A.2)

159

A. APPENDIX

Inserting the previous equation in (A.1) we obtain

(K|t×s)ij =

∫

Γ

∫

Γϕi(x)〈n(y),∇y g(x, y)〉ϕj(y)dΓxdΓy

=∑

ν∈K

∫

Γ

∫

Γϕi(x)〈n(y),∇yg(x

tν , y)〉Lt

ν(x)ϕj(y)dΓxdΓy

=∑

ν∈K

∫

Γϕi(x)Lt

ν(x)dΓx

∫

Γ〈n(y),∇yg(x

tν , y)〉ϕj(y)dΓy.

We summarise:

K|t×s = ABT ,

Aiν :=∑

ν∈K

∫

Γϕi(x)Lt

ν(x)dΓx,

Bjν :=∑

ν∈K

∫

Γ〈n(y),∇yg(x

tν , y)〉ϕj(y)dΓy.

As in the previous subsection we apply the transformations Φ and Ψ and a quadraturerule for nq quadrature points (assuming q is the quadrature order). This yields thesimplified computation of the entries of the matrices A and B:

Aiν :=∑

ν∈K

∫

Γϕi(x)Lt

ν(x)dΓx

=∑

ν∈K

∫

τi

Ltν(x)dΓx

=∑

ν∈K

∫Lt

ν(Φ(x1, y1))grτidx1dy1

=∑

ν∈K

grτi

∫ 1

0

∫ 1

0(1 − x2)Lt

ν(Φ(Ψ(x2, y2)))dx2dy2

≈∑

ν∈K

grτi

nq∑

k=1

nq∑

j=1

(1 − xk)Ltν(Φ(Ψ(xk, yj)))ωkωj.

160


Bjν :=∑

ν∈K

∫

Γϕj(y)〈n(y),∇yg(x

tν , y)〉dΓy

=∑

ν∈K

∫

τj

〈n(y),∇yg(xtν , y)〉dΓy

=∑

ν∈K

∫〈n(y),∇yg(x

tν ,Φ(x1, y1))grτi

dx1dy1

=∑

ν∈K

grτj

∫ 1

0

∫ 1

0(1 − x1)〈n(y),∇yg(x

tν ,Φ(Ψ(x2, y2)))dx2dy2

≈∑

ν∈K

grτj

nq∑

k=1

nq∑

l=1

(1 − xk)〈n(y),∇yg(xtν ,Φ(Ψ(xk, yl)))〉ωkωl.

In the case that diam(As) ≤ diam(At) we apply the degenerate kernel expansion in asimilar way, approximating the kernel function g by

g(x, y) =∑

ν∈K

g(x, xsν)Ls

ν(y).

The insertion of this expansion in (A.1) gives

K|t×s = ABT ,

Aiν :=∑

ν∈K

∫

Γϕi(x)g(x, x

sν)dΓx,

Bjν :=∑

ν∈K

∫

Γ〈n(y),∇yLs

ν(y)〉ϕj(y)dΓy.

Before we show how to practically compute the entries of the matrices A and B wewill closer discuss the evaluation of the term 〈n(y),∇yLs

ν(y)〉. Let y = (y1, . . . , yd). Bydefinition we have

Lsν(y) := Ld(y) =

d∏

j=1

Lνj(yj), and

〈n(y),∇yLsν(y)〉 = Nd := 〈∇Ld, (n1, . . . , nd)〉,∂mLm = Lm−1 ⊗ L′

m,

∂iLm = (∂iLm−1) ⊗ Lm.

161

A. APPENDIX

We compute Nd recursively by

Nd =

d∑

j=1

nj∂jLd(y)

= nd∂dLd(y) +

d−1∑

j=1

nj∂jLd(y)

= nd∂dLd(y) +

d−1∑

j=1

nj(∂jLd−1(y)) ⊗ Ld

= ndLd−1(y)L′

d + Nd−1Ld.

Once again we apply the quadrature rule for nq quadrature points, and obtain a formulafor the entries for the matrices A and B.

Aiν =∑

ν∈K

∫

Γϕi(x)g(x, x

tν)dΓx

=∑

ν∈K

∫

τi

g(x, xtν)dΓx

≈∑

ν∈K

grτi

nq∑

k=1

nq∑

j=1

g(Φ(Ψ(xk, yj)), xtν)ωkωj,

Bjν =∑

ν∈K

∫

Γϕj(y)〈n(y),∇yLs

ν(y)〉dΓy

=∑

ν∈K

∫

τj

〈n(y),∇yLsν(y)〉dΓy

≈∑

ν∈K

grτj

nq∑

k=1

nq∑

l=1

〈n(y),∇yLsν(Φ(Ψ(xk, yl)))〉ωkωl.

A.2. Piecewise Linear Ansatz

If the discretisation space is spanned by piecewise linear basis functions, then the numberof vertices is equal to the dimension of the space. The support of the basis function ϕi(x)is the union of the triangles which contain the vertex xi:

suppϕi(x) =⋃

τk∋xi

τk

A.2.1. Entries for the SLP Matrix

If (t, s) is an admissible pair of clusters we assume without loss of generality thatdiam(At) ≤ diam(As). The kernel function will be approximated as in (A.2) leading

162

A.2. Piecewise Linear Ansatz

to the low rank approximation of the matrix block G|t×s as G|t×s = ABT . The entriesof the matrices A and B will be computed as follows:

Aiν =∑

ν∈K

∫

supp ϕi(x)ϕi(x)Lt

ν(x)dΓx

=∑

ν∈K

∑

τk∋xi

∫

τk

ϕi(x)Ltν(x)dΓx

=∑

ν∈K

∑

τk∋xi

grτk

∫(1 − x1 − y1)Lt

ν(Φ(x1, y1))dx1dy1

=∑

ν∈K

∑

τk∋xi

grτk

∫ 1

0

∫ 1

0(1 − x2 − y2(1 − x2))Lt

ν(Φ(Ψ(x2, y2)))(1 − x2)dx2dy2

≈∑

ν∈K

∑

τk∋xi

grτk

nq∑

j=1

nq∑

p=1

(1 − xj − yp(1 − xj))Ltν(Φ(Ψ(xj , yp)))(1 − xj)ωjωp,

Bjν =∑

ν∈K

∫

supp ϕj(y)ϕj(y)g(x

tν , y)dΓy

=∑

ν∈K

∑

τk∋xj

∫

τk

ϕj(y)g(xtν , y)dΓy

=∑

ν∈K

∑

τk∋xj

grτk

∫(1 − x1 − y1)g(x

tν , (Φ(x1, y1)))dx1dy1

=∑

ν∈K

∑

τk∋xj

grτk

∫ 1

0

∫ 1

0(1 − x2 − y2(1 − x2))g(x

tν , (Φ(Ψ(x2, y2))))(1 − x2)dx2dy2

≈∑

ν∈K

∑

τk∋xj

grτk

nq∑

i=1

nq∑

p=1

(1 − xi − yp(1 − xi))g(xtν , (Φ(Ψ(xi, yp))))(1 − xi)ωiωp.

A.2.2. Entries for the DLP Matrix

As in the previous subsection, we obtain the low-rank approximation of the matrix blockK|t×s = ABT . We assume diam(At) ≤ diam(As). Then we compute the entries of the

163

A. APPENDIX

matrices A and B as:

Aiν =∑

ν∈K

∫

supp ϕi(x)g(x, xt

ν)ϕi(x)dΓx

=∑

ν∈K

∑

τk∋xi

∫

τk

g(x, xtν)ϕi(x)dΓx

=∑

ν∈K

∑

τk∋xi

grτk

∫(1 − x1 − y1)g(Φ(x1, y1), x

tν)dΓx

=∑

ν∈K

∑

τk∋xi

grτk

∫ 1

0

∫ 1

0(1 − x2 − y2(1 − x2))g(Φ(Ψ(x2, y2))))(1 − x2)dx2dy2

≈∑

ν∈K

∑

τk∋xi

grτk

nq∑

i=1

nq∑

p=1

(1 − xi − yp(1 − xi))g(xtν , (Φ(Ψ(xi, yp))))(1 − xi)ωiωp.

Bjν =∑

ν∈K

∫

supp ϕj(y)ϕj(y)〈n(y),∇Ls

ν(y)〉dΓy

=∑

ν∈K

∑

τk∋xj

∫

τk

ϕj(y)〈n(y),∇Lsν(y)〉dΓy

=∑

ν∈K

∑

τk∋xj

grτk

∫(1 − x1 − y1)〈n(y),∇Ls

ν((Φ(x1, y1)))〉dx1dy1

=∑

ν∈K

∑

τk∋xj

grτk

∫ 1

0

∫ 1

0(1 − x2 − y2(1 − x2))〈n(y),∇Ls

ν(Φ(Ψ(x2, y2))))〉(1 − x2)dx2dy2

≈∑

ν∈K

∑

τk∋xj

grτk

nq∑

i=1

nq∑

p=1

(1 − xi − yp(1 − xi))〈n(y),∇Lsν((Φ(Ψ(xi, yp))))〉(1 − xi)ωiωp.

164

Bibliography

[1] L. Banjai and W. Hackbusch. H- and H2-matrices for low and high frequencyHelmholz equation. Technical Report 17, Max Planck Institute for Mathematics inthe Sciences, 2005.

[2] R. E. Bank. PLTMG: A Software Package for Solving Elliptic Partial DifferentialEquations. User’s Guide 6.0. SIAM, 1990.

[3] E. Bansch. Local mesh refinement in 2 and 3 dimensions. Impact of Computing inScience and Engineering, 3:181–191, 1991.

[4] U. Baur and P. Benner. Factorised solution of Lyapunov equations based on hier-archical matrix arithmetics. Preprint 116, DFG Research Center ”Mathematics forKey Technologies”, Berlin, 2004.

[5] M. Bebendorf. Effiziente numerische Losung von Randintegralgleichungen unterVerwendung von Niedrigrang-Matrizen. PhD thesis, Universitat Saarbrucken, 2000.

[6] M. Bebendorf and W. Hackbusch. Existence of H-matrix approximants to the in-verse FE-matrix of elliptic operators with L∞-coefficients. Numerische Mathematik,95:1–28, 2003.

[7] M. Bebendorf and S. Rjasanow. Adaptive Low-Rank Approximation of CollocationMatrices. Computing, 70:1–24, 2003.

[8] S. Borm. H2-matrix arithmetics in linear complexity. Technical Report 47, MaxPlanck Institute for Mathematics in the Sciences, 2004. To appear in Computing.

[9] S. Borm. Approximation of integral operators by H2-matrices with adaptive bases.Computing, 74:249–271, 2005.

[10] S. Borm. Data-sparse approximation of non-local operators by H2-matrices. Tech-nical Report 44, Max Planck Institute for Mathematics in the Sciences, 2005. Sub-mitted to Linear Algebra and its Applications.

[11] S. Borm and L. Grasedyck. Low-rank approximation of integral operators by inter-polation. Computing, 72(3–4), 2002.

[12] S. Borm and L. Grasedyck. Hybrid cross approximation of integral operators. Nu-merische Mathematik, 101(2):221–249, 2005.

165

Bibliography

[13] S. Borm, L. Grasedyck, and W. Hackbusch. Hierarchical Matrices. Lecture notes,2003.

[14] S. Borm, L. Grasedyck, and W. Hackbusch. Introduction to hierarchical matriceswith applications. Engineering Analysis with Boundary Elements, 27:405–422, 2003.

[15] S. Borm and W. Hackbusch. H2-matrix approximation of integral operators byinterpolation. Applied Numerical Mathematics, 43:129–143, 2002.

[16] S. Borm and W. Hackbusch. Approximation of boundary element operators by adap-tive H2-matrices. Foundations of Computational Mathematics, 312:58–75, 2004.

[17] S. Borm and W. Hackbusch. Hierarchical quadrature of singular integrals. Com-puting, 74:75–100, 2004.

[18] S. Borm, M. Lohndorf, and J. M. Melenk. Approximation of integral operators byvariable-order interpolation. Numerische Mathematik, 99(4), 2005.

[19] S. Borm and J. Ostrowski. Fast evaluation of boundary integral operators arisingfrom an eddy current problem. Journal of Computational Physics, (193):67–85,2003.

[20] D. Braess. Finite Elemente: Theorie, schnelle Loser und Andwendungen in derElastizitatstheorie. Springer Verlag, 1997.

[21] C. Carstensen and S. Bartels. Each averaging technique yields reliable a poste-riori error control in FEM on unstructured grids. Part I: low order conforming,nonconforming, and mixed FEM. Math. Comp., 71:945–969, 2002.

[22] C. Carstensen and B.Faermann. Mathematical foundation of a posteriori errorestimates and adaptive mesh-refining algorithms for boundary integral equations ofthe first kind. Engineering Analysis with Boundary Elements, (25):497–509, 2001.

[23] C. Carstensen and D. Praetorius. Averaging techniques for the effective numericalsolution of symm’s integral equation of the first king. Technical report, TechnischeUniversitat Wien, 2004. submitted to SIAM J.Sci.Comp.

[24] W. Dahmen and R. Schneider. Wavelets on manifolds I: Construction and domaindecomposition. SIAM Journal of Mathematical Analysis, 31:184–230, 1999.

[25] J. M. Ford and E. E. Tyrtyshnikov. Combining kronecker product approximationwith discrete wavelet transforms to solve dense, function-related linear systems.SIAM J. Sci. Comput., 25(3):961–981, 2003.

[26] S. A. Goreinov, E. E. Tyrtyshnikov, and N. L. Zamarashkin. A theory of psedoskele-ton approximations. Lin. Alg. Appl., 261:1–22, 1997.

[27] L. Grasedyck. Theorie und Anwendungen Hierarchischer Matrizen. PhD thesis,Universitat Kiel, 2001.

166

Bibliography

[28] L. Grasedyck. Existence of a low rank or H approximant to the solution of aSylvester equation. Numer. Linear Algebra Appl., 11:371–389, 2004.

[29] L. Grasedyck. Adaptive recompression of H-matrices for BEM. Computing, 74:205–223, 2005.

[30] L. Grasedyck and S. Le Borne. H-matrix preconditioners in convection-dominatedproblems. Technical Report 62, Max Planck Institute for Mathematics in Sciences,2004. to appear in SIMAX.

[31] L. Grasedyck and W. Hackbusch. Construction and arithmetics of H-matrices.Computing, 70:295–334, 2003.

[32] L. Grasedyck and W. Hackbusch. A multigrid method to solve large scale Sylvesterequations. Technical Report 48, Max Planck Institute for Mathematics in Sciences,2004. submitted to SIAM J.Matrix Anal. Appl.

[33] L. Grasedyck, W. Hackbusch, and S.Le Borne. Adaptive geometrically balancedclustering of H-matrices. Computing, 73:1–23, 2004.

[34] L. Grasedyck, W. Hackbusch, and B. Khoromskij. Solution of large scale algebraicmatrix Riccati equations by use of hierarchical matrices. Computing, 70:121–165,2003.

[35] L. Grasedyck, W. Hackbusch, and S. LeBorne. Adaptive refinement and clusteringof H-matrices. Technical Report 106, Max Planck Institute of Mathematics in theSciences, 2001.

[36] L. Grasedyck, R. Kriemann, and S. Le Borne. Paraller Black Box Domain Decom-position Based H-LU Preconditioning. Technical Report 115, Max Planck Institutefor Mathematics in Sciences, 2005. submitted to Mathematics of Computation.

[37] L. Greengard and V. Rokhlin. A new version of the fast multipole method for theLaplace in three dimensions. In Acta Numerica 1997, pages 229–269. CambridgeUniversity Press, 1997.

[38] W. Hackbusch. Elliptic Differential Equations. Theory and Numerical Treatment.Springer-Verlag Berlin, 1992.

[39] W. Hackbusch. Integral Equations. Theory and Numerical Treatment. Birkhauser,1995.

[40] W. Hackbusch. A sparse matrix arithmetic based on H-matrices. Part I: Introduc-tion to H-matrices. Computing, 62:89–108, 1999.

[41] W. Hackbusch and B. Khoromskij. A sparse H-matrix arithmetic: General com-plexity estimates. J. Comp. Appl. Math., 125:479–501, 2000.

167

Bibliography

[42] W. Hackbusch and B. Khoromskij. A sparse matrix arithmetic based on H-matrices.Part II: Application to multi-dimensional problems. Computing, 64:21–47, 2000.

[43] W. Hackbusch, B. Khoromskij, and S. A. Sauter. On H2-matrices. In H. Bungartz,R. Hoppe, and C. Zenger, editors, Lectures on Applied Mathematics, pages 9–29.Springer-Verlag, Berlin, 2000.

[44] W. Hackbusch and Z. P. Nowak. On the fast matrix multiplication in the boundaryelement method by panel clustering. Numerische Mathematik, 54:463–491, 1989.

[45] K. Helms. Implementierungstechniken in der Numerik, Hierarchische Darstellun-gen. PhD thesis, dissertation.de, 2004.

[46] M. Lintner. The eigenvalue problem for 2d Laplacian in H-matrix arithmetics andapplication to the heat and wave equation. Computing, (72):293–323, 2004.

[47] V. Rokhlin. Rapid solution of integral equations of classical potential theory. Journalof Computational Physics, 60:187–207, 1985.

[48] H. Schulz and O.Steinbach. A new aposteriori error estimator in adaptive directboundary element methods: the Dirichlet problem. Calcolo, (37):79–96, 2000.

[49] O. Steinbach. Adaptive Boundary Element Methods based on the computationalSchemes for Sobolev norms. SIAM J. Sci. Comput., 22(2):604–616, 2000.

[50] E. Tyrtyshnikov. Incomplete cross approximation in the mosaic-skeleton method.Computing, 64:367–380, 2000.

[51] C. H. Wolters, L. Grasedyck, and W. Hackbusch. Efficient Computation of LeadField Bases and Influence Matrix for the FEM-based EEG and MEG Inverse Prob-lem. Part I: Complexity Considerations. Inverse Problems, 20:1099–1116, 2004.

168

Daten zur Autorin:

Name Jelena Djokic

geboren am 23. Januar 1978 in Kragujevac, Serbien und Montenegro

10/1996-10/2000 Studium an der Fakultat fur Mathematik, Universitat in Belgrad,Schwerpunkt: Theoretische Mathematik mit Anwendungen, Abschluss: Diplom(9.61 von 10.00)

01/2001- Doktorandin am Max-Planck-Institut fur Mathematikin den Naturwissenschaften

Bibiliographische Daten

Efficient Update of Hierarchical Matrices in the case of Adaptive Discretisation Schemes(Effizient Aufdatierung von hierarchischen Matrizen bei adaptiven Diskretisierungen)Jelena DjokicUniversitat Leipzig, Dissertation169 Seiten, 66 Abbildungen, 51 Referenzen

Selbststandigkeitserklarung

Hiermit erklare ich, die vorliegende Dissertation selbstandig und ohne unzulassige fremdeHilfe angefertigt zu haben. Ich habe keine anderen als die angefuhrten Quellen und Hilfs-mittel benutzt und samtliche Textstellen, die wortlich oder sinngemaß aus veroffentlichenoder unveroffentlichten Schriften entnommen wurden, und alle Angaben, die auf mundlichenAuskunften beruhen, als solche kenntlich gemacht. Ebenfalls sind alle von anderen Per-sonen bereitgestellten Materialien oder erbrachten Dienstleistungen als solche gekennze-ichnet.

Leipzig, 31. Juli 2006

. . . . . . . . . . . . . . . . . . . . . . . . . . .(Jelena Djokic)

eﬃcient update of hierarchical matrices in the case of ...like application of h-matrices in the...

Documents