computing functions of matrices via contour integrals...

12
Computing Functions of Matrices via Contour Integrals and the Trapezoid Rule Anthony Kellems 6/21/2005 Abstract Matrix functions f (A) are often motivated by PDE’s, but computation of these functions is not always easy. The Cauchy integral theorem is a useful tool for complex analysis and for computing f (a), but it can also be effective for computing f (A). Using a contour which encloses the spectrum of A, this integral can be approximated by the trapezoid rule, which is known to be exponentially accurate. The simple case of computing a matrix square root is explored to unlock the potential of this method, and then other functions are considered. Particular focus is devoted to the matrix exponential e A and its use in the heat equation. Incorporating new research from Weideman [8], we demonstrate that the optimized Talbot contour provides extraordinary accuracy for very few trapezoid rule points. 1 Introduction Contour integrals are important to much of complex analysis, forming the key component of the Cauchy integral theorem. The trapezoid rule is important for approximation of integrals because of its exponential accuracy under certain circumstances. However, the technique of combining the trapezoid rule with contour integration is rather sparsely referenced or used. Combining the Cauchy Integral Theorem and the trapezoid rule yields a recipe for a very powerful algorithm to compute matrix functions. These functions are generalizations of the scalar case, but they are not always easy to compute. The matrix square root is not the square root of each entry of A, but rather is the matrix B such that B 2 = A. Similarly the matrix exponential e A is not e to the power of the j, k-entry of A, but is the matrix B = I + A + A 2 /2! + A 3 /3! + ... . Both of these functions require significant effort to produce highly accurate solutions, and they can be very prone to rounding error in some cases. An easier and more accurate way to compute these and similar functions may sometimes be by using the technique we describe. The Cauchy integral theorem states that the value of a function can be computed by an integral. Given a function f (z) and a value z = a, we can compute f (a) by: f (a)= 1 2πi Γ f (z) z - a dz, where Γ is a contour in such that Γ encloses a and f (z) is analytic on and inside Γ. Generalizing this formula to the matrix case yields f (A)= 1 2πi Γ f (z)(zI - A) -1 dz, (1) where (zI - A) -1 is the resolvent of A at z and where Γ encloses the spectrum of A. Cauchy’s integral formula implies that the contour Γ can be deformed to any shape as long as it encloses all eigenvalues of A. The above formula can be simplified by taking Γ to be a circle of radius r centered at 1

Upload: hoangque

Post on 30-Mar-2018

217 views

Category:

Documents


1 download

TRANSCRIPT

Computing Functions of Matrices via

Contour Integrals and the Trapezoid RuleAnthony Kellems

6/21/2005

Abstract

Matrix functions f(A) are often motivated by PDE’s, but computation of these functions is not alwayseasy. The Cauchy integral theorem is a useful tool for complex analysis and for computing f(a), but it canalso be effective for computing f(A). Using a contour which encloses the spectrum of A, this integral canbe approximated by the trapezoid rule, which is known to be exponentially accurate. The simple case ofcomputing a matrix square root is explored to unlock the potential of this method, and then other functionsare considered. Particular focus is devoted to the matrix exponential eA and its use in the heat equation.Incorporating new research from Weideman [8], we demonstrate that the optimized Talbot contour providesextraordinary accuracy for very few trapezoid rule points.

1 Introduction

Contour integrals are important to much of complex analysis, forming the key component of the Cauchyintegral theorem. The trapezoid rule is important for approximation of integrals because of its exponentialaccuracy under certain circumstances. However, the technique of combining the trapezoid rule with contourintegration is rather sparsely referenced or used. Combining the Cauchy Integral Theorem and the trapezoidrule yields a recipe for a very powerful algorithm to compute matrix functions.

These functions are generalizations of the scalar case, but they are not always easy to compute. Thematrix square root is not the square root of each entry of A, but rather is the matrix B such that B2 = A.Similarly the matrix exponential eA is not e to the power of the j, k-entry of A, but is the matrix B =I + A + A2/2! + A3/3! + . . . . Both of these functions require significant effort to produce highly accuratesolutions, and they can be very prone to rounding error in some cases. An easier and more accurate way tocompute these and similar functions may sometimes be by using the technique we describe.

The Cauchy integral theorem states that the value of a function can be computed by an integral. Givena function f(z) and a value z = a, we can compute f(a) by:

f(a) =1

2πi

∫Γ

f(z)z − a

dz,

where Γ is a contour in C such that Γ encloses a and f(z) is analytic on and inside Γ. Generalizing thisformula to the matrix case yields

f(A) =1

2πi

∫Γ

f(z)(zI −A)−1dz, (1)

where (zI −A)−1 is the resolvent of A at z and where Γ encloses the spectrum of A.

Cauchy’s integral formula implies that the contour Γ can be deformed to any shape as long as it enclosesall eigenvalues of A. The above formula can be simplified by taking Γ to be a circle of radius r centered at

1

some point zc, defined by z = zc + reiθ. This substitution gives

f(A) =12π

∫ 2π

0

f(z)(zI −A)−1reiθdθ.

Notice that we can equivalently write reiθ = z−zc, and substituting this into the equation produces a simpleintegral for a matrix function:

f(A) =12π

∫Γ

(z − zc)f(z)(zI −A)−1dθ.

Approximating this integral using the trapezoid rule proves to be clean and accurate. Let the integrandbe the function g(θ). Taking N equally spaced points on Γ (and hence step length h = 2π/N), and notingthat g(0) = g(2π), gives

f(A) ≈ 1N

N−1∑j=0

g(θj). (2)

This is the general formula for computing a matrix function when Γ is a circle.

2 Matrix Square Root

The first test case for this method is the matrix square root,√A, or equivalently A1/2. The function

f(z) = z1/2 is analytic everywhere in C except at z = 0, so Γ must avoid the origin.1 Consider a matrixA whose eigenvalues lie approximately in the unit disk centered at zc = 3. The contour is a disk of r = 2,parameterized as z = 3 + 2eiθ, and from (2) we obtain

A1/2 ≈ 1N

N−1∑j=0

(zj − 3)z1/2j (zjI −A)−1.

Figure 1 displays the convergence of this method for matrices of dimension 4, 16, and 64. In each case

0 20 40 60 80 10010

−20

10−15

10−10

10−5

100

105

||B2 −

A||

N = number of points in trapezoid rule

Error in Computing A1/2 as dim(A) Increases

dim(A) = 4dim(A) = 16dim(A) = 64

Figure 1: Convergence curves for the trapezoid rule applied to f(A) = A1/2.

1Analyticity of a function can be verified via the Cauchy-Riemann equations ∂f/∂x = −i∂f/∂y.

2

exponential accuracy to machine precision is observed. Few points need to be used in this case becausethe eigenvalues are well-clustered. For larger spectral radii or more scattered eigenvalues the convergencewill be slower. The value of N required to achieve machine precision accuracy can range from just a fewpoints for easier problems to hundreds of thousands for quite difficult ones [1]. Comparing the trapezoidrule method with MATLAB’s sqrtm.m reveals both an advantage and a drawback (which can be corrected)to this method.

The most important aspect of the trapezoid rule approximation is that it achieves better accuracy thanthe current standard matrix square root algorithm. As shown in Figure 2a, it consistently attains about 1digit greater accuracy than sqrtm.m, measured in either the Frobenius norm or the 2-norm.2 The drawbackis that it takes a great deal longer to compute f(A), with the disparity growing in a seemingly exponentialfashion (Fig. 2b). Looking at the mechanics of sqrtm.m provides a way to eliminate this drawback inpractice.

20 40 60 80 100 120 140 160 180 20010

−16

10−15

10−14

10−13

10−12

N = dimension of A

||B2 −

A||

Residuals for methods of computing A1/2

2−norm (Trap. Rule)F−norm (Trap. Rule)2−norm (MATLAB)F−norm (MATLAB)

0 50 100 150 2000

5

10

15

20

25

N = dimension of A

Tim

e in

sec

onds

Time for computing A1/2

sqrtmtk.m (Trap. Rule)sqrtm.m (MATLAB)

Figure 2: a) Convergence curves for the trapezoid rule method with N = 128 versus MATLAB’s algorithm. Solid lines are

2-norms, dashed lines are Frobenius norms. b) Time comparisons for the two methods as dim(A) increases.

MATLAB’s algorithm computes the matrix square root by the following process:

• Compute the Schur factorization A = QTQT .

• If T is diagonal then set Rj,j = sqrt(Tj,j).

• Else operate column-by-column on T to produce the upper triangular square root matrix R.

The matrix square root is then returned as B = QRQT . It is easy to verify that

T = R2 ⇒ QTQT = QR2QT

⇒ A = (QRQT )(QRQT )

⇒ A = B2.

This leads to a good idea to speed up the trapezoid rule method: implement a preliminary factorization ofA, operate on the factored matrix, then combine the factors at the end of the computation.

2MATLAB’s function returns ‖B2 −A‖F /‖A‖F for the residual.

3

Three candidate factorizations were explored: Schur, Eigenvalue, and Hessenberg. Using the Schurfactorization and the orthogonality of Q we write

zjI −A = zjI −QTQ∗

= Q(zjI − T )Q∗.

Substituting this into (2) yields

f(A) ≈ 1NQ

N−1∑j=0

(zj − zc)f(zj)(zjI − T )−1

Q∗. (3)

Similar analysis holds for the eigenvalue decomposition A = V ΛV −1 and Hessenberg factorization A =PHP ∗. Thus we can compute the resolvent of a simpler matrix, a nearly triangular one. A comparison ofthe residual errors and timings for these methods is shown in Figure 3.

0 50 100 150 2000

5

10

15

20

25

N = dimension of A

Tim

e in

sec

onds

Time for methods of computing A1/2

No fact.SchurHessenbergEigenvaluesqrtm.m (MATLAB)

0 50 100 150 2000

0.5

1

1.5

2

2.5x 10

−13

N = dimension of A

||B2 −

A|| F

Residual comparison for A1/2

No fact.SchurHessenbergEigenvaluesqrtm.m (MATLAB)

0 50 100 150 2000

0.2

0.4

0.6

0.8

1

1.2

1.4x 10

−14

N = dimension of A

||B2 −

A|| F

Residual comparison for A1/2

No fact.SchurHessenbergsqrtm.m (MATLAB)

Figure 3: a) Time to compute A1/2 using various factorization methods as dim(A) increases. b) Convergence curves for the

factorization methods. c) The same curves as in (b) but with the eigenvalue decomposition one removed to show detail of the

others.

Clearly the eigenvalue decomposition is the fastest of the three factorizations, nearing the speed ofMATLAB’s function, but it comes at a cost. If we are willing to lose a digit or so of accuracy we can use the

4

Eigenvalue decomposition to compute f(A), but then the method would be worse than the current standard,sqrtm.m. Also, only non-defective matrices have eigenvalue decompositions, but all matrices have Schur andHessenberg factorizations, meaning using the former would make a less-robust method [6]. Note, however,that the Schur factorization has almost exactly the same error as MATLAB’s algorithm (the two curvesdirectly coincide in Fig. 3c). The Hessenberg factorization yields the best accuracy of the three methods,and we would like to take advantage of that.

In practice we often do not want f(A) itself, but rather some quantity which uses f(A), like the matrix-vector product f(A)b. Multiplying (3) with b produces

f(A) ≈ 1NQ

N−1∑j=0

(zj − zc)f(zj)(zjI − T )−1v, (4)

where v = Q∗b is a vector. Gaussian elimination can now be used to compute (zjI−T )−1(Q∗b), implementedin MATLAB via the backslash command \, which gives the trapezoid rule method the speed-up it needs.MATLAB has no way to benefit from this technique; B = f(A) must be explicitly computed and thenmultiplied by b.

0 100 200 300 400 5000

5

10

15

20

25

30

35

40

N = dimension of A

Tim

e in

sec

onds

Time comparison of methods for f(A)b

Schur (Trap.)Hess. (Trap.)MATLAB

0 100 200 300 400 50010

−15

10−14

10−13

10−12

N = dimension of A

||B2 b

− B

est

2b|

|

Residual comparison of methods for f(A)b

Schur (Trap.)Hess. (Trap.)MATLAB

Figure 4: a) Time to compute A1/2b using various factorization methods as dim(A) increases. b) Convergence curves for the

factorization methods (using unfactorized trapezoid rule method as the best estimate of Bb).

The benefits of applying Gaussian elimination to the trapezoid rule when computing f(A)b are shownin Figure 4. The Hessenberg factorization turns out to be the best choice for both speed and accuracy.In fact, as dim(A) → ∞ the Hessenberg factorization is faster than MATLAB! Therefore, depending onthe spectrum of A, the trapezoid rule for computing the matrix square root can be more effective thanMATLAB’s algorithm.

3 Matrix Exponential

Another important example of a matrix function is that of the matrix exponential, eA. It is defined analo-gously to the scalar case as the infinite sum

eA =∞∑

n=0

An

n!= I +A+

A2

2!+A3

3!+ . . . .

5

There are many ways to compute the matrix exponential, each with varying degrees of success and accu-racy [4]. The standard method, which tends to be quite accurate, uses a Pade approximation and a scalingand squaring method; this is implemented in MATLAB as the function expm.m.3

The motivation for this function comes from differential equations. Consider the heat equation, thesecond-order partial differential equation

ut = uxx (5)

with Dirichlet boundary conditions and

x ∈ [−π, π], t ≥ 0, u(x, 0) = g(x).

Discretizing uxx with the standard second-order finite-difference scheme, incorporating boundary conditions,yields uxx ≈ Du and (5) becomes

∂u

∂t= Du. (6)

It is known that a differential equation of this form has the exact solution

u(x, t) = etDu0, (7)

and hence it requires the computation of a matrix exponential eA = etD. The eigenvalues of A lie on thenegative real axis in the interval (−4t/h2, 0), where h is the mesh width, and are known to be exactly

λj =t

h2

[2 cos

(kπ

N + 1

)− 2

], N = dim(A), j = 1, . . . , N. (8)

The function f(z) = ez is analytic everywhere, so the conditions for the Cauchy integral theorem are satisfied.Thus via (4) the trapezoid rule method can be applied to solve this matrix equation.

0 50 100 150 20010

−15

10−10

10−5

100

105

N = dimension of A

||uM

−u T

|| 2

Residuals for methods of computing eAb

MATLAB vs. Trap. Rule (Hess.)

0 50 100 150 2000

2

4

6

8

10

12

14

N = dimension of A

Tim

e in

sec

onds

Time for methods of computing eAb

Trap. rule (Hess.)MATLAB (expm.m)

Figure 5: a) Residuals for computing eAb (MATLAB vs. Trap. rule method) and b) Time to compute eAb as dim(A) increases,

where A = tD.

Assessing the accuracy of the method in this case is more difficult because tD does not have a principalnatural logarithm, due to the spectrum lying on the negative real axis. Residuals for matrices with principalnatural logarithms do achieve better accuracy than those from MATLAB’s algorithm expm.m [3]. However,computing residuals for matrices such as tD produces incorrect results, thus for the heat equation examples

3Higham has improved upon this version of the matrix exponential, but his code was not used in this study. He may alsohave an improved matrix square root function, although his webpage did not mention such a code.

6

we will compute the residual by comparing ‖uM − uT ‖, where uM and uT are the solutions computed byMATLAB’s algorithm and the trapezoid rule method, respectively. Admittedly this has its drawbacks, butit is the best metric we have so far.

Timing results for computing eAb via the trapezoid rule method are not as good as those of the matrixsquare root case. Although Gaussian elimination does yield a speed increase, it is not enough, as Figure 5bshows, to compete with MATLAB’s expm.m algorithm as dim(A) →∞. For problems of small to moderate,size, however, the times are so small that the trapezoid rule is competitive.

0 2 4 6 8 1010

−15

10−10

10−5

100

105

t

Res

idua

l nor

m fo

r u j

Residual norm for solution at each time step∆ t = 0.1, x−intervals = 50

N = 64N = 128N = 256N = 512N = 1024

0 1 2 3 4 5 6 7 8 9 1010

1

102

103

104

tR

adiu

s

Radius of Γ

Figure 6: a) Residuals for computing eAb (MATLAB vs. Trap. rule method) where A = tD. b) Radius of Γ as a function of

time for the plots in (a).

For the heat equation, Figures 5a and 6a demonstrate a problem with computing etDb: the residualsget worse as dim(D) increases and as time evolves. This problem is surmountable, and it stems from thelocation of the eigenvalues of A. As t or dim(D) increases, the interval containing the eigenvalues grows,and thus Γ must grow accordingly (Fig. 6b). However, recall that the spectral decomposition of A = tD is

etD = etλ1P1 + · · ·+ etλNPN , (9)

where Pj is the spectral projection for λj . Since λj → −∞ as t → ∞ and ‖Pj‖2 = 1, the terms etλjPj

corresponding to the largest magnitude eigenvalues will decay to zero rapidly as t increases. In fact, the jthterm of (9) will decay to below machine precision when

tλj < −15 ln(10), j = 1, . . . , N. (10)

This tells us that as t increases there are fewer eigenvalues that contribute significantly to the whole functionetD. The critical observation for the trapezoid rule method applied to the heat equation is that Γ does notneed to enclose the whole spectrum of D, just those eigenvalues that contribute at least machine precisionin (9). In fact, if we fix the zc and the radius of Γ to be

zc =−15 ln(10)

2, r = |zc|+ 1 (11)

with the +1 term in r included to make sure that Γ is well-separated from the eigenvalues it encloses, thenwe should be able to compute etDb to high accuracy.

Choosing the function u(x, 0) = sin(x) + π − x2/π to be the initial condition g(x) for the heat equation,Fig. 7a displays the residuals computed with Γ using the values in (11), and Fig. 7b shows the computed

7

0 2 4 6 8 1010

−15

10−10

10−5

100

t

Res

idua

l nor

m fo

r u j

Residual norm for eAb at each time step∆ t = 0.1, x−intervals = 50

N = 64N = 128N = 256N = 512

−4−2

02

4

0

5

100

1

2

3

4

x

Approx. solution to ut = u

xx, npts = 512

∆ t = 0.1, x−intervals = 50

t

Figure 7: a) Residuals for computing eAb with different N for the trapezoid rule, and b) Solution computed via the trapezoid

rule method with N = 512.

solution for N = 512 points in the trapezoid rule. Observe that the solution is near machine precision withN = 512. Fixing the r and zc for the contour Γ produced a drastic improvement over the method with anever-growing contour. In fact, Fig. 6a indicates that many thousands of points would be required to reachhigh accuracy if Γ is allowed to grow to accommodate the spectrum of tD, but this would be inefficient. Thecurve in Fig. 6a with N = 1024 took 203.3 seconds to compute and it was not even accurate over t ∈ [0, 10].By contrast, using the values in (11) to fix Γ, the curve in Fig. 7a for N = 512 (and hence the solution inFig. 7b) took only 98.6 seconds.

To give a better idea of the accuracy of the trapezoid rule method we present an example for which theexact solution of (6) is known. Let u(x, 0) = sin(x) so that the exact solution of the heat equation is

u(x, t) = e−t sin(x)u0. (12)

Plotting the 2-norm of (12) at each timestep against the 2-norm of the trapezoid rule method solution andMATLAB’s solution results is the linear loglog graph in Figure 8a, with the residual norms of the solution(trap. rule vs. MATLAB) in Fig. 8b.

0 2 4 6 8 1010

−4

10−3

10−2

10−1

100

101

t

2−N

orm

s fo

r u j

Norm for solution at each time step∆ t = 0.1, x−intervals = 50, npts = 512

MATLABTrap. UniformExact Solution

0 2 4 6 8 1010

−15

10−14

10−13

10−12

10−11

t

Res

idua

l nor

m fo

r u j

Residual norm for solution at each time step∆ t = 0.1, x−intervals = 50

Uniform mesh

Figure 8: a) 2-Norms of the solution to the heat equation at each time step. b) Residuals for the trapezoid rule method

(compared to MATLAB).

The behavior of the solution norms as time evolves is captured well by both the trapezoid rule method

8

and MATLAB’s method, but there is a small but nearly constant log-space gap between the exact solutionand these two approximations. I hypothesize that this is due to discretization error since tD is a spatialdifferentiation matrix. However, this gap increases as dim(A) →∞, which goes contrary to the expectationthat the discretization error should decrease by a factor of 1/(∆x)2 = 1/h2. This is a phenomenon that wasnot resolved at the conclusion of our research, hence the nature of this gap between true and approximatesolutions remains a question for further study.

Thus the matrix exponential can be accurately calculated by the trapezoid rule method, though it requiresmore effort than calculating the matrix square root. However, it is extremely accurate for calculating eA

when A has its spectrum on the negative real axis. Since this occurs naturally in many PDE problems, thetrapezoid rule method can be widely applied to real-life problems.

4 Talbot Contours

The trapezoid rule as demonstrated with circular Γ is very effective and accurate, yet with a differentcontour even better performance is achieved. In a 1979 paper, Talbot proposed a contour based on cot(θ)as an effective contour for use in computing the inverse Laplace transform [5]. This contour, which is adeformation of the Bromwich contour, completely encircles the negative real axis and thus it can be usedwith the trapezoid rule method to compute matrix functions for A whose eigenvalues lie on the negative realaxis in a faster and more accurate manner. It is particularly applicable, then, to solution of PDE’s such asthe heat equation.

The original Talbot contour is defined as

z(θ) = σ + µ(θ cot θ + νiθ), −π ≤ θ ≤ π. (13)

with parameters (σ, µ, ν), where σ controls translation of the contour right or left, µ controls the locationof the extreme points in the trapezoid rule, and µν controls the relative spacing of points on the contour.Weideman optimized these parameters for use with parabolic PDE’s, finding that

σ = −0.4841N

tµ = 0.6443

N

tν = 0.5653,

where n = 2N is the number of trapezoid rule points and t is the time variable in the PDE [8]. This contouris shown in Figure 9 with the points used in the trapezoid rule overlaid on top.

−140 −120 −100 −80 −60 −40 −20 0 20−40

−30

−20

−10

0

10

20

30

40

Real axis

Imag

inar

y ax

is

Optimized Talbot Contour, N = 32

Figure 9: The optimized Talbot contour, as derived by Weideman, with n = 32.

9

To use this contour as Γ for the trapezoid rule method to approximate matrix functions we differentiate(13):

dz = µ(cot θ + sin2 θ + νi)dθ. (14)

Inserting this into the Cauchy integral formula from (1) yields

f(A) =1

2πi

∫ π

−π

f(z)(zI −A)−1µ(cot θ + sin2 θ + νi)dθ. (15)

Defining g(θ) to be the integrand, and recognizing that g(−π) and g(π) are not necessarily equal, gives then-point trapezoid rule formula

f(A) ≈h[ 12g(θ0) +

∑n−2j=1 g(θj) + 1

2g(θn−1)]2πi

. (16)

This new formula is in fact very accurate as well, but more importantly it is much faster. In fact, we reach

0 5 10 15 20 2510

−14

10−12

10−10

10−8

10−6

10−4

10−2

100

Convergence with Talbot contour(with scaling by t)

N = midpoint rule nodes (pts = 2N)

Res

idua

l nor

m

0 2 4 6 8 1010

−13

10−12

10−11

10−10

t

Res

idua

l nor

m fo

r u j

Residual norm for solution at each time step∆ t = 0.1, x−intervals = 50

N = 16N = 32

Figure 10: a) Convergence of the trapezoid rule method using Γ = the optimized Talbot contour as N varies. b) Residuals

for computing eAb with different N for the trapezoid rule method with Talbot contours.

near machine precision with just N = 16 (n = 32), as shown in Figure 10a. For the heat equation theacceleration is extraordinary, attested to in Figure 10b with greater than 10 digits of accuracy reached overthe whole interval t ∈ [0, 10] with initial condition u(x, 0) = sin(x) + π − x2/π.

To underscore the effectiveness of using Talbot contours we present a table that summarizes the speed andaccuracy of the unmodified trapezoid rule method, the modified (fixed Γ) method, and the Talbot contourmethod, as applied to the heat equation in Table 1.

Table 1: Effectiveness of Trapezoid Rule Methods for Solution of the Heat EquationType N Speed Avg. Accuracy over t ∈ [0, 10]

Unmodified 1024 203.3 sec InaccurateFixed Γ 512 98.6 sec 1e-13Talbot 16 10.2 sec 1e-13

For these experiments dim(A) = 50, however it is obvious from the acceleration provided by the Talbotcontour that this allows f(A) to be computed for larger A in reasonable amounts of time. The overallimpact of Talbot contours is thus the ability to solve larger problems faster and more accurately.

10

Matrix Cosine and Sine

More esoteric functions can be computed using the trapezoid rule, and good examples are the matrixcosine and sine, cos(A) and sin(A). These functions are defined in the complex plane as

cos(A) =eiA + e−iA

2, sin(A) =

eiA − e−iA

2. (17)

This naturally leads to the conclusion that perhaps these functions could be computed by first computingeiA and e−iA and adding them together, but Higham notes that this suffers from cancellation errors infloating point arithmetic [2]. Computational tests revealed that this is indeed true. The Talbot contourswill essentially be useless as well, since neither cos(z) nor sin(z) decays as z → ∞. The standard circularcontour is used to compute these matrix functions, which means that its speed and accuracy will dependhighly upon the location of the eigenvalues of A. However, because of the periodicity of these functions, anelliptic contour appears more optimal.

The motivation to compute trigonometric matrix functions can be found once again in PDE’s, this timein the solution of a second order system. The differential system

d2u

dt2+Au = 0, y(0) = y0, y′(0) = y′0 (18)

can be thought of as a spatially-discretized wave equation

v2 ∂2ψ

∂x2=∂2ψ

∂t2(19)

with initial conditions ψ(a, 0) = c1, ψ(b, 0) = c2 and boundary conditions ψ(x, 0) = f(x), ∂ψ/∂t(x, 0) =f ′(x). The matrix A is thus the discretization matrix scaled by −v2, so A = −v2D.

The solution to (18) is given exactly as

y(t) = cos(√At)y0 + (

√A)−1 sin(

√At)y′0, (20)

and thus arises the need to calculate the matrix cosine and matrix sine. Numerical results have indicatedthat not only is the trapezoid rule slow for the solution of the wave equation, but also it is quite inaccuratefor large v and t.

Hence although there is a theoretical basis for cos(A) and sin(A), using the trapezoid rule method tocompute them in practice suffers from lack of speed and accuracy. Even with small matrices and and well-clustered spectra, the best accuracy attained so far, with N = 215 points, was about 5 digits. Mention ismade of these matrix functions, however, because they can arise in practice and they give good examples ofplaces where the trapezoid rule method performs very poorly or fails.

5 Conclusion and Further Study

The trapezoid rule applied to the Cauchy integral theorem to compute matrix functions is a powerful al-gorithm, yet it has been sparsely mentioned or studied in the scientific community. For matrices A withclustered eigenvalues and functions f(z) that are well-behaved this method is highly accurate and typicallyfast. Even for scattered spectra this method performs well, as is the case with the matrix square root

√A.

For the special case when the eigenvalues of A lie on the negative real axis, Talbot contours offer significantspeed improvements while maintaining near machine-precision accuracy. The most impressive application ofthe trapezoid rule method thus far is for computing eA, which can be used in solutions to PDE’s, particularlythe heat equation. The most important result from a numerical analysis perspective is that, as shown in

11

section 2, the trapezoid rule method can yield 1 more digit of accuracy than current standard algorithms forcertain functions f(A), particularly

√A.

Improvements to this method will have to be found, however, because it is currently not robust enough todeal effectively with some special functions, such as cos(A). Judging from recent research and the literatureon this subject, these improvements will likely come from finding better contours of integration. Talbot’smethod has been optimized by Weideman for the case of parabolic PDE’s, but other types of PDE’s orapplications may have different optimized parameters. Other contours, such as hyperbolas and parabolas,have recently been shown by Weideman to yield excellent convergence for these PDE problems, but theyhave yet to be optimized and rigorously compared to Talbot contours [7].

Three factorizations of A were compared for improving the trapezoid rule method, but this comparisonhas not been as exhaustive as it could be. The most pressing issue is to develop a rigorous theory for whythe Hessenberg factorization offers not only a great speed-up but, more important to numerical analysts,an improvement in accuracy over other factorizations and indeed MATLAB’s algorithms. However, thereare likely other factorizations that could be considered for improving the method, and these ought to beexperimented with to determine their effectiveness. Matrices may also have certain structures that can beexploited by different factorizations to improve both speed and convergence.

The potential of the trapezoid rule method and the Cauchy integral theorem is just being tapped, andthere is much more to study in terms of real-life applications. Talbot applied it to the inverse Laplacetransform, Weideman applied it to parabolic PDE’s, and we have offered the possibility of its use in ahyperbolic PDE, though currently with minimal success. Other areas of mathematics may benefit from theapplication of the trapezoid rule method to their respective problems, so the challenge is now to learn whichareas these are, what problems they are trying to solve, and analyze whether this method is effective forthem.

References

[1] Philip I. Davies and Nicholas J. Higham. Computing f(a)b for matrix functions f . Technical Report436, School of Mathematics, University of Manchester, December 2004.

[2] Gareth I. Hargreaves and Nicholas J. Higham. Efficient algorithms for the matrix cosine and sine.Technical Report 461, School of Mathematics, University of Manchester, February 2005.

[3] Anthony Kellems. Work journal, February 6 2005.

[4] Cleve Moler and Charles Van Loan. Nineteen dubious ways to compute the exponential of a matrix,twenty-five years later. SIAM Review, 45(1):3–49, March 2003.

[5] A. Talbot. The accurate numerical inversion of laplace transforms. J. Inst. Math. Appl., 23:97–120, 1979.

[6] Lloyd N. Trefethen and David Bau III. Numerical Linear Algebra. SIAM, 1997.

[7] J.A.C. Weideman. Preliminary unpublished research, spring/summer 2005.

[8] J.A.C. Weideman. Optimizing talbot’s contours for the inversion of the laplace transform. TechnicalReport 05/05, Numerical Analysis Group, Oxford University, May 2005.

12