a fast cauchy-riemann solver · 2018. 11. 16. · a fast cauchy-riemann solver by michael ghil* and...

MATHEMATICS OF COMPUTATION, VOLUME 33, NUMBER 146APRIL 1979, PAGES 585-635

A Fast Cauchy-Riemann Solver

By Michael Ghil* and Ramesh Balgovind**

Abstract. We present a solution algorithm for a second-order accurate discrete form

of the inhomogeneous Cauchy-Riemann equations. The algorithm is comparable in

speed and storage requirements with fast Poisson solvers. Error estimates for the dis-

crete approximation of sufficiently smooth solutions of the problem are established;

numerical results indicate that second-order accuracy obtains even for solutions which

do not have the required smoothness. Different combinations of boundary conditions

are considered and suitable modifications of the solution algorithm are described and

implemented.

1. Introduction. Inhomogeneous Cauchy-Riemann equations appear naturally inmany fluid-dynamical problems, as the divergence and the vorticity equations of atwo-dimensional steady flow field (u, v) = (t/(x, y), u(x, y)). The velocity componentsu, v axe usually called in this context primitive variables, in contradistinction to thederived variables \¡i, f in the stream function-vorticity formulation of the flow equa-tions (e.g., Roache [28]). In the latter formulation, the stream function \p satisfies aPoisson equation; and computations with this formulation have greatly benefited fromthe rapid development of fast direct methods for the solution of Poisson's equation, orPoisson solvers (Buneman [4], Buzbee, Golub and Nielson [5], Dorr [9], Fischer,Golub, Hald, Leiva and Widlund [12], Golub [18], Hockney [21], [22], Widlund [32] ).

Working in the primitive variables, however, permits the treatment of more gen-eral flows. Indeed, either nondivergence or irrotationality of the flow are required inorder to introduce a stream function \p or a velocity potential 0, and obtain a Poissonequation for them. There are many situations of practical interest in which neither ofthese assumptions holds. Furthermore, the formulation of boundary conditions isoften easier in terms of the primitive variables, by using physical considerations whicharise naturally from the problem. On the other hand, a boundary condition on thevorticity f for instance is at times hard to formulate (Langlois [23]); the constructionof appropriate discrete versions of such a boundary condition is often even more diffi-cult (Öliger and Sundström [27]). Hence, the desirability of simple, physically mean-ingful boundary conditions and, thus, of the use of primitive variables.

Lomax and Martin [24] have developed a fast Cauchy-Riemann solver and

Received April 10, 1978.AMS (MOS) subject classifications (1970). Primary 65F05, 65N15, 65N20; Secondary

65N04, 65N05, 76B05, 86A10.Key words and phrases. Fast direct solvers, Cauchy-Riemann equations, elliptic first-order

systems, transonic flow.*The work of this author was supported in part by NASA, Grant No. NSG-5130.**The work of this author was supported in part by NASA, Grant No. NSG-5034.

© 1979 American Mathematical Society0025-5718/79/0000-0058/$! 3.75

585

License or copyright restrictions may apply to redistribution; see https://www.ams.org/journal-terms-of-use

586 MICHAEL GHIL AND RAMESH BALGOVIND

applied it to a quasilinear problem in aerodynamics ([24], [25] ). Additional versions oftheir solver are given in [26]. Our interest in the problem stems from a different applica-tion, dealing with a two-dimensional version of the equations of dynamic meteorology(Ghil [14], [15] ). The solver we present also differs from those of [24], [26] in a num-ber of ways: the boundary conditions and their numerical treatment, the decoupling ofu, v and the reduction to a discrete Poisson problem, and finally the method of solutionof the resulting discrete Poisson equation are all different. We substantiate by numericalresults that the present solver is second-order accurate, even for solutions which do nothave the formally required degree of smoothness. The solvers of [24], [26] seem to beonly first-order accurate according to our numerical tests.

The intended application of our solver is to a fully nonlinear first-order system,rather than a quasilinear one. This system is a generalization of a Monge-Ampére equation[15], [17]. Eventually, we hope to apply the solver to cases where the nonlinear equa-tions are of mixed elliptic-hyperbolic type [16], [17]. Prehminary results are encourag-ing, and we expect to pursue the nonlinear problem in a future publication.

The organization of the article is the following: Section 2 contains the descriptionin continuous and then in discrete form of the model problem of which we seek a fastsolution. Section 3 contains the derivation and description of the solution algorithm.Section 4 presents numerical results for test computations with the model problem. Sec-tion 5 presents modifications of the model problem arising from changes in boundaryconditions. Section 6 contains a comparison of results with the solvers of [24], [26].Section 7 gives conclusions and a discussion of possible extensions and generalizations.Finally, Appendix A presents an error estimate for the method, and Appendix B con-tains a listing of the basic program.

Acknowledgements. It is gratifying to acknowledge useful discussions with Profes-sors Eugene Isaacson and Olof Widlund. Numerical calculations were performed in parton the CDC 6600 of the Courant Mathematics and Computing Laboratory, New YorkUniversity under Contract EY-76-C-02-3077 with the U. S. Department of Energy.

2. The Model Problem.The Differential Equations. We wish to study the fast numerical solution of the

elliptic system of two first-order linear equations in two independent variables,

(2.1a) ux + vy = dix, y),

(2.1b) Uy - vx = e(x, y).

The dependent variables u, v can be thought of as velocity components in the x, y direc-tions, respectively. In this interpretation d, e axe the divergence and vorticity of theflow, which are assumed to be known. If d = 0 = e, (2.1) are the Cauchy-Riemann equa-tions, and u, v axe analytic. We are interested in the inhomogeneous case, \d\ + \e\ ^ 0,and we concentrate on real-valued d, e, u, v, although the method is applicable withminor changes to complex-valued functions as well.

We consider a rectangular domain R, taken without loss of generality to be R ={(x, y): 0 < x < 27T, 0 < y < 7r}. The boundary conditions are that d is given on the


A FAST CAUCHY-RIEMANN SOLVER 587

lower and upper side of the rectangle,

(2.2a) v = t/°>(x), y = 0,

(2.2b) V = v^0c), y = 77,

and that both u and v axe periodic in the x-direction,

(2-3a) u(x + 2tt, y) = u(x, y),

(2.3b) v(x + 2ir, y) = v(x, y).

The Gauss divergence theorem implies that d(x, y), v^°\x), and i/^ix) have to satisfy

(2.3c) J02"/0" d(x, y)dxdy = f*" {v^(x) - „(*)} dx.

These conditions, together with (2.1), determine v completely and u up to an additiveconstant (compare Ghil [15]). The latter indeterminacy in the solution can be elimi-nated, for instance, by prescribing u at one arbitrary point (x0, y0) in the rectangle.

The boundary conditions we use are associated with a standard channel-flowproblem in geophysical fluid dynamics (e.g., Elvius and Sundström [10], Gustafsson[20] ), from which the nonlinear problem mentioned in Section 1 is derived (Ghil [15],[17]). Extensions to different boundary conditions and to irregular domains will bediscussed in Sections 5 and 7.

777e Difference Equations. The discretization of the problem we chose is toapproximate the derivatives in (2.1) by finite differences. Let U, V, D, E stand forthe mesh functions which approximate the continuous functions 77, v, d, e; and leth, k stand for the mesh size in the x, y-directions, respectively. It is natural to usecentered differences to replace the corresponding derivatives in (2.1). We write

(A) U(x + b, y) - U(x - b, y) s 2St7x(x, y),

U(x, y + e) - U(x, y - e) s 2euy(x, y),

for « and similar formulas for v; this yields a second-order accurate approximation ofthe derivatives and allows us to expect that in some adequate norm ||-||,

||U- u|| + || V - u|| = 0(h2) + 0(k2).

The use of centered differences in a straightforward manner, on an unstaggeredmesh x¡ = ih, y ■ = jk, leads, however, to the existence of spurious null vectors (U, V),i.e., to zero eigenvectors of the discrete matrix operator which approximates the dif-ferential Cauchy-Riemann operator. To avoid dealing with these null vectors and to ob-tain an invertible discrete matrix operator, we used a staggered mesh (Figure 1). Sucha mesh, suggested already by Lomax and Martin [24], can be formulated for Eqs. (2.1)in a particularly efficient way.

Let u, v denote points at which U, V axe defined, and let -, x (for the vectoroperators divergence and curl) denote the points at which the discrete versions (2.4a, b)(see below) of equations (2.1a, b) are written and at which D, E are defined. Thus,


588

yiiU

MICHAEL GHIL AND RAMESH BALGOVIND

IT *

l,N-*

-»-

1,3

}1,2

1,1

^2^1

l,N ~2,N

V

1,2

M V2,l

1,1 2,

2,N

2,1 V3,l

2,1 3,1 3,1

v¡, ta v

-tf-

-»i-tír

V

-tÖ-•-

^SLIT l(27T,ir)

M,N

M,l

(0,0) '.o 2,0 3,0 M,0 277

Figure 1

x(¡)

u, v alternate on diagonals of the mesh and so do -, x on diagonals parallel to and al-ternating with those of u, v; in other words, v, x, u, -, in clockwise direction, occupythe corners of an elementary mesh cell of area hk/4 (see Figure 1). No averagingof U, V is necessary in writing (2.4a, b) on this staggered mesh. Furthermore, theboundary conditions (2.2) and the periodicity condition (2.3a, b) can be easily han-dled. Indeed, let U, V be indexed independently, with y = 0, 7T being horizontalF-lines corresponding to F-indices / = 0, N, and with x = 0, 27r being vertical F-lineswith F-indices i = 1, M + 1. Then the computational domain includes the points((/ - l)h, jk), at which Vfj is defined, and the points ((/' - l/2)h, (j - l/2)k), at whichUjj is defined; in other words, Vtj approximates v((i - l)h, jk), 1 < i


Notice that S, e in (A) have been replaced after staggering by h/2, k/2, rather than byh, k. The boundary conditions become

(2.5a)

(2.5b)

Vi¡0 = ¿°Xii - l)h), Ki


From (2.5) we have that V0 and VN axe known; accordingly, we redefine Dx asDj + &-1V0 in the first one, and D^ as D^ - k~l\N in the last one of the equa-tions (3.1a). After these changes of notation and a change of order, the vector equa-tions (3.1) can be put into the block matrix form

N N - 1

N-l

(3.3)

7V<

-I I

I-I

U,

N-l

N

"N-l .

= *

^TV-l

D,

LoJwhere / is the M x M identity matrix. This form is roughly analogous to writing(2.1a, b) in reversed order,

(2-03/3y -3/3*3/3x 3/3y

We notice that the matrix in (3.3) has only four nonzero diagonals of M x Mblocks and that the blocks are at most scalar tridiagonal. Furthermore, all the non-zero entries are ± 1 or ±p. In the present form, however, we cannot take full advan-tage of the extreme sparsity and simplicity of the matrix in order to obtain a solutionmethod comparable to fast Poisson solvers.

Before bringing (3.3) to a more advantageous form we remark that its matrixindeed has a one-dimensional null space: the sum of the MN rows in the lower half ofthe matrix is zero. The corresponding compatibility condition that 2/.-D« = 0 is thediscrete counterpart of (2.3c).

Decoupling of U and V. It is a well-known fact that each of the functions 77, uwhich satisfy (2.1) will also satisfy a Poisson equation obtained from (2.1) by elimi-nating the other dependent variable by cross-differentiation. This suggests the attemptto eliminate Fin system (3.3) in order to obtain a linear algebraic system for Ualone.The matrix of this system will be similar to a discrete Laplacian, and to it we shall beable to apply fast solution techniques.

The elimination of F proceeds as follows. In system (3.3) add the Mh blockrow to the (N + l)st, then the (N + l)st to the (N + 2)nd and so on. This discretesummation procedure is analogous to integration with respect to y in (2.1a). The lowerpart of the system thus becomes



N N-l

(3.4) N>

UN

LViv-i.

DiD,+D2

?>/ JAfter this transformation, (3.3) can be written as

N N- 1

(3.5)

N

N

¥■■

P I ßI _

I

T i 0---0

U E'

D'TVLJ L*£>d

here U = (U*..... U*)*, V = (Vf, ..., V*_,)*, 7iV_1 is the M(N - 1) x M(N - 1)identity matrix, the definitions of P, Q and E' are easily read from Eqs. (3.3), whilethose of R and D' follow from (3.4). In particular, note that D' and E' contain thefactor k.

From (3.5) it follows that

PV + ßV = E',(3.6a)

(3.6b) RV + V = D',

where we omit for the moment the last vector equation. Substituting V from (3.6b)into (3.6a), we obtain

(3.7)

where

(3.8)

(P-QR)V = E'-ßD',

N

QRTT*

TT* TT* 0N- 1

since ß is block diagonal with constant equal blocks and T*T = TT*. Attaching nowthe last equation of (3.5) to (3.7) yields

(3.9)P-QRT... T

1 -i--"-2-0-"

or, changing sign in all but the last equation,



+

(3.9')

= k

0.

T*D

TT*

TT*

JEi

UTT*

T*(BX + D2) - E2

L"i i

Let S = TT*. Notice that

(3.10) S = p(T+ T*)or, in the familiär notation for tridiagonal matrices,

(3.10') T=p(-1,1,0), 7* = p(0, 1,-1), S = p2(-1,2,-1);this is a slight extension of standard notation: because of periodicity these matrices areactually circulants, rather than tridiagonal proper (cf. (3.2), (3.10)). Thus, S corre-sponds to a central second difference, carrying further the analogy with the continu-ous case. However, (3.9') is still not in a form suitable for fast solution.

Discrete Poisson Equation for U. This form is now easily achieved. First, mul-tiply the last block row by T*. Then subtract the second block row from the first,the third from the second, and so on up to and including the last one; the new lastrow is obtained by adding all the new rows, from 1 to (N - 1), to the last row. Fi-nally, the new last row is taken as the first row of the system and all the signs are re-versed, yielding

S + I -I-I S+ 21 -I

(3.11)

-/ S+ 21 -I-I S + I

u

r*D -i7;*D, + E, -E,

T*DN_X + EN_2 -E N-lT*DN + E7V-1 mN



the vector B is introduced for future notational convenience.We notice that the matrix in (3.11) has almost, but not quite the form of a dis-

crete Laplacian (compare for instance Buzbee, Golub and Nielson [5], for Laplacianwith Dirichlet, Neumann and periodic boundary conditions). The matrix is symmet-ric and nonnegative semidefinite; in fact, LT.- = const satisfies the homogeneous sys-tem. The system would be positive definite if any one diagonal element actuallyexceeded the sum of the off-diagonal elements in the same row (e.g., Collatz [6,pp. 43—47]). This immediately suggests that we make use of the possibility of pre-scribing U( ■ = uQ, i.e., that we increase the element at the position Mj0 + iQ along

the diagonal by an arbitrary amount a > 0, and add at70 to the right-hand side of thatequation at the same time. We shall call the system thus modified (3.11a) for futurereference.

The system (3.11a) could now be solved by modifying any one of the differentmethods or combination of methods available for Poisson's equation, as reviewed forinstance by Buzbee, Golub and Nielson [5], Dorr [9], or Widlund [32]. In this sense,the elimination of V is analogous to a single step of odd/even reduction as proposedby Hockney [21], [22], although the role of this step is more crucial in the presentcontext. We describe in the sequel the particular fast algorithm chosen to solve (3.11a).

Fast Direct Solution for U. Essentially, our algorithm is based on the one de-scribed in greater generality by Buzbee, Golub and Nielson [5] as matrix decomposi-tion. It relies on the fact that the eigenvalues Xk and eigenvectors %k of S axe known,i.e., that

S%k ~ ^k%k'for

(3.12a) Xk = 2p2(l - cos(2n(k - 1)¡M)), Kk


Clearly, (3.14) corresponds to a discrete Fourier transform in the x-direction.Notice in particular that

Ui,i L ui,ji^M>1=1

where £/• , denotes the first component of the vector Üy.Premultiplying (3.11) by a block-diagonal matrix with all the diagonal blocks

equal to Q*, and using (3.14), we obtain

(3.15)

A+ 7 -I-I A+ 21

-I A+ 21 -I-I A + I uN BA^

In order to turn the matrix in (3.15) from block-tridiagonal with diagonal blocksinto a tridiagonal matrix, it suffices to reorder the components of U and of B by ver-tical Unes rather than by Fourier transformed vectors of horizontal Unes, i.e., let

(3.16) *., U,i,k> Bk.j Bj,k' 1


discussion of (3.11a), we can deal with the singularity of Ax by prescribing any com-

ponent of Ûj, Ux . say; this means prescribing Ux • = Uj x = T,f=x t/. • I\[M,

rather than one given mesh value U¡ ¡ . Again we call the suitably modified systems

(3.15a), (3.17a) and (3.18a).The LU factorization of the tridiagonal matrices in (3.18a) is performed next,

say Ak = Lkiik, storing the nontrivial entries of Lk and Uk. The diagonal elementsof Lk and the superdiagonal elements of Uk axe identically 1, while the subdiagonal ele-ments of Lk and the diagonal elements of Uk are reciprocals of each other. Hence onlythe diagonal elements of Uk in fact need to be stored.

Having computed B and thus B, from the original data, we can solve for U andhence for U. The computation of B from B and of U from U involves matrix multi-plication by Q* and by Q, respectively. Since these correspond to a direct and an in-verse Fourier transform, they can be performed efficiently by a fast Fourier transform(FFT) algorithm. The algorithm used was the one published by Cooley, Lewis andWelch [7], modified to operate in the real.

Summary and Programming Considerations. The solution algorithm developedin this section reduces in practice to the following:

(i) From the data of the problem D, E, compute

Bx=kiT*Dx-Ex),

B, = x(r*D; + E,_, - E,.), 2


conjugate allows us to solve only M/2 + 1 systems, so that the operation count for thisstep is 773 = 4MN real operations. Notice that no reindexing of B as indicated by(3.16) is actually necessary, and that at this stage U can overwrite B. The shifting ofB elements and U elements required to use the LU subroutine is relatively fast.

(iv) From the Uk computed in (iii) obtain U (again with no need for reindexing)by carrying out (3.14a) with another application of the FFT, in 774 = 2MN log2M realoperations, and with U stored instead of U.

(v) Compute V from U obtained in (iv), using (3.1a). This requires tis = 4MNoperations. Notice that in this recursion V does not depend on \JN, DN, since VN isknown. Using (3.1a) as a backward recursion would yield independence of V on UpDp since V0 is also known. This redundancy, however, is only apparent and compu-tations with (3.1a) used as either forward or backward recursion yielded results withthe same accuracy, which only depended on the accuracy of U.

The final operation count is

577 = X nk = 4MN(3 + log2M),

ito obtain both Uand Fat all the grid points at which they are defined.

The storage requirement for the L, U factors is s, = MN/2, using again the factthat we only solve for half the number of components of U; the storage requirementfor the right-hand sides D, E, is s2 = 2MN; and these locations are successively over-written until U, V are stored in them. Hence, the total storage requirement is s =sx + s2= 5MN/2.

We conclude this section with a diagram of the algorithm:

E FFT ~ R „ LU *Of, E;, V0, VN -^* B, ^Xo^m B7 AT Bk -^ Vk

R ~ FFT E—► u.-► u.-► v.-0 ' 2MN\og2M i 4MN >'

hexe E stands for evaluations by linear recursion, FFT for an application of the FastFourier Transform, R for reindexing, and LU for factorization. The ranges of the in-dices are 1 < / < N, 1 < k < M, and the operation counts are given under the corre-sponding step of the algorithm.

4. Numerical Results. In this section we shall present results of test computa-tions for model problem (2.1)—(2.3). The results will provide evidence for the shortcomputation time required by the method, for its second-order accuracy and for itsinsensitivity to errors in the data.

The experiments consisted in evaluating analytically d, e in (2.1) for known77, v, and then comparing the numerical solutions U, V with the correct analyticalu, v. The computations were carried out on an IBM 360/95 computer, with aFORTRAN IV, level H compiler, the code optimization parameter OPT being set tothe value OPT = 2. Double precision arithmetic was used throughout our numericalexperiments. Test comparisons on an Amdahl 470V/6 computer and on a CDC 6600



gave entirely similar results; the program is currently being developed into a packageand we expect to test it on other computers as well.

Initially, three different versions of the program were run. The first versionsolved the positive definite modification (3.11a) of the system (3.11) by Choleskyfactorization. It is denoted by C in the tables, and given for orientation purposesonly. This version required much longer computer time; moreover, results were lessaccurate than with the other two versions, since the larger number of arithmetic oper-ations introduced more round-off error.

The second version utilized diagonalization by the discrete Fourier transform asgiven in (3.14) and the reindexing given in (3.16), but did not use the FFT in steps(ii) and (iv) of the algorithm. It is thus somewhat comparable to Hockney's originalmethod [21] and is also included for comparison purposes. Its numerical results wereequal at least to within three significant digits to those of the third version and areUsted separately only in Table III. This version is denoted by P in the tables.

Table ICPU execution times for the versions C, P and F on IBM 360/95 with compileroptimization OPT = 2. The number of points is MN = M2 ¡2.

16 .07 .09 .01

32 .88 .65 .04

64 14.35 5.17 .13

128 .51

256 2.11

Table IICPU execution times for version F with different optimization levels, OPT = 0,1, 2, 3, on IBM 360/95, Amdahl 470V/6, and CDC 6600. The compiler used onthe CDC computer was FTN 4.6; on the other two computers, a FORTRAN IV,level H compiler was used.

360/95 470V/6 6600

16

32

64

128

OPT/0

0.02

0.08

0.35

1.40

0.02

0.07

0.20

0.88

0.01

0.04

0.13

0.51

OPT/0

0.00 0.02

0.03 0.09

0.13 0.38

0.52 1.53

0.02

0.07

0.30

1.25

0.01

0.05

0.23

0.96

OPT/0

0.01 0.059

0.06

0.24

0.232

0.969

0.96 4.065

0.031

0.120

0.494

2.034

0.025

0.093

0.382

1.615

256 5.91 3.70 2.11 2.13 6.52 5.46 4.07 4.14

The third version is the one actuaUy described in the summary of Section 3 andUsted in Appendix B; it represents the proposed method. This version is denoted by Fin the tables.



Tables I and II contain timing results. These are essentiaUy independent of thesolution, and depend only on the method, or program version, and on the number ofpoints. Table I compares CPU execution times for the three program versions C, Pand F on the IBM 360/95 computer with compiler optimization OPT = 2. As thenumber of points increases, the advantages of version F become more and more obvious.We can also see clearly that the execution time for version F is proportional to thenumber of points MN.

Table II gives CPU execution times for version F with different optimizationlevels, OPT = 0, 1, 2, 3, on the IBM 360/95, Amdahl 470V/6, and CDC 6600 comput-ers. This is meant to give an idea of the order of magnitude of running time improve-ments which could still be achieved by code optimization. We notice that timing withOPT = 3 is not superior to OPT = 2.

The other tables are organized as follows: the first column contains the knowntest solution (u, v), and for Table V only, its L2 and L„ norms. The norms used foru are the continuous norms

(4.1a) \\u\f\ = \ fon j2n u2ix, y) dx dy,2lT

(4.1b) ||id|„ä sup |r/(x,y)|,0


Table IIINumerical results with versions C, P and F of the algorithm for a number of sim-ple test cases. The labeUing of rows and columns is explained in the text.

*_


Table IV

Numerical results with the proposed algorithm (version F) for problems withslightly perturbed boundary conditions i/°*(x), v^l\x), £7(x0, y0) and right-handsides d(x, y), e(x, y). The subscript ( ) stands for the correct values.

t-(V) ».("I *

u = sin(x) sin(y), v = sin(4x) 12701732295224171071

7.0594 -21.6540 -24.0377 -39.9953 -42.4878 -4

5.06651.20912.97907.40641.8473

4.17571.67935.23451.45373.8233

1.0958 -12.6040 -26.4284 -31.6021 -34.0020 -4

d s l.ld , e = l.le„ 8127 -26760 -24525 -24003 -23874 -2

1.4141 -18.1393 -26.6998 -26.3252 -26.2204 -2

1.0866 -16.9773 -26.0982 -25.8773 -25.8172 -2

1.0417 -11.0102 -11.0028 -11.0009 -11.0003 -1

2.1951 -11.2814 -11.0667 -11.0138 -11.0007 -1

v = l.lv, v(1» = l.lv'1' 8156 -34438 -28546 -29594 -29858 -2

7.75032.60591.85481.85821.9146

5.30182.08851.E5471.90991.9507

6.26414.70867.55988.87929.4717

1.10616.11236.99348.25109.0694

u = sin(x) sin(y), v = sin(4x)

„"» = l.M v. JV= l-Ol „. 1.01 v¿°>. ,

u(x0,y0) = 1.01 uc(x0,y0)

2.27847.77325.36115.06125.0134

7.8326 -22.3701 -21.1088 -28.0548 -37.3146 -3

5.6034 -21.7407 -28.6658 -36.7151 -36.2660 -3

4.4085 -21.7937 -21.0386 -21.0096 -21.0024 -2

1.2068 -13.6301 -21.6493 -21.1618 -21.0404 -2

u ■ sin(x) sin{y)v = sin(4x) sin (4y)

3.22738.04092.00855.02031.2550

5.9183 -21.3515 -23.2789 -38.1045 -42.0164 -4

4.0498 -29.4190 -32.3045 -35.7193 -41.4258 -4

6.2089 -31.5927 -34.0074 -41.0035 -42.5096 -5

1.1072 -12.6172 -26.4545 -31.6082 -34.0171 -4

d = 1.01 dc(0)v

u(x.,y.) = 11 u (*.n,yn)

8.25975.81235.20305.05085.0128

6.5120 -21.8814 -28.3918 -35.8582 -35.2234 -3

4.4892 -21.3738 -26.9572 -35.4663 -35.1188 -3

1.5891 -21.1513 -21.0381 -21.0096 -21.0024 -2

1.2183 -13.6434 -21.6519 -21.1625 -21.0406 -2

1.1 e-ldc1.1 vv

c

u(xn,y.) - l.lu (xn,y„)*0"0' c'-O'-'O'

5.35505.08855.02215.00555.0014

1.1855 -16.6507 -25.4407 -25.1287 -25.0418 -2

8.9935 -25.8963 -25.2323 -25.0670 -25.0216 -2

1.0302 -11.0079 -11.0020 -11.0005 -11.0001 -1

2.2179 -11.2879 -11.0710 -11.0177 -11.0044 -1

errors become very nearly equal to the bound in some such cases (e.g., u = y sin(x), v =y2 sin(x)), which seems to indicate that the bound is rather close to being sharp. Thetheoretical bound provides also an indication of error magnitude even in some caseswhere the data are not sufficiently differentiable (e.g., u = y4 sin(8x), v =x(2tt - x) sin(8y); u = v = x(n - x)(2n - x) sin(y)).



TABLE IV (Continued)

»,(0) l,(U,V) »_(V)

i -= sin(x) sin (y)

, = sin(4x) sin(4y)

,(0) _ „(0)

,(1) „

+ 0.1 sin(32x)

-l- 0.1 sin(32x)

6.71636.9041

3.59654.7732

5.39895.9392

3.6605 -26.3567 -2

2.71034.7378

u = sin(x) sin(y)v = sin(4x) sin(4y)„(0) „ v + o.lsin(31x)v(1) - v'1'* 0.1sin(31x)

3.21803.24617.44206.28947.0221

7.08954.30974.53543.70444.9065

5.38313.79816.48495.50556.0615

7.24408.18924.14265.25016.9534

1.7657 -19.5994 -21.8729 -22.7606 -24.8358 -2

u = sin(x) sin(y)v = sin(4x)v + 0.1 sin(31x)

2.28895.53491.37163.48411.1054

9.25222.14165.10321.30945.8169

1.78714.24631.04102.63978.8422

5.15581.76794.96071.64139.8365

2.80776.7504 -11.6534 -14.1227 -24.9620 -2

The formal truncation error of the finite-difference scheme, in the interior asweU as on the boundary, is 0(h2 + k2)A hence, we expect that the scheme will pro-duce second-order accurate results. A good way of testing this numerically is by com-puting the ratio

| lp-h,2k) \lpi.k)

and similar quantities for v and (77, u); here || || is either the l2 or the /„ norm of(4.2). For sufficiently smaU h, k and twice continuously differentiable (77, v) theseratios should be very close to 4 if the method is indeed second-order accurate, and itshould be close to 2 if the method is first-order accurate.

The indicated ratios are computed and entered in additional rows in Table V; theentry in column 3 indicates the values of M and 2M, rather than of h and A/2, towhich the norms whose ratio was taken correspond. The interesting result is that forthose cases tested the method seems to have second-order accuracy even when v ismerely continuous and u once continuously differentiable and that it is first-order ac-curate when both u and v axe merely continuous. The continuity of u, v is given instandard notation as a column to the left of the usual first column; i.e., u, v E C°,Cl, C2, ..., C°°, indicate the number of continuous derivatives of u, v. Continuity isunderstood for u, v and their derivatives as extended x-periodic functions.

T A more detailed discussion of accuracy appears in Appendix A.



Table V

Numerical results indicating the order of accuracy of the proposed algorithm (ver-sion F) and discretization procedure.

if") *.(v> e(u)

CCc°°u = y sin(x)v = y2sin(x)

lul^ 3.1416Ivl = 9.8696

163264

4.0151.0492.6576.664

2.8797.0431.7224.248

3.573 -29.049 -32.254 -35.607 -4

8.0412.7948.1242.193

4.8791.3153.4148.566

4.455 -11.114 -12.784 -26.961 -3

8/1616/3232/64

3.8263.9503.987

4.0884.0904.054

3.9484.0164.019

2.8783.4403.706

3.7113.8513.986

u«CUc"

= y sin(4x)= y sin(4x)

ul,= 1.2825

lul^» 3.1416Ivl ■ 9.8696

163264

128

1.75094.3261.0832.7087

2.55736.18811.51333.752

2.1684 -15.292 -21.3123 -23.268 -3

4.97511.5734.61871.2895

5.56341.46993.68859.2309

1.426 +13.5648.910 -12.227 -1

16/3232/6464/128

4.0663.9963.997

4.1524.0694.033

4.0974.0334.015

3.1693.4073.582

3.8053.9643.996

kiec

|vec°°lui 2=Ivl 2=

Ivl =

y sin(8x)y2sin(8x)1.28253.12103.14169.8696

163264

128256

1.02351.67104.02189.99412.4956

1.05752.9737.0881.74494.3369

7.579 -12.3912 -12.739 -21.419 -33.835 -3

1.63475.68031.75655.04631.4018

1.11887.62161.8564.65011.1612

2.103 +25.257 +11.314 +13.2869.215 -1

16/32 6.1254.1554.0244.005

3.55714.1944.0624.023

3.1414.16634.0454.015

2.8783.2343.4813.600

1.4674.1063.9924.004

use

vec"lui 2=Ivl 2=

Ivl -

y sin(16x)y2sin(16x)1.28253.12103.14169.8696

163264

128256

5.8205 +11.03211.5734 -13.7506 -29.2864 -3

18864585206060978749

4.7763 +17.4243 -12.5130 -15.9846 -21.4774 -2

16/3232/6464/128

128/256

5.6397 +16.55964.19504.0388

8415702621310587

6.4334 +12.95444.19914.0499

30817140060685442893

0451 +16038 -28298 -11426 -13174 -2

3.290 +3.224 +2

2.056 +25.140 +11.285 +1

6319828026825060

0788 +33465 -212100294

hr£C

lul.

Ivl

sin(x)sin(y)sin(4x)sin(4y)

0.5

0.5

1.0

1.0

8163264

1.306 -23.227 -38.0409 -42.0085 -4

142919835152789

9.8921 -34.0498 -29.419 -32.3045 -3

2339208959270074

2730 -11072 -16170 -24545 -3

268 -13.171 -2

27 -31.982 -3

8/1616/3232/64

4.05484.01354.0034

62837901219

152.4434.29964.0872

598089829745

76 -1523050548

y sin(x)x(2ir-x) sin(y)22.9595

5.096697.4091

9.8696

1

163264

8/1616/3232/64

3.586 -19.332 -22.36 -25.92 -3

335096703

4.951.243.067.61

4449821617

16814 -189 -2984 -2

7.8981.975

937 -11.234 -1

3.84243.95413.9880

195911500603

4.00774.03464.0244

89229734

722197939761

Table VI contains a study of the number of grid points per wave length whichthe method necessitates for given numerical accuracy. The results are given here asrelative errors, l2(u — U)/\\u\\2, rather than as absolute errors, l2(u - U), and similarlyfor v and for /„. It seems that roughly 4 points per wave length wiU give 10 _1 rela-tive error, 8 points wiU give 5 x 10~2, and 16 wiU give 10-2. We notice again thatif u osciUates less than v, the error in u wiU be considerably smaUer than that in v.



TABLE V (Continued)

l,(U) £,(U,V) i (U) *„ e(u)

lui 2=Ivl 2=,U,«T

Ivl =

y sin(8x)x(2tt-x) sin(8y)22.9595

5.096697.4091

9.8690

163264

128256

1.732 +12.4595.853 -11.446 -13.604 -2

3.85877.1378 -11.636 -13.989 -29.892 -3

1.292 +11.83524.3267 -11.064 -12.647 -2

3.932 +16.22852.00355.3048 -11.3449 -1

7.26142.17615.130 -11.2653 -13.165 -2

3.856 +19.640

4106.025 -11.506 -1

16/3232/6464/128

128/256

7.04284.2014.0484.012

5.4064.3634.1014.032

7.0404.2424.06564.020

312109777944

3.3374.2424.0553.997

lui 2=Ivl 2=

Ivl -

y sin(16x)x{2tt-x) sin (16y)

22.95955.0966

97.40919.8690

163264

128256

4.661.8162.51785.9641.471

1.1742.2416.16771.43473.518

16/3232/6464/128

128/256

2.5667.2124.2214.053

5.2363.6344.2994.078

8.71051.31081.84584.3531.0716.6307.1184.2414.061

4677 +16798 +188417007139 -1

2.055 +24.57981.7834.106 -11.006 -1

8.0672.0175.0421.2603.151

3~86~781172798

4.4872.5684.3434.080

Ivl „=lut, =

y sin{32x)x(2n-x)sin(32y)

5.096622.9595

9.869697.4091

163264

128256

9.329.1761.8432.5355.997

7.5112.9011.21215.8131.3647

5.1762.1231.3161.84594.356

2935 +2277 +2092 +124312600

1.564 +37.936 +22.76781.49563.4708 -1

1.467 +43.669 +39.172 +22.293 -.3

732 +1

16/3232/6464/128

128/256

1.01574.97957.26814.228

2.58882.392.0854.259

2.43781.6137.13004.237

01285081030205

1.97092.867 +21.8514.309

u = x(tt-x) (2ir-x)sin y

v = x{ïï-x)(2ïï-x)sin y

163264

128256

16/3232/6464/128

128/256

1.44313.66929.22692.31045.77833.93303.97663.99373.9984

9.76882.44996.13871.53683.8452

1.24743.13897.86041.9651

91163.98743.99103.99433.9968

3.97393.99334.00004.0002

3.43169.50672.48026.31631.5928

2.4159 -16.3250 -21.6099 -24.0366 -31.0101 -3

3.60973.83303.92673.9656

3.81963.92873.98843.5960

3.784 -19.461 -22.365 -25.913 -31.478 -3

u ■ x(it-x) (2,r-x)sin y

v = x(2n-x)sin y

lul„= 11.9343

Ivl " 9.8696

1632

128256

16/3232/6464/128

128/256

1.38833.47338.68702.17205.4303

1.22071.90114.81901.2.1.333.0443

1.31272.82417.05391.76294.4065

3.06758.37542.18365.56921.4059

2.5198 -16.0616 -;1.53633.8529 -39.6442 -

3.784 -19.461 -22.365 -25.913 -31.478 -3

3.99703.99833.99953.9999

6.42103.94503.97173.9856

4.64844.00364.00144.0006

3.66253.83573.92073.9613

4.15703.94563.98733.9950

= x(2tt-x) sin y

= x(2n-x)sin y

lui

163264

128256

16/3232/6464/128

128/256

4.65422.32711.16395.81982.9100

5.07332.36761.20306.06293.0434

2.00001.99941.99992.0000

2.14281.96801.98421.9922

4.85432.34681.18335.94172.9772

1.02415.67442.96861.51447.6427

2.06851.98321.99161.9958

1.86471.91151.96031.9814

1.13065.7633 -12.9230 -11.4679 -17.3474 -21.961S1.97171.99131.9978

1.565 -13.912 -29.780 -32.445 -36.112 -4

These conclusions are also supported by some of the results in Table V. It is inter-esting that experiments with solutions containing odd wave numbers give results whichare only slightly worse than those for even wave numbers, if at aU; in other words,using M, N which are powers of 2 is not detrimental to accuracy, even when oddwave numbers are present in the solution.



TABLE V (Continued)

i (U) 1_(V) e(u)

2 2u=x (2tt-x) sin (y)4 4v - x (2n-x) sin(y)

lul_ ■ 97.4091Ivl = 9488.531

8163264

1.6926 +13.57788.674 -12.155 -1

1.2066 +22.7330 +16.51761.614

8.0018 +11.8850 +14.61311.427

8/ié16/3232/64

4.73194.12364.0258

4.41484.15934.0706

4.24464.08674.0370

3.3808 +18.3112.0094.9799 -14.06784.13684.0344

2.909 +27.162 +11.783 +14.45384.06184.01604.0040

2.310 +15.7741.4433.609 -1

u£C

vec°

uSC

uec"

u=x (rr-x) (2ïï-x) sin (y)

v - 0163264

128256

2.13484.72641.08052.55366.1632

9.54642.39856.01501.50653.7699

1.56043.39967.71211.81584.3888

4.27301.11742.85797.22381.8161

2.4159 -16.3250 -21.6099 -24.0366 -31.0101 -3

3.784 -19.461 -22.365 -25.913 -31.478 -3

16/3232/6464/128

128/256

4.51674.37434.23134.1299

3.98003.98773.99273.9961

4.58994.40814.24714.1375

3.82413.91063.95543.9777

3.81963.92873.98843.9960

u = x(2tt-x) sin(y)A

v = y sin(x)

lul^= 9.8696

Ivl = 97.4091

163264

128256

16/3232/6464/128

128/256

1.17874.97572.36811.1695.8262.91072.10112.02572.00652.0016

1.00904.8892.4331.2179"6.09883.0522.00931.99781.99721.9982

1.10914.9362.39991.1935.9632.9822.05662.01102.00131.9996

2.82761.31476.40053.14811.5597.754

1.97991.12025.7897 -12.9311 -11.4697 -17.352 -2

2.05402.03322.01942.0105

1.93491.97521.99441.9990

1.2653.1627.9041.9764.9401.235

In particular, these results also show that the present solver would perform veryweU on Unearized versions of the original geophysical fluid dynamic problem we wereinterested in ([16], [17]). We shaU return to this point in Section 7.

5. Changes in Boundary Conditions. In Section 2 we have formulated the modelproblem (2.1)—(2.3) which motivated this study. The algorithm of Section 3 has ob-vious appUcations to many other situations; it is of interest, therefore, to consider anumber of different boundary conditions which could be associated with the Cauchy-Riemann equations (2.1) in a rectangle.

We shaU assume throughout this section that the boundary conditions on y = 0,77 are still (2.2), i.e., v is prescribed there as v^°\x) and u^\x), respectively. The twodifferent combinations of boundary conditions we consider explicitly are: (1) that u isgiven on the left boundary of the rectangle and v is given on the right boundary, and(2) that u is given on both vertical sides.

It is clear that if v is given on aU sides, the problem should be formulated asPoisson's equation for v with Dirichlet boundary conditions; similarly if u is given onall the sides, a Dirichlet problem for u is more suitable. A moment's thought wiUshow thaf the two situations we shaU discuss can easUy be transformed into a con-siderable number of others, by reflections or by interchanging the roles of x and y,and of u and v. In fact, aU situations in which u as weU as v axe prescribed on someof the sides of the rectangle can be handled by sUght modifications of the algorithmswe present, yielding second-order accurate numerical solutions.



Table VI

Numerical results indicating the resolution (mesh points per wave length) requiredby the proposed discretization procedure and solution algorithm (versions C andF) to obtain prescribed accuracy.

t,(U)/lul. i2(V)/lvl2 H^lUl/lul,

u = y sin(x) v = y sin(x) 6.6212 -45.1958 -4

1.36111.3611

8.65486.9789

u = y sin(4x),

u * y sin(8x),

u = y sin(64x),

v = y sin(4x)2v = y sin(8x)2v = y sin(64x)

8.444 -33.1358 -21.9450 +28.0699 -11.1422 -1

4.849 -32.271 -24.4180 +14.4600 -31.0867 -1

1.475.592.00705.64402.0238

u = sintx)sin(y) , v = sin(4x)sin(4y) 4.10004.0170

6.55806.5578

4.42504.0074

u » y sin(x),

u = y sin(8x),4u » y sin(16x),4u = y sin{32x),4u - y sin(64x),

v = x(2n-x)sin(y)

v = x(2ir-x)sin(8y)v ■= x(2ir-x)sin(16y)v ■ x(2ir-x)sin(32y)v « x(2Tf-x)sin(64y)

4u * y sin(128x), v - x(2tt-x) sin(128y)

u « x(2ir-x)sin(y), v = y sin(x)

3.0218 -42.5784 -42.5490 -21.097 -18.027 -17.9628.0577 -11.1064 -11.592 +11.5906 +11.0707 -1

3.394 -22.2937 -2

1.7716 -31.7716 -33.2100 -21.210 -12.378 -11.2598 +21.2376 -11.1180 -16.956 +22.6429 -126.514 -2

5.270 -35.270 -3

1.35929.87282.05707.0675.2272.6175.46157.62925.2345.33605.5526

4.1043.1896

u - x2(2ir-x)2sin(y) , y - x4 (2ir-x) 4sin(y)

u ■ x{tt-x) (2n-x)sin{y) , v ■ 0

1.121 -24.9081 -31.785 -2

1.29371.2937

9.47555.11245.267

u - y"sin(3x) v = x(2rr-x)sin(3y) 5.20791.28513.20227.99891.9993

1.12052.63956.44951.59693.9748

3.68009.63282.43396.09001.5225

u ■ y sin(5x) v = x(2rr-x)sin{5y) 1.66043.88829.56802.38265.9508

2.76045.98661.43503.53628.7915

1.20793.13517.84891.98584.9629

u = y sin(7x) v ■ x(2ir-x) sin(7y) 3.74558.01041.93184.78691.1941

5.73151.08791.96216.20871.5410

2.78136.59151.66054.15781.0382

u - y sin(9x) , v = x{2ir-x) sin(9y) 7.48431.38933.25678.01601.9963

1.14031.76683.97339.64582.3893

5.54511.17042.83907.13691.7829

We proceed now with the description of the algorithm for the two cases men-tioned.

Case 1. u Given on the Left Side. The rectangular domain is now taken asR j = {(x, y): -A/2 < x < 2tt, 0 < y < 77}. This is merely done for notational con-venience, so as to leave Figure 1 unchanged. The boundary conditions are first (2.2),



which we repeat here as

(5.1a) u = u(0>(x), y.= 0,

(5.1b) v = u(1)(x), y = 77,

and also

(5.2a) " = "(0)00. x - ~hl2'

(5.2b) v = v(x)(y), x = 2ir.

Thus Eqs. (5.2) replace the periodicity conditions (2.3a, b). Eqs. (2.1), together with(5.1), (5.2) completely determine u, v, and u need not and should not be prescribedany more at an interior point of R x.

The difference equations are still (2.4), which we repeat for convenience as

(5.3a) (Uu - t/,._w)/A + iVUj - Vu_x)lk = Dip Ki


here T is an N x (N - 1) matrix of rank (N - 1),

(5.7)

1-1

and we define p = h/k. In (5.6a) U0 is known from (5.5a), and in (5.6b) VM + 1 isknown from (5.5b). Hence it is convenient to redefine D1 as Dj + A_IU0, and EM

•M h 1 \M + j. Furthermore, we redefine D¡ 0 as D,. 0 + k 1 V¡ 0 and D¡ N asas EDiN - k~lViN. After these changes of notation (5.6) can be given the block matrixform

(5.8)M M

M.

M'

T

'tV-1 *N-1

'N-l *N-1

TI

I

JN

~In ^n

^n *n

jn-i |-I i

A7-1 I

yM

Ui= A

JLU*J

D,

DM

-M

where IL is the L x L identity matrix.The decoupling of U and V in this case proceeds as foUows. In the upper half of

system (5.8) we add the first block row to the second, then the second to the third,and so on, untü the new (M - l)st row is added to the Mh. We obtain a new system,which we write in condensed form as

(5.9)

where

IMq\r

Vu

= A F

E

(5.10a) P =

TT T

T ■ ■ T T



(5.10b)

(5.10c)

(5.10d)

ß =

"^JV-l ^JV-1

'N-l ¡N-l

-IN-l

R =

D,

Dx+D2

E>,

and I is the MN x MN identity matrix. From (5.9) we have

(5.11a)

(5.11b)

with

(5.11c)

U = AF - PV,

(Q - RP)y = h(E - RE),

RP

SS s

s s]

hexe S = T*T is an (N-l) x (N - 1) matrix,

(5.12) S = p2

-1 2j

Notice that 5, in contradistinction to S of Section 3, is nonsingular. We shall return toit later.

Written out expUcitly, (5.11b) becomes after a change of sign,


A FAST CAUCHY-RIEMANN SOLVER

(5.13)

S + I -I

S S+I -I

-I

S+I

T*DX-EX

T*(DX+D2)-E2

Lt*z>,---AÍwhere / is now the (TV - 1) x (/V - 1) identity. This system can be brought into block-tridiagonal form simply by subtracting the rth row from the (/ + l)st, starting from thetop. This produces the system

(5.14)

S+I -I-I S+ 21 -I

S+ 21-I

T*DX-EX

T*D2+E1-E2

J*0M + KM^X DM JS + 27_We see that the decoupling resulted in this case in the elimination of U, rather

than of V. Equation (5.14) is rather similar to (3.11). Some of the differences havealready been pointed out; as a result of these differences, the matrix of (5.14) is non-singular. It can be brought to scalar tridiagonal form by diagonalizing S and then re-indexing. We shaU comment on the fast solution of (5.14) further at the end of thissection, together with the fast solution of the matrix equation obtained in the secondcase we wish to discuss.

Case 2. u Given on Both Vertical Sides. We fit the grid to the rectangular do-main so that R2 = {(x, y): -A/2 < x < 27r - A/2, 0 < y < n}. The boundary condi-tions are (5.1) on the horizontal sides of the rectangle, and

(5.15a) u = 77(0)(y), x = -A/2,

(5.15b) u = u(1)(y), x = 2?r - A/2,

on the vertical sides. Equations (2.1), (5.1) and (5.15) determine u and v completely,subject to the requirement that the data d(x, y), u,0^(y), u,xAy), i/0)(x), v^(x)satisfy the Gauss divergence theorem:

ffdix, y) dxdy = f^-"'2 {v^\x) - u(*)} dx(5.16)

+ /;{"(1)(v)-«(0)(y)}dy.

The difference equations are (5.3), with (5.3b) only being written for 1 < / <M-l,l


Hence, there remain (M - l)N í/-values to be determined, and M(N - 1) F-values, i.e.2MN - M - N unknowns. In (5.3) we have MN D-equations and (M - l)(N - 1) in-equations, i.e., 2MN - M - N + 1 equations in aU.

It would seem that the number of equations exceeds the number of unknownsby one. We expect, however, from the continuous case that one compatibiUty condi-tion has to be imposed on the data, analogous to (5.16), and that the matrix of sys-tem (5.3) have rank equal to the number of unknowns. This can be checked directlyin the block form (5.18) to which we shaU bring this matrix below; the compatibiUtycondition turns out to be the one obtained when computing the integrals in (5.16) by themidpoint rule. A similar statement is true in the case in which u is prescribed on the hori-zontal sides of the rectangle, and u on the vertical sides; in that case the compatibility con-dition is the discrete analog of Stokes' curl theorem, involving e(x, y) rather than d(x, y).

We introduce the TV-vectors U,-, D,- and the (N - l)-vectors V,-, Ef as in the pre-vious case. Again, the first and last components of D,, D. 0 and DiN, are modified bythe addition of ph -1 V¡ 0 and of -ph _1 V¡ N, respectively. Also, Dj is modified bythe addition of A-1U0 and D^ by the addition of -A-1UM.

After these changes of notation, system (5.3) becomes

M M- 1

M<

M- 1

(5.18)

T

T

-/ N-l 'N-l

'N-l 'N-l

'N

~'n ^n

T*

'n-I

N

v,v.

T*J Lu*-U

D,

= ADM

-EM-l



where T is the TV x (N - 1) matrix defined by (5.7), and IN-X,IN are identity ma-trices of the appropriate dimensions. Clearly, the sum of the MN rows in the upperhalf of the matrix in (5.18) is zero. The corresponding compatibiUty condition that2,. :Dj . = 0 is exactly the one we expected; we only need to remember that the D¡-close to the boundary have been redefined to include the boundary data with appro-priate coefficients.

The elimination of V proceeds in a manner analogous to Case 1, by summing theblocks of the upper half of (5.18), in a discrete form of integration with respect to x.The result is

(5.19)M- 1 M- 1

M-V

M-U

TT T

T ■ •'■ i" T

T

■ 0

T

"^jv-i ^v-i

"^7V-l ^tV-1

'N

'N

0 • ■ • 0

T*

^Af-l

'M

Ui

uM-l

DiD1+D2

-E,

-EM-l

We rewrite this, with the obvious identifications, as

(5.20)

P i Ii

••• T ¡0 ■•0_ l

q r r u

F

*M

-E

Notice that P and Q have block dimension (M-l) x M, P has blocks of dimensionN x (N - I), and Q has blocks of dimension (N - 1) x (TV - 1).

From (5.20) we obtain

(5.21a) u = AF - py,

(5.21b) ßV + RU = -AE.



This aUows us to eliminate U and write

(5.22a) (Q - RP)y = -A(E + RE),

M(5.22b) T*(TT-"T)V = AT*£D/;

1

here we introduced the block row missing in (5.21) as (5.22b). The elimination ofthe single redundant equation in (5.18) was done naturaUy by multiplication of (5.22b)with the (N - 1) x N matrix T *.

System (5.22) becomes, after carrying out the matrix multipUcations,

(5.23)

5+7 -/

S S+I -I

S+I V = A

rT*D, + ExT*(DX + D2) + E2

r*z;M-l D. + EM-l

IT£>,with S being the (N - 1) x (N - 1) matrix defined by (5.12). The matrix of thissystem is the same as that of (5.13), except for the lower right corner block; also theright-hand sides differ only sUghtly. Applying the block-tridiagonalization procedureused in Case 1, which corresponds to differencing in x, one obtains

(5.24)

S+I -I-I S + 21

-I

-I

S + 27-I

-I

S+I

V = A

T*D1 +EX

T*D2+E2-E1

T*r> + p _ p1 UM-1 T EM-1 c'M-2

We are now prepared to discuss the fast solution of (5.14) and of (5.24).Fast Sine Transform. The fast solution of (5.14) and of (5.24) involves bringing

the corresponding matrices to scalar tridiagonal form. This is done in two steps: thefirst and crucial step is to diagonalize S; the second is to bring the two diagonals whichare identicaUy -1 from the position of block subdiagonal and block superdiagonal tothat of scalar sub and superdiagonal, i.e., immediately adjacent to the main diagonal.

We shaU write the procedure for a sUghtly more general system, which includes(5.14) and (5.24) as special cases, to wit:



(5.25)

Sx+axI VS2 + a2I 7i'

ß'M-l1

7m-iI

>MSM + a-M1

W= 8;

W and 8 are partitioned to conform with the blocks of the matrix,

W = (W*,W*,...,W*f)*, 8 = (Bf, 82*,..., 3%)*, and S, = 6,.S.The eigenvalues pk and eigenvectors r\k of S,

(5.26a) Svk = ßkT}k, Kk


the scalar tridiagonaUzation of (5.25) is completed in the form

(5.31)

here each Ck is a nohsingular scalar tridiagonal M x M matrix,

5,/ifc+a, 7,

0i 52Mfc + a2 72

"lM-1

&M-1 SM^fc + 0:M_

1


periodic boundary conditions. Furthermore, the asymmetry with regard to the con-tinuity requirements on u and v does not appear to depend on whether U or V axeeliminated in the algorithm, or on the boundary conditions imposed.

6. Comparison with Existing Solvers. A fast direct solver for the inhomogeneousCauchy-Riemann equations was published by Lomax and Martin [24]. Some applica-tions and extensions are given in [25], [26]. We carried out a comparison betweenour solvers and those published previously. The comparison was made first for thetest case of [24], [26], then for some of the test cases in our Tables III through VIand similar ones. These computations were carried out on a CDC 6600 computer witha FTN 4.6 compiler; single precision arithmetic was used throughout.

Thin Biconvex Airfoil. The example problem used in [24], [26] to illustrate theuse of a Cauchy-Riemann solver in aerodynamics is that of steady, irrotational, sub-sonic, inviscid flow over a thin symmetrical parabolic-arc biconvex airfoil in the small-perturbation approximation. The linearized Prandtl-Glauert transformation (e.g., [25])yields for this problem the formulation

(6.1a) "* + vy = °.

(6.1b) ","u*"0' J'>0>-00


the rectangle not containing the airfoil. Hence, we used both the solvers of Section 5,Case 1, as well as Case 2.

Among the solvers of [26], we chose for comparison purposes version D, option(d). Version D is the one suggested by the authors themselves for the application athand, and its option (d) seemed, of all the versions and options offered, to be the onlyone which could be second-order accurate. This version corresponds to our Case 1 inthe choice of boundary conditions.

For the test case (6.1), (6.2) neither the norms of the solution over R3, nor theorder of accuracy of the solvers is particularly relevant, because of the singularities atthe edges of the airfoil, x = ±0.5, y = 0. As a means of comparison we chose, there-fore, the accuracy in computing u(x, 0). To compute U on y = 0, which is a F-lineand not a (7-line in the staggered mesh, [24] suggests the second-order accurate extra-polation formula

(6.5) Uixl2 = (l/S){9Uix - UU2 + 3(x/A)(F,. 0 - F,.+10) - 3kEi>0};we used this formula in our program, as well as in theirs.

Because of the singularity at the edges, [24] did computations both with x =±0.5 being F-lines, i.e., with computational mesh points coinciding with the edges, andwith x = ±0.5 being ¿/-lines, i.e., with the computational mesh straddling the edges.In Table VII we give results for the computation with both these meshes.

For each mesh, Table VII contains in successive columns the x-coordinates ofthe points along y = 0 at which U was calculated, the exact 77 there, the solution bythe [26] solver, version D, option (d), and its error, the solution by our solver Case 1,its error, and finally the solution by our solver Case 2, with its error. We were onlyinterested in comparing the different solvers, and did not wish to study the question ofcomputing in a finite domain the solution to a half-plane problem; hence, only thecomparison when using exact boundary data is given. The computations with homo-geneous boundary data were carried out and gave slightly poorer results for all solvers.

For the mesh points coinciding with the edges, our Case 1 solver has a smallererror than the [26] solver at aU points except four. The error, in particular, is smallerby a considerable factor over the airfoil |x| < 0.5, and it is smaller at the edges, wheremost of the error occurs. Our Case 2 solver has mostly smaUer error than the Case 1solver, but they are quite comparable.

For the computational mesh straddling the edges, our Case 1 solver seems tohave mostly larger errors than [26] outside the airfoil, 0.5 < |x|, but smaller errors in-side. The Case 2 solver is still slightly better than the Case 1 solver, except at a fewpoints.

As a conclusion, our solvers of Section 5, Case 1 and Case 2, are quite successfulon the example problem of [24]. If anything, they have slightly smaller error than the[26] solver which appeared most promising. A better test would probably be to ex-tract the singularities from the solution analytically (cf. the example in [13], for in-stance) and compute only the regular part of the solution. We felt, however, that [24]stressed the importance of computing the neighborhood of the singularities as accurate-ly as possible, and made the comparison accordingly.



Table VIINumerical results for the u component of velocity along y = 0 in the solution ofthe example problem (6.1), (6.2). The heading LM indicates results with the sol-ver of [26], version D, option (d); headings GBuv and GBuu indicate results withthe solvers of Section 5, Cases 1 and 2, respectively. The solutions were obtainedon a 64 x 64 grid; every second point is given for reasons of space economy, ex-cept near the singularity at x = ±0.5, where every point is given; the "extra"points are marked by stars. The terms "coinciding mesh" and "straddling mesh"are explained in the text.

VIIA. Coinciding mesh

u exact LM LMerror GB GBerrorGB GB

error-0.-0.-0-0-0-0-0-0-0-0-0-0-0-0-0-0.-0.-0-0.-0.

0.0.0.0.0.0.0.0.0.0.0.0.0,0.00000

953125890625828125765625703125640625609375578125546875515625484375453125421875390625328125265625203125140625078125015625046875109375171875234375296875359375390625421875453125484375515625546875578125609375671875734375796875859375921875

1.409-11.666-12.009-12.486-13.193-14.341-4

243-1588-1895-1467282646-1250-2302-1163-1729-1050170

1.2421.272

262211116698-1566-1

4.450-12.302-15.250-24.646-11.2821.4678.895-1

588-1243-1689-1802-1227-1825-1529-1

-1.'411-1-1.670-1-2.016-1-2.497-1-3.210-1-4.372-1-5.286-1-6.640-1-8.863-1-1.684-1.480-4.560-1-5.493-22.277-16.150-18.704-11.050

169241271261210115684-1548-1422-1266-1613-2573-1482

1.6858.880-1

659-1306-1736-1844-1270-1872-1584-1

1.962-44.049-46.647-41.044-31.703-33.083-34.290-35.230-3

-3.208-32.166-11.983-1

-8.664-32.434-32.490-31.323-38.401-4

.658-4

.182-4

.331-4

.869-4

.729-48.950-4

069-3338-3812-3796-3580-3635-3

7.346-31.998-12.182-11.490-37.102-36.330-34.675-34.205-34.292-34.720-35.454-3

1.411-11.670-12.015-12.496-13.208-14.370-15.283-16.637-18.860-11.6841.4804.556-15.456-22.281-16.154-18.725-1

050170242272262

1.2111.1179.695-17.560-14.435-12.280-15.464-24.557-11.4801.6848.862-16.639-15.285-13.711-12.816-12.237-11.833-11.536-1

1.612-43.346-45.586-49.012-41.522-32.864-34.051-34.970-3

-3.488-32.163-11.980-1

-9.009-32.066-32.099-38.841-4

.496-4

.209-4

.488-5

.309-5

.743-5

.587-5

.145-68.743-52.539-46.116-41.463-32.174-32.149-3

-8.918-31.981-12.164-1

-3.367-35.103-34.197-32.231-31.374-39.723-47.678-46.693-4

-1.411-1-1.670-1-2.015-1-2.496-1-3.208-1-4.370-1-5.283-1-6.637-1-8.860-1-1.684-1.480-4.556-1-5.454-22.281-16.155-18.726-11.0501.1701.2421.2721.2621.2111.1169.696-1

.560-1,436-1.281-1.454-2.556-1

-1.480-1.684-8.860-1-6.637-1

283-1709-1814-1234-1829-1531-01

1.595-43.311-45.533-48.942-4

513-3853-3039-3957-3503-3163-1980-1027-3047-3078-3606-4228-4045-5953-5

7.198-59.136-58.551-5

196-5391-5818-4297-4369-3074-3042-3033-3

1.980-12.163-13.511-34.948-34.029-3

031-3134-3777-4980-4902-4

General Purpose Comparison. Lomax and Martin developed their solvers [24],[26] with certain aerodynamical problems in mind [25], and we developed ours bearingin mind certain problems in geophysical fluid dynamics [15], [17]. On the otherhand, the first Poisson solvers were also formulated for specific applications [4], [21],and only later developed into general purpose algorithms and into packages. Therefore,it seemed reasonable to test our solvers on solutions which had an appropriately generalcharacter (Tables III through VI, and discussion at the end of Section 5); these testsshowed that our solvers are second-order accurate, given rather minimal continuity prop-erties of the solution.



VIIB. Straddling mesh

u exact LM LMerror GBGB"Verror GB'

GB""error

-0.-0.

0.0.0,

-0.0.0.0.0.0.0.0.0.0.0.0.0.

-0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.

9375875081257500687562505937556255312550004687543754062537503125250018751250062500625125018752500312537504062543754687550005312-5562559375625068757500812587509375

1.467-1■1.743-1■2.114-1•2.637-1■3.425-1■4.753-1■5.840-1•7.559-1■1.092

•7.763-1•2.353-1

.975-2441-1898-1235-1085192253273253192085235-1898-1441-1353-1353-1763-1

■1.092-7.559-1-5.840-1-4.753-1-3.425-1-2.637-1-2.114-1-1.743-1-1.467-1

■1.467-1■1.742-1■2.112-1■2.633-1•3.417-1■4.730-1■5.793-1■7.417-1-1.020

027-1219-1036-1455-1900-1234-1085192253273

1.2531.1911.0849.229-16.894-13.448-11.027-12.228-17.037-1

-1.021-7.430-1-5.807-1■4.746-1-3.435-1-2.655-1■2.138-1-1.774-1-1.506

986-5054-4127-4073-4356-4236-3734-3422-2206-2

353-2338-2796-3406-3765-4697-4290-4268-4998-4610-4144-4578-4804-4476-4365-4466-4957-3246-2252-2

-7.086-2-1.291-2-3.304-3-6.781-41.013-31.796-32.432-33.105-33.911-3

467-1741-1111-1632-1416-1729-1791-1415-1020

-7.025-1-2.217-1

.038-1

.458-1

.903-1

.237-1

.085,192.253.273.253.192.085

9.237-16.903-13.459-11.039-1

-2.216-1-7.024-1

019414-1790-1727-1414-1630-1109-1739-1464-1

063-5.474-4765-4

,939-4,463-4,372-3884-3439-2224-2

7.374-21.361-24.037-3

665-3726-4669-4183-5463-6715-5928-5971-6301-5314-5095-4274-4

-1.733-3-4.112-3-1.369-2-7.383-2

-7.235-2-1.450-2-5.011-3-2.510-3-1.108-3-6.830-4-4.957-4-3.994-4-3.457-4

1.467-11.741-1

111-1632-1416-1729-1791-1415-1020

.025-1

.217-1

.038-1

.458-1

.903-1

.237-1

.085

.192

.253

.273

.2531.1921.0859.237-16.903-13.458-11.038-1

-2.217-1-7.025-1

.020

.415-1

.791-1

.729-1

.416-1

.632-1-2.111-1-1.741-1-1.467-1

•5.903-5•1.442-4-2.717-4■4.874-4■9.380-4■2.362-3■4.873-3•1.427-2■7.223-2

.373-2

.359-2

.019-3

.646-3

.505-4

.417-4

.327-5

.980-5

.376-5

.068-5

.376-5

.990-5

.327-5

.417-4

.505-4

.646-3

.019-3-1.359-2-7.373-2

7.223-21.437-24.873-3

362-3380-4874-4717-4442-4903-5

The intent of [24], [25], [26] was explicitly restricted to the solution of spe-cific aerodynamic problems. It appeared worthwhüe, however, to consider the moregeneral applicabiüty of their solvers.

We carried out a number of tests with version D, option (d) of [26] on solu-tions with a generic character. The results are given in Table VIII. This table is or-ganized in the same way as Tables III and V of Section 4, and we refer to the de-scription there.

For very simple test cases such as u, v constant or quadratic, the [26] solver isessentiaUy exact to machine accuracy; the round-off error increases sUghtly with thenumber of grid points used. The difference between these results and those in Table IIIis due to the computer used: double-precision arithmetic on an IBM 360/95 is slightlymore accurate than single precision on a CDC 6600.

The results with more severe test cases seem to indicate that the [26] solver isonly first-order accurate in general. This is apparently due to the formulation of thealgorithm at the boundaries. An indication that boundary inaccuracies are the causeof lower than second-order accuracy are those exceptional test cases which show high



Table VIIINumerical results for general-purpose test cases using the solver of [26], versionD, option (d). The smaller regions, such as 7?4 = {(x, y): 0


TABLE VIII (Continued)

M *,(U) H2(U,V) * (u) * (v)u,v E C u = y sin(x)

+ 0.1v = sin( y)

+ 0.1

163264

128

1.7261.0023.696-11.087-1

1.9374.854-11.210-13.018-2

1.8357.876-12.750-17.981-2

6.2292.2096.000-11.750-1

4.6151.1822.969-17.427-2

16/3232/6464/128

1.7222.7123.399

3.9904.0124.010

2.3302.8643.446

2.8203.6823.429

3.9043.9803.998

u,v e c 10 . , ,u = y sin(x)v = sin(y)

163264

128

1.099+31.740+21.065+23.603+1

3.209+28.664+12.186+15.470

8.115+21.374+27.685+12.577+1

1.944+34.049+21.687+25.635+1

16/3232/6464/128

6.3207.6342.950

3.7963.9633.997

5.9051.7882.982

4.8002.4002.994

8.179+22.312+25.875+11.474+1

3.5383.9353.985

u,v e C = y sin(x)= sin(y )

163264

128

5.742-21.556-23.835-39.507-4

1.128-12.690-26.571-31.632-3

8.951-22.197-25.379-31.335-3

2.299-18.305-22.533-27.042-3

16/3232/6464/128

3.6904.0584.033

4.1954.0944.027

4.0744.0844.029

2.7683.2793.597

3.015-17.070-21.892-24.703-3

4.2643.7374.023

u,v € C u = y sin(x)• i 4,v = sm(y )

163264

128

1.261+16.5174.9132.752

1.892+11.044+17.5423.623

1.608+18.7036.3653.217

4.736+13.188+12.726+11.833+1

16/3232/6464/128

1.9351.3261.785

1.8121.3842.0828

1.8471.3671.978

1.4861.1691.487

3.853+12.896+11.494+11.026+11.3311.9391.456

t_(0> *_


TABLE VIII (Continued)

M *2(u) I (u) i (v)

u,vharmonic

u=cos(x)sinh{yv=sin(x)cosh (y

163264

128

7.981-24.464-22.379-21.234-2

2.924-21.647-28.727-34.491-3

6.013-23-365-21.792-29-285-3

3.669-12-559-11.050-11.012-1

1.105-17.177-24.246-22.383-2

16/3232/6464/12E

1.7891.8761.928

1.7751.888I.943

I.7871.878I.930

I.4341.5511.630

I.5391.6901.783

u,vharmonic

u=cos(x)sinh(yv=sin (x)cosh (y

n/8

163264

128

9.154-34.861-32.515-31.283-3

3.840-32.110-31.105-35.653-4

7.019-33.747-3I.943-39.914-4

4.605-23.087-21.938-21.166-2

1.31B-27.809-34.181-32.168-3

tf/4 16/3232/6464/128

1.883I.932I.96I

1.820I.910I.954

1.873I.9291-950

I.492I.5931.661

1.7261.868I.928

u=sin(y)cosh(xv=cos (y) sinh (x)|

163264

12816/3232/6464/12E

1.17.03.92.1

43+1655101

3.9832.5691.4637.809-1

8.5575.3152.9791.585

5.535+14.188+12.760+11.696+1

1.776+11.457+19.4925.490

1.61.71.8

18 1.5501.7561.873

1.6101.7841.879

1.3221.5171.627

1.2191.5351.745

harmonicu=cos(y)sinh{xi

V = - sin(y)cosh(x)

163264

128

1.78.94.52.2

42+1110060

8.3944.6812.4761.274

1.367+17.1173.6321.835

7.865+14.542+12.443+11.264+1

3.313+12.071+11.165+16.183

16/3232/6464/12Í

1.91.91.9

1.7931.8911.944

1.9211.9601.970

1.7311.8601.928

1.6001.7791.883

Subject to further testing, we are forced to conclude at this point that ouralgorithms, compared to previously published ones, are at least as good when appliedto specific problems of interest, and that they are more suitable for development intogeneral-purpose Cauchy-Riemann solvers.

7. Concluding Remarks. The inhomogeneous Cauchy-Riemann equations in arectangle have been discretized by a finite-difference approximation. A number ofdifferent boundary conditions have been treated explicitly, leading to algorithms whichhave overall second-order accuracy. All boundary conditions with either u or v pre-scribed along a side of the rectangle can be treated by similar methods. A rigorousproof of the second-order accuracy of the algorithm was given for one combination ofboundary conditions, and numerical experiments substantiate this result for aU theboundary conditions tested.

The algorithms presented here have nearly minimal time and storage require-ments and seem suitable for development into a general-purpose direct Cauchy-Riemannsolver for arbitrary boundary conditions. This could be done for instance along theUnes of the capacitance matrix methods of discrete potential theory (Widlund [33]);generalizations to nonrectangular domains can also be made by this approach and re-lated ones. More experience with different applications should help in formulating acode which gives a reasonable compromise between efficiency and range of applica-biUty.

It is well known [30], [31], [32] that fast solvers can be formulated for a sin-gle separable second-order eUiptic equation with variable coefficients. Clearly, the



same generalizations can be carried out for a first-order system (2.1) in which 9/9xand 9/9y are replaced by a(x)9/9x and by A(y)9/9y, and in which lower-order termscan also be introduced. A special case of such an extension appears in [26]. Effi-cient algorithms for generalizations of this nature can be based on appropriate modifi-cations and combinations of matrix decomposition, cyclic reduction and Toeplitz fac-torization [30], [31], [32].

Linear problems with variable coefficients, as well as nonlinear problems, canalso be handled by semidirect methods, i.e., by splitting the given operator to be in-verted into one whose inverse is easily computed using a fast direct method, and an-other one which is small in some suitable sense [8], [19], [25], [32]. It is in this direc-tion that we shall seek to extend the work presented here, in order to solve the non-linear problem of [15], [17] and related ones. The straightforward iterative methodof [17] wUl then be compared to semidirect methods.

Appendix A. An Error Estimate. We saw in Sections 4 and 5 that the proposedalgorithm gives a second-order accurate solution to the inhomogeneous Cauchy-Riemannequations under the various boundary conditions considered. The second-order ac-curacy of the discrete solutions was obtained in our numerical experiments even forcases in which the solution to the continuous problem was merely C1, more precisely,u E C1, v E C°. We shall give now a rigorous error estimate for the model problem ofSection 2, making the stronger and more customary assumption that u, v E C4, i.e.,that the solution to (2.1)—(2.3) has continuous derivatives up to fourth order.

The discrete operator Lh we consider is that of (3.1 la),

(A.l)

S + I-I

-I

S + 21

-I

-I

+ aee*,S + 21 -I

-I S + I

where a > 0 and e is a vector of length MN with all entries zero but one,

e, = 0 for / # Mj0 + i0, eMj0+i0 = • i

Lh acts on the grid function U. We wish to show that

(A.2a) ||i7 - £/|L = 0(h2 + k2).

From such an estimate and from (3.1a) it will follow immediately that

(A.2b) Hu - F|L = 0(h2 + k2)

as well; it suffices to observe that (3.1a) is a second-order accurate quadrature formulafor

(A.3) v(x, y) = u(x) + f¿{d(x, v) - ux(x, r,)} dV,

namely the midpoint rule. We only need to apply the result of Bramble and Hubbard[2] that (A.2a) implies similar estimates for the difference quotients of u.



The matrix operator Lh of (A.l) is of monotonie type (Collatz, [6, p. 42 ff.])or of positive type with diagonal dominance (Forsythe and Wasow [11, p. 181]). Toavoid confusion in the terminology, we shall simply state that Lh satisfies the following

Lemma 1 (Maximum Principle). IfLhVl = H, a77»i H > 0 (in the sense thatall the components of H are nonnegative), then W > 0.

From this, one easüy shows thatLemma 2 (Comparison Theorem). // \LhVi\ < Lh


conclude based on Lemma 2 that (A.2) follows from (A.6). We remark at this pointthat, in order to make the notation of (A.6) transparent, we defined

(A.7a) L = -k2A in R, L = -kb/by on bR,

(A.7b) f=-k2(dx+ ey) in R, f= -k(e + vx) on bR;

furthermore, F is not merely / at the grid points.A number of slight difficulties arise in carrying out the program above. First,

(A.4) is essentiaUy a Neumann problem, rather than a Dirichlet problem as in [13] andin most of the literature on elliptic systems. The work on boundary conditions of thesecond and third kind most relevant to the estimation which follows is that of Batsche-let [1], who gives an 0(h) estimate, and that of Bramble and Hubbard [3], who givean 0(A2|log A|) estimate, their assumptions being that u E C4. The latter article alsocontains further references to estimates for the Neumann problem. The boundary con-dition approximations in these works are different from ours.

Second, (A.6) actually fails close to the boundary, i.e., for / = 1, TV, where thelocal truncation error is of first order only. The work of Bramble and Hubbard ([3]and references therein), combined with our numerical results, led us to expect that(A.2) could stiU be proved, with some additional effort; this turned out to be the case.

Our method to obtain (A.2) is actuaUy a modification of that in [1], which ex-ploits the fact that the boundary condition we use is second-order accurate, while thatin [1] is only first-order. We shall see below that the failure of (A.6) for / = 1, /Vstems from Lh there being a linear combination of the discrete analog to (A.4b) andof that to (A.4c). Hence the truncation error, as defined by (A.6), is of first orderthere, although both (A.4b) and (A.4c) are separately approximated to second order.In spite of this formally first-order truncation error, the fact that both the equationand the boundary condition are approximated by second-order discrete analogs yieldsan overall 0(h2 + k2) error estimate (A.2).

After these observations, we proceed with the business at hand. To start, wederive (A.6) for 2


in writing 0(hpkq) depend on the derivatives of order p + q for the functions involved;they wiU not be written down explicitly, since those derivatives are known to bebounded from our assumptions.

Here, as in the derivation of (A.2b) from (A.3) and as in the sequel, it is im-portant to remember the staggering of Figure 1, which essentially guarantees that aUdifferences are centered. Notice also that, for e- # 0, (A.8b) will only be modified

by the term ae,- e,*u, in Lhu, and by the term ae; et U,- in F; the condition weio >o *o " >o 'o >ochose to eliminate the indeterminacy in the Neumann problem is exactly that thesetwo terms cancel, and hence the estimate is not affected. If instead of ee* we haveA, the same observation holds as after Lemma 2; i.e., prescribing the average of U- ,

rather than one of its components, U¡ ¡ , does not affect the estimate either.'o-VTo analyze the situation for / = 1, it is convenient to rewrite (A.5) for / = 1 as

a linear combination of two equations, involving an auxiliary vector U0; an entirelysimilar procedure can be carried out for / = N, introducing U^ +,, but we shall omitwriting out the latter analysis and merely draw the conclusions we need from it. Thetwo equations for / = 1 are

(A.9a) -U0 + (S + 2I)\JX - U2 = k(T*Dx + E0 - Ej),

(A.9b) U0-Uj =-xE0 + r*V0;

here we revert to the original definition of Dj, which had been replaced by Dj +k~l V0 in writing Eq. (3.3). Equation (A.9a) is now simply the discrete analog of (A.4b)for/ = 1, or y = k/2, with E0 defined on y = 0 in the usual manner. Equation (A.9b)is the discrete analog of (A.4c) written on y = 0. If the auxiliary vector U0 is thought ofas given on y =-k¡2, then both (A.9a) and (A.9b) are formally second-order accurate.

With this motivation in mind, it is easy to obtain

(LhW)x =(5 + /)Wj-W2

= (S + I)ux - u2 + k2Au\y = k/2 + fc(9/9y)77|y = 0

(A.lOa) -k2(dx + ey)\y = k/2 - k(T*Dx + EQ - Ex)

-k(e + vx)\y=i0+kE0-T*yo

= [(Lh-L)u]x + [f-F]x.

We have from (A. 10a) that

[(Lh - L)u], = 0(k2(h2 + k)), [f~F]x= 0(k2(h2 + k2)) + 0(kh2).

We assume throughout that k/h = 0(1) and drop the fourth-order terms; thus finaUy

(A. 10b) (/.„IV), = 0(k(h2 + k2)).

For / = N we obtain in the same way



(A.llb) L\¡/ =

(A-10c) (LhW)N = 0(k(h2 + k2)).

At this point we have to exhibit a comparison function


(A. 14b) 777 = CrA2,for (A.l2b) to hold. The constant Cr depends only on the derivatives of u, and on thedomain R.

In fact, one can write down \p = \¡/(x, y; a, b, a, ß) explicitly and optimize theparameters a, b, a, ß subject to (A.llc). This yields

(A.15a) IB'I


The main part of the program is subroutine FASTCAR. For user convenience,the driver subroutine, as well as the FFT subroutine and the tridiagonal solver we usedare included. Clearly, these subroutines can be changed to others which the user forone reason or another finds are better adapted to his needs or preferences; e.g., cyclicreduction can be used instead of the FFT, or Toeplitz factorization instead of the LUfactorization we use for the scalar tridiagonal systems, etc. The use of a more generalFFT algorithm would remove the restriction of M being a power of two.

The driver program contains some diagnostics on error norms, which would haveto be modified for problems whose solution is not known in closed form. FORMATstatements are grouped together and can thus easily be modified. Execution speed canbe increased if MN additional storage locations are available, by avoiding the shiftingrequired for the reindexing process within the same array. Other trade-offs betweenstorage and execution time are also possible.

The implementation foUows rather closely the algorithm description in the lastsubsection of Section 3. With the help of the COMMENTS provided in the pro-gram, we hope that it is fairly readable.

Questions and comments by users are warmly invited.

C DRIVER FOR FASTCAk * 1C A 2

COMMON /KEEP/ Z(S25bl A 3COMMON /HAND/ t(6292),D(U292) A 4COMMON /WET/ V0(12B>


XZ«XJV!I)K=(J-1)*M+I

SETTING UP R.H.S. UF THE INHOMOGENEOUS CAUCHY-RIEMANNEQUATIONS.E IS THE R.H.S. OF THE VORTICITY EOUATIOND IS THE R.H.S. OF THE DIVERGENCE EQUATION

E(K)«DUDY


WRITE (6,90)WRITE (6,90)WRITE (6,80) N,M,XL,XU,YL,YUWRITE (6,90)WRITE (6,90)

60 CONTINUEWRITE (6,90)DO 70 J = l,5W< J,3)=W


MHALFP=M8Y2+1MPLUS2=M+2IL«M*NILM»IL-KALAM'DELTAY/DELTAXALSQ*ALAM*ALAM

SETTING UP THE R.H.S. OF THE LINEAR SYSTEM

G2«2.0*DELTAYDO 10 K=1,ILME(KI*E(K)*G2

10 D(K)»D(K)*G2DO 20 1 = 1,MKP-I+ILMD(KP)*D(KPI*G2-A(I,2)D(1)-D(II+A(I,1)

20 A( 1,11*0.0IP*1IS*2DO *0 J*1,ILM,MKM=J-1DO 30 1*1,MlKl-KM+IA(I,IS)»E(K1)

30 E(K1)=ALAM*(D(K1+1)-Ü(K1))+E(K1)-A(I,IP)KMMP=K1+1A(M,I3)=t(KMMP)E(KMMP)=ALAM*(D(KM+1)-D(KMMP>)+E(KMHP)-A(M,IP)IEX=IPIP*IS

*0 IS-IEXDO 50 I-1,M1K1«ILM+I

50 E(K1)»ALAM*(D(K1+1)-D(K1))-A(I»IP)E(IL)=ALAM*(D(ILM+l)-D(IL))-A(f1,IP)

PERFORMING THE FAST FOURIER TRANSFORM BLOCK BY BLOCK

SIG—l.ODO 80 K1*1,NJF«(K1-1)*MDO 60 1*1,MJ»JF+IA(1,1)=E(J)

60 A( 1,2)-0.0CALL FFT2 (SIG,MPOw,PI)00 70 I*1,MBY2IQF*JF+I*2E(IQF-1)»A(1,1)

70 E(IQF)=A(I,2)X(K1)"AIMHALFP,1)

80 Y(K1)*AIMHALFP,2)

SOLVING EACH SUB-SYSTEM

DO 120 J*2,MBY2IF (ICODE.EO.3) GO TO 100X1«2.0*ALSO*(COS(2.0*PI*(FLOAT(J)-1.0)*AM)-1.01-2.0A(1,1)-X1+1.0A(N,1)=X1+1.000 90 K*2,N1

90 A(K,1)*X1100 DO 110 K*1,N

I»(K-1)*M+2*JIQF«K+NA(K,2)«E(I-1I

110 A(I0F,2)*E(I)CALL CONSOL (N,NEXT,ICODE)DO 120 K-1,NI«(K-1)*M+2*JE(I-1)»A(K,2)IQF*K+N

120 E(I)-A(IQF,2)IF (ICODE.EO.3) GO TO 1*0Xl*-2.0*(ALSO*2.0+l.J)A(1,1)»X1+1.0A(N,1)*X1+1.0DO 130 K*2,N1

130 A(K,1)=X11*0 DO 150 J*1,N

A(J,2)-X(J)IOF-J+N

5556575t59606162636*656b67be697C7172737


150 A(I0F»2)«Y(J)CALL CONSOL (N,NEXT,1C0DL)DO 160 J*1,NIQF=N+JX(J)*A(J,2)

160 Y(J)=A(10F,2)IF (ICODE.EQ.3) GO TO 1B0All,11—1.0A(N,1)*-1.0DO 170 K=2,N1

170 A(K,l)«-2.0180 DO 190 K»1,N

I*(K-1)*M+1A(K,2)*E(I)IOF=N+K

190 A(IOF,2)=E(1+1)AIIR0W,1)=A(IRÛW,11-1.0A(IR0W,2)*A(IR0W.21-S0LCALL CONSOL (N,NEXT,ICû0E)00 200 K»1,NI=(K-1)*M+1E(I)=A(K,2)IQF*N+K

200 E(I+1)*A(I0F,2)

FAST FOURIER TRANSFORM OF SOLUTION U

SIOí)JF00IQMMDJL"A(A(A(A(CA00J =

230 E(

210

220.

G*l.230(Kl210

F*JF1,1)1,2)

220MPLUL,llL,2)KHALMHALLL F

230JF+IJ)=A

0Kl

-1)1 =

♦ I*•E(*E(

1 =S2->A(*-AFP,FP,FT2

1 =

1,N*M1,MSY22IQF-1)IOF)2,M3Y2I1,1)(1,2)1)=X(K1)2)*Y(K1)

(SIG,MPOW,PI)1,M

(1,1)»AM

SOLUTION V WILL BE IN ARRAY D AND U WILL BE IN E

(ISAM.EQ.O) GU TO 300(ISAM.EC.-1) GO TO 260250 K*1,N1

= K*MMP=KM-M1,1)=ALAM»(E(KM)-E(KMMP + D)

2*0 J"2,M=KMMP+JJ,1)=ALAM*(L(KJ-l)-E(KJ))

250 1*1,MKMMP+IJ)=D(J)+«(1,1)

(K.LL.l) GO TO 250I = J-MJ)=0(J)+D(KlI)NTINUE

IFIFDOKMKMA(DOKJ

2*0 A(DOJ*D(IFKl0(

250 COGO TO 300

2.60 Oq 270 1*1,MJ-I+ILM

270 A(I,1)*0(J)IS = 1IP*2DO 290 KD*1,N1IEX«ISIS = IPIP"IEXKM«M*(N-KD+1)KMMP*KM-MA(1,IP)=ALAM»(E(KM)-E(KMMP+1))+A(1»IP)DO 280 J*2,MKJ*KMMP+J

280 A(J,IP)=ALAM*(£(KJ-1)-E(KJ))+A(J,IP)DO 290 1=1,MJ-KMMP+I

1361391*01*11*21*31**1*5

6 1*6S 1*7B l*bB 1*9B 150B 151B 152B 153B 15*B 155B 1568 157B 15EB 159B 160B 161B 162B 163B 16*B 165B 1663 167B 16CB 169B 170B 171B 172

173B 17*B 175B 176B 1778 176B 179B 18 08 161B 162B 183B 18*

185B 186B 167B 186B 189B 190B 191B 192B 1938 19*8 195B 196B 197B 198B 1998 200B 201

20220320*20520620720t209210211

B 212B 213B 21*B 215B 21bB 217B 2i6

B

B



K1I-J-MA(I,IS)'0(K1I)0(K1I)—A(I,IP)IF (KD.GT.l) D(K1I)»D(J)+D(K1I)

290 CONTINUE300 CONTINUE

RETURNENDSUBROUTINE FFT2 (SIG,M,PI)COMMON /WET/ X(12f ),Y-(l26)

FAST FOURIER TRANSFORM

N=2*»MNV2-N/2NM1=N-1J = lDO 30 1=1,NM1IF ( I.GE.J) GO TO 10TX-XÍJ)TY*Y(J)X(J)=X(I)Y(J)=Y(I)X(1)=TXY(I)*TY

10 K=NV220 IF (K.GE.J) GO TO 30

J-J-KK=K/2GO TO 20

30 J*J+KDO 50 L=1,MLE*2**LLEl*LE/2UX-1.0UY*0.0ANG*PI/FL0AT(LE1)WX=COS(ANG)WY=SIG*SIN(ANG)DO 50 J=1,LE1DO *0 I*J,N,LEIP=I+LE1TX*X(IP)*UX-Y(IP)»UYTY=X(IP)»UY+YI1P)*UXX(IPI*X(Il-TXY(IP)*Y(I)-TYX(I)*X(I1+TX

*0 Y(1)=Y(I)+TYUS*UX»WX-UY*WYUY=UX*WY+UY*WX

50 UX=USRETURNENDSUBROUTINE CONSOL (N,NEXT,1 CODE)COMMON /KEEP/ 2(8256)COMMON /WET/ X(126),Y(126)

»♦»»♦»»♦»t****************************************************

* SOLVES A SPECIAL TP1-DIAGJNAL LINEAR SYSTEM BY LU* FACTORIZATION .* TWO R.H.S. ARE GIVEN AND STORED IN Y, STARTING AT Y(l)« AND AT YIN+1) RESPECTIVELY .* THE TWO CORRESPONDING VECTOR UNKNOWNS ARE RETURNED IN X,* STARTING AT X(l) AND AT XIN+i) RESPECTIVELY.

B 2193 22CB 221B 222o 223à 22*B 225

26-1

IF ICODE = 3 THEN Z ARRAY MUST BE KEPT RESERVED FORUSE BY CONSOL.

ttl>tltt(>tttl

IF (ICODE.EO.3) GO TO 20Z(NEXT+1)*X(1)00 10 K«2»NNEXTPL-NEXT+KKl-K-1X(K1)=1.0/Z(N[XTPL-1)

10 Z(NEXTPL)*X(KI-X(K1)20 DO 30 K-2,N

KPL=K+NKNEXT«NEXT+K-1Y(K)=Y(K)-Y(K-1)/Z(KNEXT)

23*56789

101112131*1516171619202122232*2526272e29303132333*3536373639*0*1*2*3***5-

123

67f9

101112131*1516171619202122232*252627



30 Y(KPLI*Y(KPL)-Y(KPL-1)/Z(KNEXT)NEXTPL-NEXT+NY(N)«Y(N)/Z(NEXTPL)Y(2»N)=Y(2»N)/Z(NEXTPL)DO *0 J*2,NK-N-J+lKPL-K+NKNEXT=K+NEXTY(K)*(Y(K)-Y(K+1))/Z(KNEXT)

'""(KPL>=(Y(KPL)-Y(KPL+1))/Z


20

a fast cauchy-riemann solver · 2018. 11. 16. · a fast cauchy-riemann solver by michael ghil* and...

Documents