accelerated multigrid high accuracy solution of the convection-diffusion equation with high reynolds...

16
Accelerated Multigrid High Accuracy Solution of the ConvectionDiffusion Equation with High Reynolds Number Jun Zhang Department of Mathematics, The George Washington University, Washington, D. C. 20052 Received 17 January 1996; revised manuscript received 11 August 1996 A fourth-order compact finite difference scheme is employed with the multigrid algorithm to obtain highly accurate numerical solution of the convectiondiffusion equation with very high Reynolds number and variable coefficients. The multigrid solution process is accelerated by a minimal residual smoothing (MRS) technique. Numerical experiments are employed to show that the proposed multigrid solver is stable and yields accurate solution for high Reynolds number problems. We also show that the MRS acceleration procedure is efficient and the acceleration cost is negligible. c 1997 John Wiley & Sons, Inc. I. INTRODUCTION Numerical simulation of the convectiondiffusion equation plays a very important role in com- putational fluid dynamics. The general convectiondiffusion equation satisfying the Dirichlet boundary conditions is of the form u xx + u yy + p(x, y)u x + q(x, y)u y = f (x, y), (x, y) Ω, u(x, y)= g(x, y), (x, y) Ω. (1) Here the convection coefficients p(x, y) and q(x, y) are the functions of variables x and y. Ω is a convex domain and Ω is the boundary of Ω. This equation often appears in a variety of situations, e.g., in the numerical solution of the steady incompressible NavierStokes equations [13]. The magnitudes of p(x, y) and q(x, y) determine the ratio of the convection to diffusion. In many problems of practical interest, the convective terms dominate the diffusion. Many numerical simulations of Eq. (1) become increasingly difficult (converge slowly or even diverge) as the ratio of the convection to diffusion increases. Suppose that Eq. (1) is discretized by some finite difference scheme, which results in a system of linear equations A h u h = f h , (2) where h is the uniform grid spacing of the discretized domain Ω h . In practice, the matrix A h in (2) is sparse and of large dimension; it is usually nonsymmetric and indefinite for large Re, where Numerical Methods for Partial Differential Equations 13, 77 92 (1997) c 1997 John Wiley & Sons, Inc. CCC 0749-159X/97/010077-16

Upload: jun-zhang

Post on 06-Jun-2016

212 views

Category:

Documents


0 download

TRANSCRIPT

Accelerated Multigrid High Accuracy Solution ofthe Convection–Diffusion Equation with HighReynolds NumberJun ZhangDepartment of Mathematics, The George Washington University, Washington, D. C.20052

Received 17 January 1996; revised manuscript received 11 August 1996

A fourth-order compact finite difference scheme is employed with the multigrid algorithm to obtain highlyaccurate numerical solution of the convection–diffusion equation with very high Reynolds number andvariable coefficients. The multigrid solution process is accelerated by a minimal residual smoothing (MRS)technique. Numerical experiments are employed to show that the proposed multigrid solver is stable andyields accurate solution for high Reynolds number problems. We also show that the MRS accelerationprocedure is efficient and the acceleration cost is negligible. c© 1997 John Wiley & Sons, Inc.

I. INTRODUCTION

Numerical simulation of the convection–diffusion equation plays a very important role in com-putational fluid dynamics. The general convection–diffusion equation satisfying the Dirichletboundary conditions is of the form

uxx + uyy + p(x, y)ux + q(x, y)uy = f(x, y), (x, y) ∈ Ω,

u(x, y) = g(x, y), (x, y) ∈ ∂Ω. (1)

Here the convection coefficients p(x, y) and q(x, y) are the functions of variables x and y. Ω is aconvex domain and ∂Ω is the boundary of Ω.

This equation often appears in a variety of situations, e.g., in the numerical solution of the steadyincompressible Navier–Stokes equations [1–3]. The magnitudes of p(x, y) and q(x, y) determinethe ratio of the convection to diffusion. In many problems of practical interest, the convectiveterms dominate the diffusion. Many numerical simulations of Eq. (1) become increasingly difficult(converge slowly or even diverge) as the ratio of the convection to diffusion increases.

Suppose that Eq. (1) is discretized by some finite difference scheme, which results in a systemof linear equations

Ahuh = fh, (2)

where h is the uniform grid spacing of the discretized domain Ωh. In practice, the matrix Ah in(2) is sparse and of large dimension; it is usually nonsymmetric and indefinite for large Re, where

Numerical Methods for Partial Differential Equations 13, 77 92 (1997)c© 1997 John Wiley & Sons, Inc. CCC 0749-159X/97/010077-16

78 ZHANG

Re is the cell Reynolds number defined as

Re = max( sup(x,y)∈Ω

|p(x, y)|, sup(x,y)∈Ω

|q(x, y)|)h/2. (3)

For Re ≤ 1, we say that Eq. (1) is diffusion-dominated, otherwise it is convection-dominated.If the discretization scheme of Eq. (1) is the central difference scheme (CDS), the resulting

linear system (2) is a five-point formula (FPF) with a truncation error of order h2. In the caseof the FPF scheme, classical iterative methods for solving the resulting linear system do notconverge when the convective terms dominate and the cell Reynolds number (Re) is greaterthan a certain constant. On the other hand, the conventional upwind difference approximation iscomputationally stable, but only first-order accurate, and the resulting solution exhibits the effectsof artificial viscosity. The second-order upwind method suffers from similar problems, and thehigher-order finite difference methods of conventional type are computationally inefficient. Inaddition, direct methods for solving the resulting linear system may give erroneous results forlarge Re.

Due to its importance in practical applications, various attempts have been made to solve Eq.(1) with iterative (especially multigrid) methods based on FPF . Recently, Brandt and Yavneh[2, 3] used the first differential approximation (based on FPF) with added dissipation termsto solve Eq. (1) in the context of high Reynolds number flow problems. They proposed over-weighted residual and defect-correction techniques to accelerate the multigrid convergence forhigh Reynolds number incompressible entering flow and recirculating flow problems. de Zeeuw[4, 5] developed a black-box multigrid solver with some matrix-dependent prolongations andrestrictions. A multigrid method based on the Schur complement of the coefficient matrix andthe matrix-dependent prolongation operator was recently published by Reusken [6, 7].

In addition, Elman and Golub [8] proposed methods that consist of applying one step of cyclicreduction, resulting in a reduced system of half the order of the original discrete problem. Thesemethods combine a reordering of the grid points in a checkerboard fashion with a block iterativetechnique for solving the reduced system, and converge for any value of the cell Reynolds number.

Theoretically, all of the above-mentioned methods have (at most) the second-order convergencerate. The theoretical error bounds were implied by the discretization error bounds of the centraldifference scheme. But there have been no numerical results to show that the theoretical order ofconvergence rate was achieved by these methods.

For the convection–diffusion equation with constant coefficients, Gupta et al. [9] proposeda fourth-order nine-point compact finite difference formula (NPF), which was shown to becomputationally efficient and stable and to yield highly accurate numerical solutions. TheNPFscheme was extended to solve the convection–diffusion equation with variable coefficients in[10]. The new scheme also has a truncation error of order h4 and the resulting linear system canbe solved by classical iterative methods for large values of p(x, y) and q(x, y). Recently, similarfourth-order compact schemes have been developed by Dennis and Hudson [11], Li, Tang andFornberg [12], Spotz and Carey [13].

Altas and Burrage [14] used this compactNPF with the defect-correction multigrid techniqueto solve the steady incompressible Navier–Stokes equations and obtained accurate solution withsmall Reynolds number (Re = 100). They used NPF only to evaluate the residuals on thefinest grid and did not show if the target fourth-order convergence rate was achieved for theconvection–diffusion equation. The current work is to develop a multigrid solver with NPF tosolve Eq. (1) with high Reynolds number and variable coefficients and to introduce a minimalresidual smoothing (MRS) technique to accelerate the convergence of the proposed multigrid

ACCELERATED MULTIGRID HIGH ACCURACY SOLUTION . . . 79

method. Through numerical examples, we will show that the multigrid method does achieve thefourth-order convergence rate for some convection–diffusion equations.

In this article, we present the nine-point compact finite difference discretization formula(NPF) for Eq. (1) in Section 2. In Section 3, we introduce the general philosophy of themultigrid method, including the V -cycle and W -cycle algorithms. In Section 4, a minimal resid-ual smoothing technique is employed to accelerate the multigrid convergence. The acceleratedNPF multigrid solver for the convection–diffusion equation is formally designed in Section 5.In Section 6, numerical experiments are employed to show the stability and the effectiveness ofthe NPF multigrid solver, and the efficiency and the cost of the minimal residual smoothingacceleration procedure. Some concluding remarks are given in Section 7.

II. NPF FINITE DIFFERENCE SCHEME

The approximate value of a function u(x, y) at a mesh point (x, y) is denoted by u0. The approx-imate values at its eight immediate neighboring points are denoted by ui, i = 1, 2, . . . , 8. Thenine-point compact grid points are labeled as follows:

u6 u2 u5u3 u0 u1u7 u4 u8

The discretized values pi, qi, and fi, i = 0, 1, . . . , 4, have their obvious meanings. The compactfinite difference formula for the mesh point (x, y) involves the nearest eight neighboring meshpoints with the mesh spacing h and is given by (for details of deriving this formula, readers arereferred to [10]):

8∑j=0

αjuj =h2

2[8f0 + f1 + f2 + f3 + f4] +

h3

4[p0(f1 − f3) + q0(f2 − f4)], (4)

where the coefficients αi, i = 0, 1 . . . , 8, are given as

α0 = −[20 + h2(p20 + q2

0) + h(p1 − p3) + h(q2 − q4)],

α1 = 4 +h

4[4p0 + 3p1 − p3 + p2 + p4] +

h2

8[4p2

0 + p0(p1 − p3) + q0(p2 − p4)],

α2 = 4 +h

4[4q0 + 3q2 − q4 + q1 + q3] +

h2

8[4q2

0 + p0(q1 − q3) + q0(q2 − q4)],

α3 = 4− h

4[4p0 − p1 + 3p3 + p2 + p4] +

h2

8[4p2

0 − p0(p1 − p3)− q0(p2 − p4)],

α4 = 4− h

4[4q0 − q2 + 3q4 + q1 + q3] +

h2

8[4q2

0 − p0(q1 − q3)− q0(q2 − q4)],

α5 = 1 +h

2(p0 + q0) +

h

8(q1 − q3 + p2 − p4) +

h2

4p0q0,

α6 = 1− h

2(p0 − q0)− h

8(q1 − q3 + p2 − p4)− h2

4p0q0,

α7 = 1− h

2(p0 + q0) +

h

8(q1 − q3 + p2 − p4) +

h2

4p0q0,

α8 = 1 +h

2(p0 − q0)− h

8(q1 − q3 + p2 − p4)− h2

4p0q0.

80 ZHANG

When Re ≡ 0, Eq. (1) reduces to the Poisson equation, and Eq. (4) reduces to the well-knownMehrstellen formula [15]. Multigrid applications of the Mehrstellen formula have been investi-gated by Schaffer [16], Gupta, Kouatchou, and Zhang [17]. The latter authors also showed thatthe multigrid method with NPF is much more efficient than that with FPF for solving thePoisson equation on both serial and vector computers.

III. MULTIGRID METHOD

The multigrid method is among the fastest and most efficient algorithms for solving linear systemsarising from discretized elliptic partial differential equations. The multigrid algorithm iterates ona hierarchy of successively coarser grids until the convergence is reached (the residual equationsare approximately solved on the coarse grids); considerable computational time is saved by doingmajor computational work on the coarse grids.

One iteration of a simple multigrid cycling algorithm consists of smoothing the error usinga relaxation technique (e.g., Gauss–Seidel and Jacobi methods, which are called smoothers inthe multigrid literature), projecting (restricting) the residuals to the coarse grid, solving an ap-proximation to the smooth error equation on the coarse grid, interpolating (prolongating) thecoarse grid error correction back to the fine grid, and finally adding the error correction into theapproximation. An important aspect of the multigrid method is that the coarse grid solution canbe approximated by recursively using the multigrid idea. That is, on the coarse grid, relaxationis performed to reduce the high frequency errors followed by the projection of the residuals onyet another coarser grid, and so on. Thus, the multigrid method requires a series of (different)problems to be solved on a hierarchy of grids with different mesh-sizes. A multigrid V -cycle isthe computational process that goes from the finest grid down to the coarsest grid and back fromthe coarsest up to the finest. A common variation of the V -cycle is to do two correction cyclesat each level before returning to the next fine level; this is the W -cycle. A V (ν1, ν2)-cycle is amultigrid V -cycle algorithm which performs ν1 relaxation sweeps at each level before projectingthe residuals to the coarse grid space (pre-smoothing sweeps), and performs ν2 relaxation sweepsafter interpolating the coarse grid correction back to the fine grid space (post-smoothing sweeps).A W (ν1, ν2)-cycle can be defined similarly. For more details on the motivation, philosophy, andprocesses of the multigrid method, readers are referred to [18–21] and the references therein.

We assume that the discretized grid space is naturally (lexicographically) ordered. We will usethe point Gauss–Seidel method as the smoother, which is one of the cheapest smoothers available.(Different but much more costly smoothers are used by Brandt and Yavneh [3], de Zeeuw [5] andReusken [6, 7].) The bi-linear interpolation will be used in our algorithm to interpolate the coarsegrid correction from the coarse grid back to the fine grid. Specifically, the values at the commonmesh points will be directly transferred, while the values at the new grid points will be obtainedby averaging either two or four nearest mesh points. The full-weighting operator with the stencil

116

1 2 12 4 21 2 1

will be used to transfer the residuals from the fine grid to the coarse grid. Note that the projectionoperator is the transpose of the interpolation operator up to the difference of a constant factor.

ACCELERATED MULTIGRID HIGH ACCURACY SOLUTION . . . 81

In the context of the multigrid method, the residual equations are solved on the coarse grids,the right-hand side as it appears in Eq. (4) is only evaluated once on the finest grid when theinitialization of data (boundary conditions) is performed. We may define F0 by

F0 =h2

2[8f0 + f1 + f2 + f3 + f4] +

h3

4[p0(f1 − f3) + q0(f2 − f4)].

Now Eq. (4) becomes

8∑i=0

αiui = F0.

The computation of F0 for grid points close to the boundary requires the knowledge of f(x, y) onthe boundary. We assume that f(x, y) is extended naturally on to ∂Ω. Note that there is no specialformula needed for computing grid points close to the boundary, since the NPF discretizationscheme given by Eq. (4) is of compact type.

To access the computer's memory more efficiently, practical multigrid solvers usually use asingle long array to store the discretized values of u and f (here F ) for all grid levels. On thecoarse grids, the locations of u and f are used to store coarse grid correction and the residual,respectively.

It is also economical to precompute the values of p(x, y) and q(x, y) on each grid point. Weuse another long array to store these values at each grid point and on each grid level. This issimilar to the long array used above to store values of the approximate solution u(x, y) and theright-hand side f(x, y). The same pointers may be used for all the arrays.

There is an option of precomputing all values of the coefficient matrix Ah, but this requires fourand a half times more storage space than usually required for the nine-point multigrid solver to storethe information of matrix Ah. There is a trade-off between the storage space and the computationalefficiency. If Eq. (4) can be solved in a few multigrid cycles to the required accuracy, as isnormally the case for many practical multigrid algorithms, computing the coefficient matrix Ah

in the iteration process may be more attractive in practical applications.A popular practice to obtain stable second-order accurate solution for the convection–diffusion

equation is to employ the defect-correction techniques of various kinds, which use the upwindscheme for relaxation (stability) and the central difference scheme for residual evaluation (accu-racy) [22–25]. It was demonstrated by those and other investigators that, if the basic discretizationscheme is of O(h) and the target discretization is of O(h2), then the resulting solution is of O(h2).

Several defect-correction techniques have been developed and used with some success by manyauthors to obtain stable second-order accuracy solution of Eq. (1) (and of the Navier–Stokesequations) [3, 25, 26]. However, since most authors published their methods with numericalresults on the convergence rate of the residual norms only, it is not clear if these methods actuallyachieved the theoretical second-order convergence rate in practice. (An exception has been foundin [25], where some pictures were used to show the accuracy improvement resulting from thedefect-correction techniques.) Our numerical examples (see Section 6) show that the multigridmethod with the CDS scheme, even converges for some convection–diffusion equations, may notachieve the second-order convergence rate for those problems with rapidly varying convectioncoefficients. Brandt and Yavneh [27] also demonstrated that the first-order upwind scheme maynot be adequate for some convection–diffusion equations. These numerical experiments indirectlyraise some skepticism on the computed accuracy of the multigrid method based on the upwindscheme or CDS.

Since Gupta et al. [10] have shown thatNPF is stable for all Re and yields solution of O(h4),it is not necessary to use any defect-correction technique for the sake of combining stability and

82 ZHANG

accuracy. Hence, in our implementation, we useNPF for both relaxation and residual evaluation;this will guarantee that our computed solution is highly accurate.

Hereinafter, we refer to the algorithms using the multigrid cycling techniques and the NPFsmoother as NPF-MG; those using FPF (CDS) smoothers are referred to as FPF-MG.

IV. ACCELERATION BY MINIMAL RESIDUAL SMOOTHING

Suppose thatNPF-MG generates a sequence uhk with the associated residual sequence rh

k.Since it is generally not possible to measure the convergence of the error directly, the quality ofthe iterates is usually judged by the behavior of the residual norm sequence ‖rh

k‖, where ‖ · ‖is some norm. Usually, it is desirable that ‖rh

k‖ converge ‘‘smoothly’’to zero.The standard multigrid method is extremely efficient for solving elliptic problems such as

the Poisson type equations and the convection–diffusion equation with small Re [18]. But con-vergence is slowed down when it is used to solve nonelliptic problems or problems containingnonelliptic components, e.g., Eq. (1) with high Re [2, 3]. In the case of slow multigrid conver-gence, acceleration schemes are usually needed to obtain a satisfactory convergence rate. Variousacceleration schemes have been proposed by many investigators to accelerate different multigridprocedures in different situations [2, 3, 28, 29]. (These acceleration schemes have been discussedand categorized in detail by Zhang in [30].) To accelerate the convergence of our NPF-MGsolver, we introduce an acceleration technique called the minimal residual smoothing (MRS).

The minimal residual smoothing technique was introduced by Schonauer [31] and analyzedextensively by Weiss [32]. In this approach, an auxiliary sequence vk (with the associatedresidual sequence sk) is generated from uh

k by a recurrence relations

MRS

v0 = uh0 , s0 = rh

0 ,vk = (1− βk)vk−1 + βkuh

k , k = 1, 2, . . . ,sk = (1− βk)sk−1 + βkrh

k ,(5)

in which each βk is chosen to minimize ‖fh −Ah[(1− β)vk−1 + βuhk ]‖2 over the set of all real

numbers, i.e.,

βk = −sTk−1(r

hk − sk−1)

‖rhk − sk−1‖22

,

where ‖·‖2 is the Euclidean norm and sk−1 = fh−Ahvk−1. The resulting residuals sk obviouslyhave monotonically decreasing Euclidean norms, i.e., ‖sk‖2 ≤ ‖sk−1‖2 and ‖sk‖2 ≤ ‖rh

k‖2 foreach k.

Residual smoothing techniques such as MRS have been used extensively in the context of theKrylov subspace methods to stabilize the residuals of the underlying methods [33]. In the widelyused generalized minimal residual (GMRES) method [34], the residual norm sequence ‖rh

k‖2converges to zero optimally among all Krylov subspace methods. Other methods, such as bicon-jugate gradient (BCG) [35] and conjugate gradient squared (CGS) [36], have certain advantagesover GMRES but often exhibit very irregular residual-norm behavior. This irregular behaviorhas provided an incentive for the development of methods that have similar advantages but pro-duce better behaved residual norms, such as the biconjugate gradient stabilized (Bi-CGSTAB)methods [37] and methods based on the quasi-minimal residual (QMR) approach [38]. Some ofthese stabilized methods may also be obtained from more basic methods by using some residualstabilization techniques, see [33, 39].

ACCELERATED MULTIGRID HIGH ACCURACY SOLUTION . . . 83

The MRS technique in the form of recurrence relations (5) can be used in the multigrid methodto generate a sequence with ‘‘smoothed’’residuals. However, application of MRS requires thevalues of residuals that are not generally computed at each step of the multigrid process. Theevaluation of residuals on a given grid is roughly equivalent to one relaxation sweep on that gridand should be avoided whenever possible. We incorporate the MRS procedure in the multigridsolver just after the pre-smoothing sweeps are done and the residuals are computed. We use MRSto smooth the residuals before they are projected to the coarse grid. We then replace the original(vector) multigrid iterate uh

k and its residual rhk by the MRS sequence vk and its associated residual

sk, and project sk to the coarse grid to form the coarse grid sub-problem. Note that we mustreplace both the multigrid iterate uh

k and its residual rhk at the same time, otherwise the coarse

grid sub-problem would provide a wrong correction to the fine grid. In this way, we give thecoarse grid smoothed residuals, which are essential for the coarse grid to provide a good coarse-grid-correction to the fine grid [18, 40]. Therefore, we expect that the acceleration by MRS willbe favorable, if the residuals are not efficiently smoothed by the NPF smoother in the case ofhigh Re.

Since continuous multigrid iteration is only formed on the finest grid, we only apply the MRSacceleration scheme on the finest grid. On the coarse grids, each iteration solves a differentcoarse grid sub-problem, there is no continuous sequence formed, and we do not apply the MRSacceleration sceheme as described above.

The cost of the MRS acceleration scheme in the form of recurrence relations (5) is 9 operationson the finest grid and is independent of the original matrix operator Ah. When the evaluation of Ah

is complicated, i.e., when Ah is evaluated in the iteration process, the cost of the MRS accelerationis negligible in the context ofNPF-MG. Even if the coefficient matrix is precomputed and stored,the cost of the MRS acceleration is roughly equal to one NPF Gauss–Seidel relaxation on thefinest grid.

V. DESIGN OF THE ACCELERATED NPF -MG SOLVER

We design our NPF-MG solver with the MRS acceleration procedure as follows:

1. Start on the finest grid from some initial guess and perform ν1 NPF Gauss–Seidel relax-ation sweeps.

2. Compute the residuals and use the MRS acceleration procedure to generate a new iteratewith the associated residual vector.

3. Replace the multigrid iterate and its residual vector by the MRS iterate and the smoothedresidual vector.

4. Use the full-weighting residual restriction scheme to project the smoothed residual vectorto the coarse grid.

5. Perform µ multigrid cycles on the coarse grid to solve the coarse grid sub-problem.6. Interpolate the coarse grid correction to the fine grid by bi-linear interpolation and add the

correction to the find grid solution.7. Perform ν2 NPF Gauss–Seidel relaxation sweeps on the fine grid.

If µ = 1, the multigrid algorithm is called the V -cycle. If µ = 2, it is the W -cycle. Forconvection–dominated problems, it has been shown by Brandt and Yavneh [3] that the W -cyclealgorithm is usually more efficient than the V -cycle algorithm. ν1 and ν2 are the numbers ofpre-smoothing and post-smoothing sweeps. A pseudo-code of theNPF-MG µ-cycle algorithmis as follows:

84 ZHANG

NPF -MG µ -cycle algorithm with the MRS acceleration procedure.

uh ← NPF -MG(uh, fh)

Given any initial guess uh0 .

For k = 0, 1, 2, . . . , do:

If Ωh = the coarsest grid, then

Solve uhk = (Ah)−1fh.

Else

Relax ν1 times on Ahuhk = fh with the given initial guess uh

k .

Compute rhk = fh −Ahuh

k .

If Ωh = the finest grid, then

If k = 0, then

Set v0 = uh0 and s0 = rh

0 .

Else

Compute βk = −sTk−1(r

hk − sk−1)/‖rh

k − sk−1‖22.

Set sk = sk−1 + βk(rhk − sk−1);

vk = vk−1 + βk(uhk − vk−1).

Set uhk = vk and rh

k = sk.

End if.

End if.

Restrict r2hk = Rrh

k .

Set f2h = r2hk .

Set u2hk = 0.

Do u2hk ← NPF -MG(u2h

k , f2h) µ times.

Correct uhk+1 = uh

k + Pu2hk .

ACCELERATED MULTIGRID HIGH ACCURACY SOLUTION . . . 85

Relax ν2 times on Ahuhk+1 = fh with the initial guess uh

k+1.End if.

VI. NUMERICAL EXPERIMENTS

We test the acceleratedNPF-MG solver by several numerical experiments. We test the conver-gence, the computed accuracy, the acceleration rate of the MRS acceleration procedure, and thecost of the acceleration.

The test problems given here are solved using a uniform mesh-size h on a square domainΩ = (0, 1)× (0, 1). The boundary values of the solution are assumed to be known and the initialguess is u(x, y) = 0. TheNPF-MG solver with and without the MRS acceleration procedure isapplied with one pre-smoothing and one post-smoothing sweep (ν1 = ν2 = 1). For comparison,FPF-MG with CDS is also tested. The W (1, 1) cycle algorithm (µ = 2) is applied, andthe lexicographical Gauss–Seidel is used as the smoother for both FPF and NPF multigridsolvers. The number of the multigrid W (1, 1) cycles (W -cycle), the discrete error in L∞-norm(Error), and the CPU time in seconds, are reported. All computations are done on an SGI (SiliconGraphics Indy) workstation using the Fortran 77 programming language in double precision. Thecomputations are terminated when the initial residual (in L2-norm) on the finest grid is reducedby a factor of 1010.

For both FPF-MG and NPF-MG solvers, the standard coarsening technique [20, 21] (themesh-size of the coarse grid doubles that of the fine grid) is used and the coarsest grid containsonly one unknown (nine points in total including the boundary points).

For all test problems, we first compare the computed accuracy and the convergence rateof FPF-MG and NPF-MG with some fixed convection coefficients on different mesh-sizes(without the MRS acceleration). We will show that the computed accuracy ofNPF-MG improvesrapidly as the mesh-size is refined, but that of FPF-MG fails to improve in some test problems.After that, we test the effect of Re on the computed accuracy and on the convergence of NPF-MG with and without the MRS acceleration procedure. We demonstrate the acceleration rate andshow that the cost of the acceleration is negligible.

Since the recorded CPU times may not be the same when the same program is run at differenttimes due to the influence of other users on the same computer, we run the same program severaltimes and report the averaged CPU timings. However, there are still some variations in the reportedCPU timings, especially when the CPU timings are small.

A. Test Problem 1

Consider the boundary value problem (1) with

p(x, y) = Px(1− y),q(x, y) = Py(1− x),u(x, y) = xy(1− x)(1− y).

In order to show the high accuracy and stability of theNPF-MG solver, we compareNPF-MGwithFPF-MG (with central difference scheme) for different values ofP and mesh-sizeh.The testresults are reported in Table I. ForP = 10,bothNPF-MG andFPF-MG converge satisfactorily,whileNPF-MG converges faster. By observing the fifth columns of Table I, we note thatNPF-MG achieves the fourth-order convergence rate as the error is reduced approximately by a factorof 16 when the mesh-size h is halved. On the other hand, the data in the third column of TableI indicate that FPF-MG gives solutions of very poor accuracy. FPF-MG does not achieve

86 ZHANG

the second-order convergence rate as the error does not decrease when the mesh-size is refined.It was shown by Berger et al. [41] that both the fourth-order compact scheme and the second-order central difference scheme may produce approximate solution of O(1) when Re ≈ 1/2.However, in our numerical experiments, only FPF-MG shows O(1) convergence, NPF-MGdemonstrates O(h4) convergence.

When the magnitude of the convection coefficients increases, i.e., when P = 100, we findthat FPF-MG diverges while NPF-MG converges acceptably. It is clear that the convergencerate and the computed accuracy ofNPF-MG are affected inversely by the magnitude of the cellReynolds number Re (implied by the magnitude of P ). Since Re is a function of the mesh-sizeh [see definition (3)] and NPF is stable for all Re, the convergence is improved when h isrefined. This does not happen to FPF-MG, because the effective cell Reynolds number Re onthe coarse grids does not change, and the refinement of the mesh-size on the finest grid doesnot bring convergence on the coarse grids. We note that for h ≤ 1/64, the effective Re on thefinest grid is less than 1 and FPF should have converged, if the Gauss–Seidel iterations wereconducted on a single (finest) grid.

Table II contains test results with various Re(P ) and fixed mesh-size h = 1/64. Computationsare reported for NPF-MG with and without the MRS acceleration scheme. We note that themagnitude of the Reynolds number affects the convergence and the computed accuracy ofNPF-MG inversely. When Re ≤ 1, the method converges rapidly and gives reasonably accuratesolution. When Re > 1, the convergence and the computed accuracy are severely degraded. Thenice thing is thatNPF-MG still converges for large Reynolds number. It is well known that theCDS based methods diverge when Re becomes greater than a certain constant (usually 1).NPF-MG usually gives the same computed accuracy with or without the MRS acceleration;

the only exception is that when P = 0 (the Poisson equation), NPF-MG with MRS gives aslightly better accuracy. It is clear that the high-order multigrid method is stable for all Re withor without MRS. The acceleration by the MRS procedure is not obvious for small Re, whereNPF-MG converges satisfactorily. When Re is large,NPF-MG converges very slowly. In thiscase, the MRS procedure accelerates the convergence significantly, usually by more than 40%.The cost of the acceleration can be seen as negligible.

We note that there is little change in the performance of ourNPF-MG with and without MRSfor Re > 105. This suggests that NPF-MG be stable and accurate for all Re and the inverseeffect of the magnitude of Re ceases to increase beyond some threshold.

TABLE I. Test problem 1: Comparison of accuracy and convergence of FPF and NPF .

P = 10 P = 100

FPF-MG NPF -MG FPF-MG NPF -MG

h W -cycle Error W -cycle Error W -cycle Error W -cycle Error1/8 13 1.38(−02) 11 5.48(−06) diva - 21 2.35(−04)1/16 13 1.54(−02) 10 3.32(−07) div - 23 1.59(−05)1/32 11 1.57(−02) 10 2.09(−08) div - 22 9.97(−07)1/64 11 1.58(−02) 9 1.30(−09) div - 17 6.23(−08)1/128 11 1.58(−02) 9 8.15(−11) div - 12 3.89(−09)1/256 11 1.58(−02) 9 5.09(−12) div - 10 2.43(−10)a ‘div’ stands for ‘divergence’.

ACCELERATED MULTIGRID HIGH ACCURACY SOLUTION . . . 87

TABLE II. Test problem 1: Comparison of acceleration rate and cost of MRS with NPF .

h = 1/64 NPF -MG without MRS NPF -MG with MRS

P Re W -cycle CPU W -cycle CPU Error

0 0.0000(+00) 9 0.967 9 1.040 8.04(−15)a

1 7.8125(−03) 9 0.957 9 1.023 1.32(−11)10 7.8125(−02) 9 0.938 9 1.052 1.30(−09)102 7.8125(−01) 17 1.829 16 1.764 6.23(−08)103 7.8125(+00) 55 5.855 52 5.888 1.13(−06)104 7.8125(+01) 841 92.12 524 52.32 1.20(−05)105 7.8125(+02) 2016 215.3 1140 129.7 3.27(−05)106 7.8125(+03) 2084 221.1 1180 133.3 3.29(−05)107 7.8125(+04) 2088 222.8 1183 135.3 3.28(−05)1010 7.8125(+07) 2088 222.8 1178 134.8 3.28(−05)1020 7.8125(+17) 2088 222.7 1178 134.0 3.28(−05)a This is for no MRS. The computed accuracy for P = 0 with MRS is 7.71(−15).

B. Test Problem 2

In test problem 1, the convection coefficients p(x, y) and q(x, y) are polynomials of both thex and y variables, and, therefore, vary somewhat rapidly in the domain Ω. Next, we consider asimpler test problem where the convection coefficients p(x, y) and q(x, y) are linear functions ofa single variable either x or y:

p(x, y) = Px,q(x, y) = −Py,u(x, y) = xy(1− x)(1− y) exp(x + y).

We test the accuracy and the stability of the NPF-MG solver and compare NPF-MG withFPF-MG for different values of P and mesh-size h. The test results are contained in Table III.For P = 10, both NPF-MG and FPF-MG converge satisfactorily and NPF-MG convergesfaster. By examining the fifth columns of Table III, we note that NPF-MG again achieves thefourth-order convergence rate. The data in the third column of Table III indicate that FPF-MGachieves the second-order convergence rate as the error is decreased by a factor of 4 when themesh-size h is halved. This should be compared with the results of test problem 1 (Table I),

TABLE III. Test problem 2: Comparison of accuracy and convergence of FPF and NPF .

P = 10 P = 100

FPF-MG NPF -MG FPF-MG NPF -MG

h W -cycle Error W -cycle Error W -cycle Error W -cycle Error1/8 12 5.20(−03) 11 2.48(−04) diva - 18 3.84(−03)1/16 13 1.37(−03) 11 1.58(−05) div - 24 3.25(−04)1/32 13 3.45(−04) 11 9.93(−07) div - 24 2.21(−05)1/64 13 8.67(−05) 11 6.21(−08) div - 19 1.40(−06)1/128 12 2.17(−05) 10 3.88(−09) div - 14 8.74(−08)1/256 12 5.42(−06) 10 2.43(−10) div - 12 5.46(−09)a ‘div’ stands for ‘divergence’.

88 ZHANG

where FPF-MG failed to yield the second-order convergence rate. The reason may be that,in test problem 2, the coefficients p(x, y) and q(x, y) are relatively simple and do not changerapidly in Ω. These results show that NPF-MG is more robust than FPF-MG based on CDSwith regard to the variation of the convection coefficients p(x, y) and q(x, y). For P = 100,FPF-MG diverges again and NPF-MG converges satisfactorily. But, the inverse effect of Reon the convergence and accuracy is clear.

We point out that test problem 2 was used by Gupta et al. [10] with SOR to test both NPFand CDS. In their tests, SOR with CDS converged for P = 100 with h = 1/8, 1/16, 1/32. Itseems the divergence of FPF-MG was caused by the divergence on the coarsest grids whereh = 1/2, 1/4.

Table IV shows basically the same phenomenon as Table II. The convergence and the computedaccuracy ofNPF-MG are inversely affected by the magnitude of the cell Reynolds number Re.The convergence and the computed accuracy of test problem 2 are comparable with those of thetest problem 1. Once again it is clear that when Re → ∞, the convergence and the computedaccuracy approach some limits and do not degrade any more beyond some threshold.

C. Test Problem 3

In the previous two test problems, the convection coefficients p(x, y) and q(x, y) are polynomials.In the third test problem, we choose the coefficients as multiples of the trigonometric functions

p(x, y) = P sin(πx),q(x, y) = P cos(πy),u(x, y) = x2 + y2.

In this case, the coefficients are varying more rapidly than those chosen for test problems 1 and2. The test conditions are set to be the same as for the first two test problems.

For small P (= 10), FPF-MG does not achieve the second-order convergence rate as in thecase of the first test problem. The absolute errors in all test conditions are worse than those oftest problems 1 and 2. This supports our remarks made earlier that the variation of the convectioncoefficients p(x, y) and q(x, y) affects the computed accuracy of FPF-MG inversely. On theother hand, NPF-MG solution demonstrates the fourth-order convergence rate, although theabsolute errors are somewhat larger than those of the first two test problems.

TABLE IV. Test problem 2: Comparison of acceleration rate and cost of MRS with NPF .

h = 1/64 NPF -MG without MRS NPF -MG with MRS

P Re W -cycle CPU W -cycle CPU Error

0 0.0000(+00) 10 1.097 10 1.165 5.59(−09)1 7.8125(−03) 10 1.101 10 1.183 6.30(−09)

10 7.8125(−02) 11 1.190 10 1.152 6.21(−08)102 7.8125(−01) 19 2.082 17 1.949 1.40(−06)103 7.8125(+00) 65 7.089 64 7.218 1.44(−05)104 7.8125(+01) 733 79.14 423 47.46 8.49(−05)105 7.8125(+02) 1910 205.0 1180 120.6 1.32(−04)106 7.8125(+03) 2121 230.9 1197 132.8 1.33(−04)107 7.8125(+04) 2121 232.5 1198 133.5 1.33(−04)1010 7.8125(+07) 2121 231.5 1198 133.2 1.33(−04)1020 7.8125(+17) 2121 231.1 1198 133.4 1.33(−04)

ACCELERATED MULTIGRID HIGH ACCURACY SOLUTION . . . 89

TABLE V. Test problem 3: Comparison of accuracy and convergence of FPF and NPF .

P = 10 P = 100

FPF-MG NPF -MG FPF-MG NPF -MG

h W -cycle Error W -cycle Error W -cycle Error W -cycle Error

1/8 16 7.60(−01) 16 1.21(−03) diva - 31 1.36(−02)1/16 17 7.49(−01) 15 7.61(−05) div - 41 1.06(−03)1/32 17 7.44(−01) 14 4.76(−06) div - 35 7.08(−05)1/64 17 7.43(−01) 14 2.98(−07) div - 25 4.45(−06)1/128 17 7.43(−01) 14 1.86(−08) div - 16 2.78(−07)1/256 18 7.43(−01) 15 1.16(−09) div - 15 1.74(−08)a ‘div’ stands for ‘divergence’.

For moderate P (= 100),FPF-MG diverges, as in the previous two test problems. NPF-MGconverges for all mesh-size and continuously yields highly accurate solution.

Table VI shows that the convergence and the computed accuracy ofNPF-MG are worse thanthose of the first two test problems with or without the MRS acceleration procedure. This isagain caused by the rapid change of the convection coefficients. We note that the convergencefor large Reynolds number is almost degraded by half, compared with the first two test problems.However, we find that the MRS procedure accelerates the convergence for large Reynolds numberproblems by more than 40% with negligible cost.

For small Reynolds number, all three test problems showed that the MRS acceleration proceduredid not provide detectable acceleration. This is because the multigrid W -cycle algorithm is verypowerful for elliptic problems with small Re and may be too powerful to be cost-effective.Actually, a multigrid V -cycle algorithm is more cost-effective than a W -cycle algorithm for thediffusion-dominated problems (small Re), because the cost of a W -cycle is about one and a halfthat of a V -cycle. Furthermore, we used the full-weighting scheme (3) as the residual restrictionoperator, which averages the residuals at the neighboring grid points and, thus, has some smoothingeffect. More cost-effective algorithms may be designed by using MRS to accelerate the V -cycle

TABLE VI. Test problem 3: Comparison of acceleration rate and cost of MRS with NPF .

h = 1/64 NPF -MG without MRS NPF -MG with MRS

P Re W -cycle CPU W -cycle CPU Error

0 0.0000(+00) 14 1.491 14 1.541 3.06(−14)1 7.8125(−03) 14 1.492 14 1.541 3.73(−09)

10 7.8125(−02) 14 1.526 14 1.549 2.98(−08)102 7.8125(−01) 25 2.744 22 2.476 4.45(−06)103 7.8125(+00) 181 19.54 133 14.71 5.31(−05)104 7.8125(+01) 1251 133.0 731 81.01 3.22(−04)105 7.8125(+02) 4177 444.5 2388 264.2 1.22(−03)106 7.8125(+03) 4002 425.4 2288 253.9 1.22(−03)107 7.8125(+04) 3975 424.8 2266 250.4 1.21(−03)1010 7.8125(+07) 3972 422.4 2276 252.7 1.21(−03)1020 7.8125(+17) 3972 422.1 2275 252.0 1.21(−03)

90 ZHANG

algorithm and by using MRS with the residual injection operator (direct injection of residualsfrom the fine grid to the coarse grid without any averaging scheme).

VII. CONCLUDING REMARKS

A nine-point compact discretization formula has been used in conjunction with the multigridtechnique to develop a high-order multigrid solver (NPF-MG) to solve the general convection–diffusion equation with variable coefficients. A minimal residual smoothing (MRS) procedureis inserted in theNPF-MG iteration to accelerate the convergence ofNPF-MG, especially forhigh Reynolds number problems. The proposed MRS acceleration procedure is independent of theoriginal matrix operator, and the cost of the acceleration is negligible for high Reynolds numberproblems. Several test problems have been solved to demonstrate the efficiency and the computedaccuracy of our NPF-MG solver. The numerical experiments with small and high Re showedthat the MRS procedure did give significant acceleration for high Reynolds number problems.Although the convergence and the computed accuracy of NPF-MG are affected inversely bythe magnitude of the Reynolds number, our numerical examples showed that the performanceof NPF-MG does not degrade further beyond some threshold. This clearly demonstrates thestability and the robustness of NPF-MG with respect to the convergence and the accuracy.

The implementation ofNPF-MG is simple, because it employs the sameNPF discretizationscheme on all grids. The proposedNPF-MG solver does not require a preconditioner nor addeddissipation terms for high Reynolds number problems. It also does not require more memorythan FPF-MG in the case of computing the coefficient matrix in the iteration process. There isno need to use the upwind scheme for convergence, asNPF is stable for all Re. The computedaccuracy of NPF-MG is much better than that of FPF-MG.

From the numerical experiments, we also found that NPF-MG yields the fourth-order con-vergence rate in all test problems with moderate Reynolds number. We showed that the multigridmethod based on the central difference scheme failed to achieve the second-order convergence rateeven for some small Reynolds number problems when the convection coefficients vary rapidlyin the domain. These numerical results, together with the inadequacy of the first-order upwindscheme [27], essentially demonstrated the advantages of the fourth-order compact schemes.

For more detailed discussions on the implementation of the minimal residual smoothing tech-nique to accelerate the multigrid method, readers are referred to Zhang [42].

The author thanks three anonymous referees, whose valuable comments help improve the presen-tation of this article.

References

1. M. M. Gupta, ‘‘High accuracy solutions of incompressible Navier–Stokes equations,’’ J. Comput.Phys. 93, 343 (1991).

2. A. Brandt and I. Yavneh, ‘‘On multigrid solution of high-Reynolds incompressible entering flows,’’ J.Comput. Phys. 101, 151 (1992).

3. A. Brandt and I. Yavneh, ‘‘Accelerated multigrid convergence and high-Reynolds recirculatingflows,’’SIAM J. Sci. Comput. 14, 607 (1993).

4. P. M. de Zeeuw and E. J. van Asselt, ‘‘The convergence rate of multigrid algorithms applied to theconvection–diffusion equation,’’SIAM J. Sci. Stat. Comput. 6, 492 (1985).

5. P. M. de Zeeuw, ‘‘Matrix-dependent prolongations and restrictions in a blackbox multigrid solver,’’ J.Comput. Appl. Math. 33, 1 (1990).

ACCELERATED MULTIGRID HIGH ACCURACY SOLUTION . . . 91

6. A. Reusken, ‘‘Multigrid with matrix-dependent transfer operators for convection–diffusion prob-lems,’’ in Multigrid Methods IV, Proc. of 4th European Multigrid Conference, Amsterdam,P. W. Hemker and P. Wesseling, Eds., Birkhauser Verlag, Basel, 1994, p. 267.

7. A. Reusken, ‘‘Fourier analysis of a robust multigrid method for convection-diffusion equations,’’Numer. Math. 71, 365 (1995).

8. H. C. Elman and G. H. Golub, ‘‘Iterative method for cyclically reduced non-self-adjoint linear systems,’’Math. Comp. 54, 671 (1990).

9. M. M. Gupta, R. P. Manohar, and J. W. Stephenson, ‘‘A fourth order, cost effective and stable finitedifference scheme for the convection–diffusion equation,’’in Numer. Properties & Methodologies inHeat Transfer., Proc. 2nd National Symp., Hemisphere Pub. Co., Washington, DC, 1983, p. 201.

10. M. M. Gupta, R. P. Manohar, and J. W. Stephenson, ‘‘A single cell high order scheme for the convection–diffusion equation with variable coefficients,’’Int. J. Numer. Methods Fluids 4, 641 (1984).

11. S. C. R. Dennis and J. D. Hudson, ‘‘Compact h4 finite difference approximation to operators of Navier–Stokes type,’’J. Comput. Phys. 85, 390 (1989).

12. M. Li, T. Tang, and B. Fornberg, ‘‘A compact fourth-order finite difference scheme for the steadyincompressible Navier–Stokes equations,’’Int. J. Numer. Methods Fluids 30, 1137 (1995).

13. W. F. Spotz and G. F. Carey, ‘‘High-order compact scheme for the steady stream-function vorticityequations, ’’ Int. J. Numer. Methods Eng. 38, 3497 (1995).

14. I. Altas and K. Burrage, ‘‘A high accuracy defect-correction multigrid method for the steady incom-pressible Navier–Stokes equations,’’ J. Comput. Phys. 114, 227 (1994).

15. M. M. Gupta, ‘‘A fourth-order Poisson solver,’’ J. Comput. Phys. 55, 166 (1984).

16. S. Schaffer, ‘‘High order multi-grid methods,’’ Math. Comp. 43, 89 (1984).

17. M. M. Gupta, J. Kouatchou, and J. Zhang, ‘‘Comparison of 2nd and 4th order discretizations for themultigrid Poisson solver,’’ J. Comput. Phys., to appear.

18. A. Brandt, ‘‘Multi-level adaptive solution to boundary-value problems,’’ Math. Comp. 31, 333 (1977).

19. A. Brandt, ‘‘Guide to multigrid development,’’ in Multigrid Methods, W. Hackbusch and U. Trotten-berg, Eds., Lecture Notes in Math., Vol. 960, Springer–Verlag, New York, 1981, p. 220.

20. W. Briggs, A Multigrid Tutorial, SIAM, Philadelphia, 1987.

21. P. Wesseling, An Introduction to Multigrid Methods, John Wiley & Sons, Chichester, 1992.

22. W. Auzinger and H. J. Stetter, ‘‘Defect correction and multigrid iteration,’’ in Multigrid Methods, W.Hackbusch and U. Trottenberg, Eds., Lecture Notes in Math., Vol. 960, Springer–Verlag, New York,1981, p. 327.

23. W. Auzinger, ‘‘Defect corrections for multigrid solutions of the Dirichlet problem in general domains,’’Math. Comp. 48, 471 (1987).

24. W. Hackbusch, ‘‘On multigrid iterations with defect correction,’’in Multigrid Methods, W. Hackbuschand U. Trottenberg, Eds., Lecture Notes in Math., Vol. 960, Springer–Verlag, New York, 1981, p. 461.

25. P. W. Hemker, ‘‘Mixed defect correction iteration for the accurate solution of the convection–diffusionequation,’’in Multigrid Methods, W. Hackbusch and U. Trottenberg, Eds., Lecture Notes in Math., Vol.960, Springer–Verlag, New York, 1981, p. 485.

26. N. G. Wright and P. H. Gaskell, ‘‘An efficient multigrid approach to solving highly recirculatingflows,’’ Computers & Fluids 24, 63 (1995).

27. A. Brandt and I. Yavneh, ‘‘Inadequacy of first-order upwind difference scheme for some recirculatingflows,’’ J. Comput. Phys. 93, 128 (1991).

28. A. Brandt and V. Mikulinsky, ‘‘On recombining iterants in multigrid algorithms and problems withsmall islands,’’ SIAM J. Sci. Comput. 16, 20 (1995).

92 ZHANG

29. J. Zhang, ‘‘Acceleration of five-point Red–Black Gauss–Seidel in multigrid for two dimensional Pois-son equation,’’ Appl. Math. Comput., to appear.

30. J. Zhang, ‘‘Residual scaling techniques in multigrid, I: equivalence proof,’’ Appl. Math. Comput., toappear.

31. W. Schonauer, Scientif ic Computing on Vector Computers, North–Holland, Amsterdam, 1987.

32. R. Weiss, Convergence Behavior of Generalized Conjugate Gradient Methods, Ph. D. Thesis, Universityof Karlsrushe, Germany, 1990.

33. L. Zhou and H. F. Walker, ‘‘Residual smoothing techniques for iterative methods,’’ SIAM J. Sci.Comput. 15, 297 (1994).

34. Y. Saad and M. H. Schultz, ‘‘GMRES: a generalized minimal residual method for solving nonsymmetriclinear systems,’’ SIAM J. Sci. Stat. Comput. 7, 856 (1986).

35. C. Lanczos, ‘‘Solution of systems of linear equations by minimized iterations,’’ J. Res. Nat. Bur. Stand.49, 33 (1952).

36. P. Sonneveld, ‘‘CGS, a fast Lanczos-type solver for nonsymmetric linear systems,’’ SIAM J. Sci. Stat.Comput. 10, 36 (1989).

37. H. A. van der Vorst, ‘‘BI-CGSTAB: a fast and smoothly converging variant of Bi-CG for the solutionof nonsymmetric linear systems,’’SIAM J. Sci. Stat. Comput. 13, 631 (1992).

38. R. W. Freund, ‘‘A transpose-free quasi-minimal residual algorithm for non-Hermitian linear systems,’’SIAM J. Sci. Comput. 14, 470 (1993).

39. H. F. Walker, ‘‘Residual smoothing and peak/plateau behavior in Krylov subspace methods,’’ Appl.Numer. Math. 19, 279 (1995).

40. A. Brandt, ‘‘Rigorous quantitative analysis of multigrid, I: constant coefficients two-level cycle withL2-norm,’’ SIAM J. Numer. Anal. 31, 1695 (1994).

41. A. E. Berger, J. M. Solomon, M. Ciment, S. H. Leventhal, and B. C. Weinberg, ‘‘Generalized OCIschemes for boundary layer problems,’’ Math. Comp. 35, 659 (1980).

42. J. Zhang, ‘‘Minimal residual smoothing in multi-level iterative method,’’ Appl. Math. Comput., toappear.