multigrid algorithms for three-dimensional rans...

Multigrid Algorithms for Three-Dimensional RANS Calculations - The SUmb Solver

Juan J. Alonso

Department of Aeronautics & Astronautics Stanford University

CME342 Lecture 14 May 21, 2012

Outline

•  Non-linear multigrid algorithm - Review

•  FAS multigrid + modified Runge-Kutta scheme

•  Software and parallel implementation

•  SUmb solver and results

•  Unsteady algorithms

•  Future

Non-linear Multigrid Algorithm

Solve the non-linear equation

in a mesh with spacing, h

by finding a correction to the current iterate

with some algebra

Non-linear Multigrid Algorithm

and after smoothing the right hand side we can transfer the equation, solution, and residual to a coarser mesh

After relaxation in the coarse mesh, the coarse grid correction is

which can be interpolated to the finer mesh and the procedure repeated until convergence (possibly recursively)

or

Modified Runge-Kutta Time Stepping

Semi-discrete NS equations

Convective and dissipative residuals

Modified Runge-Kutta scheme

Modified Runge-Kutta Time Stepping

Modified Runge-Kutta scheme

•  Coefficients for Q and D are created separately to –  Minimize CPU time –  Improve convergence properties

•  Time accuracy is lost...do we care?

FAS Multigrid Algorithm

•  Coarse grid driven by residuals transferred from fine mesh

•  We impose boundary conditions on every mesh •  1st order artificial dissipation on coarser meshes

Volume/Area-weighted solution coarsening

Coarse grid residual forcing term

Modified R-K in coarse grid

FAS Multigrid Algorithm

•  Full-coarsened multigrid

•  V or W cycles of arbitrary depth

•  Bi-linear or tri-linear interpolation of solution to finer meshes

Additional Convergence Acceleration Schemes

•  Local time stepping

•  Implicit Residual Smoothing

•  Enthalpy Damping

•  Others... –  Block-Jacobi preconditioner –  Low-speed preconditioning –  J-coarsening or semi-coarsening

FAS Multigrid Algorithm C ****************************************************************** C * * C * TRANSFERS THE SOLUTION TO A COARSER MESH * C * * C ****************************************************************** ………. DO N=1,4 JJ = 1 DO J=2,JL,2 JJ = JJ +1 II = 1 DO I=2,IL,2 II = II +1 WWR(II,JJ,N) = (DW(I,J,N)*VOL(I,J) +DW(I+1,J,N)*VOL(I+1,J) . +DW(I,J+1,N)*VOL(I,J+1) +DW(I+1,J+1,N) *VOL(I+1,J+1))/ . (VOL(I,J)+VOL(I+1,J)+VOL(I,J+1)+ VOL(I+1,J+1)) END DO END DO END DO ……….

C ****************************************************************** C * * C * COMPLETE THE CALCULATION OF THE MGRID FORCING TERMS * C * ON FIRST ENTRY TO ANY COARSER MESH DURING THE CYCLE * C * TRANSFERS THE SOLUTION TO A COARSER MESH * C * * C ****************************************************************** C DO 32 J=2,JL DO 32 I=2,IL WR(I,J,N) = FCOLL*WR(I,J,N) -DW(I,J,N) 32 CONTINUE C C ADD THE MULTIGRID FORCING TERMS TO THE RESIDUALS C 33 DO 34 J=2,JL DO 34 I=2,IL DW(I,J,N) = DW(I,J,N) +WR(I,J,N) 34 CONTINUE 40 CONTINUE

•  Implementation follows description of FAS algorithm closely

•  Most of the code is unaware of what grid level it is working on

•  Pointer kept for current, finer, and coarser grid levels

Is this good enough?

•  Two-dimensional Euler results converge with an average residual contraction ratio of ~ 0.65 (almost “textbook” multigrid)

•  Three-dimensional Euler calculations also converge with an average contraction ratio ~ 0.75

•  Turbulent NS calculations with integration to the wall usually converge at ~ 0.98-0.99

•  What has happened to our “textbook” multigrid?

Parallel Implementation

•  For cell-centered schemes such as the one in SUmb, parallel implementations are “straightforward”

•  Chop up the domain and distribute it “evenly” among processors (using Metis and a graph representation of the multiblock mesh)

•  Whole blocks get assigned to processors •  Possibility of block splitting to improve load balancing •  Double halo approach to exchange information

Parallel Implementation

Parallel Implementation - Main Issues

•  In our algorithm, bandwidth is important since we must communicate –  At the end of each stage in the mod R-K sequence –  At all multigrid levels

•  Double level halo is necessary on each communication

•  Double precision messages (8 words = 5 cons. vars + 2 turbulence vars + pressure)

•  Non-blocking receives and sends used throughout the code

•  Not the most stringent requirement

Parallel Implementation - Main Issues

•  In our algorithm, latency is VERY important, since its importance grows particularly at the coarse mesh levels –  Fine meshes: 95% of communication cost is bandwidth –  Coarsest meshes: 65% of communication cost is latency

•  Only way to avoid latency cost is to communicate less often. We have tried this and have found that the improvement in parallel performance is more than offset by the degradation in multigrid convergence.

•  Bottomline: for good multigrid convergence in parallel you will need a high-performance network...100BaseT switches will not cut it beyond 8 procs.

Parallel Implementation - Main Issues •  In the literature, there are options for vertical and

horizontal multigrid –  How often does one communicate at the coarser levels of the

multigrid sequence? –  Only on the fine mesh? At all mesh levels? Sometimes at the

coarse levels? •  Most researchers now agree that:

–  A parallel multigrid implementation that mimics the convergence history of a serial implementation is the best approach

–  Parallel performance hit is taken (but overcome by improvements in performance)

–  Some uses of multigrid as preconditioner can ameliorate the parallel performance impact

Multigrid Semi-coarsening / J-coarsening •  There are indications that something can be done about the

lack of convergence of multigrid methods in RANS calculations

•  Encouraging 2D results first shown by N. A. Pierce and reproduced by Darmofal

•  Results shown in this talk from N. A. Pierce’s PhD Thesis –  N.A.Pierce, Preconditioned Multigrid Methods for Compressible Flow

Calculations on Stretched Meshes, Christ Church, University of Oxford, 1997.

•  3D remains an elusive challenge

Why does multigrid not converge well for RANS?

•  Stiffness comes in from a variety of sources –  Propagative speed disparity

–  Cell stretching

–  Flow alignment

–  Turbulence models

Euler Convergence, Full Multigrid

•  Euler solutions with full multigrid converge reasonably well, although performance can be improved

•  Convergence behaves appropriately for larger meshes and higher Mach numbers, although it can always be improved

Euler / NS Multigrid Convergence Analysis

Standard Full Coarsened Multigrid

Standard Full Coarsened Multigrid + Block-Jacobi Precon

Euler / NS Multigrid Convergence Analysis •  Standard full coarsening has

problems with –  All convective modes –  High-x / Low-y acoustic modes

•  Adding Block-Jacobi Precon –  High-x / Low-y acoustic modes

•  J-Coarsening (semi-coarsening) appears to be able to damp all modes.

•  Does it? J-Coarsened Multigrid with Block-Jacobi Precon

NS Multigrid Convergence Test Cases

•  RANS solutions can be made to converge as efficiently (contraction ratios ~ 0.75) as Euler calculations!!!

NS Multigrid Convergence Test Cases •  Good convergence is

a combination of techniques that provide damping across the whole spectrum of modes.

•  What about for 3D?

NS Multigrid Convergence Test Cases

•  In theory, the same can be proven for 3D calculations •  In practice, this has only been shown for VERY smooth meshes

(no wing tips, no tip gaps, no bad stretching) •  More works remains to be done

Conclusions

•  Multigrid has been one of the most successful algorithms for compressible fluid flow (of course, also for elliptic equations)

•  It can be parallelized efficiently •  It can be used for both steady and unsteady

computations (dual-time stepping) •  Performance degrades with high degrees of mesh

stretching •  Work remains to be done to obtain “textbook”

multigrid convergence

multigrid algorithms for three-dimensional rans...

Documents