multigrid algorithms for three-dimensional rans...
TRANSCRIPT
Multigrid Algorithms for Three-Dimensional RANS Calculations - The SUmb Solver
Juan J. Alonso
Department of Aeronautics & Astronautics Stanford University
CME342 Lecture 14 May 21, 2012
Outline
• Non-linear multigrid algorithm - Review
• FAS multigrid + modified Runge-Kutta scheme
• Software and parallel implementation
• SUmb solver and results
• Unsteady algorithms
• Future
Non-linear Multigrid Algorithm
Solve the non-linear equation
in a mesh with spacing, h
by finding a correction to the current iterate
with some algebra
Non-linear Multigrid Algorithm
and after smoothing the right hand side we can transfer the equation, solution, and residual to a coarser mesh
After relaxation in the coarse mesh, the coarse grid correction is
which can be interpolated to the finer mesh and the procedure repeated until convergence (possibly recursively)
or
Modified Runge-Kutta Time Stepping
Semi-discrete NS equations
Convective and dissipative residuals
Modified Runge-Kutta scheme
Modified Runge-Kutta Time Stepping
Modified Runge-Kutta scheme
• Coefficients for Q and D are created separately to – Minimize CPU time – Improve convergence properties
• Time accuracy is lost...do we care?
FAS Multigrid Algorithm
• Coarse grid driven by residuals transferred from fine mesh
• We impose boundary conditions on every mesh • 1st order artificial dissipation on coarser meshes
Volume/Area-weighted solution coarsening
Coarse grid residual forcing term
Modified R-K in coarse grid
FAS Multigrid Algorithm
• Full-coarsened multigrid
• V or W cycles of arbitrary depth
• Bi-linear or tri-linear interpolation of solution to finer meshes
Additional Convergence Acceleration Schemes
• Local time stepping
• Implicit Residual Smoothing
• Enthalpy Damping
• Others... – Block-Jacobi preconditioner – Low-speed preconditioning – J-coarsening or semi-coarsening
FAS Multigrid Algorithm C ****************************************************************** C * * C * TRANSFERS THE SOLUTION TO A COARSER MESH * C * * C ****************************************************************** ………. DO N=1,4 JJ = 1 DO J=2,JL,2 JJ = JJ +1 II = 1 DO I=2,IL,2 II = II +1 WWR(II,JJ,N) = (DW(I,J,N)*VOL(I,J) +DW(I+1,J,N)*VOL(I+1,J) . +DW(I,J+1,N)*VOL(I,J+1) +DW(I+1,J+1,N) *VOL(I+1,J+1))/ . (VOL(I,J)+VOL(I+1,J)+VOL(I,J+1)+ VOL(I+1,J+1)) END DO END DO END DO ……….
C ****************************************************************** C * * C * COMPLETE THE CALCULATION OF THE MGRID FORCING TERMS * C * ON FIRST ENTRY TO ANY COARSER MESH DURING THE CYCLE * C * TRANSFERS THE SOLUTION TO A COARSER MESH * C * * C ****************************************************************** C DO 32 J=2,JL DO 32 I=2,IL WR(I,J,N) = FCOLL*WR(I,J,N) -DW(I,J,N) 32 CONTINUE C C ADD THE MULTIGRID FORCING TERMS TO THE RESIDUALS C 33 DO 34 J=2,JL DO 34 I=2,IL DW(I,J,N) = DW(I,J,N) +WR(I,J,N) 34 CONTINUE 40 CONTINUE
• Implementation follows description of FAS algorithm closely
• Most of the code is unaware of what grid level it is working on
• Pointer kept for current, finer, and coarser grid levels
Is this good enough?
• Two-dimensional Euler results converge with an average residual contraction ratio of ~ 0.65 (almost “textbook” multigrid)
• Three-dimensional Euler calculations also converge with an average contraction ratio ~ 0.75
• Turbulent NS calculations with integration to the wall usually converge at ~ 0.98-0.99
• What has happened to our “textbook” multigrid?
Parallel Implementation
• For cell-centered schemes such as the one in SUmb, parallel implementations are “straightforward”
• Chop up the domain and distribute it “evenly” among processors (using Metis and a graph representation of the multiblock mesh)
• Whole blocks get assigned to processors • Possibility of block splitting to improve load balancing • Double halo approach to exchange information
Parallel Implementation
Parallel Implementation - Main Issues
• In our algorithm, bandwidth is important since we must communicate – At the end of each stage in the mod R-K sequence – At all multigrid levels
• Double level halo is necessary on each communication
• Double precision messages (8 words = 5 cons. vars + 2 turbulence vars + pressure)
• Non-blocking receives and sends used throughout the code
• Not the most stringent requirement
Parallel Implementation - Main Issues
• In our algorithm, latency is VERY important, since its importance grows particularly at the coarse mesh levels – Fine meshes: 95% of communication cost is bandwidth – Coarsest meshes: 65% of communication cost is latency
• Only way to avoid latency cost is to communicate less often. We have tried this and have found that the improvement in parallel performance is more than offset by the degradation in multigrid convergence.
• Bottomline: for good multigrid convergence in parallel you will need a high-performance network...100BaseT switches will not cut it beyond 8 procs.
Parallel Implementation - Main Issues • In the literature, there are options for vertical and
horizontal multigrid – How often does one communicate at the coarser levels of the
multigrid sequence? – Only on the fine mesh? At all mesh levels? Sometimes at the
coarse levels? • Most researchers now agree that:
– A parallel multigrid implementation that mimics the convergence history of a serial implementation is the best approach
– Parallel performance hit is taken (but overcome by improvements in performance)
– Some uses of multigrid as preconditioner can ameliorate the parallel performance impact
Multigrid Semi-coarsening / J-coarsening • There are indications that something can be done about the
lack of convergence of multigrid methods in RANS calculations
• Encouraging 2D results first shown by N. A. Pierce and reproduced by Darmofal
• Results shown in this talk from N. A. Pierce’s PhD Thesis – N.A.Pierce, Preconditioned Multigrid Methods for Compressible Flow
Calculations on Stretched Meshes, Christ Church, University of Oxford, 1997.
• 3D remains an elusive challenge
Why does multigrid not converge well for RANS?
• Stiffness comes in from a variety of sources – Propagative speed disparity
– Cell stretching
– Flow alignment
– Turbulence models
Euler Convergence, Full Multigrid
• Euler solutions with full multigrid converge reasonably well, although performance can be improved
• Convergence behaves appropriately for larger meshes and higher Mach numbers, although it can always be improved
Euler / NS Multigrid Convergence Analysis
Standard Full Coarsened Multigrid
Standard Full Coarsened Multigrid + Block-Jacobi Precon
Euler / NS Multigrid Convergence Analysis • Standard full coarsening has
problems with – All convective modes – High-x / Low-y acoustic modes
• Adding Block-Jacobi Precon – High-x / Low-y acoustic modes
• J-Coarsening (semi-coarsening) appears to be able to damp all modes.
• Does it? J-Coarsened Multigrid with Block-Jacobi Precon
NS Multigrid Convergence Test Cases
• RANS solutions can be made to converge as efficiently (contraction ratios ~ 0.75) as Euler calculations!!!
NS Multigrid Convergence Test Cases • Good convergence is
a combination of techniques that provide damping across the whole spectrum of modes.
• What about for 3D?
NS Multigrid Convergence Test Cases
• In theory, the same can be proven for 3D calculations • In practice, this has only been shown for VERY smooth meshes
(no wing tips, no tip gaps, no bad stretching) • More works remains to be done
Conclusions
• Multigrid has been one of the most successful algorithms for compressible fluid flow (of course, also for elliptic equations)
• It can be parallelized efficiently • It can be used for both steady and unsteady
computations (dual-time stepping) • Performance degrades with high degrees of mesh
stretching • Work remains to be done to obtain “textbook”
multigrid convergence