design of optimal explicit runge kutta schemes for the

Design of optimal explicit Runge–Kutta schemes for thehigh-order spectral difference method

Matteo Parsani

NASA Langley Research Center*

King Abdullah University of Science and Technology

National Institute of AerosapceComputational Fluid Dynamics Seminar

April 2, 2013

(LaRC/KAUST) Optimzed ERK schemes for SD April 2, 2013 1 / 40

Outline

1 IntroductionAcknowledgementsBackground

2 Spectral differenceBasic ideaSpectrum of the 2D advection equation

3 Optimization of explicit Runge-Kutta methodsStability function optimizationMinimization of the leading truncation error coefficientMeasure of the efficiencies

4 ApplicationsAdvection equationLEECompressible Euler equationsCompressible Navier-Stokes equations

5 Conclusions


Introduction Acknowledgements

Acknowledgements

David Ketcheson (KAUST, KSA)

Willem Deconinck (ECMWF, UK)

Michael Bilka (UND, USA)

Aron Ahamadia (Continuum Analytics, USA)


Introduction Background

Low-order methods

A major achievement of the period from 1970 to 1990 was theidentification of so called high-resolution shock capturing schemes whichare

non-oscillatory

second-order accurate almost everywhere

first-order accurate in the vicinity of extrema

As a result of their favorable properties, the use of low-order schemes hasbecome widespread in both academia and industry, and they form thebasis of all commercially available fluid flow solvers.



Low-order methods

Despite these successes, there exists a range of important flow problems(requiring very low numerical dissipation) for which they are not wellsuited, e.g.,

vortex dominated flows, e.g.I flow around rotor-craft bladesI flapping wings

aeroacoustics

turbulent flows :-)

It may be advantageous to use a high-order spatial discretizations whichoffer increased accuracy for a comparable computational cost.



High-order methods

Why high-order accurate methods could outperform the low-orders ones?

Less dissipative and dissipative on a coarser grids for the same costthan lower-order schemes on finer grids

0 0.5 1 1.5 2 2.5 3−4.5

−4

−3.5

−3

−2.5

−2

−1.5

−1

−0.5

0

0.5

Wave Number/(p+1)

Mo

dif

. D

iss. R

ate

/(p

+1)

SV/SD2SV/SD3SV/SD4SV/SD5SV/SD6Exact

(a) Diffusive properties

0 0.5 1 1.5 2 2.5 30

0.5

1

1.5

2

2.5

3

Mo

dif

. A

ng

. F

req

./(p

+1)

Wave Number/(p+1)

SD/SV2SD/SV3SD/SV4SD/SV5SD/SV6Exact

(b) Dispersive properties

Figure: Example of dispersive and diffusive properties for high-order compactcell-wise continuous polynomials methods.



Considerable effort has been devoted to the development of compactspatially high order methods on unstructured grids for system of PDEs,e.g.

Discontinuous Galerkin

Spectral difference

Spectral volume

Flux reconstruction

etc.

Before this era, available high-order methods were designed to be used onCartesian or very smooth structured curvilinear meshes, e.g. spectralelement



Open issues for high-order methods

The use of high-order methods in academia remains limited, and it ispractical inexistent in industry.

Reasons:

lack of robust high-order mesh generation software

lack of robust and accurate shock capturing algorithms

(lack of subgrid-scale models)

lack of simple and efficient time integration schemes tailoredspecifically for high-order spatial discretizations


Spectral difference Basic idea

Spectral difference

General system of conservation laws in physical space:

∂q

∂t+ ~∇ ·

(~fC (q)−~fD

(q, ~∇q

))=∂q

∂t+ ~∇ ·~f = 0

Mapped coordinate system on each cell →

~f~ξi =

f~ξi

g~ξi

h~ξi

= Ji~~J −1

i

figi

hi

= Ji~~J −1

i~fi

∂ (Jiqi )

∂t≡ ∂q

~ξi

∂t= −∂f

~ξi

∂ξ1− ∂g

~ξi

∂ξ2− ∂h

~ξi

∂ξ3= −~∇~ξ ·~f ~ξi


Spectral difference Basic idea

Spectral difference

Solution (degree p) and flux polynomials (degree p + 1):

qi ≈ Qi

(~ξ)

=Ns∑j=1

Qi ,j Lsj

(~ξ), ~F

~ξi

(~ξ)

=N f∑l=1

~F~ξi ,l L

fl

(~ξ)

dQi ,j

dt= − 1

Ji ,j~∇~ξ · ~F ~ξ

i

∣∣∣j

= Ri ,j

Riemann flux to handle the solution discontinuity.

−1 0 1

−1

0

1

(a) p = 1.

−1 0 0.58 1

−1

0

1

α3

(b) p = 2.


Spectral difference Spectrum of the 2D advection equation

Model problem

2D linear advection equation: ∂q∂t + ~∇ · (~aq) = 0

~r2

i + 1, j + 1

i + 1, j

i, j + 1i − 1, j + 1

i − 1, j

i − 1, j − 1 i, j − 1

i, j

i + 1, j − 1

~r1

Figure: Generating pattern.

Insert Fourier wave: Qi ,j (t) = Q̃ (t) e I~k·(i~r ′

1 +j~r ′2 )∆r = Q̃ (t) e I

~K ·(i~r ′1 +j~r ′

2 )

→ dQ̃

dt+|~a|∆r

LQ̃ = 0, ∆r = |~r1|


Spectral difference Spectrum of the 2D advection equation

Why lack of efficient time integrations?

It is desirable to use the largest step size possible if the temporaldiscretization error is acceptable.

For higher order schemes, the spectrum of the Jacobian of thesemi-discretization often has increasingly large eigenvalues.

(a) 2nd-order SD (b) 3rd-order SD (c) 4th-order SD

Figure: Fourier footprint of the 2D advection equation.


Optimization of explicit Runge-Kutta methods

Time integration schemes for high-order methods

For explicit time stepping schemes the step size is often limited bystability requirements, which become stricter with higher ordermethods

Implicit methods allow the use of much larger step sizes, but lead tovery large memory requirements that may not be feasible

The development of efficient algebraic solvers for high-order implicitdiscretizations remains challenging

Goal

=⇒ Explicit time integration methods that allow large stepsizes and require less memory seem to be an appealing alter-native


Optimization of explicit Runge-Kutta methods Stability function optimization

Consider a Runge-Kutta method with Butcher’s pair (A,b),

yi = un + ∆ts∑

j=1

aij f (yj) (1 ≤ i ≤ s),

un+1 = un + ∆ts∑

j=1

bj f (yj).

By applying such a method to the following linear, constant-coefficient IVP

Q′(t) = LQ, Q(t) : R→ RN ,L ∈ RN×N ,

Q(0) = Q0,

the following iteration is obtained:

Qn+1 = ψ(∆tL)Qn, ψ(z) = 1 +s∑

j=0

bTAj−1e z j ,



Assume that L is diagonalizable with eigenvalues let λi , i = 1, . . . ,N.Then the solution is absolutely stable for the time step ∆t if

∆tλi ∈ {z ∈ C : |ψ(z)| ≤ 1} for 1 ≤ i ≤ N.

if L is non-normal (i.e. if L∗ = L> does not hold) its pseudospectrumis used



Stability function of explicit Runge–Kutta methods

For an s-stage, order p explicit Runge–Kutta method

ψ(z) =s∑

j=0

βjzj =

p∑j=0

1

j!z j +

s∑j=p+1

βjzj

Since the exact solution of the IVP is

Q(t) = exp(tL)Q0

if the method is accurate to order p, the stability polynomial must beidentical to the Taylor expansion of the exponential function up to termsof at least order p.



Design of optimal stability polynomials

We are interested in designing RK methods that maximize the stepsize (or ν = |~a| ∆t

∆r ) for which the given stability constraints aresatisfied

In general, the cost of taking a time step is proportional to thenumber of stages s

We use the *effective step size* ∆t/s to compare the computationalefficiency of methods with different numbers of stages




Design optimal methods by first choosing the coefficients βj tomaximize the linearly stable time step

ψ(z) =

p∑j=0

1

j!z j +

s∑j=p+1

βjzj

Problem (Natural ψ(z) optimization, dQ̃dt

+ |~a|∆r

LQ̃ = 0)

Given a spectrum {λi}, an order p, and a number of stages s,

maximize ν

subject to |ψ(νλi )| − 1 ≤ 0, i = 1, . . . ,N




If instead, we assume ν is fixed, we can consider:

Problem (Smart ψ(z) optimization)

Given a spectrum {λi}, an order p, and a number of stages s,

minimize max|ψ(νλi )| − 1 ≤ 0, i = 1, . . . ,N

Where we minimize over the coefficients βj and the maximum is takenover all eigenvalues in the spectrum.

Convex feasibility problem, and therefore computationallyefficient to solve

Solve a sequence of convex feasibility problems to find thelargest ν for which the spectrum lies within a stabilitypolynomial for that step size



Example of optimal 4th-order ERK scheme

(a) Kutta’s ERK(4,4) (b) Optimal ERK(18,4)

Effective CFL number (νstab/s)

Kutta’s ERK(4,4): 3.9534× 10−2

Optimal ERK(18,4): 6.5233× 10−2 (65% improvement!)


Optimization of explicit Runge-Kutta methods Minimization of the leading truncation error coefficient

Low storage Runge–Kutta methods

The optimal stability polynomial does not uniquely define the Butchercoefficients:

Find the Butcher pair A,b which minimizes the truncation errorcoefficients C (p+1), satisfies the order conditionsτ

(j)i (A,b) = 0, (0 ≤ j ≤ p) and can be written in low-storage form

with 3 registers Γ(A,b) = 0.

Problem (C (p+1)(A,b) optimization)

minimize C (p+1)(A,b) > 0

subject to

τ(j)i (A,b) = 0 (0 ≤ j ≤ p)

Γ(A,b) = 0

bTAj−1e = βj (0 ≤ j ≤ s)


Optimization of explicit Runge-Kutta methods Measure of the efficiencies

Stability and accuracy efficiencies

If the stability is the more restrictive concern, the relative efficiency oftwo RK methods of order p can be measured as

χstab =σ ν1/s1

σ ν2/s2=ν1/s1

ν2/s2

σ: safe factor.

If the accuracy is the more restrictive concern, the relative efficiencyis given by

χacc =

(C

(p+1)2

C(p+1)1

) 1ps2

s1



(a) Opt. 2nd-order vs. ERK(2,2) (b) Opt. 3rd-order vs. ERK(3,3)

(c) Opt. 4th-order vs. ERK(4,4) (d) Opt. 5th-order vs. ERKF(6,5)

Figure: Efficiencies and maximum linearly stable CFL number.(LaRC/KAUST) Optimzed ERK schemes for SD April 2, 2013 23 / 40


Internal stability

In the application of ERK schemes with many stages to time-dependentPDEs, there can be a serious accumulation of errors that may even rendermethods unusable.

εn+1 =(vs+1 + (αs+1 + zβs+1) (I − α1:s − zβ1:s)−1 v1:s

)εn

+ (αs+1 + zβs+1) (I − α1:s − zβ1:s)−1 r̃

= P(z)εn + Q(z)r̃ .

(1)


Applications Advection equation

Linear advection equation with variable velocity

The conservation law reads

∂q

∂t+ ~u · ~∇q = 0, ~u (x , y) =

[uv

]= ω

[−yx

]

Figure: Gaussian wave advected in an annulus; 4th-order SD method and optimalERK(18,4) scheme.




(a) Optimal 2nd-order vs.ERK(2,2)

(b) Optimal 3rd-order vs.ERK(3,3)

(c) Optimal 4th-order vs.ERK(4,4)

(d) Optimal 5th-order vs.ERK(6,5)

Figure: Influence of the CFL number on the maximum error norm of a 2DGaussian wave advecting in an annulus.




25 30 35 40 45 50CPU [s]

10−5

10−4

10−3

10−2

10−1

||ε|| L

∞

2,23,28,2

3,35,3

17,3

4,49,4

18,4

6,510,520,5

(a) Error versus CPU time.

2nd-order 3rd-order 4th-order 5th-order0

10

20

30

40

50

CP

U[s

]

2,2 3,3 4,4 6,5

3,2 5,3

9,410,5

17,318,4 20,5

8,2

(b) CPU time for each simulation.

Figure: Error and CPU time for the advection problem.

Unit increment of the order of accuracy → reduction of theerror of one order of magnitudeNew ERK schemes: 42% to 65% more efficient for 4th- and5th-order schemes


Applications LEE

Linearized Euler Equations

Propagation of an acoustic pulse:

Figure: Propagation of a Gaussian pulse; 4th-order SD method and optimalERK(18,4) scheme.


Applications LEE

Linearized Euler Equations

20 25 30 35 40 45 50CPU [s]

10−9

10−8

10−7

10−6

10−5

10−4

||ε|| L

∞

2,23,28,2

3,35,317,3

4,49,418,4

6,510,520,5



10

20

30

40

50

CP

U[s

]

2,23,3

4,4

6,5

3,25,3 9,4

10,5

17,3 18,48,2


Figure: Error and CPU time for the acoustic pulse problem.


Applications Compressible Euler equations

2D compressible Euler Equations

Flow past a wedge:

Figure: Density contour of the flow past a wedge; 4th-order SD method andoptimal ERK(18,4) scheme.



Compressible Euler Equations

35 40 45 50 55 60 65 70CPU [s]

10−9

10−8

10−7

10−6

10−5

10−4

10−3

||ε|| L

∞

2,23,28,2

3,35,317,3

4,49,418,4

6,510,5

20,5



10

20

30

40

50

60

70

CP

U[s

]

2,23,3

4,4

6,5

3,25,3

9,4

10,5

17,3 18,4

20,58,2


Figure: Error and CPU time for the wedge problem.



Optimization of ERK methods for high aspect ratio

We are extending the previous approach to optimize ERK schemes forunstructured grids with aspect ratio (AR) >> 1

AR from 10 to 500 (for now)

Rectangular domain with periodic and Dirichlet BC for the 2Dadvection diffusion equation


Applications Compressible Navier-Stokes equations

2D laminar cavity flow

Re∞ = 1500

M∞ = 0.15

Evolution of the dragcoefficient, CD vs. time:

−20 −10 0 10 20−1

−0.5

0

0.5

1

t |~u∞| /D

CD

Solution St 〈CD〉〈CPD 〉

DNS (Larsson et al.) 0.243 0.377 0.402

SD p = 2 (104, 274 DOFs) 0.246 0.385 0.409

SD p = 3 (185, 376 DOFs) 0.243 0.379 0.403

M0.22

0.2

0.18

0.16

0.14

0.12

0.1

0.08

0.06

0.04

0.02



2D laminar cavity flow

Radiated sound fieldsimulation:

Nine observer points

Solution for one flowperiod (laminar flow)

FW-H approach

S=1.1 D

1 2 3 4 5 6 7 8 9

3rd − order SD scheme, 43% speed-up

−2 −1 0 1 2 3 4 5 6113

114

115

116

117

118

119

120

121

122

x1/D

SPLin

dB

DNS, Larsson et al.

Curle: dipole, Larsson et al.

Curle: dipole + quadrupole, Larsson et al.

dipole

Perm: dipole + quadupole

4th − order SD scheme, 62% speed-up

−2 −1 0 1 2 3 4 5 6113

114

115

116

117

118

119

120

121

122

123

x1/D

SPLin

dB

DNS, Larsson et al.

Curle: dipole, Larsson et al.

Curle: dipole + quadrupole, Larsson et al.

dipole

Perm: dipole + quadupole



Flow in a muffler

Test case settings:

Red = 4.64× 104, M = 0.05, Pr = 0.72

3D N-S equations and WALE model

Mesh with 36, 612 hexahedral elements

4th-order SD schemes ( 2.3× 106 DOFs; the same used at VKI)

CFL ≈ 0.7

Figure: 3D muffler.Two preliminary results:

∼ 50% speed-up

Influence of SGS models (?)



Flow in a muffler

Figure: Average u velocity



Flow in a muffler

Figure: Time averaged velocity profiles.



Flow in a muffler

Figure: Reynolds stress.


Conclusions

Conclusions

Design of more robust and efficient tailored explicit RK for SD:I Linear propertiesI Minimization of the leading truncation error constantI Low-storage form with 3 registers

Numerical verification on unstructured grids

New schemes are effectively with unstructured grids at CFLmax

Stability efficiency improvements of 42% to 65%

No significant sacrifices in accuracy (SD error dominates)

Extension of the procedure to viscous laminar and turbulent flows issatisfactory (work in progress)

Etension to embedded Runge–Kutta pair in the very near future


Conclusions

Thank you for your attention!


design of optimal explicit runge kutta schemes for the

Documents