terra-neo a two-scale approach for e cient on-the- y

32
Motivation Stencils Method Numerical Study Performance Mantle Convection Terra-Neo A two-scale approach for efficient on-the-fly operator assembly on curved geometries Simon Bauer 1 Markus Huber 2 Marcus Mohr 1 Ulrich R¨ ude 3 Jens Weism¨ uller 1,5 Markus Wittmann 4 Barbara Wohlmuth 2 1 Dept. of Earth and Environmental Sciences, Ludwig-Maximilians-Universit¨ at M¨ unchen 2 Institute for Numerical Mathematics (M2), Technische Universit¨ at M¨ unchen 3 Computer Science 10, FAU Erlangen-N¨ urnberg 4 Erlangen Regional Computing Centre (RRZE), FAU Erlangen-N¨ urnberg 5 Leibniz Supercomputing Centre (LRZ) SPPEXA Annual Plenary Meeting 21.03.2017 Bauer et al. Terra-Neo 1/25

Upload: others

Post on 20-Jul-2022

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Terra-Neo A two-scale approach for e cient on-the- y

Motivation Stencils Method Numerical Study Performance Mantle Convection

Terra-NeoA two-scale approach for efficient on-the-fly operator

assembly on curved geometries

Simon Bauer1 Markus Huber2 Marcus Mohr1 Ulrich Rude3

Jens Weismuller1,5 Markus Wittmann4 Barbara Wohlmuth2

1Dept. of Earth and Environmental Sciences, Ludwig-Maximilians-Universitat Munchen2Institute for Numerical Mathematics (M2), Technische Universitat Munchen

3Computer Science 10, FAU Erlangen-Nurnberg4Erlangen Regional Computing Centre (RRZE), FAU Erlangen-Nurnberg

5Leibniz Supercomputing Centre (LRZ)

SPPEXA Annual Plenary Meeting21.03.2017

Bauer et al. Terra-Neo 1/25

Page 2: Terra-Neo A two-scale approach for e cient on-the- y

Motivation Stencils Method Numerical Study Performance Mantle Convection

Geophysicist

Earth Mantle represented as athick spherical shell

Bauer et al. Terra-Neo 2/25

Page 3: Terra-Neo A two-scale approach for e cient on-the- y

Motivation Stencils Method Numerical Study Performance Mantle Convection

Geophysicist

Earth Mantle represented as athick spherical shell

Mathematician/ComputerScientist

+ well-suited formatrix-free FE implementations

+ regular grid →constant access patterns

Bauer et al. Terra-Neo 2/25

Page 4: Terra-Neo A two-scale approach for e cient on-the- y

Motivation Stencils Method Numerical Study Performance Mantle Convection

Geophysicist

Earth Mantle represented as athick spherical shell

Mathematician/ComputerScientist

+ well-suited formatrix-free FE implementations

+ regular grid →constant access patterns

Bauer et al. Terra-Neo 2/25

Page 5: Terra-Neo A two-scale approach for e cient on-the- y

Motivation Stencils Method Numerical Study Performance Mantle Convection

Hierarchical Hybrid Grids (HHG)

Hybrid approach

• initial coarse unstructured mesh (macroelements)

• regular refinement of macro elements

• constant access patterns (FE stencils)within each macro element for therefined grids

• matrix-free implementation

• compute stencils on-the-fly (only ONCEper macro element), reduces memorytraffic

• excellent parallel scalability and nodeperformance [Gmeiner et al., 2015]

Projection of fine grid nodes

• accurate representation of geometry onall levels

• FE stencils are not constant withinmacro elements

• need to assemble stencil for EACH finegrid node → SLOW

Bauer et al. Terra-Neo 3/25

Page 6: Terra-Neo A two-scale approach for e cient on-the- y

Motivation Stencils Method Numerical Study Performance Mantle Convection

Sequence of Mappings (2D sketch)

reference domain original HHG domain projected HHG domain

Bauer et al. Terra-Neo 4/25

Page 7: Terra-Neo A two-scale approach for e cient on-the- y

Motivation Stencils Method Numerical Study Performance Mantle Convection

Sequence of Mappings (2D sketch)

reference domain original HHG domain projected HHG domain

Can we get the performance of the original HHGalso on the projected domain?

Bauer et al. Terra-Neo 4/25

Page 8: Terra-Neo A two-scale approach for e cient on-the- y

Motivation Stencils Method Numerical Study Performance Mantle Convection

Outline

1) Introduce novel approach

2) Numerical study and performance results

3) Geophysical application

−∆u = f

Bauer et al. Terra-Neo 5/25

Page 9: Terra-Neo A two-scale approach for e cient on-the- y

Motivation Stencils Method Numerical Study Performance Mantle Convection

Novel Approach forEfficient Stencil Assembly on Curved

Geometries

Bauer et al. Terra-Neo 6/25

Page 10: Terra-Neo A two-scale approach for e cient on-the- y

Motivation Stencils Method Numerical Study Performance Mantle Convection

HHG Stencil Layout

Layout of the 15pt stencil in HHG Selected interior points of a macrotetrahedron in 3D reference domain

Bauer et al. Terra-Neo 7/25

Page 11: Terra-Neo A two-scale approach for e cient on-the- y

Motivation Stencils Method Numerical Study Performance Mantle Convection

Original (blue) vs. projected (red) stencil weights

weights (TC, . . . ) of 15pt stencil for nodes in gray plane (previous slide)

TC TS

MW MNBauer et al. Terra-Neo 8/25

Page 12: Terra-Neo A two-scale approach for e cient on-the- y

Motivation Stencils Method Numerical Study Performance Mantle Convection

Novel Approach

Idea: Approximate stencils by low order polynomials

Replace expensive stencil assembly via evaluation of local elementmatrices by low-cost approximation

• Employ quadratic or cubic (or even higher order) polynomials

• Pre-compute polynomial coefficients in setup phase and store them

• Define one polynomial for each stencil weight at each level and foreach macro element

• Fit polynomial coefficients by interpolation (IPOLY) or least-squaresapproach (LSQP)

Bauer et al. Terra-Neo 9/25

Page 13: Terra-Neo A two-scale approach for e cient on-the- y

Motivation Stencils Method Numerical Study Performance Mantle Convection

Sampling Points for Determination of PolynomialCoefficients

IPOLY: (quadratic) 10 samplingpoints → interpolation polynomial

LSQP: employ more sampling pointsand solve least-squares problem

Bauer et al. Terra-Neo 10/25

Page 14: Terra-Neo A two-scale approach for e cient on-the- y

Motivation Stencils Method Numerical Study Performance Mantle Convection

Polynomial Evaluation

1D: a quadratic polynomialpi := p(zi ) = a2z

2i + a1zi + a0

can be evaluated incrementallywith only two flops

pi+1 = pi+δpi , δpi+1 = δpi+δk

where

δp(zi ) := p′(zi )h +p′′(zi )

2h2

δk := 2ah2

Polynomial degree q > 2: Formulacan be generalized to higher poly-nomial degrees (based on Taylorseries expansion)

3D: ansatz transfers directly tohigher dimensions

Bauer et al. Terra-Neo 11/25

Page 15: Terra-Neo A two-scale approach for e cient on-the- y

Motivation Stencils Method Numerical Study Performance Mantle Convection

Numerical Study and Performance Results

Bauer et al. Terra-Neo 12/25

Page 16: Terra-Neo A two-scale approach for e cient on-the- y

Motivation Stencils Method Numerical Study Performance Mantle Convection

Model Problem

−∆u = f in Ω, u = 0 on ∂Ω

Exact solution:

u(x , y , z) = g(x , y , z)·h(x , y , z)

with

g(x , y , z) = (r − rmin)(r − rmax)

h(x , y , z) = sin(10x) sin(4y) sin(7z)

Bauer et al. Terra-Neo 13/25

Page 17: Terra-Neo A two-scale approach for e cient on-the- y

Motivation Stencils Method Numerical Study Performance Mantle Convection

Numerical Experiments (1/2)

• Employ geometric multigrid solver with V(3,3)-cycle andGauss-Seidel (GS) smoother

• Test different numerical strategies:

CONS original HHG (non-projected)

IFEM projected HHG (assemble FE stencils exactly)

IPOLY employ interpolation polynomial(quadratic)

LSQP employ approximation polynomial (quadratic)

LSQP (q=3) employ approximation polynomial (cubic)

DD use IFEM for residual and(Double Discretization) IPOLY/LSQP for the smoother

Bauer et al. Terra-Neo 14/25

Page 18: Terra-Neo A two-scale approach for e cient on-the- y

Motivation Stencils Method Numerical Study Performance Mantle Convection

Numerical Experiments (2/2)

Discretization error for differentlevels of refinement

• LSQP better than IPOLY, butboth deteriorate after some levels

• cubic more accurate thanquadratic polynomials (but moreexpensive)

• DD (Double Discretization)schemes are exact for all levels

• critical level LC depends onmacro element size H, forsmaller H we get larger LC

• Disc. error is in O(h2) +O(Hq+1)q : polynomial degree

Bauer et al. Terra-Neo 15/25

Page 19: Terra-Neo A two-scale approach for e cient on-the- y

Motivation Stencils Method Numerical Study Performance Mantle Convection

Numerical Experiments - Take Home Message

Disc. error is in O(h2) +O(Hq+1)q : polynomial degree

Task: Identify (h,H)-parameter regime such that ”O(Hq+1) 6 O(h2)”Large Scale: O(104) MPI-processes → lower bound for number of macroelements → small H → O(Hq+1) is also small

LSQP (q=2) is accurate for at least 6 levels ofrefinement.

This allows a global resolution of the Earth mantleof 1km.

Bauer et al. Terra-Neo 16/25

Page 20: Terra-Neo A two-scale approach for e cient on-the- y

Motivation Stencils Method Numerical Study Performance Mantle Convection

Node Level OptimizationExecution-Cache-Memory (ECM) model (single core)

Reported cycles from IACA for the 3 loops normalized to 8 stencil updates. (maximum values of each port of a certain category)

loop total arithmetic data type commentLoop 1 22 20 22 throughput vectorizedLoop 2 56 56 36 throughput vectorizedLoop 3 40 4 12 throughput scalar w/ recursion

measurement (data in memory)

mul/fma

ld

add/fma

div (56 cy)

add/fma

ld

loop latency (40 cy)

st

ld

mul/fma

st st

36 cy 92 cy 132 cy

0 20 40 60 80 100 120 140 160cy

L1/L2 L2/L3 L3/mem @ 2.3 GHz

loop 1 (36 cy) loop 2 (56 cy) loop 3 (40 cy)

132 cy

2 x

2 c

y/cl

2 x

2 c

y/cl

1 x

5.7

cy/

cl

2 x

2 c

y/cl

2 x

2 c

y/cl

1 x

5.7

cy/

cl

model 132 cy

155 cy

Bauer et al. Terra-Neo 17/25

Page 21: Terra-Neo A two-scale approach for e cient on-the- y

Motivation Stencils Method Numerical Study Performance Mantle Convection

Weak Scaling on SuperMUC Phase 2

Minimal CPU time [s] for one V(3,3)-cycle. 16 macro elements areassigned per core with 3.3 · 105 DOFs per macro element.

Bauer et al. Terra-Neo 18/25

Page 22: Terra-Neo A two-scale approach for e cient on-the- y

Motivation Stencils Method Numerical Study Performance Mantle Convection

Geophysical Application

Bauer et al. Terra-Neo 19/25

Page 23: Terra-Neo A two-scale approach for e cient on-the- y

Motivation Stencils Method Numerical Study Performance Mantle Convection

Mantle Convection: Physical Model

Conservation of momentum, mass and energy:

divσ + %g = 0 (1a)

∂t%+ div(%u) = 0 (1b)

∂t(%e) + div(%eu) + div q− H − σ : ε = 0 (1c)

with

σ = 2µ

(ε− tr ε

3I

)− pI , ε =

1

2

(∇u + (∇u)T

)and an equation of state % = %(p,T ).

% density u velocity p pressureg gravitational acceleration µ dynamic viscosity σ Cauchy stress tensorε rate of strain tensor T temperature e internal energyq heat flux per unit area H heat production rate

Bauer et al. Terra-Neo 20/25

Page 24: Terra-Neo A two-scale approach for e cient on-the- y

Motivation Stencils Method Numerical Study Performance Mantle Convection

Mantle Convection: Generalised Stokes System

Re-casting in terms of key quantities velocity, pressure and temperature,we get a generalised Stokes problem for the quasi-stationary flow (1a),(1b)(with an elliptic viscous operator L)

L(u;µ)−∇p = F (T )

div(u) = 0µ = µ(x,T ,u)

coupled to the energy equation (1c)

∂tT + u · ∇T + div(κ∇T )− 1

cp%(H − σ : ε) = 0

only via the buoyancy term F (T ).

cp specific heat capacity κ heat conductivity

Bauer et al. Terra-Neo 21/25

Page 25: Terra-Neo A two-scale approach for e cient on-the- y

Motivation Stencils Method Numerical Study Performance Mantle Convection

LSQP for Stokes System

Mixed Finite Element discretisation:[A(µ) BT

B −C

][u

q

]=

[f

g

]

Challenges:

1) Curved geometry → employ LSQP approachI system of PDEs → approximate stencils of each operator block

individually with a quadratic polynomial

2) How do we deal with non-constant µ?

Bauer et al. Terra-Neo 22/25

Page 26: Terra-Neo A two-scale approach for e cient on-the- y

Motivation Stencils Method Numerical Study Performance Mantle Convection

LSQP with Variable Coefficients

Remember: stencil weight aI ,Jk,m is computed via:

aI ,Jk,m =∑

t∈N (I )∩N (J)

µtEI ,Jt (2)

• Approximate stencil weight including µ (”classical” LSQP approach)→ only for ”sufficiently smooth” µ

• Approximate local element matrices → replace expensive E I ,Jt with

polynomial approximation (LSQP LocEl)

+ works for general µ– more expensive, but still much faster than computation of E I ,J

t

I , J node indices N (I ) neighbourhood of node I

k,m operator component, 1 6 k,m 6 3 EI,Jt contribution of local element matrix

t local element µt averaged viscosity on element t

Bauer et al. Terra-Neo 23/25

Page 27: Terra-Neo A two-scale approach for e cient on-the- y

Motivation Stencils Method Numerical Study Performance Mantle Convection

Application - Velocity Streamlines

classic FE assembly LSQP LocEl

Viscosity model [Davies et al. 2012] with jump at da = 660/6371

µ(x,T ) = exp

(4.61

1− ‖x‖2

1− rcmb− 2.99T

)1/10 · 6.3713d3

a for ‖x‖2 > 1− da,

1 else.

Present day temperature field T [Grand et. al 1997, Stixrude and Lithgow-Bertelloni 2005],plate velocities [GPlates] on surface and free-slip BC’s on core-mantle boundary (cmb)

Bauer et al. Terra-Neo 24/25

Page 28: Terra-Neo A two-scale approach for e cient on-the- y

Motivation Stencils Method Numerical Study Performance Mantle Convection

Application - Velocity Streamlines

classic FE assembly

=⇒speedup

5.9x(excl. coarsegrid solver)

LSQP LocEl

Viscosity model [Davies et al. 2012] with jump at da = 660/6371

µ(x,T ) = exp

(4.61

1− ‖x‖2

1− rcmb− 2.99T

)1/10 · 6.3713d3

a for ‖x‖2 > 1− da,

1 else.

Present day temperature field T [Grand et. al 1997, Stixrude and Lithgow-Bertelloni 2005],plate velocities [GPlates] on surface and free-slip BC’s on core-mantle boundary (cmb)

Bauer et al. Terra-Neo 24/25

Page 29: Terra-Neo A two-scale approach for e cient on-the- y

Motivation Stencils Method Numerical Study Performance Mantle Convection

Thank you for your attention

Bauer et al. Terra-Neo 25/25

Page 30: Terra-Neo A two-scale approach for e cient on-the- y

Bauer et al. Terra-Neo 26/25

Page 31: Terra-Neo A two-scale approach for e cient on-the- y

Original (blue) vs. projected (red) stencil weights

weights (MC, . . . ) of 15pt stencil for nodes in gray plane (previous slide)

MC MSE

BNW BEBauer et al. Terra-Neo 27/25

Page 32: Terra-Neo A two-scale approach for e cient on-the- y

Numerical Experiments

Residual in discrete L2-norm Discretization error in discreteL2-norm

Bauer et al. Terra-Neo 28/25