[1cm] a massively-parallel framework for finite-volume ......slide2/2 pdes2014 — deconincketal. a...
Post on 25-Sep-2020
1 Views
Preview:
TRANSCRIPT
A Massively-Parallel Framework for Finite-VolumeSimulation of Global Atmospheric Dynamics
Willem Deconinck 1 Mats Hamrud 1 Piotr K. Smolarkiewicz 1
Joanna Szmelter 2 Zhao Zhang 2
1European Centre for Medium-Range Weather Forecasts, Reading, UK2Wolfson School, Loughborough University, UK
2014 PDEs Workshop
T3999 ∼5km global resolution spectral transform model IFS operational in∼2024 scales well into petascale
0 50000 100000 150000 200000 250000# cores
100
150
200
250
300
350
400
450
500
Fore
cast
Days
/ D
ay
Beyond exascale (∼2030, at T7999): gridpoint method with local communication
MPDATAhorizontally unstructuredvertically structuredlon-lat domain2nd order in time and space Evolutionary introduction into IFS
Slide 1/2 PDEs 2014 — Deconinck et al.
T3999 ∼5km global resolution spectral transform model IFS operational in∼2024 scales well into petascale
0 50000 100000 150000 200000 250000# cores
100
150
200
250
300
350
400
450
500
Fore
cast
Days
/ D
ay
Beyond exascale (∼2030, at T7999): gridpoint method with local communication
MPDATAhorizontally unstructuredvertically structuredlon-lat domain2nd order in time and space Evolutionary introduction into IFS
Slide 1/2 PDEs 2014 — Deconinck et al.
T3999 ∼5km global resolution spectral transform model IFS operational in∼2024 scales well into petascale
0 50000 100000 150000 200000 250000# cores
100
150
200
250
300
350
400
450
500
Fore
cast
Days
/ D
ay
Beyond exascale (∼2030, at T7999): gridpoint method with local communication
MPDATAhorizontally unstructuredvertically structuredlon-lat domain2nd order in time and space
Evolutionary introduction into IFS
Slide 1/2 PDEs 2014 — Deconinck et al.
T3999 ∼5km global resolution spectral transform model IFS operational in∼2024 scales well into petascale
0 50000 100000 150000 200000 250000# cores
100
150
200
250
300
350
400
450
500
Fore
cast
Days
/ D
ay
Beyond exascale (∼2030, at T7999): gridpoint method with local communication
MPDATAhorizontally unstructuredvertically structuredlon-lat domain2nd order in time and space Evolutionary introduction into IFS
Slide 1/2 PDEs 2014 — Deconinck et al.
Developments and Results
Development of flexible dynamic data structure
Object Oriented C++ design with Fortran interfaceMPI / Halo-exchangesinterpolation, mesh-generation, product delivery
Orographic zonal flow, meridional velocity
0 5 10 15 20 25 30 35 40# cores x 1000
0
5
10
15
20
25
30
35
40
spee
dup
x 10
00
linearMPDATAIFS
102 103 104 105
# cores
0
20
40
60
80
100
effic
ienc
y in
%
MPDATAIFS
T2047 (∼10km) with 137 isentrope levels – scaling to 40000 cores
Slide 2/2 PDEs 2014 — Deconinck et al.
Developments and ResultsDevelopment of flexible dynamic data structure
Object Oriented C++ design with Fortran interfaceMPI / Halo-exchangesinterpolation, mesh-generation, product delivery
Orographic zonal flow, meridional velocity
0 5 10 15 20 25 30 35 40# cores x 1000
0
5
10
15
20
25
30
35
40
spee
dup
x 10
00
linearMPDATAIFS
102 103 104 105
# cores
0
20
40
60
80
100
effic
ienc
y in
%
MPDATAIFS
T2047 (∼10km) with 137 isentrope levels – scaling to 40000 cores
Slide 2/2 PDEs 2014 — Deconinck et al.
Developments and ResultsDevelopment of flexible dynamic data structure
Object Oriented C++ design with Fortran interfaceMPI / Halo-exchangesinterpolation, mesh-generation, product delivery
Orographic zonal flow, meridional velocity
0 5 10 15 20 25 30 35 40# cores x 1000
0
5
10
15
20
25
30
35
40
spee
dup
x 10
00
linearMPDATAIFS
102 103 104 105
# cores
0
20
40
60
80
100
effic
ienc
y in
%
MPDATAIFS
T2047 (∼10km) with 137 isentrope levels – scaling to 40000 cores
Slide 2/2 PDEs 2014 — Deconinck et al.
Developments and ResultsDevelopment of flexible dynamic data structure
Object Oriented C++ design with Fortran interfaceMPI / Halo-exchangesinterpolation, mesh-generation, product delivery
Orographic zonal flow, meridional velocity
0 5 10 15 20 25 30 35 40# cores x 1000
0
5
10
15
20
25
30
35
40
spee
dup
x 10
00
linearMPDATAIFS
102 103 104 105
# cores
0
20
40
60
80
100
effic
ienc
y in
%
MPDATAIFS
T2047 (∼10km) with 137 isentrope levels – scaling to 40000 coresSlide 2/2 PDEs 2014 — Deconinck et al.
A Massively-Parallel Framework for Finite-Volume Simulationof Global Atmospheric Dynamics
Willem Deconinck 1 Mats Hamrud 1 Piotr K. Smolarkiewicz 1 Joanna Szmelter 2 Zhao Zhang 2
1European Centre for Medium-Range Weather Forecasts, Reading, UK2Wolfson School, Loughborough University, UK
willem.deconinck@ecmwf.int
Scalability of Current IFS Dynamical Core
5km horizontal resolution with 137 levels
0 50000 100000 150000 200000 250000# cores
100
150
200
250
300
350
400
450
500
Fore
cast
Days
/ D
ay
I We will reach Exa-scale in ∼2030I Spectral Transform Method does not scale to Exa-scale because of global communicationsI Semi-Lagrangian time-stepping implementation is non-conservative
Alternative Dynamical Core: Unstructured Edge-based Finite Volume MPDATA
I Non-oscillatory forward-in-time scheme, capable ofaccomodating a wide range of scales and conservationproblems
I Unstructured prismatic meshes allow irregular spatialresolution and enhancement of polar regions.
I Formulation for time-dependent non-orthogonalcurvilinear coordinates on the manifold.
∂Gψ
∂t+∇ · (Gv∗ψ) = GR
See Szmelter and Smolarkiewicz (2010, JCP), for further discussion.
Massively Parallel Implementation
I Multiple levels of parallelism
Data
Task 1 Task 2 Task 3 Task 4
MPI
Thread 1---------------Thread 2
---------------Thread 3
OpenMP
Thread 1---------------Thread 2
---------------Thread 3
OpenMP
Thread 1---------------Thread 2
---------------Thread 3
OpenMP
Thread 1---------------Thread 2
---------------Thread 3
OpenMP
I Optimal Equal-Area Domain decomposition
I Local computations in every subdomainI Small halo needs to be exchanged with
surrounding subdomains for DistributedMemory algorithms
I Shared Memory parallelisation avoids furthersubdivision of subdomains
I Structured treatment of vertical directiondiscounts cost of horizontal indirectaddressing
Evolutionary Introduction into IFS
I Construction ofunstructured mesh usingsame data points as usedby IFS’ SpectralTransform Method
I Integration with ECMWF’sinfrastructure forarchiving, post-processing,visualisation
Flexible Dynamic Framework
I Complex requirements for unstructured meshesI Handling of distributed memory parallelisationI Mesh specific routines: construction of dual mesh, periodicity,
reading/writing fields, interpolationI Multiple meshes to handle multigrid implementationsI Object Oriented Design using C++:
I Hierarchical nesting of topological objectsI Meshes, Field Sets, FieldsI Multiple Halo-Exchange patterns
I Fortran Interface allows direct access to internal data
Shallow Water Equations on the Sphere
∂GD∂t
+∇ · (Gv∗D) = 0
∂GQx
∂t+∇ · (Gv∗Qx) = G
(− g
hxD∂H∂x
+ fQy −1
GD∂hx∂yQxQy
)∂GQy
∂t+∇ · (Gv∗Qy) = G
(− g
hxD∂H∂x
+ fQy −1
GD∂hx∂yQxQy
)
Meridional wind-component for flow over 2km mountain at mid-latitudes;result obtained using Reduced Gaussian mesh with 16km resolution.
3D Hydrostatic Equations in Isentropic Coordinates
Froude Number = 2, Zontal wind U = 10 m/s,Brunt-Vaisala frequency = 0.04
Isentrope height perturbation at He = λz/8
Isentropes in a verticalplane at the equator
220 240 260 280 300 320
Result obtained using Reduced Gaussian mesh with 1km horizontal resolution,and 40m vertical resolution on a small planet with radius 64km.
Parallel Scaling results
0 5 10 15 20 25 30 35 40# cores x 1000
0
5
10
15
20
25
30
35
40
speedup x
10
00
linearMPDATAIFS
102 103 104 105
# cores
0
20
40
60
80
100
eff
icie
ncy
in %
MPDATAIFS
Scaling results obtained with 10km Reduced Gaussian mesh and 137 Levels.
Acknowledgements
top related