ultra-skalierbare multiphysiksimulationen für ... · walberla framework • widely applicable...
TRANSCRIPT
![Page 1: Ultra-Skalierbare Multiphysiksimulationen für ... · waLBerla Framework • widely applicable Lattice-Boltzmann from Erlangen • HPC software framework, originally developed for](https://reader036.vdocument.in/reader036/viewer/2022071219/60583350557fcb04a4064773/html5/thumbnails/1.jpg)
Ultra-Skalierbare Multiphysiksimulationen für
Erstarrungsprozesse in Metallen (SKAMPY)
HPC-Status-Konferenz der Gauß-Allianz, 29. November 2016, Hamburg
Harald Köstler, Bauer, Schornbaum, Godenschwager, Rüde, Hammer, Wellein, Hötzer, Nestler
Chair for System SimulationFriedrich-Alexander-Universität Erlangen-Nürnberg, Erlangen, Germany
![Page 2: Ultra-Skalierbare Multiphysiksimulationen für ... · waLBerla Framework • widely applicable Lattice-Boltzmann from Erlangen • HPC software framework, originally developed for](https://reader036.vdocument.in/reader036/viewer/2022071219/60583350557fcb04a4064773/html5/thumbnails/2.jpg)
2
SKAMPY Project
Harald Köstler - Chair for System Simulation, FAU Erlangen-Nürnberg, 2016
![Page 3: Ultra-Skalierbare Multiphysiksimulationen für ... · waLBerla Framework • widely applicable Lattice-Boltzmann from Erlangen • HPC software framework, originally developed for](https://reader036.vdocument.in/reader036/viewer/2022071219/60583350557fcb04a4064773/html5/thumbnails/3.jpg)
• waLBerla Framework
• SKAMPY Project
3
Outline
Harald Köstler - Chair for System Simulation, FAU Erlangen-Nürnberg, 2016
![Page 4: Ultra-Skalierbare Multiphysiksimulationen für ... · waLBerla Framework • widely applicable Lattice-Boltzmann from Erlangen • HPC software framework, originally developed for](https://reader036.vdocument.in/reader036/viewer/2022071219/60583350557fcb04a4064773/html5/thumbnails/4.jpg)
The waLBerla Framework
![Page 5: Ultra-Skalierbare Multiphysiksimulationen für ... · waLBerla Framework • widely applicable Lattice-Boltzmann from Erlangen • HPC software framework, originally developed for](https://reader036.vdocument.in/reader036/viewer/2022071219/60583350557fcb04a4064773/html5/thumbnails/5.jpg)
waLBerla Framework
Harald Köstler - Chair for System Simulation, FAU Erlangen-Nürnberg, 2016 5
![Page 6: Ultra-Skalierbare Multiphysiksimulationen für ... · waLBerla Framework • widely applicable Lattice-Boltzmann from Erlangen • HPC software framework, originally developed for](https://reader036.vdocument.in/reader036/viewer/2022071219/60583350557fcb04a4064773/html5/thumbnails/6.jpg)
7
waLBerla Framework
• widely applicable Lattice-Boltzmann from Erlangen • HPC software framework, originally developed for CFD simulations with
Lattice Boltzmann Method (LBM) • evolved into general framework for algorithms on block-structured grids
• www.walberla.net
Vocal Fold Study(Florian Schornbaum)
Fluid Structure Interaction (Simon Bogner)
Free Surface Flow
Harald Köstler - Chair for System Simulation, FAU Erlangen-Nürnberg, 2016
![Page 7: Ultra-Skalierbare Multiphysiksimulationen für ... · waLBerla Framework • widely applicable Lattice-Boltzmann from Erlangen • HPC software framework, originally developed for](https://reader036.vdocument.in/reader036/viewer/2022071219/60583350557fcb04a4064773/html5/thumbnails/7.jpg)
8
Block-structured Grids
Complex geometry given by surface Add regular block partitioning
Discard empty blocks
Allocate block data
Load balancing
Harald Köstler - Chair for System Simulation, FAU Erlangen-Nürnberg, 2016
![Page 8: Ultra-Skalierbare Multiphysiksimulationen für ... · waLBerla Framework • widely applicable Lattice-Boltzmann from Erlangen • HPC software framework, originally developed for](https://reader036.vdocument.in/reader036/viewer/2022071219/60583350557fcb04a4064773/html5/thumbnails/8.jpg)
• Domain Decomposition & Distribution to Processes:• regular decomposition into blocks containing uniform grids
• grid refinement: octree-like decomposition
9
Block-structured Grids
In most cases, if a regular decomposition of a uniform
grid is used, exactly one block is assigned to each process.
forest of octrees:each block contains a uniform grid
of the same size→ 2:1 balance between
neighboring cells on level transitions
Harald Köstler - Chair for System Simulation, FAU Erlangen-Nürnberg, 2016
![Page 9: Ultra-Skalierbare Multiphysiksimulationen für ... · waLBerla Framework • widely applicable Lattice-Boltzmann from Erlangen • HPC software framework, originally developed for](https://reader036.vdocument.in/reader036/viewer/2022071219/60583350557fcb04a4064773/html5/thumbnails/9.jpg)
• Distributed Memory Parallelization: MPI• data exchange on borders between blocks via ghost layers
• support for overlapping communication and computation
• some advanced models require more complex communication patterns ( e.g. free-surface and fluid-structure interaction)
10
Hybrid Parallelization
receiverprocess
senderprocess
(slightly more complicated for non-uniform domain decompositions, but the same general ideas still apply)
Harald Köstler - Chair for System Simulation, FAU Erlangen-Nürnberg, 2016
![Page 10: Ultra-Skalierbare Multiphysiksimulationen für ... · waLBerla Framework • widely applicable Lattice-Boltzmann from Erlangen • HPC software framework, originally developed for](https://reader036.vdocument.in/reader036/viewer/2022071219/60583350557fcb04a4064773/html5/thumbnails/10.jpg)
SKAMPY ProjectApplication
![Page 11: Ultra-Skalierbare Multiphysiksimulationen für ... · waLBerla Framework • widely applicable Lattice-Boltzmann from Erlangen • HPC software framework, originally developed for](https://reader036.vdocument.in/reader036/viewer/2022071219/60583350557fcb04a4064773/html5/thumbnails/11.jpg)
Overview
Johannes Hötzer- Institute of Applied Materials – Computational Material Science, KIT, 2016 12
![Page 12: Ultra-Skalierbare Multiphysiksimulationen für ... · waLBerla Framework • widely applicable Lattice-Boltzmann from Erlangen • HPC software framework, originally developed for](https://reader036.vdocument.in/reader036/viewer/2022071219/60583350557fcb04a4064773/html5/thumbnails/12.jpg)
• ternary eutectic alloys
• directional solidification
• analytically moving
• temperature gradient
• massively parallel phase-field simulations
• large domain sizes
Application Setting
Harald Köstler - Chair for System Simulation, FAU Erlangen-Nürnberg, 2016 13
![Page 13: Ultra-Skalierbare Multiphysiksimulationen für ... · waLBerla Framework • widely applicable Lattice-Boltzmann from Erlangen • HPC software framework, originally developed for](https://reader036.vdocument.in/reader036/viewer/2022071219/60583350557fcb04a4064773/html5/thumbnails/13.jpg)
Overview
Johannes Hötzer- Institute of Applied Materials – Computational Material Science, KIT, 2016 14
![Page 14: Ultra-Skalierbare Multiphysiksimulationen für ... · waLBerla Framework • widely applicable Lattice-Boltzmann from Erlangen • HPC software framework, originally developed for](https://reader036.vdocument.in/reader036/viewer/2022071219/60583350557fcb04a4064773/html5/thumbnails/14.jpg)
Phase-field model
Johannes Hötzer- Institute of Applied Materials – Computational Material Science, KIT, 2016 15
![Page 15: Ultra-Skalierbare Multiphysiksimulationen für ... · waLBerla Framework • widely applicable Lattice-Boltzmann from Erlangen • HPC software framework, originally developed for](https://reader036.vdocument.in/reader036/viewer/2022071219/60583350557fcb04a4064773/html5/thumbnails/15.jpg)
16
Microstructure prediction Al-Ag-Cu
Harald Köstler - Chair for System Simulation, FAU Erlangen-Nürnberg, 2016
![Page 16: Ultra-Skalierbare Multiphysiksimulationen für ... · waLBerla Framework • widely applicable Lattice-Boltzmann from Erlangen • HPC software framework, originally developed for](https://reader036.vdocument.in/reader036/viewer/2022071219/60583350557fcb04a4064773/html5/thumbnails/16.jpg)
Pattern features in a Al-Ag-Cu
Johannes Hötzer- Institute of Applied Materials – Computational Material Science, KIT, 2016 17
![Page 17: Ultra-Skalierbare Multiphysiksimulationen für ... · waLBerla Framework • widely applicable Lattice-Boltzmann from Erlangen • HPC software framework, originally developed for](https://reader036.vdocument.in/reader036/viewer/2022071219/60583350557fcb04a4064773/html5/thumbnails/17.jpg)
Spiral growth in ternary systems
Johannes Hötzer- Institute of Applied Materials – Computational Material Science, KIT, 2016 18
![Page 18: Ultra-Skalierbare Multiphysiksimulationen für ... · waLBerla Framework • widely applicable Lattice-Boltzmann from Erlangen • HPC software framework, originally developed for](https://reader036.vdocument.in/reader036/viewer/2022071219/60583350557fcb04a4064773/html5/thumbnails/18.jpg)
SKAMPY ProjectPerformance Engineering
![Page 19: Ultra-Skalierbare Multiphysiksimulationen für ... · waLBerla Framework • widely applicable Lattice-Boltzmann from Erlangen • HPC software framework, originally developed for](https://reader036.vdocument.in/reader036/viewer/2022071219/60583350557fcb04a4064773/html5/thumbnails/19.jpg)
20
Work packages
Harald Köstler - Chair for System Simulation, FAU Erlangen-Nürnberg, 2016
![Page 20: Ultra-Skalierbare Multiphysiksimulationen für ... · waLBerla Framework • widely applicable Lattice-Boltzmann from Erlangen • HPC software framework, originally developed for](https://reader036.vdocument.in/reader036/viewer/2022071219/60583350557fcb04a4064773/html5/thumbnails/20.jpg)
21
Single Node Tuning
Harald Köstler - Chair for System Simulation, FAU Erlangen-Nürnberg, 2016
80 x faster compared to original version
![Page 21: Ultra-Skalierbare Multiphysiksimulationen für ... · waLBerla Framework • widely applicable Lattice-Boltzmann from Erlangen • HPC software framework, originally developed for](https://reader036.vdocument.in/reader036/viewer/2022071219/60583350557fcb04a4064773/html5/thumbnails/21.jpg)
22
Intranode Scaling
Harald Köstler - Chair for System Simulation, FAU Erlangen-Nürnberg, 2016
intranode weak scaling on SuperMUC
![Page 22: Ultra-Skalierbare Multiphysiksimulationen für ... · waLBerla Framework • widely applicable Lattice-Boltzmann from Erlangen • HPC software framework, originally developed for](https://reader036.vdocument.in/reader036/viewer/2022071219/60583350557fcb04a4064773/html5/thumbnails/22.jpg)
23
Single Node Optimization Summary
Harald Köstler - Chair for System Simulation, FAU Erlangen-Nürnberg, 2016
𝜙-Sweep 21 %
μ-Sweep 27 %
Complete Program 25%
Single Node Optimizations
• replace/remove expensive operations like square roots and divisions
• pre-compute and buffer values where possible
• SIMD intrinsics
Percent Peak on SuperMUC
Why not 100% Peak?
• unbalanced number of multiplications and addition
• divisions counted as 1 FLOP but they cost 43 times as much as a multiplication or addition
![Page 23: Ultra-Skalierbare Multiphysiksimulationen für ... · waLBerla Framework • widely applicable Lattice-Boltzmann from Erlangen • HPC software framework, originally developed for](https://reader036.vdocument.in/reader036/viewer/2022071219/60583350557fcb04a4064773/html5/thumbnails/23.jpg)
24
Scaling
Harald Köstler - Chair for System Simulation, FAU Erlangen-Nürnberg, 2016
• scaling on SuperMUC up to 32,768 cores
• ghost layer based communication
• communication hiding
![Page 24: Ultra-Skalierbare Multiphysiksimulationen für ... · waLBerla Framework • widely applicable Lattice-Boltzmann from Erlangen • HPC software framework, originally developed for](https://reader036.vdocument.in/reader036/viewer/2022071219/60583350557fcb04a4064773/html5/thumbnails/24.jpg)
25
Execution-Cache-Memory Model
Julian Hammer, Georg Hager, Gerhard Wellein – RRZE HPC group, FAU Erlangen-Nürnberg, 2016
![Page 25: Ultra-Skalierbare Multiphysiksimulationen für ... · waLBerla Framework • widely applicable Lattice-Boltzmann from Erlangen • HPC software framework, originally developed for](https://reader036.vdocument.in/reader036/viewer/2022071219/60583350557fcb04a4064773/html5/thumbnails/25.jpg)
• Automatic Layer Conditions Model
26
Execution-Cache-Memory Model
Julian Hammer, Georg Hager, Gerhard Wellein – RRZE HPC group, FAU Erlangen-Nürnberg, 2016
![Page 26: Ultra-Skalierbare Multiphysiksimulationen für ... · waLBerla Framework • widely applicable Lattice-Boltzmann from Erlangen • HPC software framework, originally developed for](https://reader036.vdocument.in/reader036/viewer/2022071219/60583350557fcb04a4064773/html5/thumbnails/26.jpg)
SKAMPY ProjectSoftware Engineering andParallelization
![Page 27: Ultra-Skalierbare Multiphysiksimulationen für ... · waLBerla Framework • widely applicable Lattice-Boltzmann from Erlangen • HPC software framework, originally developed for](https://reader036.vdocument.in/reader036/viewer/2022071219/60583350557fcb04a4064773/html5/thumbnails/27.jpg)
• Written in C++ with Python extensions
• Hybridly parallelized (MPI + OpenMP)
• No data structures growing with number of processes involved
• Scales from laptop to recent petascale machines
• Parallel I/O
• Portable (Compiler/OS)
• Automated tests / CI servers
• Open Source release planned
28
Continuous Integration
llvm/clang
Harald Köstler - Chair for System Simulation, FAU Erlangen-Nürnberg, 2016
![Page 28: Ultra-Skalierbare Multiphysiksimulationen für ... · waLBerla Framework • widely applicable Lattice-Boltzmann from Erlangen • HPC software framework, originally developed for](https://reader036.vdocument.in/reader036/viewer/2022071219/60583350557fcb04a4064773/html5/thumbnails/28.jpg)
29
Continuous Integration
Harald Köstler - Chair for System Simulation, FAU Erlangen-Nürnberg, 2016
![Page 29: Ultra-Skalierbare Multiphysiksimulationen für ... · waLBerla Framework • widely applicable Lattice-Boltzmann from Erlangen • HPC software framework, originally developed for](https://reader036.vdocument.in/reader036/viewer/2022071219/60583350557fcb04a4064773/html5/thumbnails/29.jpg)
30
Outlook: Selected Work packages I
Harald Köstler - Chair for System Simulation, FAU Erlangen-Nürnberg, 2016
SA2: Steering & Prototyping Interface
• based on Python
SA3: Adaptivity and Dynamic Load Balancing
![Page 30: Ultra-Skalierbare Multiphysiksimulationen für ... · waLBerla Framework • widely applicable Lattice-Boltzmann from Erlangen • HPC software framework, originally developed for](https://reader036.vdocument.in/reader036/viewer/2022071219/60583350557fcb04a4064773/html5/thumbnails/30.jpg)
31
Outlook: Selected Work packages II
Harald Köstler - Chair for System Simulation, FAU Erlangen-Nürnberg, 2016
MC1: Many-Core Data Structures
MC2: Communication Optimization and Multi-scale Data Reduction
• Communication strategies for heterogeous systems• Problem specific data compression
MC3: Asynchronous Execution and Resilience
![Page 31: Ultra-Skalierbare Multiphysiksimulationen für ... · waLBerla Framework • widely applicable Lattice-Boltzmann from Erlangen • HPC software framework, originally developed for](https://reader036.vdocument.in/reader036/viewer/2022071219/60583350557fcb04a4064773/html5/thumbnails/31.jpg)
Acknowledgements
• Funded by• Bundesministerium für Bildung und Forschung
• KONWIHR. Bavarian project
• DFG SPP 1648/1 – Software for Exascale computing
• Industry
• Supercomputing centers
http://www.exastencils.org/
43
![Page 32: Ultra-Skalierbare Multiphysiksimulationen für ... · waLBerla Framework • widely applicable Lattice-Boltzmann from Erlangen • HPC software framework, originally developed for](https://reader036.vdocument.in/reader036/viewer/2022071219/60583350557fcb04a4064773/html5/thumbnails/32.jpg)
Thank you!
Questions?