the highly scalable iterative solver library phist - better … · dlr locations and employees...
TRANSCRIPT
![Page 1: The Highly Scalable Iterative Solver Library PHIST - better … · DLR Locations and Employees DLR.de • Chart 3 > ASE21 > Achim Basermann • PHIST_ASE21_U_Tokyo_2_12_15.pptx >](https://reader033.vdocument.in/reader033/viewer/2022052022/6037c89c14a0c03a5703eb00/html5/thumbnails/1.jpg)
The Highly Scalable Iterative Solver Library PHIST
Achim Basermann, Melven Röhrig-Zöllner and Jonas Thies German Aerospace Center (DLR) Simulation and Software Technology Linder Höhe, Cologne, Germany
> ASE21 > Achim Basermann • PHIST_ASE21_U_Tokyo_2_12_15.pptx > 02.12.2015 DLR.de • Chart 1
DFG Projekt ESSEX
![Page 2: The Highly Scalable Iterative Solver Library PHIST - better … · DLR Locations and Employees DLR.de • Chart 3 > ASE21 > Achim Basermann • PHIST_ASE21_U_Tokyo_2_12_15.pptx >](https://reader033.vdocument.in/reader033/viewer/2022052022/6037c89c14a0c03a5703eb00/html5/thumbnails/2.jpg)
• Research Institution • Space Agency • Project Management Agency
DLR German Aerospace Center
> ASE21 > Achim Basermann • PHIST_ASE21_U_Tokyo_2_12_15.pptx > 02.12.2015 DLR.de • Chart 2
![Page 3: The Highly Scalable Iterative Solver Library PHIST - better … · DLR Locations and Employees DLR.de • Chart 3 > ASE21 > Achim Basermann • PHIST_ASE21_U_Tokyo_2_12_15.pptx >](https://reader033.vdocument.in/reader033/viewer/2022052022/6037c89c14a0c03a5703eb00/html5/thumbnails/3.jpg)
DLR Locations and Employees
> ASE21 > Achim Basermann • PHIST_ASE21_U_Tokyo_2_12_15.pptx > 02.12.2015 DLR.de • Chart 3
Approx. 8000 employees across 33 institutes and facilities at 16 sites.
Offices in Brussels, Paris, Tokyo and Washington.
Cologne
Oberpfaffenhofen
Braunschweig
Goettingen
Berlin
Bonn
Neustrelitz
Weilheim
Bremen Trauen
Lampoldshausen
Stuttgart
Stade
Augsburg
Hamburg
Juelich
![Page 4: The Highly Scalable Iterative Solver Library PHIST - better … · DLR Locations and Employees DLR.de • Chart 3 > ASE21 > Achim Basermann • PHIST_ASE21_U_Tokyo_2_12_15.pptx >](https://reader033.vdocument.in/reader033/viewer/2022052022/6037c89c14a0c03a5703eb00/html5/thumbnails/4.jpg)
DLR Institute Simulation and Software Technology Scientific Themes and Working Groups
> ASE21 > Achim Basermann • PHIST_ASE21_U_Tokyo_2_12_15.pptx > 02.12.2015 DLR.de • Chart 4
Software for Space Systems and Interactive
Visualization
Distributed Systems and Component Software
Dep
artm
ents
Wor
king
Gro
ups
Software Engineering
Distributed Software Systems
High-Performance Computing
Embedded Systems
Modeling and Simulation
Scientific Visualization
3D Interaction
![Page 5: The Highly Scalable Iterative Solver Library PHIST - better … · DLR Locations and Employees DLR.de • Chart 3 > ASE21 > Achim Basermann • PHIST_ASE21_U_Tokyo_2_12_15.pptx >](https://reader033.vdocument.in/reader033/viewer/2022052022/6037c89c14a0c03a5703eb00/html5/thumbnails/5.jpg)
Survey
• Motivation for extreme scale computing
• The DFG project ESSEX
• The ESSEX software infrastructure
• The iterative solver library PHIST
• Iterative solvers from PHIST
• Methods
• Performance
• Conclusions
> ASE21 > Achim Basermann • PHIST_ASE21_U_Tokyo_2_12_15.pptx > 02.12.2015 DLR.de • Chart 5
![Page 6: The Highly Scalable Iterative Solver Library PHIST - better … · DLR Locations and Employees DLR.de • Chart 3 > ASE21 > Achim Basermann • PHIST_ASE21_U_Tokyo_2_12_15.pptx >](https://reader033.vdocument.in/reader033/viewer/2022052022/6037c89c14a0c03a5703eb00/html5/thumbnails/6.jpg)
Hypothetical Exascale System
> ASE21 > Achim Basermann • PHIST_ASE21_U_Tokyo_2_12_15.pptx > 02.12.2015 DLR.de • Chart 6
“Aggressive Strawman” (2007) DARPA (The Defense Advanced Research Projects Agency of the U.S)
Characteristic Flops – peak (PF) 997 Microprocessors 223,872 Cores/microprocessor 742 Cache (TB) 37.2 DRAM (PB) 3.58 Total power (MW) 67.7 Memory bandwidth / Flops 0.0025 Network bandwidth / Flops 0.0008
170 million cores!
![Page 7: The Highly Scalable Iterative Solver Library PHIST - better … · DLR Locations and Employees DLR.de • Chart 3 > ASE21 > Achim Basermann • PHIST_ASE21_U_Tokyo_2_12_15.pptx >](https://reader033.vdocument.in/reader033/viewer/2022052022/6037c89c14a0c03a5703eb00/html5/thumbnails/7.jpg)
Today´s Workstations are Hundredfold Parallel
• Example: Intel® Haswell architecture
• 1-2 CPU sockets • with 18 cores each • Hyperthreading, 2 threads/core • 8 operations performed concurrently (SIMD, FMA)
• GPUs offer parallelism with ten thousands of asynchronous threads.
> ASE21 > Achim Basermann • PHIST_ASE21_U_Tokyo_2_12_15.pptx > 02.12.2015 DLR.de • Chart 7
Conclusion: Highly scalable software is not only relevant for high-end computing, but has many applications on common hardware available for everyone.
![Page 8: The Highly Scalable Iterative Solver Library PHIST - better … · DLR Locations and Employees DLR.de • Chart 3 > ASE21 > Achim Basermann • PHIST_ASE21_U_Tokyo_2_12_15.pptx >](https://reader033.vdocument.in/reader033/viewer/2022052022/6037c89c14a0c03a5703eb00/html5/thumbnails/8.jpg)
Accelerator Hardware makes HPC Main Stream
> ASE21 > Achim Basermann • PHIST_ASE21_U_Tokyo_2_12_15.pptx > 02.12.2015 DLR.de • Chart 8
Nvidia® GPUs Intel® Xeon Phi
• High parallelism and flop rates • Expert know-how for porting necessary
(e.g. CUDA knowledge) • higher memory bandwidth • new bottleneck CPU→device
• Common representatives:
TOP500 development
![Page 9: The Highly Scalable Iterative Solver Library PHIST - better … · DLR Locations and Employees DLR.de • Chart 3 > ASE21 > Achim Basermann • PHIST_ASE21_U_Tokyo_2_12_15.pptx >](https://reader033.vdocument.in/reader033/viewer/2022052022/6037c89c14a0c03a5703eb00/html5/thumbnails/9.jpg)
Software Challenges
Problems:
• Only a few algorithms are designed for extreme parallelism. • Applications software is as a rule incrementally adapted to new technologies.
Extreme parallelism requires: • Extremely scalable algorithms
• New concepts for
• fault tolerance • programming models • frameworks for modelling and simulation
• Focus on suitable software engineering methods for parallel codes
• New test methods • New tools for development and analysis
> ASE21 > Achim Basermann • PHIST_ASE21_U_Tokyo_2_12_15.pptx > 02.12.2015 DLR.de • Chart 9
![Page 10: The Highly Scalable Iterative Solver Library PHIST - better … · DLR Locations and Employees DLR.de • Chart 3 > ASE21 > Achim Basermann • PHIST_ASE21_U_Tokyo_2_12_15.pptx >](https://reader033.vdocument.in/reader033/viewer/2022052022/6037c89c14a0c03a5703eb00/html5/thumbnails/10.jpg)
The DFG Project ESSEX
> ASE21 > Achim Basermann • PHIST_ASE21_U_Tokyo_2_12_15.pptx > 02.12.2015 DLR.de • Chart 10
DFG programme Software for Exascale Computing
Project ESSEX
Equipping Sparse Solvers for the Exascale Participating universities
RRZE Erlangen, Computer Science (Prof. Wellein, Hager) Wuppertal, Numerical Analysis (Prof. Lang)
Greifswald, Physics (Prof. Fehske)
Period: 2013-2015 Extended to 2018
International contacts
Sandia (Trilinos project) Tenessee (Dongarra)
Japan: Tsukuba, Tokyo The Netherlands: Groningen, Utrecht
ESSEX develops open-source software.
![Page 11: The Highly Scalable Iterative Solver Library PHIST - better … · DLR Locations and Employees DLR.de • Chart 3 > ASE21 > Achim Basermann • PHIST_ASE21_U_Tokyo_2_12_15.pptx >](https://reader033.vdocument.in/reader033/viewer/2022052022/6037c89c14a0c03a5703eb00/html5/thumbnails/11.jpg)
ESSEX Motivation
> ASE21 > Achim Basermann • PHIST_ASE21_U_Tokyo_2_12_15.pptx > 02.12.2015 DLR.de • Chart 11
Quantum physics/information applications ),(),( trHtr
ti ψψ =∂∂
𝑯𝑯 𝒙𝒙 = 𝝀𝝀 𝒙𝒙
Sparse eigenvalue solvers of broad applicability
“Few” (1,…,100s) of eigenpairs “Bulk” (100s,…,1000s)
eigenpairs
Good approximation to full spectrum (e.g. Density of States)
Large, Sparse
and beyond….
𝝀𝝀𝟏𝟏,𝝀𝝀𝟐𝟐, … , … , … , … ,𝝀𝝀𝒌𝒌, … , … , … , … ,𝝀𝝀𝒏𝒏−𝟏𝟏,𝝀𝝀𝒏𝒏
![Page 12: The Highly Scalable Iterative Solver Library PHIST - better … · DLR Locations and Employees DLR.de • Chart 3 > ASE21 > Achim Basermann • PHIST_ASE21_U_Tokyo_2_12_15.pptx >](https://reader033.vdocument.in/reader033/viewer/2022052022/6037c89c14a0c03a5703eb00/html5/thumbnails/12.jpg)
Enabling Extreme Parallelism through Software Codesign
> ASE21 > Achim Basermann • PHIST_ASE21_U_Tokyo_2_12_15.pptx > 02.12.2015 DLR.de • Chart 12
Applications
Computational Algorithms
Building Blocks Faul
t Tol
eran
ce
Scal
abili
ty
Num
eric
al R
elia
bilit
y
![Page 13: The Highly Scalable Iterative Solver Library PHIST - better … · DLR Locations and Employees DLR.de • Chart 3 > ASE21 > Achim Basermann • PHIST_ASE21_U_Tokyo_2_12_15.pptx >](https://reader033.vdocument.in/reader033/viewer/2022052022/6037c89c14a0c03a5703eb00/html5/thumbnails/13.jpg)
Programming Models for Heterogeneous HPC Systems
> ASE21 > Achim Basermann • PHIST_ASE21_U_Tokyo_2_12_15.pptx > 02.12.2015 DLR.de • Chart 13
• Flat MPI + off-loading
• Runtime (e.g. MAGMA, OmpSs)
• Dynamic scheduling of small tasks good load balancing
• Kokkos (Trilinos)
• High level of abstraction (C++11)
• MPI+X strategy in ESSEX • X: OpenMP, CUDA, SIMD Intrinsics, e.g.
AVX • Tasking for bigger asynchronous
functions functional parallelism • Experts implement the kernels required.
![Page 14: The Highly Scalable Iterative Solver Library PHIST - better … · DLR Locations and Employees DLR.de • Chart 3 > ASE21 > Achim Basermann • PHIST_ASE21_U_Tokyo_2_12_15.pptx >](https://reader033.vdocument.in/reader033/viewer/2022052022/6037c89c14a0c03a5703eb00/html5/thumbnails/14.jpg)
The ESSEX Software Infrastructure
> ASE21 > Achim Basermann • PHIST_ASE21_U_Tokyo_2_12_15.pptx > 02.12.2015 DLR.de • Chart 14
![Page 15: The Highly Scalable Iterative Solver Library PHIST - better … · DLR Locations and Employees DLR.de • Chart 3 > ASE21 > Achim Basermann • PHIST_ASE21_U_Tokyo_2_12_15.pptx >](https://reader033.vdocument.in/reader033/viewer/2022052022/6037c89c14a0c03a5703eb00/html5/thumbnails/15.jpg)
The ESSEX Software Infrastructure: Test-Driven Algorithm Development
> ASE21 > Achim Basermann • PHIST_ASE21_U_Tokyo_2_12_15.pptx > 02.12.2015 DLR.de • Chart 15
![Page 16: The Highly Scalable Iterative Solver Library PHIST - better … · DLR Locations and Employees DLR.de • Chart 3 > ASE21 > Achim Basermann • PHIST_ASE21_U_Tokyo_2_12_15.pptx >](https://reader033.vdocument.in/reader033/viewer/2022052022/6037c89c14a0c03a5703eb00/html5/thumbnails/16.jpg)
Optimized ESSEX Kernel Library
> ASE21 > Achim Basermann • PHIST_ASE21_U_Tokyo_2_12_15.pptx > 02.12.2015 DLR.de • Chart 16
General, Hybrid, and Optimized Sparse Toolkit
• MPI + OpenMP + SIMD + CUDA • Sparse matrix-(block-)vector multiplication • Dense block-vector operations • Task-queue for functional parallelism • Asynchronous checkpoint-restart
Status: beta version, suitable for experienced HPC C programmers
http://bitbucket.org/essex/ghost
BSD License
![Page 17: The Highly Scalable Iterative Solver Library PHIST - better … · DLR Locations and Employees DLR.de • Chart 3 > ASE21 > Achim Basermann • PHIST_ASE21_U_Tokyo_2_12_15.pptx >](https://reader033.vdocument.in/reader033/viewer/2022052022/6037c89c14a0c03a5703eb00/html5/thumbnails/17.jpg)
The Iterative Solver Library PHIST
> ASE21 > Achim Basermann • PHIST_ASE21_U_Tokyo_2_12_15.pptx > 02.12.2015 DLR.de • Chart 17
PHIST Pipelined Hybrid parallel Iterative Solver Toolkit
• Iterative solvers for sparse matrices • Eigenproblems: Jacobi-Davidson, FEAST • Systems of linear equations: GMRES, MINRES, CARP-CG
• Provides some abstraction from data layout, process management, tasking etc.
• Adapts algorithms to use block operations • Implements asynchronous and fault-tolerant solvers • Simple functional interface (C, Fortran, Python) • Systematically tests kernel libraries for correctness and performance • Various possibilities for integration into applications
Status: beta version with extensive test framework http://bitbucket.org/essex/phist BSD License
orthogonalization
![Page 18: The Highly Scalable Iterative Solver Library PHIST - better … · DLR Locations and Employees DLR.de • Chart 3 > ASE21 > Achim Basermann • PHIST_ASE21_U_Tokyo_2_12_15.pptx >](https://reader033.vdocument.in/reader033/viewer/2022052022/6037c89c14a0c03a5703eb00/html5/thumbnails/18.jpg)
Integration of PHIST into Applications
> ASE21 > Achim Basermann • PHIST_ASE21_U_Tokyo_2_12_15.pptx > 02.12.2015 DLR.de • Chart 18
Selection of kernel library )nur Required flexibility
gering mittel hoch
No easy access to matrix elements
PHIST builtin Only CPU F‘03+OpenMP CRS format
Various arch. Large C++ code base
Own data structures Adapter ca 1000 lines of code
Hardware awareness
low
low
high
high
![Page 19: The Highly Scalable Iterative Solver Library PHIST - better … · DLR Locations and Employees DLR.de • Chart 3 > ASE21 > Achim Basermann • PHIST_ASE21_U_Tokyo_2_12_15.pptx >](https://reader033.vdocument.in/reader033/viewer/2022052022/6037c89c14a0c03a5703eb00/html5/thumbnails/19.jpg)
Interoperability of PHIST and Trilinos
> ASE21 > Achim Basermann • PHIST_ASE21_U_Tokyo_2_12_15.pptx > 02.12.2015 DLR.de • Chart 19
ESSEX project
PHIST
-------------------------------- PHIST builtin
Projekt
Anasazi (eigenproblems) Belos (lin. eq. syst.)
--------------------------------- Epetra Tpetra
Iterative solvers ------------------------- Basic operations
C Wrapper
“Can Use”
![Page 20: The Highly Scalable Iterative Solver Library PHIST - better … · DLR Locations and Employees DLR.de • Chart 3 > ASE21 > Achim Basermann • PHIST_ASE21_U_Tokyo_2_12_15.pptx >](https://reader033.vdocument.in/reader033/viewer/2022052022/6037c89c14a0c03a5703eb00/html5/thumbnails/20.jpg)
Iterative Solvers from PHIST: Jacobi-Davidson QR method (Fokkema, 1998)
> ASE21 > Achim Basermann • PHIST_ASE21_U_Tokyo_2_12_15.pptx > 02.12.2015 DLR.de • Chart 20
![Page 21: The Highly Scalable Iterative Solver Library PHIST - better … · DLR Locations and Employees DLR.de • Chart 3 > ASE21 > Achim Basermann • PHIST_ASE21_U_Tokyo_2_12_15.pptx >](https://reader033.vdocument.in/reader033/viewer/2022052022/6037c89c14a0c03a5703eb00/html5/thumbnails/21.jpg)
Iterative Solvers from PHIST: Block JDQR method
> ASE21 > Achim Basermann • PHIST_ASE21_U_Tokyo_2_12_15.pptx > 02.12.2015 DLR.de • Chart 21
![Page 22: The Highly Scalable Iterative Solver Library PHIST - better … · DLR Locations and Employees DLR.de • Chart 3 > ASE21 > Achim Basermann • PHIST_ASE21_U_Tokyo_2_12_15.pptx >](https://reader033.vdocument.in/reader033/viewer/2022052022/6037c89c14a0c03a5703eb00/html5/thumbnails/22.jpg)
Iterative Solvers from PHIST: Complete Block JDQR method
> ASE21 > Achim Basermann • PHIST_ASE21_U_Tokyo_2_12_15.pptx > 02.12.2015 DLR.de • Chart 22
![Page 23: The Highly Scalable Iterative Solver Library PHIST - better … · DLR Locations and Employees DLR.de • Chart 3 > ASE21 > Achim Basermann • PHIST_ASE21_U_Tokyo_2_12_15.pptx >](https://reader033.vdocument.in/reader033/viewer/2022052022/6037c89c14a0c03a5703eb00/html5/thumbnails/23.jpg)
Block JDQR method: Required Linear Algebra Operations
> ASE21 > Achim Basermann • PHIST_ASE21_U_Tokyo_2_12_15.pptx > 02.12.2015 DLR.de • Chart 23
![Page 24: The Highly Scalable Iterative Solver Library PHIST - better … · DLR Locations and Employees DLR.de • Chart 3 > ASE21 > Achim Basermann • PHIST_ASE21_U_Tokyo_2_12_15.pptx >](https://reader033.vdocument.in/reader033/viewer/2022052022/6037c89c14a0c03a5703eb00/html5/thumbnails/24.jpg)
Block JDQR method: Block Vector Operations
> ASE21 > Achim Basermann • PHIST_ASE21_U_Tokyo_2_12_15.pptx > 02.12.2015 DLR.de • Chart 24
![Page 25: The Highly Scalable Iterative Solver Library PHIST - better … · DLR Locations and Employees DLR.de • Chart 3 > ASE21 > Achim Basermann • PHIST_ASE21_U_Tokyo_2_12_15.pptx >](https://reader033.vdocument.in/reader033/viewer/2022052022/6037c89c14a0c03a5703eb00/html5/thumbnails/25.jpg)
Block JDQR method: Result of Correction Kernel Optimization 10-core Intel Ivy Bridge CPU; CRS format; matrix: 107 rows; 1.5·108 nonzeros; 120 correction operations
> ASE21 > Achim Basermann • PHIST_ASE21_U_Tokyo_2_12_15.pptx > 02.12.2015 DLR.de • Chart 25
![Page 26: The Highly Scalable Iterative Solver Library PHIST - better … · DLR Locations and Employees DLR.de • Chart 3 > ASE21 > Achim Basermann • PHIST_ASE21_U_Tokyo_2_12_15.pptx >](https://reader033.vdocument.in/reader033/viewer/2022052022/6037c89c14a0c03a5703eb00/html5/thumbnails/26.jpg)
Block JDQR method: Overall Speedup through Blocking Node: 2x10-core Intel Ivy Bridge CPU; SELL-C-σ format; blocked preconditioning; residual reduction: 10-8
> ASE21 > Achim Basermann • PHIST_ASE21_U_Tokyo_2_12_15.pptx > 02.12.2015 DLR.de • Chart 26
![Page 27: The Highly Scalable Iterative Solver Library PHIST - better … · DLR Locations and Employees DLR.de • Chart 3 > ASE21 > Achim Basermann • PHIST_ASE21_U_Tokyo_2_12_15.pptx >](https://reader033.vdocument.in/reader033/viewer/2022052022/6037c89c14a0c03a5703eb00/html5/thumbnails/27.jpg)
Iterative Solvers from PHIST: FEAST eigensolver (Polizzi '09)
> ASE21 > Achim Basermann • PHIST_ASE21_U_Tokyo_2_12_15.pptx > 02.12.2015 DLR.de • Chart 27
![Page 28: The Highly Scalable Iterative Solver Library PHIST - better … · DLR Locations and Employees DLR.de • Chart 3 > ASE21 > Achim Basermann • PHIST_ASE21_U_Tokyo_2_12_15.pptx >](https://reader033.vdocument.in/reader033/viewer/2022052022/6037c89c14a0c03a5703eb00/html5/thumbnails/28.jpg)
Iterative Solvers from PHIST: Linear Systems for FEAST/graphene
> ASE21 > Achim Basermann • PHIST_ASE21_U_Tokyo_2_12_15.pptx > 02.12.2015 DLR.de • Chart 28
![Page 29: The Highly Scalable Iterative Solver Library PHIST - better … · DLR Locations and Employees DLR.de • Chart 3 > ASE21 > Achim Basermann • PHIST_ASE21_U_Tokyo_2_12_15.pptx >](https://reader033.vdocument.in/reader033/viewer/2022052022/6037c89c14a0c03a5703eb00/html5/thumbnails/29.jpg)
Iterative Solvers from PHIST: The CGMN Linear Solver
> ASE21 > Achim Basermann • PHIST_ASE21_U_Tokyo_2_12_15.pptx > 02.12.2015 DLR.de • Chart 29
![Page 30: The Highly Scalable Iterative Solver Library PHIST - better … · DLR Locations and Employees DLR.de • Chart 3 > ASE21 > Achim Basermann • PHIST_ASE21_U_Tokyo_2_12_15.pptx >](https://reader033.vdocument.in/reader033/viewer/2022052022/6037c89c14a0c03a5703eb00/html5/thumbnails/30.jpg)
Iterative Solvers from PHIST: Parallelization Strategies for CGMN
> ASE21 > Achim Basermann • PHIST_ASE21_U_Tokyo_2_12_15.pptx > 02.12.2015 DLR.de • Chart 30
![Page 31: The Highly Scalable Iterative Solver Library PHIST - better … · DLR Locations and Employees DLR.de • Chart 3 > ASE21 > Achim Basermann • PHIST_ASE21_U_Tokyo_2_12_15.pptx >](https://reader033.vdocument.in/reader033/viewer/2022052022/6037c89c14a0c03a5703eb00/html5/thumbnails/31.jpg)
Iterative Solvers from PHIST: Scaling of CARP-CG Intel Ivy Bridge
> ASE21 > Achim Basermann • PHIST_ASE21_U_Tokyo_2_12_15.pptx > 02.12.2015 DLR.de • Chart 31
![Page 32: The Highly Scalable Iterative Solver Library PHIST - better … · DLR Locations and Employees DLR.de • Chart 3 > ASE21 > Achim Basermann • PHIST_ASE21_U_Tokyo_2_12_15.pptx >](https://reader033.vdocument.in/reader033/viewer/2022052022/6037c89c14a0c03a5703eb00/html5/thumbnails/32.jpg)
MC-CARP-CG: Cache Coherence Kills Performance on Socket Level
> ASE21 > Achim Basermann • PHIST_ASE21_U_Tokyo_2_12_15.pptx > 02.12.2015 DLR.de • Chart 32
![Page 33: The Highly Scalable Iterative Solver Library PHIST - better … · DLR Locations and Employees DLR.de • Chart 3 > ASE21 > Achim Basermann • PHIST_ASE21_U_Tokyo_2_12_15.pptx >](https://reader033.vdocument.in/reader033/viewer/2022052022/6037c89c14a0c03a5703eb00/html5/thumbnails/33.jpg)
Conclusions
• PHIST with provides a pragmatic, flexible and hardware-aware programming model for heterogeneous systems.
• Includes highly scalable sparse iterative solvers for eigenproblems and systems of linear equations • Well suited for iterative solver development and solver integration into applications
• Block operations distinctly increase performance of JDQR. • Slight increase of operations • Impact of memory layout: row- rather than column-major for block vectors • Higher node-level performance • Inter-node advantage: message aggregation
• CGMN with CARP and multi-coloring parallelization is suitable for robust iterative solution of nearly singular equations.
• Appropriate iterative solver for FEAST in order to find interior eigenpairs, • in particular for problems from graphene design
• Future: AMG preconditioning for blocked JDQR & FEAST (Kengo Nakajima, University of Tokyo); exploitation of the non-linear Sakurai-Sugiura method (Tetsuya Sakurai, University of Tsukuba)
> ASE21 > Achim Basermann • PHIST_ASE21_U_Tokyo_2_12_15.pptx > 02.12.2015 DLR.de • Chart 33
![Page 34: The Highly Scalable Iterative Solver Library PHIST - better … · DLR Locations and Employees DLR.de • Chart 3 > ASE21 > Achim Basermann • PHIST_ASE21_U_Tokyo_2_12_15.pptx >](https://reader033.vdocument.in/reader033/viewer/2022052022/6037c89c14a0c03a5703eb00/html5/thumbnails/34.jpg)
References
• Röhrig-Zöllner, Thies, Basermann et al.: Increasing the performance of Jacobi-Davidson by blocking; SISC (in print)
• Thies, Basermann, Lang et al.: On the parallel iterative solution of linear systems arising in the FEAST algorithm for computing inner eigenvalues; Parallel Computing 49 (2015) 153–163
Thanks to all partners from the ESSEX project and to DFG for the support through the Priority Programme 1648 “Software for Exascale Computing”.
> ASE21 > Achim Basermann • PHIST_ASE21_U_Tokyo_2_12_15.pptx > 02.12.2015 DLR.de • Chart 34
Computer Science, Univ. Erlangen Applied Computer Science, Univ. Wuppertal Institute for Physics, Univ. Greifswald Erlangen Regional Computing Center
![Page 35: The Highly Scalable Iterative Solver Library PHIST - better … · DLR Locations and Employees DLR.de • Chart 3 > ASE21 > Achim Basermann • PHIST_ASE21_U_Tokyo_2_12_15.pptx >](https://reader033.vdocument.in/reader033/viewer/2022052022/6037c89c14a0c03a5703eb00/html5/thumbnails/35.jpg)
> ASE21 > Achim Basermann • PHIST_ASE21_U_Tokyo_2_12_15.pptx > 02.12.2015 DLR.de • Chart 35
Many thanks for your attention!
Questions? Dr.-Ing. Achim Basermann German Aerospace Center (DLR) Simulation and Software Technology
Department Distributed Systems and Component Software
Team High Performance Computing
http://www.DLR.de/sc