performance of the hybrid mpi/openmp version of the ......step 1 : optimizing the mpi-io hints mpi-o...
TRANSCRIPT
![Page 1: Performance of the hybrid MPI/OpenMP version of the ......STEP 1 : Optimizing the MPI-IO Hints MPI-O hints can have a dramatic effect on the IO performances Best parameters depend](https://reader036.vdocument.in/reader036/viewer/2022071605/6141e0322035ff3bc7624ef2/html5/thumbnails/1.jpg)
Performance of the hybrid MPI/OpenMPversion of the HERACLES code on the
Curie « Fat nodes » system
Edouard Audit, Matthias Gonzalez, Pierre Kestener and Pierre-François Lavallé
![Page 2: Performance of the hybrid MPI/OpenMP version of the ......STEP 1 : Optimizing the MPI-IO Hints MPI-O hints can have a dramatic effect on the IO performances Best parameters depend](https://reader036.vdocument.in/reader036/viewer/2022071605/6141e0322035ff3bc7624ef2/html5/thumbnails/2.jpg)
SIAM meeting, Savannah, February 2012
The HERACLES code
(Magneto)hydrodynamics : finite volume, 2nd order godunovExplicit or Implicit
Multigroup radiative transfer : Moment method, Implicit
Gravity, fully coupled to ohydro / Splitted
Thermochemistry and/or heating/coling function (local)
Turbulent forcing (local)
Fixed grid finite volume code working in 1,2,and 3D in cartesian, cylindrical and spherical coordinate. Fortran + MPI, domain decomposition
Used in astrophysics (star formation, interstellar medium studies,…) and to interpret laser generated plasma experiment.
![Page 3: Performance of the hybrid MPI/OpenMP version of the ......STEP 1 : Optimizing the MPI-IO Hints MPI-O hints can have a dramatic effect on the IO performances Best parameters depend](https://reader036.vdocument.in/reader036/viewer/2022071605/6141e0322035ff3bc7624ef2/html5/thumbnails/3.jpg)
SIAM meeting, Savannah, February 2012
The HERACLES code
(Magneto)hydrodynamics : finite volume, 2nd order godunovExplicit or Implicit
Multigroup radiative transfer : Moment method, Implicit
Gravity, fully coupled to hydro / Splitted
Thermochemistry and/or heating/cooling function (local)
Turbulent forcing (local)
Fixed grid finite volume code working in 1,2,and 3D in cartesian, cylindrical and spherical coordinate. Fortran + MPI, domain decomposition
Used in astrophysics (star formation, interstellar medium studies,…) and to interpret laser generated plasma experiment.
![Page 4: Performance of the hybrid MPI/OpenMP version of the ......STEP 1 : Optimizing the MPI-IO Hints MPI-O hints can have a dramatic effect on the IO performances Best parameters depend](https://reader036.vdocument.in/reader036/viewer/2022071605/6141e0322035ff3bc7624ef2/html5/thumbnails/4.jpg)
![Page 5: Performance of the hybrid MPI/OpenMP version of the ......STEP 1 : Optimizing the MPI-IO Hints MPI-O hints can have a dramatic effect on the IO performances Best parameters depend](https://reader036.vdocument.in/reader036/viewer/2022071605/6141e0322035ff3bc7624ef2/html5/thumbnails/5.jpg)
SIAM meeting, Savannah, February 2012
Domain Decomposition
MPI process
MPI processMPI process
MPI process
![Page 6: Performance of the hybrid MPI/OpenMP version of the ......STEP 1 : Optimizing the MPI-IO Hints MPI-O hints can have a dramatic effect on the IO performances Best parameters depend](https://reader036.vdocument.in/reader036/viewer/2022071605/6141e0322035ff3bc7624ef2/html5/thumbnails/6.jpg)
SIAM meeting, Savannah, February 2012
Domaine DecompositionPhysical boundaries
Communications
![Page 7: Performance of the hybrid MPI/OpenMP version of the ......STEP 1 : Optimizing the MPI-IO Hints MPI-O hints can have a dramatic effect on the IO performances Best parameters depend](https://reader036.vdocument.in/reader036/viewer/2022071605/6141e0322035ff3bc7624ef2/html5/thumbnails/7.jpg)
SIAM meeting, Savannah, February 2012
The HERACLES code Read simulation parameters
Split domain over the MPI processes
Initial conditions
Loop over time
Fill the ghost cells : boundary conditions or communications
Compute time step Hydro step
Loop over chunk
Loop over cells (slope, Riemann solver,….)
Compute cooling (local)
Stirring (local)
Output
End
OpenMPOpenMP
OpenMPOpenMP
Not multi-threaded
![Page 8: Performance of the hybrid MPI/OpenMP version of the ......STEP 1 : Optimizing the MPI-IO Hints MPI-O hints can have a dramatic effect on the IO performances Best parameters depend](https://reader036.vdocument.in/reader036/viewer/2022071605/6141e0322035ff3bc7624ef2/html5/thumbnails/8.jpg)
SIAM meeting, Savannah, February 2012
Pure MPI vs MPI/OpenMPMPI MPI + 4 threads
16 messages of size 1 4 messages of size 2
![Page 9: Performance of the hybrid MPI/OpenMP version of the ......STEP 1 : Optimizing the MPI-IO Hints MPI-O hints can have a dramatic effect on the IO performances Best parameters depend](https://reader036.vdocument.in/reader036/viewer/2022071605/6141e0322035ff3bc7624ef2/html5/thumbnails/9.jpg)
SIAM meeting, Savannah, February 2012
The Curie system
Fat nodes360 BullX-S6010
Intel NH EX 2,26 Ghz11 520 cores32 cores/node128 GB/node
105 TFlops
Thin nodes5040 BullX B510
Intel New generation (SNB)80 640 cores
16 cores/node - 4 GB/core – 128 GB SSD1.5+ PFflops
Hybrid nodes144 Bullx B505
288 Nvidia M2090
184 + 11 TFlops
Interconnect Infiniband QDR
1st levelLustre
6 PB - 150 GB/s
February 2011 March 2012October 2011
![Page 10: Performance of the hybrid MPI/OpenMP version of the ......STEP 1 : Optimizing the MPI-IO Hints MPI-O hints can have a dramatic effect on the IO performances Best parameters depend](https://reader036.vdocument.in/reader036/viewer/2022071605/6141e0322035ff3bc7624ef2/html5/thumbnails/10.jpg)
SIAM meeting, Savannah, February 2012
The Curie system
![Page 11: Performance of the hybrid MPI/OpenMP version of the ......STEP 1 : Optimizing the MPI-IO Hints MPI-O hints can have a dramatic effect on the IO performances Best parameters depend](https://reader036.vdocument.in/reader036/viewer/2022071605/6141e0322035ff3bc7624ef2/html5/thumbnails/11.jpg)
SIAM meeting, Savannah, February 2012
Strong Scaling (9003 run)
Pur MPI
2 threads
4 threads
8 threads
![Page 12: Performance of the hybrid MPI/OpenMP version of the ......STEP 1 : Optimizing the MPI-IO Hints MPI-O hints can have a dramatic effect on the IO performances Best parameters depend](https://reader036.vdocument.in/reader036/viewer/2022071605/6141e0322035ff3bc7624ef2/html5/thumbnails/12.jpg)
SIAM meeting, Savannah, February 2012
Strong Scaling (9003 run)
Pur MPI
2 threads
4 threads
8 threads
![Page 13: Performance of the hybrid MPI/OpenMP version of the ......STEP 1 : Optimizing the MPI-IO Hints MPI-O hints can have a dramatic effect on the IO performances Best parameters depend](https://reader036.vdocument.in/reader036/viewer/2022071605/6141e0322035ff3bc7624ef2/html5/thumbnails/13.jpg)
SIAM meeting, Savannah, February 2012
Strong Scaling (9003 run)
Pur MPI
2 threads
4 threads
8 threads
![Page 14: Performance of the hybrid MPI/OpenMP version of the ......STEP 1 : Optimizing the MPI-IO Hints MPI-O hints can have a dramatic effect on the IO performances Best parameters depend](https://reader036.vdocument.in/reader036/viewer/2022071605/6141e0322035ff3bc7624ef2/html5/thumbnails/14.jpg)
SIAM meeting, Savannah, February 2012
weak scaling - 2563 / node (32 cores)
Pur MPI
2 threads
4 threads
8 threads
![Page 15: Performance of the hybrid MPI/OpenMP version of the ......STEP 1 : Optimizing the MPI-IO Hints MPI-O hints can have a dramatic effect on the IO performances Best parameters depend](https://reader036.vdocument.in/reader036/viewer/2022071605/6141e0322035ff3bc7624ef2/html5/thumbnails/15.jpg)
SIAM meeting, Savannah, February 2012
weak scaling - 2563 / node (32 cores)
![Page 16: Performance of the hybrid MPI/OpenMP version of the ......STEP 1 : Optimizing the MPI-IO Hints MPI-O hints can have a dramatic effect on the IO performances Best parameters depend](https://reader036.vdocument.in/reader036/viewer/2022071605/6141e0322035ff3bc7624ef2/html5/thumbnails/16.jpg)
SIAM meeting, Savannah, February 2012
weak scaling - 2563 / node (32 cores)
![Page 17: Performance of the hybrid MPI/OpenMP version of the ......STEP 1 : Optimizing the MPI-IO Hints MPI-O hints can have a dramatic effect on the IO performances Best parameters depend](https://reader036.vdocument.in/reader036/viewer/2022071605/6141e0322035ff3bc7624ef2/html5/thumbnails/17.jpg)
SIAM meeting, Savannah, February 2012
Scaling on BlueGene-IDRIS (strong scaling)
![Page 18: Performance of the hybrid MPI/OpenMP version of the ......STEP 1 : Optimizing the MPI-IO Hints MPI-O hints can have a dramatic effect on the IO performances Best parameters depend](https://reader036.vdocument.in/reader036/viewer/2022071605/6141e0322035ff3bc7624ef2/html5/thumbnails/18.jpg)
SIAM meeting, Savannah, February 2012
IO – the craftsman way
All processes write their output at the same time….
Failure when > few 103
Write by packet + temporization
Ncpu_write ~ 100 – 1000 T_wait ~ 2 – 10 secondes
One output ~ 5 time steps
![Page 19: Performance of the hybrid MPI/OpenMP version of the ......STEP 1 : Optimizing the MPI-IO Hints MPI-O hints can have a dramatic effect on the IO performances Best parameters depend](https://reader036.vdocument.in/reader036/viewer/2022071605/6141e0322035ff3bc7624ef2/html5/thumbnails/19.jpg)
SIAM meeting, Savannah, February 2012
IO – the professional approachP. Wautelet and P. Kestener
4 different IO approach where tested : POSIX : 1 file per MPI processes MPI-IO HDF5 Parallel-NetCDF
STEP 1 : Optimizing the MPI-IO Hints MPI-O hints can have a dramatic effect on the IO performances Best parameters depend on the application 7 of the 23 available hints where tested !!
STEP 2 : Strong Scaling test
![Page 20: Performance of the hybrid MPI/OpenMP version of the ......STEP 1 : Optimizing the MPI-IO Hints MPI-O hints can have a dramatic effect on the IO performances Best parameters depend](https://reader036.vdocument.in/reader036/viewer/2022071605/6141e0322035ff3bc7624ef2/html5/thumbnails/20.jpg)
SIAM meeting, Savannah, February 2012
IO – the professional approachP. Wautelet and P. Kestener
![Page 21: Performance of the hybrid MPI/OpenMP version of the ......STEP 1 : Optimizing the MPI-IO Hints MPI-O hints can have a dramatic effect on the IO performances Best parameters depend](https://reader036.vdocument.in/reader036/viewer/2022071605/6141e0322035ff3bc7624ef2/html5/thumbnails/21.jpg)
SIAM meeting, Savannah, February 2012
Conclusions
Multi-threading necessary for large number of cores
OpenMP is “easy” to implement but not always to understand…
Multi-threaded communications probably necessary
Good results for a small number of threads.