high-performance application and computers …hacspecis.gforge.inria.fr/midterm_hacspecis.pdf1xeon...
TRANSCRIPT
![Page 1: High-performance Application and Computers …hacspecis.gforge.inria.fr/MidTerm_HACSPECIS.pdf1Xeon E5-2620 server (Nova, Grid’5000) ˇ 47 hoursand16GB Accuracy(Evaluation on Taurus](https://reader033.vdocument.in/reader033/viewer/2022041917/5e6a7e85a96e5e5f42104873/html5/thumbnails/1.jpg)
High-performance Application and ComputersStudying PErformance and Correctness In Simulation
Paris, May 30th 2018
HAC SPECIS mid-term review 1/31
![Page 2: High-performance Application and Computers …hacspecis.gforge.inria.fr/MidTerm_HACSPECIS.pdf1Xeon E5-2620 server (Nova, Grid’5000) ˇ 47 hoursand16GB Accuracy(Evaluation on Taurus](https://reader033.vdocument.in/reader033/viewer/2022041917/5e6a7e85a96e5e5f42104873/html5/thumbnails/2.jpg)
Meeting Schedule
10:00-13:00
General presentation of HAC SPECISAdministrative Information
Positions, dissemination, collaborations, publications, . . .
A few success stories
13:00 Lunch at ”Le Repere”
14:30-17:00
3 scientific focusesPerspectives & Discussions
HAC SPECIS mid-term review 2/31
![Page 3: High-performance Application and Computers …hacspecis.gforge.inria.fr/MidTerm_HACSPECIS.pdf1Xeon E5-2620 server (Nova, Grid’5000) ˇ 47 hoursand16GB Accuracy(Evaluation on Taurus](https://reader033.vdocument.in/reader033/viewer/2022041917/5e6a7e85a96e5e5f42104873/html5/thumbnails/3.jpg)
Outline
General PresentationContext: Modern HPCHAC SPECIS and SimGrid
Administrative Facts
Success StoriesCapacity planning with SMPI (Arnaud)StarPU-SimGrid (Samuel)Exact and Statistical Model Checking in Simgrid (Stephan)Improving State Equality with Static Analysis (Emmanuelle)Collateral Projects (Frederic)
FocusPredicting Energy Consumption (Anne-Cecile & Arnaud)StarPU-SimGrid (Samuel & Emmanuel)Formal Aspects in HAC SPECIS (Martin & Thierry)
Perspectives
HAC SPECIS mid-term review 3/31
![Page 4: High-performance Application and Computers …hacspecis.gforge.inria.fr/MidTerm_HACSPECIS.pdf1Xeon E5-2620 server (Nova, Grid’5000) ˇ 47 hoursand16GB Accuracy(Evaluation on Taurus](https://reader033.vdocument.in/reader033/viewer/2022041917/5e6a7e85a96e5e5f42104873/html5/thumbnails/4.jpg)
Modern Supercomputers
TaihuLight 40 960Ö260 (RISC) Fat-Tree 15MWTianhe-2 32,000Ö12 (Xeon) + 48,000 Xeon Phi Fat-Tree 18MWPiz Daint 5,272Ö12 (Xeon) + 5,272 P100 DragonFly 2MWGyoukou 1,248Ö16 (Xeon D) + 9,984 PEZY-SC2 (1,984) ??? 1MWTitan 6,274Ö16 (AMD) + 18,688 K20 3D-Torus 8MWSequoia 98,304Ö12 (Power A2) 5D-Torus 8MW
Up to 20,000,000 cores, accelerators, a complex high speed interconnect
Co-design: Which technology for which application ? Energy/power management ?
HAC SPECIS mid-term review 4/31
![Page 5: High-performance Application and Computers …hacspecis.gforge.inria.fr/MidTerm_HACSPECIS.pdf1Xeon E5-2620 server (Nova, Grid’5000) ˇ 47 hoursand16GB Accuracy(Evaluation on Taurus](https://reader033.vdocument.in/reader033/viewer/2022041917/5e6a7e85a96e5e5f42104873/html5/thumbnails/5.jpg)
Programming Supercomputers
MPI (1994): ; regular algorithmic patterns
But many other APIs to exploit cores, GPUS, KNC, KNL, FPGAs, . . . :OpenMP CUDA OpenCL Cilk Charm++ KAAPI StarPU QUARK
ParSEC OmpSs ...
Exploitation is a programming (coding + algorithm) nightmare
HAC SPECIS mid-term review 5/31
![Page 6: High-performance Application and Computers …hacspecis.gforge.inria.fr/MidTerm_HACSPECIS.pdf1Xeon E5-2620 server (Nova, Grid’5000) ˇ 47 hoursand16GB Accuracy(Evaluation on Taurus](https://reader033.vdocument.in/reader033/viewer/2022041917/5e6a7e85a96e5e5f42104873/html5/thumbnails/6.jpg)
Programming Supercomputers
MPI (1994): ; regular algorithmic patterns
But many other APIs to exploit cores, GPUS, KNC, KNL, FPGAs, . . . :OpenMP CUDA OpenCL Cilk Charm++ KAAPI StarPU QUARK
ParSEC OmpSs ...
Exploitation is a programming (coding + algorithm) nightmare
HAC SPECIS mid-term review 5/31
![Page 7: High-performance Application and Computers …hacspecis.gforge.inria.fr/MidTerm_HACSPECIS.pdf1Xeon E5-2620 server (Nova, Grid’5000) ˇ 47 hoursand16GB Accuracy(Evaluation on Taurus](https://reader033.vdocument.in/reader033/viewer/2022041917/5e6a7e85a96e5e5f42104873/html5/thumbnails/7.jpg)
Handling Supercomputers at the Application Level
Larger and larger scale hybrid machines are a pain for application developers
Programming models do not mix well
Possible programming approaches“Rigid, hand tuned”
SuperLU
”Task-based” and Dynamic
MUMPSAnalysis and Comparison of Two Distributed Memory Sparse Solvers
Amestoy, Duff, L’excellent, Li. ACM Trans. on Math. Software, Vol. 27, No. 4, 2001.
; Applications are more and more adaptive
HAC SPECIS mid-term review 6/31
![Page 8: High-performance Application and Computers …hacspecis.gforge.inria.fr/MidTerm_HACSPECIS.pdf1Xeon E5-2620 server (Nova, Grid’5000) ˇ 47 hoursand16GB Accuracy(Evaluation on Taurus](https://reader033.vdocument.in/reader033/viewer/2022041917/5e6a7e85a96e5e5f42104873/html5/thumbnails/8.jpg)
Toward Exascale
Modern Application Structure
1. Task-based Runtime (e.g., StarPU )2. Task implementation Auto-Tuning
Typical Domains
Dense linear solvers (e.g., Chameleon)Sparse linear solvers (e.g. qr mumps),Low-rankFast Multipole MethodsSeismic applicationM.L. for Climate/Weather prediction
0
0 0 0 0
0 0 0 0 0 0 0 0 0 0
1
1 1 1
1 1 1 1 1 1
2
2 2
2 2 2
3
3
3
4
; adaptive applications, synchronizations, complex optimizations/scheduling
Performance ??? Correctness ???
Simulation & Model Checking
HAC SPECIS mid-term review 7/31
![Page 9: High-performance Application and Computers …hacspecis.gforge.inria.fr/MidTerm_HACSPECIS.pdf1Xeon E5-2620 server (Nova, Grid’5000) ˇ 47 hoursand16GB Accuracy(Evaluation on Taurus](https://reader033.vdocument.in/reader033/viewer/2022041917/5e6a7e85a96e5e5f42104873/html5/thumbnails/9.jpg)
Performance Evaluation challenges
MPI/StarPU Simulation: what for ?
1. Helping application/runtime developers
Non-intrusive tracing, repeatable execution with classical debugging toolsSave computing resources (runs on your laptop if possible)Provide a sound baseline for performance non regression testing
2. Helping application end-users
How much resources should I ask for? (scaling)Configure MPI collective operations (tuning)Provide baseline (did something go wrong ?)
3. Capacity planning
Energy consumption: can we save on components ?Hardware upgrade: what-if the network was this ?
Could we have a tool that allows for this on a daily basis?
HAC SPECIS mid-term review 8/31
![Page 10: High-performance Application and Computers …hacspecis.gforge.inria.fr/MidTerm_HACSPECIS.pdf1Xeon E5-2620 server (Nova, Grid’5000) ˇ 47 hoursand16GB Accuracy(Evaluation on Taurus](https://reader033.vdocument.in/reader033/viewer/2022041917/5e6a7e85a96e5e5f42104873/html5/thumbnails/10.jpg)
Formal verification challenges
MPI/StarPU Model Checking: what for ?
Checking dynamic properties (deadlock, progression) of real HPC applicationsand runtimes
Heisenbugs are commonHPC applications are dirty but also have specific rigid structuresAdaptative applications require to explore all possible executions
Quantitative properties (e.g., swapping)
Exact computation is painfulEstimate probability of particular events
Could we have a tool that allows for this on a daily basis?
The need for specific tools/language is a showstopper for adoption in the HPCcommunity ; Many almost unexplored problems on real apps
Global meetings help us discriminating short term and long term challenges
HAC SPECIS mid-term review 9/31
![Page 11: High-performance Application and Computers …hacspecis.gforge.inria.fr/MidTerm_HACSPECIS.pdf1Xeon E5-2620 server (Nova, Grid’5000) ˇ 47 hoursand16GB Accuracy(Evaluation on Taurus](https://reader033.vdocument.in/reader033/viewer/2022041917/5e6a7e85a96e5e5f42104873/html5/thumbnails/11.jpg)
Simulation vs. Model Checking
Simulation explores one possible execution of the program according to thefeatures/limitations of the platform
Model checking explores all possible executions of the program
State space with simulation State space with model checking
Simulation and model checking are complementary:
Simulation for performance evaluationModel Checking for the verification of execution properties
Both run automatically
HAC SPECIS mid-term review 10/31
![Page 12: High-performance Application and Computers …hacspecis.gforge.inria.fr/MidTerm_HACSPECIS.pdf1Xeon E5-2620 server (Nova, Grid’5000) ˇ 47 hoursand16GB Accuracy(Evaluation on Taurus](https://reader033.vdocument.in/reader033/viewer/2022041917/5e6a7e85a96e5e5f42104873/html5/thumbnails/12.jpg)
HAC Specis
HAC SPECIS Goal
Bridge the gap between the HPC?, formal verification† and performanceevaluation§ communitiesUse SimGrid as an prototyping/integration platform
Partners
Rhone Alpes: AVALON? § POLARIS? § (+ CEA?)Bretagne Atlantique: MYRIADS? †, SUMO†
Sud Ouest: HIEPACS?, STORM?
Ile de France: MEXICO†
Grand Est: VERIDIS†
A few dates
Initial version in Jan 2014Final version in Dec 2016 (accepted April 2016)Kickoff June 2016 @ Rennes
HAC SPECIS mid-term review 11/31
![Page 13: High-performance Application and Computers …hacspecis.gforge.inria.fr/MidTerm_HACSPECIS.pdf1Xeon E5-2620 server (Nova, Grid’5000) ˇ 47 hoursand16GB Accuracy(Evaluation on Taurus](https://reader033.vdocument.in/reader033/viewer/2022041917/5e6a7e85a96e5e5f42104873/html5/thumbnails/13.jpg)
Why Simgrid ?
SimGrid: A 18 years old open-source project. Collaboration between France(INRIA, CNRS, Univ. Lyon, Nancy, Grenoble, . . . ), USA (UCSD, U. Hawaii),UK, Austria (Vienna). . .
http://simgrid.gforge.inria.fr
500 cite, 300 use, 60 extendInitially focused on Grid settings, we argue that the same tool/techniquescan be used for P2P, HPC and cloud
Goals: A usable tool with predictive capabilitySimGrid offers model checking capabilities since a few years
SimGrid 4 HPC
SMPI (2008-. . . ) BigDFT, Ondes3D, SpecFEM3D, . . .StarPU-SimGrid (2013-. . . ) Dense solvers, early work on qr mumps
HAC SPECIS mid-term review 12/31
![Page 14: High-performance Application and Computers …hacspecis.gforge.inria.fr/MidTerm_HACSPECIS.pdf1Xeon E5-2620 server (Nova, Grid’5000) ˇ 47 hoursand16GB Accuracy(Evaluation on Taurus](https://reader033.vdocument.in/reader033/viewer/2022041917/5e6a7e85a96e5e5f42104873/html5/thumbnails/14.jpg)
A Flourishing state of the Art
Many simulation projects for HPC
Dimemas (BSC, probably one of the earliest)PSINS (SDSC, used to rely on Dimemas)BigSim (UIUC): BigNetSim or BigFastSimLogGopSim (UIUC/ETHZ)SST (Sandia Nat. Lab.)XSim (Oak Ridge Nat. Lab.)CODES (Argonne Nat. Lab.)
Verification for HPC
CIVL (Delaware)MUST (Aachen)ISP and DAMPI (Utah)
Very few allow to work with real applications directly
HAC SPECIS mid-term review 13/31
![Page 15: High-performance Application and Computers …hacspecis.gforge.inria.fr/MidTerm_HACSPECIS.pdf1Xeon E5-2620 server (Nova, Grid’5000) ˇ 47 hoursand16GB Accuracy(Evaluation on Taurus](https://reader033.vdocument.in/reader033/viewer/2022041917/5e6a7e85a96e5e5f42104873/html5/thumbnails/15.jpg)
Outline
General PresentationContext: Modern HPCHAC SPECIS and SimGrid
Administrative Facts
Success StoriesCapacity planning with SMPI (Arnaud)StarPU-SimGrid (Samuel)Exact and Statistical Model Checking in Simgrid (Stephan)Improving State Equality with Static Analysis (Emmanuelle)Collateral Projects (Frederic)
FocusPredicting Energy Consumption (Anne-Cecile & Arnaud)StarPU-SimGrid (Samuel & Emmanuel)Formal Aspects in HAC SPECIS (Martin & Thierry)
Perspectives
HAC SPECIS mid-term review 14/31
![Page 16: High-performance Application and Computers …hacspecis.gforge.inria.fr/MidTerm_HACSPECIS.pdf1Xeon E5-2620 server (Nova, Grid’5000) ˇ 47 hoursand16GB Accuracy(Evaluation on Taurus](https://reader033.vdocument.in/reader033/viewer/2022041917/5e6a7e85a96e5e5f42104873/html5/thumbnails/16.jpg)
Publications and Dissemination
Co-publications
CCPE15, ICPADS15, Cluster17, TPDS17, CCPE18
SC17: Correctness, VPA
Dissemination
SC16, SC17, ISC18 booths
SIAMPP 2018: BoF on Simulation
Intervention DGDT
Maison simulation
. . .
HAC SPECIS mid-term review 15/31
![Page 17: High-performance Application and Computers …hacspecis.gforge.inria.fr/MidTerm_HACSPECIS.pdf1Xeon E5-2620 server (Nova, Grid’5000) ˇ 47 hoursand16GB Accuracy(Evaluation on Taurus](https://reader033.vdocument.in/reader033/viewer/2022041917/5e6a7e85a96e5e5f42104873/html5/thumbnails/17.jpg)
Outline
General PresentationContext: Modern HPCHAC SPECIS and SimGrid
Administrative Facts
Success StoriesCapacity planning with SMPI (Arnaud)StarPU-SimGrid (Samuel)Exact and Statistical Model Checking in Simgrid (Stephan)Improving State Equality with Static Analysis (Emmanuelle)Collateral Projects (Frederic)
FocusPredicting Energy Consumption (Anne-Cecile & Arnaud)StarPU-SimGrid (Samuel & Emmanuel)Formal Aspects in HAC SPECIS (Martin & Thierry)
Perspectives
HAC SPECIS mid-term review 16/31
![Page 18: High-performance Application and Computers …hacspecis.gforge.inria.fr/MidTerm_HACSPECIS.pdf1Xeon E5-2620 server (Nova, Grid’5000) ˇ 47 hoursand16GB Accuracy(Evaluation on Taurus](https://reader033.vdocument.in/reader033/viewer/2022041917/5e6a7e85a96e5e5f42104873/html5/thumbnails/18.jpg)
HPL and the Top500
Stampede, U.S.A., #20 with ≈ 5 Pflops56 Gbit/s FDR InfiniBand Fat tree topology
6,400 Ö (8 cores + 1 Xeon Phi)
ContextReal execution (qualification benchmark)
Matrix of rank 3,875,000: ≈ 120 Terabytes6,006 MPI processes for 2 hours: 500 CPU-days
Simulation/Emulation with SMPI1 Xeon E5-2620 server (Nova, Grid’5000)≈ 47 hours and 16GB
Accuracy (Evaluation on Taurus (Grid’5000))
●
●
●●
0
20
40
60
80
0 50 100 150 200Number of processes
Dur
atio
n (s
econ
ds)
HPL duration for different numbers of processesMatrix size: 20,000
●
●
●
●
0
10000
20000
30000
40000
0 50 100 150 200Number of processes
Ene
rgy
cons
umpt
ion
(joul
es)
HPL energy consumption for different numbers of processesMatrix size: 20,000
Experiment type ●Optimized simulation Vanilla simulation Real execution
Mismatch with the Stampede qualification run (Intel HPL )
Perspective Capacity planning, Tune real applications, Co-Design, . . .
HAC SPECIS mid-term review 17/31
![Page 19: High-performance Application and Computers …hacspecis.gforge.inria.fr/MidTerm_HACSPECIS.pdf1Xeon E5-2620 server (Nova, Grid’5000) ˇ 47 hoursand16GB Accuracy(Evaluation on Taurus](https://reader033.vdocument.in/reader033/viewer/2022041917/5e6a7e85a96e5e5f42104873/html5/thumbnails/19.jpg)
StarPU-Simgrid principle
StarPU
SimGrid
Simulation
Quickly Simulate Many Times
StarPU
Performance Profile
Calibration
Run once!HAC SPECIS mid-term review 18/31
![Page 20: High-performance Application and Computers …hacspecis.gforge.inria.fr/MidTerm_HACSPECIS.pdf1Xeon E5-2620 server (Nova, Grid’5000) ˇ 47 hoursand16GB Accuracy(Evaluation on Taurus](https://reader033.vdocument.in/reader033/viewer/2022041917/5e6a7e85a96e5e5f42104873/html5/thumbnails/20.jpg)
StarPU-Simgrid on dense linear algebra
Accurate simulated time resultsAlready required a lot of careExtensively used for scheduling research
Conan Cholesky Attila LU
0
500
1000
1500
20000 40000 60000 80000 20000 40000 60000 80000Matrix dimension
GF
lop/
s
ExperimentalCondition
SimGrid (naiveruntime modeling)SimGrid (smart)
Native
HAC SPECIS mid-term review 19/31
![Page 21: High-performance Application and Computers …hacspecis.gforge.inria.fr/MidTerm_HACSPECIS.pdf1Xeon E5-2620 server (Nova, Grid’5000) ˇ 47 hoursand16GB Accuracy(Evaluation on Taurus](https://reader033.vdocument.in/reader033/viewer/2022041917/5e6a7e85a96e5e5f42104873/html5/thumbnails/21.jpg)
StarPU-Simgrid for CI
Execution on one node of sirocco (12 cores + 3 GPU K40M)
HAC SPECIS mid-term review 20/31
![Page 22: High-performance Application and Computers …hacspecis.gforge.inria.fr/MidTerm_HACSPECIS.pdf1Xeon E5-2620 server (Nova, Grid’5000) ˇ 47 hoursand16GB Accuracy(Evaluation on Taurus](https://reader033.vdocument.in/reader033/viewer/2022041917/5e6a7e85a96e5e5f42104873/html5/thumbnails/22.jpg)
StarPU-Simgrid for CI with MPI
Execution on sirocco with 4 nodes (2Ö12 cores + 4 GPU K40M)
HAC SPECIS mid-term review 21/31
![Page 23: High-performance Application and Computers …hacspecis.gforge.inria.fr/MidTerm_HACSPECIS.pdf1Xeon E5-2620 server (Nova, Grid’5000) ˇ 47 hoursand16GB Accuracy(Evaluation on Taurus](https://reader033.vdocument.in/reader033/viewer/2022041917/5e6a7e85a96e5e5f42104873/html5/thumbnails/23.jpg)
StarPU Visualization
HAC SPECIS mid-term review 22/31
![Page 24: High-performance Application and Computers …hacspecis.gforge.inria.fr/MidTerm_HACSPECIS.pdf1Xeon E5-2620 server (Nova, Grid’5000) ˇ 47 hoursand16GB Accuracy(Evaluation on Taurus](https://reader033.vdocument.in/reader033/viewer/2022041917/5e6a7e85a96e5e5f42104873/html5/thumbnails/24.jpg)
StarPU QR-Mumps
QR-MUMPS multi-frontal sparse factorization ontop of StarPU
Tree parallelism
Node parallelism
Variable matrix geometry
Fully dynamic scheduling w. StarPU
ActivateAssemblePanelUpdateDeactivate
Native 1
Native 2
SimGrid
Native 3
0
1
2
3
0
1
2
3
0
1
2
3
0
1
2
3
0 10,000 20,000 30,000 40,000Time [ms]
Allo
cate
d M
emor
y [G
iB]
Extrapolation
0
30
60
90
4 10 20 40 100 400Number of Threads
Dur
atio
n [s
]
Type
Native
SimGrid
Measured Time
Overall Makespan
Idle Time per Thread
Perspective Tune app. and scheduler, capacity (memory) planning
HAC SPECIS mid-term review 23/31
![Page 25: High-performance Application and Computers …hacspecis.gforge.inria.fr/MidTerm_HACSPECIS.pdf1Xeon E5-2620 server (Nova, Grid’5000) ˇ 47 hoursand16GB Accuracy(Evaluation on Taurus](https://reader033.vdocument.in/reader033/viewer/2022041917/5e6a7e85a96e5e5f42104873/html5/thumbnails/25.jpg)
Exact model checkingExplore reachable system states
• check invariant properties in all states
• ensure liveness properties of infinite executions
Avoid computing and storing states when possible• depth-bounded state space exploration
• save history of actions, recompute state when backtracking
• DPOR: avoid exploration of independent actions
• liveness checking: compress states, semantic equality checking
HAC SPECIS mid-term review 24/31
![Page 26: High-performance Application and Computers …hacspecis.gforge.inria.fr/MidTerm_HACSPECIS.pdf1Xeon E5-2620 server (Nova, Grid’5000) ˇ 47 hoursand16GB Accuracy(Evaluation on Taurus](https://reader033.vdocument.in/reader033/viewer/2022041917/5e6a7e85a96e5e5f42104873/html5/thumbnails/26.jpg)
McSimGrid: EvaluationMcSimGrid is part of SimGrid distribution
Success stories• identified reason of known bug in Chord implementation
• analyze conformance tests proposed for MPI implementations
• verification of send-determinism
Drawbacks• state space explosion restricts applicability to a handful of processes
• liveness properties are even harder to verify
• certain errors correspond to unrealistic border cases
Ongoing work• use unfolding techniques to improve reduction (thèse The Anh Pham)
• static analysis for improving state equality detection (Emmanuelle Saillard)
HAC SPECIS mid-term review 24/31
![Page 27: High-performance Application and Computers …hacspecis.gforge.inria.fr/MidTerm_HACSPECIS.pdf1Xeon E5-2620 server (Nova, Grid’5000) ˇ 47 hoursand16GB Accuracy(Evaluation on Taurus](https://reader033.vdocument.in/reader033/viewer/2022041917/5e6a7e85a96e5e5f42104873/html5/thumbnails/27.jpg)
Statistical Model CheckingAlternative to exhaustive verification of Boolean properties
• interval estimation: approximate a parameter value
• hypothesis testing: statistical evidence for acceptance or rejection
• quantify confidence in the result / probability of error
Perform Monte-Carlo simulation, apply statistical techniques• no need to construct or store transition system
• applicable without prior knowledge about probability distributions
• simulation and analysis techniques are embarrassingly parallel
Combine performance evaluation and verification• estimate quantitative parameters, e.g. response time or memory consumption
• closely related to existing techniques within SimGrid
HAC SPECIS mid-term review 24/31
![Page 28: High-performance Application and Computers …hacspecis.gforge.inria.fr/MidTerm_HACSPECIS.pdf1Xeon E5-2620 server (Nova, Grid’5000) ˇ 47 hoursand16GB Accuracy(Evaluation on Taurus](https://reader033.vdocument.in/reader033/viewer/2022041917/5e6a7e85a96e5e5f42104873/html5/thumbnails/28.jpg)
Preliminary ResultsPrototypical implementation and application to Chord
• estimate average lookup time
• estimate percentage of failed lookups
• estimate network resilience (capacity to reconnect after failure)
HAC SPECIS mid-term review 24/31
![Page 29: High-performance Application and Computers …hacspecis.gforge.inria.fr/MidTerm_HACSPECIS.pdf1Xeon E5-2620 server (Nova, Grid’5000) ˇ 47 hoursand16GB Accuracy(Evaluation on Taurus](https://reader033.vdocument.in/reader033/viewer/2022041917/5e6a7e85a96e5e5f42104873/html5/thumbnails/29.jpg)
Mitigating the State Space Explosion
1
System-Level State Equality
‣ Detect when a given state was previously explored
‣ Introspect the application state similarly to gdb
‣ Also with Memory Compaction
HAC SPECIS mid-term review 25/31
![Page 30: High-performance Application and Computers …hacspecis.gforge.inria.fr/MidTerm_HACSPECIS.pdf1Xeon E5-2620 server (Nova, Grid’5000) ˇ 47 hoursand16GB Accuracy(Evaluation on Taurus](https://reader033.vdocument.in/reader033/viewer/2022041917/5e6a7e85a96e5e5f42104873/html5/thumbnails/30.jpg)
2
HAC SPECIS mid-term review 25/31
![Page 31: High-performance Application and Computers …hacspecis.gforge.inria.fr/MidTerm_HACSPECIS.pdf1Xeon E5-2620 server (Nova, Grid’5000) ˇ 47 hoursand16GB Accuracy(Evaluation on Taurus](https://reader033.vdocument.in/reader033/viewer/2022041917/5e6a7e85a96e5e5f42104873/html5/thumbnails/31.jpg)
3
init
Reduce McSimGrid state space
Improve Dynamic State Equality Detection
.
. WRITE(x)
.
. WRITE(x) . . .
x=42 x=43
x=4
init
x=42 x=43
y=x y=x
.
. WRITE(x) . . READ(x) WRITE(y) . .
HAC SPECIS mid-term review 25/31
![Page 32: High-performance Application and Computers …hacspecis.gforge.inria.fr/MidTerm_HACSPECIS.pdf1Xeon E5-2620 server (Nova, Grid’5000) ˇ 47 hoursand16GB Accuracy(Evaluation on Taurus](https://reader033.vdocument.in/reader033/viewer/2022041917/5e6a7e85a96e5e5f42104873/html5/thumbnails/32.jpg)
4
init
Reduce McSimGrid state space
Improve Dynamic State Equality Detection
.
. WRITE(x)
.
. WRITE(x) . . .
x=42 x=43
x=4
init
x=42 x=43
y=x y=x
.
. WRITE(x) . . READ(x) WRITE(y) . .
HAC SPECIS mid-term review 25/31
![Page 33: High-performance Application and Computers …hacspecis.gforge.inria.fr/MidTerm_HACSPECIS.pdf1Xeon E5-2620 server (Nova, Grid’5000) ˇ 47 hoursand16GB Accuracy(Evaluation on Taurus](https://reader033.vdocument.in/reader033/viewer/2022041917/5e6a7e85a96e5e5f42104873/html5/thumbnails/33.jpg)
.
. WRITE(x) . . READ(x) WRITE(y) . .
5
init
Reduce McSimGrid state space
Improve Dynamic State Equality Detection
.
. WRITE(x)
.
. WRITE(x) . . .
x=42 x=43
x=4
*redefined before it is used or never used in the future
x is dead*
init
x=42 x=43
y=x y=x
x,y are dead*
HAC SPECIS mid-term review 25/31
![Page 34: High-performance Application and Computers …hacspecis.gforge.inria.fr/MidTerm_HACSPECIS.pdf1Xeon E5-2620 server (Nova, Grid’5000) ˇ 47 hoursand16GB Accuracy(Evaluation on Taurus](https://reader033.vdocument.in/reader033/viewer/2022041917/5e6a7e85a96e5e5f42104873/html5/thumbnails/34.jpg)
.
. WRITE(x) . . READ(x) WRITE(y) . .
6
init
Reduce McSimGrid state space
Improve Dynamic State Equality Detection
.
. WRITE(x)
.
. WRITE(x) . . .
x=42 x=43
x=4
*redefined before it is used or never used in the future
x is dead*
x=0
=> Set dead variables to 0
init
x=42 x=43
y=x y=x
x=0
x,y are dead*
y=0
HAC SPECIS mid-term review 25/31
![Page 35: High-performance Application and Computers …hacspecis.gforge.inria.fr/MidTerm_HACSPECIS.pdf1Xeon E5-2620 server (Nova, Grid’5000) ˇ 47 hoursand16GB Accuracy(Evaluation on Taurus](https://reader033.vdocument.in/reader033/viewer/2022041917/5e6a7e85a96e5e5f42104873/html5/thumbnails/35.jpg)
7
Reduce McSimGrid state space
Improve Dynamic State Equality Detection
Give type information in the heap
=> Static Analysis in LLVM
Two heaps syntactically different but semantically identical
HAC SPECIS mid-term review 25/31
![Page 36: High-performance Application and Computers …hacspecis.gforge.inria.fr/MidTerm_HACSPECIS.pdf1Xeon E5-2620 server (Nova, Grid’5000) ˇ 47 hoursand16GB Accuracy(Evaluation on Taurus](https://reader033.vdocument.in/reader033/viewer/2022041917/5e6a7e85a96e5e5f42104873/html5/thumbnails/36.jpg)
BatSim
A Job and Resource Management System Simulator
A key component in HPC systems
Decouple the decision making from the simulation
Uses SimGrid as a backend
Developed in the Datamove team (Grenoble)
https://github.com/oar-team/batsim
HAC SPECIS mid-term review 26/31
![Page 37: High-performance Application and Computers …hacspecis.gforge.inria.fr/MidTerm_HACSPECIS.pdf1Xeon E5-2620 server (Nova, Grid’5000) ˇ 47 hoursand16GB Accuracy(Evaluation on Taurus](https://reader033.vdocument.in/reader033/viewer/2022041917/5e6a7e85a96e5e5f42104873/html5/thumbnails/37.jpg)
Wrench
A Workflow Management System Simulation WorkbenchObjective
Provide high-level building blocks for developingcustom simulators
Targets:
Scientists: make quick and informed choiceswhen executing workflowsSoftware developers: implement more efficientsoftware infrastructures to support workflowsResearchers: Develop novel efficient algorithms
Coupled with BatSim
http://wrench-project.org
Collaboration with ISI/USC and UH ManoaFunded by the NSF (grants number 1642369and 1642335) and CNRS (PICS 7239)
HAC SPECIS mid-term review 27/31
![Page 38: High-performance Application and Computers …hacspecis.gforge.inria.fr/MidTerm_HACSPECIS.pdf1Xeon E5-2620 server (Nova, Grid’5000) ˇ 47 hoursand16GB Accuracy(Evaluation on Taurus](https://reader033.vdocument.in/reader033/viewer/2022041917/5e6a7e85a96e5e5f42104873/html5/thumbnails/38.jpg)
Outline
General PresentationContext: Modern HPCHAC SPECIS and SimGrid
Administrative Facts
Success StoriesCapacity planning with SMPI (Arnaud)StarPU-SimGrid (Samuel)Exact and Statistical Model Checking in Simgrid (Stephan)Improving State Equality with Static Analysis (Emmanuelle)Collateral Projects (Frederic)
FocusPredicting Energy Consumption (Anne-Cecile & Arnaud)StarPU-SimGrid (Samuel & Emmanuel)Formal Aspects in HAC SPECIS (Martin & Thierry)
Perspectives
HAC SPECIS mid-term review 28/31
![Page 39: High-performance Application and Computers …hacspecis.gforge.inria.fr/MidTerm_HACSPECIS.pdf1Xeon E5-2620 server (Nova, Grid’5000) ˇ 47 hoursand16GB Accuracy(Evaluation on Taurus](https://reader033.vdocument.in/reader033/viewer/2022041917/5e6a7e85a96e5e5f42104873/html5/thumbnails/39.jpg)
Scientific Perspectives
Contribute to next-generation HPC tools & Gather strengths from 3 communities
Key scientific challenges:
Automatic performance modeling (building models, parameter learning)
Platform (MPI)Computation kernels (StarPU)
OpenMP/SimGrid (; PhD IPL 2018)
Proxy Apps: 23/40 functional (mostly because of OpenMP)Capturing/modeling the impact of compiler/runtime, memory interferences
Improving DPOR (; PhD IPL 2016)
Good semantic formalizationOptimality, merge with state equality, liveness, . . .
Applying ”classical” safety MC techniques to HPC code (; PostDoc IPL 2017)
Op. system challenges, improving state equality w. compilerHandle threads, checking applications and runtimes
HAC SPECIS mid-term review 29/31
![Page 40: High-performance Application and Computers …hacspecis.gforge.inria.fr/MidTerm_HACSPECIS.pdf1Xeon E5-2620 server (Nova, Grid’5000) ˇ 47 hoursand16GB Accuracy(Evaluation on Taurus](https://reader033.vdocument.in/reader033/viewer/2022041917/5e6a7e85a96e5e5f42104873/html5/thumbnails/40.jpg)
Technical Considerations
Complex and Dynamic Code Base
Only 100k sloc, but complex due to versatile efficiency + formal verification
Implemented in C++/C (+ assembly); Bindings: Java, Lua and Fortran
Active project: commits every day by ≈ 6 commiters, 4 releases a year
Ongoing full rewrite in C++ along with Release soon, Release often
Well Tested740 integration tests, 10k units (coverage: 80%)
Each commit: 22 configurations (4 OS, 3 compilers, 2 archs; 3 providers)
Nightly: 2 dynamic + 2 static analyzers; StarPU, BigDFT and Proxy Apps
We cultivate our garden: simplify to grow further
HAC SPECIS mid-term review 30/31
![Page 41: High-performance Application and Computers …hacspecis.gforge.inria.fr/MidTerm_HACSPECIS.pdf1Xeon E5-2620 server (Nova, Grid’5000) ˇ 47 hoursand16GB Accuracy(Evaluation on Taurus](https://reader033.vdocument.in/reader033/viewer/2022041917/5e6a7e85a96e5e5f42104873/html5/thumbnails/41.jpg)
Technical Considerations
Complex and Dynamic Code Base
Only 100k sloc, but complex due to versatile efficiency + formal verification
Implemented in C++/C (+ assembly); Bindings: Java, Lua and Fortran
Active project: commits every day by ≈ 6 commiters, 4 releases a year
Ongoing full rewrite in C++ along with Release soon, Release often
Well Tested740 integration tests, 10k units (coverage: 80%)
Each commit: 22 configurations (4 OS, 3 compilers, 2 archs; 3 providers)
Nightly: 2 dynamic + 2 static analyzers; StarPU, BigDFT and Proxy Apps
We cultivate our garden: simplify to grow further
HAC SPECIS mid-term review 30/31
![Page 42: High-performance Application and Computers …hacspecis.gforge.inria.fr/MidTerm_HACSPECIS.pdf1Xeon E5-2620 server (Nova, Grid’5000) ˇ 47 hoursand16GB Accuracy(Evaluation on Taurus](https://reader033.vdocument.in/reader033/viewer/2022041917/5e6a7e85a96e5e5f42104873/html5/thumbnails/42.jpg)
Building a Community
Communication and AnimationSimGrid User Days: Welcome newcomers & Take feedback since 2010
Scientific tutorials, Booth at SuperComputing & others; Companies Courses
500 cite 300 use 60 extend; 30 mails/month; 5 bugs/month; Stack Overflow
Preliminary Industrial Contacts
CERN: currently testing the LHC DataGrid before production
Intel/KAUST: internal project (est. at SC’17)
Octo: dimensionning Ceph infrastructures for their clients (past attempt)
Amazon/Nice: very preliminary contacts for dimensionning, service to clients
My dream: make open-source IT infra (Samba, Ceph) testable with SimGrid
Possible Income: subscription of 6-8 supporting institutions/companies
Toward EducationTeach now the researchers and engineers of tomorrow to SimGrid
Done: SMPI CourseWare, PeerSimGrid; Ongoing: Cloud, Wrench and more?
HAC SPECIS mid-term review 31/31