1 mpi verification ganesh gopalakrishnan and robert m. kirby students yu yang, sarvani vakkalanka,...
Post on 21-Dec-2015
217 views
TRANSCRIPT
![Page 1: 1 MPI Verification Ganesh Gopalakrishnan and Robert M. Kirby Students Yu Yang, Sarvani Vakkalanka, Guodong Li, Subodh Sharma, Anh Vo, Michael DeLisi, Geof](https://reader035.vdocument.in/reader035/viewer/2022081515/56649d545503460f94a3181c/html5/thumbnails/1.jpg)
1
MPI Verification
Ganesh Gopalakrishnan and Robert M. Kirby Students
Yu Yang, Sarvani Vakkalanka, Guodong Li, Subodh Sharma, Anh Vo, Michael DeLisi, Geof Sawaya
(http://www.cs.utah.edu/formal_verification)
School of ComputingUniversity of Utah
Supported by: Microsoft HPC Institutes
NSF CNS 0509379
![Page 2: 1 MPI Verification Ganesh Gopalakrishnan and Robert M. Kirby Students Yu Yang, Sarvani Vakkalanka, Guodong Li, Subodh Sharma, Anh Vo, Michael DeLisi, Geof](https://reader035.vdocument.in/reader035/viewer/2022081515/56649d545503460f94a3181c/html5/thumbnails/2.jpg)
2
“MPI Verification”
or
How to exhaustively verify MPI programs
without the pain of model buildingand considering only “relevant interleavings”
![Page 3: 1 MPI Verification Ganesh Gopalakrishnan and Robert M. Kirby Students Yu Yang, Sarvani Vakkalanka, Guodong Li, Subodh Sharma, Anh Vo, Michael DeLisi, Geof](https://reader035.vdocument.in/reader035/viewer/2022081515/56649d545503460f94a3181c/html5/thumbnails/3.jpg)
3
Computing is at an inflection point
(photo courtesy of Intel)
![Page 4: 1 MPI Verification Ganesh Gopalakrishnan and Robert M. Kirby Students Yu Yang, Sarvani Vakkalanka, Guodong Li, Subodh Sharma, Anh Vo, Michael DeLisi, Geof](https://reader035.vdocument.in/reader035/viewer/2022081515/56649d545503460f94a3181c/html5/thumbnails/4.jpg)
4
Our work pertains to these:
MPI programs
MPI libraries
Shared Memory Threads based on Locks
![Page 5: 1 MPI Verification Ganesh Gopalakrishnan and Robert M. Kirby Students Yu Yang, Sarvani Vakkalanka, Guodong Li, Subodh Sharma, Anh Vo, Michael DeLisi, Geof](https://reader035.vdocument.in/reader035/viewer/2022081515/56649d545503460f94a3181c/html5/thumbnails/5.jpg)
5
Name of the Game: Progress Through Precision
1. Precision in Understanding
2. Precision in Modeling
3. Precision in Analysis
4. Doing Modeling and Analysis with Low Cost
![Page 6: 1 MPI Verification Ganesh Gopalakrishnan and Robert M. Kirby Students Yu Yang, Sarvani Vakkalanka, Guodong Li, Subodh Sharma, Anh Vo, Michael DeLisi, Geof](https://reader035.vdocument.in/reader035/viewer/2022081515/56649d545503460f94a3181c/html5/thumbnails/6.jpg)
6
1. Need for Precision in Understanding:
The “crooked barrier” quiz
P0---
MPI_Isend ( P2 )
MPI_Barrier
P1---
MPI_Barrier
MPI_Isend( P2 )
P2---
MPI_Irecv ( ANY )
MPI_Barrier
Will P1’s Send Match P2’s Receive ?
![Page 7: 1 MPI Verification Ganesh Gopalakrishnan and Robert M. Kirby Students Yu Yang, Sarvani Vakkalanka, Guodong Li, Subodh Sharma, Anh Vo, Michael DeLisi, Geof](https://reader035.vdocument.in/reader035/viewer/2022081515/56649d545503460f94a3181c/html5/thumbnails/7.jpg)
7
Need for Precision in Understanding:
The “crooked barrier” quiz
P0---
MPI_Isend ( P2 )
MPI_Barrier
P1---
MPI_Barrier
MPI_Isend( P2 )
P2---
MPI_Irecv ( ANY )
MPI_Barrier
It will ! Here is the animation
![Page 8: 1 MPI Verification Ganesh Gopalakrishnan and Robert M. Kirby Students Yu Yang, Sarvani Vakkalanka, Guodong Li, Subodh Sharma, Anh Vo, Michael DeLisi, Geof](https://reader035.vdocument.in/reader035/viewer/2022081515/56649d545503460f94a3181c/html5/thumbnails/8.jpg)
8
Need for Precision in Understanding:
The “crooked barrier” quiz
P0---
MPI_Isend ( P2 )
MPI_Barrier
P1---
MPI_Barrier
MPI_Isend( P2 )
P2---
MPI_Irecv ( ANY )
MPI_Barrier
![Page 9: 1 MPI Verification Ganesh Gopalakrishnan and Robert M. Kirby Students Yu Yang, Sarvani Vakkalanka, Guodong Li, Subodh Sharma, Anh Vo, Michael DeLisi, Geof](https://reader035.vdocument.in/reader035/viewer/2022081515/56649d545503460f94a3181c/html5/thumbnails/9.jpg)
9
Need for Precision in Understanding:
The “crooked barrier” quiz
P0---
MPI_Isend ( P2 )
MPI_Barrier
P1---
MPI_Barrier
MPI_Isend( P2 )
P2---
MPI_Irecv ( ANY )
MPI_Barrier
![Page 10: 1 MPI Verification Ganesh Gopalakrishnan and Robert M. Kirby Students Yu Yang, Sarvani Vakkalanka, Guodong Li, Subodh Sharma, Anh Vo, Michael DeLisi, Geof](https://reader035.vdocument.in/reader035/viewer/2022081515/56649d545503460f94a3181c/html5/thumbnails/10.jpg)
10
Need for Precision in Understanding:
The “crooked barrier” quiz
P0---
MPI_Isend ( P2 )
MPI_Barrier
P1---
MPI_Barrier
MPI_Isend( P2 )
P2---
MPI_Irecv ( ANY )
MPI_Barrier
![Page 11: 1 MPI Verification Ganesh Gopalakrishnan and Robert M. Kirby Students Yu Yang, Sarvani Vakkalanka, Guodong Li, Subodh Sharma, Anh Vo, Michael DeLisi, Geof](https://reader035.vdocument.in/reader035/viewer/2022081515/56649d545503460f94a3181c/html5/thumbnails/11.jpg)
11
Need for Precision in Understanding:
The “crooked barrier” quiz
P0---
MPI_Isend ( P2 )
MPI_Barrier
P1---
MPI_Barrier
MPI_Isend( P2 )
P2---
MPI_Irecv ( ANY )
MPI_Barrier
![Page 12: 1 MPI Verification Ganesh Gopalakrishnan and Robert M. Kirby Students Yu Yang, Sarvani Vakkalanka, Guodong Li, Subodh Sharma, Anh Vo, Michael DeLisi, Geof](https://reader035.vdocument.in/reader035/viewer/2022081515/56649d545503460f94a3181c/html5/thumbnails/12.jpg)
12
Need for Precision in Understanding:
The “crooked barrier” quiz
P0---
MPI_Isend ( P2 )
MPI_Barrier
P1---
MPI_Barrier
MPI_Isend( P2 )
P2---
MPI_Irecv ( ANY )
MPI_Barrier
![Page 13: 1 MPI Verification Ganesh Gopalakrishnan and Robert M. Kirby Students Yu Yang, Sarvani Vakkalanka, Guodong Li, Subodh Sharma, Anh Vo, Michael DeLisi, Geof](https://reader035.vdocument.in/reader035/viewer/2022081515/56649d545503460f94a3181c/html5/thumbnails/13.jpg)
13
Would you rather explain each conceivable situation in a large API with an elaborate “bee dance” and informal English…. or would you rather specify it mathematically and let the user calculate the outcomes?
P0---
MPI_Isend ( P2 )
MPI_Barrier
P1---
MPI_Barrier
MPI_Isend( P2 )
P2---
MPI_Irecv ( ANY )
MPI_Barrier
![Page 14: 1 MPI Verification Ganesh Gopalakrishnan and Robert M. Kirby Students Yu Yang, Sarvani Vakkalanka, Guodong Li, Subodh Sharma, Anh Vo, Michael DeLisi, Geof](https://reader035.vdocument.in/reader035/viewer/2022081515/56649d545503460f94a3181c/html5/thumbnails/14.jpg)
14
TLA+ Spec of MPI_Wait (Slide 1/2)
![Page 15: 1 MPI Verification Ganesh Gopalakrishnan and Robert M. Kirby Students Yu Yang, Sarvani Vakkalanka, Guodong Li, Subodh Sharma, Anh Vo, Michael DeLisi, Geof](https://reader035.vdocument.in/reader035/viewer/2022081515/56649d545503460f94a3181c/html5/thumbnails/15.jpg)
15
TLA+ Spec of MPI_Wait (Slide 2/2)
![Page 16: 1 MPI Verification Ganesh Gopalakrishnan and Robert M. Kirby Students Yu Yang, Sarvani Vakkalanka, Guodong Li, Subodh Sharma, Anh Vo, Michael DeLisi, Geof](https://reader035.vdocument.in/reader035/viewer/2022081515/56649d545503460f94a3181c/html5/thumbnails/16.jpg)
16
Executable Formal Specification can help validate our understanding of MPI …
04/18/23
TLA+ MPI Library Model
TLA+ Prog. Model
MPIC Program Model
Visual Studio 2005
Phoenix Compiler
TLC Model Checker MPIC Model Checker
Verification Environment
MPIC IR
FMICS 07 PADTAD 07
![Page 17: 1 MPI Verification Ganesh Gopalakrishnan and Robert M. Kirby Students Yu Yang, Sarvani Vakkalanka, Guodong Li, Subodh Sharma, Anh Vo, Michael DeLisi, Geof](https://reader035.vdocument.in/reader035/viewer/2022081515/56649d545503460f94a3181c/html5/thumbnails/17.jpg)
17
The Histrionics of FV for HPC (1)
![Page 18: 1 MPI Verification Ganesh Gopalakrishnan and Robert M. Kirby Students Yu Yang, Sarvani Vakkalanka, Guodong Li, Subodh Sharma, Anh Vo, Michael DeLisi, Geof](https://reader035.vdocument.in/reader035/viewer/2022081515/56649d545503460f94a3181c/html5/thumbnails/18.jpg)
18
The Histrionics of FV for HPC (2)
![Page 19: 1 MPI Verification Ganesh Gopalakrishnan and Robert M. Kirby Students Yu Yang, Sarvani Vakkalanka, Guodong Li, Subodh Sharma, Anh Vo, Michael DeLisi, Geof](https://reader035.vdocument.in/reader035/viewer/2022081515/56649d545503460f94a3181c/html5/thumbnails/19.jpg)
19
Error-trace Visualization in VisualStudio
![Page 20: 1 MPI Verification Ganesh Gopalakrishnan and Robert M. Kirby Students Yu Yang, Sarvani Vakkalanka, Guodong Li, Subodh Sharma, Anh Vo, Michael DeLisi, Geof](https://reader035.vdocument.in/reader035/viewer/2022081515/56649d545503460f94a3181c/html5/thumbnails/20.jpg)
20
2. Precision in Modeling:The “Byte-range Locking Protocol” ChallengeAsked to see if new protocol using MPI 1-sided was OK…
lock_acquire (start, end) {
Stage 11 val[0] = 1; /* flag */ val[1] = start; val[2] = end;2 while(1) {3 lock_win4 place val in win5 get values of other processes from win6 unlock_win7 for all i, if (Pi conflicts with my range)8 conflict = 1;
Stage 29 if(conflict) {10 val[0] = 011 lock_win12 place val in win13 unlock_win14 MPI_Recv(ANY_SOURCE)15 }16 else{17 /* lock is acquired */18 break;19 }20 }//end while
flag start end 0 -1 -1 0 -1 -1 0 -1 -1
![Page 21: 1 MPI Verification Ganesh Gopalakrishnan and Robert M. Kirby Students Yu Yang, Sarvani Vakkalanka, Guodong Li, Subodh Sharma, Anh Vo, Michael DeLisi, Geof](https://reader035.vdocument.in/reader035/viewer/2022081515/56649d545503460f94a3181c/html5/thumbnails/21.jpg)
21
Precision in Modeling:The “Byte-range Locking Protocol” Challenge
Studied code Wrote Promela Verification Model (a week) Applied the SPIN Model Checker Found Two Deadlocks Previously Unknown Wrote Paper (EuroPVM / MPI 2006) with Thakur and Gropp – won
one of the three best-paper awards With new insight, Designed Correct AND Faster Protocol !
Still, we felt lucky … what if we had missed the error while hand-modeling
Also hand-modeling was NO FUN – how about running the real MPI code “cleverly” ?
![Page 22: 1 MPI Verification Ganesh Gopalakrishnan and Robert M. Kirby Students Yu Yang, Sarvani Vakkalanka, Guodong Li, Subodh Sharma, Anh Vo, Michael DeLisi, Geof](https://reader035.vdocument.in/reader035/viewer/2022081515/56649d545503460f94a3181c/html5/thumbnails/22.jpg)
22
Measurement under Low Contention
![Page 23: 1 MPI Verification Ganesh Gopalakrishnan and Robert M. Kirby Students Yu Yang, Sarvani Vakkalanka, Guodong Li, Subodh Sharma, Anh Vo, Michael DeLisi, Geof](https://reader035.vdocument.in/reader035/viewer/2022081515/56649d545503460f94a3181c/html5/thumbnails/23.jpg)
23
Measurement under High Contention
![Page 24: 1 MPI Verification Ganesh Gopalakrishnan and Robert M. Kirby Students Yu Yang, Sarvani Vakkalanka, Guodong Li, Subodh Sharma, Anh Vo, Michael DeLisi, Geof](https://reader035.vdocument.in/reader035/viewer/2022081515/56649d545503460f94a3181c/html5/thumbnails/24.jpg)
24
4. Modeling and Analysis with Reduced Cost…
0: 1: 2: 3: 4: 5:
Card Deck 0 Card Deck 1
0: 1: 2: 3: 4: 5:
• Only the interleavings of the red cards matter • So don’t try all riffle-shuffles (12!) / (6!) (6!) = 924• Instead just try TWO shuffles of the decks !!
![Page 25: 1 MPI Verification Ganesh Gopalakrishnan and Robert M. Kirby Students Yu Yang, Sarvani Vakkalanka, Guodong Li, Subodh Sharma, Anh Vo, Michael DeLisi, Geof](https://reader035.vdocument.in/reader035/viewer/2022081515/56649d545503460f94a3181c/html5/thumbnails/25.jpg)
25
What works for cards works for MPI(and for PThreads also) !!
0: MPI_Init1: MPI_Win_lock2: MPI_Accumulate3: MPI_Win_unlock4: MPI_Barrier5: MPI_Finalize
P0 (owner of window) P1 (non-owner of window)
0: MPI_Init1: MPI_Win_lock2: MPI_Accumulate3: MPI_Win_unlock4: MPI_Barrier5: MPI_Finalize
•These are the dependent operations• 504 interleavings without POR in this example• 2 interleavings with POR !!
![Page 26: 1 MPI Verification Ganesh Gopalakrishnan and Robert M. Kirby Students Yu Yang, Sarvani Vakkalanka, Guodong Li, Subodh Sharma, Anh Vo, Michael DeLisi, Geof](https://reader035.vdocument.in/reader035/viewer/2022081515/56649d545503460f94a3181c/html5/thumbnails/26.jpg)
26
4. Modeling and Analysis with Reduced Cost The “Byte-range Locking Protocol” Challenge
Studied code DID NOT STUDY CODE Wrote Promela Verification Model (a week) NO MODELING Applied the SPIN Model Checker NEW ISP VERIFIER Found Two Deadlocks Previously Unknown FOUND SAME! Wrote Paper (EuroPVM / MPI 2007) with Thakur and Gropp – won
one of the three best-paper awards DID NOT WIN
Still, we felt lucky … what if we had missed the error while hand-modeling NO NEED TO FEEL LUCKY (NO LOST INTERLEAVING – but also did not foolishly do ALL interleavings)
Also hand-modeling was NO FUN – how about running the real MPI code “cleverly” ? DIRECT RUNNING WAS FUN
![Page 27: 1 MPI Verification Ganesh Gopalakrishnan and Robert M. Kirby Students Yu Yang, Sarvani Vakkalanka, Guodong Li, Subodh Sharma, Anh Vo, Michael DeLisi, Geof](https://reader035.vdocument.in/reader035/viewer/2022081515/56649d545503460f94a3181c/html5/thumbnails/27.jpg)
27
3. Precision in AnalysisThe “crooked barrier” quiz again …
P0---
MPI_Isend ( P2 )
MPI_Barrier
P1---
MPI_Barrier
MPI_Isend( P2 )
P2---
MPI_Irecv ( ANY )
MPI_Barrier
Our Cluster NEVER gave us the P0 to P2 match !!!
Elusive Interleavings !!
Bites you the hardest when you port to new platform !!
![Page 28: 1 MPI Verification Ganesh Gopalakrishnan and Robert M. Kirby Students Yu Yang, Sarvani Vakkalanka, Guodong Li, Subodh Sharma, Anh Vo, Michael DeLisi, Geof](https://reader035.vdocument.in/reader035/viewer/2022081515/56649d545503460f94a3181c/html5/thumbnails/28.jpg)
28
3. Precision in AnalysisThe “crooked barrier” quiz again …
P0---
MPI_Isend ( P2 )
MPI_Barrier
P1---
MPI_Barrier
MPI_Isend( P2 )
P2---
MPI_Irecv ( ANY )
MPI_Barrier
SOLVED!! Using the new POE Algorithm
Partial Order Reduction in the presence ofOut of Order Operations and Elusive Interleavings
![Page 29: 1 MPI Verification Ganesh Gopalakrishnan and Robert M. Kirby Students Yu Yang, Sarvani Vakkalanka, Guodong Li, Subodh Sharma, Anh Vo, Michael DeLisi, Geof](https://reader035.vdocument.in/reader035/viewer/2022081515/56649d545503460f94a3181c/html5/thumbnails/29.jpg)
29
Precision in Analysis
POE Works Great (all 41 Umpire Test-Suites Run) No need to “pad” delay statements to jiggle schedule
and force “the other” interleaving– This is a very brittle trick anyway!
Prelim Version Under Submission – Detailed Version for EuroPVM…
Jitterbug uses this approach – We don’t need it
Siegel (MPI_SPIN): Modeling effort Marmot : Different Coverage Guarantees..
![Page 30: 1 MPI Verification Ganesh Gopalakrishnan and Robert M. Kirby Students Yu Yang, Sarvani Vakkalanka, Guodong Li, Subodh Sharma, Anh Vo, Michael DeLisi, Geof](https://reader035.vdocument.in/reader035/viewer/2022081515/56649d545503460f94a3181c/html5/thumbnails/30.jpg)
30
1-4: Finally! Precision and Low Cost in Modeling and Analysis, taking advantage of MPI semantics (in our heads…)
P0---
MPI_Isend ( P2 )
MPI_Barrier
P1---
MPI_Barrier
MPI_Isend( P2 )
P2---
MPI_Irecv ( ANY )
MPI_Barrier
This is how POE does it
![Page 31: 1 MPI Verification Ganesh Gopalakrishnan and Robert M. Kirby Students Yu Yang, Sarvani Vakkalanka, Guodong Li, Subodh Sharma, Anh Vo, Michael DeLisi, Geof](https://reader035.vdocument.in/reader035/viewer/2022081515/56649d545503460f94a3181c/html5/thumbnails/31.jpg)
31
Discover All Potential Senders by Collecting (but not issuing) operations at runtime…
P0---
MPI_Isend ( P2 )
MPI_Barrier
P1---
MPI_Barrier
MPI_Isend( P2 )
P2---
MPI_Irecv ( ANY )
MPI_Barrier
![Page 32: 1 MPI Verification Ganesh Gopalakrishnan and Robert M. Kirby Students Yu Yang, Sarvani Vakkalanka, Guodong Li, Subodh Sharma, Anh Vo, Michael DeLisi, Geof](https://reader035.vdocument.in/reader035/viewer/2022081515/56649d545503460f94a3181c/html5/thumbnails/32.jpg)
32
Rewrite “ANY” to ALL POTENTIAL SENDERS
P0---
MPI_Isend ( P2 )
MPI_Barrier
P1---
MPI_Barrier
MPI_Isend( P2 )
P2---
MPI_Irecv ( P0 )
MPI_Barrier
![Page 33: 1 MPI Verification Ganesh Gopalakrishnan and Robert M. Kirby Students Yu Yang, Sarvani Vakkalanka, Guodong Li, Subodh Sharma, Anh Vo, Michael DeLisi, Geof](https://reader035.vdocument.in/reader035/viewer/2022081515/56649d545503460f94a3181c/html5/thumbnails/33.jpg)
33
Rewrite “ANY” to ALL POTENTIAL SENDERS
P0---
MPI_Isend ( P2 )
MPI_Barrier
P1---
MPI_Barrier
MPI_Isend( P2 )
P2---
MPI_Irecv ( P1 )
MPI_Barrier
![Page 34: 1 MPI Verification Ganesh Gopalakrishnan and Robert M. Kirby Students Yu Yang, Sarvani Vakkalanka, Guodong Li, Subodh Sharma, Anh Vo, Michael DeLisi, Geof](https://reader035.vdocument.in/reader035/viewer/2022081515/56649d545503460f94a3181c/html5/thumbnails/34.jpg)
34
Recurse over all such configurations !
P0---
MPI_Isend ( P2 )
MPI_Barrier
P1---
MPI_Barrier
MPI_Isend( P2 )
P2---
MPI_Irecv ( P1 )
MPI_Barrier
![Page 35: 1 MPI Verification Ganesh Gopalakrishnan and Robert M. Kirby Students Yu Yang, Sarvani Vakkalanka, Guodong Li, Subodh Sharma, Anh Vo, Michael DeLisi, Geof](https://reader035.vdocument.in/reader035/viewer/2022081515/56649d545503460f94a3181c/html5/thumbnails/35.jpg)
35
If we now have P0-P2 doing this, and P3-5 doing the same computation between themselves, no need to interleave these groups…
P0---
MPI_Isend ( P2 )
MPI_Barrier
P1---
MPI_Barrier
MPI_Isend( P2 )
P2---
MPI_Irecv ( * )
MPI_Barrier
P3---
MPI_Isend ( P5 )
MPI_Barrier
P4---
MPI_Barrier
MPI_Isend( P5 )
P5---
MPI_Irecv ( * )
MPI_Barrier
![Page 36: 1 MPI Verification Ganesh Gopalakrishnan and Robert M. Kirby Students Yu Yang, Sarvani Vakkalanka, Guodong Li, Subodh Sharma, Anh Vo, Michael DeLisi, Geof](https://reader035.vdocument.in/reader035/viewer/2022081515/56649d545503460f94a3181c/html5/thumbnails/36.jpg)
36
Why is all this worth doing ?
![Page 37: 1 MPI Verification Ganesh Gopalakrishnan and Robert M. Kirby Students Yu Yang, Sarvani Vakkalanka, Guodong Li, Subodh Sharma, Anh Vo, Michael DeLisi, Geof](https://reader035.vdocument.in/reader035/viewer/2022081515/56649d545503460f94a3181c/html5/thumbnails/37.jpg)
37
MPI is the de-facto standard for programming cluster machines
Our focus: Help Eliminate Concurrency Bugs from HPC ProgramsApply similar techniques for other APIs also (e.g. PThreads, OpenMP)
(BlueGene/L - Image courtesy of IBM / LLNL) (Image courtesy of Steve Parker, CSAFE, Utah)
![Page 38: 1 MPI Verification Ganesh Gopalakrishnan and Robert M. Kirby Students Yu Yang, Sarvani Vakkalanka, Guodong Li, Subodh Sharma, Anh Vo, Michael DeLisi, Geof](https://reader035.vdocument.in/reader035/viewer/2022081515/56649d545503460f94a3181c/html5/thumbnails/38.jpg)
38
The success of MPI (Courtesy of Al Geist, EuroPVM / MPI 2007)
![Page 39: 1 MPI Verification Ganesh Gopalakrishnan and Robert M. Kirby Students Yu Yang, Sarvani Vakkalanka, Guodong Li, Subodh Sharma, Anh Vo, Michael DeLisi, Geof](https://reader035.vdocument.in/reader035/viewer/2022081515/56649d545503460f94a3181c/html5/thumbnails/39.jpg)
39
The Need for Formal Semantics for MPI
– Send
– Receive
– Send / Receive
– Send / Receive / Replace
– Broadcast
– Barrier
– Reduce
– Rendezvous mode
– Blocking mode
– Non-blocking mode
– Reliance on system buffering
– User-attached buffering
– Restarts/Cancels of MPI Operations
– Non Wildcard receives
– Wildcard receives
– Tag matching
– Communication spaces
An MPI program is an interesting (and legal)combination of elementsfrom these spaces
![Page 40: 1 MPI Verification Ganesh Gopalakrishnan and Robert M. Kirby Students Yu Yang, Sarvani Vakkalanka, Guodong Li, Subodh Sharma, Anh Vo, Michael DeLisi, Geof](https://reader035.vdocument.in/reader035/viewer/2022081515/56649d545503460f94a3181c/html5/thumbnails/40.jpg)
40
The core count rises but the number of pins on a socket is fixed. This accelerates the decrease in the bytes/flops ratio per socket.
The bandwidth to memory (per core) decreases
The bandwidth to interconnect (per core) decreases
The bandwidth to disk (per core) decreases
MPI Library Implementations Would Also ChangeMulti-core – how it affects MPI (Courtesy, Al Geist)
Need Formal Semantics for MPI, because we can’t imitate any existing implementation…
![Page 41: 1 MPI Verification Ganesh Gopalakrishnan and Robert M. Kirby Students Yu Yang, Sarvani Vakkalanka, Guodong Li, Subodh Sharma, Anh Vo, Michael DeLisi, Geof](https://reader035.vdocument.in/reader035/viewer/2022081515/56649d545503460f94a3181c/html5/thumbnails/41.jpg)
41
Look for commonly committed mistakes automatically–Deadlocks
–Communication Races
–Resource Leaks
We are only after “low hanging” bugs…
![Page 42: 1 MPI Verification Ganesh Gopalakrishnan and Robert M. Kirby Students Yu Yang, Sarvani Vakkalanka, Guodong Li, Subodh Sharma, Anh Vo, Michael DeLisi, Geof](https://reader035.vdocument.in/reader035/viewer/2022081515/56649d545503460f94a3181c/html5/thumbnails/42.jpg)
42
Deadlock pattern…
04/18/23
P0 P1--- ---
s(P1); s(P0);
r(P1); r(P0);
P0 P1
--- ---
Bcast; Barrier;
Barrier; Bcast;
![Page 43: 1 MPI Verification Ganesh Gopalakrishnan and Robert M. Kirby Students Yu Yang, Sarvani Vakkalanka, Guodong Li, Subodh Sharma, Anh Vo, Michael DeLisi, Geof](https://reader035.vdocument.in/reader035/viewer/2022081515/56649d545503460f94a3181c/html5/thumbnails/43.jpg)
43
Communication Race Pattern…
04/18/23
P0 P1 P2--- --- ---r(*); s(P0); s(P0);
r(P1);
P0 P1 P2--- --- ---r(*); s(P0); s(P0);
r(P1);
OK
NOK
![Page 44: 1 MPI Verification Ganesh Gopalakrishnan and Robert M. Kirby Students Yu Yang, Sarvani Vakkalanka, Guodong Li, Subodh Sharma, Anh Vo, Michael DeLisi, Geof](https://reader035.vdocument.in/reader035/viewer/2022081515/56649d545503460f94a3181c/html5/thumbnails/44.jpg)
44
Resource Leak Pattern…
04/18/23
P0---some_allocation_op(&handle);
FORGOTTEN DEALLOC !!
![Page 45: 1 MPI Verification Ganesh Gopalakrishnan and Robert M. Kirby Students Yu Yang, Sarvani Vakkalanka, Guodong Li, Subodh Sharma, Anh Vo, Michael DeLisi, Geof](https://reader035.vdocument.in/reader035/viewer/2022081515/56649d545503460f94a3181c/html5/thumbnails/45.jpg)
45
Bugs are hidden within huge state-spaces…
04/18/23
![Page 46: 1 MPI Verification Ganesh Gopalakrishnan and Robert M. Kirby Students Yu Yang, Sarvani Vakkalanka, Guodong Li, Subodh Sharma, Anh Vo, Michael DeLisi, Geof](https://reader035.vdocument.in/reader035/viewer/2022081515/56649d545503460f94a3181c/html5/thumbnails/46.jpg)
46
Partial Order Reduction Illustrated… With 3 processes, the size
of an interleaved state space is ps=27
Partial-order reduction explores representative sequences from each equivalence class
Delays the execution of independent transitions
In this example, it is possible to “get away” with 7 states (one interleaving)
04/18/23
![Page 47: 1 MPI Verification Ganesh Gopalakrishnan and Robert M. Kirby Students Yu Yang, Sarvani Vakkalanka, Guodong Li, Subodh Sharma, Anh Vo, Michael DeLisi, Geof](https://reader035.vdocument.in/reader035/viewer/2022081515/56649d545503460f94a3181c/html5/thumbnails/47.jpg)
47
A Deadlock Example… (off by one deadlock)
// Add-up integrals calculated by each process
if (my_rank == 0) {
total = integral;
for (source = 0; source < p; source++) { MPI_Recv(&integral, 1, MPI_FLOAT,source,
tag, MPI_COMM_WORLD, &status);
total = total + integral;
}
} else {
MPI_Send(&integral, 1, MPI_FLOAT, dest,
tag, MPI_COMM_WORLD);
}
04/18/23
p1:to 0 p2:to 0 p3:to 0
p0:fr 0 p0:fr 1 p0:fr 2
![Page 48: 1 MPI Verification Ganesh Gopalakrishnan and Robert M. Kirby Students Yu Yang, Sarvani Vakkalanka, Guodong Li, Subodh Sharma, Anh Vo, Michael DeLisi, Geof](https://reader035.vdocument.in/reader035/viewer/2022081515/56649d545503460f94a3181c/html5/thumbnails/48.jpg)
48
Organization of ISP
MPI ProgramMPI Program
Simplified MPI
Program
Simplified MPI
Program
Simplifications
Actual MPI Library and Runtime
Actual MPI Library and Runtime
executableexecutable
Proc 1
Proc n
schedulerrequest/permit
request/permitcompile
PMPI calls
![Page 49: 1 MPI Verification Ganesh Gopalakrishnan and Robert M. Kirby Students Yu Yang, Sarvani Vakkalanka, Guodong Li, Subodh Sharma, Anh Vo, Michael DeLisi, Geof](https://reader035.vdocument.in/reader035/viewer/2022081515/56649d545503460f94a3181c/html5/thumbnails/49.jpg)
49
Summary (have posters for each) Formal Semantics for a large subset of MPI 2.0
– Executable semantics for about 150 MPI 2.0 functions– User interactions through VisualStudio API
Direct execution of user MPI programs to find issues– Downscale code, remove data that does not affect control, etc– New Partial Order Reduction Algorithm
» Explores only Relevant Interleavings– User can insert barriers to contain complexity
» New Vector-Clock algorithm determines if barriers are safe– Errors detected
» Deadlocks» Communication races» Resource leaks
Direct execution of PThread programs to find issues– Adaptation of Dynamic Partial Order Reduction reduces interleavings– Parallel implementation – scales linearly
![Page 50: 1 MPI Verification Ganesh Gopalakrishnan and Robert M. Kirby Students Yu Yang, Sarvani Vakkalanka, Guodong Li, Subodh Sharma, Anh Vo, Michael DeLisi, Geof](https://reader035.vdocument.in/reader035/viewer/2022081515/56649d545503460f94a3181c/html5/thumbnails/50.jpg)
50
Also built POR explorer for C / Pthreads programs, called “Inspect”
Multithreaded C/C++ program
Multithreaded C/C++ program
instrumented program
instrumented program
instrumentation
Thread library wrapper
Thread library wrapper
compile
executableexecutable
thread 1
thread n
schedulerrequest/permit
request/permit
![Page 51: 1 MPI Verification Ganesh Gopalakrishnan and Robert M. Kirby Students Yu Yang, Sarvani Vakkalanka, Guodong Li, Subodh Sharma, Anh Vo, Michael DeLisi, Geof](https://reader035.vdocument.in/reader035/viewer/2022081515/56649d545503460f94a3181c/html5/thumbnails/51.jpg)
51
Dynamic POR is almost a “must” !
( Dynamic POR as in Flanagan and Godefroid, POPL 2005)
![Page 52: 1 MPI Verification Ganesh Gopalakrishnan and Robert M. Kirby Students Yu Yang, Sarvani Vakkalanka, Guodong Li, Subodh Sharma, Anh Vo, Michael DeLisi, Geof](https://reader035.vdocument.in/reader035/viewer/2022081515/56649d545503460f94a3181c/html5/thumbnails/52.jpg)
52
Why Dynamic POR ?
a[ j ]++ a[ k ]--
• Ample Set depends on whether j == k
• Can be very difficult to determine statically
• Can determine dynamically
![Page 53: 1 MPI Verification Ganesh Gopalakrishnan and Robert M. Kirby Students Yu Yang, Sarvani Vakkalanka, Guodong Li, Subodh Sharma, Anh Vo, Michael DeLisi, Geof](https://reader035.vdocument.in/reader035/viewer/2022081515/56649d545503460f94a3181c/html5/thumbnails/53.jpg)
53
Why Dynamic POR ?
The notion of action dependence (crucial to POR methods) is a function of the execution
![Page 54: 1 MPI Verification Ganesh Gopalakrishnan and Robert M. Kirby Students Yu Yang, Sarvani Vakkalanka, Guodong Li, Subodh Sharma, Anh Vo, Michael DeLisi, Geof](https://reader035.vdocument.in/reader035/viewer/2022081515/56649d545503460f94a3181c/html5/thumbnails/54.jpg)
54
Computation of “ample” sets in Static POR versus in DPOR
Ample determinedusing “local” criteria
Current State
Next move of Red process
Nearest DependentTransitionLooking Back
Add Red Process to“Backtrack Set”
This builds the Ampleset incrementally based on observed dependencies
Blue is in “Done” set
{ BT }, { Done }
![Page 55: 1 MPI Verification Ganesh Gopalakrishnan and Robert M. Kirby Students Yu Yang, Sarvani Vakkalanka, Guodong Li, Subodh Sharma, Anh Vo, Michael DeLisi, Geof](https://reader035.vdocument.in/reader035/viewer/2022081515/56649d545503460f94a3181c/html5/thumbnails/55.jpg)
55
We target C/C++ PThread Programs Instrument the given program (largely automated) Run the concurrent program “till the end” Record interleaving variants while advancing When # recorded backtrack points reaches a soft
limit, spill work to other nodes In one larger example, a 11-hour run was finished in
11 minutes using 64 nodes
Heuristic to avoid recomputations was essential for speed-up. First known distributed DPOR
Putting it all together …
![Page 56: 1 MPI Verification Ganesh Gopalakrishnan and Robert M. Kirby Students Yu Yang, Sarvani Vakkalanka, Guodong Li, Subodh Sharma, Anh Vo, Michael DeLisi, Geof](https://reader035.vdocument.in/reader035/viewer/2022081515/56649d545503460f94a3181c/html5/thumbnails/56.jpg)
56
A Simple DPOR Example
{}, {}t0:
lock(t)
unlock(t)
t1:
lock(t)
unlock(t)
t2:
lock(t)
unlock(t)
![Page 57: 1 MPI Verification Ganesh Gopalakrishnan and Robert M. Kirby Students Yu Yang, Sarvani Vakkalanka, Guodong Li, Subodh Sharma, Anh Vo, Michael DeLisi, Geof](https://reader035.vdocument.in/reader035/viewer/2022081515/56649d545503460f94a3181c/html5/thumbnails/57.jpg)
57
For this example, all the paths explored during DPOR
For others, it will be a proper subset
![Page 58: 1 MPI Verification Ganesh Gopalakrishnan and Robert M. Kirby Students Yu Yang, Sarvani Vakkalanka, Guodong Li, Subodh Sharma, Anh Vo, Michael DeLisi, Geof](https://reader035.vdocument.in/reader035/viewer/2022081515/56649d545503460f94a3181c/html5/thumbnails/58.jpg)
58
Idea for parallelization: Explore computations from the backtrack set in other processes.
“Embarrassingly Parallel” – it seems so, anyway !
![Page 59: 1 MPI Verification Ganesh Gopalakrishnan and Robert M. Kirby Students Yu Yang, Sarvani Vakkalanka, Guodong Li, Subodh Sharma, Anh Vo, Michael DeLisi, Geof](https://reader035.vdocument.in/reader035/viewer/2022081515/56649d545503460f94a3181c/html5/thumbnails/59.jpg)
59
worker a worker b
Request unloading
idle node id
work description
report result
load balancer
We then devised a work-distribution scheme…
![Page 60: 1 MPI Verification Ganesh Gopalakrishnan and Robert M. Kirby Students Yu Yang, Sarvani Vakkalanka, Guodong Li, Subodh Sharma, Anh Vo, Michael DeLisi, Geof](https://reader035.vdocument.in/reader035/viewer/2022081515/56649d545503460f94a3181c/html5/thumbnails/60.jpg)
60
Speedup on aget
![Page 61: 1 MPI Verification Ganesh Gopalakrishnan and Robert M. Kirby Students Yu Yang, Sarvani Vakkalanka, Guodong Li, Subodh Sharma, Anh Vo, Michael DeLisi, Geof](https://reader035.vdocument.in/reader035/viewer/2022081515/56649d545503460f94a3181c/html5/thumbnails/61.jpg)
61
Speedup on bbuf
![Page 62: 1 MPI Verification Ganesh Gopalakrishnan and Robert M. Kirby Students Yu Yang, Sarvani Vakkalanka, Guodong Li, Subodh Sharma, Anh Vo, Michael DeLisi, Geof](https://reader035.vdocument.in/reader035/viewer/2022081515/56649d545503460f94a3181c/html5/thumbnails/62.jpg)
62
Historical Note
Model Checking – Proposed in 1981
– 2007 ACM Turing Award for Clarke, Emerson, and Sifakis
Bug discovery facilitated by– The creation of simplified models
– Exhaustively checking the models
» Exploring only relevant interleavings
![Page 63: 1 MPI Verification Ganesh Gopalakrishnan and Robert M. Kirby Students Yu Yang, Sarvani Vakkalanka, Guodong Li, Subodh Sharma, Anh Vo, Michael DeLisi, Geof](https://reader035.vdocument.in/reader035/viewer/2022081515/56649d545503460f94a3181c/html5/thumbnails/63.jpg)
63
Looking ahead…
Plans for one year out…
![Page 64: 1 MPI Verification Ganesh Gopalakrishnan and Robert M. Kirby Students Yu Yang, Sarvani Vakkalanka, Guodong Li, Subodh Sharma, Anh Vo, Michael DeLisi, Geof](https://reader035.vdocument.in/reader035/viewer/2022081515/56649d545503460f94a3181c/html5/thumbnails/64.jpg)
64
Finish tool implementation for MPI and others…
Static Analysis to reduce some cost Inserting Barriers (to contain cost) using new vector-
clocking algorithm for MPI Demonstrate on meaningful apps (e.g. Parmetis) Plug into MS VisualStudio Development of PThread (“Inspect”) tool with same
capabilities Evolving these tools to Transaction Memory, Microsoft
TPL, OpenMP, …
![Page 65: 1 MPI Verification Ganesh Gopalakrishnan and Robert M. Kirby Students Yu Yang, Sarvani Vakkalanka, Guodong Li, Subodh Sharma, Anh Vo, Michael DeLisi, Geof](https://reader035.vdocument.in/reader035/viewer/2022081515/56649d545503460f94a3181c/html5/thumbnails/65.jpg)
65
Thanks Microsoft !and Dennis Crain, Shahrokh Mortazavi
In these times of unpredictable NSF funding, the HPC Institute Program made it possible for us to produce a
great cadre of Formal Verification Engineers
Robert Palmer (PhD – to join Microsoft soon), Sonjong Hwang (MS), Steve Barrus (BS), Salman
Pervez (MS)
Yu Yang (PhD), Sarvani Vakkalanka (PhD), Guodong Li (PhD), Subodh Sharma (PhD), Anh Vo (PhD), Michael DeLisi (BS/MS), Geof Sawaya (BS)
(http://www.cs.utah.edu/formal_verification)
Microsoft HPC Institutes
NSF CNS 0509379
![Page 66: 1 MPI Verification Ganesh Gopalakrishnan and Robert M. Kirby Students Yu Yang, Sarvani Vakkalanka, Guodong Li, Subodh Sharma, Anh Vo, Michael DeLisi, Geof](https://reader035.vdocument.in/reader035/viewer/2022081515/56649d545503460f94a3181c/html5/thumbnails/66.jpg)
66
Extra Slides
![Page 67: 1 MPI Verification Ganesh Gopalakrishnan and Robert M. Kirby Students Yu Yang, Sarvani Vakkalanka, Guodong Li, Subodh Sharma, Anh Vo, Michael DeLisi, Geof](https://reader035.vdocument.in/reader035/viewer/2022081515/56649d545503460f94a3181c/html5/thumbnails/67.jpg)
67
Looking Further Ahead: Need to clear “idea log-jam in multi-core computing…”
“There isn’t such a thing as Republican clean air or Democratic clean air. We all breathe the same air.”
There isn’t such a thing as an architectural-only solution, or a compilers-only solution to future problems in multi-core computing…
![Page 68: 1 MPI Verification Ganesh Gopalakrishnan and Robert M. Kirby Students Yu Yang, Sarvani Vakkalanka, Guodong Li, Subodh Sharma, Anh Vo, Michael DeLisi, Geof](https://reader035.vdocument.in/reader035/viewer/2022081515/56649d545503460f94a3181c/html5/thumbnails/68.jpg)
68
Now you see it; Now you don’t !
On the menace of non reproducible bugs.
Deterministic replay must ideally be an option User programmable schedulers greatly
emphasized by expert developers Runtime model-checking methods with state-
space reduction holds promise in meshing with current practice…