practical model-checking method for verifying correctness of mpi programs
DESCRIPTION
Practical Model-Checking Method For Verifying Correctness of MPI Programs. Salman Pervez, Ganesh Gopalakrishnan, Robert M. Kirby, Robert Palmer School of Computing University of Utah. Rajeev Thakur, William Gropp Mathematics and Computer Science Division Argonne National Laboratory. - PowerPoint PPT PresentationTRANSCRIPT
Argonne National Laboratory School of Computing and SCI Institute, University of Utah
Practical Model-Checking MethodFor Verifying Correctness of MPI Programs
Salman Pervez, Ganesh Gopalakrishnan, Robert M. Kirby, Robert PalmerSchool of Computing
University of Utah
Rajeev Thakur, William GroppMathematics and Computer Science Division
Argonne National Laboratory
Argonne National Laboratory School of Computing and SCI Institute, University of Utah
• Concurrent algorithms are notoriously hard to design and verify.
• Formal methods, and in particular finite-state model checking, provide a means of reasoning about concurrent algorithms.
• Principle advantages of modeling checking approach:- Provides formal framework for reasoning- Allows coverage – examination of all possible process interleavings
• Principle challenges of modeling checking approach:- Requires modeling step- Can lead to “state explosion”
Thesis of the Talk
Thesis: In-Situ modeling checking with dynamic partial-order reduction provides the advantages of the
model checking approach while ameliorating the challenges.2/28
Argonne National Laboratory School of Computing and SCI Institute, University of Utah
Why MPI is Complex: Collision of Features
– Send
– Receive
– Send / Receive
– Send / Receive / Replace
– Broadcast
– Barrier
– Reduce
– Rendezvous mode
– Blocking mode
– Non-blocking mode
– Reliance on system buffering
– User-attached buffering
– Restarts/Cancels of MPI Operations
– Non Wildcard receives– Wildcard receives– Tag matching – Communication spaces
An MPI program is an interesting (and legal)combination of elementsfrom these spaces
3/28
Argonne National Laboratory School of Computing and SCI Institute, University of Utah
Conventional Debugging of MPI
• Inspection– Difficult to carry out on MPI programs (low level notation)
• Simulation Based– Run given program with manually selected inputs– Can give poor coverage in practice
• Simulation with runtime heuristics to find bugs– Marmot: Timeout based deadlocks, random executions– Intel Trace Collector: Similar checks with data checking– TotalView: Better trace viewing – still no “model checking”(?)– We don’t know if any formal coverage metrics are offered
4/28
Argonne National Laboratory School of Computing and SCI Institute, University of Utah
What is Model Checking?
Navier-Stokes Equations are a mathematical model of fluid flow physics
“V&V” – Validation and Verification“Validate Models, Verify Codes”
“Formal models” can be generated eitherautomatically or by a modeler whichtranslate and abstract algorithms
and implementations.
5/28
Argonne National Laboratory School of Computing and SCI Institute, University of Utah
Related work on FV for MPI programs
• Main related work is that by Siegel and Avrunin
• Provide synchronous channel theorems for blocking and non-blocking MPI constructs– Deadlocks caught iff caught using synchronous channels
• Provide a state-machine model for MPI calls– Have built a tool called MPI_Spin that uses C extensions to
Promela to encode MPI state-machine
• Provide a symbolic execution approach to check computational results of MPI programs
• Define a static POR algorithm which ameliorates challenge 2.– Schedules processes in a canonical order– Schedules sends when receives posted – sync channel effect– Wildcard receives handled through over-approximation
6/28
Argonne National Laboratory School of Computing and SCI Institute, University of Utah
Traditional Execution Checking Versus Model Checking
“Execution Checking”
“Model Checking”
In current practice, concrete executions on a few diverse platforms are often used to verifyalgorithms/codes.
Consequence: Many feasible executions mightnot be manifested.
Model checking forces all executions of a judiciously down-scaled model to be examined.
Current focus of our research: minimize modeling effort and error.
7/28
Argonne National Laboratory School of Computing and SCI Institute, University of Utah
Solution – Runtime (i.e. “In Situ”) Model Checking
• Pioneered by Patrice Godefroid (at Bell labs) • Developed in the context of his Verisoft project. He called it Runtime model checking.
• Godefroid created the dynamic partial-order reduction algorithm in 2005
“In Situ” Model Checking
Fundamental challenges of model checking:• Model creation (and validation)• Managing state explosion
Ameliorate first challengeby running instrumentedversions of the code.
Ameliorate second challengeby pruning the state-space based upon independenceof operations.
8/28
Argonne National Laboratory School of Computing and SCI Institute, University of Utah
Process 0 Process 1 Process 2 Process 3
Scheduler
Socket Communication
Our Contribution: In Situ Model Checker For MPI
ConsiderWildcard
Receives andTheir Interleaving
9/28
Argonne National Laboratory School of Computing and SCI Institute, University of Utah
Code to handle MPI_Win_unlock (in general, this is how every
MPI_SomeFunc is structured…) MPI_Win_unlock(arg1, arg2...argN) {
sendToSocket(pID, Win_unlock, arg1,...,argN);
while(recvFromSocket(pID) != go-ahead)
MPI_Iprobe(MPI_ANY_SOURCE, 0, MPI_COMM_WORLD...);
return PMPI_Win_unlock(arg1, arg2...argN);
}
An innocuous Progress-Engine “Poker”Introduced for handling one-sided MPI
10/28
Argonne National Laboratory School of Computing and SCI Institute, University of Utah
Current MPI Constructs Examined
• MPI Constructs Examined:– MPI_Init– MPI_Send– MPI_Ssend– MPI_Recv– MPI_Barrier– MPI_Finalize– MPI_Win_lock– MPI_Win_unlock– MPI_Put– MPI_Get– MPI_Accumulate
11/28
Required creating code whichcommunicated with scheduler.
Required understanding howthe progress engine worked with MPICH (with adjustmentsto the scheduler to employ thisinformation judiciously).
Argonne National Laboratory School of Computing and SCI Institute, University of Utah
0: MPI_Init1: MPI_Win_lock2: MPI_Accumulate3: MPI_Win_unlock4: MPI_Barrier5: MPI_Finalize
0: MPI_Init1: MPI_Win_lock2: MPI_Accumulate3: MPI_Win_unlock4: MPI_Barrier5: MPI_Finalize
Process P0 Process P1
Current Position: NULL / NULL
Scheduler Options: P0:0 and P1:0
Scheduler Choice:
MPI One-Sided Example
12/28
Argonne National Laboratory School of Computing and SCI Institute, University of Utah
0: MPI_Init1: MPI_Win_lock2: MPI_Accumulate3: MPI_Win_unlock4: MPI_Barrier5: MPI_Finalize
0: MPI_Init1: MPI_Win_lock2: MPI_Accumulate3: MPI_Win_unlock4: MPI_Barrier5: MPI_Finalize
Process P0 Process P1
Current Position: NULL / NULL
Scheduler Options: P0:0 and P1:0
Scheduler Choice: P1:0
MPI One-Sided Example
13/28
Argonne National Laboratory School of Computing and SCI Institute, University of Utah
0: MPI_Init1: MPI_Win_lock2: MPI_Accumulate3: MPI_Win_unlock4: MPI_Barrier5: MPI_Finalize
0: MPI_Init1: MPI_Win_lock2: MPI_Accumulate3: MPI_Win_unlock4: MPI_Barrier5: MPI_Finalize
Process P0 Process P1
Current Position: NULL / P1:0
Scheduler Options: P0:0 and P1:1
Scheduler Choice:
MPI One-Sided Example
14/28
Argonne National Laboratory School of Computing and SCI Institute, University of Utah
0: MPI_Init1: MPI_Win_lock2: MPI_Accumulate3: MPI_Win_unlock4: MPI_Barrier5: MPI_Finalize
0: MPI_Init1: MPI_Win_lock2: MPI_Accumulate3: MPI_Win_unlock4: MPI_Barrier5: MPI_Finalize
Process P0 Process P1
Current Position: NULL / P1:0
Scheduler Options: P0:0 and P1:1
Scheduler Choice: P1:1
MPI One-Sided Example
15/28
Argonne National Laboratory School of Computing and SCI Institute, University of Utah
0: MPI_Init1: MPI_Win_lock2: MPI_Accumulate3: MPI_Win_unlock4: MPI_Barrier5: MPI_Finalize
0: MPI_Init1: MPI_Win_lock2: MPI_Accumulate3: MPI_Win_unlock4: MPI_Barrier5: MPI_Finalize
Process P0 Process P1
Current Position: NULL / P1:1
Scheduler Options: P0:0 and P1:2
Scheduler Choice:
MPI One-Sided Example
16/28
Argonne National Laboratory School of Computing and SCI Institute, University of Utah
0: MPI_Init1: MPI_Win_lock2: MPI_Accumulate3: MPI_Win_unlock4: MPI_Barrier5: MPI_Finalize
0: MPI_Init1: MPI_Win_lock2: MPI_Accumulate3: MPI_Win_unlock4: MPI_Barrier5: MPI_Finalize
Process P0 Process P1
Current Position: NULL / P1:1
Scheduler Options: P0:0 and P1:2
Scheduler Choice: P1:2
MPI One-Sided Example
17/28
Argonne National Laboratory School of Computing and SCI Institute, University of Utah
0: MPI_Init1: MPI_Win_lock2: MPI_Accumulate3: MPI_Win_unlock4: MPI_Barrier5: MPI_Finalize
0: MPI_Init1: MPI_Win_lock2: MPI_Accumulate3: MPI_Win_unlock4: MPI_Barrier5: MPI_Finalize
Process P0 Process P1
Current Position: NULL / P1:2
Scheduler Options: P0:0 and P1:3
Scheduler Choice:
MPI One-Sided Example
18/28
Argonne National Laboratory School of Computing and SCI Institute, University of Utah
0: MPI_Init1: MPI_Win_lock2: MPI_Accumulate3: MPI_Win_unlock4: MPI_Barrier5: MPI_Finalize
0: MPI_Init1: MPI_Win_lock2: MPI_Accumulate3: MPI_Win_unlock4: MPI_Barrier5: MPI_Finalize
Process P0 Process P1
Current Position: NULL / P1:2
Scheduler Options: P0:0 and P1:3
Scheduler Choice: P1:3
MPI One-Sided Example
19/28
Argonne National Laboratory School of Computing and SCI Institute, University of Utah
0: MPI_Init1: MPI_Win_lock2: MPI_Accumulate3: MPI_Win_unlock4: MPI_Barrier5: MPI_Finalize
0: MPI_Init1: MPI_Win_lock2: MPI_Accumulate3: MPI_Win_unlock4: MPI_Barrier5: MPI_Finalize
Process P0 Process P1
Current Position: NULL / P1:3
Scheduler Options: P0:0 and P1:4
Scheduler Choice:
MPI One-Sided Example
20/28
Argonne National Laboratory School of Computing and SCI Institute, University of Utah
0: MPI_Init1: MPI_Win_lock2: MPI_Accumulate3: MPI_Win_unlock4: MPI_Barrier5: MPI_Finalize
0: MPI_Init1: MPI_Win_lock2: MPI_Accumulate3: MPI_Win_unlock4: MPI_Barrier5: MPI_Finalize
Process P0 Process P1
Current Position: NULL / P1:4
Scheduler Options: P0:0
Scheduler Choice:
MPI One-Sided Example
21/28
Argonne National Laboratory School of Computing and SCI Institute, University of Utah
0: MPI_Init1: MPI_Win_lock2: MPI_Accumulate3: MPI_Win_unlock4: MPI_Barrier5: MPI_Finalize
0: MPI_Init1: MPI_Win_lock2: MPI_Accumulate3: MPI_Win_unlock4: MPI_Barrier5: MPI_Finalize
Process P0 Process P1
Current Position: P0:0 / P1:4
Scheduler Options: P0:1
Scheduler Choice: P0:1 – P0:4
MPI One-Sided Example
22/28
Argonne National Laboratory School of Computing and SCI Institute, University of Utah
0: MPI_Init1: MPI_Win_lock2: MPI_Accumulate3: MPI_Win_unlock4: MPI_Barrier5: MPI_Finalize
0: MPI_Init1: MPI_Win_lock2: MPI_Accumulate3: MPI_Win_unlock4: MPI_Barrier5: MPI_Finalize
Process P0 Process P1
Current Position: P0:4 / P1:4
Scheduler Options: P0:5 and P1:5
Scheduler Choice:
MPI One-Sided Example
Does it matter which choiceIt makes? Are these
independent?
23/28
Argonne National Laboratory School of Computing and SCI Institute, University of Utah
Partial-Order Reduction
• With 3 processes, the size of an interleaved state space is p3=27
• Partial-order reduction explores representative sequences from each equivalence class
• Delays the execution of independent transitions
• In this example, it is possible to “get away” with 7 states (one interleaving)
24/28
Argonne National Laboratory School of Computing and SCI Institute, University of Utah
Full = { … }Enabled = {…}Backtrack = {…}
Full = { … }Enabled = {…}Backtrack = {…}
Full = { … }Enabled = {…}Backtrack = {…}
Transition 1
Transition 2
Transition 3
Run the “instrumented” programto populate the full set of transitionsand the enabled set of transitions at each state.
Dynamic Partial-Order Reduction
Given enabled sets E, we want to find backset sets Bsuch that B is a proper subset of E and such that B capturesrepresentatives of all equivalentexecutions (under the notion ofIndependence)
25/28
Argonne National Laboratory School of Computing and SCI Institute, University of Utah
MPI Functions Dependence
MPI_Init
MPI_Send
MPI_Ssend
MPI_Recv
MPI_Barrier
MPI_Win_lock
MPI_Win_unlock
MPI_Win_free
MPI_Finalize
None
MPI_Send, MPI_Ssend, MPI_Recv
MPI_Send, MPI_Ssend, MPI_Recv
MPI_Send, MPI_Ssend
None
None
MPI_Win_unlock
None
None
Defining Dependence
26/28
Argonne National Laboratory School of Computing and SCI Institute, University of Utah
Program Number of Procs
Interleavings
without DPOR
Interleavings with DPOR
Byte-range (reduced depth)
2 2289 119
Byte-range
(full depth)
2 - 1522
Example Benefits: One-Sided Byte-Range Protocol
27/28
Argonne National Laboratory School of Computing and SCI Institute, University of Utah
• Formal methods, and in particular finite-state model checking, provide a means of reasoning about concurrent algorithms.
• Principle challenges of modeling checking approach:- Requires modeling step- Can lead to “state explosion”
Both of which can be ameliorated by In-Situ Model Checking
Future Work:• Expand number of MPI Primitives (and the corresponding dependence table)• Exploit code-slicing to remove ancillary operations
Funding Acknowledgements:
• NSF (CSR–SMA: Toward Reliable and Efficient Message Passing Software Through Formal Analysis)• Microsoft (Formal Analysis and Code Generation Support for MPI)• Office of Science – Department of Energy
Summary
28/28