systematic development of programs for parallel and cloud ... · joint work with (alphabetical...
TRANSCRIPT
Systematic Development of Programsfor Parallel and Cloud Computing:
Towards a FrameworkNII Lectures Series
Frederic Loulergue
Universite d’Orleans – LIFO – PaMDA Team
October-November 2013
F. Loulergue SyDPaCC – Towards a Framework – Lecture 1 October-November 2013 1 / 63
Joint Work with (alphabetical order)
I Dr. Wadoud Bousdira (Universite d’Orleans)I Dr. Frederic Dabrowski (Universite d’Orleans)I Sylvain Dailler (KUT & Universite d’Orleans)I Dr. Kento Emoto (Kyushu University of Technology)I Pr. Zhenjiang Hu (National Institute of Informatics)I Dr. Sylvain Jubertie (Universite d’Orleans)I Dr. Hab. Frederic Gava (Universite Paris-Est Creteil)I Dr. Louis Gesbert (OCamlPro)I Hideki Hashimoto (The University of Tokyo)I Joeffrey Legaux (Universite d’Orleans)I Dr. Kiminori Matsuzaki (Kochi University of Technology)I Dr. Virginia Niculescu (Babes-Bolyai University of Cluj-Napoca)I Thomas Pinsard (Universite d’Orleans)I Simon Robillard (Universite d’Orleans)I Pr. Masato Takeichi (The University of Tokyo)I Dr. Julien Tesson (Universite Paris-Est Creteil)
F. Loulergue SyDPaCC – Towards a Framework – Lecture 1 October-November 2013 2 / 63
Universite d’Orleans, LIFO
F. Loulergue SyDPaCC – Towards a Framework – Lecture 1 October-November 2013 3 / 63
Outline
1 Motivations and BackgroundOur GoalParallel ProgrammingProgram Correctness
2 Systematic Development of Programs for Parallel and Cloud Computing3 An Overview of Next Lectures
Lecture 2: Program Verification with the Coq Proof AssistantLecture 3: Parallel Programming in CoqLecture 4: Constructive Algorithms in CoqLecture 5: PowerLists in CoqLecture 6: Bulk Synchronous Parallel HomomorphismsLecture 7: Generate-Test-Aggregate in Coq
4 Summary5 Bibliography
F. Loulergue SyDPaCC – Towards a Framework – Lecture 1 October-November 2013 4 / 63
Outline
1 Motivations and BackgroundOur GoalParallel ProgrammingProgram Correctness
2 Systematic Development of Programs for Parallel and Cloud Computing3 An Overview of Next Lectures
Lecture 2: Program Verification with the Coq Proof AssistantLecture 3: Parallel Programming in CoqLecture 4: Constructive Algorithms in CoqLecture 5: PowerLists in CoqLecture 6: Bulk Synchronous Parallel HomomorphismsLecture 7: Generate-Test-Aggregate in Coq
4 Summary5 Bibliography
F. Loulergue SyDPaCC – Towards a Framework – Lecture 1 October-November 2013 5 / 63
Our Goal
To ease the development of correctand verified parallel programs
with predictable performances
using theories and tools to allowa user to develop an application
by using building blocks andimplementing short programs satisfying
conditions easily or automatically proved
F. Loulergue SyDPaCC – Towards a Framework – Lecture 1 October-November 2013 6 / 63
Our Goal
To ease the development of correctand verified parallel programs
with predictable performances
using theories and tools to allowa user to develop an application
by using building blocks andimplementing short programs satisfying
conditions easily or automatically proved
F. Loulergue SyDPaCC – Towards a Framework – Lecture 1 October-November 2013 6 / 63
Outline
1 Motivations and BackgroundOur GoalParallel ProgrammingProgram Correctness
2 Systematic Development of Programs for Parallel and Cloud Computing3 An Overview of Next Lectures
Lecture 2: Program Verification with the Coq Proof AssistantLecture 3: Parallel Programming in CoqLecture 4: Constructive Algorithms in CoqLecture 5: PowerLists in CoqLecture 6: Bulk Synchronous Parallel HomomorphismsLecture 7: Generate-Test-Aggregate in Coq
4 Summary5 Bibliography
F. Loulergue SyDPaCC – Towards a Framework – Lecture 1 October-November 2013 7 / 63
Parallel Programming
AutomaticParallelization
Structured Parallelism
I Algorithmic Skeletons
I Bridging Models
I Declarative Parallel Programming
I . . .
Concurrent &Distributed
Programming
F. Loulergue SyDPaCC – Towards a Framework – Lecture 1 October-November 2013 8 / 63
Parallel Programming
AutomaticParallelization
Structured Parallelism
I Algorithmic Skeletons
I Bridging Models
I Declarative Parallel Programming
I . . .
Concurrent &Distributed
Programming
F. Loulergue SyDPaCC – Towards a Framework – Lecture 1 October-November 2013 8 / 63
Parallel Programming
AutomaticParallelization
Structured Parallelism
I Algorithmic Skeletons
I Bridging Models
I Declarative Parallel Programming
I . . .
Concurrent &Distributed
Programming
F. Loulergue SyDPaCC – Towards a Framework – Lecture 1 October-November 2013 8 / 63
Parallel Programming
AutomaticParallelization
Structured Parallelism
I Algorithmic Skeletons
I Bridging Models
I Declarative Parallel Programming
I . . .
Concurrent &Distributed
Programming
F. Loulergue SyDPaCC – Towards a Framework – Lecture 1 October-November 2013 8 / 63
Parallel Programming – Bridging Models I
Bridging Model
I Leslie Valiant in his 1990 CACM paper“A Bridging Model for Parallel Computation”http://dx.doi.org/10.1145/79173.79181
The von Neumann model is the connecting bridgethat enables programs from the diverse and chaoticworld of software to run efficientby on machinesfrom the diverse and chaotic world of hardware
I Valiant’s proposal: Bulk Synchronous Parallelism (BSP)
I Other models: LogP and variants, BSP variants, . . .
F. Loulergue SyDPaCC – Towards a Framework – Lecture 1 October-November 2013 9 / 63
Parallel Programming – Bridging Models II
Research on BSP90’ by Valiant & McColl
Three models
I abstract architecture
I execution model
I cost model
BSP computerI p processor / memory pairs
(of speed r)I a communication network
(of speed g)I a global synchronisation unit
(of speed L)
Execution model
Cost modelT (s) = max0≤i<p wi + h × g + L
where h = max0≤i<p{h+i , h
−i }
F. Loulergue SyDPaCC – Towards a Framework – Lecture 1 October-November 2013 10 / 63
Parallel Programming – Bridging Models III
Applications
I scientific computation [3]
I genetic algorithms [4]
I genetic programming [7]
I neural networks [18]
I parallel databases [1]
I parallel constraints solvers [10]
I . . .
F. Loulergue SyDPaCC – Towards a Framework – Lecture 1 October-November 2013 11 / 63
Parallel Programming – Algorithmic Skeletons
Algorithmic Skeletons
I Coined by Murray Cole inAlgorithmic Skeletons: Structured Managementof Parallel Computation, MIT Press, 1989http://homepages.inf.ed.ac.uk/mic/Pubs/skeletonbook.ps.gz
I Popular skeletons: Google’s MapReduce
Skeletal Parallelism
I Skeleton = pattern of a parallel algorithmfamiliar sequential semantics
I Program = composition of skeletons
F. Loulergue SyDPaCC – Towards a Framework – Lecture 1 October-November 2013 12 / 63
Parallel Programming – Algorithmic Skeletons
Algorithmic Skeletons
I Coined by Murray Cole inAlgorithmic Skeletons: Structured Managementof Parallel Computation, MIT Press, 1989http://homepages.inf.ed.ac.uk/mic/Pubs/skeletonbook.ps.gz
I Popular skeletons: Google’s MapReduce
Skeletal Parallelism
I Skeleton = higher-order function implemented in parallelfamiliar sequential semantics
I Program = composition of skeletons
F. Loulergue SyDPaCC – Towards a Framework – Lecture 1 October-November 2013 12 / 63
Parallel Programming – Algorithmic Skeletons
Libraries of Algorithmic Skeletons
I For C++: SkeTo1, OSL2, Muesli, QUAFF, . . .
I For C: eSkel, SKElib
I For Java: Lithium, Muskel, Calcium, . . .I For functional languages:
I OCaml: OCamlP3L, ParmapI Erlang: SkelI Haskell: HaskSkel, Edenskeletons
Algorithmic Skeletons Theory
I List homomorphisms for parallel programming (Cole 1993)
I Many further developments in particular in Tokyo
1http://sketo.ipl-lab.org2http://traclifo.univ-orleans.fr/OSL
F. Loulergue SyDPaCC – Towards a Framework – Lecture 1 October-November 2013 13 / 63
Parallel Programming – Algorithmic Skeletons
An Example: Variance
1
n
n−1∑k=0
(xk −1
n
n−1∑k=0
xk)2
Variance as an OSL Program
double avg = reduce(plus<double>(), x) / x.getSize();
double variance =
reduce(plus<double>(),
map(bind(minus<double>(),avg, _2), x));
F. Loulergue SyDPaCC – Towards a Framework – Lecture 1 October-November 2013 14 / 63
Outline
1 Motivations and BackgroundOur GoalParallel ProgrammingProgram Correctness
2 Systematic Development of Programs for Parallel and Cloud Computing3 An Overview of Next Lectures
Lecture 2: Program Verification with the Coq Proof AssistantLecture 3: Parallel Programming in CoqLecture 4: Constructive Algorithms in CoqLecture 5: PowerLists in CoqLecture 6: Bulk Synchronous Parallel HomomorphismsLecture 7: Generate-Test-Aggregate in Coq
4 Summary5 Bibliography
F. Loulergue SyDPaCC – Towards a Framework – Lecture 1 October-November 2013 15 / 63
Program Correctness?
F. Loulergue SyDPaCC – Towards a Framework – Lecture 1 October-November 2013 16 / 63
Program Correctness?
F. Loulergue SyDPaCC – Towards a Framework – Lecture 1 October-November 2013 16 / 63
Program Verification I
Verification MethodsSome properties:
I Model checking
I Static analysis in particular abstract interpretation
Functional correctness {P} c {Q}:I interactive provers:
I CoqI Isabelle/HOL
I deductive program verification:verification condition generator + (SMT) solvers
I Why3I Boogie
F. Loulergue SyDPaCC – Towards a Framework – Lecture 1 October-November 2013 17 / 63
Program Verification II
The seL4 operating system kernel Klein et al. [12]
Design Cycle
Haskell Prototype
Design
Formal Executable Spec
High-Performance C ImplementationUser Programs
Hardware Simulator
ProofManual
Implementation
+
Figure 1: The seL4 design process
prototype in a near-to-realistic setting, we link itwith software (derived from QEMU) that simulatesthe hardware platform. Normal user-level executionis enabled by the simulator, while traps are passedto the kernel model which computes the result ofthe trap. The prototype modifies the user-level stateof the simulator to appear as if a real kernel hadexecuted in privileged mode.
This arrangement provides a prototyping environ-ment that enables low-level design evaluation fromboth the user and kernel perspective, including low-level physical and virtual memory management. Italso provides a realistic execution environment that isbinary-compatible with the real kernel. For example,we ran a subset of the Iguana embedded OS [37] onthe simulator-Haskell combination. The alternativeof producing the executable specification directlyin the theorem prover would have meant a steeplearning curve for the design team and a much lesssophisticated tool chain for execution and simulation.
We restrict ourselves to a subset of Haskell thatcan be automatically translated into the languageof the theorem prover we use. For instance, we donot make any substantial use of laziness, make onlyrestricted use of type classes, and we prove that allfunctions terminate. The details of this subset aredescribed elsewhere [19,41].
While the Haskell prototype is an executable modeland implementation of the final design, it is not thefinal production kernel. We manually re-implementthe model in the C programming language for severalreasons. Firstly, the Haskell runtime is a significantbody of code (much bigger than our kernel) whichwould be hard to verify for correctness. Secondly, theHaskell runtime relies on garbage collection which isunsuitable for real-time environments. Incidentally,the same arguments apply to other systems based ontype-safe languages, such as SPIN [7] and Singular-ity [23]. Additionally, using C enables optimisation ofthe low-level implementation for performance. Whilean automated translation from Haskell to C wouldhave simplified verification, we would have lost most
Abstract Specification
Executable Specification
High-Performance C Implementation
Haskell Prototype
Isabelle/HOL
Automatic Translation
Refinement Proof
Figure 2: The refinement layers in the verification ofseL4
opportunities to micro-optimise the kernel, which isrequired for adequate microkernel performance.
2.3 Formal verification
The technique we use for formal verification is inter-active, machine-assisted and machine-checked proof.Specifically, we use the theorem prover Isabelle/HOL[50]. Interactive theorem proving requires humanintervention and creativity to construct and guidethe proof. However, it has the advantage that it isnot constrained to specific properties or finite, feasi-ble state spaces, unlike more automated methods ofverification such as static analysis or model checking.
The property we are proving is functional correct-ness in the strongest sense. Formally, we are showingrefinement [18]: A refinement proof establishes acorrespondence between a high-level (abstract) anda low-level (concrete, or refined) representation of asystem.
The correspondence established by the refinementproof ensures that all Hoare logic properties of theabstract model also hold for the refined model. Thismeans that if a security property is proved in Hoarelogic about the abstract model (not all security prop-erties can be), refinement guarantees that the sameproperty holds for the kernel source code. In thispaper, we concentrate on the general functional cor-rectness property. We have also modelled and provedthe security of seL4’s access-control system in Is-abelle/HOL on a high level. This is described else-where [11,21], and we have not yet connected it tothe proof presented here.
Fig. 2 shows the specification layers used in theverification of seL4; they are related by formal proof.Sect. 4 explains the proof and each of these layers indetail; here we give a short summary.
The top-most layer in the picture is the abstractspecification: an operational model that is the main,complete specification of system behaviour. Theabstract level contains enough detail to specify theouter interface of the kernel, e.g., how system-callarguments are encoded in binary form, and it de-
from [12]
I Proof assistant: Isabelle/HOL
I Memory model: Sequential Consistency
F. Loulergue SyDPaCC – Towards a Framework – Lecture 1 October-November 2013 18 / 63
Program Verification III
Verified compilation
Target Language
Syntax:
OperationalSemantics:
Source Language
Syntax:
OperationalSemantics:
ProgramAST
ProgramAST
Programexecution
Programexecution
Proof of SemanticPreservation
Compilation Pass
F. Loulergue SyDPaCC – Towards a Framework – Lecture 1 October-November 2013 19 / 63
Program Verification IV
The Compcert [13] and CompcertTSO Compilers [19]
from http://compcert.inria.fr
I Proof assistant: CoqI Transformation passes are proved to preserve semanticsI The compiler is extracted from the formal developmentI CompcertTSO: a first step towards parallelism
F. Loulergue SyDPaCC – Towards a Framework – Lecture 1 October-November 2013 20 / 63
Program Refinement & Program Calculation
B Method
Abstract Model(B language)
⇓Refined Model
(B language)⇓...
Concrete Model(B language)
⇓Programming Language
Source Code
Program Calculation
Problem Specification⇓
Derivation based onTransformation Theory
⇓Optimised Algorithm
⇓Programming Language
Source Code
F. Loulergue SyDPaCC – Towards a Framework – Lecture 1 October-November 2013 21 / 63
Verification of Parallel Programs?
I Functional correctness:I concurrent extension of CI BSP-Why
I Some properties:I MPI in TLA+I . . .
F. Loulergue SyDPaCC – Towards a Framework – Lecture 1 October-November 2013 22 / 63
Outline
1 Motivations and BackgroundOur GoalParallel ProgrammingProgram Correctness
2 Systematic Development of Programs for Parallel and Cloud Computing3 An Overview of Next Lectures
Lecture 2: Program Verification with the Coq Proof AssistantLecture 3: Parallel Programming in CoqLecture 4: Constructive Algorithms in CoqLecture 5: PowerLists in CoqLecture 6: Bulk Synchronous Parallel HomomorphismsLecture 7: Generate-Test-Aggregate in Coq
4 Summary5 Bibliography
F. Loulergue SyDPaCC – Towards a Framework – Lecture 1 October-November 2013 23 / 63
SyDPaCC – Systematic Development of Programsfor Parallel and Cloud Computing
F. Loulergue SyDPaCC – Towards a Framework – Lecture 1 October-November 2013 24 / 63
SyDPaCC: Program Development
F. Loulergue SyDPaCC – Towards a Framework – Lecture 1 October-November 2013 25 / 63
Past and Ongoing Projects
SDPP (J. Tesson PhD) PAPDAS
Task 1ALGORITHMIC SKELETONS
BASED ON CONSTRUCTIVE ALGORITHMS
Task 3A VERIFIED COMPILER
FOR ATOMIC SECTIONS C
Program generation: Atomic Sections C
Certified Compiler in Coq
Certified Parallel Programsassembly code
+ system calls for parallelism and/or communications
Derivation based on proved transformation theory
Problem specification
Algorithm as composition ofAlgorithmic Skeletons
Task 2EFFICIENT C++ALGORITHMIC
SKELETON LIBRARIES
SkeTo(C++)
OSLPrototype(Coq&BSML)
SPEED
OSL(C++)
SPEED
Algorithmic Skeletons Verified in Coq
BSMLAxiomatisation
Parallel Programsassembly code
+ system calls for parallelism and/or communications
Certified Parallel Programsassembly code
+ system calls for parallelism and/or communications
OCaml Source Code
BSMLImplemention
F. Loulergue SyDPaCC – Towards a Framework – Lecture 1 October-November 2013 26 / 63
Outline
1 Motivations and BackgroundOur GoalParallel ProgrammingProgram Correctness
2 Systematic Development of Programs for Parallel and Cloud Computing3 An Overview of Next Lectures
Lecture 2: Program Verification with the Coq Proof AssistantLecture 3: Parallel Programming in CoqLecture 4: Constructive Algorithms in CoqLecture 5: PowerLists in CoqLecture 6: Bulk Synchronous Parallel HomomorphismsLecture 7: Generate-Test-Aggregate in Coq
4 Summary5 Bibliography
F. Loulergue SyDPaCC – Towards a Framework – Lecture 1 October-November 2013 27 / 63
Outline
1 Motivations and BackgroundOur GoalParallel ProgrammingProgram Correctness
2 Systematic Development of Programs for Parallel and Cloud Computing3 An Overview of Next Lectures
Lecture 2: Program Verification with the Coq Proof AssistantLecture 3: Parallel Programming in CoqLecture 4: Constructive Algorithms in CoqLecture 5: PowerLists in CoqLecture 6: Bulk Synchronous Parallel HomomorphismsLecture 7: Generate-Test-Aggregate in Coq
4 Summary5 Bibliography
F. Loulergue SyDPaCC – Towards a Framework – Lecture 1 October-November 2013 28 / 63
The Coq Proof Assistant I
ACM SIGPLAN Software Award 2013The Coq proof assistant provides a rich environment forinteractive development of machine-checked formalreasoning. Coq is having a profound impact on research onprogramming languages and systems [. . . ] It has beenwidely adopted as a research tool by the programminglanguage research community [. . . ] Last but not least,these successes have helped to spark a wave of widespreadinterest in dependent type theory, the richly expressivecore logic on which Coq is based.
[. . . ] The Coq team continues to develop the system,bringing significant improvements in expressiveness andusability with each new release.
In short, Coq is playing an essential role in our transitionto a new era of formal assurance in mathematics,semantics, and program verification.
F. Loulergue SyDPaCC – Towards a Framework – Lecture 1 October-November 2013 29 / 63
The Coq Proof Assistant II
Foundations
I Calculus ofinductiveconstructions
I Curry-Howardcorrespondance
F. Loulergue SyDPaCC – Towards a Framework – Lecture 1 October-November 2013 30 / 63
The Coq Proof Assistant III
Natural Deduction
(v)A ∈ Γ
Γ ` A
(l)Γ,A ` B
Γ ` A→ B
(a)Γ ` A→ B Γ ` A
Γ ` B
Simply Typed λ-Calculus
(V )x : A ∈ Γ
Γ ` x : A
(L)Γ, x : A ` e : B
Γ ` (λx :A.e) : A→ B
(A)Γ ` e : A→ B Γ ` e′ : A
Γ ` (e e′) : B
F. Loulergue SyDPaCC – Towards a Framework – Lecture 1 October-November 2013 31 / 63
The Coq Proof Assistant IV
Curry-Howard IsomomorphismFor all formula there exists a proof of this formula in natural deduction ifand only if there exists a λ-term that has this formula as type.
I Theorem statement ⇔ Type
I Proof ⇔ Program
Coq in practice
I Functional programming language
I Rich type system: allow to express logical properties
I Language for building proofs (ie proof terms)
I Program extraction
F. Loulergue SyDPaCC – Towards a Framework – Lecture 1 October-November 2013 32 / 63
The Coq Proof Assistant V
The Lecture
I Functional programming in Coq
I Writing specifications
I Proving correctness
I Program extraction
F. Loulergue SyDPaCC – Towards a Framework – Lecture 1 October-November 2013 33 / 63
Outline
1 Motivations and BackgroundOur GoalParallel ProgrammingProgram Correctness
2 Systematic Development of Programs for Parallel and Cloud Computing3 An Overview of Next Lectures
Lecture 2: Program Verification with the Coq Proof AssistantLecture 3: Parallel Programming in CoqLecture 4: Constructive Algorithms in CoqLecture 5: PowerLists in CoqLecture 6: Bulk Synchronous Parallel HomomorphismsLecture 7: Generate-Test-Aggregate in Coq
4 Summary5 Bibliography
F. Loulergue SyDPaCC – Towards a Framework – Lecture 1 October-November 2013 34 / 63
Parallel Programming in Coq I
High-level Building Blocks
I Candidates: algorithmic skeletons
I How to implement them?
with Bulk Synchronous Parallel ML (BSML)a dialect/library for OCaml
I How to prove the correctness of the implementation?
using the Coq Proof Assistant
F. Loulergue SyDPaCC – Towards a Framework – Lecture 1 October-November 2013 35 / 63
Parallel Programming in Coq I
High-level Building Blocks
I Candidates: algorithmic skeletons
I How to implement them?with Bulk Synchronous Parallel ML (BSML)a dialect/library for OCaml
I How to prove the correctness of the implementation?using the Coq Proof Assistant
F. Loulergue SyDPaCC – Towards a Framework – Lecture 1 October-November 2013 35 / 63
Parallel Programming in Coq II
The Bulk Synchronous Parallel ML Approach
I an efficient functional programming language with formalsemantics and easy reasoning about the performance ofprograms (strict evaluation):
ML (Objective Caml flavor)
I a restricted model of parallelism with no deadlock, verylimited cases of non-determinism, a simple cost model:
Bulk Synchronous Parallelism
The result is:Bulk Synchronous Parallel ML (BSML)
F. Loulergue SyDPaCC – Towards a Framework – Lecture 1 October-November 2013 36 / 63
Parallel Programming in Coq III
Bulk Synchronous Parallel ML
Design principles:
I Small set of parallel primitives
I Universal for bulk synchronous parallelism
I Global view of programs
I Simple formal semantics
BSML:
a sequential functional language
+ a parallel data structure (non nestable)
+ parallel operations on this data structure
Papers and software
I http://traclifo.univ-orleans.fr/BSML
F. Loulergue SyDPaCC – Towards a Framework – Lecture 1 October-November 2013 37 / 63
Parallel Programming in Coq IV
The Lecture
I Tutorial on programming with BSML
I BSML in Coq
I Reasoning about BSML programs in Coq
I Applications and experiments
Based on [21, 20]
F. Loulergue SyDPaCC – Towards a Framework – Lecture 1 October-November 2013 38 / 63
Outline
1 Motivations and BackgroundOur GoalParallel ProgrammingProgram Correctness
2 Systematic Development of Programs for Parallel and Cloud Computing3 An Overview of Next Lectures
Lecture 2: Program Verification with the Coq Proof AssistantLecture 3: Parallel Programming in CoqLecture 4: Constructive Algorithms in CoqLecture 5: PowerLists in CoqLecture 6: Bulk Synchronous Parallel HomomorphismsLecture 7: Generate-Test-Aggregate in Coq
4 Summary5 Bibliography
F. Loulergue SyDPaCC – Towards a Framework – Lecture 1 October-November 2013 39 / 63
Constructive Algorithms in Coq IProgram calculation
maximum = hd � sort.
if law :8 f,f (if a then b else c)= if a then f b else f c
maximum (a :: x)= { def. of maximum }
hd (sort (a :: x))= { def. of sort }
hd (if a > hd(sort x) then a :: sort xelse hd(sort x) :: insert a (tail(sort x)))
= { by if law }if a > hd(sort x) then hd(a : sort x)else hd(hd(sort x) :: insert a (tail(sort x)))
= { def. of hd }if a > hd(sort x) then a else hd(sort x)
= { def. of maximum }if a > maximum x then a else maximum x
= { define x " y = if x > y then x else y }a " (maximum x)
Tesson, Hashimoto, Hu, Loulergue, Takeichi Program Calculation in Coq AMAST, June 2010 4 / 30
F. Loulergue SyDPaCC – Towards a Framework – Lecture 1 October-November 2013 40 / 63
Constructive Algorithms in Coq IIExample
Lemma maximum over list :8 a x d ,
maximum d (a::x) = if a ?> (maximum a x) then a else maximum a x .Proof.Begin.LHS
= { by def maximum}(hd d (sort (a::x)) ).
= { unfold sort }(hd d (if a ?> (hd a (sort x)) then a::(sort x)else (hd a (sort x))::(insert a (tail (sort x))))).
={ rewrite (if law (hd d))}(if a ?> hd a (sort x) then hd d (a :: (sort x))else hd d (hd a (sort x) :: insert a (tail (sort x)))) .
={ by def hd ; simpl if }(if a ?> hd a (sort x) then a else hd a (sort x)) .
={by def maximum}(if a ?> (maximum a x) then a else maximum a x).
[].Qed.
Tesson, Hashimoto, Hu, Loulergue, Takeichi Program Calculation in Coq AMAST, June 2010 9 / 30
F. Loulergue SyDPaCC – Towards a Framework – Lecture 1 October-November 2013 41 / 63
Constructive Algorithms in Coq III
Correct ParallelisationParallelisation correcte
join
fp
f
join
Parallelisation correcte composable
join
fp
f
join
gp
gjoin
J. Tesson, Developpement et preuve de correction de programmes paralleles fonctionnels. soutenance, 2011 23 / 52
F. Loulergue SyDPaCC – Towards a Framework – Lecture 1 October-November 2013 42 / 63
Constructive Algorithms in Coq IV
The Lecture
I Program calculation in Coq
I Examples
I Correct parallelisation
I Application to BSML skeletons/applications
Based on [21, 20]
F. Loulergue SyDPaCC – Towards a Framework – Lecture 1 October-November 2013 43 / 63
Outline
1 Motivations and BackgroundOur GoalParallel ProgrammingProgram Correctness
2 Systematic Development of Programs for Parallel and Cloud Computing3 An Overview of Next Lectures
Lecture 2: Program Verification with the Coq Proof AssistantLecture 3: Parallel Programming in CoqLecture 4: Constructive Algorithms in CoqLecture 5: PowerLists in CoqLecture 6: Bulk Synchronous Parallel HomomorphismsLecture 7: Generate-Test-Aggregate in Coq
4 Summary5 Bibliography
F. Loulergue SyDPaCC – Towards a Framework – Lecture 1 October-November 2013 44 / 63
Powerlists in Coq I
Powerlists J. Misra
I singleton powerlistI p, q powerlists of the same length:
I tie: p|q concatenation of p and q,I zip: p 1 q alternating items from p and q
F. Loulergue SyDPaCC – Towards a Framework – Lecture 1 October-November 2013 45 / 63
Powerlists in Coq II
Powerlist laws
I L0 (identical base cases): 〈x〉|〈y〉 = 〈x〉 1 〈y〉I L1 (dual deconstruction): P a non-singleton powerlist there exist
r , s, u, v such that:(a) P = r | s(b) P = u 1 v
I L2 (unique deconstruction):(a) (〈x〉 = 〈y〉) ≡ (x = y)(b) (p | q = u|v) ≡ (p = u ∧ q = v)(c) (p 1 q = u 1 v) ≡ (p = u ∧ q = v)
I L3 (commutativity):(p | q) 1 (u | v) = (p 1 u) | (q 1 v)
F. Loulergue SyDPaCC – Towards a Framework – Lecture 1 October-November 2013 46 / 63
Powerlists in Coq III
The Lecture
I Powerlist axiomatisation in Coq
I Programming with powerlists in Coq
I Reasoning about powerlists in Coq
I Applications
Based on [15, 14]
F. Loulergue SyDPaCC – Towards a Framework – Lecture 1 October-November 2013 47 / 63
Outline
1 Motivations and BackgroundOur GoalParallel ProgrammingProgram Correctness
2 Systematic Development of Programs for Parallel and Cloud Computing3 An Overview of Next Lectures
Lecture 2: Program Verification with the Coq Proof AssistantLecture 3: Parallel Programming in CoqLecture 4: Constructive Algorithms in CoqLecture 5: PowerLists in CoqLecture 6: Bulk Synchronous Parallel HomomorphismsLecture 7: Generate-Test-Aggregate in Coq
4 Summary5 Bibliography
F. Loulergue SyDPaCC – Towards a Framework – Lecture 1 October-November 2013 48 / 63
Bulk Synchronous Parallel Homomorphisms I
BSP Homomorphism
118 Chapitre 7. Parallélisation correcte avec l’homomorphisme BSP
la fonction au résultat de son application sur un ensemble de sous-listes correspondprécisément à la définition d’un homomorphisme.
La dernière étape consiste finalement en l’application du calcul local décrit pré-cédemment, en utilisant les résumés globaux obtenus après les calculs précédentscomme éléments l et r. Le lien entre BH et le modèle BSP devient explicite à la vuede ce schéma : dans ce calcul nous avons une première phase de calculs locaux, suivied’une phase de communication globale, puis d’une dernière phase de calculs locaux,ce schéma peut donc directement s’exprimer sous la forme de super-étapes BSP.
0 i p � 1
BHseq BHseq BHseq
BHSteps
Calculdes résuméspar g1 et g2
Communications
Fusion des résuméspar � et ⌦
Calcul de BHlocal
Figure 7.2 – Calcul de BH en parallèle
7.1.2 Transformation d’une fonction mapAround en BH
Les homomorphismes BSP sont très puissants car ils permettent de paralléliser uncalcul de manière correcte, cependant il est difficile de transformer la spécificationd’un programme en un BH. Au lieu de vérifier directement qu’une fonction donnéepuisse être exprimée sous la forme d’un BH, nous pouvons utiliser une représenta-tion intermédiaire beaucoup plus facile à manipuler intuitivement faisant appel ausquelette mapAround.
Cette fonction proche de map exprime elle aussi des calculs indépendants sur leséléments d’une liste, mais permet d’exprimer des calculs plus complexes faisant in-tervenir l’ensemble des éléments de la liste. Intuitivement, mapAround applique unefonction séparément pour chacun des éléments d’une liste, mais cette fonction pourraextraire des informations des sous-listes à gauche et à droite de l’élément considéré
F. Loulergue SyDPaCC – Towards a Framework – Lecture 1 October-November 2013 49 / 63
Bulk Synchronous Parallel Homomorphisms II
Applications
I Tower building xL x1 x2 x3 xi xn−1 xn xR
hL h1h2
h3 hi
hn−1 hnhR
· · · · · ·
I Array packing
pack p [x0, . . . , xn]= [ xi0 , . . . , xim︸ ︷︷ ︸∀x . p x= true
, xk0 , . . . , xkm′︸ ︷︷ ︸∀x . p x= false
]
I All nearest smaller valuesansv [3, 1, 4, 1, 5, 9, 2]= [(⊥, 1), (⊥,⊥), (1, 1),
(⊥,⊥), (1, 2), (5, 2), (1,⊥)]
I Sparse matrix-vector multiplication
F. Loulergue SyDPaCC – Towards a Framework – Lecture 1 October-November 2013 50 / 63
Bulk Synchronous Parallel Homomorphisms III
The Lecture
I BSP homomorphisms theory
I BSP homomorphisms in CoqI The BH skeleton:
I in CoqI in OSL
I Applications
F. Loulergue SyDPaCC – Towards a Framework – Lecture 1 October-November 2013 51 / 63
Outline
1 Motivations and BackgroundOur GoalParallel ProgrammingProgram Correctness
2 Systematic Development of Programs for Parallel and Cloud Computing3 An Overview of Next Lectures
Lecture 2: Program Verification with the Coq Proof AssistantLecture 3: Parallel Programming in CoqLecture 4: Constructive Algorithms in CoqLecture 5: PowerLists in CoqLecture 6: Bulk Synchronous Parallel HomomorphismsLecture 7: Generate-Test-Aggregate in Coq
4 Summary5 Bibliography
F. Loulergue SyDPaCC – Towards a Framework – Lecture 1 October-November 2013 52 / 63
Generate-Test-Aggregate in Coq I
Generate-Test-Aggregate K. Emoto, S. Fischer, Z. Hu
User’s perspective:
G Generator to produce all candidates
T Tester for validity check of candidates
A Aggregator to build the final result
Optimization:
I Filter embedding: G T A ⇒ G A
I Semiring Fusion: G A ⇒ D&C
F. Loulergue SyDPaCC – Towards a Framework – Lecture 1 October-November 2013 53 / 63
Generate-Test-Aggregate in Coq II
The Lecture
I Overview of GTAI GTA in Coq:
I Data structuresI GTA optmisation theoremI Support for automatic parallelisationI Program extraction?
Based on [8, 9] and ongoing work
F. Loulergue SyDPaCC – Towards a Framework – Lecture 1 October-November 2013 54 / 63
Outline
1 Motivations and BackgroundOur GoalParallel ProgrammingProgram Correctness
2 Systematic Development of Programs for Parallel and Cloud Computing3 An Overview of Next Lectures
Lecture 2: Program Verification with the Coq Proof AssistantLecture 3: Parallel Programming in CoqLecture 4: Constructive Algorithms in CoqLecture 5: PowerLists in CoqLecture 6: Bulk Synchronous Parallel HomomorphismsLecture 7: Generate-Test-Aggregate in Coq
4 Summary5 Bibliography
F. Loulergue SyDPaCC – Towards a Framework – Lecture 1 October-November 2013 55 / 63
F. Loulergue SyDPaCC – Towards a Framework – Lecture 1 October-November 2013 56 / 63
Outline
1 Motivations and BackgroundOur GoalParallel ProgrammingProgram Correctness
2 Systematic Development of Programs for Parallel and Cloud Computing3 An Overview of Next Lectures
Lecture 2: Program Verification with the Coq Proof AssistantLecture 3: Parallel Programming in CoqLecture 4: Constructive Algorithms in CoqLecture 5: PowerLists in CoqLecture 6: Bulk Synchronous Parallel HomomorphismsLecture 7: Generate-Test-Aggregate in Coq
4 Summary5 Bibliography
F. Loulergue SyDPaCC – Towards a Framework – Lecture 1 October-November 2013 57 / 63
Bibliography I
[1] M. Bamha and M. Exbrayat. Pipelining a Skew-Insensitive ParallelJoin Algorithm. Parallel Processing Letters, 13(3):317–328, 2003.
[2] Y. Bertot and P. Casteran. Interactive Theorem Proving andProgram Development. Springer, 2004.
[3] R. Bisseling. Parallel Scientific Computation. A structured approachusing BSP and MPI. Oxford University Press, 2004.
[4] A. Braud and C. Vrain. A parallel genetic algorithm based on theBSP model. In Evolutionary Computation and Parallel ProcessingGECCO & AAAI Workshop, Orlando (Florida), USA, 1999.
[5] M. Cole. Algorithmic Skeletons: Structured Management of ParallelComputation. MIT Press, 1989. Available athttp://homepages.inf.ed.ac.uk/mic/Pubs.
[6] M. Cole. Parallel Programming with List Homomorphisms. ParallelProcessing Letters, 5(2):191–203, 1995.
F. Loulergue SyDPaCC – Towards a Framework – Lecture 1 October-November 2013 58 / 63
Bibliography II
[7] D. C. Dracopoulos and S. Kent. Speeding up genetic programming:A parallel BSP implementation. In First Annual Conference onGenetic Programming. MIT Press, July 1996.
[8] K. Emoto, S. Fischer, and Z. Hu. Generate, Test, and Aggregate –A Calculation-based Framework for Systematic ParallelProgramming with MapReduce. In ESOP, volume 7211 of LNCS,pages 254–273. Springer, 2012. doi:10.1007/978-3-642-28869-2 13.
[9] K. Emoto, S. Fischer, and Z. Hu. Filter-embedding semiring fusionfor programming with MapReduce. Formal Asp. Comput., 24(4-6):623–645, 2012. doi:10.1007/s00165-012-0241-8.
[10] L. Granvilliers, G. Hains, Q. Miller, and N. Romero. A system forthe high-level parallelization and cooperation of constraint solvers.In Y. Pan, S. G. Akl, and K. Li, editors, Proceedings of InternationalConference on Parallel and Distributed Computing and Systems(PDCS), pages 596–601, Las Vegas, USA, 1998. IASTED/ACTAPress.
F. Loulergue SyDPaCC – Towards a Framework – Lecture 1 October-November 2013 59 / 63
Bibliography III
[11] Z. Hu, H. Iwasaki, and M. Takechi. Formal derivation of efficientparallel programs by construction of list homomorphisms. ACMTrans. Program. Lang. Syst., 19(3):444–461, 1997. ISSN0164-0925. doi:10.1145/256167.256201.
[12] G. Klein, K. Elphinstone, G. Heiser, J. Andronick, D. Cock,P. Derrin, D. Elkaduwe, K. Engelhardt, R. Kolanski, M. Norrish,T. Sewell, H. Tuch, and S. Winwood. seL4: formal verification of anOS kernel. In Proceedings of the ACM SIGOPS 22nd symposium onOperating systems principles (SOSP’09), pages 207–220. ACM,2009. ISBN 978-1-60558-752-3. doi:10.1145/1629575.1629596.
[13] X. Leroy. A formally verified compiler back-end. Journal ofAutomated Reasoning, 43(4):363–446, 2009.doi:10.1007/s10817-009-9155-4.
[14] F. Loulergue, V. Niculescu, and S. Robillard. Powerlists in Coq:Programming and Reasoning. In First International Symposium onComputing and Networking (CANDAR). IEEE Computer Society,2013. to appear.
F. Loulergue SyDPaCC – Towards a Framework – Lecture 1 October-November 2013 60 / 63
Bibliography IV
[15] J. Misra. Powerlist: A structure for parallel recursion. ACM Trans.Program. Lang. Syst., 16(6):1737–1767, November 1994.
[16] A. Morihata, K. Matsuzaki, Z. Hu, and M. Takeichi. The thirdhomomorphism theorem on trees: downward & upward lead todivide-and-conquer. In Z. Shao and B. C. Pierce, editors,Symposium on Principles of Programming Languages (POPL),pages 177–185. ACM Press, 2009. doi:10.1145/1480881.1480905.
[17] K. Morita, A. Morihata, K. Matsuzaki, Z. Hu, and M. Takeichi.Automatic Inversion Generates Divide-and-Conquer ParallelPrograms. In Conference on Programming Language Design andImplementation (PLDI), pages 146–155. ACM Press, 2007.doi:10.1145/1250734.1250752.
[18] R. O. Rogers and D. B. Skillicorn. Using the BSP cost model tooptimise parallel neural network training. Future GenerationComputer Systems, 14(5-6):409–424, 1998.
F. Loulergue SyDPaCC – Towards a Framework – Lecture 1 October-November 2013 61 / 63
Bibliography V
[19] J. Sevcık, V. Vafeiadis, F. Z. Nardelli, S. Jagannathan, andP. Sewell. CompCertTSO: A Verified Compiler for Relaxed-MemoryConcurrency. Journal of the ACM, 60(3):22, 2013.doi:10.1145/2487241.2487248.
[20] J. Tesson. Environnement pour le developpement et la preuve decorrection systematiques de programmes paralleles fonctionnels.PhD thesis, LIFO, University of Orleans, November 2011. URLhttp://hal.archives-ouvertes.fr/tel-00660554/en/.
[21] J. Tesson, H. Hashimoto, Z. Hu, F. Loulergue, and M. Takeichi.Program Calculation in Coq. In Thirteenth International Conferenceon Algebraic Methodology And Software Technology(AMAST2010), LNCS 6486, pages 163–179. Springer, 2010.doi:10.1007/978-3-642-17796-5 10.
[22] The Coq Development Team. The Coq Proof Assistant.http://coq.inria.fr.
[23] L. G. Valiant. A bridging model for parallel computation. Commun.ACM, 33(8):103, 1990. doi:10.1145/79173.79181.
F. Loulergue SyDPaCC – Towards a Framework – Lecture 1 October-November 2013 62 / 63
Coq Hands-On
I To try during the presentation, and/or
I For Coq hands-on after the lecture
Installing Coq
Coq website: http://coq.inria.fr
Recommended installation:
I Windows installer for Coq’s website
I Linux: Ubuntu packages (coq, coqide, proofgeneral-coq)
I MacOS: MacPorts or Homebrew
Recommended IDE: emacs + ProofGeneral
F. Loulergue SyDPaCC – Towards a Framework – Lecture 1 October-November 2013 63 / 63