parallel object programming in pop- c++: a case study for...
TRANSCRIPT
![Page 1: Parallel Object Programming in POP- C++: a case study for ...pmaa06.irisa.fr/pres/13-Jiogo-PMAA06.pdf · Method call semantics : definition 1 - An arriving concurrent call can be](https://reader036.vdocument.in/reader036/viewer/2022081613/5f9ef6cb36131522af35a72c/html5/thumbnails/1.jpg)
Parallel Object Programming in POP-C++: a case study for sparse matrix
vector multiplication
Clovis Dongmo JiogoPierre MannebackFaculté polytechnique de Mons
Pierre KuonenUniversity of Fribourg
![Page 2: Parallel Object Programming in POP- C++: a case study for ...pmaa06.irisa.fr/pres/13-Jiogo-PMAA06.pdf · Method call semantics : definition 1 - An arriving concurrent call can be](https://reader036.vdocument.in/reader036/viewer/2022081613/5f9ef6cb36131522af35a72c/html5/thumbnails/2.jpg)
Purpose of this work
Test Pop-C++ for some scientific computations on Grids
Present the parallel programming model POP-C++Evaluate its performances in Grid environmentShow how POP-C++ can improve matrix computations
![Page 3: Parallel Object Programming in POP- C++: a case study for ...pmaa06.irisa.fr/pres/13-Jiogo-PMAA06.pdf · Method call semantics : definition 1 - An arriving concurrent call can be](https://reader036.vdocument.in/reader036/viewer/2022081613/5f9ef6cb36131522af35a72c/html5/thumbnails/3.jpg)
Agenda
Overview of POP-C++Sparse Matrix/Vector productProgramming in Pop-C++Experimental resultsFuture work
![Page 4: Parallel Object Programming in POP- C++: a case study for ...pmaa06.irisa.fr/pres/13-Jiogo-PMAA06.pdf · Method call semantics : definition 1 - An arriving concurrent call can be](https://reader036.vdocument.in/reader036/viewer/2022081613/5f9ef6cb36131522af35a72c/html5/thumbnails/4.jpg)
Object oriented application
POP: Parallel Object Programming
Grid environment
ObjectObj
Object
ObjectObject
Object
• Heterogeneous• Large scale• Unstructured• Dynamic and unknown topology
• Distributed objects• Heterogeneous• Dynamic
execute
![Page 5: Parallel Object Programming in POP- C++: a case study for ...pmaa06.irisa.fr/pres/13-Jiogo-PMAA06.pdf · Method call semantics : definition 1 - An arriving concurrent call can be](https://reader036.vdocument.in/reader036/viewer/2022081613/5f9ef6cb36131522af35a72c/html5/thumbnails/5.jpg)
Approach of POP-C++
Service oriented approachResource allocation driven by object requirementsVarious invocations semanticsObject-oriented parallel programming paradigm (parallel objects)Object-oriented Programming System
![Page 6: Parallel Object Programming in POP- C++: a case study for ...pmaa06.irisa.fr/pres/13-Jiogo-PMAA06.pdf · Method call semantics : definition 1 - An arriving concurrent call can be](https://reader036.vdocument.in/reader036/viewer/2022081613/5f9ef6cb36131522af35a72c/html5/thumbnails/6.jpg)
POP-C++ Programming Model
Extension of C++ languageData transmission via shared objectTwo level of parallelism
Inter-object parallelismIntra-object parallelism
Transparent and dynamic object allocation guided by the object resources needCapacity to glue to Grid Toolkits
![Page 7: Parallel Object Programming in POP- C++: a case study for ...pmaa06.irisa.fr/pres/13-Jiogo-PMAA06.pdf · Method call semantics : definition 1 - An arriving concurrent call can be](https://reader036.vdocument.in/reader036/viewer/2022081613/5f9ef6cb36131522af35a72c/html5/thumbnails/7.jpg)
Semantic invocation : interface side
Two ways to call a method
SynchronousMethod returns when the execution is finished
Same semantic than sequential invocation
AsynchronousMethod returns immediately
Allows parallelism but.. no returned value
Object 1 Object 2
Object 1 Object 2
Parallelexecution
![Page 8: Parallel Object Programming in POP- C++: a case study for ...pmaa06.irisa.fr/pres/13-Jiogo-PMAA06.pdf · Method call semantics : definition 1 - An arriving concurrent call can be](https://reader036.vdocument.in/reader036/viewer/2022081613/5f9ef6cb36131522af35a72c/html5/thumbnails/8.jpg)
Method call semantics : definition
1 - An arriving concurrent call can be executed concurrently (time sharing) when it arrives, except if mutex calls are pending or executing. In the later case he is executed after completion of all mutex calls previously arrived.
2 - An arriving sequential call is executed after completion of all sequential and mutex calls previously arrived.
3 - An arriving mutex call is executed after completion of all calls previously arrived.
![Page 9: Parallel Object Programming in POP- C++: a case study for ...pmaa06.irisa.fr/pres/13-Jiogo-PMAA06.pdf · Method call semantics : definition 1 - An arriving concurrent call can be](https://reader036.vdocument.in/reader036/viewer/2022081613/5f9ef6cb36131522af35a72c/html5/thumbnails/9.jpg)
O1 O2
Method call semantics : example
All calls are asynchronous
Delayed
O2.Mseq()
O2.Mconc()
O2.Mseq()
O2.Mconc()
O2.Mseq()
O2.Mmut()
O2.Mconc()
Delayed
Delayed
Delayed
![Page 10: Parallel Object Programming in POP- C++: a case study for ...pmaa06.irisa.fr/pres/13-Jiogo-PMAA06.pdf · Method call semantics : definition 1 - An arriving concurrent call can be](https://reader036.vdocument.in/reader036/viewer/2022081613/5f9ef6cb36131522af35a72c/html5/thumbnails/10.jpg)
POP-C++ Syntax
POP-C++ is an implementation of the parallel object model as an extension of C++ with six new key words :
parclass : to declare a parallel classasync : asynchronous method callsync : synchronous method callconc : concurrent method executionseq : sequential method executionmutex : mutex method execution
![Page 11: Parallel Object Programming in POP- C++: a case study for ...pmaa06.irisa.fr/pres/13-Jiogo-PMAA06.pdf · Method call semantics : definition 1 - An arriving concurrent call can be](https://reader036.vdocument.in/reader036/viewer/2022081613/5f9ef6cb36131522af35a72c/html5/thumbnails/11.jpg)
POP-C++ architecture
A multi-layer architecture
Integration of new middleware into the system in a PnP flavor
Computational environment
POP-C++ essential service abstractions
Globus Toolkit XtremWeb Standalone POP-C++
Grid Web computing
Testing Distributed
environment
Other toolkits
Other distributed
environments
POP-C++ programming
POP-C++ services for
Globus
POP-C++ services for XtremWeb
POP-C++ services for
testing
Other customizable
services
Customizable service implementations
![Page 12: Parallel Object Programming in POP- C++: a case study for ...pmaa06.irisa.fr/pres/13-Jiogo-PMAA06.pdf · Method call semantics : definition 1 - An arriving concurrent call can be](https://reader036.vdocument.in/reader036/viewer/2022081613/5f9ef6cb36131522af35a72c/html5/thumbnails/12.jpg)
Requirement-driven objects
Each parallel object has a user-specified object description (OD)OD describes the requirements of parallel objectsOD is used as a guideline for allocating resource and object migrationOD can be expressed in terms of:
Maximum computing power (e.g. Mflops)Communication bandwidth with its interfaceMemory needed
OD can be parameterized on each parallel object (based on the actual input)
![Page 13: Parallel Object Programming in POP- C++: a case study for ...pmaa06.irisa.fr/pres/13-Jiogo-PMAA06.pdf · Method call semantics : definition 1 - An arriving concurrent call can be](https://reader036.vdocument.in/reader036/viewer/2022081613/5f9ef6cb36131522af35a72c/html5/thumbnails/13.jpg)
Object description example
parclass Matrix{Matrix (int n) @{
od.power(300 , 100);od.memory(n*n*sizeof(double)/1E6)od.protocol("socket http")
… }}The creation of an object for Matrix parallel class requires:
A computing power of 300Mfps, but 100Mfps are acceptableA capacity memory of de n*n*sizeof(double)/1E6 MbytesA protocol socket or http for the communication
![Page 14: Parallel Object Programming in POP- C++: a case study for ...pmaa06.irisa.fr/pres/13-Jiogo-PMAA06.pdf · Method call semantics : definition 1 - An arriving concurrent call can be](https://reader036.vdocument.in/reader036/viewer/2022081613/5f9ef6cb36131522af35a72c/html5/thumbnails/14.jpg)
Agenda
Overview of POP-C++Sparse Matrix/Vector productProgramming in Pop-C++Experimental resultsFuture work
![Page 15: Parallel Object Programming in POP- C++: a case study for ...pmaa06.irisa.fr/pres/13-Jiogo-PMAA06.pdf · Method call semantics : definition 1 - An arriving concurrent call can be](https://reader036.vdocument.in/reader036/viewer/2022081613/5f9ef6cb36131522af35a72c/html5/thumbnails/15.jpg)
Sparse storage format : CRS
11 0 14 0 00 22 0 0 00 0 0 0 014 0 0 0 4515 0 0 45 0
Row_ptr[*] = [1; 3; 4; 6; 8]Col_ind[*] = [1; 3; 2; 1; 5; 1; 4]Mat_val[*] = [11; 14; 22; 14; 45; 15; 45]
CRS data structure use three vectors
![Page 16: Parallel Object Programming in POP- C++: a case study for ...pmaa06.irisa.fr/pres/13-Jiogo-PMAA06.pdf · Method call semantics : definition 1 - An arriving concurrent call can be](https://reader036.vdocument.in/reader036/viewer/2022081613/5f9ef6cb36131522af35a72c/html5/thumbnails/16.jpg)
Sparse Matrix/vector partitioning
××××××××
× × × ×× × × ×
× × × ×× × ×
× × ×× × × × ×
× × × ×× × × × ×
==
R1
R2
R3
Sparse matrix is partitioned according to the resource power
××××××××
![Page 17: Parallel Object Programming in POP- C++: a case study for ...pmaa06.irisa.fr/pres/13-Jiogo-PMAA06.pdf · Method call semantics : definition 1 - An arriving concurrent call can be](https://reader036.vdocument.in/reader036/viewer/2022081613/5f9ef6cb36131522af35a72c/html5/thumbnails/17.jpg)
×××× ××××× ×××××××××× ×× ××××× ×× ××××××××××× × ××× ××××× ××××× × ×××× ×× ×× ×××× ×× ×× ×××××××× ×××××× ×××××××
A1
A2
A3
A4
A
A1
A2
A3
A4
Tminimal
Execution time
?
Distribution model
Find a matrix partitioning which minimizes the total execution time?
![Page 18: Parallel Object Programming in POP- C++: a case study for ...pmaa06.irisa.fr/pres/13-Jiogo-PMAA06.pdf · Method call semantics : definition 1 - An arriving concurrent call can be](https://reader036.vdocument.in/reader036/viewer/2022081613/5f9ef6cb36131522af35a72c/html5/thumbnails/18.jpg)
Objectives:
Load balancing:
Fast : linear computing time
Efficient : ε << 1
Balancing Heuristic
∑ε+=ε+≈i
ii
avgi
i W)1(pp)1(W
pkpW
![Page 19: Parallel Object Programming in POP- C++: a case study for ...pmaa06.irisa.fr/pres/13-Jiogo-PMAA06.pdf · Method call semantics : definition 1 - An arriving concurrent call can be](https://reader036.vdocument.in/reader036/viewer/2022081613/5f9ef6cb36131522af35a72c/html5/thumbnails/19.jpg)
Agenda
Overview of POP-C++Sparse Matrix/Vector productProgramming in Pop-C++Experimental resultsFuture work
![Page 20: Parallel Object Programming in POP- C++: a case study for ...pmaa06.irisa.fr/pres/13-Jiogo-PMAA06.pdf · Method call semantics : definition 1 - An arriving concurrent call can be](https://reader036.vdocument.in/reader036/viewer/2022081613/5f9ef6cb36131522af35a72c/html5/thumbnails/20.jpg)
The parallel class SparseMatrixparclass SparseMatrix{
public :SparseMatrix(int wanted, int min)@od.power(wanted, min) ;seq async void Init( [in, size=n+1] double *rom_ptr, int n, …) ;seq async void MvMultiply( [in, size=n] double *vector, int n) ;mutex sync int GetResult( [out, size=m] double *V, int m) ;
private :double *mat_val , *vect_res; int *col_ind, *row_ptr;…}
The object requirements are defined by the constructor
![Page 21: Parallel Object Programming in POP- C++: a case study for ...pmaa06.irisa.fr/pres/13-Jiogo-PMAA06.pdf · Method call semantics : definition 1 - An arriving concurrent call can be](https://reader036.vdocument.in/reader036/viewer/2022081613/5f9ef6cb36131522af35a72c/html5/thumbnails/21.jpg)
Minimal extension of C++
parclass Foo {…
Foo(…) @ {power =100; };
conc async void Mymethod(…);
};
class Foo {…
Foo(…);
Void Mymethod(…);
};
Foo : : Foo(…) {…}
Void Foo : : Mymethod (…) {… }
C++Constructor:
Method:
POP-C++
Shared implementation
![Page 22: Parallel Object Programming in POP- C++: a case study for ...pmaa06.irisa.fr/pres/13-Jiogo-PMAA06.pdf · Method call semantics : definition 1 - An arriving concurrent call can be](https://reader036.vdocument.in/reader036/viewer/2022081613/5f9ef6cb36131522af35a72c/html5/thumbnails/22.jpg)
Methods are implemented in C++
…void SparseMatrix : :MvMultiply ( double *vector, int n) {
for (int i = 0 ; i < n ; i++){vect res[i] = 0.0 ;for (int j=row ptr[i] ; j<row ptr[i+1] ; j++)
vect res[i] += mat val[j] * vector[col ind[j]] ;}
}…
![Page 23: Parallel Object Programming in POP- C++: a case study for ...pmaa06.irisa.fr/pres/13-Jiogo-PMAA06.pdf · Method call semantics : definition 1 - An arriving concurrent call can be](https://reader036.vdocument.in/reader036/viewer/2022081613/5f9ef6cb36131522af35a72c/html5/thumbnails/23.jpg)
5721R4R3R2R1
power_ptr
R1R2
R4R3
row_ptrmat_valcol_indvector
SetMatVarDataMatDist
PartitionMatrix
Init
GetResult
MvMultiply
ComputeResult
Fichiers de données
Execution steps
![Page 24: Parallel Object Programming in POP- C++: a case study for ...pmaa06.irisa.fr/pres/13-Jiogo-PMAA06.pdf · Method call semantics : definition 1 - An arriving concurrent call can be](https://reader036.vdocument.in/reader036/viewer/2022081613/5f9ef6cb36131522af35a72c/html5/thumbnails/24.jpg)
Agenda
Overview of POP-C++Sparse Matrix/Vector productProgramming in Pop-C++Experimental resultsFuture work
![Page 25: Parallel Object Programming in POP- C++: a case study for ...pmaa06.irisa.fr/pres/13-Jiogo-PMAA06.pdf · Method call semantics : definition 1 - An arriving concurrent call can be](https://reader036.vdocument.in/reader036/viewer/2022081613/5f9ef6cb36131522af35a72c/html5/thumbnails/25.jpg)
Experimental Platform
PCs properties
AMD Athelon2 Ghz256Mb of RamFast Ethernet
Cluster properties
Cluster Sun Fire V2010 bi-opteron nodes1.8 Ghz1Gb of RamGigaBit Ethernet
![Page 26: Parallel Object Programming in POP- C++: a case study for ...pmaa06.irisa.fr/pres/13-Jiogo-PMAA06.pdf · Method call semantics : definition 1 - An arriving concurrent call can be](https://reader036.vdocument.in/reader036/viewer/2022081613/5f9ef6cb36131522af35a72c/html5/thumbnails/26.jpg)
Test matrices
Nom matrice Domaine
d’application Taille(n) NZ(m)
(a) fidapFinite element modeling 16614 1091362
(b) poisson3DbFinite element modeling 85623 2374949
(c) Stanford-web Web crawling 281903 2382912
(d) Stanford-w.b. Web crawling 685230 8006115
Matrix Markets Format
<i> <j> <Aij>
![Page 27: Parallel Object Programming in POP- C++: a case study for ...pmaa06.irisa.fr/pres/13-Jiogo-PMAA06.pdf · Method call semantics : definition 1 - An arriving concurrent call can be](https://reader036.vdocument.in/reader036/viewer/2022081613/5f9ef6cb36131522af35a72c/html5/thumbnails/27.jpg)
Experimental results
# Proc. Matrice Type
POP-C++ 108.2 62.8 31.4 22.9 22.7 LAM/MPI 96.5 52.6 39.2 20.7 16.9 POP-C++ 230.3 120.0 63.3 41.4 36.4 LAM/MPI 215.6 111.6 73.8 43.2 33.6 POP-C++ 267.7 112.4 80.5 49.2 48.4 LAM/MPI 173.5 101.3 64.5 46.2 46.8 (d)
(c)
(b)
#1 #2 #4 #8 #12
Total execution time for 1000 iterations
![Page 28: Parallel Object Programming in POP- C++: a case study for ...pmaa06.irisa.fr/pres/13-Jiogo-PMAA06.pdf · Method call semantics : definition 1 - An arriving concurrent call can be](https://reader036.vdocument.in/reader036/viewer/2022081613/5f9ef6cb36131522af35a72c/html5/thumbnails/28.jpg)
Experimental results
0
50
100
150
200
250
300
# 1 # 2 # 4 # 8 # 12
Nombre de processeursTe
mps
(s)
POP-C++
LAM/MPI
0
50
100
150
200
250
#1 #2 #4 #8 #12
Nombre de processeurs
Tem
ps(s
)
POP-C++
LAM/MPI
Matrice (b) Matrice (d)
![Page 29: Parallel Object Programming in POP- C++: a case study for ...pmaa06.irisa.fr/pres/13-Jiogo-PMAA06.pdf · Method call semantics : definition 1 - An arriving concurrent call can be](https://reader036.vdocument.in/reader036/viewer/2022081613/5f9ef6cb36131522af35a72c/html5/thumbnails/29.jpg)
Agenda
Overview of POP-C++Sparse Matrix/Vector productProgramming in Pop-C++Experimental resultsFuture work
![Page 30: Parallel Object Programming in POP- C++: a case study for ...pmaa06.irisa.fr/pres/13-Jiogo-PMAA06.pdf · Method call semantics : definition 1 - An arriving concurrent call can be](https://reader036.vdocument.in/reader036/viewer/2022081613/5f9ef6cb36131522af35a72c/html5/thumbnails/30.jpg)
Future work
Improve the performance by coupling POP-C++ with MPISetting up a Scheduler for tasks assignmentImplement iterative methods in grid environment based on heuristic for load balancingEvaluate POP-C++ performance in Globus environment