pagrid: a mesh partitioner for computational grids
DESCRIPTION
PaGrid: A Mesh Partitioner for Computational Grids. Virendra C. Bhavsar Professor and Dean Faculty of Computer Science UNB, Fredericton [email protected] This work is done in collaboration with Sili Huang and Dr. Eric Aubanel. Outline. Introduction Background PaGrid Mesh Partitioner - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: PaGrid: A Mesh Partitioner for Computational Grids](https://reader035.vdocument.in/reader035/viewer/2022062321/56813d0e550346895da6c584/html5/thumbnails/1.jpg)
PaGrid: A Mesh Partitioner for Computational Grids
Virendra C. BhavsarProfessor and Dean
Faculty of Computer Science UNB, [email protected]
This work is done in collaboration with Sili Huang and Dr. Eric Aubanel.
![Page 2: PaGrid: A Mesh Partitioner for Computational Grids](https://reader035.vdocument.in/reader035/viewer/2022062321/56813d0e550346895da6c584/html5/thumbnails/2.jpg)
Outline
Introduction
Background
PaGrid Mesh Partitioner
Experimental Results
Conclusion
![Page 3: PaGrid: A Mesh Partitioner for Computational Grids](https://reader035.vdocument.in/reader035/viewer/2022062321/56813d0e550346895da6c584/html5/thumbnails/3.jpg)
Advanced Computational Research Laboratory
Virendra C. Bhavsar
![Page 4: PaGrid: A Mesh Partitioner for Computational Grids](https://reader035.vdocument.in/reader035/viewer/2022062321/56813d0e550346895da6c584/html5/thumbnails/4.jpg)
ACRL Facilities
![Page 5: PaGrid: A Mesh Partitioner for Computational Grids](https://reader035.vdocument.in/reader035/viewer/2022062321/56813d0e550346895da6c584/html5/thumbnails/5.jpg)
ACEnet Project
ACEnet (Atlantic Computational Excellence Network) is Atlantic Canada's entry into this national fabric of HPC facilities.
A partnership of seven institutions, including UNB, MUN, MTA, Dalhousie, StFX, SMU, and UPEI.
ACEnet was awarded $9.9M by the CFI in March 2004. The project will be worth nearly $28M.
![Page 6: PaGrid: A Mesh Partitioner for Computational Grids](https://reader035.vdocument.in/reader035/viewer/2022062321/56813d0e550346895da6c584/html5/thumbnails/6.jpg)
Mesh Partitioning Problem
i
j
hi,j+1
hi+1,j
hi-1,j
hi,j-1
Enlarged
Metal plate
(a) Heat distribution problem
(b) Corresponding application graph
1, 1, , 1 , 1, 4
i j i j i j i ji j
h h h hh
![Page 7: PaGrid: A Mesh Partitioner for Computational Grids](https://reader035.vdocument.in/reader035/viewer/2022062321/56813d0e550346895da6c584/html5/thumbnails/7.jpg)
Mesh Partitioning Problem
Mapping of the mesh onto the processors while minimizing the inter-processor communication cost
Balance the computational load among processors
p0
p1p2
p3
(b) A partition with homogeneous partitioning
p2p0
p1 p3
1
11
1
1
1
(a) Homogeneous system graph
Cut Edges:
p0: 8
p1: 8
p2: 8
p3: 8
Total Cut Edges:
16
![Page 8: PaGrid: A Mesh Partitioner for Computational Grids](https://reader035.vdocument.in/reader035/viewer/2022062321/56813d0e550346895da6c584/html5/thumbnails/8.jpg)
Computational Grids
The slide is from the Centre for Unified Computing, University of College Cork, Ireland
![Page 9: PaGrid: A Mesh Partitioner for Computational Grids](https://reader035.vdocument.in/reader035/viewer/2022062321/56813d0e550346895da6c584/html5/thumbnails/9.jpg)
Computational Grid Applications
Computational Fluid Dynamics
Computational Mechanics
Bioinformatics
Condensed Matter Physics SimulationThe slide is from Fluent.com, University of California San Diego, George Washington University, Ohio State University
![Page 10: PaGrid: A Mesh Partitioner for Computational Grids](https://reader035.vdocument.in/reader035/viewer/2022062321/56813d0e550346895da6c584/html5/thumbnails/10.jpg)
A Computational Grid Model
Computational Grids and their heterogeneity in both processors and networks
p0p1
p2
p3
4
p4
p5
p6 p7
p8 p9
Cluster 1 Cluster 2
![Page 11: PaGrid: A Mesh Partitioner for Computational Grids](https://reader035.vdocument.in/reader035/viewer/2022062321/56813d0e550346895da6c584/html5/thumbnails/11.jpg)
Mesh Partitioning Problem
p2p0 p1 p3
1 1 2
(a) Processor graph
p0
p1
p2
p3
(c) Optimal partition with a heterogeneous partitioner
Total Cut Edges:24Total Communication Cost:32
Total Cut Edges:16Total Communication Cost:40
p0
p1p2
p3
(b) Optimal partition with a homogeneous partitioner
{ , }
| ( , )| | ( ( ), ( ))|, where| ( , )| denotes the
weight of the edge ( , ) in the application graph, is the
set of edges cut by the partition, and ( ) represents the
processor to which ver
cuv E
c
uv u v uv
uv E
v
tex is assigned in the mapping.v
Equation: Total communication cost
![Page 12: PaGrid: A Mesh Partitioner for Computational Grids](https://reader035.vdocument.in/reader035/viewer/2022062321/56813d0e550346895da6c584/html5/thumbnails/12.jpg)
Background
Generic Multilevel Partitioning Algorithm
The slide is from Centre from CEPBA-IBM Research Institute, Spain.
![Page 13: PaGrid: A Mesh Partitioner for Computational Grids](https://reader035.vdocument.in/reader035/viewer/2022062321/56813d0e550346895da6c584/html5/thumbnails/13.jpg)
Background
Coarsening phase Matching and contraction.
Heavy Edge Matching Heuristic.
3
4
1
1
12
33
2
4
1
11
1
11 3
2
1
5
4
5
3
15
[2] [2]
[2]
[2] [2]
v1
v2
u
![Page 14: PaGrid: A Mesh Partitioner for Computational Grids](https://reader035.vdocument.in/reader035/viewer/2022062321/56813d0e550346895da6c584/html5/thumbnails/14.jpg)
Background
Refinement (Uncoarsening Phase) Kernighan-Lin/Fiduccia-Mattheyses (KL-FM)
refinement Refine partitions under load balance constraint. Compute a gain for each candidate vertex. Each step, move a single vertex to a different
subdomain. Vertices with negative gains are allowed for
migration. Greedy refinement
Similar to KL-FM refinement Vertices with negative gains are not allowed to move
![Page 15: PaGrid: A Mesh Partitioner for Computational Grids](https://reader035.vdocument.in/reader035/viewer/2022062321/56813d0e550346895da6c584/html5/thumbnails/15.jpg)
Background
(Computational) Load balancing To balance the load among the processors Small imbalance can lead to a better partition.
Diffusion-based Flow Solutions Determine how
much load to be transferred among processors
![Page 16: PaGrid: A Mesh Partitioner for Computational Grids](https://reader035.vdocument.in/reader035/viewer/2022062321/56813d0e550346895da6c584/html5/thumbnails/16.jpg)
Mesh Partitioning Tools
Mesh Partitioning Tools METIS (Karypis and Kumar, 1995) JOSTLE (Walshaw, 1997) CHACO (Hendrickson and Leland, 1994) PART (Chen and Taylor, 1996) SCOTCH (Pellegrini, 1994) PARTY (Preis and Diekmann,
1996) MiniMax (Kumar, Das, and Biswas ,
2002)
![Page 17: PaGrid: A Mesh Partitioner for Computational Grids](https://reader035.vdocument.in/reader035/viewer/2022062321/56813d0e550346895da6c584/html5/thumbnails/17.jpg)
METIS
A widely used partitioning tool. Developed from 1995. Uses Multilevel partitioning algorithm.
Heavy Edge Matching for Coarsening Phase
Greedy Refinement algorithm Does not consider the network
heterogeneity.
![Page 18: PaGrid: A Mesh Partitioner for Computational Grids](https://reader035.vdocument.in/reader035/viewer/2022062321/56813d0e550346895da6c584/html5/thumbnails/18.jpg)
JOSTLE
Developed from 1997. A heterogeneous partitioner Uses multilevel partitioning algorithm
Heavy Edge Matching KL-type refinement algorithm
Does not factor in the ratio of communication time and computation time.
![Page 19: PaGrid: A Mesh Partitioner for Computational Grids](https://reader035.vdocument.in/reader035/viewer/2022062321/56813d0e550346895da6c584/html5/thumbnails/19.jpg)
PaGrid Mesh Partitioner
Grid System Modeling
Refinement Cost Function
KL-type Refinement
Estimated Execution Time Load Balancing
![Page 20: PaGrid: A Mesh Partitioner for Computational Grids](https://reader035.vdocument.in/reader035/viewer/2022062321/56813d0e550346895da6c584/html5/thumbnails/20.jpg)
Grid System Modeling
Grid system that contains a set of processors (P) connected by a set of edges (C) –> weighted processor graph S.
Vertex weight = relative computational power if p0 is twice powerful than p1, and |p1|=0.5, then |p0|
=1 Path length = accumulative weights in the
shortest path. Weighted Matrix W of size |P| X |P| is
constructed, where
2 | ( , ) |ij i jW p p0 1 9
1 0 4
9 4 0
Grid system Model
1 2
p0 p1 p2
|(p0, p1)|= 1
|(p1, p2)|= 2
|(p0, p2)|= 3Path lengths
Weighted matrix W
![Page 21: PaGrid: A Mesh Partitioner for Computational Grids](https://reader035.vdocument.in/reader035/viewer/2022062321/56813d0e550346895da6c584/html5/thumbnails/21.jpg)
Refinement Cost Function
Given a processor mapping cost matrix W, the total mapping cost for a partition is given by
( ) ( ){ , }
| ( , ) |c
u vu v E
u v W
u
v
map to
map to
( )v
( )up0
p1p2
p3
( ) ( )u vW | ( , ) |u v
![Page 22: PaGrid: A Mesh Partitioner for Computational Grids](https://reader035.vdocument.in/reader035/viewer/2022062321/56813d0e550346895da6c584/html5/thumbnails/22.jpg)
Refinement Cost Function
Let ( ) denote the set of cut edges from vertex
to vertices assigned to processor , ( ) {( , )| } ,
and | ( )| represent the sum of the weights of these edges.
The gain (mapping cost reducito
q
q q
q
E v v
q E v v u u
E v
n) of the migration of vertex
from original processor to processor is given by:
( , ) | ( )| ( )
KL-type refinement is done with vertex migrations determined by
the gains.
r pr qrr P
v
p q
gainv q E v W W
![Page 23: PaGrid: A Mesh Partitioner for Computational Grids](https://reader035.vdocument.in/reader035/viewer/2022062321/56813d0e550346895da6c584/html5/thumbnails/23.jpg)
Multilevel Partitioning Algorithm Coarsening Phase.
Heavy Edge Matching Iterate until the number of vertices in the coarsest
graph is same as the given number of processors. Initial Partitioning Phase.
Assign the each vertex to a processor, while minimizing the cost function.
Uncoarsening Phase. Load balancing based on vertex weights KL-type refinement algorithm.
Load balancing based on estimated execution time.
![Page 24: PaGrid: A Mesh Partitioner for Computational Grids](https://reader035.vdocument.in/reader035/viewer/2022062321/56813d0e550346895da6c584/html5/thumbnails/24.jpg)
Estimated Execution time load balancing Input is the final partition after refinement
stage.
Tries to improve the quality of final partition in terms of estimated execution time.
Execution time for a processor is the sum of time required for computation and the time required for communication.
Execution time is a more accurate metric for the quality of a partition.
Uses KL-type algorithm
![Page 25: PaGrid: A Mesh Partitioner for Computational Grids](https://reader035.vdocument.in/reader035/viewer/2022062321/56813d0e550346895da6c584/html5/thumbnails/25.jpg)
Estimated Execution time load balancing For a processor p with one of its edges (p, q)
in the processor graph, let
Estimated execution time for processor p is given as
Estimated execution time of the application is:
( , )
( , )
( , ) / , where represents the computation time of
processor for processing a vertex that has the smallest weight in the
application graph, and denotes the commun
pq p pcomm comp comp
pqcomm
R pq t t t
p
t ication time for a vertex
from processor to processor .p q
max { | }p
p Pt t p P
| | | ( )| ( , )p
p p rv r P
t E v R pr
![Page 26: PaGrid: A Mesh Partitioner for Computational Grids](https://reader035.vdocument.in/reader035/viewer/2022062321/56813d0e550346895da6c584/html5/thumbnails/26.jpg)
Experimental Results
Test application graphs
Grid system graphs
Comparison with METIS and JOSTLE
![Page 27: PaGrid: A Mesh Partitioner for Computational Grids](https://reader035.vdocument.in/reader035/viewer/2022062321/56813d0e550346895da6c584/html5/thumbnails/27.jpg)
Test Application Graphs
Graph |V| |E| |E|/|V| Description
598a 110971 741934 6.693D finite element mesh(Submarine I)
144 144649107439
37.43
3D finite element mesh(Parafoil)
m14b 214765167901
87.82
3D finite element mesh(Submarine II)
auto 448695331461
17.39
3D finite element mesh (GM Saturn)
Mrng2
1017253
2015714
1.98(description not
available)
|V| is the total number of vertices and |E| is the total number of edges in the graph.
![Page 28: PaGrid: A Mesh Partitioner for Computational Grids](https://reader035.vdocument.in/reader035/viewer/2022062321/56813d0e550346895da6c584/html5/thumbnails/28.jpg)
Grid Systems
32-processor Grid system
64-processor Grid system
![Page 29: PaGrid: A Mesh Partitioner for Computational Grids](https://reader035.vdocument.in/reader035/viewer/2022062321/56813d0e550346895da6c584/html5/thumbnails/29.jpg)
Metrics
Total Communication Cost
Maximum Estimated Execution Time
{ , }
| ( , )| | ( ( ), ( ))|cuv E
uv u v
max { | }p
p Pt t t P
![Page 30: PaGrid: A Mesh Partitioner for Computational Grids](https://reader035.vdocument.in/reader035/viewer/2022062321/56813d0e550346895da6c584/html5/thumbnails/30.jpg)
Total Communication Cost
32-processor Grid System
![Page 31: PaGrid: A Mesh Partitioner for Computational Grids](https://reader035.vdocument.in/reader035/viewer/2022062321/56813d0e550346895da6c584/html5/thumbnails/31.jpg)
Total Communication Cost
Average values of Total Communication Cost of PaGrid are similar to those of METIS.
Average values of Total Communication Cost of PaGrid are slightly worse than for Jostle.
![Page 32: PaGrid: A Mesh Partitioner for Computational Grids](https://reader035.vdocument.in/reader035/viewer/2022062321/56813d0e550346895da6c584/html5/thumbnails/32.jpg)
Maximum Estimated Execution Time
32-processor Grid System
![Page 33: PaGrid: A Mesh Partitioner for Computational Grids](https://reader035.vdocument.in/reader035/viewer/2022062321/56813d0e550346895da6c584/html5/thumbnails/33.jpg)
Maximum Estimated Execution Time The minimum and average values of
Execution Time for PaGrid are always lower than for Jostle and METIS, except for graph mrng2, where PaGrid is slightly worse than METIS.
Even though the results PaGrid are worse than Jostle in terms of average Total Communication Cost, PaGrid’s Estimated Execution Time Load Balancing generates lower average Execution Time than Jostle in all cases.
![Page 34: PaGrid: A Mesh Partitioner for Computational Grids](https://reader035.vdocument.in/reader035/viewer/2022062321/56813d0e550346895da6c584/html5/thumbnails/34.jpg)
Total Communication Cost
64-processor Grid System
![Page 35: PaGrid: A Mesh Partitioner for Computational Grids](https://reader035.vdocument.in/reader035/viewer/2022062321/56813d0e550346895da6c584/html5/thumbnails/35.jpg)
Total Communication Cost
Average values of Total Communication Cost of PaGrid are better than METIS in most cases, except for graph mrng2 (because of the
low ratio of |E|/|V|).
Average values of Total Communication Cost of PaGrid are much worse than Jostle in three of five test application graphs.
![Page 36: PaGrid: A Mesh Partitioner for Computational Grids](https://reader035.vdocument.in/reader035/viewer/2022062321/56813d0e550346895da6c584/html5/thumbnails/36.jpg)
Maximum Estimated Execution Time
64-processor Grid System
![Page 37: PaGrid: A Mesh Partitioner for Computational Grids](https://reader035.vdocument.in/reader035/viewer/2022062321/56813d0e550346895da6c584/html5/thumbnails/37.jpg)
Maximum Estimated Execution Time The difference between PaGrid and Jostle
are amplified: even though the results PaGrid are much worse
than Jostle in terms of average Total Communication Cost, the minimum and average values of Execution Time for PaGrid are much lower than for Jostle.
The minimum Estimated Execution Times for PaGrid are always much lower than for METIS, and the average Execution Times for PaGrid are almost always lower than those of METIS, except for application graph mrng2.
![Page 38: PaGrid: A Mesh Partitioner for Computational Grids](https://reader035.vdocument.in/reader035/viewer/2022062321/56813d0e550346895da6c584/html5/thumbnails/38.jpg)
Conclusion
Intensive need for mesh partitioner that considers the heterogeneity of the processors and networks in a computational Grid environment.
Current partitioning tools provide only limited solution.
PaGrid: a heterogeneous mesh partitioner Consider both processor and network heterogeneity. Use multilevel graph partitioning algorithm. Incorporate load balancing that is based on estimated
execution time. Experimental results indicate that load balancing
based on estimated execution time improves the quality of partitions.
![Page 39: PaGrid: A Mesh Partitioner for Computational Grids](https://reader035.vdocument.in/reader035/viewer/2022062321/56813d0e550346895da6c584/html5/thumbnails/39.jpg)
Future Work
Cost function can be modified to be based on estimated execution time.
Algorithms can be developed addressing repartitioning problem.
Parallelization of PaGrid.
![Page 40: PaGrid: A Mesh Partitioner for Computational Grids](https://reader035.vdocument.in/reader035/viewer/2022062321/56813d0e550346895da6c584/html5/thumbnails/40.jpg)
Publications
S. Huang, E. Aubanel, and V.C. Bhavsar, "PaGrid: A Mesh Partitioner for Computational Grids", Journal of Grid Computing, 18 pages, in press, 2006.
S. Huang, E. Aubanel and V. Bhavsar, ‘Mesh Partitioners for Computational Grids: a Comparison’, in V. Kumar, M. Gavrilova, C. Tan, and P. L'Ecuyer (eds.), Computational Science and Its Applications, Vol. 2269 of Lecture Notes in Computer Science, Springer Inc., Berlin Heidelberg New York, pp. 60–68, 2003.
![Page 41: PaGrid: A Mesh Partitioner for Computational Grids](https://reader035.vdocument.in/reader035/viewer/2022062321/56813d0e550346895da6c584/html5/thumbnails/41.jpg)
Questions ?