scalable algebraic multigrid on blue gene/lhdesterc/websitew/data/presentations/pre… · 5....

32
Hans De Sterck, Jeff Butler Department of Applied Mathematics University of Waterloo Ulrike Meier Yang Center for Applied Scientific Computing Lawrence Livermore National Lab, USA Scalable Algebraic Multigrid on Blue Gene/L CAIMS Annual Meeting, Toronto, 19 June 2006

Upload: others

Post on 19-Oct-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Scalable Algebraic Multigrid on Blue Gene/Lhdesterc/websiteW/Data/presentations/pre… · 5. scaling results on Blue Gene/L. CAIMS 2006 3 (1) introduction: algebraic multigrid (AMG)

Hans De Sterck, Jeff ButlerDepartment of Applied Mathematics

University of WaterlooUlrike Meier Yang

Center for Applied Scientific ComputingLawrence Livermore National Lab, USA

Scalable Algebraic Multigrid onBlue Gene/L

CAIMS Annual Meeting,Toronto, 19 June 2006

Page 2: Scalable Algebraic Multigrid on Blue Gene/Lhdesterc/websiteW/Data/presentations/pre… · 5. scaling results on Blue Gene/L. CAIMS 2006 3 (1) introduction: algebraic multigrid (AMG)

2CAIMS 2006

Outline

1. introduction: algebraic multigrid (AMG)2. classical coarsening may lead to complexity

growth3. Parallel Modified Independent Set (PMIS)

coarsening4. improving interpolation5. scaling results on Blue Gene/L

Page 3: Scalable Algebraic Multigrid on Blue Gene/Lhdesterc/websiteW/Data/presentations/pre… · 5. scaling results on Blue Gene/L. CAIMS 2006 3 (1) introduction: algebraic multigrid (AMG)

3CAIMS 2006

(1) introduction: algebraic multigrid (AMG)

solve

from 3D PDE – sparse!

large problems (109 dof) - parallel

unstructured grid problems

fAu =

A

Page 4: Scalable Algebraic Multigrid on Blue Gene/Lhdesterc/websiteW/Data/presentations/pre… · 5. scaling results on Blue Gene/L. CAIMS 2006 3 (1) introduction: algebraic multigrid (AMG)

4CAIMS 2006

algebraic multigrid (AMG)

multi-level iterative algebraic: suitable for unstructured grids!

Page 5: Scalable Algebraic Multigrid on Blue Gene/Lhdesterc/websiteW/Data/presentations/pre… · 5. scaling results on Blue Gene/L. CAIMS 2006 3 (1) introduction: algebraic multigrid (AMG)

5CAIMS 2006

— Select coarse “grids”— Define interpolation,— Define restriction and coarse-grid operators

AMG building blocks

Setup Phase:

Solve Phasemm(m)fuA =Relax

mm(m)fuA =Relax

1m(m)mePe

+= eInterpolat

!

Compute rm

= fm"A

(m)um

mmm euu +! Correct

1m1m1)(mreA

+++=

Solve

m(m)1mrRr =

+Restrict

1,2,...m ,P(m)

=

(m)(m)(m)T 1)(m(m)T(m)PAPA PR ==

+

Page 6: Scalable Algebraic Multigrid on Blue Gene/Lhdesterc/websiteW/Data/presentations/pre… · 5. scaling results on Blue Gene/L. CAIMS 2006 3 (1) introduction: algebraic multigrid (AMG)

6CAIMS 2006

2D model problem:

high-frequency error is removed by relaxation low-frequency error needs to be removed by coarse-grid correction low-frequency error on fine grid becomes higher frequency error on

coarse grid

Page 7: Scalable Algebraic Multigrid on Blue Gene/Lhdesterc/websiteW/Data/presentations/pre… · 5. scaling results on Blue Gene/L. CAIMS 2006 3 (1) introduction: algebraic multigrid (AMG)

7CAIMS 2006

AMG complexity - scalability

Operator complexity Cop=

e.g., 3D, ideally: Cop = 1 + 1/8 + 1/64 + … < 8 / 7

measure of memory use, and work in solvephase

scalable algorithm:O(n) operations per V-cycle (Cop bounded)

ANDnumber of V-cycles independent of n

(ρAMG independent of n)

Page 8: Scalable Algebraic Multigrid on Blue Gene/Lhdesterc/websiteW/Data/presentations/pre… · 5. scaling results on Blue Gene/L. CAIMS 2006 3 (1) introduction: algebraic multigrid (AMG)

8CAIMS 2006

O(n) scalability

Page 9: Scalable Algebraic Multigrid on Blue Gene/Lhdesterc/websiteW/Data/presentations/pre… · 5. scaling results on Blue Gene/L. CAIMS 2006 3 (1) introduction: algebraic multigrid (AMG)

9CAIMS 2006

AMG coarsening and interpolation large aij, ‘strong connections’ are important

define strength matrix S:

consider the undirected graph of S

apply parallel maximal independent set

algorithm to graph(S) [Luby, 1986]

!

A =

x x x

x x x

x x x x

x x x

x x x

"

#

$ $ $ $ $ $

%

&

' ' ' ' ' '

!

S =

1 1 0

1 0 1

0 0 1 1

1 0 1

1 1 0

"

#

$ $ $ $ $ $

%

&

' ' ' ' ' '

Page 10: Scalable Algebraic Multigrid on Blue Gene/Lhdesterc/websiteW/Data/presentations/pre… · 5. scaling results on Blue Gene/L. CAIMS 2006 3 (1) introduction: algebraic multigrid (AMG)

10CAIMS 2006

classical AMG coarsening (CLJP)

(C1) Maximal Independent Set:Independent: no two C-points are

connectedMaximal: if one more C-point is

added, the independence is lost

(C2) All F-F connections requireconnections to a common C-point (for good interpolation)

F-points have to be changed intoC-points, to ensure (C2); (C1) isviolated

more C-points, higher complexity

Page 11: Scalable Algebraic Multigrid on Blue Gene/Lhdesterc/websiteW/Data/presentations/pre… · 5. scaling results on Blue Gene/L. CAIMS 2006 3 (1) introduction: algebraic multigrid (AMG)

11CAIMS 2006

classical coarsening: scalability results

9

9

9

Iter

256

64

16

Procs ttotCop

5.014.50

3.854.50

2.894.48

example: finite difference Laplacian, parallelCLJP coarsening algorithm

2D (5-point): near-optimal scalability (2502 dof/proc)

Page 12: Scalable Algebraic Multigrid on Blue Gene/Lhdesterc/websiteW/Data/presentations/pre… · 5. scaling results on Blue Gene/L. CAIMS 2006 3 (1) introduction: algebraic multigrid (AMG)

12CAIMS 2006

(2) classical coarsening may lead tocomplexity problems

643

323

dof Cop

22.51

16.17

3D (7-point): complexity growth

Page 13: Scalable Algebraic Multigrid on Blue Gene/Lhdesterc/websiteW/Data/presentations/pre… · 5. scaling results on Blue Gene/L. CAIMS 2006 3 (1) introduction: algebraic multigrid (AMG)

13CAIMS 2006

(3) Parallel Modified Independent Set(PMIS) coarsening

our approach to reduce complexity: do not add C points for strong F-F connectionsthat do not have a common C point

less C points, reduced complexity, but worseconvergence factors expected

compensate by GMRES acceleration

Page 14: Scalable Algebraic Multigrid on Blue Gene/Lhdesterc/websiteW/Data/presentations/pre… · 5. scaling results on Blue Gene/L. CAIMS 2006 3 (1) introduction: algebraic multigrid (AMG)

14CAIMS 2006

parallel PMIS results: 7-point finitedifference Laplacian in 3D, 403 dof per proc

CLJP and PMIS-GMRES(10)

35.831017.02512

17.99282.371331

46.251017.191331

3.35614.391

2513

Iter ttotalCopproc

12.772.375121.282.321

Page 15: Scalable Algebraic Multigrid on Blue Gene/Lhdesterc/websiteW/Data/presentations/pre… · 5. scaling results on Blue Gene/L. CAIMS 2006 3 (1) introduction: algebraic multigrid (AMG)

15CAIMS 2006

convergence problems on PMIS-coarsened grids

PMIS coarsening works well for many problems,but requires GMRES acceleration

for some problems, too many iterations arenecessary because interpolation is not accurateenough (“not enough C-points”)

one solution: add C-points (CLJP…) other solution: use distance-two C-points for

interpolation = long-range interpolation F-F interpolation

Page 16: Scalable Algebraic Multigrid on Blue Gene/Lhdesterc/websiteW/Data/presentations/pre… · 5. scaling results on Blue Gene/L. CAIMS 2006 3 (1) introduction: algebraic multigrid (AMG)

16CAIMS 2006

convergence problems

PMIS

CLJP

686

17

Iter

211.79

52.48

ttot Cop

2.40

17.00

3D elliptic PDE with jumps in coefficient a

!

(aux )x + (auy )y + (auz )z =1

1000 processors, 403 dof/proc

remedy: improve interpolation used with PMIS

Page 17: Scalable Algebraic Multigrid on Blue Gene/Lhdesterc/websiteW/Data/presentations/pre… · 5. scaling results on Blue Gene/L. CAIMS 2006 3 (1) introduction: algebraic multigrid (AMG)

17CAIMS 2006

(4) improving interpolation:F-F interpolation

when strong F-F connection without a commonC-point is detected, do not add C-point, butextend interpolation stencil to distance-two C-points

no C-points added, but larger interpolationstencils

Page 18: Scalable Algebraic Multigrid on Blue Gene/Lhdesterc/websiteW/Data/presentations/pre… · 5. scaling results on Blue Gene/L. CAIMS 2006 3 (1) introduction: algebraic multigrid (AMG)

18CAIMS 2006

results using long-range F-F interpolation

94.9021.4PMIS + F-F

PMIS

CLJP

188

7

Iter

94.6

48.0

ttot Cop

2.46

21.54

3D elliptic PDE with jumps in coefficient a

!

(aux )x + (auy )y + (auz )z =1

1 processor, AMG+GMRES, 803 dof

Page 19: Scalable Algebraic Multigrid on Blue Gene/Lhdesterc/websiteW/Data/presentations/pre… · 5. scaling results on Blue Gene/L. CAIMS 2006 3 (1) introduction: algebraic multigrid (AMG)

19CAIMS 2006

reduce complexity: FF1 Interpolation

Modified FF Interpolation (FF1)

To reduce operator complexity, only include onedistance-two C-point when a strong FFconnection is encountered

Setup time, complexity are reduced

Page 20: Scalable Algebraic Multigrid on Blue Gene/Lhdesterc/websiteW/Data/presentations/pre… · 5. scaling results on Blue Gene/L. CAIMS 2006 3 (1) introduction: algebraic multigrid (AMG)

20CAIMS 2006

66.2922.0744.22153.68FF1

106.6722.8683.81134.80FF

102.5685.9316.63772.36Regular

ttotal (s)tsolve (s)tsetup (s)#iterations

Cop

results: 7-pt Laplacian Problem

PMIS coarsening, 1 processor, 1283 dof

AMG

Page 21: Scalable Algebraic Multigrid on Blue Gene/Lhdesterc/websiteW/Data/presentations/pre… · 5. scaling results on Blue Gene/L. CAIMS 2006 3 (1) introduction: algebraic multigrid (AMG)

21CAIMS 2006

scalability of PMIS-FF1

Page 22: Scalable Algebraic Multigrid on Blue Gene/Lhdesterc/websiteW/Data/presentations/pre… · 5. scaling results on Blue Gene/L. CAIMS 2006 3 (1) introduction: algebraic multigrid (AMG)

22CAIMS 2006

57.8322.4735.36183.84FF1

83.4920.5462.95144.94FF

converge

Slow to>> 2002.44Regular

ttotal (s)tsolve (s)tsetup (s)#iterations

Cop

results: 3D elliptic PDE with jumps

AMG, 1 processor, 1203 dof

!

(aux )x + (auy )y + (auz )z =1

Page 23: Scalable Algebraic Multigrid on Blue Gene/Lhdesterc/websiteW/Data/presentations/pre… · 5. scaling results on Blue Gene/L. CAIMS 2006 3 (1) introduction: algebraic multigrid (AMG)

23CAIMS 2006

(5) scaling results on Blue Gene/L

Top 500 Supercomputer list (November 2005)

Page 24: Scalable Algebraic Multigrid on Blue Gene/Lhdesterc/websiteW/Data/presentations/pre… · 5. scaling results on Blue Gene/L. CAIMS 2006 3 (1) introduction: algebraic multigrid (AMG)

24CAIMS 2006

LLNL Blue Gene/L

dual-processor nodes optimized for data access each node: one processor for simulation, one for

communication; only 256MB ram per processor lightweight, single-process linux kernel

Page 25: Scalable Algebraic Multigrid on Blue Gene/Lhdesterc/websiteW/Data/presentations/pre… · 5. scaling results on Blue Gene/L. CAIMS 2006 3 (1) introduction: algebraic multigrid (AMG)

25CAIMS 2006

LLNL Blue Gene/L results

Page 26: Scalable Algebraic Multigrid on Blue Gene/Lhdesterc/websiteW/Data/presentations/pre… · 5. scaling results on Blue Gene/L. CAIMS 2006 3 (1) introduction: algebraic multigrid (AMG)

26CAIMS 2006

LLNL Blue Gene/L results on full machine

7-pt Laplacian, total execution time, AMG-CG, total problem size ~2 billion

Page 27: Scalable Algebraic Multigrid on Blue Gene/Lhdesterc/websiteW/Data/presentations/pre… · 5. scaling results on Blue Gene/L. CAIMS 2006 3 (1) introduction: algebraic multigrid (AMG)

27CAIMS 2006

PMIS: select 1

select C-pts withmaximalmeasure locally

make neighbourF-pts

removeneighbour edges

3.7 5.3 5.0 5.9 5.4 5.3 3.4

5.2 8.0 8.5 8.2 8.6 8.9 5.1

5.9 8.1 8.8 8.9 8.4 8.2 5.9

5.7 8.6 8.3 8.8 8.3 8.1 5.0

5.3 8.7 8.3 8.4 8.3 8.8 5.9

5.0 8.8 8.5 8.6 8.7 8.9 5.3

3.2 5.6 5.8 5.6 5.9 5.9 3.0

Page 28: Scalable Algebraic Multigrid on Blue Gene/Lhdesterc/websiteW/Data/presentations/pre… · 5. scaling results on Blue Gene/L. CAIMS 2006 3 (1) introduction: algebraic multigrid (AMG)

28CAIMS 2006

PMIS: remove and update 1

select C-pts withmaximal measurelocally

make neighboursF-pts

removeneighbour edges

3.7 5.3 5.0 5.9

5.2 8.0

5.9 8.1

5.7 8.6 8.1 5.0

8.4

8.6

5.6

Page 29: Scalable Algebraic Multigrid on Blue Gene/Lhdesterc/websiteW/Data/presentations/pre… · 5. scaling results on Blue Gene/L. CAIMS 2006 3 (1) introduction: algebraic multigrid (AMG)

29CAIMS 2006

PMIS: select 2

select C-pts withmaximal measurelocally

make neighboursF-pts

removeneighbour edges

3.7 5.3 5.0 5.9

5.2 8.0

5.9 8.1

5.7 8.1 5.0

8.4

8.6

5.6

8.6

Page 30: Scalable Algebraic Multigrid on Blue Gene/Lhdesterc/websiteW/Data/presentations/pre… · 5. scaling results on Blue Gene/L. CAIMS 2006 3 (1) introduction: algebraic multigrid (AMG)

30CAIMS 2006

PMIS: remove and update 2

select C-pts withmaximal measurelocally

make neighboursF-pts

removeneighbour edges

3.7 5.3

5.2 8.0

Page 31: Scalable Algebraic Multigrid on Blue Gene/Lhdesterc/websiteW/Data/presentations/pre… · 5. scaling results on Blue Gene/L. CAIMS 2006 3 (1) introduction: algebraic multigrid (AMG)

31CAIMS 2006

PMIS: final grid

select C-pts withmaximal measurelocally

make neighbourF-pts

remove neighbouredges

parallel algorithm

Page 32: Scalable Algebraic Multigrid on Blue Gene/Lhdesterc/websiteW/Data/presentations/pre… · 5. scaling results on Blue Gene/L. CAIMS 2006 3 (1) introduction: algebraic multigrid (AMG)

32CAIMS 2006

LLNL Blue Gene/L results