applications of algebraic multigrid to large scale mechanics problems

32
1 Mark F. Adams 22 October 2004 Applications of Algebraic Multigrid to Large Scale Mechanics Problems

Upload: obert

Post on 11-Jan-2016

46 views

Category:

Documents


1 download

DESCRIPTION

Applications of Algebraic Multigrid to Large Scale Mechanics Problems. Mark F. Adams 22 October 2004. Outline. Algebraic multigrid (AMG) introduction Industrial applications Micro-FE bone modeling Olympus Parallel FE framework Scalability studies on IBM SPs Scaled speedup Plain speedup - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Applications of Algebraic Multigrid to Large Scale Mechanics Problems

1

Mark F. Adams

22 October 2004

Applications of Algebraic Multigrid to Large Scale

Mechanics Problems

Page 2: Applications of Algebraic Multigrid to Large Scale Mechanics Problems

2

Outline

Algebraic multigrid (AMG) introduction Industrial applications Micro-FE bone modeling Olympus Parallel FE framework Scalability studies on IBM SPs

Scaled speedup Plain speedup Nodal performance

Page 3: Applications of Algebraic Multigrid to Large Scale Mechanics Problems

3

Multigrid smoothing and coarse grid correction

(projection)smoothing

Finest Grid

Prolongation (P=RT)

The MultigridV-cycle

First Coarse Grid

Restriction (R)

Note:smaller grid

Page 4: Applications of Algebraic Multigrid to Large Scale Mechanics Problems

4

Multigrid V() - cycle Function u = MG-VMG-V(A,f)

if A is small u A-1f

else u S(f, u) -- steps of smoother (pre) rH PT( f – Au )

uH MG-VMG-V(PTAP, rH ) -- recursionrecursion (Galerkin)

u u + PuH

u S(f, u) -- steps of smoother (post)

Iteration matrix: T = S ( I - P(RAP)-1RA ) S multiplicative

Page 5: Applications of Algebraic Multigrid to Large Scale Mechanics Problems

5

Smoothed Aggregation

Coarse grid space & smoother MG method Piecewise constant function: “Plain” agg. (P0)

Start with kernel vectors B of operator eg, 6 RBMs in elasticity

Nodal aggregation B P0

“Smoothed” aggregation: lower energy of functions One Jacobi iteration: P ( I - D-1 A ) P0

Page 6: Applications of Algebraic Multigrid to Large Scale Mechanics Problems

6

Parallel Smoothers

CG/Jacobi: Additive (Requires damping for MG) Damped by CG (Adams SC1999) Dot products, non-stationary

Gauss-Seidel: multiplicative (Optimal MG smoother) Complex communication and computation (Adams SC2001)

Polynomial Smoothers: Additive Chebyshev is ideal for multigrid smoothers Chebychev chooses p(y) such that

|1 - p(y) y | = min over interval [* , max] Estimate of max easy Use * = max / C (No need for lowest eigenvalue)

C related to rate of grid coarsening

Page 7: Applications of Algebraic Multigrid to Large Scale Mechanics Problems

7

Outline

Algebraic multigrid (AMG) introduction Industrial applications Micro-FE bone modeling Olympus Parallel FE framework Scalability studies on IBM SPs

Scaled speedup Plain speedup Nodal performance

Page 8: Applications of Algebraic Multigrid to Large Scale Mechanics Problems

8

Aircraft carrier• 315,444 vertices• Shell and beam elements (6 DOF per node)• Linear dynamics – transient (time domain)• About 1 min. per solve (rtol=10-6)

– 2.4 GHz Pentium 4/Xenon processors– Matrix vector product runs at 254 Mflops

Page 9: Applications of Algebraic Multigrid to Large Scale Mechanics Problems

9

Solve and setup times(26 Sun processors)

Page 10: Applications of Algebraic Multigrid to Large Scale Mechanics Problems

10

Adagio: “BR” tire

ADAGIO: Quasi static solid mechanics app. (Sandia) Nearly incompressible visco-elasticity (rubber)

Augmented Lagrange formulation w/ Uzawa like update Contact (impenetrability constraint)

Saddle point solution scheme: Uzawa like iteration (pressure) w/ contact search

Non-linear CG (with linear constraints and constant pressure)

Preconditioned with Linear solvers (AMG, FETI, …) Nodal (Jacobi)

Page 11: Applications of Algebraic Multigrid to Large Scale Mechanics Problems

11

“BR” tire

Page 12: Applications of Algebraic Multigrid to Large Scale Mechanics Problems

12

Displacement history

Page 13: Applications of Algebraic Multigrid to Large Scale Mechanics Problems

13

Outline

Algebraic multigrid (AMG) introduction Industrial applications Micro-FE bone modeling Olympus Parallel FE framework Scalability studies on IBM SPs

Scaled speedup Plain speedup Nodal performance

Page 14: Applications of Algebraic Multigrid to Large Scale Mechanics Problems

14

Trabecular Bone

5-mm Cube

Cortical bone

Trabecular bone

Page 15: Applications of Algebraic Multigrid to Large Scale Mechanics Problems

15

Micro-Computed Tomography

CT @ 22 m resolution

3D image

Mechanical TestingE, yield, ult, etc.

2.5 mm cube44 m elements

FE mesh

Methods: FE modeling

Page 16: Applications of Algebraic Multigrid to Large Scale Mechanics Problems

16

Outline

Algebraic multigrid (AMG) introduction Industrial applications Micro-FE bone modeling Olympus Parallel FE framework Scalability studies on IBM SPs

Scaled speedup Plain speedup Nodal performance

Page 17: Applications of Algebraic Multigrid to Large Scale Mechanics Problems

17

Athena: Parallel FE ParMetis

Parallel Mesh Partitioner (Univerisity of Minnesota)

Prometheus Multigrid Solver

FEAP Serial general purpose

FE application (University of California)

PETSc Parallel numerical

libraries (Argonne National Labs)

FE MeshInput File

Athena ParMetis

FE input file(in memory)

FE input file(in memory)

Partition to SMPs

Athena Athena ParMetis

File File File File

FEAPFEAPFEAPFEAP Material Card

Silo DBSilo DB

Silo DBSilo DB

Visit

Prometheus

PETScParMetis

METISMETISMETISMETIS

pFEAP

Computational Architecture

Olympus

Page 18: Applications of Algebraic Multigrid to Large Scale Mechanics Problems

18

Outline

Algebraic multigrid (AMG) introduction Industrial applications Micro-FE bone modeling Olympus Parallel FE framework Scalability studies on IBM SPs

Scaled speedup Plain speedup Nodal performance

Page 19: Applications of Algebraic Multigrid to Large Scale Mechanics Problems

1980 µm w/o shell

Inexact Newton CG linear solver

Variable tolerance

Smoothed aggregation AMG preconditioner

Nodal block diagonal smoothers: 2nd order Chebeshev (add.) Gauss-Seidel (multiplicative)

Scalability

Page 20: Applications of Algebraic Multigrid to Large Scale Mechanics Problems

21

80 µm w/ shell

Vertebral Body With Shell

Large deformation elast. 6 load steps (3% strain) Scaled speedup

~131K dof/processor 7 to 537 million dof 4 to 292 nodes IBM SP Power3

15 of 16 procs/node used Double/Single Colony switch

Page 21: Applications of Algebraic Multigrid to Large Scale Mechanics Problems

22

Computational phases Mesh setup (per mesh):

Coarse grid construction (aggregation) Graph processing

Matrix setup (per matrix): Coarse grid operator construction

Sparse matrix triple product RAP (expensive for S.A.)

Subdomain factorizations

Solve (per RHS): Matrix vector products (residuals, grid transfer) Smoothers (Matrix vector products)

Page 22: Applications of Algebraic Multigrid to Large Scale Mechanics Problems

23

Linear solver iterations

Newton

Load

Small (7.5M dof) Large (537M dof)

1 2 3 4 5 1 2 3 4 5 6

1 5 14 20 21 18 5 11 35 25 70 2

2 5 14 20 20 20 5 11 36 26 70 2

3 5 14 20 22 19 5 11 36 26 70 2

4 5 14 20 22 19 5 11 36 26 70 2

5 5 14 20 22 19 5 11 36 26 70 2

6 5 14 20 22 19 5 11 36 26 70 2

Page 23: Applications of Algebraic Multigrid to Large Scale Mechanics Problems

24

131K dof / proc - Flops/sec/proc .47 Terflops - 4088 processors

Page 24: Applications of Algebraic Multigrid to Large Scale Mechanics Problems

25

End to end times and (in)efficiency components

Page 25: Applications of Algebraic Multigrid to Large Scale Mechanics Problems

26

Sources of scale inefficiencies in solve phase

7.5M dof 537M dof

#iteration 450 897

#nnz/row 50 68

Flop rate 76 74

#elems/pr 19.3K 33.0K

model 1.00 2.78

Measured 1.00 2.61

Page 26: Applications of Algebraic Multigrid to Large Scale Mechanics Problems

27

164K dof/proc

Page 27: Applications of Algebraic Multigrid to Large Scale Mechanics Problems

28

First try: Flop rates (265K dof/processor)

265K dof per proc. IBM switch bug

Bisection bandwidth plateau 64-128 nodes

Solution: use more processors Less dof per proc. Less pressure on switch

Bisection bandwidth

Page 28: Applications of Algebraic Multigrid to Large Scale Mechanics Problems

29

Outline

Algebraic multigrid (AMG) introduction Industrial applications Micro-FE bone modeling Olympus Parallel FE framework Scalability studies on IBM SPs

Scaled speedup Plain speedup Nodal performance

Page 29: Applications of Algebraic Multigrid to Large Scale Mechanics Problems

30

Speedup with 7.5M dof problem (1 to 128 nodes)

Page 30: Applications of Algebraic Multigrid to Large Scale Mechanics Problems

31

Outline

Algebraic multigrid (AMG) introduction Industrial applications Micro-FE bone modeling Olympus Parallel FE framework Scalability studies on IBM SPs

Scaled speedup Plain speedup Nodal performance

Page 31: Applications of Algebraic Multigrid to Large Scale Mechanics Problems

32

Nodal Performance of IBM SP Power3 and Power4

IBM power3, 16 processors per node 375 Mhz, 4 flops per cycle 16 GB/sec bus (~7.9 GB/sec w/ STREAM bm)

Implies ~1.5 Gflops/s MB peak for Mat-Vec We get ~1.2 Gflops/s (15 x .08Gflops)

IBM power4, 32 processors per node 1.3 GHz, 4 flops per cycle Complex memory architecture

Page 32: Applications of Algebraic Multigrid to Large Scale Mechanics Problems

33

Speedup