recent advances in hpc with freefem+++/uploads/main/p-jolivet.pdf · 2013-01-31 · recent advances...

42
Recent advances in HPC with FreeFem++ Pierre Jolivet Laboratoire Jacques-Louis Lions Laboratoire Jean Kuntzmann Fourth workshop on FreeFem++ December 6, 2012 With F. Hecht, F. Nataf, C. Prud’homme.

Upload: others

Post on 16-Mar-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Recent advances in HPC with FreeFem+++/uploads/Main/P-Jolivet.pdf · 2013-01-31 · Recent advances in HPC with FreeFem++ PierreJolivet Laboratoire Jacques-Louis Lions Laboratoire

Recent advances in HPC with FreeFem++

Pierre Jolivet

Laboratoire Jacques-Louis LionsLaboratoire Jean Kuntzmann

Fourth workshop on FreeFem++

December 6, 2012

With F. Hecht, F. Nataf, C. Prud’homme.

Page 2: Recent advances in HPC with FreeFem+++/uploads/Main/P-Jolivet.pdf · 2013-01-31 · Recent advances in HPC with FreeFem++ PierreJolivet Laboratoire Jacques-Louis Lions Laboratoire

Introduction New solvers Domain decomposition methods Conclusion

Outline

1 IntroductionFlashback to last yearA new machine

2 New solversMUMPSPARDISO

3 Domain decomposition methodsThe necessary toolsPreconditioning Krylov methodsNonlinear problems

4 Conclusion

2 / 26 Pierre Jolivet HPC and FreeFem++

Page 3: Recent advances in HPC with FreeFem+++/uploads/Main/P-Jolivet.pdf · 2013-01-31 · Recent advances in HPC with FreeFem++ PierreJolivet Laboratoire Jacques-Louis Lions Laboratoire

Introduction New solvers Domain decomposition methods Conclusion

Where were we one year ago ?

What we had:• FreeFem++ version 3.17,• a “simple” toolbox for domain decomposition methods,• 3 supercomputers (SGI, Bull, IBM).

What we achieved:• linear problems up to 120 million unkowns in few minutes,• good scaling up to ∼2048 processes on a BlueGene/P.

3 / 26 Pierre Jolivet HPC and FreeFem++

Page 4: Recent advances in HPC with FreeFem+++/uploads/Main/P-Jolivet.pdf · 2013-01-31 · Recent advances in HPC with FreeFem++ PierreJolivet Laboratoire Jacques-Louis Lions Laboratoire

Introduction New solvers Domain decomposition methods Conclusion

FreeFem++ is working on the following parallel architectures:N of cores Memory Peak performance

hpc1@LJLL [email protected] GHz 640 Go ∼ 10 TFLOP/stitane@CEA [email protected] GHz 37 To 140 TFLOP/sbabel@IDRIS 40960@850 MHz 20 To 139 TFLOP/s

curie@CEA [email protected] GHz 320 To 1.6 PFLOP/s

http://www-hpc.cea.fr, Bruyères-le-Châtel, France.http://www.idris.fr, Orsay, France (grant PETALh).

http://www.prace-project.eu (grant HPC-PDE).

4 / 26 Pierre Jolivet HPC and FreeFem++

Page 5: Recent advances in HPC with FreeFem+++/uploads/Main/P-Jolivet.pdf · 2013-01-31 · Recent advances in HPC with FreeFem++ PierreJolivet Laboratoire Jacques-Louis Lions Laboratoire

Introduction New solvers Domain decomposition methods Conclusion

FreeFem++ is working on the following parallel architectures:N of cores Memory Peak performance

hpc1@LJLL [email protected] GHz 640 Go ∼ 10 TFLOP/stitane@CEA [email protected] GHz 37 To 140 TFLOP/sbabel@IDRIS 40960@850 MHz 20 To 139 TFLOP/scurie@CEA [email protected] GHz 320 To 1.6 PFLOP/s

http://www-hpc.cea.fr, Bruyères-le-Châtel, France.http://www.idris.fr, Orsay, France (grant PETALh).http://www.prace-project.eu (grant HPC-PDE).

4 / 26 Pierre Jolivet HPC and FreeFem++

Page 6: Recent advances in HPC with FreeFem+++/uploads/Main/P-Jolivet.pdf · 2013-01-31 · Recent advances in HPC with FreeFem++ PierreJolivet Laboratoire Jacques-Louis Lions Laboratoire

Introduction New solvers Domain decomposition methods Conclusion

FreeFem++ and the linear solvers

set(A, solver = sparsesolver) in FreeFem++⇓

instance of VirtualSolver<T> from which inherits a solver.

Interfacing a new solver basically consists in implementing:1 the constructor MySolver(const MatriceMorse<T>&)2 the pure virtual method

void Solver(const MatriceMorse<T>&, KN_<T>&,const KN_<T>&) const

3 the destructor MySolver(const MatriceMorse<T>&)

real[int] x = Aˆ-1 * b will call MySolver::Solver.

5 / 26 Pierre Jolivet HPC and FreeFem++

Page 7: Recent advances in HPC with FreeFem+++/uploads/Main/P-Jolivet.pdf · 2013-01-31 · Recent advances in HPC with FreeFem++ PierreJolivet Laboratoire Jacques-Louis Lions Laboratoire

Introduction New solvers Domain decomposition methods Conclusion

FreeFem++ and the linear solvers

set(A, solver = sparsesolver) in FreeFem++⇓

instance of VirtualSolver<T> from which inherits a solver.

Interfacing a new solver basically consists in implementing:1 the constructor MySolver(const MatriceMorse<T>&)

2 the pure virtual methodvoid Solver(const MatriceMorse<T>&, KN_<T>&,

const KN_<T>&) const3 the destructor MySolver(const MatriceMorse<T>&)

real[int] x = Aˆ-1 * b will call MySolver::Solver.

5 / 26 Pierre Jolivet HPC and FreeFem++

Page 8: Recent advances in HPC with FreeFem+++/uploads/Main/P-Jolivet.pdf · 2013-01-31 · Recent advances in HPC with FreeFem++ PierreJolivet Laboratoire Jacques-Louis Lions Laboratoire

Introduction New solvers Domain decomposition methods Conclusion

FreeFem++ and the linear solvers

set(A, solver = sparsesolver) in FreeFem++⇓

instance of VirtualSolver<T> from which inherits a solver.

Interfacing a new solver basically consists in implementing:1 the constructor MySolver(const MatriceMorse<T>&)2 the pure virtual method

void Solver(const MatriceMorse<T>&, KN_<T>&,const KN_<T>&) const

3 the destructor MySolver(const MatriceMorse<T>&)

real[int] x = Aˆ-1 * b will call MySolver::Solver.

5 / 26 Pierre Jolivet HPC and FreeFem++

Page 9: Recent advances in HPC with FreeFem+++/uploads/Main/P-Jolivet.pdf · 2013-01-31 · Recent advances in HPC with FreeFem++ PierreJolivet Laboratoire Jacques-Louis Lions Laboratoire

Introduction New solvers Domain decomposition methods Conclusion

FreeFem++ and the linear solvers

set(A, solver = sparsesolver) in FreeFem++⇓

instance of VirtualSolver<T> from which inherits a solver.

Interfacing a new solver basically consists in implementing:1 the constructor MySolver(const MatriceMorse<T>&)2 the pure virtual method

void Solver(const MatriceMorse<T>&, KN_<T>&,const KN_<T>&) const

3 the destructor MySolver(const MatriceMorse<T>&)

real[int] x = Aˆ-1 * b will call MySolver::Solver.

5 / 26 Pierre Jolivet HPC and FreeFem++

Page 10: Recent advances in HPC with FreeFem+++/uploads/Main/P-Jolivet.pdf · 2013-01-31 · Recent advances in HPC with FreeFem++ PierreJolivet Laboratoire Jacques-Louis Lions Laboratoire

Introduction New solvers Domain decomposition methods Conclusion

FreeFem++ and the linear solvers

set(A, solver = sparsesolver) in FreeFem++⇓

instance of VirtualSolver<T> from which inherits a solver.

Interfacing a new solver basically consists in implementing:1 the constructor MySolver(const MatriceMorse<T>&)2 the pure virtual method

void Solver(const MatriceMorse<T>&, KN_<T>&,const KN_<T>&) const

3 the destructor MySolver(const MatriceMorse<T>&)

real[int] x = Aˆ-1 * b will call MySolver::Solver.

5 / 26 Pierre Jolivet HPC and FreeFem++

Page 11: Recent advances in HPC with FreeFem+++/uploads/Main/P-Jolivet.pdf · 2013-01-31 · Recent advances in HPC with FreeFem++ PierreJolivet Laboratoire Jacques-Louis Lions Laboratoire

Introduction New solvers Domain decomposition methods Conclusion

MUltifrontal Massively Parallel Sparse direct Solver

http://graal.ens-lyon.fr/MUMPS

Distributed memory direct solver.Compiled by FreeFem++ with --enable-download.Renumbering via AMD, QAMD, AMF, PORD, (Par)METIS,

(PT-)SCOTCH.

Solves unsymmetric and symmetric linear systems !

6 / 26 Pierre Jolivet HPC and FreeFem++

Page 12: Recent advances in HPC with FreeFem+++/uploads/Main/P-Jolivet.pdf · 2013-01-31 · Recent advances in HPC with FreeFem++ PierreJolivet Laboratoire Jacques-Louis Lions Laboratoire

Introduction New solvers Domain decomposition methods Conclusion

MUMPS and FreeFem++

1 load "MUMPS"int[int] l = [1, 1, 2, 2];mesh Th;if(mpirank != 0) // no need to store the matrix on ranks other than 0

5 Th = square(1, 1, label = l);else

Th = square(150, 150, label = l);fespace Vh(Th, P2);

9 varf lap(u, v) = int2d(Th)(dx(u)∗dx(v) + dy(u)∗dy(v)) + int2d(Th)(v)+ on(1, u = 1);

real[int] b = lap(0, Vh);matrix A = lap(Vh, Vh);set(A, solver = sparsesolver);

13 Vh x; x[] = A−1 ∗ b;plot(Th, x, wait = 1, dim = 3, fill = 1, cmm = "sparsesolver", value =

1);

7 / 26 Pierre Jolivet HPC and FreeFem++

Page 13: Recent advances in HPC with FreeFem+++/uploads/Main/P-Jolivet.pdf · 2013-01-31 · Recent advances in HPC with FreeFem++ PierreJolivet Laboratoire Jacques-Louis Lions Laboratoire

Introduction New solvers Domain decomposition methods Conclusion

Intel MKL PARDISO

http://software.intel.com/en-us/intel-mkl

Shared memory direct solver.Part of the Intel Math Kernel Library.Renumbering via Metis or threaded nested dissection.

Solves unsymmetric and symmetric linear systems !

8 / 26 Pierre Jolivet HPC and FreeFem++

Page 14: Recent advances in HPC with FreeFem+++/uploads/Main/P-Jolivet.pdf · 2013-01-31 · Recent advances in HPC with FreeFem++ PierreJolivet Laboratoire Jacques-Louis Lions Laboratoire

Introduction New solvers Domain decomposition methods Conclusion

PARDISO and FreeFem++

load "PARDISO"2 include "cube.idp"int n = 35; int[int] NN = [n, n, n]; real[int, int] BB = [[0,1], [0,1],

[0,1]]; int[int, int] L = [[1,1], [3,4], [5,6]];mesh3 Th = Cube(NN, BB, L);fespace Vh(Th, P2);

6 varf lap(u, v) = int3d(Th)(dx(u)∗dx(v) + dz(u)∗dz(v) + dy(u)∗dy(v))+ int3d(Th)(v) + on(1, u = 1);

real[int] b = lap(0, Vh);matrix A = lap(Vh, Vh);real timer = mpiWtime();

10 set(A, solver = sparsesolver);cout << "Factorization: " << mpiWtime() − timer << endl;Vh x; timer = mpiWtime();x[] = A−1 ∗ b;

14 cout << "Solve: " << mpiWtime() − timer << endl;

9 / 26 Pierre Jolivet HPC and FreeFem++

Page 15: Recent advances in HPC with FreeFem+++/uploads/Main/P-Jolivet.pdf · 2013-01-31 · Recent advances in HPC with FreeFem++ PierreJolivet Laboratoire Jacques-Louis Lions Laboratoire

Introduction New solvers Domain decomposition methods Conclusion

Is it really necessary ?

If you are using direct methods within FreeFem++: yes !

Why ? Sooner or later, UMFPACK will blow up:

UMFPACK V5.5.1 (Jan 25, 2011): ERROR: out of memory

umfpack_di_numeric failed

10 / 26 Pierre Jolivet HPC and FreeFem++

Page 16: Recent advances in HPC with FreeFem+++/uploads/Main/P-Jolivet.pdf · 2013-01-31 · Recent advances in HPC with FreeFem++ PierreJolivet Laboratoire Jacques-Louis Lions Laboratoire

Introduction New solvers Domain decomposition methods Conclusion

Motivation

One of the most straightforward way to solve BVP in parallel.

Based on the “divide and conquer” paradigm:1 assemble,2 factorize and3 solve smaller problems.

11 / 26 Pierre Jolivet HPC and FreeFem++

Page 17: Recent advances in HPC with FreeFem+++/uploads/Main/P-Jolivet.pdf · 2013-01-31 · Recent advances in HPC with FreeFem++ PierreJolivet Laboratoire Jacques-Louis Lions Laboratoire

Introduction New solvers Domain decomposition methods Conclusion

Motivation

One of the most straightforward way to solve BVP in parallel.

Based on the “divide and conquer” paradigm:1 assemble,2 factorize and3 solve smaller problems.

11 / 26 Pierre Jolivet HPC and FreeFem++

Page 18: Recent advances in HPC with FreeFem+++/uploads/Main/P-Jolivet.pdf · 2013-01-31 · Recent advances in HPC with FreeFem++ PierreJolivet Laboratoire Jacques-Louis Lions Laboratoire

Introduction New solvers Domain decomposition methods Conclusion

A short introduction I

Consider the following BVP in R2:

∇ · (κ∇u) = F in Ω

B(u) = 0 on ∂ΩΩ

Then, solve in parallel:

∇ · (κ∇um+11 ) = F1 in Ω1

B(um+11 ) = 0 on ∂Ω1 ∩ ∂Ω

um+11 = um

2 on ∂Ω1 ∩ Ω2

∇ · (κ∇um+12 ) = F2 in Ω2

B(um+12 ) = 0 on ∂Ω2 ∩ ∂Ω

um+12 = um

1 on ∂Ω2 ∩ Ω1

12 / 26 Pierre Jolivet HPC and FreeFem++

Page 19: Recent advances in HPC with FreeFem+++/uploads/Main/P-Jolivet.pdf · 2013-01-31 · Recent advances in HPC with FreeFem++ PierreJolivet Laboratoire Jacques-Louis Lions Laboratoire

Introduction New solvers Domain decomposition methods Conclusion

A short introduction I

Consider the following BVP in R2:

∇ · (κ∇u) = F in Ω

B(u) = 0 on ∂Ω

Ω2Ω1

Then, solve in parallel:

∇ · (κ∇um+11 ) = F1 in Ω1

B(um+11 ) = 0 on ∂Ω1 ∩ ∂Ω

um+11 = um

2 on ∂Ω1 ∩ Ω2

∇ · (κ∇um+12 ) = F2 in Ω2

B(um+12 ) = 0 on ∂Ω2 ∩ ∂Ω

um+12 = um

1 on ∂Ω2 ∩ Ω1

12 / 26 Pierre Jolivet HPC and FreeFem++

Page 20: Recent advances in HPC with FreeFem+++/uploads/Main/P-Jolivet.pdf · 2013-01-31 · Recent advances in HPC with FreeFem++ PierreJolivet Laboratoire Jacques-Louis Lions Laboratoire

Introduction New solvers Domain decomposition methods Conclusion

A short introduction II

We have effectively divided, but we have yet to conquer.

Duplicated unknowns coupled via a partition of unity:

I =N∑

i=1RT

i DiRi ,

where RTi is the prolongation from Vi to V .

Then,

um+1 =N∑

i=1RT

i Dium+1i .

13 / 26 Pierre Jolivet HPC and FreeFem++

Page 21: Recent advances in HPC with FreeFem+++/uploads/Main/P-Jolivet.pdf · 2013-01-31 · Recent advances in HPC with FreeFem++ PierreJolivet Laboratoire Jacques-Louis Lions Laboratoire

Introduction New solvers Domain decomposition methods Conclusion

A short introduction II

We have effectively divided, but we have yet to conquer.

Duplicated unknowns coupled via a partition of unity:

I =N∑

i=1RT

i DiRi ,

where RTi is the prolongation from Vi to V .

Then,

um+1 =N∑

i=1RT

i Dium+1i .

13 / 26 Pierre Jolivet HPC and FreeFem++

Page 22: Recent advances in HPC with FreeFem+++/uploads/Main/P-Jolivet.pdf · 2013-01-31 · Recent advances in HPC with FreeFem++ PierreJolivet Laboratoire Jacques-Louis Lions Laboratoire

Introduction New solvers Domain decomposition methods Conclusion

One-level preconditioners

A common preconditioner is:

M−1RAS :=

N∑i=1

RTi Di(RiART

i )−1Ri .

For future references, let Aij := RiARTj .

These preconditioners don’t scale as N increases:

κ(M−1A) 6 C 1H2

(1 +

Hδh

)

They act as low-pass filters (think about mulgrid methods).

14 / 26 Pierre Jolivet HPC and FreeFem++

Page 23: Recent advances in HPC with FreeFem+++/uploads/Main/P-Jolivet.pdf · 2013-01-31 · Recent advances in HPC with FreeFem++ PierreJolivet Laboratoire Jacques-Louis Lions Laboratoire

Introduction New solvers Domain decomposition methods Conclusion

One-level preconditioners

A common preconditioner is:

M−1RAS :=

N∑i=1

RTi Di(RiART

i )−1Ri .

For future references, let Aij := RiARTj .

These preconditioners don’t scale as N increases:

κ(M−1A) 6 C 1H2

(1 +

Hδh

)

They act as low-pass filters (think about mulgrid methods).

14 / 26 Pierre Jolivet HPC and FreeFem++

Page 24: Recent advances in HPC with FreeFem+++/uploads/Main/P-Jolivet.pdf · 2013-01-31 · Recent advances in HPC with FreeFem++ PierreJolivet Laboratoire Jacques-Louis Lions Laboratoire

Introduction New solvers Domain decomposition methods Conclusion

Two-level preconditionersA common technique in the field of DDM, MG, deflation:

introduce an auxiliary “coarse” problem.

Let Z be a rectangular matrix so that the “bad eigenvectors”of M−1A belong to the space spanned by its columns. Define

E := Z T AZ Q := ZE−1Z T .

Z has O(N) columns, hence E is relatively smaller than A.The following preconditioner can scale theoretically:

P−1A-DEF1 := M−1(I − AQ) + Q .

15 / 26 Pierre Jolivet HPC and FreeFem++

Page 25: Recent advances in HPC with FreeFem+++/uploads/Main/P-Jolivet.pdf · 2013-01-31 · Recent advances in HPC with FreeFem++ PierreJolivet Laboratoire Jacques-Louis Lions Laboratoire

Introduction New solvers Domain decomposition methods Conclusion

Two-level preconditionersA common technique in the field of DDM, MG, deflation:

introduce an auxiliary “coarse” problem.

Let Z be a rectangular matrix so that the “bad eigenvectors”of M−1A belong to the space spanned by its columns. Define

E := Z T AZ Q := ZE−1Z T .

Z has O(N) columns, hence E is relatively smaller than A.

The following preconditioner can scale theoretically:

P−1A-DEF1 := M−1(I − AQ) + Q .

15 / 26 Pierre Jolivet HPC and FreeFem++

Page 26: Recent advances in HPC with FreeFem+++/uploads/Main/P-Jolivet.pdf · 2013-01-31 · Recent advances in HPC with FreeFem++ PierreJolivet Laboratoire Jacques-Louis Lions Laboratoire

Introduction New solvers Domain decomposition methods Conclusion

Two-level preconditionersA common technique in the field of DDM, MG, deflation:

introduce an auxiliary “coarse” problem.

Let Z be a rectangular matrix so that the “bad eigenvectors”of M−1A belong to the space spanned by its columns. Define

E := Z T AZ Q := ZE−1Z T .

Z has O(N) columns, hence E is relatively smaller than A.The following preconditioner can scale theoretically:

P−1A-DEF1 := M−1(I − AQ) + Q .

15 / 26 Pierre Jolivet HPC and FreeFem++

Page 27: Recent advances in HPC with FreeFem+++/uploads/Main/P-Jolivet.pdf · 2013-01-31 · Recent advances in HPC with FreeFem++ PierreJolivet Laboratoire Jacques-Louis Lions Laboratoire

Introduction New solvers Domain decomposition methods Conclusion

Preconditioners in actionBack to our Krylov method of choice: the GMRES.

procedure GMRES(input vector x0, right-hand side b)r0 ← P−1

A-DEF1(b−Ax0)v0 ← r0/||r0||2for i = 0, . . . ,m − 1 do

w ← P−1A-DEF1Avi

for j = 0, . . . , i dohj,i ← 〈w , vj〉

end forvi+1 ← w −

∑ij=1 hj,i vj

hi+1,i ← ||vi+1||2vi+1 ← vi+1/hi+1,iapply Givens rotations to h:,i

end forym ← arg min||Hmym − βe1||2 with β = ||r0||2return x0 + Vmym

end procedure

global sparse matrix-vector multiplication

global dot products

global preconditioner-vector computation

16 / 26 Pierre Jolivet HPC and FreeFem++

Page 28: Recent advances in HPC with FreeFem+++/uploads/Main/P-Jolivet.pdf · 2013-01-31 · Recent advances in HPC with FreeFem++ PierreJolivet Laboratoire Jacques-Louis Lions Laboratoire

Introduction New solvers Domain decomposition methods Conclusion

Preconditioners in actionBack to our Krylov method of choice: the GMRES.

procedure GMRES(input vector x0, right-hand side b)r0 ← P−1

A-DEF1(b−Ax0)v0 ← r0/||r0||2for i = 0, . . . ,m − 1 do

w ← P−1A-DEF1Avi

for j = 0, . . . , i dohj,i ← 〈w , vj〉

end forvi+1 ← w −

∑ij=1 hj,i vj

hi+1,i ← ||vi+1||2vi+1 ← vi+1/hi+1,iapply Givens rotations to h:,i

end forym ← arg min||Hmym − βe1||2 with β = ||r0||2return x0 + Vmym

end procedure

global sparse matrix-vector multiplication

global dot products

global preconditioner-vector computation

16 / 26 Pierre Jolivet HPC and FreeFem++

Page 29: Recent advances in HPC with FreeFem+++/uploads/Main/P-Jolivet.pdf · 2013-01-31 · Recent advances in HPC with FreeFem++ PierreJolivet Laboratoire Jacques-Louis Lions Laboratoire

Introduction New solvers Domain decomposition methods Conclusion

Preconditioners in actionBack to our Krylov method of choice: the GMRES.

procedure GMRES(input vector x0, right-hand side b)r0 ← P−1

A-DEF1(b−Ax0)v0 ← r0/||r0||2for i = 0, . . . ,m − 1 do

w ← P−1A-DEF1Avi

for j = 0, . . . , i dohj,i ← 〈w , vj〉

end forvi+1 ← w −

∑ij=1 hj,i vj

hi+1,i ← ||vi+1||2vi+1 ← vi+1/hi+1,iapply Givens rotations to h:,i

end forym ← arg min||Hmym − βe1||2 with β = ||r0||2return x0 + Vmym

end procedure

global sparse matrix-vector multiplication

global dot products

global preconditioner-vector computation

16 / 26 Pierre Jolivet HPC and FreeFem++

Page 30: Recent advances in HPC with FreeFem+++/uploads/Main/P-Jolivet.pdf · 2013-01-31 · Recent advances in HPC with FreeFem++ PierreJolivet Laboratoire Jacques-Louis Lions Laboratoire

Introduction New solvers Domain decomposition methods Conclusion

Preconditioners in actionBack to our Krylov method of choice: the GMRES.

procedure GMRES(input vector x0, right-hand side b)r0 ← P−1

A-DEF1(b−Ax0)v0 ← r0/||r0||2for i = 0, . . . ,m − 1 do

w ← P−1A-DEF1Avi

for j = 0, . . . , i dohj,i ← 〈w , vj〉

end forvi+1 ← w −

∑ij=1 hj,i vj

hi+1,i ← ||vi+1||2vi+1 ← vi+1/hi+1,iapply Givens rotations to h:,i

end forym ← arg min||Hmym − βe1||2 with β = ||r0||2return x0 + Vmym

end procedure

global sparse matrix-vector multiplication

global dot products

global preconditioner-vector computation

16 / 26 Pierre Jolivet HPC and FreeFem++

Page 31: Recent advances in HPC with FreeFem+++/uploads/Main/P-Jolivet.pdf · 2013-01-31 · Recent advances in HPC with FreeFem++ PierreJolivet Laboratoire Jacques-Louis Lions Laboratoire

Introduction New solvers Domain decomposition methods Conclusion

In practice

A scalable DDM framework must include routines for:• p2p communications to compute spmv,• p2p communications to apply a one-level preconditioner,• global transfer between master and slaves processes toapply “coarse” corrections,

• direct solvers for the “coarse” problem and local problems.

17 / 26 Pierre Jolivet HPC and FreeFem++

Page 32: Recent advances in HPC with FreeFem+++/uploads/Main/P-Jolivet.pdf · 2013-01-31 · Recent advances in HPC with FreeFem++ PierreJolivet Laboratoire Jacques-Louis Lions Laboratoire

Introduction New solvers Domain decomposition methods Conclusion

In FreeFem++

subdomain A(S, D, restrictionIntersection, arrayIntersection,communicator = mpiCommWorld); // build buffers for spmv

2 preconditioner E(A); // build M−1RAS

if(CoarseOperator) ...attachCoarseOperator(E, A, A = GEVPA, B = GEVPB, parameters

= parm); // build P−1A-DEF1

6 GMRESDDM(A, E, u[], rhs, dim = 100, iter = 100, eps = eps);

18 / 26 Pierre Jolivet HPC and FreeFem++

Page 33: Recent advances in HPC with FreeFem+++/uploads/Main/P-Jolivet.pdf · 2013-01-31 · Recent advances in HPC with FreeFem++ PierreJolivet Laboratoire Jacques-Louis Lions Laboratoire

Introduction New solvers Domain decomposition methods Conclusion

Nonlinear elasticity

We solve an evolution problem of hyperelasticity on a beam,find u such that: ∀t ∈ [0; T ], ∀v ∈ [H1(Ω)]

3,∫

Ωρu · v +

∫Ω

(I +∇u) Σ︸ ︷︷ ︸=:F(u)

: ∇v =∫

Ωf · v +

∫∂ΩS

g · v

u = 0 on ∂ΩD

σ(u) · n = 0 on ∂ΩN

whereΣ = λtr(E )I + 2µE

19 / 26 Pierre Jolivet HPC and FreeFem++

Page 34: Recent advances in HPC with FreeFem+++/uploads/Main/P-Jolivet.pdf · 2013-01-31 · Recent advances in HPC with FreeFem++ PierreJolivet Laboratoire Jacques-Louis Lions Laboratoire

Introduction New solvers Domain decomposition methods Conclusion

In FreeFem++

Add another operator rebuild (currently only rebuilds Aii−1).

The rest is (almost) the same.1 for(int k = 0; k < 200; ++k)

SchemeInit(u)if(k > 0)

S = vPbNonlinear(Wh, Wh, solver = CG);5 rhs = vPbLinear(0, Wh);

rebuild(S, A, E);GMRESDDM(A, E, h[], rhs, dim = 100, iter = 100, eps = eps);

9 SchemeAdvance(u, h)

20 / 26 Pierre Jolivet HPC and FreeFem++

Page 35: Recent advances in HPC with FreeFem+++/uploads/Main/P-Jolivet.pdf · 2013-01-31 · Recent advances in HPC with FreeFem++ PierreJolivet Laboratoire Jacques-Louis Lions Laboratoire

Introduction New solvers Domain decomposition methods Conclusion

Outline

1 IntroductionFlashback to last yearA new machine

2 New solversMUMPSPARDISO

3 Domain decomposition methodsThe necessary toolsPreconditioning Krylov methodsNonlinear problems

4 Conclusion

21 / 26 Pierre Jolivet HPC and FreeFem++

Page 36: Recent advances in HPC with FreeFem+++/uploads/Main/P-Jolivet.pdf · 2013-01-31 · Recent advances in HPC with FreeFem++ PierreJolivet Laboratoire Jacques-Louis Lions Laboratoire

Introduction New solvers Domain decomposition methods Conclusion

Amuses bouche before lunch I

1 0242 048

4 0966 144

8 192

1

2

4

6

8

#processes

Tim

ingrelative

to102

4processes

Linear speedup 10

15

20

25

#iterations

Linear elasticity in 2D with P3 FE.1 billion unknowns solved in 31 seconds at peak performance.

22 / 26 Pierre Jolivet HPC and FreeFem++

Page 37: Recent advances in HPC with FreeFem+++/uploads/Main/P-Jolivet.pdf · 2013-01-31 · Recent advances in HPC with FreeFem++ PierreJolivet Laboratoire Jacques-Louis Lions Laboratoire

Introduction New solvers Domain decomposition methods Conclusion

Amuses bouche before lunch II

1 0242 048

4 0966 144

8 192

1

2

4

6

8

#processes

Tim

ingrelative

to102

4processes

Linear speedup 10

15

20

25

#iterations

Linear elasticity in 3D with P2 FE.80 million unknowns solved in 35 seconds at peak performance.

23 / 26 Pierre Jolivet HPC and FreeFem++

Page 38: Recent advances in HPC with FreeFem+++/uploads/Main/P-Jolivet.pdf · 2013-01-31 · Recent advances in HPC with FreeFem++ PierreJolivet Laboratoire Jacques-Louis Lions Laboratoire

Introduction New solvers Domain decomposition methods Conclusion

Amuses bouche before lunch III

1 0242 048

4 0966 144

8 19212 288

0%

20%

40%

60%

80%

100%

#processes

Weakeffi

ciency

relativeto

1024processes

844

10 176

#d.o.f.(inmillion

s)

Scalar diffusivity in 2D with P3 FE.10 billion unknowns solved in 160 seconds on 12 228 processes.

24 / 26 Pierre Jolivet HPC and FreeFem++

Page 39: Recent advances in HPC with FreeFem+++/uploads/Main/P-Jolivet.pdf · 2013-01-31 · Recent advances in HPC with FreeFem++ PierreJolivet Laboratoire Jacques-Louis Lions Laboratoire

Introduction New solvers Domain decomposition methods Conclusion

Amuses bouche before lunch IV

1 0242 048

4 0966 144

8 1920%

20%

40%

60%

80%

100%

#processes

Weakeffi

ciency

relativeto

1024processes

130

1 051

#d.o.f.(inmillion

s)

Scalar diffusivity in 3D with P2 FE.1 billion unknowns solved in 150 seconds on 8 192 processes.

25 / 26 Pierre Jolivet HPC and FreeFem++

Page 40: Recent advances in HPC with FreeFem+++/uploads/Main/P-Jolivet.pdf · 2013-01-31 · Recent advances in HPC with FreeFem++ PierreJolivet Laboratoire Jacques-Louis Lions Laboratoire

Introduction New solvers Domain decomposition methods Conclusion

To sum up

1 FreeFem+++ = easy framework to solve large systems.

Two-level DDM2 New solvers give performance boost on smaller systems.

New problems being tackled:• more complex nonlinearities,• reuse of Krylov and deflation subspaces,• BNN and FETI methods.

Thank you for your attention.

26 / 26 Pierre Jolivet HPC and FreeFem++

Page 41: Recent advances in HPC with FreeFem+++/uploads/Main/P-Jolivet.pdf · 2013-01-31 · Recent advances in HPC with FreeFem++ PierreJolivet Laboratoire Jacques-Louis Lions Laboratoire

Introduction New solvers Domain decomposition methods Conclusion

To sum up

1 FreeFem+++ = easy framework to solve large systems.

Two-level DDM2 New solvers give performance boost on smaller systems.

New problems being tackled:• more complex nonlinearities,• reuse of Krylov and deflation subspaces,• BNN and FETI methods.

Thank you for your attention.

26 / 26 Pierre Jolivet HPC and FreeFem++

Page 42: Recent advances in HPC with FreeFem+++/uploads/Main/P-Jolivet.pdf · 2013-01-31 · Recent advances in HPC with FreeFem++ PierreJolivet Laboratoire Jacques-Louis Lions Laboratoire

Introduction New solvers Domain decomposition methods Conclusion

To sum up

1 FreeFem+++ = easy framework to solve large systems.

Two-level DDM2 New solvers give performance boost on smaller systems.

New problems being tackled:• more complex nonlinearities,• reuse of Krylov and deflation subspaces,• BNN and FETI methods.

Thank you for your attention.26 / 26 Pierre Jolivet HPC and FreeFem++