![Page 1: A Parallel Implementation of the BDDC Method for the ...users.math.cas.cz/~sistek/talks/Sistek-2010-ICCFD-talk.pdf · A Parallel Implementation of the BDDC Method for the Stokes Flow](https://reader034.vdocument.in/reader034/viewer/2022042119/5e9880c867fab03eb348912b/html5/thumbnails/1.jpg)
A Parallel Implementation of the BDDCMethod for the Stokes Flow
Jakub Sıstek
joint work withP. Burda, M. Certıkova, J. Mandel, J. Novotny, B. Sousedık
Institute of Mathematics of the AS CR, PragueCzech Technical University, Prague
University of Colorado Denver
July 16th, 2010, ICCFD6St. Petersburg
![Page 2: A Parallel Implementation of the BDDC Method for the ...users.math.cas.cz/~sistek/talks/Sistek-2010-ICCFD-talk.pdf · A Parallel Implementation of the BDDC Method for the Stokes Flow](https://reader034.vdocument.in/reader034/viewer/2022042119/5e9880c867fab03eb348912b/html5/thumbnails/2.jpg)
Table of contents
Stokes problem and mixed FEM
BDDC method
Parallel implementation
Numerical results
Conclusion
![Page 3: A Parallel Implementation of the BDDC Method for the ...users.math.cas.cz/~sistek/talks/Sistek-2010-ICCFD-talk.pdf · A Parallel Implementation of the BDDC Method for the Stokes Flow](https://reader034.vdocument.in/reader034/viewer/2022042119/5e9880c867fab03eb348912b/html5/thumbnails/3.jpg)
Outline
Stokes problem and mixed FEM
BDDC method
Parallel implementation
Numerical results
Conclusion
![Page 4: A Parallel Implementation of the BDDC Method for the ...users.math.cas.cz/~sistek/talks/Sistek-2010-ICCFD-talk.pdf · A Parallel Implementation of the BDDC Method for the Stokes Flow](https://reader034.vdocument.in/reader034/viewer/2022042119/5e9880c867fab03eb348912b/html5/thumbnails/4.jpg)
Steady Stokes problemFind flow velocity u(x) ∈ [C2(Ω)]d and pressure p(x) ∈ C1(Ω)/Rsatisfying
−ν∆u +∇p = f in Ω,
−∇ · u = 0 in Ω,
u = g on ∂Ωg ,
−ν(∇u)n + pn = 0 on ∂Ωh,
I d = 2, 3 . . . spacial dimension
I Ω ⊂ Rd . . . domain with Lipschitz boundary ∂Ω filled withincompressible viscous fluid
I ν . . . constant positive kinematic viscosity of the fluid
I f(x) . . . vector of intensity of volume forces per mass unit
I ∂Ωg and ∂Ωh . . . subsets of ∂Ω satisfying ∂Ω = ∂Ωg ∪ ∂Ωh
I n . . . unit outer normal vector to the boundary ∂Ω
![Page 5: A Parallel Implementation of the BDDC Method for the ...users.math.cas.cz/~sistek/talks/Sistek-2010-ICCFD-talk.pdf · A Parallel Implementation of the BDDC Method for the Stokes Flow](https://reader034.vdocument.in/reader034/viewer/2022042119/5e9880c867fab03eb348912b/html5/thumbnails/5.jpg)
Weak formulation
Function spaces
Vg =
v = (v1, v2) | v ∈ [H1(Ω)]d ; Tr vi = gi , i = 1, . . . on ∂Ωg
,
V =
v = (v1, v2) | v ∈ [H1(Ω)]d ; Tr vi = 0, i = 1, . . . on ∂Ωg
.
Find u(x) ∈ Vg , u− ug ∈ V and p(x) ∈ L2(Ω)/R satisfying
ν∫
Ω∇u : ∇vdΩ −∫
Ω p∇ · vdΩ =∫
Ω f · vdΩ ∀v ∈ V ,−∫
Ω ψ∇ · udΩ = 0 ∀ψ ∈ L2(Ω).
I ug ∈ Vg satisfies Dirichlet boundary condition g
![Page 6: A Parallel Implementation of the BDDC Method for the ...users.math.cas.cz/~sistek/talks/Sistek-2010-ICCFD-talk.pdf · A Parallel Implementation of the BDDC Method for the Stokes Flow](https://reader034.vdocument.in/reader034/viewer/2022042119/5e9880c867fab03eb348912b/html5/thumbnails/6.jpg)
Approximation of the problem by finite element method
Triangulation τh of domain Ω by Taylor–Hood finite elements.
Ω
finite element K
τh
• . . . node with value of velocity component and pressure• . . . node with value of velocity component only
![Page 7: A Parallel Implementation of the BDDC Method for the ...users.math.cas.cz/~sistek/talks/Sistek-2010-ICCFD-talk.pdf · A Parallel Implementation of the BDDC Method for the Stokes Flow](https://reader034.vdocument.in/reader034/viewer/2022042119/5e9880c867fab03eb348912b/html5/thumbnails/7.jpg)
Approximation of the problem by finite element methodTaylor–Hood finite elements – satisfying Babuska-Brezzi condition
∃CB > 0, const. ∀ψh ∈ Qh supvh∈Vh
(ψh,∇ · vh)0
‖vh‖1≥ CB‖ψh‖0
function spaces for approximation:
velocities
Vgh =
vh ∈ [C(Ω)]d ; vhi|K∈ R2(K ), i = 1, . . . , d ; vh = g on ∂Ωg
pressure and test functions for the continuity equation
Qh =ψh ∈ C(Ω); ψh |K∈ R1(K )
test functions for momentum equations
Vh =
vh ∈ [C(Ω)]d ; vhi|K∈ R2(K ), i = 1, . . . , d ; vh = 0 on ∂Ωg
where
Rm(K ) =
Pm(K ), if K is a triangle/tetrahedron
Qm(K ), if K is a quadrilateral/hexahedron
![Page 8: A Parallel Implementation of the BDDC Method for the ...users.math.cas.cz/~sistek/talks/Sistek-2010-ICCFD-talk.pdf · A Parallel Implementation of the BDDC Method for the Stokes Flow](https://reader034.vdocument.in/reader034/viewer/2022042119/5e9880c867fab03eb348912b/html5/thumbnails/8.jpg)
Matrix problem
Discretization leads to the saddle point problem[A BT
B 0
] [up
]=
[f0
]
I u . . . velocity unknowns
I p . . . pressure unknowns
I A . . . vector–Laplacian matrix
I B . . . divergence matrix
I f . . . discrete vector of intensity of volume forces per mass unit
![Page 9: A Parallel Implementation of the BDDC Method for the ...users.math.cas.cz/~sistek/talks/Sistek-2010-ICCFD-talk.pdf · A Parallel Implementation of the BDDC Method for the Stokes Flow](https://reader034.vdocument.in/reader034/viewer/2022042119/5e9880c867fab03eb348912b/html5/thumbnails/9.jpg)
Outline
Stokes problem and mixed FEM
BDDC method
Parallel implementation
Numerical results
Conclusion
![Page 10: A Parallel Implementation of the BDDC Method for the ...users.math.cas.cz/~sistek/talks/Sistek-2010-ICCFD-talk.pdf · A Parallel Implementation of the BDDC Method for the Stokes Flow](https://reader034.vdocument.in/reader034/viewer/2022042119/5e9880c867fab03eb348912b/html5/thumbnails/10.jpg)
Brief overview of BDDC method
I Balancing Domain Decomposition by Constraints
I 2003 C. Dohrmann (Sandia), theory with J. Mandel (UCD)
I nonoverlapping primary domain decomposition method
I equivalent with FETI-DP [Mandel, Dohrmann, Tezaur 2005]
I for SPD problems - condition number κ satisfy
κ ≤ C log2(1 + H
h
)
![Page 11: A Parallel Implementation of the BDDC Method for the ...users.math.cas.cz/~sistek/talks/Sistek-2010-ICCFD-talk.pdf · A Parallel Implementation of the BDDC Method for the Stokes Flow](https://reader034.vdocument.in/reader034/viewer/2022042119/5e9880c867fab03eb348912b/html5/thumbnails/11.jpg)
The abstract problem in BDDC
Variational setting
u ∈ U : a(u, v) = 〈f , v〉 ∀v ∈ U
I a (·, ·) symmetric positive definite form on U
I 〈·, ·〉 is inner product on U
I U is finite dimensional space
Matrix form
u ∈ U : Au = f
I A symmetric positive definite matrix on U
Linked together
〈Au, v〉 = a (u, v) ∀u, v ∈ U
![Page 12: A Parallel Implementation of the BDDC Method for the ...users.math.cas.cz/~sistek/talks/Sistek-2010-ICCFD-talk.pdf · A Parallel Implementation of the BDDC Method for the Stokes Flow](https://reader034.vdocument.in/reader034/viewer/2022042119/5e9880c867fab03eb348912b/html5/thumbnails/12.jpg)
BDDC set-up
I division into subdomains
I selection of coarse problem nodes (also called corners)
interface
subdomain iΩcoarse problem nodes
h
H
finite elements
![Page 13: A Parallel Implementation of the BDDC Method for the ...users.math.cas.cz/~sistek/talks/Sistek-2010-ICCFD-talk.pdf · A Parallel Implementation of the BDDC Method for the Stokes Flow](https://reader034.vdocument.in/reader034/viewer/2022042119/5e9880c867fab03eb348912b/html5/thumbnails/13.jpg)
Function spaces in BDDC
U ⊂ W c ⊂ Wcontinuous continuous at coarse no continuity
problem nodes
I enough coarse nodes to fix floating subdomains – rigid bodymodes captured
I a (·, ·) symmetric positive definite form on W c
I corresponding matrix Ac symmetric positive definite, almostblock diagonal structure, larger dimension than A
I operator of projection E : W c → U, Range(E ) = U,e.g. averaging across interfaces (arithmetic, weighted)
![Page 14: A Parallel Implementation of the BDDC Method for the ...users.math.cas.cz/~sistek/talks/Sistek-2010-ICCFD-talk.pdf · A Parallel Implementation of the BDDC Method for the Stokes Flow](https://reader034.vdocument.in/reader034/viewer/2022042119/5e9880c867fab03eb348912b/html5/thumbnails/14.jpg)
The BDDC preconditioner with corners
Define MBDDC : r ∈ U −→ u ∈ U
variational form
MBDDC : r 7−→ u = Ew , w ∈ W c : a (w , z) = 〈r ,Ez〉 , ∀z ∈ W c
matrix form
Acw = ET r
MBDDC r = Ew
![Page 15: A Parallel Implementation of the BDDC Method for the ...users.math.cas.cz/~sistek/talks/Sistek-2010-ICCFD-talk.pdf · A Parallel Implementation of the BDDC Method for the Stokes Flow](https://reader034.vdocument.in/reader034/viewer/2022042119/5e9880c867fab03eb348912b/html5/thumbnails/15.jpg)
Fictious mesh
I Ac can be constructed using auxiliary mesh
![Page 16: A Parallel Implementation of the BDDC Method for the ...users.math.cas.cz/~sistek/talks/Sistek-2010-ICCFD-talk.pdf · A Parallel Implementation of the BDDC Method for the Stokes Flow](https://reader034.vdocument.in/reader034/viewer/2022042119/5e9880c867fab03eb348912b/html5/thumbnails/16.jpg)
BDDC for the Stokes problem
In the abstract formAu = f ,
simply put
A =
[A BT
B 0
],
u =
[up
],
and
f =
[f0
].
I system matrix and preconditioner – symmetric indefinite
![Page 17: A Parallel Implementation of the BDDC Method for the ...users.math.cas.cz/~sistek/talks/Sistek-2010-ICCFD-talk.pdf · A Parallel Implementation of the BDDC Method for the Stokes Flow](https://reader034.vdocument.in/reader034/viewer/2022042119/5e9880c867fab03eb348912b/html5/thumbnails/17.jpg)
Outline
Stokes problem and mixed FEM
BDDC method
Parallel implementation
Numerical results
Conclusion
![Page 18: A Parallel Implementation of the BDDC Method for the ...users.math.cas.cz/~sistek/talks/Sistek-2010-ICCFD-talk.pdf · A Parallel Implementation of the BDDC Method for the Stokes Flow](https://reader034.vdocument.in/reader034/viewer/2022042119/5e9880c867fab03eb348912b/html5/thumbnails/18.jpg)
Parallel implementation
I built on multifrontal solver
I MUltifrontal Massively Parallel sparse direct Solver(MUMPS) http://mumps.enseeiht.fr
I based on W c
I Fortran 90 programming language, MPI libraryI experiments on
I SGI Altix 4700, CTU, Prague, CR72 processors Intel Itanium 2, OS Linux
![Page 19: A Parallel Implementation of the BDDC Method for the ...users.math.cas.cz/~sistek/talks/Sistek-2010-ICCFD-talk.pdf · A Parallel Implementation of the BDDC Method for the Stokes Flow](https://reader034.vdocument.in/reader034/viewer/2022042119/5e9880c867fab03eb348912b/html5/thumbnails/19.jpg)
Outline
Stokes problem and mixed FEM
BDDC method
Parallel implementation
Numerical results
Conclusion
![Page 20: A Parallel Implementation of the BDDC Method for the ...users.math.cas.cz/~sistek/talks/Sistek-2010-ICCFD-talk.pdf · A Parallel Implementation of the BDDC Method for the Stokes Flow](https://reader034.vdocument.in/reader034/viewer/2022042119/5e9880c867fab03eb348912b/html5/thumbnails/20.jpg)
Stokes flow in 2D lid driven cavity
I 1282 = 16 384 Taylor-Hood elements, 115 971 dof
I 8 subdomains, 14 corners
0 0.2 0.4 0.6 0.8 10
0.2
0.4
0.6
0.8
1
![Page 21: A Parallel Implementation of the BDDC Method for the ...users.math.cas.cz/~sistek/talks/Sistek-2010-ICCFD-talk.pdf · A Parallel Implementation of the BDDC Method for the Stokes Flow](https://reader034.vdocument.in/reader034/viewer/2022042119/5e9880c867fab03eb348912b/html5/thumbnails/21.jpg)
Stokes flow in 2D lid driven cavity
I 8 processors of SGI Altix 4700
I 59 PCG iterations, 17.2 sec (serial frontal algorithm – 231 sec)
0 0.2 0.4 0.6 0.8 10
0.2
0.4
0.6
0.8
1
streamlines pressure
![Page 22: A Parallel Implementation of the BDDC Method for the ...users.math.cas.cz/~sistek/talks/Sistek-2010-ICCFD-talk.pdf · A Parallel Implementation of the BDDC Method for the Stokes Flow](https://reader034.vdocument.in/reader034/viewer/2022042119/5e9880c867fab03eb348912b/html5/thumbnails/22.jpg)
Stokes flow in 2D lid driven cavity
I other iterative methods and preconditioners
I ‖r‖2/‖g‖2 < 10−8
I Matlab results
no BDDC BDDC ILUT ILUT ILUT
method prec. W c W c+F τ = 10−3 10−4 10−5
BICGSTAB n/a 45 22 n/a 331 10GMRES 759 49 38 472 87 18
I n/a – no convergence
I F – continuity of arithmetic averages on faces
![Page 23: A Parallel Implementation of the BDDC Method for the ...users.math.cas.cz/~sistek/talks/Sistek-2010-ICCFD-talk.pdf · A Parallel Implementation of the BDDC Method for the Stokes Flow](https://reader034.vdocument.in/reader034/viewer/2022042119/5e9880c867fab03eb348912b/html5/thumbnails/23.jpg)
Stokes flow in 3D channel
I a quarter of the channel
I 3 393 Taylor–Hood finite elements, 54 248 unknowns
I division into 4 subdomains
![Page 24: A Parallel Implementation of the BDDC Method for the ...users.math.cas.cz/~sistek/talks/Sistek-2010-ICCFD-talk.pdf · A Parallel Implementation of the BDDC Method for the Stokes Flow](https://reader034.vdocument.in/reader034/viewer/2022042119/5e9880c867fab03eb348912b/html5/thumbnails/24.jpg)
Stokes flow in 3D channel
I ‖r‖2/‖f ‖2 < 10−6 by 33 PCG iterations
I pressure with streamlines
![Page 25: A Parallel Implementation of the BDDC Method for the ...users.math.cas.cz/~sistek/talks/Sistek-2010-ICCFD-talk.pdf · A Parallel Implementation of the BDDC Method for the Stokes Flow](https://reader034.vdocument.in/reader034/viewer/2022042119/5e9880c867fab03eb348912b/html5/thumbnails/25.jpg)
Stokes flow in 3D lid driven cavity
I 3D extension of 2D lid driven cavity flow
I unit cube
I tangential velocity rotated by π/8
I kinematic viscosity 0.01
z
x
uπ/8y
![Page 26: A Parallel Implementation of the BDDC Method for the ...users.math.cas.cz/~sistek/talks/Sistek-2010-ICCFD-talk.pdf · A Parallel Implementation of the BDDC Method for the Stokes Flow](https://reader034.vdocument.in/reader034/viewer/2022042119/5e9880c867fab03eb348912b/html5/thumbnails/26.jpg)
Stokes flow in 3D lid driven cavity
I 323 = 32 768 Taylor-Hood finite elements, 457 380 unknowns
I division into 32 subdomains by METIS graph partitioner
![Page 27: A Parallel Implementation of the BDDC Method for the ...users.math.cas.cz/~sistek/talks/Sistek-2010-ICCFD-talk.pdf · A Parallel Implementation of the BDDC Method for the Stokes Flow](https://reader034.vdocument.in/reader034/viewer/2022042119/5e9880c867fab03eb348912b/html5/thumbnails/27.jpg)
Stokes flow in 3D lid driven cavity
I stopping criterion ‖r‖2/‖g‖2 < 10−6.
I 46 PCG iterations
I 731 sec on 32 processors
I streamline through point with coordinates [0.5,0.55,0.5]
![Page 28: A Parallel Implementation of the BDDC Method for the ...users.math.cas.cz/~sistek/talks/Sistek-2010-ICCFD-talk.pdf · A Parallel Implementation of the BDDC Method for the Stokes Flow](https://reader034.vdocument.in/reader034/viewer/2022042119/5e9880c867fab03eb348912b/html5/thumbnails/28.jpg)
Outline
Stokes problem and mixed FEM
BDDC method
Parallel implementation
Numerical results
Conclusion
![Page 29: A Parallel Implementation of the BDDC Method for the ...users.math.cas.cz/~sistek/talks/Sistek-2010-ICCFD-talk.pdf · A Parallel Implementation of the BDDC Method for the Stokes Flow](https://reader034.vdocument.in/reader034/viewer/2022042119/5e9880c867fab03eb348912b/html5/thumbnails/29.jpg)
Conclusion and future direction
I implementation based on MUMPS is simpler thansubdomain-by-subdomain approach
I lack of theoretical background for BDDC on indefiniteproblems for some elements
I study of using BDDC implementation to Stokes problem ‘as is’
I BDDC is applicable to other problems than SPD and othermethods than PCG
I more sophisticated (adaptive) way for selection of constraints- ongoing research
I application as block preconditioner
![Page 30: A Parallel Implementation of the BDDC Method for the ...users.math.cas.cz/~sistek/talks/Sistek-2010-ICCFD-talk.pdf · A Parallel Implementation of the BDDC Method for the Stokes Flow](https://reader034.vdocument.in/reader034/viewer/2022042119/5e9880c867fab03eb348912b/html5/thumbnails/30.jpg)
introduce matrix G with constraints
I each row of G corresponds to a continuity constraint betweentwo subdomains
I introduces new coupling between subdomains
Example: for arithmetic averages on an edge between subdomainsi and j , a row of G is
gk = [0 . . . 0 1 1 1 1︸ ︷︷ ︸edge dof on Ωi
0 . . . 0−1− 1− 1− 1︸ ︷︷ ︸edge dof on Ωj
0 . . . 0]
define intermediate space as
W =
w ∈ W c : Gw = 0
![Page 31: A Parallel Implementation of the BDDC Method for the ...users.math.cas.cz/~sistek/talks/Sistek-2010-ICCFD-talk.pdf · A Parallel Implementation of the BDDC Method for the Stokes Flow](https://reader034.vdocument.in/reader034/viewer/2022042119/5e9880c867fab03eb348912b/html5/thumbnails/31.jpg)
Enforcing additional constraints
Change of variables on each subdomain, such that averages appearas single node constraints.
w = Tw , w = Bw , B = T−1
Matrix T invertible, contains weights of averages.
Compute MBDDC r = EBw , where w is the solution to
BT AcBw + BTGTλ = BTET rGBw = 0
.
Transformed averages may be handled as corners and furtherassembled [Li, Widlund 2006].
Drawback: The distinction between W c and W lost.
![Page 32: A Parallel Implementation of the BDDC Method for the ...users.math.cas.cz/~sistek/talks/Sistek-2010-ICCFD-talk.pdf · A Parallel Implementation of the BDDC Method for the Stokes Flow](https://reader034.vdocument.in/reader034/viewer/2022042119/5e9880c867fab03eb348912b/html5/thumbnails/32.jpg)
Projected change of variablesCombination of projected BDDC and change of variables:
I introduce matrix G with constraints,
I define matrix G = GB – reduces to one 1 and one −1 in eachrow.
Projection onto null(G )
P = I − GT
(GGT
)−1G
Construct matrix
A = PBT AcBP + t(I − P)
BDDC preconditioner as
Aw = PBTET r
MBDDC r = EBw