danny dunlavy, andy salinger sandia national laboratories albuquerque, new mexico, usa siam parallel...
TRANSCRIPT
Danny Dunlavy, Andy SalingerSandia National LaboratoriesAlbuquerque, New Mexico, USA
SIAM Parallel Processing
February 23, 2006
SAND2006-1075C
Sandia is a multiprogram laboratory operated by Sandia Corporation, a Lockheed Martin Company,for the United States Department of Energy’s National Nuclear Security Administration
under contract DE-AC04-94AL85000.
Preconditioners for the Space-Time Solution
of Large-Scale PDE Applications
SIAM Parallel Processing 2006
Motivation
• Large-scale Transient Applications
• Space-Time Formulations– Transient calculations:
• Initial conditions and parameter
– Space-time formulations:• Parallelism in time (and space)
• Intermediate/final values
• Integrated values
• Periodic orbits
– Applications• Current: Fluid flow (MPSalsa)
• Planned: Semiconductor devices (Charon)Fluid/structure problems (Aria/Sierra)
SIAM Parallel Processing 2006
Space-Time Formulation
Transient Simulation of:
First solve:
Then solve:
Then solve:
Instead, solve for all solutions
at once:
where
… and with Newton solve:
Solve system with GMRES (right preconditioning)
SIAM Parallel Processing 2006
Space-Time Preconditioners
• Global• Sequential• Parallel • Block Diag• “Parareal” (Multilevel)
= Solve/Precondition
= Multiply, Add
SIAM Parallel Processing 2006
proc 0:
proc 1:
proc 1:
proc 0:
proc 0:
proc 1:
proc 3:
proc 2:
Space and Time Partitioned Independently Ex: 4 Time Steps on 4 Procs
Spatial Domains Space-Time Domains
Proc 0:
Proc 1:
Proc 3:
Proc 2:
Each processor owns 1 time step
for the entire spatial domain
Each processor owns 4 time
steps for ¼ of the spatial
domain
Each processor owns 2 time
steps for ½ of the spatial
domain
proc 0:
proc 0:
proc 0:
proc 0:
SIAM Parallel Processing 2006
Preliminary Analysis – Computational Time
Time Integration
Sequential (preconditioning only, 1 time domain)
Sequential (preconditioning only, Nproc time domains)
Parallel (Nproc time domains)
Parareal (Nproc time domains)
Global (Nproc time domains)
SIAM Parallel Processing 2006
Demonstration Problem
• Frank-Kamenetskii explosion model– Extended to include reactant consumption term
– 5 scalar PDEs
– 5 unknowns:
insulated
axis ofsymmetry
SIAM Parallel Processing 2006
Numerical Experiments
• Methods– MPSalsa: FEM: 64 x 48 elements, time steps: 32, unknowns: 509,600– Trilinos: Newton (NOX) : 4–7 iterations
GMRES (Aztec) : 400 max. outer, 200 max. inner iterations
ILUk (Ifpack) : k=1 (fill)Continuation in (LOCA): 1 step
• Fixed Number of Spatial Domains (4)– Processors: 4 8 16 32 64 128– Time Domains: 1 2 4 8 16 32– How much can parallelism in time speed up the solve?
• Fixed Number of Processors (32)– Spatial domains: 1 2 4 8 16 32– Time domains: 32 16 8 4 2 1– How can space-time parallelism be used most effectively?
SIAM Parallel Processing 2006
Results – Fixed Number of Spatial Domains (4)
Processors 4 8 16 32 64 128Time Domains 1 2 4 8 16 32
Sequential (1e-6, P) 236 164 131 115 108 104
Sequential (1e-2, P) 217 139 94 74 67 65
Sequential (P, 1e-3) 931 636 477 380 352 357
Parallel (1e-6, 1e-3) 331 210 148 116 98 93
Parallel (P, 1e-3) 943 477 246 108 61 53
Block Diag (P, 1e-3) 1027 523 263 110 64 53
Global (1e-3) 958 491 244 105 57 46
Parareal (1e-6, P) 237 112 145 119
Parareal (P, 1e-3) 950 277 181 106
Preconditioner (block solve tolerance, GMRES tolerance); P = preconditioning only
SIAM Parallel Processing 2006
Results – Fixed Number of Spatial Domains (4)
Best Results
Sequential (1e-2, P)
Parallel (P, 1e-3)
Global (1e-3)
SIAM Parallel Processing 2006
Results – Fixed Number of Processors (32)
Spatial Domains 32 16 8 4 2 1Time Domains 1 2 4 8 16 32
Sequential (1e-6, P) 72 71 87 100 168 122
Sequential (1e-2, P) 55 52 59 66 103 84
Sequential (P, 1e-3) 551 310 339 359 548 625
Parallel (1e-6, 1e-3) 117 95 99 107 154 170
Parallel (P, 1e-3) 548 217 162 135 84 70
Block Diag (P, 1e-3) 550 204 161 137 88 69
Global (1e-3) 365 172 143 125 81 57
Parareal (1e-6, P) 70 75 110 226
Parareal (P, 1e-3) 551 188 184 399
Preconditioner (block solve tolerance, GMRES tolerance); P = preconditioning only
SIAM Parallel Processing 2006
Summary
• Conclusions– Several preconditioners improve performance of space-time solves
– Achieve time parallelism for serial codes (fixed spatial domains)
• Future Work– More time steps (study limits of time parallelism)
– Comparison of analysis to experimental timing results
– Periodic orbit tracking
– Initial guesses for Newton (mesh refinement/preconditioning)
– Other time discretizations (p-refinement)
– Adaptive time steps (r-adaptivity) and time domain partitioning
SIAM Parallel Processing 2006
Thank You
MS44 – Parallel Space-Time AlgorithmsFriday, 9:45 – 11:45 AM (Carmel Room)
Space-Time Solution of Large-Scale PDE Applications
Andy Salinger, 11:15 – 11:40 AM
Danny [email protected]
Andy [email protected]