exascale algorithms forroystgnr/ghattas-talk.pdf · mesh refinement on forests of octrees, siam...

15
PECOS Predictive Engineering and Computational Sciences Exascale Algorithms for Large-Scale Solvers and Uncertainty Quantification George Biros, Roger Ghanem, Omar Ghattas The University of Texas at Austin January 15, 2013 ICES Biros, Ghanem, Ghattas Exascale Algorithms January 15, 2013 1 / 15

Upload: others

Post on 21-Aug-2020

3 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Exascale Algorithms forroystgnr/ghattas-talk.pdf · mesh refinement on forests of octrees, SIAM Journal on Scientific Computing, 33(3):1103–1133, 2011. Open-source release, interface

PECOSPredictive Engineering and Computational Sciences

Exascale Algorithms forLarge-Scale Solvers and Uncertainty Quantification

George Biros, Roger Ghanem, Omar Ghattas

The University of Texas at Austin

January 15, 2013ICES

Biros, Ghanem, Ghattas Exascale Algorithms January 15, 2013 1 / 15

Page 2: Exascale Algorithms forroystgnr/ghattas-talk.pdf · mesh refinement on forests of octrees, SIAM Journal on Scientific Computing, 33(3):1103–1133, 2011. Open-source release, interface

“Moore’s Law” for MHD simulations

From SCaLeS Report, Vol. 2, D. Keyes et al., eds., 2004

Biros, Ghanem, Ghattas Exascale Algorithms January 15, 2013 2 / 15

Page 3: Exascale Algorithms forroystgnr/ghattas-talk.pdf · mesh refinement on forests of octrees, SIAM Journal on Scientific Computing, 33(3):1103–1133, 2011. Open-source release, interface

A brief history of parallel algorithms research by our teamAs measured in the “SC’XY awards norm”

Team members have published 18 SC’XY papers on scalable parallel algorithmssince late 90s, receiving a number of SC honors:

• SC02: Inexact Newton-Krylov for inverse problems (Best Paper Award)• SC03: Inverse problems in wave propagation (Gordon Bell Prize)• SC03: Kernel independent fast multipole method (Gordon Bell Finalist,

Best Paper Finalist, Best Student Paper Award)• SC06: Integrated simulation and visualization (Best Student Paper Finalist)• SC06: Online meshing, simulation, and visualization (HPC Analytics Award)• SC07: Non-uniform multigrid (Best Paper Finalist)• SC08: AMR on octrees (Gordon Bell Finalist)• SC09: KIFMM for heterogeneous systems (Best Student Paper Finalist)• SC09: High-order AMR on complex geometry (Best Poster Award)• SC10: High-order AMR on complex geometry (Gordon Bell Finalist)• SC10: Fast multipole for complex fluids (Gordon Bell Prize)• SC12: UQ for inverse problems (Gordon Bell Finalist)

Biros, Ghanem, Ghattas Exascale Algorithms January 15, 2013 3 / 15

Page 4: Exascale Algorithms forroystgnr/ghattas-talk.pdf · mesh refinement on forests of octrees, SIAM Journal on Scientific Computing, 33(3):1103–1133, 2011. Open-source release, interface

A hybrid geometric-algebraic multigrid method

• Multigrid is the gold standard for linear solvers

• AMG: ideal for unstructured meshes, butdifficulty scaling to extreme core counts (e.g.,ML from Trilinos, BoomerAMG from hypre)

• GMG: demonstrated good scaling to O(105)cores, but challenges for unstructured meshes

• Hybrid AMG-GMG:

I Hexahedral coarse mesh to resolvegeometry

I GMG using forest of octrees adaptivity todefine finer meshes and prolongation &restriction operators

I AMG as the coarse mesh solverI Weak-scales to 262K cores with 71%

efficiency

103 104 105

10

20

30

40

50

cores→

time(

sec)→

AMG strongGMG strong

Strong scaling on Jaguar XK6 for variable-coefficientPoisson on adapted spherical mesh with 124M elements

blue=AMG (ML/Trilinos), red=our hybrid method

H. Sundar, G. Biros, C. Burstedde, J. Rudi, O. Ghattas, G. Stadler, Parallel geometric-algebraicmultigrid on unstructured forests of octrees, Proceedings of SC12.

Biros, Ghanem, Ghattas Exascale Algorithms January 15, 2013 4 / 15

Page 5: Exascale Algorithms forroystgnr/ghattas-talk.pdf · mesh refinement on forests of octrees, SIAM Journal on Scientific Computing, 33(3):1103–1133, 2011. Open-source release, interface

Weak scalability of hybrid geometric-algebraic multigridbased on forest-of-octrees adaptivity

64 512 4096 32,768 262,144Setup 2.97 2.64 3.1 3.76 8.6Smoother 289.7 301.5 336.3 391.3 409.1Transfer 7.45 8.47 11.5 11.35 15.88Coarse Setup 1.85 2.13 0.82 1.27 1.63Coarse Solve 24.3 30.8 18.47 30.1 26.01Total Time 326.3 345.5 370.2 437.8 461.2

• weak scaling of Poisson solve with 400K elements per core (largestproblem = 100 billion DOF)

• 45K octrees coarse mesh• 4 pre- and post-smoothing steps• ML AMG solver (from Trilinos) used as coarse grid solver• 71% parallel efficiency for 4000× increase in problem size & core

count from 64 to 262,144 cores

Biros, Ghanem, Ghattas Exascale Algorithms January 15, 2013 5 / 15

Page 6: Exascale Algorithms forroystgnr/ghattas-talk.pdf · mesh refinement on forests of octrees, SIAM Journal on Scientific Computing, 33(3):1103–1133, 2011. Open-source release, interface

Research issues for exascale multigrid solvers

• Extending our method to anisotropic and rough operators

• Extending our method to high-order discretizations

• Fault tolerance

• Performance tuning, particularly for heterogeneous architectures

• deployment within rvdDNS and GRINS

• Alternative to Newton-MG: Nonlinear multigrid

Biros, Ghanem, Ghattas Exascale Algorithms January 15, 2013 6 / 15

Page 7: Exascale Algorithms forroystgnr/ghattas-talk.pdf · mesh refinement on forests of octrees, SIAM Journal on Scientific Computing, 33(3):1103–1133, 2011. Open-source release, interface

p4est: Parallel forest-of-octrees AMR library

p0 p1 p1 p2

x0

y0

x1

y1

Details in: C. Burstedde, L.C. Wilcox, and O. Ghattas, p4est: Scalable algorithms for parallel adaptivemesh refinement on forests of octrees, SIAM Journal on Scientific Computing, 33(3):1103–1133,2011.

Open-source release, interface w/deal.II: W. Bangerth, C. Burstedde, T. Heister, andM. Kronbichler, Algorithms and data structures for massively parallel generic adaptive finite elementcodes, ACM Transactions on Mathematical Software, 30, 2011.

Biros, Ghanem, Ghattas Exascale Algorithms January 15, 2013 7 / 15

Page 8: Exascale Algorithms forroystgnr/ghattas-talk.pdf · mesh refinement on forests of octrees, SIAM Journal on Scientific Computing, 33(3):1103–1133, 2011. Open-source release, interface

Weak scalability of p4est-only operations on full JaguarExcellent scalability of pure AMR operations over 18,360X range of core count

0

10

20

30

40

50

60

70

80

90

100

12 60 432 3444 27540 220320

Per

centa

ge

ofru

ntim

e

Number of CPU cores

Partition Balance Ghost Nodes

0

2

4

6

8

10

12 60 432 3444 27540 220320Sec

onds

per

(mill

ion

elem

ents

/co

re)

Number of CPU cores

Balance Nodes

Left: Runtime dominated by Balance and Nodes while Partition and Ghost take less than 10%(New and Refine are negligible and not shown).

Right: Weak scaling for 2.3 million elements/core; ideal scaling would result in bars of constant height.Largest mesh created contains over 513 billion elements and is balanced in 21 s.

Details in: C. Burstedde, O. Ghattas, M. Gurnis, T. Isaac, G. Stadler, T. Warburton, L.C. Wilcox,Extreme-Scale AMR, Proceedings of ACM/IEEE SC10 (Gordon Bell Prize Finalist)

Biros, Ghanem, Ghattas Exascale Algorithms January 15, 2013 8 / 15

Page 9: Exascale Algorithms forroystgnr/ghattas-talk.pdf · mesh refinement on forests of octrees, SIAM Journal on Scientific Computing, 33(3):1103–1133, 2011. Open-source release, interface

Scalable methods for polynomial chaos expansions

• Conventional PCEs suffer from the curse of dimensionality

• Stems from attempting to approximate entire spatio-temporal field ofsolution

• Yet QoIs are generally low-dimensional functionals of solution

• Coordinate rotation of stochastic space results in most of theprobabilistic content of QoI being concentrated about a singledimension

• Work to compute transformation scales linearly in parameter spacedimension

• These ideas will be developed and applied to target combustionproblem

R. Tipireddy and R. Ghanem, Basis Adaptation in Homogeneous Chaos Spaces, submitted.

Biros, Ghanem, Ghattas Exascale Algorithms January 15, 2013 9 / 15

Page 10: Exascale Algorithms forroystgnr/ghattas-talk.pdf · mesh refinement on forests of octrees, SIAM Journal on Scientific Computing, 33(3):1103–1133, 2011. Open-source release, interface

Bayesian framework for inverse problems:Quest for knowledge from data and models

Input parameters, computational model,and output observables

Uncertainty is a fundamental feature of ill-posed inverse problems:

• Deterministic approach toill-posedness: employ regularization topenalize unwanted solution features,guarantee unique solution

• Bayesian approach to ill-posedness:describe probability of all parametersthat are consistent with the data, themodel, and any prior knowledge of theparameters

• Unfortunately, solution of Bayesianinverse problems via MCMC (method ofchoice) is intractable for highdimensional parameter spaces andexpensive forward models!

Biros, Ghanem, Ghattas Exascale Algorithms January 15, 2013 10 / 15

Page 11: Exascale Algorithms forroystgnr/ghattas-talk.pdf · mesh refinement on forests of octrees, SIAM Journal on Scientific Computing, 33(3):1103–1133, 2011. Open-source release, interface

Stochastic Newton MCMC samplingGoal: exploit problem structure in form of Hessian of parameter-to-observable map

Sample posterior probability density πpost(m) ∝

exp(− 12‖ f(m)−dobs ‖

2

Γ−1noise

− 12‖m−mpr ‖2

Γ−1pr

)

MCMC: propose from distribution q(mk, ·); accept withprobability

α = min(1,

π(y) q(y,mk)

π(mk) q(mk,y)

)Convergence comparison: Stochastic Newton vs. DRAM

−0.5 0 0.5 1−0.5

0

0.5

1

x

y

Random walk proposal: isotropic Gaussian

−0.5 0 0.5 1−0.5

0

0.5

1

x

y

Stochastic Newton proposal: local Hessian-tailored Gaussian

Biros, Ghanem, Ghattas Exascale Algorithms January 15, 2013 11 / 15

Page 12: Exascale Algorithms forroystgnr/ghattas-talk.pdf · mesh refinement on forests of octrees, SIAM Journal on Scientific Computing, 33(3):1103–1133, 2011. Open-source release, interface

Million-dimensional example

• 1.07 million uncertain acoustic wave speed parameters• 630 million state variables, 2400 time steps• Up to 100K cores on Jaguar XK6 (single forward solve is 1 minute on 64K cores)• 2000× reduction in problem dimension (488 dominant eigenvectors)• Top row: Samples from prior• Bottom row: Samples from the posterior• Right: “true” earth model (black dots=5 sources, white dots=100 receivers)

J. Martin, L.C. Wilcox, C. Burstedde, and O. Ghattas, A stochastic Newton MCMC method for large-scale statistical inverse problemswith application to seismic inversion, SIAM Journal on Scientific Computing, 34(3):A1460-A1487, 2012.

T. Bui-Thanh, C. Burstedde, O. Ghattas, J. Martin, G. Stadler, and L.C. Wilcox, Extreme-scale UQ for Bayesian inverse problemsgoverned by PDEs, Proceedings of SC12.

Biros, Ghanem, Ghattas Exascale Algorithms January 15, 2013 12 / 15

Page 13: Exascale Algorithms forroystgnr/ghattas-talk.pdf · mesh refinement on forests of octrees, SIAM Journal on Scientific Computing, 33(3):1103–1133, 2011. Open-source release, interface

Research challenges for intrusive MCMC sampling

• Scalable prior operators

• Reuse of Hessian information to improve Gaussian proposals

• Devise problem-specific Hessian approximations when even low rankapproximation is too expensive

• Develop trust region methods to enhance robustness of stochasticNewton for strongly non-Gaussian distributions

• All of the above in extreme-scale setting

Biros, Ghanem, Ghattas Exascale Algorithms January 15, 2013 13 / 15

Page 14: Exascale Algorithms forroystgnr/ghattas-talk.pdf · mesh refinement on forests of octrees, SIAM Journal on Scientific Computing, 33(3):1103–1133, 2011. Open-source release, interface

Synergistic projects• QUEST: Quantification of Uncertainty in Extreme Scale Computations, DOE ASCR

SciDAC Institutes program, 2011–2016. (SNL, LANL, Duke, MIT, USC, UT Austin)• DiaMonD: An Integrated Multifaceted Approach to Mathematics at the Interfaces of

Data, Models, and Decisions, DOE ASCR MMICCs program, 2012–2017. (UTAustin, MIT, FSU, CSU, Stanford, ORNL, LANL)

• Ultra-Scalable Algorithms for Large-Scale Uncertainty Quantification in Inverse WavePropagation, AFOSR Computational Mathematics program, 2012–2015.

• Ultra-High Resolution Dynamic Earth Models Through Joint Inversion of Seismic andGeodynamic Data, NSF CDI program, 2010–2014.

• Stochastic Prediction for the Design and Management of Interacting ComplexSystems, NSF EFRI, 2010–2013.

• Dynamics of Ice Sheets: Advanced Simulation Models, Large-Scale Data Inversion,and Quantification of Uncertainty in Sea Level Rise Projections, NSF CDI program,2009–2013.

• Uncertainty Quantification for Large-Scale Ice Sheet Modeling, DOE ASCR SciDACprogram, 2009–2013.

• Analysis and Reduction of Complex Networks Under Uncertainty, DOE ASCR,2009–2013.

• Software for Integral Equation Solvers on Manycore and HeterogeneousArchitectures, NSF SI2 program, 2009–2013.

Biros, Ghanem, Ghattas Exascale Algorithms January 15, 2013 14 / 15

Page 15: Exascale Algorithms forroystgnr/ghattas-talk.pdf · mesh refinement on forests of octrees, SIAM Journal on Scientific Computing, 33(3):1103–1133, 2011. Open-source release, interface

Extra slides

Biros, Ghanem, Ghattas Exascale Algorithms January 15, 2013 15 / 15