barker_siamcse15

28
Lawrence Livermore National Laboratory Parallel spectral element-based agglomeration algebraic multigrid for porous media flow LLNL-PRES-667969. This work was performed under the auspices of the U.S. Department of Energy under Contract DE-AC52-07NA27344. Lawrence Livermore National Security, LLC A. T. Barker, P. S. Vassilevski Lawrence Livermore National Laboratory D. Kalchev University of Colorado, Boulder

Upload: karen-pao

Post on 17-Jul-2015

36 views

Category:

Science


0 download

TRANSCRIPT

Lawrence Livermore National Laboratory

Parallel spectral element-based agglomeration algebraicmultigrid for porous media flow

LLNL-PRES-667969. This work was performed under the auspices of the U.S.

Department of Energy under Contract DE-AC52-07NA27344. Lawrence

Livermore National Security, LLC

A. T. Barker, P. S. Vassilevski Lawrence Livermore NationalLaboratory

D. Kalchev University of Colorado, Boulder

Achieving resilience

I Rely on resilient kernels (matrix-vector multiplication).

I Rely on resilient libraries (black box, well tested).

Our goal is to use these generic components to build morespecialized algorithms.

Outline

I Model problem

I Algebraic multigrid

I Algebraic multigrid for the model problem.

I Putting it all together.

Model problem

Equations for porous media flow:

−∇ · (κ∇u) = f

I κ is discontinuous, has high contrast, and is generally nasty.

I Applications in petroleum engineering, among other places.

Algebraic multigrid

An idealized view of AMG views it as a black box solver for

Ax = b

I In practice all AMG algorithms use some assumptions aboutthe “near nullspace” of A.

I We want to use out of the box AMG in a situation when theseassumptions are not satisfied.

AMG packages

AMG packages such as Hypre, ML, and SAMG have a lot ofadvantages:

I Well-tested.

I Fast and scalable.

I Resilient?

[Baker et. al. 2011; Drummond and Marques 2005]

AMG and the near nullspace

I Coarse grids must represent smooth error modes from finegrids

I We don’t know in advance what these mode are.

I We have to make a priori assumptions.

I Traditionally, we assume the vector of all ones is the nearnullspace.

I This assumption is true for many (not all!) discretizations ofmany (not all!) Poisson-like problems.

I What if it’s not true?

AMG for the model problem

−∇ · (κ∇u) = f

I High contrast in κ tends to add additional components to thenear nullspace.

I We need to represent these components on at least the firstcoarse grid.

Smoothed aggregation spectral AMGe

I Identifying near nullspace components is a (global) eigenvalueproblem.

I We approximate them by solving local eigenvalue problems.

This is the basis of an algorithm called smoothed aggregationspectral element-based AMG [Brezina-Vassilevski 2011]

Element-based algebraic multigrid

Spectral AMGe

[Chartier et. al. 2003]

I Subdivide mesh into agglomerates using a graph partitioner.

I On an agglomerate τ , assemble a local matrix Aτ (includescoefficients)

I Solve a local eigenvalue problem to find coarse basis functionson Aτ

I These basis functions become columns of P .

Spectral AMGe with smoothed aggregation

[Brezina-Vassilevski 2011]

I The domain we assemble Aτ on is a subset of theagglomerated element (partition the degrees of freedom).

I We smooth the resulting basis functions when we construct P .

I We set a tolerance, and include all eigenvectors correspondingto eigenvalues smaller than this tolerance (usually 1 to 3eigenvectors per agglomerated element).

I The local problems are Neumann problems.

Element-based algebraic multigrid

I Only requires mesh on finest level.

I Includes information about mesh, coefficients, smoother.

I Preserves multigrid approximation properties.

I Everything is built on matrix-vector multiplication.

The two level method works

Using CG to solve the coarse problem (1 million unknowns on finegrid, approx 30000 coarse)

1m 9mprocessors its factor its factor

4 24 0.318 25 0.3216 30 0.39 26 0.3432 28 0.36 28 0.3664 24 0.31 24 0.30128 34 0.44 32 0.42256 34 0.44 32 0.42512 25 0.33

Recursion is not straightforward

The agglomeration itself is not problematic, but multilevelextensions of this kind of procedure tend to suffer from somecombination of the following three problems:

I Increasing operator complexities (dense coarse matrices).

I Ill-conditioned coarse matrices (linearly dependent coarse basisfunctions).

I Very complicated constructions of P .

Avoid recursion

I One approach is to use a resilient parallel hierarchicalsemi-separable solver from Berkeley lab.

I Another way to avoid recursion is to simply apply Hypre tothe coarse problem...

Naively apply Hypre to spectral problem

PCG-Hypre for coarse problem (tight tolerance)

1m 9mprocessors coarse its time coarse its time

4 7k 405.78 10k 406.216 14k 326.1 28k 1118032 20k 285.0 38k 969664 19k 102.4 35k 3606128 21k 53.0 41k 2228256 22k 43.9 71k 2346512 44k 662.4

The resulting spectral grid problem

The coarse grid problem coming from smoothed aggregationspectral element-based AMG:

I is not nodal,

I has “spectral” degrees of freedom,

I and the vector of ones is not in the near nullspace, which is akey assumption of the BoomerAMG solver.

Where are we?

I We have a spectral element-based AMG that makes a nicecoarse level for the original problem.

I It is difficult to make further coarse levels with the spectralAMG.

I The problem on the spectral level is not well suited to blackbox AMG.

The “three-level” strategy

fine (A)

spectral (As)

nullspace (An)

hypre

P

The strategy

We would like to formulate some version of the coarse problemthat has the right nullspace for Hypre.

Suppose we have a vector c on the spectral level

Pc = 1fine

Now define a diagonal matrix P̂ so that

c = P̂1nul.

The strategy

Pc = 1fine

c = P̂1nul

ThenP̂ TAsP̂1nul = P̂ TAsc = P̂ TP TA1fine ≈ 0

that is, 1nul is in the near nullspace of the operator An = P̂ TAsP̂ .

Practical details

Finding cPc = 1fine

Is a global (and in general not well defined) problem. We considerinstead

P̃ c = 1fine

where P̃ is the “tentative” interpolator. We solve each block in aleast squares sense.

Numerical results

I We use the SPE10 model, which is (very) standard inpetroleum engineering.

I About 1 million elements (refined to 9 or 72 million), κ isdiscontinuous and highly varying

I Parallel results are on Sierra, a fairly standard Linux clusterwith 12 cores and 24 GB memory per node, infinibandinterconnect.

I We solve with a zero right hand side, random initial guess,and residual tolerance of 10−12.

Three level technique

Corrected nullspace, PCG-Hypre for problem on nullspace level(tight tolerance)

1m 9mprocessors fine its nul its time fine its nul its time

4 30 434 73.98 31 397 51.816 36 397 28.4 49 632 533.532 35 324 17.8 51 572 349.664 31 256 8.9 46 470 141.4128 40 287 6.2 55 448 77.9256 41 252 4.6 52 421 45.1512 49 307 23.3

Scaling

Further scaling

Optimize for time (V-cycle on nullspace level, less smoothing)

1m 9m 72mprocessors its time its time its time

4 78 88.78 83 63.816 82 33.5 181 688.032 87 23.1 180 465.264 82 11.8 182 213.8128 83 6.7 186 107.5256 75 3.8 176 65.7 386 1102.8512 86 3.2 190 34.7 392 461.41024 193 22.5 396 280.42048 187 9.9 392 126.3

Conclusions

I We demonstrate a method for porous media flow that has thepotential for resilience.

I We investigate ways to get recursion and further scalabilityfrom a two level method.

I For further scalability we need a better smoother.